Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Ville Voutilainen
On 13 January 2017 at 09:56, Ville Voutilainen
 wrote:
>> problem with just going through all of them and splicing the contents
>> (if any) back to *this?
>
> Well, in addition to the computational complexity of it, not knowing
> which elements should be spliced
> back where. If a comparator given to sort() throws, trying to "unsort"
> with the same comparator
> can also throw, so I don't know how to reverse the operations done by
> that point.

Ah, I think I see what you're saying. Just splice them back in any
order. Ok, I'll give that a spin.


Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Tim Song
On Fri, Jan 13, 2017 at 3:00 AM, Ville Voutilainen
 wrote:
> On 13 January 2017 at 09:56, Ville Voutilainen
>  wrote:
>>> problem with just going through all of them and splicing the contents
>>> (if any) back to *this?
>>
>> Well, in addition to the computational complexity of it, not knowing
>> which elements should be spliced
>> back where. If a comparator given to sort() throws, trying to "unsort"
>> with the same comparator
>> can also throw, so I don't know how to reverse the operations done by
>> that point.
>
> Ah, I think I see what you're saying. Just splice them back in any
> order. Ok, I'll give that a spin.

Right, list::sort doesn't promise strong exception safety, so
"unsorting" is not needed.

Also, shouldn't merge() rethrow the caught exception rather than swallow it?


Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Ville Voutilainen
On 13 January 2017 at 10:02, Tim Song  wrote:
> On Fri, Jan 13, 2017 at 3:00 AM, Ville Voutilainen
>  wrote:
>> On 13 January 2017 at 09:56, Ville Voutilainen
>>  wrote:
 problem with just going through all of them and splicing the contents
 (if any) back to *this?
>>>
>>> Well, in addition to the computational complexity of it, not knowing
>>> which elements should be spliced
>>> back where. If a comparator given to sort() throws, trying to "unsort"
>>> with the same comparator
>>> can also throw, so I don't know how to reverse the operations done by
>>> that point.
>>
>> Ah, I think I see what you're saying. Just splice them back in any
>> order. Ok, I'll give that a spin.
>
> Right, list::sort doesn't promise strong exception safety, so
> "unsorting" is not needed.
>
> Also, shouldn't merge() rethrow the caught exception rather than swallow it?

Ha, yes, well spotted. I'll cook up an improved patch.


Re: [PATCH][GIMPLE FE] Add parsing of MEM_REFs

2017-01-13 Thread Richard Biener
On Thu, 12 Jan 2017, Richard Biener wrote:

> On Thu, 12 Jan 2017, Jakub Jelinek wrote:
> 
> > On Thu, Jan 12, 2017 at 01:35:32PM +0100, Richard Biener wrote:
> > > It also lacks expressiveness when comparing
> > > 
> > >   short s;
> > >   s_1 = (short) 1;
> > >   s_2 = short (1);
> > > 
> > > IMHO having a _Literal somewhere makes it more obvious what is meant
> > > (I'm still going to dump 1u or 1l instead of _Literal (unsigned) 1
> > > of course as that's even more easy to recognize).
> > 
> > So, with the _Literal (type) constant syntax, what are we going to emit
> > and parse for __int128 constants?
> > _Literal (__int128) (0x123456ULL << 64 || 0x123456ULL), something else?
> 
> Yes, unless we teach libcpp about larger than 64bit literals.
> 
> The following patch implements _Literal (but not proper dumping of
> > 64bit literals for __int128 nor parsing the above).
> 
> I think that's a good start which also covers pointer types nicely.
> For integers we might at some point want to adopt sN and uN suffixes
> which appearantly MSVC supports to handle native 128bit literals.
> 
> For
> 
> struct X { int a : 2; };
> void __GIMPLE ()
> foo (struct X * p)
> {
>   p->a = _Literal (int : 2) 1;
>   return;
> }
> 
> and thus 2 bit literals c_parser_type_name needs to be extended
> or for _Literal we can handle : 2 on our own.  But for loads
> from p->a into a temporary we need to handle that in declarations as well.
> 
> struct X { int a : 2; };
> int __GIMPLE ()
> foo (struct X * p)
> {
>   int : 2 a; // int a : 2;
>   a = p->a;
>   return a;
> }

I have bootstrapped and tested the patch below and installed it.  We
can add support for parsing vector and complex constants via the
same mechanism so having _Literal ( type ) x looks useful even if
we get another means of handling integer literals in the future.

Richard.

> 2017-01-12  Richard Biener  
> 
>   c/
>   * gimple-parser.c (c_parser_gimple_postfix_expression): Parse
>   _Literal ( type-name ) number.
> 
>   * tree-pretty-print.c (dump_generic_node): Dump INTEGER_CSTs
>   as _Literal ( type ) number in case usual suffixes do not
>   preserve all information.
> 
>   * gcc.dg/gimplefe-22.c: New testcase.
> 
> Index: gcc/c/gimple-parser.c
> ===
> --- gcc/c/gimple-parser.c (revision 244350)
> +++ gcc/c/gimple-parser.c (working copy)
> @@ -799,6 +799,32 @@ c_parser_gimple_postfix_expression (c_pa
>  type, ptr.value, alias_off);
> break;
>   }
> +   else if (strcmp (IDENTIFIER_POINTER (id), "_Literal") == 0)
> + {
> +   /* _Literal '(' type-name ')' number  */
> +   c_parser_consume_token (parser);
> +   tree type = NULL_TREE;
> +   if (c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
> + {
> +   struct c_type_name *type_name = c_parser_type_name (parser);
> +   tree tem;
> +   if (type_name)
> + type = groktypename (type_name, &tem, NULL);
> +   c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
> +  "expected %<)%>");
> + }
> +   tree val = c_parser_gimple_postfix_expression (parser).value;
> +   if (! type
> +   || ! val
> +   || val == error_mark_node
> +   || TREE_CODE (val) != INTEGER_CST)
> + {
> +   c_parser_error (parser, "invalid _Literal");
> +   return expr;
> + }
> +   expr.value = fold_convert (type, val);
> +   return expr;
> + }
> /* SSA name.  */
> unsigned version, ver_offset;
> if (! lookup_name (id)
> Index: gcc/tree-pretty-print.c
> ===
> --- gcc/tree-pretty-print.c   (revision 244350)
> +++ gcc/tree-pretty-print.c   (working copy)
> @@ -1664,6 +1664,16 @@ dump_generic_node (pretty_printer *pp, t
>break;
>  
>  case INTEGER_CST:
> +  if (flags & TDF_GIMPLE
> +   && (POINTER_TYPE_P (TREE_TYPE (node))
> +   || (TYPE_PRECISION (TREE_TYPE (node))
> +   < TYPE_PRECISION (integer_type_node))
> +   || exact_log2 (TYPE_PRECISION (TREE_TYPE (node))) == -1))
> + {
> +   pp_string (pp, "_Literal (");
> +   dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);
> +   pp_string (pp, ") ");
> + }
>if (TREE_CODE (TREE_TYPE (node)) == POINTER_TYPE
> && ! (flags & TDF_GIMPLE))
>   {
> @@ -1693,11 +1703,7 @@ dump_generic_node (pretty_printer *pp, t
>else if (tree_fits_shwi_p (node))
>   pp_wide_integer (pp, tree_to_shwi (node));
>else if (tree_fits_uhwi_p (node))
> - {
> -   pp_unsigned_wide_integer (pp, tree_to_uhwi (node));
> -   if (flags & TDF_GIMPLE)
> - pp_character (pp, 'U')

[PATCH][PR sanitizer/78887] Don't emit ODR indicators if -fsanitize=kernel-address is present.

2017-01-13 Thread Maxim Ostapenko

Hi,

as mentioned in PR, Linux kernel 4.9 fails to build with ASan due to 
wrong handling of emitted ODR indicator symbols. Although this might be 
a kernel bug (relying on specific pattern in symbol name sounds 
questionable), kernel doesn't need ODR indicators at all thus we can 
just disable them if -fsanitize=kernel-address is present.

Tested on x86_64-unknown-linux-gnu, OK for trunk?

-Maxim

gcc/ChangeLog:

2017-01-13  Maxim Ostapenko  

	PR sanitizer/78887
	* asan.c (asan_needs_odr_indicator_p): Don't emit ODR indicators
	if -fsanitize=kernel-address is present.

diff --git a/gcc/asan.c b/gcc/asan.c
index bc7ebc8..157d468 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2360,7 +2360,8 @@ create_odr_indicator (tree decl, tree type)
 static bool
 asan_needs_odr_indicator_p (tree decl)
 {
-  return !DECL_ARTIFICIAL (decl) && !DECL_WEAK (decl) && TREE_PUBLIC (decl);
+  return !(flag_sanitize & SANITIZE_KERNEL_ADDRESS) && !DECL_ARTIFICIAL (decl)
+	 && !DECL_WEAK (decl) && TREE_PUBLIC (decl);
 }
 
 /* Append description of a single global DECL into vector V.


Re: [PATCH] Introduce --with-gcc-major-version-only configure option (take 2)

2017-01-13 Thread Richard Biener
On Thu, Jan 12, 2017 at 9:16 PM, Jakub Jelinek  wrote:
> Hi!
>
> On Tue, Jan 10, 2017 at 07:56:36AM +0100, Jakub Jelinek wrote:
>> As I said on IRC, I think -dumpversion of 7 in this configuration is the
>> right thing to do.  In GCC sources, we have 3 uses of -dumpversion,
>> two of them look like:
>> gcc_version := $(shell $(GOC) -dumpversion)
>> ...
>> toolexeclibgodir = 
>> $(nover_glibgo_toolexeclibdir)/go/$(gcc_version)/$(target_alias)
>> libexecsubdir = $(libexecdir)/gcc/$(target_alias)/$(gcc_version)
>> (in libgo and gotools), one in config/tcl.m4 is like:
>> if test "`gcc -dumpversion | awk -F. '{print 
>> [$]1}'`" -lt "3" ; then
>> AC_MSG_WARN([64bit mode not supported with 
>> GCC < 3.2 on $system])
>> which works well whether it prints 7 or 7.1.1.
>> With --with-gcc-major-version-only, the spec_version is different from
>> BASEVER, and -dumpversion can print just one of those, when they are not the
>> same.  So, we either break users that expect they can do
>> `$CC -dumpmachine`-gcc-`$CC -dumpversion`, or find out the C++ includes by
>> g++ -dumpversion, etc., or we break users that expect 3 numbers separated
>> by dot or 2 numbers separated by dot with optional another one.
>> In the past, we have not always pointed 3 numbers, releases printed just
>> major.minor, like gcc -dumpversion printed 3.0 (as mentioned in the manual).
>> But the former 3.0 in the previous versioning scheme corresponds to just 7
>> in the new one.  So, users that expect 3 numbers are already broken,
>> and just one number is just adjusting those assumptions to the current
>> versioning scheme.  Yes, we can add a new option, but IMNSHO it should
>> be -dumpbaseversion or -dumpfullversion that will always print
>> major.minor.patchlevel.  From the SUSE bugzilla, it looks like SUSE has
>> been shipping compilers that printed just 5 or 6 for almost 2 years now,
>> so hopefully some changes if needed somewhere have been already upstreamed.
>
> Here is updated version of the patch that introduces the -dumpfullversion
> option and documents better -dumpversion as well as -dumpfullversion.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Looks good to me.  Please leave others some time to comment.

Thanks,
Richard.

> 2017-01-12  Jakub Jelinek  
>
> PR other/79046
> * configure: Regenerated.
> config/
> * acx.m4 (GCC_BASE_VER): New m4 function.
> (ACX_TOOL_DIRS): Require GCC_BASE_VER, for
> --with-gcc-major-version-only use just major number from BASE-VER.
> gcc/
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.in (version): Use @get_gcc_base_ver@ instead of cat to get
> version from BASE-VER file.
> (CFLAGS-gcc.o): Add -DBASEVER=$(BASEVER_s).
> (gcc.o): Depend on $(BASEVER).
> * common.opt (dumpfullversion): New option.
> * gcc.c (driver_handle_option): Handle OPT_dumpfullversion.
> * doc/invoke.texi: Document -dumpfullversion.
> * doc/install.texi: Document --with-gcc-major-version-only.
> * configure: Regenerated.
> libatomic/
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to
> get version from BASE-VER file.
> * testsuite/Makefile.in: Regenerated.
> * configure: Regenerated.
> * Makefile.in: Regenerated.
> libgomp/
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to
> get version from BASE-VER file.
> * testsuite/Makefile.in: Regenerated.
> * configure: Regenerated.
> * Makefile.in: Regenerated.
> libgcc/
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.in (version): Use @get_gcc_base_ver@ instead of cat to get
> version from BASE-VER file.
> * configure: Regenerated.
> libssp/
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to
> get version from BASE-VER file.
> * configure: Regenerated.
> * Makefile.in: Regenerated.
> liboffloadmic/
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to
> get version from BASE-VER file.
> * aclocal.m4: Include ../config/acx.m4.
> * configure: Regenerated.
> * Makefile.in: Regenerated.
> libquadmath/
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to
> get version from BASE-VER file.
> * configure: Regenerated.
> * Makefile.in: Regenerated.
> libmpx/
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to
> get version from BASE-VER file.
> * configure: Regenerated.
> * 

Re: [PATCH][GIMPLE FE] Add parsing of MEM_REFs

2017-01-13 Thread Jakub Jelinek
On Fri, Jan 13, 2017 at 09:15:22AM +0100, Richard Biener wrote:
> > @@ -1710,6 +1716,24 @@ dump_generic_node (pretty_printer *pp, t
> >   print_hex (val, pp_buffer (pp)->digit_buffer);
> >   pp_string (pp, pp_buffer (pp)->digit_buffer);
> > }
> > +  if ((flags & TDF_GIMPLE)
> > + && (POINTER_TYPE_P (TREE_TYPE (node))
> > + || (TYPE_PRECISION (TREE_TYPE (node))
> > + < TYPE_PRECISION (integer_type_node))
> > + || exact_log2 (TYPE_PRECISION (TREE_TYPE (node))) == -1))
> > +   {
> > + if (TYPE_UNSIGNED (TREE_TYPE (node)))
> > +   pp_character (pp, 'u');
> > + if (TYPE_PRECISION (TREE_TYPE (node))
> > + == TYPE_PRECISION (unsigned_type_node))
> > +   ;
> > + else if (TYPE_PRECISION (TREE_TYPE (node))
> > +  == TYPE_PRECISION (long_unsigned_type_node))
> > +   pp_character (pp, 'l');
> > + else if (TYPE_PRECISION (TREE_TYPE (node))
> > +  == TYPE_PRECISION (long_long_unsigned_type_node))
> > +   pp_string (pp, "ll");
> > +   }

Not sure if I understand this.  The outer if condition says that only the
sub-int or strange precision or pointer types do that, but then
you compare the precisions against long and long long.  That will be
true only for pointers.  Don't you want the u/l/ll/ul/ull suffixes emitted
for integers even when they have normal precision and _Literal is not used?
Or is that handled somewhere else?

Jakub


Re: [PATCH][PR sanitizer/78887] Don't emit ODR indicators if -fsanitize=kernel-address is present.

2017-01-13 Thread Jakub Jelinek
On Fri, Jan 13, 2017 at 11:19:19AM +0300, Maxim Ostapenko wrote:
> as mentioned in PR, Linux kernel 4.9 fails to build with ASan due to wrong
> handling of emitted ODR indicator symbols. Although this might be a kernel
> bug (relying on specific pattern in symbol name sounds questionable), kernel
> doesn't need ODR indicators at all thus we can just disable them if
> -fsanitize=kernel-address is present.
> Tested on x86_64-unknown-linux-gnu, OK for trunk?

> gcc/ChangeLog:
> 
> 2017-01-13  Maxim Ostapenko  
> 
>   PR sanitizer/78887
>   * asan.c (asan_needs_odr_indicator_p): Don't emit ODR indicators
>   if -fsanitize=kernel-address is present.
> 
> diff --git a/gcc/asan.c b/gcc/asan.c
> index bc7ebc8..157d468 100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -2360,7 +2360,8 @@ create_odr_indicator (tree decl, tree type)
>  static bool
>  asan_needs_odr_indicator_p (tree decl)
>  {
> -  return !DECL_ARTIFICIAL (decl) && !DECL_WEAK (decl) && TREE_PUBLIC (decl);
> +  return !(flag_sanitize & SANITIZE_KERNEL_ADDRESS) && !DECL_ARTIFICIAL 
> (decl)
> +  && !DECL_WEAK (decl) && TREE_PUBLIC (decl);

As the condition is longer than a line, please use
  return (!(flag_sanitize & SANITIZE_KERNEL_ADDRESS)
  && !DECL_ARTIFICIAL (decl)
  && !DECL_WEAK (decl)
  && TREE_PUBLIC (decl));
instead (i.e. one sub-condition per line, and ()s around the whole
condition.  Perhaps a short comment why we don't emit those for
-fsanitize=kernel-address would be useful too.

Ok for trunk with those changes.

Jakub


Re: [PATCH][GIMPLE FE] Add parsing of MEM_REFs

2017-01-13 Thread Richard Biener
On Fri, 13 Jan 2017, Jakub Jelinek wrote:

> On Fri, Jan 13, 2017 at 09:15:22AM +0100, Richard Biener wrote:
> > > @@ -1710,6 +1716,24 @@ dump_generic_node (pretty_printer *pp, t
> > > print_hex (val, pp_buffer (pp)->digit_buffer);
> > > pp_string (pp, pp_buffer (pp)->digit_buffer);
> > >   }
> > > +  if ((flags & TDF_GIMPLE)
> > > +   && (POINTER_TYPE_P (TREE_TYPE (node))
> > > +   || (TYPE_PRECISION (TREE_TYPE (node))
> > > +   < TYPE_PRECISION (integer_type_node))
> > > +   || exact_log2 (TYPE_PRECISION (TREE_TYPE (node))) == -1))
> > > + {
> > > +   if (TYPE_UNSIGNED (TREE_TYPE (node)))
> > > + pp_character (pp, 'u');
> > > +   if (TYPE_PRECISION (TREE_TYPE (node))
> > > +   == TYPE_PRECISION (unsigned_type_node))
> > > + ;
> > > +   else if (TYPE_PRECISION (TREE_TYPE (node))
> > > +== TYPE_PRECISION (long_unsigned_type_node))
> > > + pp_character (pp, 'l');
> > > +   else if (TYPE_PRECISION (TREE_TYPE (node))
> > > +== TYPE_PRECISION (long_long_unsigned_type_node))
> > > + pp_string (pp, "ll");
> > > + }
> 
> Not sure if I understand this.  The outer if condition says that only the
> sub-int or strange precision or pointer types do that, but then
> you compare the precisions against long and long long.  That will be
> true only for pointers.  Don't you want the u/l/ll/ul/ull suffixes emitted
> for integers even when they have normal precision and _Literal is not used?
> Or is that handled somewhere else?

Oops, you are right - it shouldn't be the same condition that controls
whether to add _Literal (type).  It should be the inverse.  Maybe
we need to add suffixes anyway even for say 61bit precision constants
to avoid warnings from libcpp.

I'll fix it up after testing a few cases.

Richard.


Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Ville Voutilainen
On 13 January 2017 at 10:09, Ville Voutilainen
 wrote:
>>> Ah, I think I see what you're saying. Just splice them back in any
>>> order. Ok, I'll give that a spin.
>>
>> Right, list::sort doesn't promise strong exception safety, so
>> "unsorting" is not needed.
>>
>> Also, shouldn't merge() rethrow the caught exception rather than swallow it?
>
> Ha, yes, well spotted. I'll cook up an improved patch.

Thus:

2017-01-13  Ville Voutilainen  

PR libstdc++/78389
* include/bits/list.tcc (merge(list&&)):
Adjust list sizes if the comparator throws.
(merge(list&&, _StrictWeakOrdering)): Likewise.
(sort()): Splice elements back from the scratch buffers
if the comparator throws.
(sort(_StrictWeakOrdering)): Likewise.
* testsuite/23_containers/list/operations/78389.cc: New.
diff --git a/libstdc++-v3/include/bits/list.tcc 
b/libstdc++-v3/include/bits/list.tcc
index c4f397f..46bb9d2 100644
--- a/libstdc++-v3/include/bits/list.tcc
+++ b/libstdc++-v3/include/bits/list.tcc
@@ -386,20 +386,30 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  iterator __last1 = end();
  iterator __first2 = __x.begin();
  iterator __last2 = __x.end();
- while (__first1 != __last1 && __first2 != __last2)
-   if (*__first2 < *__first1)
- {
-   iterator __next = __first2;
-   _M_transfer(__first1, __first2, ++__next);
-   __first2 = __next;
- }
-   else
- ++__first1;
- if (__first2 != __last2)
-   _M_transfer(__last1, __first2, __last2);
+ size_t __orig_size = __x.size();
+ __try {
+   while (__first1 != __last1 && __first2 != __last2)
+ if (*__first2 < *__first1)
+   {
+ iterator __next = __first2;
+ _M_transfer(__first1, __first2, ++__next);
+ __first2 = __next;
+   }
+ else
+   ++__first1;
+   if (__first2 != __last2)
+ _M_transfer(__last1, __first2, __last2);
 
- this->_M_inc_size(__x._M_get_size());
- __x._M_set_size(0);
+   this->_M_inc_size(__x._M_get_size());
+   __x._M_set_size(0);
+ }
+ __catch(...)
+   {
+ size_t __dist = distance(__first2, __last2);
+ this->_M_inc_size(__dist);
+ __x._M_set_size(__orig_size - __dist);
+ __throw_exception_again;
+   }
}
 }
 
@@ -423,20 +433,31 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
iterator __last1 = end();
iterator __first2 = __x.begin();
iterator __last2 = __x.end();
-   while (__first1 != __last1 && __first2 != __last2)
- if (__comp(*__first2, *__first1))
-   {
- iterator __next = __first2;
- _M_transfer(__first1, __first2, ++__next);
- __first2 = __next;
-   }
- else
-   ++__first1;
-   if (__first2 != __last2)
- _M_transfer(__last1, __first2, __last2);
-
-   this->_M_inc_size(__x._M_get_size());
-   __x._M_set_size(0);
+   size_t __orig_size = __x.size();
+   __try
+ {
+   while (__first1 != __last1 && __first2 != __last2)
+ if (__comp(*__first2, *__first1))
+   {
+ iterator __next = __first2;
+ _M_transfer(__first1, __first2, ++__next);
+ __first2 = __next;
+   }
+ else
+   ++__first1;
+   if (__first2 != __last2)
+ _M_transfer(__last1, __first2, __last2);
+
+   this->_M_inc_size(__x._M_get_size());
+   __x._M_set_size(0);
+ }
+   __catch(...)
+ {
+   size_t __dist = distance(__first2, __last2);
+   this->_M_inc_size(__dist);
+   __x._M_set_size(__orig_size - __dist);
+   __throw_exception_again;
+ }
  }
   }
 
@@ -453,27 +474,35 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 list __tmp[64];
 list * __fill = __tmp;
 list * __counter;
-
-do
+   __try
  {
-   __carry.splice(__carry.begin(), *this, begin());
-
-   for(__counter = __tmp;
-   __counter != __fill && !__counter->empty();
-   ++__counter)
+   do
  {
-   __counter->merge(__carry);
+   __carry.splice(__carry.begin(), *this, begin());
+   
+   for(__counter = __tmp;
+   __counter != __fill && !__counter->empty();
+   ++__counter)
+ {
+   __counter->merge(__carry);
+   __carry.swap(*__counter);
+ }
__carry.swap(*__counter);
+   if (__cou

Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Ville Voutilainen
On 13 January 2017 at 10:27, Ville Voutilainen
 wrote:
> On 13 January 2017 at 10:09, Ville Voutilainen
>  wrote:
 Ah, I think I see what you're saying. Just splice them back in any
 order. Ok, I'll give that a spin.
>>>
>>> Right, list::sort doesn't promise strong exception safety, so
>>> "unsorting" is not needed.
>>>
>>> Also, shouldn't merge() rethrow the caught exception rather than swallow it?
>>
>> Ha, yes, well spotted. I'll cook up an improved patch.
>
> Thus:
>
> 2017-01-13  Ville Voutilainen  
>
> PR libstdc++/78389
> * include/bits/list.tcc (merge(list&&)):
> Adjust list sizes if the comparator throws.
> (merge(list&&, _StrictWeakOrdering)): Likewise.
> (sort()): Splice elements back from the scratch buffers
> if the comparator throws.
> (sort(_StrictWeakOrdering)): Likewise.
> * testsuite/23_containers/list/operations/78389.cc: New.


..and yes, sigh, that patch has whitespace damage in it. I have
already fixed that, so that'll be corrected before
committing.


Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Ville Voutilainen
On 13 January 2017 at 10:29, Ville Voutilainen
 wrote:
> ..and yes, sigh, that patch has whitespace damage in it. I have
> already fixed that, so that'll be corrected before
> committing.

..as well as replacing those asserts with VERIFY.


Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Tim Song
On Fri, Jan 13, 2017 at 3:27 AM, Ville Voutilainen
 wrote:
> On 13 January 2017 at 10:09, Ville Voutilainen
>  wrote:
 Ah, I think I see what you're saying. Just splice them back in any
 order. Ok, I'll give that a spin.
>>>
>>> Right, list::sort doesn't promise strong exception safety, so
>>> "unsorting" is not needed.
>>>
>>> Also, shouldn't merge() rethrow the caught exception rather than swallow it?
>>
>> Ha, yes, well spotted. I'll cook up an improved patch.
>
> Thus:
>
> 2017-01-13  Ville Voutilainen  
>
> PR libstdc++/78389
> * include/bits/list.tcc (merge(list&&)):
> Adjust list sizes if the comparator throws.
> (merge(list&&, _StrictWeakOrdering)): Likewise.
> (sort()): Splice elements back from the scratch buffers
> if the comparator throws.
> (sort(_StrictWeakOrdering)): Likewise.
> * testsuite/23_containers/list/operations/78389.cc: New.

Shouldn't __carry be spliced back as well?


Re: [RFA][PATCH 1a/4] [PR tree-optimization/33562] New sbitmap functions

2017-01-13 Thread Richard Biener
On Fri, Jan 13, 2017 at 4:36 AM, Jeff Law  wrote:
> Per Richi's request, this breaks out the new sbitmap functions into their
> own patch.
>
> This patch should address all the concerns Richi raised in the prior review.
> Namely the performance of the range set/clear and bootstrapping with older
> or non-GCC C++ compilers and added missing docs.
>
> The range set/clear functions work by first handling any partial changes in
> the first word by building and applying a single bitmask.  Then all full
> word changes are handled by a call to memset, then finally residuals in the
> last word with another bitmask application.
>
> Internally I tested these by having both implementations and verifying they
> gave identical results during a bootstrap and regression test cycle (with
> the subsequent DSE patches that utilize the new functions).
>
> I also tested the old gcc/non-gcc path by forcing it to be used rather than
> the builtin popcount paths.  Again, this was bootstrapped and regression
> tested against the full DSE patches.
>
> After those tests were complete, I ripped out the debugging/verification
> bits and did another bootstrap and regression test of just the sbitmap
> changes as well as a bootstrap and regression test of each stage in the
> patchkit.
>
> OK for the trunk?

Ok.

Thanks,
Richard.

> Jeff
>
>
>
> PR tree-optimization/33562
> * sbitmap.h (bitmap_count_bits): Prototype.
> (bitmap_clear_range, bitmap_set_range): Likewise.
> * sbitmap.c (bitmap_clear_range): New function.
> (bitmap_set_range, sbitmap_popcount, bitmap_count_bits): Likewise.
>
> diff --git a/gcc/sbitmap.c b/gcc/sbitmap.c
> index c9bcea9..c065d13 100644
> --- a/gcc/sbitmap.c
> +++ b/gcc/sbitmap.c
> @@ -202,6 +202,170 @@ bitmap_empty_p (const_sbitmap bmap)
>return true;
>  }
>
> +/* Clear COUNT bits from START in BMAP.  */
> +
> +void
> +bitmap_clear_range (sbitmap bmap, unsigned int start, unsigned int count)
> +{
> +  if (count == 0)
> +return;
> +
> +  unsigned int start_word = start / SBITMAP_ELT_BITS;
> +  unsigned int start_bitno = start % SBITMAP_ELT_BITS;
> +
> +  /* Clearing less than a full word, starting at the beginning of a word.
> */
> +  if (start_bitno == 0 && count < SBITMAP_ELT_BITS)
> +{
> +  SBITMAP_ELT_TYPE mask = ((SBITMAP_ELT_TYPE)1 << count) - 1;
> +  bmap->elms[start_word] &= ~mask;
> +  return;
> +}
> +
> +  unsigned int end_word = (start + count) / SBITMAP_ELT_BITS;
> +  unsigned int end_bitno = (start + count) % SBITMAP_ELT_BITS;
> +
> +  /* Clearing starts somewhere in the middle of the first word.  Clear up
> to
> + the end of the first word or the end of the requested region,
> whichever
> + comes first.  */
> +  if (start_bitno != 0)
> +{
> +  unsigned int nbits = ((start_word == end_word)
> +   ? end_bitno - start_bitno
> +   : SBITMAP_ELT_BITS - start_bitno);
> +  SBITMAP_ELT_TYPE mask = ((SBITMAP_ELT_TYPE)1 << nbits) - 1;
> +  mask <<= start_bitno;
> +  bmap->elms[start_word] &= ~mask;
> +  start_word++;
> +  count -= nbits;
> +}
> +
> +  if (count == 0)
> +return;
> +
> +  /* Now clear words at a time until we hit a partial word.  */
> +  unsigned int nwords = (end_word - start_word);
> +  if (nwords)
> +{
> +  memset (&bmap->elms[start_word], 0, nwords * sizeof
> (SBITMAP_ELT_TYPE));
> +  count -= nwords * sizeof (SBITMAP_ELT_TYPE) * BITS_PER_UNIT;
> +  start_word += nwords;
> +}
> +
> +  if (count == 0)
> +return;
> +
> +  /* Now handle residuals in the last word.  */
> +  SBITMAP_ELT_TYPE mask = ((SBITMAP_ELT_TYPE)1 << count) - 1;
> +  bmap->elms[start_word] &= ~mask;
> +}
> +
> +/* Set COUNT bits from START in BMAP.  */
> +void
> +bitmap_set_range (sbitmap bmap, unsigned int start, unsigned int count)
> +{
> +  if (count == 0)
> +return;
> +
> +  unsigned int start_word = start / SBITMAP_ELT_BITS;
> +  unsigned int start_bitno = start % SBITMAP_ELT_BITS;
> +
> +  /* Setting less than a full word, starting at the beginning of a word.
> */
> +  if (start_bitno == 0 && count < SBITMAP_ELT_BITS)
> +{
> +  SBITMAP_ELT_TYPE mask = ((SBITMAP_ELT_TYPE)1 << count) - 1;
> +  bmap->elms[start_word] |= mask;
> +  return;
> +}
> +
> +  unsigned int end_word = (start + count) / SBITMAP_ELT_BITS;
> +  unsigned int end_bitno = (start + count) % SBITMAP_ELT_BITS;
> +
> +  /* Setting starts somewhere in the middle of the first word.  Set up to
> + the end of the first word or the end of the requested region,
> whichever
> + comes first.  */
> +  if (start_bitno != 0)
> +{
> +  unsigned int nbits = ((start_word == end_word)
> +   ? end_bitno - start_bitno
> +   : SBITMAP_ELT_BITS - start_bitno);
> +  SBITMAP_ELT_TYPE mask = ((SBITMAP_ELT_TYPE)1 << nbits) - 1;
> +  mask <<= start_bitno;
> +  bmap->elms[start_word] |= mask;
> + 

Re: [PATCH] Fix PR77283

2017-01-13 Thread Richard Biener
On Thu, 12 Jan 2017, Jeff Law wrote:

> On 01/12/2017 07:55 AM, Richard Biener wrote:
> > 
> > The following fixes PR77283, path splitting being overly aggressive
> > and causing loop unrolling not to happen (because how it distorts the
> > CFG).
> > 
> > It is a aim at creating a cost model (there's none apart from
> > not duplicating too much stmts) by means of the observation that
> > we'd have to have PHI nodes in the joiner to have any possibility
> > of CSE opportunities being exposed by duplicating it or threading
> > opportunities being exposed across the new latch.  That includes
> > virtual PHIs for CSE (so any load/store) but not for the threading
> > (but IMHO threading should figure all this out on its own without
> > the requirement for somebody else duplicating the joiner).
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, the s390x
> > libquantum regression is reportedly fixed by this.  I had to adjust
> > gcc.dg/tree-ssa/split-path-7.c to not expect any path splitting because
> > I (and of course the cost model) can't see any reason to do it.
> > 
> > Ok for trunk?
> > 
> > Thanks,
> > Richard.
> > 
> > 2017-01-12  Richard Biener  
> > 
> > PR tree-optimization/77283
> > * gimple-ssa-split-paths.c: Include gimple-ssa.h, tree-phinodes.h
> > and ssa-iterators.h.
> > (is_feasible_trace): Implement a cost model based on joiner
> > PHI node uses.
> > 
> > * gcc.dg/tree-ssa/split-path-7.c: Adjust.
> > * gcc.dg/tree-ssa/split-path-8.c: New testcase.
> > * gcc.dg/tree-ssa/split-path-9.c: Likewise.
> So I think the only concern is split-path-7.  My memory is hazy, but I suspect
> split-path-7 shows the case where path splitting's CFG manipulations can
> result in fewer jumps for diamond sub-graphs.  You might see assembly code
> improvements due to path splitting on this test for the microblaze port.
> 
> Certainly the code in gimple-ssa-split-paths.c that you're adding is an
> improvement and brings gimple path splitting closer to its intended purpose.
> I don't think regressing split-path-7 should block this improvement, but we
> would want a PR to track the code quality regression.
> 
> So I think it's OK for the trunk and if it shows a code quality regression for
> split-path-7 on the microblaze port that we should have a distinct PR to track
> that issue (which is probably best solved in bb-reorder).

Committed.  With the wrong variant of the -9 test, fixed as below.

Richard.

2017-01-13  Richard Biener  

PR tree-optimization/77283
* gcc.dg/tree-ssa/split-path-9.c: Fix.

Index: gcc/testsuite/gcc.dg/tree-ssa/split-path-9.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/split-path-9.c(revision 244393)
+++ gcc/testsuite/gcc.dg/tree-ssa/split-path-9.c(working copy)
@@ -9,8 +9,8 @@ foo(unsigned int size, unsigned int *sta
 
   for(i = 0; i < size; i++)
 {
-  if(*state & 1)
-   *state ^= 1;
+  if(state[i] & 1)
+   state[i] ^= 1;
 }
 }
 


Re: [PATCH][ARM]Use different startfile and endfile for elf target when generating shared object.

2017-01-13 Thread Christophe Lyon
Hi Renlin,


On 12 January 2017 at 16:50, Renlin Li  wrote:
> Hi Kugan,
>
> some of the targets do include pie, and use the same crtbegin file as shared
> object.
> For example, alpha/elf.h
>
> And there are targets which don't do that,
> For example, sh/elf.h
>
> Most of the elf target seem only consider the simple case.
>
> The purpose of this patch is to make it possible to correctly check whether
> current toolchain supports shared object.
>
> Current dejegnu target selector "shared" tries to compile a simple source
> code to with "-shared -fpic" options to check whether "-shared" is
> supported.
>
> For arm baremetal targets with, this is not sufficient.
>
> arm-none-eabi is built with multilib. When running this testcase, if it's
> compiled with "-march=armv7-a".
> The crtbegin.o for this architecture version contains relocations which
> cannot be used in shared object.
> This test will fail.
>
> if no cpu or architecture is specified, the default cpu will be arm7tdmi.
> The test will pass as crtbegin.o for this version doesn't contains any
> relocations
> only allowed in shared object.
>
> To make this "shared" target selector work for arm baremetal toolchain. The
> correct way is to use different startup file for shared and non-shared
> toolchain.
>
> If the toolchain doesn't support "-shared" option, and someone attempts to
> use it
> to create shared object, it will complaint that "crtbeginS.o" cannot not be
> found.
>

I have run validations with your patch, and noticed regressions on arm-none-eabi
when using default cpu or --with-cpu=cortex-m3:

  - PASS now FAIL [PASS => FAIL]:

  gcc.dg/lto/pr54709 c_lto_pr54709_0.o-c_lto_pr54709_1.o link,  -fPIC
-fvisibility=hidden -flto
  gcc.dg/lto/pr61526 c_lto_pr61526_0.o-c_lto_pr61526_1.o link,  -fPIC
-flto -flto-partition=1to1
  gcc.dg/lto/pr64415 c_lto_pr64415_0.o-c_lto_pr64415_1.o link,  -O -flto -fpic

on the same configurations, I've noticed these improvements:

  g++.dg/ipa/devirt-28a.C  -std=gnu++11 (test for excess errors)
  g++.dg/ipa/devirt-28a.C  -std=gnu++14 (test for excess errors)
  g++.dg/ipa/devirt-28a.C  -std=gnu++98 (test for excess errors)
are now unsupported rather than fail.

Why is it different when the toolchain is configured --with-cpu=cortex-a9
for instance? Are the tests involving -shared already skipped in this case?

> A full history of discussion is here:
> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00322.html
>
>
> Regards,
> Renlin
>
>
>
> On 12/01/17 11:47, kugan wrote:
>>
>> Hi,
>>
>> On 16/06/16 21:04, Renlin Li wrote:
>>>
>>>  /* Now we define the strings used to build the spec file.  */
>>> -#define UNKNOWN_ELF_STARTFILE_SPEC" crti%O%s crtbegin%O%s crt0%O%s"
>>> +#define UNKNOWN_ELF_STARTFILE_SPEC\
>>> +  "crti%O%s \
>>> +  %{!shared:crtbegin%O%s} %{shared:crtbeginS%O%s} \
>>> +  crt0%O%s"
>>
>>
>> Some targets seems to use shared|pie. When you change it, shouldn't it
>> also include for pie?
>>
>> Thanks,
>> Kugan


Re: [PATCH][PR sanitizer/78887] Don't emit ODR indicators if -fsanitize=kernel-address is present.

2017-01-13 Thread Maxim Ostapenko

On 13/01/17 11:24, Jakub Jelinek wrote:

On Fri, Jan 13, 2017 at 11:19:19AM +0300, Maxim Ostapenko wrote:

as mentioned in PR, Linux kernel 4.9 fails to build with ASan due to wrong
handling of emitted ODR indicator symbols. Although this might be a kernel
bug (relying on specific pattern in symbol name sounds questionable), kernel
doesn't need ODR indicators at all thus we can just disable them if
-fsanitize=kernel-address is present.
Tested on x86_64-unknown-linux-gnu, OK for trunk?
gcc/ChangeLog:

2017-01-13  Maxim Ostapenko  

PR sanitizer/78887
* asan.c (asan_needs_odr_indicator_p): Don't emit ODR indicators
if -fsanitize=kernel-address is present.

diff --git a/gcc/asan.c b/gcc/asan.c
index bc7ebc8..157d468 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2360,7 +2360,8 @@ create_odr_indicator (tree decl, tree type)
  static bool
  asan_needs_odr_indicator_p (tree decl)
  {
-  return !DECL_ARTIFICIAL (decl) && !DECL_WEAK (decl) && TREE_PUBLIC (decl);
+  return !(flag_sanitize & SANITIZE_KERNEL_ADDRESS) && !DECL_ARTIFICIAL (decl)
+&& !DECL_WEAK (decl) && TREE_PUBLIC (decl);

As the condition is longer than a line, please use
   return (!(flag_sanitize & SANITIZE_KERNEL_ADDRESS)
  && !DECL_ARTIFICIAL (decl)
  && !DECL_WEAK (decl)
  && TREE_PUBLIC (decl));
instead (i.e. one sub-condition per line, and ()s around the whole
condition.  Perhaps a short comment why we don't emit those for
-fsanitize=kernel-address would be useful too.

Ok for trunk with those changes.


OK, thanks, I'm going to apply following patch.

-Maxim



Jakub




gcc/ChangeLog:

2017-01-13  Maxim Ostapenko  

	PR sanitizer/78887
	* asan.c (asan_needs_odr_indicator_p): Don't emit ODR indicators
	if -fsanitize=kernel-address is present.

diff --git a/gcc/asan.c b/gcc/asan.c
index bc7ebc8..e3bf16d 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2360,7 +2360,16 @@ create_odr_indicator (tree decl, tree type)
 static bool
 asan_needs_odr_indicator_p (tree decl)
 {
-  return !DECL_ARTIFICIAL (decl) && !DECL_WEAK (decl) && TREE_PUBLIC (decl);
+  /* Don't emit ODR indicators for kernel because:
+ a) Kernel is written in C thus doesn't need ODR indicators.
+ b) Some kernel code may have assumptions about symbols containing specific
+patterns in their names.  Since ODR indicators contain original names
+of symbols they are emmitted for, these assumptions would be broken for
+ODR indicator symbols.  */
+  return (!(flag_sanitize & SANITIZE_KERNEL_ADDRESS)
+	  && !DECL_ARTIFICIAL (decl)
+	  && !DECL_WEAK (decl)
+	  && TREE_PUBLIC (decl));
 }
 
 /* Append description of a single global DECL into vector V.


Re: [PATCH][PR sanitizer/78887] Don't emit ODR indicators if -fsanitize=kernel-address is present.

2017-01-13 Thread Jakub Jelinek
On Fri, Jan 13, 2017 at 12:01:54PM +0300, Maxim Ostapenko wrote:
> +of symbols they are emmitted for, these assumptions would be broken 
> for

emitted rather than emmitted.

> +ODR indicator symbols.  */
> +  return (!(flag_sanitize & SANITIZE_KERNEL_ADDRESS)
> +   && !DECL_ARTIFICIAL (decl)
> +   && !DECL_WEAK (decl)
> +   && TREE_PUBLIC (decl));
>  }
>  
>  /* Append description of a single global DECL into vector V.


Jakub


[PATCH] Fix gcc.dg/tree-ssa/scev-[345].c testcases

2017-01-13 Thread Richard Biener

The following is an attempt to change those testcases to be less dependent
on previous passes.  The original motivation of the testcases seems to be
testing SCEV capabilities and in turn IVOPTs decisions, thus the testcases
are changed to check the IVO dump, use the GIMPLE FE feeding the loop
pipeline directly and skip lowering/store-motion we meanwhile do to
the testcase.

To avoid some existing issue with CFG construction after GIMPLE parsing
we need to be able to add GIMPLE_NOPs which the patch enables to generate
from empty stmts (previously those resulted in parse errors).

Tested the testcases on x86_64 with {,-m32} sofar I'll appreciate
testing on more targets.

Full bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-01-13  Richard Biener  

PR testsuite/52563
PR testsuite/71237
PR testsuite/77737
c/
* gimple-parser.c (c_parser_gimple_compound_statement): Handle
nops.

* gcc.dg/tree-ssa/scev-3.c: Re-write to a GIMPLE testcase for IVOPTs.
* gcc.dg/tree-ssa/scev-4.c: Likewise.
* gcc.dg/tree-ssa/scev-5.c: Likewise.

Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 244393)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -211,6 +211,17 @@ c_parser_gimple_compound_statement (c_pa
}
  goto expr_stmt;
 
+   case CPP_SEMICOLON:
+ {
+   /* Empty stmt.  */
+   location_t loc = c_parser_peek_token (parser)->location;
+   c_parser_consume_token (parser);
+   gimple *nop = gimple_build_nop ();
+   gimple_set_location (nop, loc);
+   gimple_seq_add_stmt (seq, nop);
+   break;
+ }
+
default:
 expr_stmt:
  c_parser_gimple_statement (parser, seq);
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/scev-3.c  (revision 244393)
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-3.c  (working copy)
@@ -1,18 +1,43 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fgimple -fdump-tree-ivopts" } */
 
 int *a_p;
 int a[1000];
 
-void
-f(int k)
+void __GIMPLE (startwith ("loop"))
+f (int k)
 {
-   int i;
+  int i;
+  int * _1;
+
+bb_2:
+  i_5 = k_4(D);
+  if (i_5 <= 999)
+goto bb_4;
+  else
+goto bb_3;
+
+bb_3:
+  return;
+
+bb_4:
+  ;
+
+bb_5:
+  i_12 = __PHI (bb_6: i_9, bb_4: i_5);
+  _1 = &a[i_12];
+  a_p = _1;
+  __MEM  ((int *)&a)[i_12] = 100;
+  i_9 = i_5 + i_12;
+  if (i_9 <= 999)
+goto bb_6;
+  else
+goto bb_3;
+
+bb_6:
+  ;
+  goto bb_5;
 
-   for (i=k; i<1000; i+=k) {
-   a_p = &a[i];
-   *a_p = 100;
-}
 }
 
-/* { dg-final { scan-tree-dump-times "&a" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&a" 1 "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/scev-4.c  (revision 244393)
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-4.c  (working copy)
@@ -1,23 +1,48 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fgimple -fdump-tree-ivopts" } */
 
 typedef struct {
-   int x;
-   int y;
+int x;
+int y;
 } S;
 
 int *a_p;
 S a[1000];
 
-void
-f(int k)
+void __GIMPLE (startwith ("loop"))
+f (int k)
 {
-   int i;
+  int i;
+  int * _1;
+
+bb_2:
+  i_5 = k_4(D);
+  if (i_5 <= 999)
+goto bb_4;
+  else
+goto bb_3;
+
+bb_3:
+  return;
+
+bb_4:
+  ;
+
+bb_5:
+  i_12 = __PHI (bb_6: i_9, bb_4: i_5);
+  _1 = &a[i_12].y;
+  a_p = _1;
+  __MEM  ((int *)&a)[i_12].y = 100;
+  i_9 = i_5 + i_12;
+  if (i_9 <= 999)
+goto bb_6;
+  else
+goto bb_3;
+
+bb_6:
+  ;
+  goto bb_5;
 
-   for (i=k; i<1000; i+=k) {
-   a_p = &a[i].y;
-   *a_p = 100;
-}
 }
 
-/* { dg-final { scan-tree-dump-times "&a" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&a" 1 "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-5.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/scev-5.c  (revision 244393)
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-5.c  (working copy)
@@ -1,18 +1,43 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fgimple -fdump-tree-ivopts" } */
 
 int *a_p;
 int a[1000];
 
-void
-f(int k)
+void __GIMPLE (startwith ("loop"))
+f (int k)
 {
-long long i;
+  long long int i;
+  int * _1;
+
+bb_2:
+  i_5 = (long long int) k_4(D);
+  if (i_5 <= 999ll)
+goto bb_4;
+  else
+goto bb_3;
+
+bb_3:
+  return;
+
+bb_4:
+  ;
+
+bb_5:
+  i_12 = __PHI (bb_6: i_9, bb_4: i_5);
+  _1 = &a[i_12];
+  a_p = _1;
+  __MEM  ((int *)&a)[i_12] = 100;
+  i_9 = i_5 + i_12;
+  if (i_9 <= 999ll)
+goto bb_6;

[Ada] Static predicates on strings

2017-01-13 Thread Arnaud Charlet
RM 3.2.4 stipulates that comparison operators on strings are legal in the
expression for a Static_Predicate aspect of a string type. The implementation
of this capability was deferred because it conflicts with the definition of
static expression (RM 4.9) which excludes string comparisons from staticness.
This inconsistency will eventually be resolved by the ARG, but it is worth
implementing the wider scope of static predicates to include string comparison.

Executing:

   gnatmake -q -gnatws -gnata main
   main

must yield:

   Some_String OK
   Early_String OK
   Middle_String  OK
   Late_String  OK

---
with Text_IO; use Text_IO;
with support; use support;
procedure main is
  Maybe : Boolean := String'("ABC") < "CDE";
begin
  begin
 declare
Wrong : constant some_String := "abcdefg";
 begin
null;
 end;
  exception
 when others => Put_Line ("Some_String OK");
  end;

  begin
 declare
Wrong : Early_String := "ebcdefg";
 begin
null;
 end;
  exception
 when others => Put_Line ("Early_String OK");
  end;
  begin
 declare
Wrong : Middle_String := "abcdefg";
 begin
null;
 end;
  exception
 when others => Put_Line ("Middle_String  OK");
  end;
  begin
 declare
Wrong : Late_String := "abcdefg";
 begin
null;
 end;
  exception
 when others => Put_Line ("Late_String  OK");
  end;
end;
---
package Support is
   subtype My_String is String (1 .. 7);

   subtype My_Special_String is My_String with
 Static_Predicate => My_Special_String = "aaa";

   subtype My_short_String is My_String with
 Static_Predicate => My_short_String'length > 6;

   subtype Early_String is My_String with
 Static_Predicate => Early_String < "ddd";

   subtype Late_String is My_String with
 Static_Predicate => "ddd" < Late_String;

   subtype Middle_String is MY_String with
  Static_Predicate => Middle_String >= "aaa"
 and then "ggg" < Middle_String;

   subtype Some_String is My_String with
  Static_Predicate => Some_String in "aaa" | "zzz";
end Support;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-01-13  Ed Schonberg  

* sem_ch13.adb (Is_Predicate_Static): Following the intent of the RM,
treat comparisons on strings as legal in a Static_Predicate.
(Is_Predicate_Static, Is_Type_Ref): Predicate also returns true on
a function call that is the expansion of a string comparison.The
function call is built when compiling the corresponding predicate
function, but the expression has been found legal as a static
predicate during earlier analysis.
* sem_eval.adb (Real_Or_String_Static_Predicate_Matches): Handle
properly a function call that is the expansion of a string
comparison operation, in order to recover the Static_Predicate
expression and apply it to a static argument when needed.

Index: sem_eval.adb
===
--- sem_eval.adb(revision 244369)
+++ sem_eval.adb(working copy)
@@ -5469,6 +5469,40 @@
return Skip;
 end;
 
+ --  The predicate function may contain string-comparison operations
+ --  that have been converted into calls to run-time array-comparison
+ --  routines. To evaluate the predicate statically, we recover the
+ --  original comparison operation and replace the occurrence of the
+ --  formal by the static string value. The actuals of the generated
+ --  call are of the form X'Address.
+
+ elsif Nkind (N) in N_Op_Compare
+   and then Nkind (Left_Opnd (N)) = N_Function_Call
+ then
+declare
+   C : constant Node_Id := Left_Opnd (N);
+   F : constant Node_Id := First (Parameter_Associations (C));
+   L : constant Node_Id := Prefix (F);
+   R : constant Node_Id := Prefix (Next (F));
+
+begin
+   --  If an operand is an entity name, it is the formal of the
+   --  predicate function, so replace it with the string value.
+   --  It may be either operand in the call. The other operand
+   --  is a static string from the original predicate.
+
+   if Is_Entity_Name (L) then
+  Rewrite (Left_Opnd (N),  New_Copy (Val));
+  Rewrite (Right_Opnd (N), New_Copy (R));
+
+   else
+  Rewrite (Left_Opnd (N),  New_Copy (L));
+  Rewrite (Right_Opnd (N), New_Copy (Val));
+   end if;
+
+   return Skip;
+end;
+
  else
 return OK;
  end if;
Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 244396)
+++ sem_ch13.adb(working copy)
@@ -11603,11 +11603,18 @@
 
   functio

Re: [PATCH][ARM]Use different startfile and endfile for elf target when generating shared object.

2017-01-13 Thread Renlin Li

Hi Christophe,

Thanks for testing the patch!
I check the test case gcc.dg/lto/pr54709, it seems the test case is not 
properly written.

It add extra ld option -shared without checking the target support for that. After the 
change, this compilation will fail as a regression.

IIUC, '-shared' option is required, it should be gated with corresponding 
target selector.

"g++.dg/ipa/devirt-28a.C" now is skipped because of the target selector there.
// { dg-do link { target { { gld && fpic } && shared } } }

perhaps "gcc.dg/lto/pr54709" should do similar things like this:
// { dg-do link { target { shared } } }


As far as I know, with different cpu/arch configurations, different relocations are 
generated in the library, some of the relocations are not allowed to be used in shared

object.

With -march=armv7-a (and the --with-cpu=cortex-a9 option you mentioned), the linking stage 
of the test will fail because of this error:

"relocation XXX against external symbol `YYY' can not be used when making a shared 
object"
for instance: crtbegin.o: relocation R_ARM_MOVW_ABS_NC against `a local symbol` can not be 
used when making a shared object; recompile with -fPIC


If you are luck enough, for example with arm7tdmi cpu, no such relocation is generated in 
startup files. The "shared" target support check will pass for simple and naive code.
"--with-cpu=cortex-m3" should be this case. But the test cases which require shared object 
support will fail.



So this "shared" target checking mechanism is not reliable. The patch is to 
change this.



Regards,
Renlin



On 13/01/17 08:48, Christophe Lyon wrote:

Hi Renlin,


On 12 January 2017 at 16:50, Renlin Li  wrote:

Hi Kugan,

some of the targets do include pie, and use the same crtbegin file as shared
object.
For example, alpha/elf.h

And there are targets which don't do that,
For example, sh/elf.h

Most of the elf target seem only consider the simple case.

The purpose of this patch is to make it possible to correctly check whether
current toolchain supports shared object.

Current dejegnu target selector "shared" tries to compile a simple source
code to with "-shared -fpic" options to check whether "-shared" is
supported.

For arm baremetal targets with, this is not sufficient.

arm-none-eabi is built with multilib. When running this testcase, if it's
compiled with "-march=armv7-a".
The crtbegin.o for this architecture version contains relocations which
cannot be used in shared object.
This test will fail.

if no cpu or architecture is specified, the default cpu will be arm7tdmi.
The test will pass as crtbegin.o for this version doesn't contains any
relocations
only allowed in shared object.

To make this "shared" target selector work for arm baremetal toolchain. The
correct way is to use different startup file for shared and non-shared
toolchain.

If the toolchain doesn't support "-shared" option, and someone attempts to
use it
to create shared object, it will complaint that "crtbeginS.o" cannot not be
found.



I have run validations with your patch, and noticed regressions on arm-none-eabi
when using default cpu or --with-cpu=cortex-m3:

   - PASS now FAIL [PASS => FAIL]:

   gcc.dg/lto/pr54709 c_lto_pr54709_0.o-c_lto_pr54709_1.o link,  -fPIC
-fvisibility=hidden -flto
   gcc.dg/lto/pr61526 c_lto_pr61526_0.o-c_lto_pr61526_1.o link,  -fPIC
-flto -flto-partition=1to1
   gcc.dg/lto/pr64415 c_lto_pr64415_0.o-c_lto_pr64415_1.o link,  -O -flto -fpic

on the same configurations, I've noticed these improvements:

   g++.dg/ipa/devirt-28a.C  -std=gnu++11 (test for excess errors)
   g++.dg/ipa/devirt-28a.C  -std=gnu++14 (test for excess errors)
   g++.dg/ipa/devirt-28a.C  -std=gnu++98 (test for excess errors)
are now unsupported rather than fail.

Why is it different when the toolchain is configured --with-cpu=cortex-a9
for instance? Are the tests involving -shared already skipped in this case?


A full history of discussion is here:
https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00322.html


Regards,
Renlin



On 12/01/17 11:47, kugan wrote:


Hi,

On 16/06/16 21:04, Renlin Li wrote:


  /* Now we define the strings used to build the spec file.  */
-#define UNKNOWN_ELF_STARTFILE_SPEC" crti%O%s crtbegin%O%s crt0%O%s"
+#define UNKNOWN_ELF_STARTFILE_SPEC\
+  "crti%O%s \
+  %{!shared:crtbegin%O%s} %{shared:crtbeginS%O%s} \
+  crt0%O%s"



Some targets seems to use shared|pie. When you change it, shouldn't it
also include for pie?

Thanks,
Kugan


Re: [PATCH, Fortran, pr70696, v1] [Coarray] ICE on EVENT POST of host-associated EVENT_TYPE coarray

2017-01-13 Thread Andre Vehreschild
Hi Jerry,

thanks for the review. Committed as r244407.

Will backport to gcc-6 in a week or so.

Regards,
Andre

On Thu, 12 Jan 2017 10:11:37 -0800
Jerry DeLisle  wrote:

> On 01/12/2017 03:45 AM, Andre Vehreschild wrote:
> > Hi all,
> > 
> > attached patch fixes the ICE when using an event in a subroutine. The reason
> > for the ICE was that the backend_decl of the symbol to event on was not set
> > when accessed. The patch ensures this. Furthermore did I fix a invalid
> > memory access in the caf_single lib, where to less memory was allocated in
> > the registration of the event.
> > 
> > Bootstraps and regtests ok on x86_64-linux/F25. Ok for trunk?
> > 
> > Regards,
> > Andre
> >   
> 
> OK and thanks.
> 
> Jerry


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 244394)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,9 @@
+2017-01-13  Andre Vehreschild  
+
+	PR fortran/70696
+	* trans-expr.c (gfc_get_tree_for_caf_expr): Ensure the backend_decl
+	is valid before accessing it.
+
 2017-01-09  Jakub Jelinek  
 
 	PR translation/79019
Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c	(Revision 244394)
+++ gcc/fortran/trans-expr.c	(Arbeitskopie)
@@ -1838,6 +1838,10 @@
 		 "component at %L is not supported", &expr->where);
   }
 
+  /* Make sure the backend_decl is present before accessing it.  */
+  if (expr->symtree->n.sym->backend_decl == NULL_TREE)
+expr->symtree->n.sym->backend_decl
+	= gfc_get_symbol_decl (expr->symtree->n.sym);
   caf_decl = expr->symtree->n.sym->backend_decl;
   gcc_assert (caf_decl);
   if (expr->symtree->n.sym->ts.type == BT_CLASS)
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 244394)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,8 @@
+2017-01-13  Andre Vehreschild  
+
+	PR fortran/70696
+	* gfortran.dg/coarray/event_3.f08: New test.
+
 2017-01-13  Richard Biener  
 
 	PR tree-optimization/77283
Index: gcc/testsuite/gfortran.dg/coarray/event_3.f08
===
--- gcc/testsuite/gfortran.dg/coarray/event_3.f08	(nicht existent)
+++ gcc/testsuite/gfortran.dg/coarray/event_3.f08	(Arbeitskopie)
@@ -0,0 +1,20 @@
+! { dg-do run }
+!
+! Check PR fortran/70696 is fixed.
+
+program global_event
+  use iso_fortran_env , only : event_type
+  implicit none
+  type(event_type) :: x[*]
+  
+  call exchange
+  contains
+subroutine exchange
+  integer :: cnt
+  event post(x[1])
+  event post(x[1])
+  call event_query(x, cnt)
+  if (cnt /= 2) error stop 1
+  event wait(x, until_count=2)
+end subroutine
+end 
Index: libgfortran/ChangeLog
===
--- libgfortran/ChangeLog	(Revision 244394)
+++ libgfortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,9 @@
+2017-01-13  Andre Vehreschild  
+
+	PR fortran/70696
+	* caf/single.c (_gfortran_caf_register): Allocate enough memory for
+	the event counter.
+
 2017-01-07  Andre Vehreschild  
 
 	PR fortran/78781
Index: libgfortran/caf/single.c
===
--- libgfortran/caf/single.c	(Revision 244394)
+++ libgfortran/caf/single.c	(Arbeitskopie)
@@ -141,9 +141,12 @@
   caf_single_token_t single_token;
 
   if (type == CAF_REGTYPE_LOCK_STATIC || type == CAF_REGTYPE_LOCK_ALLOC
-  || type == CAF_REGTYPE_CRITICAL || type == CAF_REGTYPE_EVENT_STATIC
-  || type == CAF_REGTYPE_EVENT_ALLOC)
+  || type == CAF_REGTYPE_CRITICAL)
 local = calloc (size, sizeof (bool));
+  else if (type == CAF_REGTYPE_EVENT_STATIC || type == CAF_REGTYPE_EVENT_ALLOC)
+/* In the event_(wait|post) function the counter for events is a uint32,
+   so better allocate enough memory here.  */
+local = calloc (size, sizeof (uint32_t));
   else if (type == CAF_REGTYPE_COARRAY_ALLOC_REGISTER_ONLY)
 local = NULL;
   else


Re: [PATCH, Fortran, pr70697, v1] [Coarray] ICE on EVENT WAIT with array element UNTIL_COUNT argument

2017-01-13 Thread Andre Vehreschild
Hi Jerry,

thanks again for the fast review. Committed as r244413.

Will backport to gcc-6 in about a week.

Regards,
Andre

On Thu, 12 Jan 2017 10:12:24 -0800
Jerry DeLisle  wrote:

> On 01/12/2017 05:43 AM, Andre Vehreschild wrote:
> > Hi all,
> > 
> > *** this is no duplicate, but +1 in the PR#! ***
> > 
> > attached patch fixes the ICE by resolving the expression in UNTIL_COUNT
> > correctly. The ICE was caused by the array-specification in UNTIL_COUNT not
> > correctly set.
> > 
> > Bootstraps and regtests ok on x86_64-linux/F25. Ok for trunk and gcc-6?
> > 
> > - Andre
> >   
> 
> OK and thanks,
> 
> Jerry


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 244409)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,5 +1,11 @@
 2017-01-13  Andre Vehreschild  
 
+	PR fortran/70697
+	* resolve.c (resolve_lock_unlock_event): Resolve the expression for
+	event's until_count.
+
+2017-01-13  Andre Vehreschild  
+
 	PR fortran/70696
 	* trans-expr.c (gfc_get_tree_for_caf_expr): Ensure the backend_decl
 	is valid before accessing it.
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(Revision 244409)
+++ gcc/fortran/resolve.c	(Arbeitskopie)
@@ -9158,10 +9158,13 @@
 return;
 
   /* Check for EVENT WAIT the UNTIL_COUNT.  */
-  if (code->op == EXEC_EVENT_WAIT && code->expr4
-  && (code->expr4->ts.type != BT_INTEGER || code->expr4->rank != 0))
-gfc_error ("UNTIL_COUNT= argument at %L must be a scalar INTEGER "
-	   "expression", &code->expr4->where);
+  if (code->op == EXEC_EVENT_WAIT && code->expr4)
+{
+  if (!gfc_resolve_expr (code->expr4) || code->expr4->ts.type != BT_INTEGER
+	  || code->expr4->rank != 0)
+	gfc_error ("UNTIL_COUNT= argument at %L must be a scalar INTEGER "
+		   "expression", &code->expr4->where);
+}
 }
 
 
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 244409)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,5 +1,10 @@
 2017-01-13  Andre Vehreschild  
 
+	PR fortran/70697
+	* gfortran.dg/coarray/event_4.f08: New test.
+
+2017-01-13  Andre Vehreschild  
+
 	PR fortran/70696
 	* gfortran.dg/coarray/event_3.f08: New test.
 
Index: gcc/testsuite/gfortran.dg/coarray/event_4.f08
===
--- gcc/testsuite/gfortran.dg/coarray/event_4.f08	(nicht existent)
+++ gcc/testsuite/gfortran.dg/coarray/event_4.f08	(Arbeitskopie)
@@ -0,0 +1,12 @@
+! { dg-do run }
+!
+! Check that pr 70697 is fixed.
+
+program event_4
+  use iso_fortran_env
+  integer :: nc(1)
+  type(event_type) done[*]
+  nc(1) = 1
+  event post(done[1])
+  event wait(done,until_count=nc(1))
+end


Restore Solaris/SPARC Ada bootstrap

2017-01-13 Thread Rainer Orth
This patch (which I didn't see on gcc-patches)

r244367 | charlet | 2017-01-12 16:57:45 +0100 (Thu, 12 Jan 2017) | 2 lines
Changed paths:
   M /trunk/gcc/ada/ChangeLog
   M /trunk/gcc/ada/gcc-interface/Makefile.in

* gcc-interface/Makefile.in: Clean up VxWorks targets.

included a typo which broke Solaris/SPARC bootstrap:

gcc-interface/Makefile:1274: *** missing separator.  Stop.
make[3]: Leaving directory '/var/gcc/regression/trunk/12-gcc/build/gcc/ada'
Makefile:122: recipe for target 'osconstool' failed
make[2]: *** [osconstool] Error 2
make[2]: Leaving directory 
'/var/gcc/regression/trunk/12-gcc/build/sparc-sun-solaris2.12/libada'
Makefile:23305: recipe for target 'all-target-libada' failed
make[1]: *** [all-target-libada] Error 2

Fixed like this, will install shortly as obvious.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-01-13  Rainer Orth  

* gcc-interface/Makefile.in (SPARC/Solaris): Fix typo.

diff --git a/gcc/ada/gcc-interface/Makefile.in b/gcc/ada/gcc-interface/Makefile.in
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -1270,7 +1270,7 @@ ifeq ($(strip $(filter-out sparc% sun so
   s-tpopsp.adb

Re: Restore Solaris/SPARC Ada bootstrap

2017-01-13 Thread Arnaud Charlet
> * gcc-interface/Makefile.in: Clean up VxWorks targets.
> 
> included a typo which broke Solaris/SPARC bootstrap:
> 
> gcc-interface/Makefile:1274: *** missing separator.  Stop.
> make[3]: Leaving directory
> '/var/gcc/regression/trunk/12-gcc/build/gcc/ada'
> Makefile:122: recipe for target 'osconstool' failed
> make[2]: *** [osconstool] Error 2
> make[2]: Leaving directory
> '/var/gcc/regression/trunk/12-gcc/build/sparc-sun-solaris2.12/libada'
> Makefile:23305: recipe for target 'all-target-libada' failed
> make[1]: *** [all-target-libada] Error 2
> 
> Fixed like this, will install shortly as obvious.

Great, thanks.


[Ada] Expression functions as completions and private types

2017-01-13 Thread Arnaud Charlet
An expression function that is a completion is a freeze point for all
entities referenced in its expression. As a consequence, a reference
to an uncompleted private type declared in an enclosing scope is illegal.
This patch adds the proper check to enforce this rule.

Compiling env_baselines.ads must yield:

   env_baseline.ads:16:51: premature use of private type "T"

---
package Env_Baseline is
   package Side is
  type T is (First, Last);
   end Side;

   type T is private;

   package General is
  type T is private;

  function First (Set : T) return Env_Baseline.T;

   private
  type T is array (Side.T) of Env_Baseline.T;

  function First (Set : T) return Env_Baseline.T is (Set (Side.First));
   end General;

private
   type T is record
  Major : Natural;
  Minor : Natural;
   end record;
end Env_Baseline;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-01-13  Ed Schonberg  

* sem_ch6.adb (Analyze_Expression_Function): If the expression
function is a completion, all entities referenced in the
expression are frozen. As a consequence, a reference to an
uncompleted private type from an enclosing scope is illegal.

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 244418)
+++ sem_ch6.adb (working copy)
@@ -274,6 +274,7 @@
   New_Spec : Node_Id;
   Orig_N   : Node_Id;
   Ret  : Node_Id;
+  Ret_Type : Entity_Id;
 
   Prev : Entity_Id;
   --  If the expression is a completion, Prev is the entity whose
@@ -366,17 +367,35 @@
   then
  Set_Has_Completion (Prev, False);
  Set_Is_Inlined (Prev);
+ Ret_Type := Etype (Prev);
 
  --  An expression function that is a completion freezes the
- --  expression. This means freezing the return type, and if it is
- --  an access type, freezing its designated type as well.
+ --  expression. This means freezing the return type, and if it is an
+ --  access type, freezing its designated type as well.
 
  --  Note that we cannot defer this freezing to the analysis of the
  --  expression itself, because a freeze node might appear in a nested
  --  scope, leading to an elaboration order issue in gigi.
 
- Freeze_Before (N, Etype (Prev));
+ --  An entity can only be frozen if it has a completion, so we must
+ --  check this explicitly. If it is declared elsewhere it will have
+ --  been frozen already, so only types declared in currently opend
+ --  scopes need to be tested.
 
+ if Ekind (Ret_Type) = E_Private_Type
+   and then In_Open_Scopes (Scope (Ret_Type))
+   and then not Is_Generic_Type (Ret_Type)
+   and then not Is_Frozen (Ret_Type)
+   and then No (Full_View (Ret_Type))
+ then
+Error_Msg_NE
+  ("premature use of private type&",
+   Result_Definition (Specification (N)), Ret_Type);
+
+ else
+Freeze_Before (N, Ret_Type);
+ end if;
+
  if Is_Access_Type (Etype (Prev)) then
 Freeze_Before (N, Designated_Type (Etype (Prev)));
  end if;


Re: [PATCH] Fix assertions along default switch labels (PR tree-optimization/78819)

2017-01-13 Thread Marek Polacek
On Fri, Dec 16, 2016 at 03:16:14PM +0100, Marek Polacek wrote:
> On Fri, Dec 16, 2016 at 02:58:59PM +0100, Richard Biener wrote:
> > On Fri, Dec 16, 2016 at 2:03 PM, Bernd Schmidt  wrote:
> > > On 12/16/2016 12:49 PM, Marek Polacek wrote:
> > >
> > >> But as this testcase shows, this breaks when the default label shares a
> > >> label
> > >> with another case.  On this testcase, when we reach the switch, we know
> > >> that
> > >> argc is either 1, 2, or 3.  So by the time we reach vrp2, the IR will 
> > >> have
> > >> been optimized to
> > >>
> > >>   switch (argc) {
> > >> default:
> > >> case 3:
> > >>   // argc will be considered 1 despite the case 3
> > >>   break;
> > >> case 2:
> > >>   ...
> > >>}
> > >
> > >
> > > Shouldn't we just remove the "case 3:" from the switch in this case? Would
> > > that fix things?
> > 
> > We probably should indeed.  But can we rely on this?
> 
> I think we should do both -- apply my fix + investigated why we kept case 3
> around.  I'm willing to look into this.

This is work of simplify_switch_using_ranges and then
11293   /* Update SWITCH_EXPR case label vector.  */
11294   FOR_EACH_VEC_ELT (to_update_switch_stmts, i, su)
11295 {
11296   size_t j;
11297   size_t n = TREE_VEC_LENGTH (su->vec);
11298   tree label;
11299   gimple_switch_set_num_labels (su->stmt, n);
11300   for (j = 0; j < n; j++)
11301 gimple_switch_set_label (su->stmt, j, TREE_VEC_ELT (su->vec, j));
11302   /* As we may have replaced the default label with a regular one
11303  make sure to make it a real default label again.  This ensures
11304  optimal expansion.  */
11305   label = gimple_switch_label (su->stmt, 0);
11306   CASE_LOW (label) = NULL_TREE;
11307   CASE_HIGH (label) = NULL_TREE;
^^ this

But only after I wrote a patch I noticed that cfgcleanup merges default: and
case 3: before we reach .optimized, so I think there's nothing to do here.

Marek


Re: [PATCH][ARM]Use different startfile and endfile for elf target when generating shared object.

2017-01-13 Thread Christophe Lyon
On 13 January 2017 at 11:22, Renlin Li  wrote:
> Hi Christophe,
>
> Thanks for testing the patch!
> I check the test case gcc.dg/lto/pr54709, it seems the test case is not
> properly written.
>
> It add extra ld option -shared without checking the target support for that.
> After the change, this compilation will fail as a regression.
> IIUC, '-shared' option is required, it should be gated with corresponding
> target selector.
>
> "g++.dg/ipa/devirt-28a.C" now is skipped because of the target selector
> there.
> // { dg-do link { target { { gld && fpic } && shared } } }
>
> perhaps "gcc.dg/lto/pr54709" should do similar things like this:
> // { dg-do link { target { shared } } }
Quite likely, indeed.

>
>
> As far as I know, with different cpu/arch configurations, different
> relocations are generated in the library, some of the relocations are not
> allowed to be used in shared
> object.
>
> With -march=armv7-a (and the --with-cpu=cortex-a9 option you mentioned), the
> linking stage of the test will fail because of this error:
> "relocation XXX against external symbol `YYY' can not be used when making a
> shared object"
> for instance: crtbegin.o: relocation R_ARM_MOVW_ABS_NC against `a local
> symbol` can not be used when making a shared object; recompile with -fPIC
>
> If you are luck enough, for example with arm7tdmi cpu, no such relocation is
> generated in startup files. The "shared" target support check will pass for
> simple and naive code.
> "--with-cpu=cortex-m3" should be this case. But the test cases which require
> shared object support will fail.
>
>
> So this "shared" target checking mechanism is not reliable. The patch is to
> change this.
>
Shouldn't your patch imply that several tests move from "fail" to
"unsupported" on armv7-a ? I'm surprised not to see any difference in the
results.

>
>
> Regards,
> Renlin
>
>
>
>
> On 13/01/17 08:48, Christophe Lyon wrote:
>>
>> Hi Renlin,
>>
>>
>> On 12 January 2017 at 16:50, Renlin Li  wrote:
>>>
>>> Hi Kugan,
>>>
>>> some of the targets do include pie, and use the same crtbegin file as
>>> shared
>>> object.
>>> For example, alpha/elf.h
>>>
>>> And there are targets which don't do that,
>>> For example, sh/elf.h
>>>
>>> Most of the elf target seem only consider the simple case.
>>>
>>> The purpose of this patch is to make it possible to correctly check
>>> whether
>>> current toolchain supports shared object.
>>>
>>> Current dejegnu target selector "shared" tries to compile a simple source
>>> code to with "-shared -fpic" options to check whether "-shared" is
>>> supported.
>>>
>>> For arm baremetal targets with, this is not sufficient.
>>>
>>> arm-none-eabi is built with multilib. When running this testcase, if it's
>>> compiled with "-march=armv7-a".
>>> The crtbegin.o for this architecture version contains relocations which
>>> cannot be used in shared object.
>>> This test will fail.
>>>
>>> if no cpu or architecture is specified, the default cpu will be arm7tdmi.
>>> The test will pass as crtbegin.o for this version doesn't contains any
>>> relocations
>>> only allowed in shared object.
>>>
>>> To make this "shared" target selector work for arm baremetal toolchain.
>>> The
>>> correct way is to use different startup file for shared and non-shared
>>> toolchain.
>>>
>>> If the toolchain doesn't support "-shared" option, and someone attempts
>>> to
>>> use it
>>> to create shared object, it will complaint that "crtbeginS.o" cannot not
>>> be
>>> found.
>>>
>>
>> I have run validations with your patch, and noticed regressions on
>> arm-none-eabi
>> when using default cpu or --with-cpu=cortex-m3:
>>
>>- PASS now FAIL [PASS => FAIL]:
>>
>>gcc.dg/lto/pr54709 c_lto_pr54709_0.o-c_lto_pr54709_1.o link,  -fPIC
>> -fvisibility=hidden -flto
>>gcc.dg/lto/pr61526 c_lto_pr61526_0.o-c_lto_pr61526_1.o link,  -fPIC
>> -flto -flto-partition=1to1
>>gcc.dg/lto/pr64415 c_lto_pr64415_0.o-c_lto_pr64415_1.o link,  -O -flto
>> -fpic
>>
>> on the same configurations, I've noticed these improvements:
>>
>>g++.dg/ipa/devirt-28a.C  -std=gnu++11 (test for excess errors)
>>g++.dg/ipa/devirt-28a.C  -std=gnu++14 (test for excess errors)
>>g++.dg/ipa/devirt-28a.C  -std=gnu++98 (test for excess errors)
>> are now unsupported rather than fail.
>>
>> Why is it different when the toolchain is configured --with-cpu=cortex-a9
>> for instance? Are the tests involving -shared already skipped in this
>> case?
>>
>>> A full history of discussion is here:
>>> https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00322.html
>>>
>>>
>>> Regards,
>>> Renlin
>>>
>>>
>>>
>>> On 12/01/17 11:47, kugan wrote:


 Hi,

 On 16/06/16 21:04, Renlin Li wrote:
>
>
>   /* Now we define the strings used to build the spec file.  */
> -#define UNKNOWN_ELF_STARTFILE_SPEC" crti%O%s crtbegin%O%s
> crt0%O%s"
> +#define UNKNOWN_ELF_STARTFILE_SPEC\
> +  "crti%O%s \
> +  %{!shared:crtbegin%O%s} %{shared:crtbeginS%O%s} \
>

Re: [PATCH] Fix gcc.dg/tree-ssa/scev-[345].c testcases

2017-01-13 Thread Bin.Cheng
On Fri, Jan 13, 2017 at 9:46 AM, Richard Biener  wrote:
>
> The following is an attempt to change those testcases to be less dependent
> on previous passes.  The original motivation of the testcases seems to be
> testing SCEV capabilities and in turn IVOPTs decisions, thus the testcases
> are changed to check the IVO dump, use the GIMPLE FE feeding the loop
> pipeline directly and skip lowering/store-motion we meanwhile do to
> the testcase.
>
> To avoid some existing issue with CFG construction after GIMPLE parsing
> we need to be able to add GIMPLE_NOPs which the patch enables to generate
> from empty stmts (previously those resulted in parse errors).
>
> Tested the testcases on x86_64 with {,-m32} sofar I'll appreciate
> testing on more targets.
I checked aarch64-elf/aarch64-linux with default configuration, all
passed with this change.

Thanks,
bin
>
> Full bootstrap / regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>

> -}
>  }
>
> -/* { dg-final { scan-tree-dump-times "&a" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "&a" 1 "ivopts" } } */


[Ada] Spurious dimensional error in SPARK mode

2017-01-13 Thread Arnaud Charlet
THis patch removes a spurious error on mismatched dimensions in the body of a
subprogram that is inlined for SPARK purposes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-01-13  Ed Schonberg  

* sem_ch8.adb (Analyze_Expanded_Name): Perfrom dimension analysis
even if entity is already set, because the node may be renalyzed
after inlining transformations.

Index: sem_ch8.adb
===
--- sem_ch8.adb (revision 244418)
+++ sem_ch8.adb (working copy)
@@ -609,11 +609,12 @@
 Set_Etype (N, Etype (Entity (N)));
  end if;
 
- return;
   else
  Find_Expanded_Name (N);
   end if;
 
+  --  In either case, propagate dimension of entity to expanded name
+
   Analyze_Dimension (N);
end Analyze_Expanded_Name;
 


[Ada] Inlining of expression function returning controlled object

2017-01-13 Thread Arnaud Charlet
Pragma Inline_Always has been extended to support inlining of calls
to expression functions that return a controlled object if the
expression function fulfills all the following requirements:
  1. Has pragma/aspect Inline_Always
  2. Has no formals
  3. Has no contracts
  4. Has no dispatching primitive
  5. Its result subtype is controlled (or with controlled components)
  6. Its result subtype not subject to type-invariant checks
  7. Its result subtype not a class-wide type
  8. Its return expression naming an object global to the function
  9. The nominal subtype of the returned object statically compatible with
 the result subtype of the expression function.

After this enhancement, using the following sources, the call to the
expression function Ada_Exception_Trace is now inlined.

with Ada.Finalization;
package Param is
   type T is abstract tagged private;
private
   type T is abstract new Ada.Finalization.Controlled with null record;

   procedure Initialize (Obj : in out T);
   procedure Adjust (Obj : in out T);
   procedure Finalize   (Obj : in out T);
end;

package Param.Debug is
   type T is private;

   function Value (Parameter : T) return Boolean with Inline_Always;
   function Ada_Exception_Trace return T with Inline_Always;
   procedure Do_Test;
private
   type Comp_T is new Param.T with record
  Value : Boolean := True;
   end record;

   type T is record
  Component : Comp_T;
   end record;

   function Value (Parameter : T) return Boolean
 is (Parameter.Component.Value);

   Private_Ada_Exception_Trace : T;

   function Ada_Exception_Trace return T  --  Test
 is (Private_Ada_Exception_Trace);
end Param.Debug;

with Ada.Text_IO; use Ada.Text_IO;
package body Param is
   procedure Initialize (Obj : in out T) is
   begin
  Put_Line ("Initialize()");
   end;

   procedure Adjust (Obj : in out T) is
   begin
  Put_Line ("Adjust()");
   end;

   procedure Finalize (Obj : in out T) is
   begin
  Put_Line ("Finalize()");
   end;
end;

with Ada.Text_IO; use Ada.Text_IO;
package body Param.Debug is
   procedure Do_Test is
   begin
  if Value (Ada_Exception_Trace) then-- Test
 Put_Line ("Do_Test()");
  end if;
   end;
end;

with Param.Debug;
procedure Main is
begin
   Param.Debug.Do_Test;
end;

Command: gnatmake main.adb; ./main
Output:
  Initialize()
  Do_Test()
  Finalize()

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-01-13  Javier Miranda  

* sem_res.adb (Resolve_Call): Do not establish a transient scope
for a call to inlinable expression function (since the call will
be replaced by its returned object).
* exp_ch6.ads (Is_Inlinable_Expression_Function): New subprogram.
* exp_ch6.adb (Expression_Of_Expression_Function): New subprogram.
(Expand_Call): For inlinable expression function call replace the
call by its returned object.
(Is_Inlinable_Expression_Function): New subprogram.

Index: sem_res.adb
===
--- sem_res.adb (revision 244418)
+++ sem_res.adb (working copy)
@@ -6260,7 +6260,10 @@
   --  within the specialized Exp_Ch6 procedures for expanding those
   --  build-in-place calls.
 
-  --  e) If the subprogram is marked Inline_Always, then even if it returns
+  --  e) Calls to inlinable expression functions do not use the secondary
+  --  stack (since the call will be replaced by its returned object).
+
+  --  f) If the subprogram is marked Inline_Always, then even if it returns
   --  an unconstrained type the call does not require use of the secondary
   --  stack. However, inlining will only take place if the body to inline
   --  is already present. It may not be available if e.g. the subprogram is
@@ -6281,6 +6284,7 @@
   elsif Ekind (Nam) = E_Enumeration_Literal
 or else Is_Build_In_Place_Function (Nam)
 or else Is_Intrinsic_Subprogram (Nam)
+or else Is_Inlinable_Expression_Function (Nam)
   then
  null;
 
Index: exp_ch6.adb
===
--- exp_ch6.adb (revision 244418)
+++ exp_ch6.adb (working copy)
@@ -219,6 +219,10 @@
--  reference to the object itself, and the call becomes a call to the
--  corresponding protected subprogram.
 
+   function Expression_Of_Expression_Function
+ (Subp : Entity_Id) return Node_Id;
+   --  Return the expression of the expression function Subp
+
function Has_Unconstrained_Access_Discriminants
  (Subtyp : Entity_Id) return Boolean;
--  Returns True if the given subtype is unconstrained and has one
@@ -3938,6 +3942,14 @@
  if not Is_Inlined (Subp) then
 null;
 
+ --  Frontend inlining of expression functions (performed also when
+ --  backend inlining is enabled)
+
+ elsif Is_Inlinable_Expression_Function (Subp) then
+Rewrite (N, New_Copy (Expressio

[PATCH] Adjust gcc.target/i386/pr45685.c testcase

2017-01-13 Thread Richard Biener

This makes it pass again.

Tested on x86_64-unknown-linux-gnu.

Richard.

2017-01-13  Richard Biener  

PR middle-end/78411
* gcc.target/i386/pr45685.c: Add -ftree-loop-if-convert.

Index: gcc/testsuite/gcc.target/i386/pr45685.c
===
--- gcc/testsuite/gcc.target/i386/pr45685.c (revision 244393)
+++ gcc/testsuite/gcc.target/i386/pr45685.c (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O3 -mno-sse4" } */
+/* { dg-options "-O3 -mno-sse4 -ftree-loop-if-convert" } */
 
 typedef unsigned long long int uint64_t;
 typedef long long int int64_t;


PR79066, non-PIC code generated for powerpc glibc with -fpic

2017-01-13 Thread Alan Modra
Bootstrapped and regression tested powerpc64-linux biarch.  elf_high
has said "Elf specific ways of loading addresses for non-PIC code"
 ^^^
right from the inital V4 support in 1995.

OK for mainline?

PR target/79066
* config/rs6000/rs6000.md (elf_high, elf_low): Disable when pic.
testsuite/
* gcc.target/powerpc/pr79066.c: New.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 80c10a7..98209fa 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -10597,14 +10597,14 @@ (define_insn_and_split "*tocref"
 (define_insn "elf_high"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=b*r")
(high:SI (match_operand 1 "" "")))]
-  "TARGET_ELF && ! TARGET_64BIT"
+  "TARGET_ELF && !TARGET_64BIT && !flag_pic"
   "lis %0,%1@ha")
 
 (define_insn "elf_low"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b")
   (match_operand 2 "" "")))]
-   "TARGET_ELF && ! TARGET_64BIT"
+   "TARGET_ELF && !TARGET_64BIT && !flag_pic"
"la %0,%2@l(%1)")
 
 ;; Call and call_value insns
diff --git a/gcc/testsuite/gcc.target/powerpc/pr79066.c 
b/gcc/testsuite/gcc.target/powerpc/pr79066.c
new file mode 100644
index 000..86b2014
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr79066.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { fpic && ilp32 } } } */
+/* { dg-options "-O2 -fpic" } */
+/* { dg-final { scan-assembler-not "lis.*@ha" } } */
+
+union U { double x; int i[2]; };
+
+double
+foo (double x)
+{
+  union U v;
+  v.i[0] = 0x7ff0;
+  v.i[1] = 0;
+  return x / v.x;
+}

-- 
Alan Modra
Australia Development Lab, IBM


Avoid PR72749 by not using unspecs

2017-01-13 Thread Alan Modra
Rather than using unspecs in doloop insns to stop combine creating
these insns, use legitimate_combined_insn.

I'm not sure why the original patch implementing
legitimate_combined_insn did not store the result of recog to the insn
but it seems good to me, and would allow the recog call in
ix86_legitimate_combined_insn to be omitted.  (I tested that too, not
shown here.)

Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux,
x86_64-linux.  OK for mainline?

PR target/72749
* combine.c (recog_for_combine_1): Set INSN_CODE before calling
target legitimate_combined_insn.
* config/rs6000/rs6000.c (TARGET_LEGITIMATE_COMBINED_INSN): Define.
(rs6000_legitimate_combined_insn): New function.
* config/rs6000/rs6000.md (UNSPEC_DOLOOP): Delete, and remove
all uses.
(ctr_internal3): Rename from *ctr_internal5.
(ctr_internal4): Rename from *ctr_internal6.
(ctr_internal1, ctr_internal2): Remove '*' from name.

diff --git a/gcc/combine.c b/gcc/combine.c
index 3043f2a..73a74ac 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11199,6 +11199,7 @@ recog_for_combine_1 (rtx *pnewpat, rtx_insn *insn, rtx 
*pnotes)
   old_icode = INSN_CODE (insn);
   PATTERN (insn) = pat;
   REG_NOTES (insn) = notes;
+  INSN_CODE (insn) = insn_code_number;
 
   /* Allow targets to reject combined insn.  */
   if (!targetm.legitimate_combined_insn (insn))
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 02b521c..76ba81b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1558,6 +1558,9 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_CONST_NOT_OK_FOR_DEBUG_P
 #define TARGET_CONST_NOT_OK_FOR_DEBUG_P rs6000_const_not_ok_for_debug_p
 
+#undef TARGET_LEGITIMATE_COMBINED_INSN
+#define TARGET_LEGITIMATE_COMBINED_INSN rs6000_legitimate_combined_insn
+
 #undef TARGET_ASM_FUNCTION_PROLOGUE
 #define TARGET_ASM_FUNCTION_PROLOGUE rs6000_output_function_prologue
 #undef TARGET_ASM_FUNCTION_EPILOGUE
@@ -9076,6 +9079,49 @@ rs6000_const_not_ok_for_debug_p (rtx x)
   return false;
 }
 
+
+/* Implement the TARGET_LEGITIMATE_COMBINED_INSN hook.  */
+
+static bool
+rs6000_legitimate_combined_insn (rtx_insn *insn)
+{
+  switch (INSN_CODE (insn))
+{
+  /* Reject creating doloop insns.  Combine should not be allowed
+to create these for a number of reasons:
+1) In a nested loop, if combine creates one of these in an
+outer loop and the register allocator happens to allocate ctr
+to the outer loop insn, then the inner loop can't use ctr.
+Inner loops ought to be more highly optimized.
+2) Combine often wants to create one of these from what was
+originally a three insn sequence, first combining the three
+insns to two, then to ctrsi/ctrdi.  When ctrsi/ctrdi is not
+allocated ctr, the splitter takes use back to the three insn
+sequence.  It's better to stop combine at the two insn
+sequence.
+3) Faced with not being able to allocate ctr for ctrsi/crtdi
+insns, the register allocator sometimes uses floating point
+or vector registers for the pseudo.  Since ctrsi/ctrdi is a
+jump insn and output reloads are not implemented for jumps,
+the ctrsi/ctrdi splitters need to handle all possible cases.
+That's a pain, and it gets to be seriously difficult when a
+splitter that runs after reload needs memory to transfer from
+a gpr to fpr.  See PR70098 and PR71763 which are not fixed
+for the difficult case.  It's better to not create problems
+in the first place.  */
+case CODE_FOR_ctrsi_internal1:
+case CODE_FOR_ctrdi_internal1:
+case CODE_FOR_ctrsi_internal2:
+case CODE_FOR_ctrdi_internal2:
+case CODE_FOR_ctrsi_internal3:
+case CODE_FOR_ctrdi_internal3:
+case CODE_FOR_ctrsi_internal4:
+case CODE_FOR_ctrdi_internal4:
+  return false;
+}
+  return true;
+}
+
 /* Construct the SYMBOL_REF for the tls_get_addr function.  */
 
 static GTY(()) rtx rs6000_tls_symbol;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index f7c1ab2..9442b07 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -149,7 +149,6 @@ (define_c_enum "unspec"
UNSPEC_IEEE128_MOVE
UNSPEC_IEEE128_CONVERT
UNSPEC_SIGNBIT
-   UNSPEC_DOLOOP
UNSPEC_SF_FROM_SI
UNSPEC_SI_FROM_SF
   ])
@@ -12740,7 +12739,6 @@ (define_expand "ctr"
  (set (match_dup 0)
   (plus:P (match_dup 0)
(const_int -1)))
- (unspec [(const_int 0)] UNSPEC_DOLOOP)
  (clobber (match_scratch:CC 2 ""))
  (clobber (match_scratch:P 3 ""))])]
   ""
@@ -12751,9 +12749,8 @@ (define_expand "ctr"
 ;; JUMP_INSNs.
 ;; For the length attribute to be calculated correctly, the
 ;; label MUST be operand 0.

Re: [PATCH] builtin expansion of strncmp for rs6000

2017-01-13 Thread Tulio Magno Quites Machado Filho
Segher Boessenkool  writes:

> On Thu, Jan 12, 2017 at 05:53:06PM +, Joseph Myers wrote:
>> On Thu, 15 Dec 2016, Aaron Sawdey wrote:
>> 
>> > +  emit_library_call_value (gen_rtx_SYMBOL_REF (Pmode, "strncmp"),
>> > + target, LCT_NORMAL, GET_MODE (target), 3,
>> > + force_reg (Pmode, XEXP (src1, 0)), Pmode,
>> > + force_reg (Pmode, XEXP (src2, 0)), Pmode,
>> > + len_rtx, GET_MODE (len_rtx));
>> 
>> Building glibc with GCC mainline for powerpc64le-linux-gnu (from my bot 
>> doing such builds and compilation parts of the testsuite for all GNU/Linux 
>> glibc ABIs daily), I'm seeing a failure of the elf/check-localplt test, 
>> "Extra PLT reference: libc.so: strncmp".
>> 
>> Without having bisected, I suspect this patch of being a likely cause of 
>> that failure.  glibc redirects internal calls to use hidden aliases such 
>> as __GI_strncmp to avoid them going through the PLT.  By using 
>> gen_rtx_SYMBOL_REF there with a hardcoded function name "strncmp", this 
>> code looks like it would render a declaration of strncmp with asm 
>> ("__GI_strncmp") ineffective, generating a direct call to strncmp when 
>> glibc expects __GI_strncmp to be called instead.
>> 
>> I don't know how readily this code can be made to respect asm renaming of 
>> strncmp.  If it's hard to fix in GCC, I suppose glibc needs a powerpc 
>> version of symbol-hacks.h to handle this redirection.
>
> Tulio will look at fixing it in glibc tomorrow.  Thanks for the report,

Does this code affects a specific rs6000 build? i.e. powerpc64le -mcpu=power8.
Or is it generic?

-- 
Tulio Magno



[patch,avr]: For PR78883 #3

2017-01-13 Thread Georg-Johann Lay

This is 3rd way to fix PR78883 by rejecting malicious expressions
from the start.

Ok for trunk?

Johann

gcc/
PR target/78883
* config/avr/avr.c (rtl-iter.h): Include it.
(TARGET_LEGITIMATE_COMBINED_INSN): New hook define...
(avr_legitimate_combined_insn): ...and implementation.

gcc/testsuite/
PR target/78883
* gcc.c-torture/compile/pr78883.c: New test.

Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 244001)
+++ config/avr/avr.c	(working copy)
@@ -49,6 +49,7 @@
 #include "context.h"
 #include "tree-pass.h"
 #include "print-rtl.h"
+#include "rtl-iter.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -12823,6 +12824,34 @@ avr_convert_to_type (tree type, tree exp
 }
 
 
+/* Implement `TARGET_LEGITIMATE_COMBINED_INSN'.  */
+
+/* PR78883: Filter out paradoxical SUBREGs of MEM which are not handled
+   properly by following passes.  As INSN_SCHEDULING is off and hence
+   general_operand accepts such expressions, ditch them now.  */
+
+static bool
+avr_legitimate_combined_insn (rtx_insn *insn)
+{
+  subrtx_iterator::array_type array;
+
+  FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
+{
+  const_rtx op = *iter;
+
+  if (SUBREG_P (op)
+  && MEM_P (SUBREG_REG (op))
+  && (GET_MODE_SIZE (GET_MODE (op))
+  > GET_MODE_SIZE (GET_MODE (SUBREG_REG (op)
+{
+  return false;
+}
+}
+
+  return true;
+}
+
+
 /* PR63633: The middle-end might come up with hard regs as input operands.
 
RMASK is a bit mask representing a subset of hard registers R0...R31:
@@ -14364,6 +14393,9 @@ avr_fold_builtin (tree fndecl, int n_arg
 #define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \
   avr_use_by_pieces_infrastructure_p
 
+#undef  TARGET_LEGITIMATE_COMBINED_INSN
+#define TARGET_LEGITIMATE_COMBINED_INSN avr_legitimate_combined_insn
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 
Index: testsuite/gcc.c-torture/compile/pr78883.c
===
--- testsuite/gcc.c-torture/compile/pr78883.c	(nonexistent)
+++ testsuite/gcc.c-torture/compile/pr78883.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+
+int foo (int *p)
+{
+  int i;
+  for (i = 0; i < 5; i++)
+{
+  if (p[i] & 1)
+return i;
+}
+  return -1;
+}


Re: [patch] update zlib to the 1.2.10 release.

2017-01-13 Thread Matthias Klose
On 12.01.2017 22:39, Jeff Law wrote:
> On 01/12/2017 02:26 PM, Matthias Klose wrote:
>> On 12.01.2017 22:17, Jeff Law wrote:
>>> On 01/05/2017 07:45 AM, Matthias Klose wrote:
 These are the changes updating zlib from 1.2.8 to 1.2.10. It is only used 
 when
 building without a system zlib.  The new release includes fixes for 
 security
 issues CVE-2016-9840, CVE-2016-9841, CVE-2016-9842, CVE-2016-9843.

 Checked with a build with disabled system-zlib. Ok for the trunk?
>>> Were there any changes that we needed to carry forward or any changes you 
>>> needed
>>> to make to the upstream sources?
>>
>> I backed out the changes to the configure* and Makefile* changes (and only
>> these), which are completely different to zlib upstream. There are no
>> additions/deletions to zlib source files, so these build changes still work 
>> with
>> the updated zlib.
> One more note.  I think that, in general, backing out local changes which 
> don't
> have a strong need to be carried forward is absolutely the right thing to do. 
> The less hacking we do on these libraries we pull from other sources, the
> better, IMHO.

agreed, except for the build system (and a locally fixed typo in zlib's
ChangeLog), everything applied cleanly.

Please import it into the binutils-gdb repository.

Matthias



Re: [PATCH] Reload global options when strict aliasing is dropped (PR ipa/79043).

2017-01-13 Thread Richard Biener
On Tue, Jan 10, 2017 at 4:28 PM, Martin Liška  wrote:
> As mentioned in the PR, we currently do not properly reload global
> optimization options when we drop strict-aliasing flag on a function
> that equals to cfun.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?

Ok.

Richard.

> Martin


[PATCH] PR78361 recognise noexcept functions as referenceable

2017-01-13 Thread Jonathan Wakely

This ensures that __is_referenceable gives the right answer for
functions with noexcept in the type, which exist in C++17.

I'll also review the stuff in  for similar problems.

Tested powerpc64le-linux, -std=gnu++{14,17}, committed to trunk.


commit 7029af834c490402beaab6336a1d11d286f65cc8
Author: Jonathan Wakely 
Date:   Fri Jan 13 11:47:29 2017 +

PR78361 recognise noexcept functions as referenceable

2017-01-13  Jonathan Wakely  

PR libstdc++/78361
* testsuite/20_util/add_pointer/value.cc: Test forming function
pointers.

2017-01-13  Michael Brune  

PR libstdc++/78361
* include/std/type_traits (__is_referenceable): Handle noexcept
function types.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index d0fa390..a50f06c 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -641,13 +641,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public __or_, is_reference<_Tp>>::type
 { };
 
-  template
-struct __is_referenceable<_Res(_Args...)>
+  template
+struct __is_referenceable<_Res(_Args...) _GLIBCXX_NOEXCEPT_QUAL>
 : public true_type
 { };
 
-  template
-struct __is_referenceable<_Res(_Args..)>
+  template
+struct __is_referenceable<_Res(_Args..) _GLIBCXX_NOEXCEPT_QUAL>
 : public true_type
 { };
 
diff --git a/libstdc++-v3/testsuite/20_util/add_pointer/value.cc 
b/libstdc++-v3/testsuite/20_util/add_pointer/value.cc
index e0688cb..a2f1e67 100644
--- a/libstdc++-v3/testsuite/20_util/add_pointer/value.cc
+++ b/libstdc++-v3/testsuite/20_util/add_pointer/value.cc
@@ -34,3 +34,18 @@ void test01()
ClassType**>::value, "");
   static_assert(is_same::type, ClassType*>::value, "");
 }
+
+void test02()
+{
+  using std::add_pointer;
+  using std::is_same;
+
+  void f1();
+  using f1_type = decltype(f1);
+  using pf1_type = decltype(&f1);
+  static_assert(is_same::type, pf1_type>::value, "");
+  void f2() noexcept; // PR libstdc++/78361
+  using f2_type = decltype(f2);
+  using pf2_type = decltype(&f2);
+  static_assert(is_same::type, pf2_type>::value, "");
+}


[PATCH v5] add -fprolog-pad=N,M option

2017-01-13 Thread Torsten Duwe
Changes since v4: hopefully addressed all of Sandra's requests
and suggestions concerning the documentation snippets, thanks
for the feedback.  If it still isn't clear, feel free to rephrase
-- I'm a programmer, not a technical writer.

2017-01-13  Torsten Duwe : 

 * c-family/c-attribs.c : introduce prolog_pad attribute and create
   a handler for it.

 * lto/lto-lang.c : Likewise.

 * common.opt : introduce -fprolog_pad command line option
   and its variables prolog_nop_pad_size and prolog_nop_pad_entry.

 * doc/extend.texi : document prolog_pad attribute.

 * doc/invoke.texi : document -fprolog_pad command line option.

 * opts.c (OPT_fprolog_pad_): add parser.

 * doc/tm.texi.in (TARGET_ASM_PRINT_PROLOG_PAD): new target hook

 * doc/tm.texi : Likewise.

 * target.def (print_prolog_pad): Likewise.

 * targhooks.h (default_print_prolog_pad): new function.

 * targhooks.c (default_print_prolog_pad): Likewise.

 * testsuite/c-c++-common/attribute-prolog_pad-1.c : New test.

 * toplev.c (process_options): Switch off IPA-RA if
   prolog pads are being generated.

 * varasm.c (assemble_start_function): look at prolog-pad command
   line switch and function attributes and maybe generate NOP
   pads by calling print_prolog_pad.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index ce7fcaa..5c6cf1c 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -139,6 +139,7 @@ static tree handle_bnd_variable_size_attribute (tree *, 
tree, tree, int, bool *)
 static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
 static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
+static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *);
 
 /* Table of machine-independent attributes common to all C-like languages.
 
@@ -345,6 +346,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_bnd_instrument, false },
   { "fallthrough",   0, 0, false, false, false,
  handle_fallthrough_attribute, false },
+  { "prolog_pad",1, 2, true, false, false,
+ handle_prolog_pad_attribute, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -3173,3 +3176,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
int,
   *no_add_attrs = true;
   return NULL_TREE;
 }
+
+static tree
+handle_prolog_pad_attribute (tree *, tree, tree, int,
+bool *)
+{
+  return NULL_TREE;
+}
diff --git a/gcc/common.opt b/gcc/common.opt
index 8ad5b77..37d4009 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
 Variable
 int flag_debug_asm
 
+; If we should generate NOP pads before each function prologue
+Variable
+HOST_WIDE_INT prolog_nop_pad_size
+
+; And how far the asm entry point is into this pad
+Variable
+HOST_WIDE_INT prolog_nop_pad_entry
 
 ; Balance between GNAT encodings and standard DWARF to emit.
 Variable
@@ -2019,6 +2026,10 @@ fprofile-reorder-functions
 Common Report Var(flag_profile_reorder_functions)
 Enable function reordering that improves code placement.
 
+fprolog-pad=
+Common Joined Optimization
+Pad NOPs before each function prologue.
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 6be113c..fb7d62d 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3076,6 +3076,23 @@ that affect more than one function.
 This attribute should be used for debugging purposes only.  It is not
 suitable in production code.
 
+@item prolog_pad
+@cindex @code{prolog_pad} function attribute
+@cindex function entry padded with NOPs
+In case the target's text segment can be made writable at run time
+by any means, padding the function entry with a number of NOPs can
+be used to provide a universal tool for instrumentation.  Usually,
+prolog padding is enabled globally using the @option{-fprolog-pad=N,M}
+command-line switch, and disabled with attribute @code{prolog_pad (0)}
+for functions that are part of the actual instrumentation framework.
+This conveniently avoids an endless recursion.
+Of course the @code{prolog_pad} function attribute can be used to
+change the pad size to any desired value.  The two-value syntax is
+the same as for the command-line switch @option{-fprolog-pad=N,M},
+generating a NOP pad of size @var{N}, with the function entry point
+@var{M} NOP instructions into the pad.  @var{M} defaults to 0
+if omitted e.g. function entry point is the beginning of the pad.
+
 @item pure
 @cindex @code{pure} function attribute
 @cindex functions that have no side effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 69f0adb

Re: [PATCH][ARM]Use different startfile and endfile for elf target when generating shared object.

2017-01-13 Thread Renlin Li

Hi Christophe,

On 13/01/17 11:14, Christophe Lyon wrote:

On 13 January 2017 at 11:22, Renlin Li  wrote:

Hi Christophe,

Thanks for testing the patch!
I check the test case gcc.dg/lto/pr54709, it seems the test case is not
properly written.

It add extra ld option -shared without checking the target support for that.
After the change, this compilation will fail as a regression.
IIUC, '-shared' option is required, it should be gated with corresponding
target selector.

"g++.dg/ipa/devirt-28a.C" now is skipped because of the target selector
there.
// { dg-do link { target { { gld && fpic } && shared } } }

perhaps "gcc.dg/lto/pr54709" should do similar things like this:
// { dg-do link { target { shared } } }

Quite likely, indeed.




As far as I know, with different cpu/arch configurations, different
relocations are generated in the library, some of the relocations are not
allowed to be used in shared
object.

With -march=armv7-a (and the --with-cpu=cortex-a9 option you mentioned), the
linking stage of the test will fail because of this error:
"relocation XXX against external symbol `YYY' can not be used when making a
shared object"
for instance: crtbegin.o: relocation R_ARM_MOVW_ABS_NC against `a local
symbol` can not be used when making a shared object; recompile with -fPIC

If you are luck enough, for example with arm7tdmi cpu, no such relocation is
generated in startup files. The "shared" target support check will pass for
simple and naive code.
"--with-cpu=cortex-m3" should be this case. But the test cases which require
shared object support will fail.


So this "shared" target checking mechanism is not reliable. The patch is to
change this.


Shouldn't your patch imply that several tests move from "fail" to
"unsupported" on armv7-a ? I'm surprised not to see any difference in the
results.





Oops, I reordered the explanation paragraphs in my last email, making it 
obscure.

The "shared" target check will fail on armv7-a architecture because of the 
reason
mentioned below. So they are already been ignored. After the change, they are 
still
marked as "unsupported", but with a different reason. "crtbeginS.o cannot be 
found"

The deja-gnu test framework will compose a small program to test whether the 
toolchain
supports "shared" option.

>> With -march=armv7-a (and the --with-cpu=cortex-a9 option you mentioned), the
>> linking stage of the test will fail because of this error:
>> "relocation XXX against external symbol `YYY' can not be used when making a
>> shared object"
>> for instance: crtbegin.o: relocation R_ARM_MOVW_ABS_NC against `a local
>> symbol` can not be used when making a shared object; recompile with -fPIC
>>

So the check won't pass, and the test case is marked as "unsupported".

>> If you are luck enough, for example with arm7tdmi cpu, no such relocation is
>> generated in startup files. The "shared" target support check will pass for
>> simple and naive code.
>> "--with-cpu=cortex-m3" should be this case. But the test cases which require
>> shared object support will fail.

So for the same test case,
With "--with-cpu=cortex-m3",
The "shared" target support check will pass. It is marked as supported, but fail to 
produce binary.

with --with-cpu=cortex-a9",
The "shared" target support check will fail. it is marked as "unsupported" and 
skipped.

After the change, the test case will marked as "unsupported" regardless of the
cpu/arch configuration.

Regards,
Renlin


Re: [PATCH][ARM]Use different startfile and endfile for elf target when generating shared object.

2017-01-13 Thread Christophe Lyon
On 13 January 2017 at 13:26, Renlin Li  wrote:
> Hi Christophe,
>
>
> On 13/01/17 11:14, Christophe Lyon wrote:
>>
>> On 13 January 2017 at 11:22, Renlin Li  wrote:
>>>
>>> Hi Christophe,
>>>
>>> Thanks for testing the patch!
>>> I check the test case gcc.dg/lto/pr54709, it seems the test case is not
>>> properly written.
>>>
>>> It add extra ld option -shared without checking the target support for
>>> that.
>>> After the change, this compilation will fail as a regression.
>>> IIUC, '-shared' option is required, it should be gated with corresponding
>>> target selector.
>>>
>>> "g++.dg/ipa/devirt-28a.C" now is skipped because of the target selector
>>> there.
>>> // { dg-do link { target { { gld && fpic } && shared } } }
>>>
>>> perhaps "gcc.dg/lto/pr54709" should do similar things like this:
>>> // { dg-do link { target { shared } } }
>>
>> Quite likely, indeed.
>>
>>>
>>>
>>> As far as I know, with different cpu/arch configurations, different
>>> relocations are generated in the library, some of the relocations are not
>>> allowed to be used in shared
>>> object.
>>>
>>> With -march=armv7-a (and the --with-cpu=cortex-a9 option you mentioned),
>>> the
>>> linking stage of the test will fail because of this error:
>>> "relocation XXX against external symbol `YYY' can not be used when making
>>> a
>>> shared object"
>>> for instance: crtbegin.o: relocation R_ARM_MOVW_ABS_NC against `a local
>>> symbol` can not be used when making a shared object; recompile with -fPIC
>>>
>>> If you are luck enough, for example with arm7tdmi cpu, no such relocation
>>> is
>>> generated in startup files. The "shared" target support check will pass
>>> for
>>> simple and naive code.
>>> "--with-cpu=cortex-m3" should be this case. But the test cases which
>>> require
>>> shared object support will fail.
>>>
>>>
>>> So this "shared" target checking mechanism is not reliable. The patch is
>>> to
>>> change this.
>>>
>> Shouldn't your patch imply that several tests move from "fail" to
>> "unsupported" on armv7-a ? I'm surprised not to see any difference in the
>> results.
>>
>>>
>
> Oops, I reordered the explanation paragraphs in my last email, making it
> obscure.
>
> The "shared" target check will fail on armv7-a architecture because of the
> reason
> mentioned below. So they are already been ignored. After the change, they
> are still
> marked as "unsupported", but with a different reason. "crtbeginS.o cannot be
> found"
>
> The deja-gnu test framework will compose a small program to test whether the
> toolchain
> supports "shared" option.
>
>>> With -march=armv7-a (and the --with-cpu=cortex-a9 option you mentioned),
>>> the
>>> linking stage of the test will fail because of this error:
>>> "relocation XXX against external symbol `YYY' can not be used when making
>>> a
>>> shared object"
>>> for instance: crtbegin.o: relocation R_ARM_MOVW_ABS_NC against `a local
>>> symbol` can not be used when making a shared object; recompile with -fPIC
>>>
>
> So the check won't pass, and the test case is marked as "unsupported".
>
>>> If you are luck enough, for example with arm7tdmi cpu, no such relocation
>>> is
>>> generated in startup files. The "shared" target support check will pass
>>> for
>>> simple and naive code.
>>> "--with-cpu=cortex-m3" should be this case. But the test cases which
>>> require
>>> shared object support will fail.
>
> So for the same test case,
> With "--with-cpu=cortex-m3",
> The "shared" target support check will pass. It is marked as supported, but
> fail to produce binary.
> with --with-cpu=cortex-a9",
> The "shared" target support check will fail. it is marked as "unsupported"
> and skipped.
>
> After the change, the test case will marked as "unsupported" regardless of
> the
> cpu/arch configuration.
>
OK thanks for the clarification.

> Regards,
> Renlin


Re: [PATCH][GIMPLE FE] Add parsing of MEM_REFs

2017-01-13 Thread Richard Biener
On Fri, 13 Jan 2017, Richard Biener wrote:

> On Fri, 13 Jan 2017, Jakub Jelinek wrote:
> 
> > On Fri, Jan 13, 2017 at 09:15:22AM +0100, Richard Biener wrote:
> > > > @@ -1710,6 +1716,24 @@ dump_generic_node (pretty_printer *pp, t
> > > >   print_hex (val, pp_buffer (pp)->digit_buffer);
> > > >   pp_string (pp, pp_buffer (pp)->digit_buffer);
> > > > }
> > > > +  if ((flags & TDF_GIMPLE)
> > > > + && (POINTER_TYPE_P (TREE_TYPE (node))
> > > > + || (TYPE_PRECISION (TREE_TYPE (node))
> > > > + < TYPE_PRECISION (integer_type_node))
> > > > + || exact_log2 (TYPE_PRECISION (TREE_TYPE (node))) == -1))
> > > > +   {
> > > > + if (TYPE_UNSIGNED (TREE_TYPE (node)))
> > > > +   pp_character (pp, 'u');
> > > > + if (TYPE_PRECISION (TREE_TYPE (node))
> > > > + == TYPE_PRECISION (unsigned_type_node))
> > > > +   ;
> > > > + else if (TYPE_PRECISION (TREE_TYPE (node))
> > > > +  == TYPE_PRECISION (long_unsigned_type_node))
> > > > +   pp_character (pp, 'l');
> > > > + else if (TYPE_PRECISION (TREE_TYPE (node))
> > > > +  == TYPE_PRECISION (long_long_unsigned_type_node))
> > > > +   pp_string (pp, "ll");
> > > > +   }
> > 
> > Not sure if I understand this.  The outer if condition says that only the
> > sub-int or strange precision or pointer types do that, but then
> > you compare the precisions against long and long long.  That will be
> > true only for pointers.  Don't you want the u/l/ll/ul/ull suffixes emitted
> > for integers even when they have normal precision and _Literal is not used?
> > Or is that handled somewhere else?
> 
> Oops, you are right - it shouldn't be the same condition that controls
> whether to add _Literal (type).  It should be the inverse.  Maybe
> we need to add suffixes anyway even for say 61bit precision constants
> to avoid warnings from libcpp.
> 
> I'll fix it up after testing a few cases.

Fixed as follows, bootstrapped / tested on x86_64-unknown-linux-gnu
and committed with the GIMPLE_NOP parsing from the other patch.

Richard.

2017-01-13  Richard Biener  

* tree-pretty-print.c (dump_generic_node): Fix inverted condition
for dumping GIMPLE INTEGER_CSTs.

Index: gcc/tree-pretty-print.c
===
--- gcc/tree-pretty-print.c (revision 244393)
+++ gcc/tree-pretty-print.c (working copy)
@@ -1717,10 +1717,10 @@ dump_generic_node (pretty_printer *pp, t
  pp_string (pp, pp_buffer (pp)->digit_buffer);
}
   if ((flags & TDF_GIMPLE)
- && (POINTER_TYPE_P (TREE_TYPE (node))
- || (TYPE_PRECISION (TREE_TYPE (node))
- < TYPE_PRECISION (integer_type_node))
- || exact_log2 (TYPE_PRECISION (TREE_TYPE (node))) == -1))
+ && ! (POINTER_TYPE_P (TREE_TYPE (node))
+   || (TYPE_PRECISION (TREE_TYPE (node))
+   < TYPE_PRECISION (integer_type_node))
+   || exact_log2 (TYPE_PRECISION (TREE_TYPE (node))) == -1))
{
  if (TYPE_UNSIGNED (TREE_TYPE (node)))
pp_character (pp, 'u');


[driver, doc] Support escaping special characters in specs

2017-01-13 Thread Rainer Orth
As a prerequisite for (a corner case in) fixing PR target/40411, we need
the ability to escape special characters in specs.  Case at hand: we
want to match -std=iso9899:199409 which doesn't work right now.  The
option isn't an alias but the canonical form (though in theory one could
introduce -std=c94 and make the above an alias for that), so we can only
match it directly.  The goal is to have this work:

#define STARTFILE_ARCH_SPEC \
  "%{ansi|std=c90|std=iso9899\\:199409:values-Xc.o%s; :values-Xa.o%s} \
   %{std=c90|std=gnu90:values-xpg4.o%s; :values-xpg6.o%s}"

Jeff submitted a patch for this in the PR almost 8 years ago, and I've
just updated it slightly so it applies to mainline, and copied the docs
snippet to invoke.texi with the necessary markup.

Bootstrapped (together with the current proposed patch to fix the PR
above) on i386-pc-solaris2.12 and sparc-sun-solaris2.12.

I'm unsure if the patch is large enough to need a copyright assignment
(in which case it's almost certainly too late for GCC 7), and even if
not if it's appropriate at this point in the release cycle.

How should we deal with this?

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-01-10  Jeff Downs  
Rainer Orth  

gcc:
* gcc.c (handle_braces): Support escaping in switch matching
text.
* doc/invoke.texi (Spec Files): Document it.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -26391,6 +26391,13 @@ be as many clauses as you need.  This ma
 
 @end table
 
+The switch matching text @code{S} in a %@{@code{S}@},
+%@{@code{S}:@code{X}@} or similar construct can use a backslash to
+ignore the special meaning of the character following it, thus allowing
+literal matching of a character that is otherwise specially treated.
+For example, %@{@code{std=iso9899\:1999}:@code{X}@} would substitute
+@code{X} if the @option{-std=iso9899:1999} option were given.
+
 The conditional text @code{X} in a %@{@code{S}:@code{X}@} or similar
 construct may contain other nested @samp{%} constructs or spaces, or
 even newlines.  They are processed as usual, as described above.
diff --git a/gcc/gcc.c b/gcc/gcc.c
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -583,6 +583,12 @@ or with constant text in a single argume
 
  %(Spec) processes a specification defined in a specs file as *Spec:
 
+The switch matching text S in a %{S}, %{S:X}, or similar construct can use a 
+backslash to ignore the special meaning of the character following it, thus
+allowing literal matching of a character that is otherwise specially treated.
+For example, %{std=iso9899\:1999:X} would substitute X if the
+-std=iso9899:1999 option were given.
+
 The conditional text X in a %{S:X} or similar construct may contain
 other nested % constructs or spaces, or even newlines.  They are
 processed as usual, as described above.  Trailing white space in X is
@@ -6228,6 +6234,8 @@ handle_braces (const char *p)
 {
   const char *atom, *end_atom;
   const char *d_atom = NULL, *d_end_atom = NULL;
+  char *esc_buf = NULL, *d_esc_buf = NULL;
+  int esc;
   const char *orig = p;
 
   bool a_is_suffix;
@@ -6278,11 +6286,41 @@ handle_braces (const char *p)
 	p++, a_is_spectype = true;
 
 	  atom = p;
+	  esc = 0;
 	  while (ISIDNUM (*p) || *p == '-' || *p == '+' || *p == '='
-		 || *p == ',' || *p == '.' || *p == '@')
-	p++;
+		 || *p == ',' || *p == '.' || *p == '@' || *p == '\\')
+	{
+	  if (*p == '\\')
+		{
+		  p++;
+		  if (!*p)
+		fatal_error (input_location,
+ "braced spec %qs ends in escape", orig);
+		  esc++;
+		}
+	  p++;
+	}
 	  end_atom = p;
 
+	  if (esc)
+	{
+	  const char *ap;
+	  char *ep;
+	  if (esc_buf && esc_buf != d_esc_buf)
+		free(esc_buf);
+	  esc_buf = NULL;
+	  ep = esc_buf = (char *)xmalloc (end_atom - atom - esc + 1);
+	  for (ap = atom; ap != end_atom; ap++, ep++)
+		{
+		  if (*ap == '\\')
+		ap++;
+		  *ep = *ap;
+		}
+	  *ep = '\0';
+	  atom = esc_buf;
+	  end_atom = ep;
+	}
+
 	  if (*p == '*')
 	p++, a_is_starred = 1;
 	}
@@ -6349,6 +6387,7 @@ handle_braces (const char *p)
 		  disj_matched = true;
 		  d_atom = atom;
 		  d_end_atom = end_atom;
+		  d_esc_buf = esc_buf;
 		}
 		}
 	}
@@ -6360,7 +6399,7 @@ handle_braces (const char *p)
 	  p = process_brace_body (p + 1, d_atom, d_end_atom, disj_starred,
   disj_matched && !n_way_matched);
 	  if (p == 0)
-		return 0;
+		goto done;
 
 	  /* If we have an N-way choice, reset state for the next
 		 disjunction.  */
@@ -6381,6 +6420,12 @@ handle_braces (const char *p)
 }
   while (*p++ != '}');
 
+ done:
+  if (d_esc_buf && d_esc_buf != esc_buf)
+free(d_esc_buf);
+  if (esc_buf)
+free(esc_buf);
+
   return p;
 
  invalid:


Re: [PATCH] Reload global options when strict aliasing is dropped (PR ipa/79043).

2017-01-13 Thread Martin Liška
On 01/13/2017 01:16 PM, Richard Biener wrote:
> On Tue, Jan 10, 2017 at 4:28 PM, Martin Liška  wrote:
>> As mentioned in the PR, we currently do not properly reload global
>> optimization options when we drop strict-aliasing flag on a function
>> that equals to cfun.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
> 
> Ok.

Thanks, applied to trunk. May I install the patch to active branches if it 
survives
bootstrap and regression tests?

Martin

> 
> Richard.
> 
>> Martin



Re: [PATCH] builtin expansion of strncmp for rs6000

2017-01-13 Thread Joseph Myers
On Fri, 13 Jan 2017, Tulio Magno Quites Machado Filho wrote:

> > Tulio will look at fixing it in glibc tomorrow.  Thanks for the report,
> 
> Does this code affects a specific rs6000 build? i.e. powerpc64le -mcpu=power8.
> Or is it generic?

build-many-glibcs.py does not use any special configure options in GCC or 
glibc to select a particular processor (which I think means a POWER8 
default in GCC; I don't think powerpc glibc yet determines which sysdeps 
directories to use based on compiler defaults, however, though we've 
discussed that this approach, as used by the ARM port, is to be preferred 
over --with-cpu).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Mark test as XFAIL for C++17 mode

2017-01-13 Thread Jonathan Wakely

This test fails to compile in C++17 mode because the explicit
instantiation using a custom allocator runs into the fact that the new
node extraction/reinsertion functions for C++17 don't support fancy
pointers (yet).

Adding dg-xfail-if avoids the FAIL in C++17 mode, and changing it from
a run test to compile test means we don't get an UNRESOVLED when the
compilation fails. Running this test isn't especially important, it's
really just checking that everything compiles using fancy pointers.

* testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Mark
XFAIL for C++17 until node reinsertion supports fancy pointers.

Tested x86_64-linux, committed to trunk.

commit 6527a0847bf24f9174fe0dec2160c3813cb64a14
Author: Jonathan Wakely 
Date:   Fri Jan 13 12:20:48 2017 +

Mark test as XFAIL for C++17 mode

* testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Mark
XFAIL for C++17 until node reinsertion supports fancy pointers.

diff --git 
a/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
index f59b808..ef38a72 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do run { target c++11 } }
+// { dg-do compile { target c++11 } }
 
 #include 
 #include 
@@ -30,6 +30,8 @@ struct E : std::equal_to { };
 
 using __gnu_test::CustomPointerAlloc;
 
+// { dg-xfail-if "node reinsertion assumes raw pointers" { c++1z } }
+// TODO when removing this xfail change the test back to "dg-do run".
 template class std::unordered_set>;
 
 void test01()


Re: [PATCH] Reload global options when strict aliasing is dropped (PR ipa/79043).

2017-01-13 Thread Richard Biener
On Fri, Jan 13, 2017 at 2:00 PM, Martin Liška  wrote:
> On 01/13/2017 01:16 PM, Richard Biener wrote:
>> On Tue, Jan 10, 2017 at 4:28 PM, Martin Liška  wrote:
>>> As mentioned in the PR, we currently do not properly reload global
>>> optimization options when we drop strict-aliasing flag on a function
>>> that equals to cfun.
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>>
>>> Ready to be installed?
>>
>> Ok.
>
> Thanks, applied to trunk. May I install the patch to active branches if it 
> survives
> bootstrap and regression tests?

Yes, but please wait a few days to see if there's any fallout with odd
target stuff
done by set_cfun.

Richard.

>
> Martin
>
>>
>> Richard.
>>
>>> Martin
>


Re: Unreviewed fixincludes patch

2017-01-13 Thread Rainer Orth
Hi Bruce,

> *I* would certainly argue that.  I do occasionally shut down the
> internet and go on vacation :).  Looks good to me, not being a Solaris

good habit: should get into it myself ;-)

> person anymore.

Thanks.

> BTW, tiny notational/formatting thing:  the '<<-' "here text" marker says to
> strip leading tabs on each line.  In other words, the following can be 
> indented
> with tabs to be more eyeball friendly (for future reference):
>
>
> +fix = {
> +hackname  = solaris_gets_cxx14;
> +mach  = "*-*-solaris2*";
> +files = "iso/stdio_iso.h";
> +select= <<- _EOSelect_
> +(#if __STDC_VERSION__ < 201112L)
> +(extern char \*gets\(char \*\) __ATTR_DEPRECATED;)
> +_EOSelect_;
> +c_fix = format;
> +c_fix_arg = "%1 && __cplusplus < 201402L\n%2";

Great, I'll try to remember that.

In the meantime, I've tested the fix on the gcc-5 and gcc-6 branches,
too: it didn't need more than regenerating fixincl.x and fixed the
failing testcase as expected.

I'll wait a week or two on mainline and apply it to the those branches,
too.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[patch,avr]: Increase branch costs after reload.

2017-01-13 Thread Georg-Johann Lay

This adds a penalty of 4 to the post-reload branch costs.

Purpose is reduce the number of out-of-line blocks like in

unsigned long variant5 (unsigned in)
{
unsigned long out = 0;
if (in & (1 << 0)) out |= 0xful << (4*0);
if (in & (1 << 1)) out |= 0xful << (4*1);
if (in & (1 << 2)) out |= 0xful << (4*2);
if (in & (1 << 3)) out |= 0xful << (4*3);
return out;
}

without the patch, code is


variant5:
mov r18,r24  ;  67  movqi_insn/1[length = 1]
sbrs r24,0   ;  10  *sbrx_branchhi  [length = 2]
rjmp .L6
ldi r22,lo8(15)  ;  5   *movsi/5[length = 4]
ldi r23,0
ldi r24,0
ldi r25,0
.L2:

.L6:
ldi r22,0;  4   *movsi/2[length = 3]
ldi r23,0
movw r24,r22
rjmp .L2 ;  74  jump[length = 1]


and with the patch it reads:

variant5:
mov r18,r24  ;  67  movqi_insn/1[length = 1]
ldi r22,lo8(15)  ;  5   *movsi/5[length = 4]
ldi r23,0
ldi r24,0
ldi r25,0
sbrc r18,0   ;  10  *sbrx_branchhi  [length = 2]
rjmp .L2
ldi r22,0;  4   *movsi/2[length = 3]
ldi r23,0
movw r24,r22
.L2:



Using fall-through safes insn 74.

Main blocker for not increasing default branch costs in general
is do_store_flag which is a heap of assertions not using
rtx_costs and which gives the best results with the old
default of 0, which is not changed.

Tested without regressions.

Ok for trunk?

Johann

* config/avr/avr.h (BRANCH_COST) [reload_completed]: Increase by 4.
Index: config/avr/avr.h
===
--- config/avr/avr.h	(revision 244001)
+++ config/avr/avr.h	(working copy)
@@ -360,7 +360,12 @@ typedef struct avr_args
   } \
   } while (0)
 
-#define BRANCH_COST(speed_p, predictable_p) avr_branch_cost
+/* We increase branch costs after reload in order to keep basic-block
+   reordering from introducing out-of-line jumps and to prefer fall-through
+   edges instead.  The default branch costs are 0, mainly because otherwise
+   do_store_flag might come up with bloated code.  */
+#define BRANCH_COST(speed_p, predictable_p) \
+  (avr_branch_cost + (reload_completed ? 4 : 0))
 
 #define SLOW_BYTE_ACCESS 0
 


Re: [C++ PATCH] c++/61636 generic lambdas and this capture

2017-01-13 Thread Nathan Sidwell

On 12/21/2016 03:27 PM, Nathan Sidwell wrote:

This patch addresses bug 61636, which is an ICE during generic lambda
instantiation due to unexpected this capture.


Here's an updated patch.
Firstly I added the worker function to lambda.c, with a more descriptive 
name.  Secondly, this implements a check that the overload set contains 
at least one non-static member fn.  That's what Clang does, and 
discussion on the std list seems to be heading that way as the right 
approach.


ok?

nathan

--
Nathan Sidwell
2016-12-21  Nathan Sidwell  

	PR c++/61636
	* cp-tree.h (maybe_generic_this_capture): Declare.
	* lambda.c (resolvable_dummy): New, broken out of ...
	(maybe_resolve_dummy): ... here.  Call it.
	(maybe_generic_this_capture): New.
	* parser.c (cp_parser_postfix_expression): Speculatively capture
	this in generic lambda in unresolved member function call.
	* pt.c (tsubst_copy_and_build): Force hard error from failed
	member function lookup in generic lambda.

	PR c++/61636
	* g++.dg/cpp1y/pr61636-1.C: New.
	* g++.dg/cpp1y/pr61636-2.C: New.
	* g++.dg/cpp1y/pr61636-3.C: New.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 244382)
+++ cp/cp-tree.h	(working copy)
@@ -6551,6 +6551,7 @@ extern bool is_capture_proxy			(tree);
 extern bool is_normal_capture_proxy (tree);
 extern void register_capture_members		(tree);
 extern tree lambda_expr_this_capture(tree, bool);
+extern void maybe_generic_this_capture		(tree, tree);
 extern tree maybe_resolve_dummy			(tree, bool);
 extern tree current_nonlambda_function		(void);
 extern tree nonlambda_method_basetype		(void);
Index: cp/lambda.c
===
--- cp/lambda.c	(revision 244382)
+++ cp/lambda.c	(working copy)
@@ -793,16 +793,14 @@ lambda_expr_this_capture (tree lambda, b
   return result;
 }
 
-/* We don't want to capture 'this' until we know we need it, i.e. after
-   overload resolution has chosen a non-static member function.  At that
-   point we call this function to turn a dummy object into a use of the
-   'this' capture.  */
+/* Return the current LAMBDA_EXPR, if this is a resolvable dummy
+   object.  NULL otherwise..  */
 
-tree
-maybe_resolve_dummy (tree object, bool add_capture_p)
+static tree
+resolvable_dummy (tree object)
 {
   if (!is_dummy_object (object))
-return object;
+return NULL_TREE;
 
   tree type = TYPE_MAIN_VARIANT (TREE_TYPE (object));
   gcc_assert (!TYPE_PTR_P (type));
@@ -812,18 +810,55 @@ maybe_resolve_dummy (tree object, bool a
   && LAMBDA_TYPE_P (current_class_type)
   && lambda_function (current_class_type)
   && DERIVED_FROM_P (type, current_nonlambda_class_type ()))
-{
-  /* In a lambda, need to go through 'this' capture.  */
-  tree lam = CLASSTYPE_LAMBDA_EXPR (current_class_type);
-  tree cap = lambda_expr_this_capture (lam, add_capture_p);
-  if (cap && cap != error_mark_node)
+return CLASSTYPE_LAMBDA_EXPR (current_class_type);
+
+  return NULL_TREE;
+}
+
+/* We don't want to capture 'this' until we know we need it, i.e. after
+   overload resolution has chosen a non-static member function.  At that
+   point we call this function to turn a dummy object into a use of the
+   'this' capture.  */
+
+tree
+maybe_resolve_dummy (tree object, bool add_capture_p)
+{
+  if (tree lam = resolvable_dummy (object))
+if (tree cap = lambda_expr_this_capture (lam, add_capture_p))
+  if (cap != error_mark_node)
 	object = build_x_indirect_ref (EXPR_LOCATION (object), cap,
    RO_NULL, tf_warning_or_error);
-}
 
   return object;
 }
 
+/* When parsing a generic lambda containing an argument-dependent
+   member function call we defer overload resolution to instantiation
+   time.  But we have to know now whether to capture this or not.
+   Do that if FNS contains any non-static fns.
+   The std doesn't anticipate this case, but I expect this to be the
+   outcome of discussion.  */
+
+void
+maybe_generic_this_capture (tree object, tree fns)
+{
+  if (tree lam = resolvable_dummy (object))
+if (!LAMBDA_EXPR_THIS_CAPTURE (lam))
+  {
+	/* We've not yet captured, so look at the function set of
+	   interest.  */
+	if (BASELINK_P (fns))
+	  fns = BASELINK_FUNCTIONS (fns);
+	for (; fns; fns = OVL_NEXT (fns))
+	  if (DECL_NONSTATIC_MEMBER_FUNCTION_P (OVL_CURRENT (fns)))
+	{
+	  /* Found a non-static member.  Capture this.  */
+	  lambda_expr_this_capture (lam, true);
+	  break;
+	}
+  }
+}
+
 /* Returns the innermost non-lambda function.  */
 
 tree
Index: cp/parser.c
===
--- cp/parser.c	(revision 244382)
+++ cp/parser.c	(working copy)
@@ -6971,6 +6971,7 @@ cp_parser_postfix_expression (cp_parser
 			|| type_dependent_expression_p (fn)
 			|| any_type_dependent_arguments_p (args)))
 		  {
+		maybe_generic_this_capture (instance, fn);
 		 

Re: [PATCH, ARM] correctly encode the CC reg data flow

2017-01-13 Thread Richard Earnshaw (lists)
On 18/12/16 12:58, Bernd Edlinger wrote:
> Hi,
> 
> this is related to PR77308, the follow-up patch will depend on this one.
> 
> When trying the split the *arm_cmpdi_insn and *arm_cmpdi_unsigned
> before reload, a mis-compilation in libgcc function __gnu_satfractdasq
> was discovered, see [1] for more details.
> 
> The reason seems to be that when the *arm_cmpdi_insn is directly
> followed by a *arm_cmpdi_unsigned instruction, both are split
> up into this:
> 
>[(set (reg:CC CC_REGNUM)
>  (compare:CC (match_dup 0) (match_dup 1)))
> (parallel [(set (reg:CC CC_REGNUM)
> (compare:CC (match_dup 3) (match_dup 4)))
>(set (match_dup 2)
> (minus:SI (match_dup 5)
>  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 
> 0])]
> 
>[(set (reg:CC CC_REGNUM)
>  (compare:CC (match_dup 2) (match_dup 3)))
> (cond_exec (eq:SI (reg:CC CC_REGNUM) (const_int 0))
>(set (reg:CC CC_REGNUM)
> (compare:CC (match_dup 0) (match_dup 1]
> 
> The problem is that the reg:CC from the *subsi3_carryin_compare
> is not mentioning that the reg:CC is also dependent on the reg:CC
> from before.  Therefore the *arm_cmpsi_insn appears to be
> redundant and thus got removed, because the data values are identical.
> 
> I think that applies to a number of similar pattern where data
> flow is happening through the CC reg.
> 
> So this is a kind of correctness issue, and should be fixed
> independently from the optimization issue PR77308.
> 
> Therefore I think the patterns need to specify the true
> value that will be in the CC reg, in order for cse to
> know what the instructions are really doing.
> 
> 
> Bootstrapped and reg-tested on arm-linux-gnueabihf.
> Is it OK for trunk?
> 

I agree you've found a valid problem here, but I have some issues with
the patch itself.


(define_insn_and_split "subdi3_compare1"
  [(set (reg:CC_NCV CC_REGNUM)
(compare:CC_NCV
  (match_operand:DI 1 "register_operand" "r")
  (match_operand:DI 2 "register_operand" "r")))
   (set (match_operand:DI 0 "register_operand" "=&r")
(minus:DI (match_dup 1) (match_dup 2)))]
  "TARGET_32BIT"
  "#"
  "&& reload_completed"
  [(parallel [(set (reg:CC CC_REGNUM)
   (compare:CC (match_dup 1) (match_dup 2)))
  (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))])
   (parallel [(set (reg:CC_C CC_REGNUM)
   (compare:CC_C
 (zero_extend:DI (match_dup 4))
 (plus:DI (zero_extend:DI (match_dup 5))
  (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)
  (set (match_dup 3)
   (minus:SI (minus:SI (match_dup 4) (match_dup 5))
 (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0])]


This pattern is now no-longer self consistent in that before the split
the overall result for the condition register is in mode CC_NCV, but
afterwards it is just CC_C.

I think CC_NCV is correct mode (the N, C and V bits all correctly
reflect the result of the 64-bit comparison), but that then implies that
the cc mode of subsi3_carryin_compare is incorrect as well and should in
fact also be CC_NCV.  Thinking about this pattern, I'm inclined to agree
that CC_NCV is the correct mode for this operation

I'm not sure if there are other consequences that will fall out from
fixing this (it's possible that we might need a change to select_cc_mode
as well).

R.

> 
> Thanks
> Bernd.
> 
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00680.html
> 
> 
> patch-pr77308-5.diff
> 
> 
> 2016-12-10  Bernd Edlinger  
> 
>   PR target/77308
>   * config/arm/arm.md (adddi3_compareV, *addsi3_compareV_upper,
>   adddi3_compareC, *addsi3_compareC_upper, subdi3_compare1,
>   subsi3_carryin_compare, subsi3_carryin_compare_const,
>   negdi2_compare, *negsi2_carryin_compare,
>   *arm_cmpdi_insn): Fix the CC reg dataflow.
> 
> Index: gcc/config/arm/arm.md
> ===
> --- gcc/config/arm/arm.md (revision 243515)
> +++ gcc/config/arm/arm.md (working copy)
> @@ -669,17 +669,15 @@
> (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))])
> (parallel [(set (reg:CC_V CC_REGNUM)
>  (ne:CC_V
> - (plus:DI (plus:DI
> -   (sign_extend:DI (match_dup 4))
> -   (sign_extend:DI (match_dup 5)))
> -  (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
> - (plus:DI (sign_extend:DI
> -   (plus:SI (match_dup 4) (match_dup 5)))
> -  (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)
> -  (set (match_dup 3) (plus:SI (plus:SI
> -   (match_dup 4) (match_dup 5))
> -  (ltu:SI (reg:CC_C

Re: [Ping~]Re: [5/5][AArch64, libgcc] Runtime support for AArch64 return address signing (also attached target macros version)

2017-01-13 Thread James Greenhalgh
On Thu, Jan 12, 2017 at 06:10:36PM +, Jiong Wang wrote:
> On 06/01/17 11:47, Jiong Wang wrote:
> >This is the update on libgcc unwinder support according to new
> >DWARF proposal.
> >
> >As Joseph commented, duplication of unwind-dw2.c is not encouraged
> >in libgcc,
> >But from this patch, you can see there are a few places we need to
> >modify for
> >AArch64 in unwind-aarch64.c, so the file duplication approach is
> >acceptable?
> >
> >
> >libgcc/
> >
> >2017-01-06  Jiong Wang  
> >
> >* config/aarch64/unwind-aarch64.c (DWARF_REGNUM_AARCH64_RA_STATE,
> >RA_A_SIGNED_BIT): New macros.
> >(execute_cfa_program): Multiplex DW_CFA_GNU_window_save on
> >AArch64.
> >(uw_frame_state_for): Clear bit[0] of
> >DWARF_REGNUM_AARCH64_RA_STATE.
> >(uw_update_context): Authenticate return address according to
> >DWARF_REGNUM_AARCH64_RA_STATE.
> >(uw_init_context_1): Strip signature of seed address.
> >(uw_install_context): Re-authenticate EH handler's address.
> >
> Ping~
> 
> For comparision, I have also attached the patch using the target macros.

Personally, I much prefer this approach.

I haven't looked at your patch in detail, I'll leave that for after a
libgcc maintainer has commented on whether these macros would be
acceptable.

Thanks,
James




[v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Ville Voutilainen
Update patch with splices for __carry added. Hopefully this resolves
the remaining concerns that we had.
diff --git a/libstdc++-v3/include/bits/list.tcc 
b/libstdc++-v3/include/bits/list.tcc
index c4f397f..9ba403c 100644
--- a/libstdc++-v3/include/bits/list.tcc
+++ b/libstdc++-v3/include/bits/list.tcc
@@ -380,26 +380,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   // 300. list::merge() specification incomplete
   if (this != std::__addressof(__x))
{
- _M_check_equal_allocators(__x); 
+ _M_check_equal_allocators(__x);
 
  iterator __first1 = begin();
  iterator __last1 = end();
  iterator __first2 = __x.begin();
  iterator __last2 = __x.end();
- while (__first1 != __last1 && __first2 != __last2)
-   if (*__first2 < *__first1)
- {
-   iterator __next = __first2;
-   _M_transfer(__first1, __first2, ++__next);
-   __first2 = __next;
- }
-   else
- ++__first1;
- if (__first2 != __last2)
-   _M_transfer(__last1, __first2, __last2);
+ size_t __orig_size = __x.size();
+ __try {
+   while (__first1 != __last1 && __first2 != __last2)
+ if (*__first2 < *__first1)
+   {
+ iterator __next = __first2;
+ _M_transfer(__first1, __first2, ++__next);
+ __first2 = __next;
+   }
+ else
+   ++__first1;
+   if (__first2 != __last2)
+ _M_transfer(__last1, __first2, __last2);
 
- this->_M_inc_size(__x._M_get_size());
- __x._M_set_size(0);
+   this->_M_inc_size(__x._M_get_size());
+   __x._M_set_size(0);
+ }
+ __catch(...)
+   {
+ size_t __dist = distance(__first2, __last2);
+ this->_M_inc_size(__dist);
+ __x._M_set_size(__orig_size - __dist);
+ __throw_exception_again;
+   }
}
 }
 
@@ -423,20 +433,31 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
iterator __last1 = end();
iterator __first2 = __x.begin();
iterator __last2 = __x.end();
-   while (__first1 != __last1 && __first2 != __last2)
- if (__comp(*__first2, *__first1))
-   {
- iterator __next = __first2;
- _M_transfer(__first1, __first2, ++__next);
- __first2 = __next;
-   }
- else
-   ++__first1;
-   if (__first2 != __last2)
- _M_transfer(__last1, __first2, __last2);
-
-   this->_M_inc_size(__x._M_get_size());
-   __x._M_set_size(0);
+   size_t __orig_size = __x.size();
+   __try
+ {
+   while (__first1 != __last1 && __first2 != __last2)
+ if (__comp(*__first2, *__first1))
+   {
+ iterator __next = __first2;
+ _M_transfer(__first1, __first2, ++__next);
+ __first2 = __next;
+   }
+ else
+   ++__first1;
+   if (__first2 != __last2)
+ _M_transfer(__last1, __first2, __last2);
+
+   this->_M_inc_size(__x._M_get_size());
+   __x._M_set_size(0);
+ }
+   __catch(...)
+ {
+   size_t __dist = distance(__first2, __last2);
+   this->_M_inc_size(__dist);
+   __x._M_set_size(__orig_size - __dist);
+   __throw_exception_again;
+ }
  }
   }
 
@@ -453,27 +474,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 list __tmp[64];
 list * __fill = __tmp;
 list * __counter;
-
-do
+   __try
  {
-   __carry.splice(__carry.begin(), *this, begin());
-
-   for(__counter = __tmp;
-   __counter != __fill && !__counter->empty();
-   ++__counter)
+   do
  {
-   __counter->merge(__carry);
+   __carry.splice(__carry.begin(), *this, begin());
+
+   for(__counter = __tmp;
+   __counter != __fill && !__counter->empty();
+   ++__counter)
+ {
+   __counter->merge(__carry);
+   __carry.swap(*__counter);
+ }
__carry.swap(*__counter);
+   if (__counter == __fill)
+ ++__fill;
  }
-   __carry.swap(*__counter);
-   if (__counter == __fill)
- ++__fill;
- }
-   while ( !empty() );
+   while ( !empty() );
 
-for (__counter = __tmp + 1; __counter != __fill; ++__counter)
-  __counter->merge(*(__counter - 1));
-swap( *(__fill - 1) );
+   for (__counter = __tmp + 1; __counter != __fill; ++__counter)
+ __counter->mer

Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Jonathan Wakely

On 13/01/17 16:26 +0200, Ville Voutilainen wrote:

Update patch with splices for __carry added. Hopefully this resolves
the remaining concerns that we had.


OK for trunk after fixing the ADL issue noted below.

There are also two stylistic comments ...


diff --git a/libstdc++-v3/include/bits/list.tcc 
b/libstdc++-v3/include/bits/list.tcc
index c4f397f..9ba403c 100644
--- a/libstdc++-v3/include/bits/list.tcc
+++ b/libstdc++-v3/include/bits/list.tcc
@@ -380,26 +380,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  // 300. list::merge() specification incomplete
  if (this != std::__addressof(__x))
{
- _M_check_equal_allocators(__x);
+ _M_check_equal_allocators(__x);

  iterator __first1 = begin();
  iterator __last1 = end();
  iterator __first2 = __x.begin();
  iterator __last2 = __x.end();
- while (__first1 != __last1 && __first2 != __last2)
-   if (*__first2 < *__first1)
- {
-   iterator __next = __first2;
-   _M_transfer(__first1, __first2, ++__next);
-   __first2 = __next;
- }
-   else
- ++__first1;
- if (__first2 != __last2)
-   _M_transfer(__last1, __first2, __last2);
+ size_t __orig_size = __x.size();


This could be const, just to ensure we don't accidentally modify it.


+ __try {
+   while (__first1 != __last1 && __first2 != __last2)
+ if (*__first2 < *__first1)
+   {
+ iterator __next = __first2;
+ _M_transfer(__first1, __first2, ++__next);
+ __first2 = __next;
+   }
+ else
+   ++__first1;
+   if (__first2 != __last2)
+ _M_transfer(__last1, __first2, __last2);

- this->_M_inc_size(__x._M_get_size());
- __x._M_set_size(0);
+   this->_M_inc_size(__x._M_get_size());
+   __x._M_set_size(0);
+ }
+ __catch(...)
+   {
+ size_t __dist = distance(__first2, __last2);


This must be std::distance, so we don't find an overload in a
namespace associated with our value_type or allocator_type.

Same comments for the second merge function as well.



diff --git a/libstdc++-v3/testsuite/23_containers/list/operations/78389.cc 
b/libstdc++-v3/testsuite/23_containers/list/operations/78389.cc
new file mode 100644
index 000..fded09d
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/list/operations/78389.cc
@@ -0,0 +1,85 @@
+// { dg-do run { target c++11 } }
+
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// 23.2.2.4 list operations [lib.list.ops]
+
+#include 
+
+#include 
+
+struct ThrowingComparator
+{
+unsigned int throw_after = 0;
+unsigned int count = 0;
+bool operator()(int, int) {
+if (++count >= throw_after) {
+throw 666;
+}
+return true;
+}
+};
+
+struct X
+{
+  X() = default;
+  X(int) {}
+};


The indentation in this test keeps changing from 2 spaces to 4 spaces
:-)


[PATCH 2/2] [msp430] Remove mpy.o from libgcc

2017-01-13 Thread Joe Seymour
This patch fixes "multiple definition of `__mspabi_mpyi'" errors
encountered with -mhwmult=f5series, by moving mpy.o out of libgcc and
into libmul_none.a. The other libmul_*.a archives already provide a
definition of _mspabi_mpyi.

mpy.c actually contains __mulhi3, but the msp430 backend renames that to
_mspabi_mpyi.

Built and tested (no regressions) as follows:
  Configured with: --target=msp430-elf --enable-languages=c
  Test variations:
msp430-sim/-mcpu=msp430
msp430-sim/-mcpu=msp430x
msp430-sim/-mcpu=msp430x/-mlarge/-mdata-region=either/-mcode-region=either
msp430-sim/-mhwmult=none
msp430-sim/-mhwmult=f5series

I've also compared the output from msp430-elf-nm for
lib/gcc/msp430-elf/7.0.0/lib*.a on before/after builds, confirming that
moving _msp430abi_mpyi from libgcc to libmul_none.a is the only change.

Test results actually show several progressions:
  # of expected passes  +721
  # of unexpected failures  -396
  # of unresolved testcases -26

If this patch is acceptable, I would appreciate it if someone would
commit it on my behalf.

Thanks,

2017-01-XX  Joe Seymour  

libgcc/
* config/msp430/t-msp430 (LIB2ADD): Remove mpy.c
(mpy.o): New rule.
(libmul_none.a): Add mpy.o

gcc/testsuite/
* gcc.target/msp430/mul_f5_muldef.c: New test.
---
 gcc/testsuite/gcc.target/msp430/mul_f5_muldef.c |   15 +++
 libgcc/config/msp430/t-msp430   |6 --
 2 files changed, 19 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/msp430/mul_f5_muldef.c

diff --git a/gcc/testsuite/gcc.target/msp430/mul_f5_muldef.c 
b/gcc/testsuite/gcc.target/msp430/mul_f5_muldef.c
new file mode 100644
index 000..da1b1bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/mul_f5_muldef.c
@@ -0,0 +1,15 @@
+/* { dg-do link } */
+/* { dg-options "-mhwmult=f5series" } */
+
+/* This program used to result in a multiple definition error:
+
+libmul_f5.a(lib2hw_mul_f5.o): In function `__mulhi2_f5':
+(.text.__mulhi2_f5+0x0): multiple definition of `__mspabi_mpyi'
+libgcc.a(mpy.o):mpy.c:(.text.__mulhi3+0x0): first defined here */
+
+#include 
+
+int main (void)
+{
+  printf ("%d", 430);
+}
diff --git a/libgcc/config/msp430/t-msp430 b/libgcc/config/msp430/t-msp430
index 107eb3d..668b943 100644
--- a/libgcc/config/msp430/t-msp430
+++ b/libgcc/config/msp430/t-msp430
@@ -30,7 +30,6 @@ LIB2ADD = \
$(srcdir)/config/msp430/lib2mul.c \
$(srcdir)/config/msp430/lib2shift.c \
$(srcdir)/config/msp430/epilogue.S \
-   $(srcdir)/config/msp430/mpy.c \
$(srcdir)/config/msp430/slli.S \
$(srcdir)/config/msp430/srai.S \
$(srcdir)/config/msp430/srli.S \
@@ -43,6 +42,9 @@ LIB2ADD = \
 
 HOST_LIBGCC2_CFLAGS += -Os -ffunction-sections -fdata-sections -mhwmult=none
 
+mpy.o: $(srcdir)/config/msp430/mpy.c
+   $(gcc_compile) $< -c
+
 lib2_mul_none.o: $(srcdir)/config/msp430/lib2mul.c
$(gcc_compile) $< -c -DMUL_NONE
 
@@ -58,7 +60,7 @@ lib2hw_mul_32.o: $(srcdir)/config/msp430/lib2hw_mul.S
 lib2hw_mul_f5.o: $(srcdir)/config/msp430/lib2hw_mul.S
$(gcc_compile) $< -c -DMUL_F5
 
-libmul_none.a: lib2_mul_none.o
+libmul_none.a: lib2_mul_none.o mpy.o
$(AR_CREATE_FOR_TARGET) $@ $(filter %.o,$^)
 
 libmul_16.a: lib2hw_mul_16.o lib2_mul_16bit.o
-- 
1.7.1


Re: [C++ Patch] PR 71737

2017-01-13 Thread Nathan Sidwell

On 01/13/2017 09:45 AM, Paolo Carlini wrote:

Hi,

in this error recovery issue get_underlying_template crashes when
TYPE_TEMPLATE_INFO_MAYBE_ALIAS is applied to a null orig_type. Simply
checking for that condition appears to solve the issue in a
straightforward way. Tested x86_64-linux.


Wouldn't it be better if a scrogged alias got error_mark_node as the 
underlying type?  (I have no idea whether that's an easy thing to 
accomplish)


nathan
--
Nathan Sidwell


Re: [v3 PATCH] PR libstdc++/78389

2017-01-13 Thread Jonathan Wakely

On 13/01/17 14:41 +, Jonathan Wakely wrote:

On 13/01/17 16:26 +0200, Ville Voutilainen wrote:

Update patch with splices for __carry added. Hopefully this resolves
the remaining concerns that we had.


OK for trunk after fixing the ADL issue noted below.


As discussed on IRC, the list::merge() part is a regression compared
to gcc4, because we don't store the size in the old list. That part
should be backported.




[C++ Patch] PR 71737

2017-01-13 Thread Paolo Carlini

Hi,

in this error recovery issue get_underlying_template crashes when 
TYPE_TEMPLATE_INFO_MAYBE_ALIAS is applied to a null orig_type. Simply 
checking for that condition appears to solve the issue in a 
straightforward way. Tested x86_64-linux.


Thanks, Paolo.

/

/cp
2017-01-13  Paolo Carlini  

PR c++/71737
* pt.c (get_underlying_template): Don't crash if orig_type
is NULL_TREE.

/testsuite
2017-01-13  Paolo Carlini  

PR c++/71737
* g++.dg/cpp0x/pr71737.C: New.
Index: cp/pt.c
===
--- cp/pt.c (revision 244405)
+++ cp/pt.c (working copy)
@@ -5916,6 +5916,8 @@ get_underlying_template (tree tmpl)
 {
   /* Determine if the alias is equivalent to an underlying template.  */
   tree orig_type = DECL_ORIGINAL_TYPE (DECL_TEMPLATE_RESULT (tmpl));
+  if (!orig_type)
+   break;
   tree tinfo = TYPE_TEMPLATE_INFO_MAYBE_ALIAS (orig_type);
   if (!tinfo)
break;
Index: testsuite/g++.dg/cpp0x/pr71737.C
===
--- testsuite/g++.dg/cpp0x/pr71737.C(revision 0)
+++ testsuite/g++.dg/cpp0x/pr71737.C(working copy)
@@ -0,0 +1,13 @@
+// PR c++/78765
+// { dg-do compile { target c++11 } }
+
+template  class TT>
+struct quote {
+  template 
+  using apply = TT;  // { dg-error "pack expansion" }
+};
+
+template 
+using to_int_t = int;
+
+using t = quote::apply>::apply;


[PATCH 1/2] [msp430] Remove source files from libmul archives

2017-01-13 Thread Joe Seymour
The msp430 libmul archives currently contain several source files. If
you run msp430-elf-nm on these archives it will produce output on
stderr:

> msp430-elf-nm: enable-execute-stack.c: File format not recognized
...
> msp430-elf-nm: libgcc_tm.h: File format not recognized

It looks like the source files are added as dependencies of libmul by
libgcc/Makefile:

> $(EXTRA_PARTS): $(LIBGCC_LINKS) libgcc_tm.h

This patch adds filters to t-msp430, to ensure that only object files
are added to the archives.

Built and tested (no regressions) as follows:
  Configured with: --target=msp430-elf --enable-languages=c
  Test variations:
msp430-sim/-mcpu=msp430
msp430-sim/-mcpu=msp430x
msp430-sim/-mcpu=msp430x/-mlarge/-mdata-region=either/-mcode-region=either
msp430-sim/-mhwmult=none
msp430-sim/-mhwmult=f5series

If this patch is acceptable, I would appreciate it if someone would
commit it on my behalf.

Thanks,

2017-01-XX  Joe Seymour  

libgcc/
* config/msp430/t-msp430 (libmul_none.a, libmul_16.a, libmul_32.a)
(libmul_f5.a): Filter archived prerequisites.
---
 libgcc/config/msp430/t-msp430 |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/libgcc/config/msp430/t-msp430 b/libgcc/config/msp430/t-msp430
index 2428f54..107eb3d 100644
--- a/libgcc/config/msp430/t-msp430
+++ b/libgcc/config/msp430/t-msp430
@@ -59,16 +59,16 @@ lib2hw_mul_f5.o: $(srcdir)/config/msp430/lib2hw_mul.S
$(gcc_compile) $< -c -DMUL_F5
 
 libmul_none.a: lib2_mul_none.o
-   $(AR_CREATE_FOR_TARGET) $@ $^
+   $(AR_CREATE_FOR_TARGET) $@ $(filter %.o,$^)
 
 libmul_16.a: lib2hw_mul_16.o lib2_mul_16bit.o
-   $(AR_CREATE_FOR_TARGET) $@ $^
+   $(AR_CREATE_FOR_TARGET) $@ $(filter %.o,$^)
 
 libmul_32.a: lib2hw_mul_32.o
-   $(AR_CREATE_FOR_TARGET) $@ $^
+   $(AR_CREATE_FOR_TARGET) $@ $(filter %.o,$^)
 
 libmul_f5.a: lib2hw_mul_f5.o
-   $(AR_CREATE_FOR_TARGET) $@ $^
+   $(AR_CREATE_FOR_TARGET) $@ $(filter %.o,$^)
 
 # Local Variables:
 # mode: Makefile
-- 
1.7.1


[PATCH] Fix wrong assumption in contains_type_p (PR ipa/71207).

2017-01-13 Thread Martin Liška
Hello.

As mentioned in the PR, having a diamond virtual inheritance can cause a wrong
assumption done in contains_type_p. I also decided to rename one argument of the
function as otr_type and outer_type names are very confusing. Apart from what 
was
written in bugzilla I also verified that after the hunk removal, there's no 
situation
on Firefox where the function would return true, which used to return false.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 47e06bcf33719df390b9c49328235d9cbb48e4a7 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 12 Jan 2017 15:55:42 +0100
Subject: [PATCH] Fix wrong assumption in contains_type_p (PR ipa/71207).

gcc/testsuite/ChangeLog:

2017-01-12  Martin Liska  

	PR ipa/71207
	* g++.dg/ipa/pr71207.C: New test.

gcc/ChangeLog:

2017-01-12  Martin Liska  

	PR ipa/71207
	* ipa-polymorphic-call.c (contains_type_p): Fix wrong
	assumption, add comment and renamed otr_type to inner_type.
---
 gcc/ipa-polymorphic-call.c | 22 ++--
 gcc/testsuite/g++.dg/ipa/pr71207.C | 42 ++
 2 files changed, 53 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr71207.C

diff --git a/gcc/ipa-polymorphic-call.c b/gcc/ipa-polymorphic-call.c
index da64ce4c6e0..c13fc858c86 100644
--- a/gcc/ipa-polymorphic-call.c
+++ b/gcc/ipa-polymorphic-call.c
@@ -446,15 +446,15 @@ no_useful_type_info:
 }
 }
 
-/* Return true if OUTER_TYPE contains OTR_TYPE at OFFSET.
-   CONSIDER_PLACEMENT_NEW makes function to accept cases where OTR_TYPE can
+/* Return true if OUTER_TYPE contains INNER_TYPE at OFFSET.
+   CONSIDER_PLACEMENT_NEW makes function to accept cases where INNER_TYPE can
be built within OUTER_TYPE by means of placement new.  CONSIDER_BASES makes
-   function to accept cases where OTR_TYPE appears as base of OUTER_TYPE or as
+   function to accept cases where INNER_TYPE appears as base of OUTER_TYPE or as
base of one of fields of OUTER_TYPE.  */
 
 static bool
 contains_type_p (tree outer_type, HOST_WIDE_INT offset,
-		 tree otr_type,
+		 tree inner_type,
 		 bool consider_placement_new,
 		 bool consider_bases)
 {
@@ -463,18 +463,18 @@ contains_type_p (tree outer_type, HOST_WIDE_INT offset,
   /* Check that type is within range.  */
   if (offset < 0)
 return false;
-  if (TYPE_SIZE (outer_type) && TYPE_SIZE (otr_type)
-  && TREE_CODE (TYPE_SIZE (outer_type)) == INTEGER_CST
-  && TREE_CODE (TYPE_SIZE (otr_type)) == INTEGER_CST
-  && wi::ltu_p (wi::to_offset (TYPE_SIZE (outer_type)),
-		(wi::to_offset (TYPE_SIZE (otr_type)) + offset)))
-return false;
+
+  /* PR ipa/71207
+ As OUTER_TYPE can be a type which has a diamond virtual inheritance,
+ it's not necessary that INNER_TYPE will fit within OUTER_TYPE with
+ a given offset.  It can happen that INNER_TYPE also contains a base object,
+ however it would point to the same instance in the OUTER_TYPE.  */
 
   context.offset = offset;
   context.outer_type = TYPE_MAIN_VARIANT (outer_type);
   context.maybe_derived_type = false;
   context.dynamic = false;
-  return context.restrict_to_inner_class (otr_type, consider_placement_new,
+  return context.restrict_to_inner_class (inner_type, consider_placement_new,
 	  consider_bases);
 }
 
diff --git a/gcc/testsuite/g++.dg/ipa/pr71207.C b/gcc/testsuite/g++.dg/ipa/pr71207.C
new file mode 100644
index 000..19a03998460
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr71207.C
@@ -0,0 +1,42 @@
+/* PR ipa/71207 */
+/* { dg-do run } */
+
+class Class1
+{
+public:
+  Class1() {};
+  virtual ~Class1() {};
+
+protected:
+  unsigned Field1;
+};
+
+class Class2 : public virtual Class1
+{
+};
+
+class Class3 : public virtual Class1
+{
+public:
+  virtual void Method1() = 0;
+
+  void Method2()
+  {
+Method1();
+  }
+};
+
+class Class4 : public Class2, public virtual Class3
+{
+public:
+  Class4() {};
+  virtual void Method1() {};
+};
+
+int main()
+{
+  Class4 var1;
+  var1.Method2();
+
+  return 0;
+}
-- 
2.11.0



[PATCH] Define cxx11-abi effective target for libstdc++ tests

2017-01-13 Thread Jonathan Wakely

This effective target means we can skip/xfail tests that depend on the
new std::string, std::list etc.

I've added the check_effective_target_cxx11 procedure in libstdc++.exp
not in gcc/testsuite/lib/target-supports.exp because I don't think it
is useful outside the libstdc++ testsuite.

There are still lots of tests that fail when run with
-D_GLIBCXX_USE_CXX11_ABI=0 but I plan to fix them by adding the
missing features, not skipping the tests.

PR libstdc++/79075
* testsuite/lib/libstdc++.exp (check_v3_target_filesystem_ts): Remove
redundant option from cxxflags.
(check_effective_target_cxx11-abi): Define.
* testsuite/21_strings/basic_string/allocator/71964.cc: Use cxx11-abi
effective target.
* testsuite/21_strings/basic_string/allocator/char/copy.cc: Likewise.
* testsuite/21_strings/basic_string/allocator/char/copy_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/minimal.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/move.cc: Likewise.
* testsuite/21_strings/basic_string/allocator/char/move_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/noexcept.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/swap.cc: Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/copy.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/copy_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/minimal.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/move.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/move_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/noexcept.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/swap.cc:
Likewise.
* testsuite/23_containers/list/61347.cc: Likewise.
* testsuite/27_io/basic_fstream/cons/base.cc: Likewise.
* testsuite/27_io/ios_base/failure/cxx11.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.


commit 85ad51f9d27151ad8e070b5d19aef98f1c607bb1
Author: Jonathan Wakely 
Date:   Fri Jan 13 11:18:36 2017 +

Define cxx11-abi effective target for libstdc++ tests

PR libstdc++/79075
* testsuite/lib/libstdc++.exp (check_v3_target_filesystem_ts): Remove
redundant option from cxxflags.
(check_effective_target_cxx11-abi): Define.
* testsuite/21_strings/basic_string/allocator/71964.cc: Use cxx11-abi
effective target.
* testsuite/21_strings/basic_string/allocator/char/copy.cc: Likewise.
* testsuite/21_strings/basic_string/allocator/char/copy_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/minimal.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/move.cc: Likewise.
* testsuite/21_strings/basic_string/allocator/char/move_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/noexcept.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/swap.cc: Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/copy.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/copy_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/minimal.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/move.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/move_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/noexcept.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/swap.cc:
Likewise.
* testsuite/23_containers/list/61347.cc: Likewise.
* testsuite/27_io/basic_fstream/cons/base.cc: Likewise.
* testsuite/27_io/ios_base/failure/cxx11.cc: Likewise.

diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
index f5ef176..70e5cf8 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
@@ -16,6 +16,8 @@
 // .
 
 // { dg-do run { target c++11 } }
+// COW strings don't support C++11 allocators:
+// { dg-require-effective-target cxx11-abi }
 
 #include 
 #include 
@@ -58,13 +60,10 @@ operator!=(const mv_allocator&, const mv_allocator&) 
{ return false; }
 void
 test01()
 {
-  // COW strings don't support C++11 allocators
-#if _GLIBCXX_USE_CXX11_ABI
   std::basic_string, mv_allocator> s;
   auto t = std::move(s);
   VERIFY( s.get_allocator().moved_from );
   VERIFY( t.get_allocator().moved_

[PATCH] Do not declare artificial variables in tree-profile.c to have a definition (PR lto/69188).

2017-01-13 Thread Martin Liška
Hello.

Nice example provided in the PR causes ICE as we have an artificial symbol
created in tree-profile.c once being removed by remove unreachable nodes (-O0)
and once not (-O1). Well, difference is in process_references where following 
hunk
prevent removal:

  || (((before_inlining_p
&& ((TREE_CODE (node->decl) != FUNCTION_DECL
 && optimize)

Anyway, these artificial symbols really should be just declarations as they are 
defined
in libgcov library.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 87262b1b60009381fd943fb433bc38f5d5685ac9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 13 Jan 2017 13:12:57 +0100
Subject: [PATCH] Do not declare artificial variables in tree-profile.c to have
 a definition (PR lto/69188).

gcc/testsuite/ChangeLog:

2017-01-13  Martin Liska  

	PR lto/69188
	* gcc.dg/lto/pr69188_0.c: New test.
	* gcc.dg/lto/pr69188_1.c: New test.

gcc/ChangeLog:

2017-01-13  Martin Liska  

	PR lto/69188
	* tree-profile.c (init_ic_make_global_vars): Do not call
	finalize_decl.
	(gimple_init_gcov_profiler): Likewise.
---
 gcc/testsuite/gcc.dg/lto/pr69188_0.c |  7 +++
 gcc/testsuite/gcc.dg/lto/pr69188_1.c | 10 ++
 gcc/tree-profile.c   |  6 --
 3 files changed, 17 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr69188_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr69188_1.c

diff --git a/gcc/testsuite/gcc.dg/lto/pr69188_0.c b/gcc/testsuite/gcc.dg/lto/pr69188_0.c
new file mode 100644
index 000..8bee874a65b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69188_0.c
@@ -0,0 +1,7 @@
+/* PR ipa/69188 */
+/* { dg-lto-do link } */
+/* { dg-lto-options { { -flto -O0 -fprofile-generate } } } */
+
+void fn1(void) 
+{ 
+}
diff --git a/gcc/testsuite/gcc.dg/lto/pr69188_1.c b/gcc/testsuite/gcc.dg/lto/pr69188_1.c
new file mode 100644
index 000..3ed9d5560c6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69188_1.c
@@ -0,0 +1,10 @@
+/* PR ipa/69188 */
+/* { dg-options "-flto -O1 -fprofile-generate" } */
+
+extern void fn1(void);
+
+int main() {
+  fn1();
+  return 0;
+}
+
diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
index 186cfdf7929..a49ec37f8bb 100644
--- a/gcc/tree-profile.c
+++ b/gcc/tree-profile.c
@@ -94,8 +94,6 @@ init_ic_make_global_vars (void)
   if (targetm.have_tls)
 set_decl_tls_model (ic_void_ptr_var, decl_default_tls_model (ic_void_ptr_var));
 
-  varpool_node::finalize_decl (ic_void_ptr_var);
-
   gcov_type_ptr = build_pointer_type (get_gcov_type ());
 
   ic_gcov_type_ptr_var
@@ -112,8 +110,6 @@ init_ic_make_global_vars (void)
   DECL_INITIAL (ic_gcov_type_ptr_var) = NULL;
   if (targetm.have_tls)
 set_decl_tls_model (ic_gcov_type_ptr_var, decl_default_tls_model (ic_gcov_type_ptr_var));
-
-  varpool_node::finalize_decl (ic_gcov_type_ptr_var);
 }
 
 /* Create the type and function decls for the interface with gcov.  */
@@ -210,8 +206,6 @@ gimple_init_gcov_profiler (void)
   DECL_ARTIFICIAL (tree_time_profiler_counter) = 1;
   DECL_INITIAL (tree_time_profiler_counter) = NULL;
 
-  varpool_node::finalize_decl (tree_time_profiler_counter);
-
   /* void (*) (gcov_type *, gcov_type)  */
   average_profiler_fn_type
 	  = build_function_type_list (void_type_node,
-- 
2.11.0



Add test for c++/71166

2017-01-13 Thread Marek Polacek
This bug was fixed in r238395, so what remains to do for GCC 7 is to add a
testcase and close the PR.

Tested on x86_64-linux, ok for trunk?

2017-01-13  Marek Polacek  

PR c++/71166
* g++.dg/cpp0x/constexpr-array18.C: New test.

diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-array18.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-array18.C
index e69de29..0f2d86e 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-array18.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-array18.C
@@ -0,0 +1,23 @@
+// PR c++/71166
+// { dg-do compile { target c++11 } }
+
+struct Foo { int value; };
+
+constexpr Foo MakeFoo() { return Foo{0}; }
+
+struct Bar {
+  Foo color = MakeFoo();
+};
+
+struct BarContainer {
+  Bar array[1];
+};
+
+Foo X ()
+{
+  return MakeFoo ();
+}
+
+void Foo() {
+  new BarContainer();
+}

Marek


Re: [PATCH] BRIG frontend: request for a global review

2017-01-13 Thread Pekka Jääskeläinen
Hi Richard,

Thanks for the review! Replies/questions inline.

On Fri, Jan 13, 2017 at 2:28 PM, Richard Biener
 wrote:
> On Thu, Jan 12, 2017 at 3:55 PM, Pekka Jääskeläinen  
> wrote:
>> Hi,
>>
>> A gentle ping...
>
> diff --git a/gcc/configure.ac b/gcc/configure.ac
> index 140b9f9..06941c5 100644
> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac
> @@ -996,6 +996,11 @@ if test x"$enable_hsa" = x1 ; then
>  [Define this to enable support for generating HSAIL.])
>  fi
>
> +if echo "$enable_languages" | grep "brig" > /dev/null; then
> +  AC_DEFINE(ENABLE_BRIG_FE, 1,
> +[Define this to enable the BRIG (HSAIL) frontend.])
> +fi
> +
>
> this looks odd and I'd have expected this to be solely controlled via
> --enable-languages at configure time?

Yes, it is. This AC_DEFINE just adds an autoconf variable ENABLE_BRIG_FE
using which I guard unnecessary inclusion of the brig-builtins.def in
builtins.def in
case BRIG FE is disabled.

> @@ -1366,6 +1369,11 @@ assembler  assembler-with-cpp
>  ada
>  f77  f77-cpp-input f95  f95-cpp-input
>  go
> +<<< HEAD
> +java
> +brig
> +===
> +>>> gcc-master
>  @end smallexample
>
>  @item -x none
>
> unmerged hunk (java is gone).

Cleaned.

> diff --git a/include/hsa-interface.h b/include/hsa-interface.h
> new file mode 100644
> index 000..6765751
> --- /dev/null
> +++ b/include/hsa-interface.h
> @@ -0,0 +1,630 @@
> +/* HSA runtime API 1.0.1 representation description.
> +   Copyright (C) 2016 Free Software Foundation, Inc.
> +
> ...
>
> this looks like a sub/superset of libgomp/plugin/hsa.h, please work on
> retaining only one version.

Merged.

> Did you check whether libhsail-rt builds for all GCC targets?  If not
> you probably want to add
> a list where to disable the frontend and its runtime (see go / libgo
> for an example in the
> toplevel configure.ac).

I have tested only x86 of the GCC upstream targets so far.

Libgo seems to black list known broken ones with a separate --enable-libgo
switch to force enable.  However, BRIG FE and libhsail-rt are disabled
by default.
Should I still add a separate switch to force enable BRIG and enable
it by default for x86?

Or should I force disable it even when enabled with --enable-languages when
building for an untested target?


Thanks,
Pekka


Re: [PATCH TEST]Add test for PR78652

2017-01-13 Thread Bin.Cheng
On Mon, Dec 12, 2016 at 4:35 PM, Jakub Jelinek  wrote:
> On Mon, Dec 12, 2016 at 09:32:30AM -0700, Jeff Law wrote:
>> On 12/09/2016 03:20 AM, Bin Cheng wrote:
>> >Hi,
>> >PR78652 was fixed by patch for PR77856, this patch adds a test for it.  
>> >Test result checked, is it OK?
>> >
>> >Thanks,
>> >bin
>> >
>> >gcc/testsuite/ChangeLog
>> >2016-12-07  Bin Cheng  
>> >
>> > PR rtl-optimization/78652
>> > * gcc.c-torture/execute/pr78652.c: New test.
>> >
>> You need to restrict it to targets with named section support -- or --
>> remove the attribute section from main.
>>
>> /* { dg-require-effective-target named_sections } */
>
> If the testcase fails before r243038 without the attribute and succeeds
> with it, I'd strongly prefer the latter option - remove the attribute.
Hi,
Thank you both for reviewing, the attribute is irrelevant to test, so
this version has it removed.

Thanks,
bin
>
> Jakub
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78652.c 
b/gcc/testsuite/gcc.c-torture/execute/pr78652.c
new file mode 100644
index 000..0b9ba87
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr78652.c
@@ -0,0 +1,76 @@
+/* PR rtl-optimization/78652 */
+/* { dg-options "-fno-strict-aliasing" } */
+
+short a, b = 1, p, t;
+struct {
+  signed f1 : 9;
+} m = {3};
+
+int c, g = 0, h = 8, i, k, l, q, r, s, w, x = 9;
+long long d = 1;
+long long e = 1;
+char f[9][6][4] = {{{1}}};
+short j[6] = {1};
+
+unsigned n;
+static long long *o = &e;
+char u;
+short *v;
+char *y = &f[6][4][3];
+short fn1 (short p1) { return a == 0 ? p1 : p1 % a; }
+
+static int *fn2 (signed char p1, int p2, signed char p3) {
+lbl_2057:
+  if (n >= (q >= fn1 (3)))
+return &r;
+  if (p2)
+t = u;
+  *v = (t | *o) * p2;
+  for (;;) {
+for (; u <= 1;)
+  goto lbl_2057;
+if (p1)
+  s = g = 0;
+  }
+}
+
+void fn3 (int *p1) {
+  if (*p1)
+;
+  else {
+x = w;
+for (;;)
+  ;
+  }
+}
+
+int *fn4 (long p1) {
+  for (; p1; p1--)
+e = 1;
+  y = 0;
+  return &h;
+}
+
+__attribute__((noinline))
+void bar (int a)
+{
+  if (a != 3)
+__builtin_abort ();
+}
+
+int main () {
+  int *z;
+  long t1;
+  long long *t2 = &d;
+  *t2 = b;
+  z = fn4 (*t2);
+  fn3 (z);
+  short *t3 = &j[0];
+  *t3 = f[6][4][3] >= (b = i);
+  t1 = p < n;
+  fn2 (c < t1 >= m.f1, l, k);
+  j[5]++;
+  bar (m.f1);
+
+  return 0;
+}


[PATCH] Add string_view support to COW std::string

2017-01-13 Thread Jonathan Wakely

This adds the C++17 changes to make std::string_view interoperate with
std::string. The code is mostly just copied from the new std::string,
with two adjustments needed where the names of internal member
functions differ.

With this we have no more test failures when using the old ABI (either
via running tests with -D_GLIBCXX_USE_CXX11_ABI=0 or when configured
so that's the default).

PR libstdc++/79075
* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI] (basic_string):
Make _If_sv private.
[!_GLIBCXX_USE_CXX11_ABI] (basic_string): Add member functions taking
basic_string_view arguments.

Tested powerpc64le-linux, committed to trunk.

commit 587d97a197fdb2216861a80c04ad70e0d7cef14a
Author: Jonathan Wakely 
Date:   Fri Jan 13 15:26:46 2017 +

Add string_view support to COW std::string

	PR libstdc++/79075
	* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI] (basic_string):
	Make _If_sv private.
	[!_GLIBCXX_USE_CXX11_ABI] (basic_string): Add member functions taking
	basic_string_view arguments.

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index fe8f7e6..9dffcf9 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -111,6 +111,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 #if __cplusplus > 201402L
   // A helper type for avoiding boiler-plate.
   typedef basic_string_view<_CharT, _Traits> __sv_type;
+
+  template
+	using _If_sv = enable_if_t<
+	  __and_,
+		 __not_>>::value,
+	  _Res>;
 #endif
 
   // Use empty-base optimization: http://www.cantrip.org/emptyopt.html
@@ -585,12 +591,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 	{ _M_construct(__beg, __end); }
 
 #if __cplusplus > 201402L
-  template
-	using _If_sv = enable_if_t<
-	  __and_,
-		 __not_>>::value,
-	  _Res>;
-
   /**
*  @brief  Construct string from a substring of a string_view.
*  @param  __t   Source string view.
@@ -3320,6 +3320,17 @@ _GLIBCXX_END_NAMESPACE_CXX11
   _S_empty_rep() _GLIBCXX_NOEXCEPT
   { return _Rep::_S_empty_rep(); }
 
+#if __cplusplus > 201402L
+  // A helper type for avoiding boiler-plate.
+  typedef basic_string_view<_CharT, _Traits> __sv_type;
+
+  template
+	using _If_sv = enable_if_t<
+	  __and_,
+		 __not_>>::value,
+	  _Res>;
+#endif
+
 public:
   // Construct/copy/destroy:
   // NB: We overload ctors in some cases instead of using default
@@ -3441,6 +3452,29 @@ _GLIBCXX_END_NAMESPACE_CXX11
 basic_string(_InputIterator __beg, _InputIterator __end,
 		 const _Alloc& __a = _Alloc());
 
+#if __cplusplus > 201402L
+  /**
+   *  @brief  Construct string from a substring of a string_view.
+   *  @param  __t   Source string view.
+   *  @param  __pos The index of the first character to copy from __t.
+   *  @param  __n   The number of characters to copy from __t.
+   *  @param  __a   Allocator to use.
+   */
+  template>
+	basic_string(const _Tp& __t, size_type __pos, size_type __n,
+		 const _Alloc& __a = _Alloc())
+	: basic_string(__sv_type(__t).substr(__pos, __n), __a) { }
+
+  /**
+   *  @brief  Construct string from a string_view.
+   *  @param  __sv  Source string view.
+   *  @param  __a  Allocator to use (default is default allocator).
+   */
+  explicit
+  basic_string(__sv_type __sv, const _Alloc& __a = _Alloc())
+  : basic_string(__sv.data(), __sv.size(), __a) { }
+#endif // C++17
+
   /**
*  @brief  Destroy the string instance.
*/
@@ -3506,6 +3540,23 @@ _GLIBCXX_END_NAMESPACE_CXX11
   }
 #endif // C++11
 
+#if __cplusplus > 201402L
+  /**
+   *  @brief  Set value to string constructed from a string_view.
+   *  @param  __sv  A string_view.
+   */
+  basic_string&
+  operator=(__sv_type __sv)
+  { return this->assign(__sv); }
+
+  /**
+   *  @brief  Convert to a string_view.
+   *  @return A string_view.
+   */
+  operator __sv_type() const noexcept
+  { return __sv_type(data(), size()); }
+#endif // C++17
+
   // Iterators:
   /**
*  Returns a read/write iterator that points to the first character in
@@ -3910,6 +3961,17 @@ _GLIBCXX_END_NAMESPACE_CXX11
   { return this->append(__l.begin(), __l.size()); }
 #endif // C++11
 
+#if __cplusplus > 201402L
+  /**
+   *  @brief  Append a string_view.
+   *  @param __sv  The string_view to be appended.
+   *  @return  Reference to this string.
+   */
+  basic_string&
+  operator+=(__sv_type __sv)
+  { return this->append(__sv); }
+#endif // C++17
+
   /**
*  @brief  Append a string to this string.
*  @param __str  The string to append.
@@ -3990,6 +4052,34 @@ _GLIBCXX_END_NAMESPACE_CXX11
 append(_InputIterator __first, _InputIterator __last)
 { return this->replace(_M_iend(), _M_iend(), __first, __last

Re: [PATCH] BRIG frontend: request for a global review

2017-01-13 Thread Pekka Jääskeläinen
On Fri, Jan 13, 2017 at 2:34 PM, Richard Biener
 wrote:
> On Thu, Jan 12, 2017 at 3:55 PM, Pekka Jääskeläinen  
> wrote:
>> Hi,
>>
>> A gentle ping...
>
> Looking at 002/
>
> What's the reason of having brig-builtins.def?  I do not see any
> middle-end stuff using them
> and if you just emit calls to the runtime then you do not need
> "builtins" for this -- just build
> the function decls in the frontend.  The Fortran frontend has examples
> for how to do that
> for almost all entries to the libgfortran runtime.
>
> So all changes besides in brig/ look unnecessary to me.

Most of the HSAIL instructions represented as builtins here are fine
grained enough for
them to be candidates for expanding as sensible target instruction
sequences instead of runtime
function calls.

I understand that there is no code in any of the targets to do so yet.
But still I wonder is it
meaningful to convert them to direct runtime function calls -- if/when
a target wants
to optimize their HSAIL codegen they have to be moved back. Especially given
the BRIG FE is not enabled by default and the inclusion of the BRIG
builtins is done
only if enabled, does it cause harm?

BR,
--Pekka


Re: Avoid PR72749 by not using unspecs

2017-01-13 Thread Uros Bizjak
On Fri, Jan 13, 2017 at 12:50 PM, Alan Modra  wrote:
> Rather than using unspecs in doloop insns to stop combine creating
> these insns, use legitimate_combined_insn.
>
> I'm not sure why the original patch implementing
> legitimate_combined_insn did not store the result of recog to the insn
> but it seems good to me, and would allow the recog call in
> ix86_legitimate_combined_insn to be omitted.  (I tested that too, not
> shown here.)

IIRC, I copied operand scanning loop from recog.c (around line 2580)
and the function was later enhanced with preferred alternatives
handling. The function worked well, and not being an expert in this
area, I didn't try to "optimize" the code that worked...

So, there is no particular reason for the current implementation.

Uros.


Re: [1/5][AArch64] Return address protection on AArch64

2017-01-13 Thread James Greenhalgh
On Fri, Jan 06, 2017 at 11:47:07AM +, Jiong Wang wrote:
> On 11/11/16 18:22, Jiong Wang wrote:
> gcc/
> 2017-01-06  Jiong Wang  
> 
> * config/aarch64/aarch64-opts.h (aarch64_function_type): New enum.
> * config/aarch64/aarch64-protos.h
> (aarch64_return_address_signing_enabled): New declaration.
> * config/aarch64/aarch64.c (aarch64_return_address_signing_enabled):
> New function.
> (aarch64_expand_prologue): Sign return address before it's pushed onto
> stack.
> (aarch64_expand_epilogue): Authenticate return address fetched from
> stack.
> (aarch64_override_options): Sanity check for ILP32 and ISA level.
> (aarch64_attributes): New function attributes for 
> "sign-return-address".
> * config/aarch64/aarch64.md (UNSPEC_AUTI1716, UNSPEC_AUTISP,
> UNSPEC_PACI1716, UNSPEC_PACISP, UNSPEC_XPACLRI): New unspecs.
> ("*do_return"): Generate combined instructions according to key index.
> ("sp", " * config/aarch64/iterators.md (PAUTH_LR_SP, PAUTH_17_16): New integer
> iterators.
> (pauth_mnem_prefix, pauth_hint_num_a): New integer attributes.
> * config/aarch64/aarch64.opt (msign-return-address=): New.
> * doc/extend.texi (AArch64 Function Attributes): Documents
> "sign-return-address=".
> * doc/invoke.texi (AArch64 Options): Documents 
> "-msign-return-address=".
> 
> gcc/testsuite/
> 2017-01-06  Jiong Wang  
> 
> * gcc.target/aarch64/return_address_sign_1.c: New testcase.
> * gcc.target/aarch64/return_address_sign_scope_1.c: New testcase.

I have a few comments on this patch, mostly around the wording of comments.

> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3670,6 +3690,16 @@ aarch64_expand_epilogue (bool for_sibcall)
>RTX_FRAME_RELATED_P (insn) = 1;
>  }
>  
> +  /* sibcall won't generate normally return, therefore we need to 
> authenticate
> + at here.  !TARGET_ARMV8_3 disallows GCC to use combined authentication
> + instruction which we prefer, so we need to generate a seperate
> + authentication. eh_return path can't do combined authentication, as it 
> will
> + do extra stack adjustment which updates CFA to EH handler's CFA while we
> + want to use the CFA of the function which calls eh_return.  */

This comment does not read correctly to me, and I'm not sure what you
mean by it. I'd rewrite it as so:

  We prefer to emit the combined return/authenticate instruction RETAA,
  however there are three cases in which we must instead emit an explicit
  authentication instruction.

1) Sibcalls don't return in a normal way, so if we're about to
   call one we must authenticate.
2) The RETAA instruction is not available before ARMv8.3-A, so if we
   are generating code for !TARGET_ARMV8_3 we can't use it and must
   explicitly authenticate.
3) On an eh_return path we make extra stack adjustments to update the
   canonical frame address to be the exception handler's CFA.  We want
   to authenticate using the CFA of the function which calls eh_return.

> +  if (aarch64_return_address_signing_enabled ()
> +  && (for_sibcall || !TARGET_ARMV8_3 || crtl->calls_eh_return))
> +emit_insn (gen_autisp ());
> +
>/* Stack adjustment for exception handler.  */
>if (crtl->calls_eh_return)
>  {
> @@ -8894,6 +8924,9 @@ aarch64_override_options (void)
>  error ("Assembler does not support -mabi=ilp32");
>  #endif
>  
> +  if (aarch64_ra_sign_scope != AARCH64_FUNCTION_NONE && TARGET_ILP32)
> +error ("Return address signing is only supported on LP64");

Should this be a sorry? I think this is something which has not been
implemented rather than a fundamental restriction.

Maybe:

  sorry ("Return address signing is only supported for -mabi=lp64");

Would be clear for our users.

> +
>/* Make sure we properly set up the explicit options.  */
>if ((aarch64_cpu_string && valid_cpu)
> || (aarch64_tune_string && valid_tune))
> @@ -9277,6 +9310,8 @@ static const struct aarch64_attribute_info 
> aarch64_attributes[] =
>{ "cpu", aarch64_attr_custom, false, aarch64_handle_attr_cpu, OPT_mcpu_ },
>{ "tune", aarch64_attr_custom, false, aarch64_handle_attr_tune,
>   OPT_mtune_ },
> +  { "sign-return-address", aarch64_attr_enum, false, NULL,
> + OPT_msign_return_address_ },
>{ NULL, aarch64_attr_custom, false, NULL, OPT }
>  };
>  
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 
> bde42317f1bfd91a9d38a4cfa94d4cedd5246003..9b6f99dee3cecc0f15f24ec2375d3641fe03892d
>  100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -67,6 +67,8 @@
>  )
>  
>  (define_c_enum "unspec" [
> +UNSPEC_AUTI1716
> +UNSPEC_AUTISP
>  UNSPEC_CASESI
>  UNSPEC_CRC32B
>  UNSPEC_CRC32CB
> @@ -106,6 +108,8 @@
>  UNSPEC_LD4_LAN

Re: [2/5][AArch64] Generate dwarf information for -msign-return-address

2017-01-13 Thread Richard Earnshaw (lists)
On 06/01/17 11:47, Jiong Wang wrote:
> On 11/11/16 18:22, Jiong Wang wrote:
>> This patch generate DWARF description for pointer authentication. 
>> DWARF value
>> expression is used to describe the authentication action.
>>
>> Please see the cover letter and AArch64 DWARF specification for the
>> semantics
>> of AArch64 DWARF operations.
>>
>> When authentication key index is A key, we use compact DWARF
>> description which
>> can largely save DWARF frame size, otherwise we fallback to general
>> operator.
>>
>>
>>
>> Example
>> ===
>>
>>  int
>>  cal (int a, int b, int c)
>>  {
>>return a + dec (b) + c;
>>  }
>>
>> Compact DWARF description
>> (-march=armv8.3-a -msign-return-address)
>> ===
>>
>>DW_CFA_advance_loc: 4 to 0004
>>DW_CFA_val_expression: r30 (x30) (DW_OP_AARCH64_paciasp)
>>DW_CFA_advance_loc: 4 to 0008
>>DW_CFA_val_expression: r30 (x30) (DW_OP_AARCH64_paciasp_deref: -24)
>>
>> General DWARF description
>> ===
>> (-march=armv8.3-a -msign-return-address -mpauth-key=b_key)
>>
>>DW_CFA_advance_loc: 4 to 0004
>>DW_CFA_val_expression: r30 (x30) (DW_OP_breg30 (x30): 0;
>> DW_OP_AARCH64_pauth: 18)
>>DW_CFA_advance_loc: 4 to 0008
>>DW_CFA_val_expression: r30 (x30) (DW_OP_dup; DW_OP_const1s: -24;
>> DW_OP_plus; DW_OP_deref; DW_OP_AARCH64_pauth: 18)
>>
>>  From Linux kernel testing, -msign-return-address will introduce +24%
>> .debug_frame size increase when signing all functions and using compact
>> description, and about +45% .debug_frame size increase if using general
>> description.
>>
>>
>> gcc/
>> 2016-11-11  Jiong Wang
>>
>>  * config/aarch64/aarch64.h (aarch64_pauth_action_type): New
>> enum.
>>  * config/aarch64/aarch64.c
>> (aarch64_attach_ra_auth_dwarf_note): New function.
>>  (aarch64_attach_ra_auth_dwarf_general): New function.
>>  (aarch64_attach_ra_auth_dwarf_shortcut): New function.
>>  (aarch64_save_callee_saves): Generate dwarf information if LR
>> is signed.
>>  (aarch64_expand_prologue): Likewise.
>>  (aarch64_expand_epilogue): Likewise.
> 
> This patch is an update on DWARF generation for return address signing.
> 
> According to new proposal, we simply needs to generate REG_CFA_WINDOW_SAVE
> annotation.
> 
> gcc/
> 
> 2017-01-06  Jiong Wang  
> 
> * config/aarch64/aarch64.c (aarch64_expand_prologue): Generate
> dwarf
> annotation (REG_CFA_WINDOW_SAVE) for return address signing.
> (aarch64_expand_epilogue): Likewise.
> 
> 

I don't think we should be overloading REG_CFA_WINDOW_SAVE internally in
the compiler -- it's one thing to do it in the dwarf output tables, but
quite another to be doing it elsewhere in the compiler.

Instead we should create a new reg note kind and use that, but in the
final dwarf output then emit the overloaded opcode.

R.

> 
> 2.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 002895a167ce0deb45a5c1726527651af18bb4df..20ed79e5690f45ec121ef516245c686cc0cc82b5
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3553,7 +3553,11 @@ aarch64_expand_prologue (void)
>  
>/* Sign return address for functions.  */
>if (aarch64_return_address_signing_enabled ())
> -emit_insn (gen_pacisp ());
> +{
> +  insn = emit_insn (gen_pacisp ());
> +  add_reg_note (insn, REG_CFA_WINDOW_SAVE, const0_rtx);
> +  RTX_FRAME_RELATED_P (insn) = 1;
> +}
>  
>if (flag_stack_usage_info)
>  current_function_static_stack_size = frame_size;
> @@ -3698,7 +3702,11 @@ aarch64_expand_epilogue (bool for_sibcall)
>   want to use the CFA of the function which calls eh_return.  */
>if (aarch64_return_address_signing_enabled ()
>&& (for_sibcall || !TARGET_ARMV8_3 || crtl->calls_eh_return))
> -emit_insn (gen_autisp ());
> +{
> +  insn = emit_insn (gen_autisp ());
> +  add_reg_note (insn, REG_CFA_WINDOW_SAVE, const0_rtx);
> +  RTX_FRAME_RELATED_P (insn) = 1;
> +}
>  
>/* Stack adjustment for exception handler.  */
>if (crtl->calls_eh_return)
> 



Re: [PATCH, ARM] correctly encode the CC reg data flow

2017-01-13 Thread Bernd Edlinger
On 01/13/17 14:50, Richard Earnshaw (lists) wrote:
> On 18/12/16 12:58, Bernd Edlinger wrote:
>> Hi,
>>
>> this is related to PR77308, the follow-up patch will depend on this one.
>>
>> When trying the split the *arm_cmpdi_insn and *arm_cmpdi_unsigned
>> before reload, a mis-compilation in libgcc function __gnu_satfractdasq
>> was discovered, see [1] for more details.
>>
>> The reason seems to be that when the *arm_cmpdi_insn is directly
>> followed by a *arm_cmpdi_unsigned instruction, both are split
>> up into this:
>>
>>[(set (reg:CC CC_REGNUM)
>>  (compare:CC (match_dup 0) (match_dup 1)))
>> (parallel [(set (reg:CC CC_REGNUM)
>> (compare:CC (match_dup 3) (match_dup 4)))
>>(set (match_dup 2)
>> (minus:SI (match_dup 5)
>>  (ltu:SI (reg:CC_C CC_REGNUM) (const_int
>> 0])]
>>
>>[(set (reg:CC CC_REGNUM)
>>  (compare:CC (match_dup 2) (match_dup 3)))
>> (cond_exec (eq:SI (reg:CC CC_REGNUM) (const_int 0))
>>(set (reg:CC CC_REGNUM)
>> (compare:CC (match_dup 0) (match_dup 1]
>>
>> The problem is that the reg:CC from the *subsi3_carryin_compare
>> is not mentioning that the reg:CC is also dependent on the reg:CC
>> from before.  Therefore the *arm_cmpsi_insn appears to be
>> redundant and thus got removed, because the data values are identical.
>>
>> I think that applies to a number of similar pattern where data
>> flow is happening through the CC reg.
>>
>> So this is a kind of correctness issue, and should be fixed
>> independently from the optimization issue PR77308.
>>
>> Therefore I think the patterns need to specify the true
>> value that will be in the CC reg, in order for cse to
>> know what the instructions are really doing.
>>
>>
>> Bootstrapped and reg-tested on arm-linux-gnueabihf.
>> Is it OK for trunk?
>>
>
> I agree you've found a valid problem here, but I have some issues with
> the patch itself.
>
>
> (define_insn_and_split "subdi3_compare1"
>   [(set (reg:CC_NCV CC_REGNUM)
>   (compare:CC_NCV
> (match_operand:DI 1 "register_operand" "r")
> (match_operand:DI 2 "register_operand" "r")))
>(set (match_operand:DI 0 "register_operand" "=&r")
>   (minus:DI (match_dup 1) (match_dup 2)))]
>   "TARGET_32BIT"
>   "#"
>   "&& reload_completed"
>   [(parallel [(set (reg:CC CC_REGNUM)
>  (compare:CC (match_dup 1) (match_dup 2)))
> (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))])
>(parallel [(set (reg:CC_C CC_REGNUM)
>  (compare:CC_C
>(zero_extend:DI (match_dup 4))
>(plus:DI (zero_extend:DI (match_dup 5))
> (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)
> (set (match_dup 3)
>  (minus:SI (minus:SI (match_dup 4) (match_dup 5))
>(ltu:SI (reg:CC_C CC_REGNUM) (const_int 0])]
>
>
> This pattern is now no-longer self consistent in that before the split
> the overall result for the condition register is in mode CC_NCV, but
> afterwards it is just CC_C.
>
> I think CC_NCV is correct mode (the N, C and V bits all correctly
> reflect the result of the 64-bit comparison), but that then implies that
> the cc mode of subsi3_carryin_compare is incorrect as well and should in
> fact also be CC_NCV.  Thinking about this pattern, I'm inclined to agree
> that CC_NCV is the correct mode for this operation
>
> I'm not sure if there are other consequences that will fall out from
> fixing this (it's possible that we might need a change to select_cc_mode
> as well).
>

Yes, this is still a bit awkward...

The N and V bit will be the correct result for the subdi3_compare1
a 64-bit comparison, but zero_extend:DI (match_dup 4) (plus:DI ...)
only gets the C bit correct, the expression for N and V is a different
one.

It probably works, because the subsi3_carryin_compare instruction sets
more CC bits than the pattern does explicitly specify the value.
We know the subsi3_carryin_compare also computes the NV bits, but it is
hard to write down the correct rtl expression for it.

In theory the pattern should describe everything correctly,
maybe, like:

set (reg:CC_C CC_REGNUM)
 (compare:CC_C
   (zero_extend:DI (match_dup 4))
   (plus:DI (zero_extend:DI (match_dup 5))
(ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)
set (reg:CC_NV CC_REGNUM)
 (compare:CC_NV
  (match_dup 4))
  (plus:SI (match_dup 5) (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))
set (match_dup 3)
 (minus:SI (minus:SI (match_dup 4) (match_dup 5))
   (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)


But I doubt that will work to set CC_REGNUM with two different modes
in parallel?

Another idea would be to invent a CC_CNV_NOOV mode, that implicitly
defines C from the DImode result, and NV from the SImode result,
similar to the CC_NOOVmode, that also leaves someth

Re: [3/5][AArch64] New builtins required by libgcc unwinder

2017-01-13 Thread Richard Earnshaw (lists)
On 06/01/17 11:47, Jiong Wang wrote:
> On 11/11/16 18:22, Jiong Wang wrote:
>> This patch implements a few ARMv8.3-A new builtins for pointer sign and
>> authentication instructions.
>>
>> Currently, these builtins are supposed to be used by libgcc EH unwinder
>> only.  They are not public interface to external user.
>>
>> OK to install?
>>
>> gcc/
>> 2016-11-11  Jiong Wang
>>
>>  * config/aarch64/aarch64-builtins.c (enum aarch64_builtins):
>> New entries
>>  for AARCH64_PAUTH_BUILTIN_PACI1716,
>> AARCH64_PAUTH_BUILTIN_AUTIA1716,
>>  AARCH64_PAUTH_BUILTIN_AUTIB1716, AARCH64_PAUTH_BUILTIN_XPACLRI.
>>  (aarch64_init_v8_3_builtins): New.
>>  (aarch64_init_builtins): Call aarch64_init_builtins.
>>  (arch64_expand_builtin): Expand new builtins.
>>
>>
> This patch is an update on builtins support.  All these builtins are to be
> internally used by libgcc only, so the updates only keeps those used.
> 
> OK for trunk?
> 
> gcc/
> 
> 2017-01-06  Jiong Wang  
> 
> * config/aarch64/aarch64-builtins.c (enum aarch64_builtins): New
> entries
> for AARCH64_PAUTH_BUILTIN_XPACLRI, AARCH64_PAUTH_BUILTIN_PACIA1716,
> AARCH64_PAUTH_BUILTIN_AUTIA1716);
> (aarch64_init_pauth_hint_builtins): New.
> (aarch64_init_builtins): Call aarch64_init_pauth_hint_builtins.
> (aarch64_expand_builtin): Expand new builtins.
> 
> 
> 3.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index 
> 69fb756f0fbdc016f35ce1d08f2aaf092a034704..9ae9d9afc9c141235d7eee037d5571b9f35edc31
>  100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -376,6 +376,10 @@ enum aarch64_builtins
>AARCH64_CRC32_BUILTIN_BASE,
>AARCH64_CRC32_BUILTINS
>AARCH64_CRC32_BUILTIN_MAX,
> +  /* ARMv8.3-A Pointer Authentication Builtins.  */
> +  AARCH64_PAUTH_BUILTIN_AUTIA1716,
> +  AARCH64_PAUTH_BUILTIN_PACIA1716,
> +  AARCH64_PAUTH_BUILTIN_XPACLRI,
>AARCH64_BUILTIN_MAX
>  };
>  
> @@ -923,6 +927,33 @@ aarch64_init_fp16_types (void)
>aarch64_fp16_ptr_type_node = build_pointer_type (aarch64_fp16_type_node);
>  }
>  
> +/* Pointer authentication builtins that will become NOP on legacy platform.
> +   Currently, these builtins are for internal use only (libgcc EH unwinder). 
>  */
> +
> +void
> +aarch64_init_pauth_hint_builtins (void)
> +{
> +  /* Pointer Authentication builtins.  */
> +  tree ftype_pointer_auth
> += build_function_type_list (ptr_type_node, ptr_type_node,
> + unsigned_intDI_type_node, NULL_TREE);
> +  tree ftype_pointer_strip
> += build_function_type_list (ptr_type_node, ptr_type_node, NULL_TREE);
> +
> +  aarch64_builtin_decls[AARCH64_PAUTH_BUILTIN_AUTIA1716]
> += add_builtin_function ("__builtin_aarch64_autia1716", 
> ftype_pointer_auth,
> + AARCH64_PAUTH_BUILTIN_AUTIA1716, BUILT_IN_MD, NULL,
> + NULL_TREE);
> +  aarch64_builtin_decls[AARCH64_PAUTH_BUILTIN_PACIA1716]
> += add_builtin_function ("__builtin_aarch64_pacia1716", 
> ftype_pointer_auth,
> + AARCH64_PAUTH_BUILTIN_PACIA1716, BUILT_IN_MD, NULL,
> + NULL_TREE);
> +  aarch64_builtin_decls[AARCH64_PAUTH_BUILTIN_XPACLRI]
> += add_builtin_function ("__builtin_aarch64_xpaclri", ftype_pointer_strip,
> + AARCH64_PAUTH_BUILTIN_XPACLRI, BUILT_IN_MD, NULL,
> + NULL_TREE);
> +}
> +
>  void
>  aarch64_init_builtins (void)
>  {
> @@ -951,6 +982,10 @@ aarch64_init_builtins (void)
>  
>aarch64_init_crc32_builtins ();
>aarch64_init_builtin_rsqrt ();
> +
> +/* Initialize pointer authentication builtins which are backed by 
> instructions
> +   in NOP encoding space.  */
> +  aarch64_init_pauth_hint_builtins ();
>  }
>  
>  tree
> @@ -1293,6 +1328,43 @@ aarch64_expand_builtin (tree exp,
>   }
>emit_insn (pat);
>return target;

Add a blank line before case statements.


OK with that change.

R.

> +case AARCH64_PAUTH_BUILTIN_AUTIA1716:
> +case AARCH64_PAUTH_BUILTIN_PACIA1716:
> +case AARCH64_PAUTH_BUILTIN_XPACLRI:
> +  arg0 = CALL_EXPR_ARG (exp, 0);
> +  op0 = force_reg (Pmode, expand_normal (arg0));
> +
> +  if (!target)
> + target = gen_reg_rtx (Pmode);
> +  else
> + target = force_reg (Pmode, target);
> +
> +  emit_move_insn (target, op0);
> +
> +  if (fcode == AARCH64_PAUTH_BUILTIN_XPACLRI)
> + {
> +   rtx lr = gen_rtx_REG (Pmode, R30_REGNUM);
> +   icode = CODE_FOR_xpaclri;
> +   emit_move_insn (lr, op0);
> +   emit_insn (GEN_FCN (icode) ());
> +   emit_move_insn (target, lr);
> + }
> +  else
> + {
> +   tree arg1 = CALL_EXPR_ARG (exp, 1);
> +   rtx op1 = force_reg (Pmode, expand_normal (arg1));
> +   icode = (fcode == AARCH64_PAUTH_BUILTIN_PACIA1716
> +? CODE_FOR_paci1716 : CODE_

[PATCH, rs6000] Fix swap optimization to handle __builtin_vsx_xxspltd

2017-01-13 Thread Bill Schmidt
Hi,

There is a gap in swap optimization that does not properly handle code
generated by __builtin_vsx_xxspltd.  This is expanded into an 
UNSPEC_VSX_XXSPLTD, which is currently treated as ok to swap.  It should
instead be treated as ok to swap, with special handling to modify the lane
used as the source of the splat.  We have existing code to do this for
other splat forms, so the patch is quite simple.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
Is this ok for trunk?  We also require backports for 5 and 6.

Thanks,
Bill


[gcc]

2017-01-13  Bill Schmidt  

* config/rs6000/rs6000.c (rtx_is_swappable_p): Change
UNSPEC_VSX__XXSPLTD to require special splat handling.

[gcc/testsuite]

2017-01-13  Bill Schmidt  

* gcc.target/powerpc/swaps-p8-27.c: New.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 244382)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -41271,6 +41271,7 @@ rtx_is_swappable_p (rtx op, unsigned int *special)
  case UNSPEC_VSX_VEC_INIT:
return 0;
  case UNSPEC_VSPLT_DIRECT:
+ case UNSPEC_VSX_XXSPLTD:
*special = SH_SPLAT;
return 1;
  case UNSPEC_REDUC_PLUS:
Index: gcc/testsuite/gcc.target/powerpc/swaps-p8-27.c
===
--- gcc/testsuite/gcc.target/powerpc/swaps-p8-27.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/swaps-p8-27.c  (working copy)
@@ -0,0 +1,36 @@
+/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O3 " } */
+/* { dg-final { scan-assembler-times "lxvd2x" 2 } } */
+/* { dg-final { scan-assembler-times "stxvd2x" 1 } } */
+/* { dg-final { scan-assembler-times "xxpermdi" 3 } } */
+
+/* Verify that swap optimization works correctly for a VSX direct splat.
+   The three xxpermdi's that are generated correspond to two splats
+   and the __builtin_vsx_xxpermdi.  */
+
+int printf (const char *__restrict __format, ...);
+typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));
+
+double s1[] = {2134.3343, 6678.346};
+double s2[] = {41124.234, 6678.346};
+long long dd[] = {1, 2}, d[2];
+union{long long l[2]; double d[2];} e;
+
+void
+foo ()
+{
+  __m128d source1, source2, dest;
+  __m128d a, b, c;
+
+  e.d[1] = s1[1];
+  e.l[0] = !__builtin_isunordered(s1[0], s2[0]) 
+&& s1[0] == s2[0] ? -1 : 0;
+  source1 = __builtin_vec_vsx_ld (0, s1);
+  source2 = __builtin_vec_vsx_ld (0, s2);
+  a = __builtin_vec_splat (source1, 0);
+  b = __builtin_vec_splat (source2, 0);
+  c = (__m128d)__builtin_vec_cmpeq (a, b);
+  dest = __builtin_vsx_xxpermdi (source1, c, 1);
+  *(__m128d *)d = dest;
+}




Re: [PATCH][AArch64][GCC 6] PR target/79041: Correct -mpc-relative-literal-loads logic in aarch64_classify_symbol

2017-01-13 Thread James Greenhalgh
On Wed, Jan 11, 2017 at 04:32:45PM +, Kyrill Tkachov wrote:
> Hi all,
> 
> In this PR we generated ADRP/ADD instructions with :lo12: relocations on
> symbols even though -mpc-relative-literal-loads is used. This is due to the
> confusing double-negative logic of the nopcrelative_literal_loads aarch64
> variable and its relation to the aarch64_nopcrelative_literal_loads global
> variable in the GCC 6 branch.
> 
> Wilco fixed this on trunk as part of a larger patch (r237607) and parts of
> that patch were backported, but other parts weren't and that patch now
> doesn't apply cleanly to the branch.

As I commented to Jakub at the time he made the first partial backport,
I'd much rather just see all of Wilco's patch backported. We're not on
the verge of a 6 release now, so please just backport Wilco's patch (as
should have been done all along if this had been correctly identified as
a fix rather than a clean-up).

Thanks,
James




Re: [C++ Patch] PR 71737

2017-01-13 Thread Paolo Carlini

Hi,

On 13/01/2017 15:51, Nathan Sidwell wrote:

On 01/13/2017 09:45 AM, Paolo Carlini wrote:

Hi,

in this error recovery issue get_underlying_template crashes when
TYPE_TEMPLATE_INFO_MAYBE_ALIAS is applied to a null orig_type. Simply
checking for that condition appears to solve the issue in a
straightforward way. Tested x86_64-linux.


Wouldn't it be better if a scrogged alias got error_mark_node as the 
underlying type?  (I have no idea whether that's an easy thing to 
accomplish)
Your reply, Nathan, led me to investigate where exactly 
DECL_ORIGINAL_TYPE becomes null, and turns out that in tsubst_decl we 
have code actively doing that. That same code, a few lines below, only 
sets TYPE_DEPENDENT_P_VALID to false if type != error_mark_node. I 
cannot say to fully understand yet all the details, but certainly the 
patchlet below also passes testing. Do you have comments about this too?


Thanks!
Paolo.

///
Index: pt.c
===
--- pt.c(revision 244405)
+++ pt.c(working copy)
@@ -12868,7 +12868,8 @@ tsubst_decl (tree t, tree args, tsubst_flags_t com
/* Preserve a typedef that names a type.  */
if (is_typedef_decl (r))
  {
-   DECL_ORIGINAL_TYPE (r) = NULL_TREE;
+   if (type != error_mark_node)
+ DECL_ORIGINAL_TYPE (r) = NULL_TREE;
set_underlying_type (r);
if (TYPE_DECL_ALIAS_P (r) && type != error_mark_node)
  /* An alias template specialization can be dependent


Re: [1/5][AArch64] Return address protection on AArch64

2017-01-13 Thread Jiong Wang

On 13/01/17 16:04, James Greenhalgh wrote:

On Fri, Jan 06, 2017 at 11:47:07AM +, Jiong Wang wrote:

On 11/11/16 18:22, Jiong Wang wrote:
gcc/
2017-01-06  Jiong Wang  

 * config/aarch64/aarch64-opts.h (aarch64_function_type): New enum.
 * config/aarch64/aarch64-protos.h
 (aarch64_return_address_signing_enabled): New declaration.
 * config/aarch64/aarch64.c (aarch64_return_address_signing_enabled):
 New function.
 (aarch64_expand_prologue): Sign return address before it's pushed onto
 stack.
 (aarch64_expand_epilogue): Authenticate return address fetched from
 stack.
 (aarch64_override_options): Sanity check for ILP32 and ISA level.
 (aarch64_attributes): New function attributes for 
"sign-return-address".
 * config/aarch64/aarch64.md (UNSPEC_AUTI1716, UNSPEC_AUTISP,
 UNSPEC_PACI1716, UNSPEC_PACISP, UNSPEC_XPACLRI): New unspecs.
 ("*do_return"): Generate combined instructions according to key index.
 ("sp", "
I have a few comments on this patch


All fixed.  New patch attached.

gcc/
2017-01-13  Jiong Wang  

* config/aarch64/aarch64-opts.h (aarch64_function_type): New enum.
* config/aarch64/aarch64-protos.h
(aarch64_return_address_signing_enabled): New declaration.
* config/aarch64/aarch64.c (aarch64_return_address_signing_enabled):
New function.
(aarch64_expand_prologue): Sign return address before it's pushed onto
stack.
(aarch64_expand_epilogue): Authenticate return address fetched from
stack.
(aarch64_override_options): Sanity check for ILP32 and ISA level.
(aarch64_attributes): New function attributes for "sign-return-address".
* config/aarch64/aarch64.md (UNSPEC_AUTI1716, UNSPEC_AUTISP,
UNSPEC_PACI1716, UNSPEC_PACISP, UNSPEC_XPACLRI): New unspecs.
("*do_return"): Generate combined instructions according to key index.
("sp", "calls_eh_return)
+  return "retaa";
+
+return "ret";
+  }
   [(set_attr "type" "branch")]
 )
 
@@ -5341,6 +5353,36 @@
   [(set_attr "length" "0")]
 )
 
+;; Pointer authentication patterns are always provided.  In architecture
+;; revisions prior to ARMv8.3-A these HINT instructions operate as NOPs.
+;; This lets the user write portable software which authenticates pointers
+;; when run on something which implements ARMv8.3-A, and which runs
+;; correctly, but does not authenticate pointers, where ARMv8.3-A is not
+;; implemented.
+
+;; Signing/Authenticating R30 using SP as the salt.
+(define_insn "sp"
+  [(set (reg:DI R30_REGNUM)
+	(unspec:DI [(reg:DI R30_REGNUM) (reg:DI SP_REGNUM)] PAUTH_LR_SP))]
+  ""
+  "hint\t // asp";
+)
+
+;; Signing/Authenticating X17 using X16 as the salt.
+(define_insn "1716"
+  [(set (reg:DI R17_REGNUM)
+	(unspec:DI [(reg:DI R17_REGNUM) (reg:DI R16_REGNUM)] PAUTH_17_16))]
+  ""
+  "hint\t // a1716";
+)
+
+;; Stripping the signature in R30.
+(define_insn "xpaclri"
+  [(set (reg:DI R30_REGNUM) (unspec:DI [(reg:DI R30_REGNUM)] UNSPEC_XPACLRI))]
+  ""
+  "hint\t7 // xpaclri"
+)
+
 ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and
 ;; all of memory.  This blocks insns from being moved across this point.
 
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 56b920d..5436884 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -149,6 +149,23 @@ mpc-relative-literal-loads
 Target Report Save Var(pcrelative_literal_loads) Init(2) Save
 PC relative literal loads.
 
+msign-return-address=
+Target RejectNegative Report Joined Enum(aarch64_ra_sign_scope_t) Var(aarch64_ra_sign_scope) Init(AARCH64_FUNCTION_NONE) Save
+Select return address signing scope.
+
+Enum
+Name(aarch64_ra_sign_scope_t) Type(enum aarch64_function_type)
+Supported AArch64 return address signing scope (for use with -msign-return-address= option):
+
+EnumValue
+Enum(aarch64_ra_sign_scope_t) String(none) Value(AARCH64_FUNCTION_NONE)
+
+EnumValue
+Enum(aarch64_ra_sign_scope_t) String(non-leaf) Value(AARCH64_FUNCTION_NON_LEAF)
+
+EnumValue
+Enum(aarch64_ra_sign_scope_t) String(all) Value(AARCH64_FUNCTION_ALL)
+
 mlow-precision-recip-sqrt
 Common Var(flag_mrecip_low_precision_sqrt) Optimization
 Enable the reciprocal square root approximation.  Enabling this reduces
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index e2377c1..c59d31e 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -1032,6 +1032,10 @@
 (define_int_iterator FMAXMIN_UNS [UNSPEC_FMAX UNSPEC_FMIN
   UNSPEC_FMAXNM UNSPEC_FMINNM])
 
+(define_int_iterator PAUTH_LR_SP [UNSPEC_PACISP UNSPEC_AUTISP])
+
+(define_int_iterator PAUTH_17_16 [UNSPEC_PACI1716 UNSPEC_AUTI1716])
+
 (define_int_iterator VQDMULH [UNSPEC_SQDMULH UNSPEC_SQRDMULH])
 
 (define_int_iterator USSUQADD [UNSPEC_SUQADD UNSPEC_USQADD])
@@ -1218,6 +1222,18 @@
   (UNSPEC_FC

Re: Add test for c++/71166

2017-01-13 Thread Jeff Law

On 01/13/2017 08:40 AM, Marek Polacek wrote:

This bug was fixed in r238395, so what remains to do for GCC 7 is to add a
testcase and close the PR.

Tested on x86_64-linux, ok for trunk?

2017-01-13  Marek Polacek  

PR c++/71166
* g++.dg/cpp0x/constexpr-array18.C: New test.

OK.
jeff



Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-13 Thread Janne Blomqvist
On Thu, Jan 12, 2017 at 10:46 AM, FX  wrote:
>> I was finally able to get a 32-bit i686 compiler going (my attempts to
>> do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
>> running 32-bit builds/tests on a i686 container). At least on i686,
>> the patch below on top of the big charlen->size_t patch fixes the
>> failures
>
> Patch approved. The old code used gfc_extract_int, which bails out if a 
> non-constant expression is passed, so this is the right thing to do.
>
> FX

Committed as r28.

David, can you verify that it works alright on ppc?



-- 
Janne Blomqvist


Re: Avoid PR72749 by not using unspecs

2017-01-13 Thread Jeff Law

On 01/13/2017 04:50 AM, Alan Modra wrote:

Rather than using unspecs in doloop insns to stop combine creating
these insns, use legitimate_combined_insn.

I'm not sure why the original patch implementing
legitimate_combined_insn did not store the result of recog to the insn
but it seems good to me, and would allow the recog call in
ix86_legitimate_combined_insn to be omitted.  (I tested that too, not
shown here.)

Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux,
x86_64-linux.  OK for mainline?

PR target/72749
* combine.c (recog_for_combine_1): Set INSN_CODE before calling
target legitimate_combined_insn.
* config/rs6000/rs6000.c (TARGET_LEGITIMATE_COMBINED_INSN): Define.
(rs6000_legitimate_combined_insn): New function.
* config/rs6000/rs6000.md (UNSPEC_DOLOOP): Delete, and remove
all uses.
(ctr_internal3): Rename from *ctr_internal5.
(ctr_internal4): Rename from *ctr_internal6.
(ctr_internal1, ctr_internal2): Remove '*' from name.

OK on the combine.c changes.  ppc maintainers own the rest.

jeff



[PATCH] Remove unused include from Profile Mode header

2017-01-13 Thread Jonathan Wakely

This profile mode header includes , which includes
 and  in C++17 mode, and they try to include
the profile mode headers, and we end up in a cycle and so the content
of the headers is skipped by the header guards.

It doesn't look like profile mode needs  anyway, so we can
just remove it. In stage 1 I'll look at changing  to
include  and  directly,
skipping any Debug/Profile mode wrappers.

This fixes the last FAIL with --target_board=unix/-std=gnu++17 - yay!

* include/profile/base.h: Remove unused header that leads to header
cycle in C++17 mode.

Tested powerpc64le-linux, normal mode, profile mode, --disable-pch,
--target_board=unix/-std=gnu++17 and other variations too.

Committed to trunk.

commit 801c276e7b7dff4f4463b46c8ce1fe1ab22040b3
Author: Jonathan Wakely 
Date:   Fri Jan 13 15:51:06 2017 +

Remove unused include from Profile Mode header

* include/profile/base.h: Remove unused header that leads to header
cycle in C++17 mode.

diff --git a/libstdc++-v3/include/profile/base.h 
b/libstdc++-v3/include/profile/base.h
index e6ba43e..7e1a6a8 100644
--- a/libstdc++-v3/include/profile/base.h
+++ b/libstdc++-v3/include/profile/base.h
@@ -31,7 +31,6 @@
 #ifndef _GLIBCXX_PROFILE_BASE_H
 #define _GLIBCXX_PROFILE_BASE_H 1
 
-#include 
 #include 
 
 // Profiling mode namespaces.


Re: [PATCH] Do not declare artificial variables in tree-profile.c to have a definition (PR lto/69188).

2017-01-13 Thread Jeff Law

On 01/13/2017 08:08 AM, Martin Liška wrote:

Hello.

Nice example provided in the PR causes ICE as we have an artificial symbol
created in tree-profile.c once being removed by remove unreachable nodes (-O0)
and once not (-O1). Well, difference is in process_references where following 
hunk
prevent removal:

  || (((before_inlining_p
&& ((TREE_CODE (node->decl) != FUNCTION_DECL
 && optimize)

Anyway, these artificial symbols really should be just declarations as they are 
defined
in libgcov library.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin


0001-Do-not-declare-artificial-variables-in-tree-profile..patch


From 87262b1b60009381fd943fb433bc38f5d5685ac9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 13 Jan 2017 13:12:57 +0100
Subject: [PATCH] Do not declare artificial variables in tree-profile.c to have
 a definition (PR lto/69188).

gcc/testsuite/ChangeLog:

2017-01-13  Martin Liska  

PR lto/69188
* gcc.dg/lto/pr69188_0.c: New test.
* gcc.dg/lto/pr69188_1.c: New test.

gcc/ChangeLog:

2017-01-13  Martin Liska  

PR lto/69188
* tree-profile.c (init_ic_make_global_vars): Do not call
finalize_decl.
(gimple_init_gcov_profiler): Likewise.

OK.
jeff



Re: [PATCH 4/6] RISC-V Port: libsanitizer

2017-01-13 Thread Bernhard Reutner-Fischer
On 12 January 2017 03:30:36 CET, Palmer Dabbelt  wrote:

>--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
>+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
>@@ -64,7 +64,7 @@ namespace __sanitizer {
> 
>#if !defined(__powerpc64__) && !defined(__x86_64__) &&
>!defined(__aarch64__)\
>   && !defined(__mips__) && !defined(__s390__)\
>-&& !defined(__sparc__)
>+&& !defined(__sparc__) && &&
>!defined(__riscv)

Lots of ampersands.. how would that compile?


Re: [PATCH/AARCH64] Add scheduler for Thunderx2t99

2017-01-13 Thread James Greenhalgh
On Thu, Jan 12, 2017 at 03:43:52AM +, Hurugalawadi, Naveen wrote:
> Hi James,
> 
> The scheduling patch for vulcan was posted at the following link:-
> https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01205.html
> 
> We are working on the patch and addressed the comments for thunderx2t99.

Great, thanks.

> 
> >> I tried lowering the repeat expressions as so:
> Done.
> 
> >>split off the AdvSIMD/FP model from the main pipeline
> Done.
> 
> >> A change like wiring the vulcan_f0 and vulcan_f1 reservations
> >> to be cpu_units of a new define_automaton "vulcan_advsimd"
> Done.

Perfect, the automaton size is much more palatable now.

  Automaton `thunderx2t99'
  184 NDFA states,838 NDFA arcs
  184 DFA states, 838 DFA arcs
  184 minimal DFA states, 838 minimal DFA arcs
  370 all insns  8 insn equivalence classes
0 locked states
   1016 transition comb vector els,  1472 trans table els: use simple vect
   1472 min delay table els, compression factor 4

  Automaton `thunderx2t99_advsimd'
12231 NDFA states,  85833 NDFA arcs
12231 DFA states,   85833 DFA arcs
 9246 minimal DFA states,   66554 minimal DFA arcs
  370 all insns 11 insn equivalence classes
0 locked states
  84074 transition comb vector els, 101706 trans table els: use simple vect
  101706 min delay table els, compression factor 2

  Automaton `thunderx2t99_ldst'
   49 NDFA states,193 NDFA arcs
   49 DFA states, 193 DFA arcs
   16 minimal DFA states,  94 minimal DFA arcs
  370 all insns 13 insn equivalence classes
0 locked states
 91 transition comb vector els,   208 trans table els: use simple vect
208 min delay table els, compression factor 2

  Automaton `thunderx2t99_mult'
2 NDFA states,  5 NDFA arcs
2 DFA states,   5 DFA arcs
2 minimal DFA states,   5 minimal DFA arcs
  370 all insns  3 insn equivalence classes
0 locked states
6 transition comb vector els, 6 trans table els: use simple vect
6 min delay table els, compression factor 8

> 
> >> simplifying some of the remaining large expressions
> >> (vulcan_asimd_load*_mult, vulcan_asimd_load*_elts) can bring the size down
> Did not understand much about this comment.
> Can you please let me know about the simplification?

I wonder whether the current modeling of:

  (define_insn_reservation "thunderx2t99_asimd_load4_elts" 6

as:

  "thunderx2t99_ls_both*2,(thunderx2t99_ls0d1+thunderx2t99_ls1d1),\
   (thunderx2t99_ls0d2+thunderx2t99_ls1d2),\
   (thunderx2t99_ls0d3+thunderx2t99_ls1d3),thunderx2t99_f01")

Actually benefits the schedule in a meaningful way, or if it just increases
the size of the automaton. My guess is that given how many approximations
scheduling makes as to the actual working of the machine (e.g. always perfect
alignment, no cache misses, no other reason to restart loads) there is only
a very small benefit to accurately describing the flow of an instruction
through the pipelines in this way.

The size of the automaton is more reasonable now, so I won't insist on
changing it, but even with your changes the thunderx2t99 model is still
the largest automaton we're building.

> Please find attached the modified patch as per your suggestions and comments.
> Please review the patch and let us know if its okay?

I'd like for the size to come down again, or at least for you to comment on
whether the modelling you do for thunderx2t99_asimd_load*_mult and
thunderx2t99_asimd_load*_elts can be justified by an improved schedule.

> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index a7a4b33..4d39673 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -75,7 +75,7 @@ AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  
> AARCH64_FL_FOR_ARCH8, xge
>  
>  /* Broadcom ('B') cores. */
>  AARCH64_CORE("thunderx2t99",  thunderx2t99, cortexa57, 8_1A,  
> AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)

You'll want to update this to use your new scheduling model :-).

> -AARCH64_CORE("vulcan",  vulcan, cortexa57, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | 
> AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)
> +AARCH64_CORE("vulcan",  vulcan, vulcan, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | 
> AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)

This line is incorrect, you should be changing vulcan to use the new
thunderx2t99 model. The tune attribute "vulcan" won't match your new model,
so with this change you'd get "generic" (i.e. no) scheduling for -mcpu=vulcan .

Otherwise, I don't have any concerns with this patch but it must be
retested as the new scheduling model can never be used with this patch
version.

Thanks,
James



Re: [C++ Patch] PR 71737

2017-01-13 Thread Jason Merrill
On Fri, Jan 13, 2017 at 11:42 AM, Paolo Carlini
 wrote:
> Hi,
>
> On 13/01/2017 15:51, Nathan Sidwell wrote:
>>
>> On 01/13/2017 09:45 AM, Paolo Carlini wrote:
>>>
>>> Hi,
>>>
>>> in this error recovery issue get_underlying_template crashes when
>>> TYPE_TEMPLATE_INFO_MAYBE_ALIAS is applied to a null orig_type. Simply
>>> checking for that condition appears to solve the issue in a
>>> straightforward way. Tested x86_64-linux.
>>
>>
>> Wouldn't it be better if a scrogged alias got error_mark_node as the
>> underlying type?  (I have no idea whether that's an easy thing to
>> accomplish)
>
> Your reply, Nathan, led me to investigate where exactly DECL_ORIGINAL_TYPE
> becomes null, and turns out that in tsubst_decl we have code actively doing
> that. That same code, a few lines below, only sets TYPE_DEPENDENT_P_VALID to
> false if type != error_mark_node. I cannot say to fully understand yet all
> the details, but certainly the patchlet below also passes testing. Do you
> have comments about this too?

The clearing of DECL_ORIGINAL_TYPE is to allow set_underlying_type to
then set it to something more appropriate.  That function currently
avoids setting DECL_ORIGINAL_TYPE to error_mark_node, perhaps that
should be changed.

Jason


[PATCH] PR65411 don't retry fclose on EINTR

2017-01-13 Thread Jonathan Wakely

After an interrupted fclose() we can't know if it's safe (or undefined
behaviour) to re-use the FILE*, so we shouldn't try calling fclose
again.

PR libstdc++/65411
* config/io/basic_file_stdio.cc (__basic_file::close()): Don't
retry fclose on EINTR.

Tested x86_64-linux, committed to trunk.

commit 852ab4a7f7619036ddf2d7263a40368c351ff19c
Author: Jonathan Wakely 
Date:   Fri Jan 13 17:25:06 2017 +

PR65411 don't retry fclose on EINTR

PR libstdc++/65411
* config/io/basic_file_stdio.cc (__basic_file::close()): Don't
retry fclose on EINTR.

diff --git a/libstdc++-v3/config/io/basic_file_stdio.cc 
b/libstdc++-v3/config/io/basic_file_stdio.cc
index a0ad82c..e736701 100644
--- a/libstdc++-v3/config/io/basic_file_stdio.cc
+++ b/libstdc++-v3/config/io/basic_file_stdio.cc
@@ -267,16 +267,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
int __err = 0;
if (_M_cfile_created)
- {
-   // In general, no need to zero errno in advance if checking
-   // for error first. However, C89/C99 (at variance with IEEE
-   // 1003.1, f.i.) do not mandate that fclose must set errno
-   // upon error.
-   errno = 0;
-   do
- __err = fclose(_M_cfile);
-   while (__err && errno == EINTR);
- }
+ __err = fclose(_M_cfile);
_M_cfile = 0;
if (!__err)
  __ret = this;


Re: [PATCH][AArch64 - v3] Simplify eh_return implementation

2017-01-13 Thread James Greenhalgh
On Fri, Sep 02, 2016 at 11:31:04AM +, Wilco Dijkstra wrote:
> Ramana Radhakrishnan wrote:
> > Can you please file a PR for this and add some testcases ?  This sounds 
> > like a serious enough problem that needs to be looked at probably going 
> > back since the dawn of time.
> 
> I've created PR77455. Updated patch below:
> 
> This patch simplifies the handling of the EH return value.  We force the use 
> of the
> frame pointer so the return location is always at FP + 8.  This means we can 
> emit
> a simple volatile access in EH_RETURN_HANDLER_RTX without needing md
> patterns, splitters and frame offset calculations.  The new implementation 
> also
> fixes various bugs in aarch64_final_eh_return_addr, which does not work with
> -fomit-frame-pointer, alloca or outgoing arguments.

I've been putting off reviewing this patch for a while now, because I don't
understand enough about the current eh_return code to understand why what
you're proposing is correct.

The best way to progress this patch would be to go in to more detail as to
what the current code does, why we don't need it, and why your new
implementation is sufficient. The current code looks like it covers special
cases, and calls out DSE and CSELIB as needing special handling - your
new code doesn't.

Could you explain your patch to me assuming I know very little about the
implementation, with particular focus on why we no longer need the special
cases?

Thanks,
James

> 
> Bootstrap OK, GCC Regression OK, OK for trunk? Would it be useful to backport
> this to GCC6.x?
> 
> ChangeLog:
> 
> 2016-09-02  Wilco Dijkstra  
> 
>   PR77455
> gcc/
>   * config/aarch64/aarch64.md (eh_return): Remove pattern and splitter.
>   * config/aarch64/aarch64.h (AARCH64_EH_STACKADJ_REGNUM): Remove.
>   (EH_RETURN_HANDLER_RTX): New define.
>   * config/aarch64/aarch64.c (aarch64_frame_pointer_required):
>   Force frame pointer in EH return functions.
>   (aarch64_expand_epilogue): Add barrier for eh_return.
>   (aarch64_final_eh_return_addr): Remove.
>   (aarch64_eh_return_handler_rtx): New function.
>   * config/aarch64/aarch64-protos.h (aarch64_final_eh_return_addr):
>   Remove.
>   (aarch64_eh_return_handler_rtx): New prototype.
> 
> testsuite/
>   * gcc.target/aarch64/eh_return.c: New test.
> --
 


Re: [2/5][AArch64] Generate dwarf information for -msign-return-address

2017-01-13 Thread Jiong Wang

On 13/01/17 16:09, Richard Earnshaw (lists) wrote:

On 06/01/17 11:47, Jiong Wang wrote:


This patch is an update on DWARF generation for return address signing.

According to new proposal, we simply needs to generate REG_CFA_WINDOW_SAVE
annotation.

gcc/

2017-01-06  Jiong Wang  

 * config/aarch64/aarch64.c (aarch64_expand_prologue): Generate
dwarf
 annotation (REG_CFA_WINDOW_SAVE) for return address signing.
 (aarch64_expand_epilogue): Likewise.



I don't think we should be overloading REG_CFA_WINDOW_SAVE internally in
the compiler -- it's one thing to do it in the dwarf output tables, but
quite another to be doing it elsewhere in the compiler.

Instead we should create a new reg note kind and use that, but in the
final dwarf output then emit the overloaded opcode.


I can see the reason for doing this is if you want to seperate the interpretion
of GCC CFA reg-note and the final DWARF CFA operation.  My understanding is all
reg notes defined in gcc/reg-note.def should have general meaning, even the
CFA_WINDOW_SAVE.  For those which are architecture specific we might need a
mechanism to define them in backend only.
   
For general reg-notes in gcc/reg-note.def, they are not always have the

corresponding standard DWARF CFA operation, for example CFA_WINDOW_SAVE,
therefore if we want to achieve what you described, I think we also need to
define a new target hook which maps a GCC CFA reg-note into architecture DWARF
CFA operation.

Regards,
Jiong




Re: [driver, doc] Support escaping special characters in specs

2017-01-13 Thread Joseph Myers
On Fri, 13 Jan 2017, Rainer Orth wrote:

> I'm unsure if the patch is large enough to need a copyright assignment
> (in which case it's almost certainly too late for GCC 7), and even if
> not if it's appropriate at this point in the release cycle.

I think it's big enough to need an assignment.

Note missing spaces before '(' in calls to free, and after ')' in a cast.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Allow building GCC with PTX offloading even without CUDA being installed (gcc and nvptx-tools patches)

2017-01-13 Thread Jakub Jelinek
Hi!

This is something that has been discussed already during the last Cauldron.
Especially for distributions it is undesirable to need to have proprietary
CUDA libraries and headers installed when building GCC.

These two patches allow building GCC without CUDA around in a way that later
on can offload to PTX if libcuda.so.1 is installed (and the NVidia kernel
driver is installed, haven't tried if it works with nouveau, nor tried some
free CUDA replacements).  This is important because the former step can be
done when building the distribution packages, while the latter is a decision
of the user.  If the nvptx libgomp plugin is installed, but libcuda.so.1
can't be found, then the plugin behaves as if there are no PTX devices
available.  In order to configure gcc to load libcuda.so.1 dynamically,
one has to either configure it --without-cuda-driver, or without
--with-cuda-driver=/--with-cuda-driver-lib=/--with-cuda-driver-include=
options if cuda.h and -lcuda aren't found in the default locations.

I've talked to our lawyers and they said that the cuda.h header included
in this patch doesn't infringe anyone's copyright or is otherwise a fair
use, it has been created by gathering all the cu*/CU* symbols from the
current and older nvptx plugin and some oacc tests, then stubbing the
pointer-ish typedefs, grabing most enum values and function prototypes from
https://raw.githubusercontent.com/shinpei0208/gdev/master/cuda/driver/cuda.h
and verifying assembly with that header against assembly when compiled
against NVidia's cuda.h.

The nvptx-tools change to the nvptx-none-as binary is an important part of
this, although it is not a change to gcc itself - the problem is that by
default nvptx-none-as was calling the ptxas program to verify the assembly
is correct, which of course doesn't work very well when the proprietary
ptxas is not available.  So the patch makes it invoke ptxas always only if
a new --verify option is used, if --no-verify is used, then as before it
is not invoked, and without either of these options the behavior is that if
ptxas is found in $PATH, then it invokes it, if not, it does only minimal
verification good enough for gcc/configure purposes (it turned out to be
sufficient to error out if .version directive is not the first non-comment
token (ptxas errors on that too).

Tested on x86_64-linux, with CUDA around
(--with-cuda-driver=/usr/local/cuda-8.0) as well as without, and tested
in that case also both with libcuda.so.1 available and without.

Can the OpenACC hackers as well as Alex (or his collegues) please also test
it?  Do you have any problems with the GCC patch (if not, I'd commit it
next week before stage3 closes)?  Is the nvptx-tools patch ok (and if so,
can you commit it; I guess I could create a github pull request for this
if needed).

P.S.: not sure what is the cuInit call in nvptx_init good for, doesn't
libgomp always call nvptx_get_num_devices first and thus call cuInit already
there (but I've kept it in the patch)?

2017-01-13  Jakub Jelinek  

* plugin/configfrag.ac: For --without-cuda-driver don't initialize
CUDA_DRIVER_INCLUDE nor CUDA_DRIVER_LIB.  If both
CUDA_DRIVER_INCLUDE and CUDA_DRIVER_LIB are empty and linking small
cuda program fails, define PLUGIN_NVPTX_DYNAMIC to 1 and use
plugin/include/cuda as include dir and -ldl instead of -lcuda as
library to link ptx plugin against.
* plugin/plugin-nvptx.c: Include dlfcn.h if PLUGIN_NVPTX_DYNAMIC.
(CUDA_CALLS): Define.
(cuda_lib, cuda_lib_inited): New variables.
(init_cuda_lib): New function.
(CUDA_CALL_PREFIX): Define.
(CUDA_CALL_ERET, CUDA_CALL_ASSERT): Use CUDA_CALL_PREFIX.
(CUDA_CALL): Use FN instead of (FN).
(CUDA_CALL_NOCHECK): Define.
(cuda_error, fini_streams_for_device, select_stream_for_async,
nvptx_attach_host_thread_to_device, nvptx_open_device, link_ptx,
event_gc, nvptx_exec, nvptx_async_test, nvptx_async_test_all,
nvptx_wait_all, nvptx_set_clocktick, GOMP_OFFLOAD_unload_image,
nvptx_stacks_alloc, nvptx_stacks_free, GOMP_OFFLOAD_run): Use
CUDA_CALL_NOCHECK.
(nvptx_init): Call init_cuda_lib, if it fails, return false.  Use
CUDA_CALL_NOCHECK.
(nvptx_get_num_devices): Call init_cuda_lib, if it fails, return 0.
Use CUDA_CALL_NOCHECK.
* plugin/cuda/cuda.h: New file.
* config.h.in: Regenerated.
* configure: Regenerated.
* Makefile.in: Regenerated.

--- libgomp/plugin/configfrag.ac.jj 2017-01-13 12:07:56.0 +0100
+++ libgomp/plugin/configfrag.ac2017-01-13 17:33:26.608240936 +0100
@@ -58,10 +58,12 @@ AC_ARG_WITH(cuda-driver-include,
 AC_ARG_WITH(cuda-driver-lib,
[AS_HELP_STRING([--with-cuda-driver-lib=PATH],
[specify directory for the installed CUDA driver library])])
-if test "x$with_cuda_driver" != x; then
-  CUDA_DRIVER_INCLUDE=$with_cuda_driver/include

Re: [driver, doc] Support escaping special characters in specs

2017-01-13 Thread Rainer Orth
Hi Joseph,

> On Fri, 13 Jan 2017, Rainer Orth wrote:
>
>> I'm unsure if the patch is large enough to need a copyright assignment
>> (in which case it's almost certainly too late for GCC 7), and even if
>> not if it's appropriate at this point in the release cycle.
>
> I think it's big enough to need an assignment.

ok, then I'll get the ball rolling.  Are the forms on

https://www.gnu.org/software/gnulib/Copyright/

current?  I didn't have to deal with the copyright assignment process so
far.

> Note missing spaces before '(' in calls to free, and after ')' in a cast.

Thanks, will fix.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [driver, doc] Support escaping special characters in specs

2017-01-13 Thread Joseph Myers
On Fri, 13 Jan 2017, Rainer Orth wrote:

> ok, then I'll get the ball rolling.  Are the forms on
> 
>   https://www.gnu.org/software/gnulib/Copyright/
> 
> current?  I didn't have to deal with the copyright assignment process so
> far.

As far as I know they are (the place I refer people to is actually gnulib 
git rather than the gnulib website).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, GCC/testsuite/ARM, ping] Skip optional_mthumb tests if GCC has a default mode

2017-01-13 Thread Thomas Preudhomme

Ping ARM maintainers? (target independent part ACKed by Jeff)

Best regards,

Thomas

On 03/01/17 17:19, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 12/12/16 17:52, Thomas Preudhomme wrote:

Hi,

The logic to make -mthumb optional for Thumb-only devices is only executed when
no -marm or -mthumb is given on the command-line. This includes configuring GCC
with --with-mode= because this makes the option to be passed before any others.
The optional_mthumb-* testcases are skipped when -marm or -mthumb is passed on
the command line but not when GCC was configured with --with-mode. Not only are
the tests meaningless in these configurations, they also spuriously FAIL if
--with-mode=arm was used since the test are built for Thumb-only devices, as
reported by Christophe Lyon in [1].

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html

This patch adds logic to target-support.exp to check how was GCC configured and
changes the optional_mthumb testcases to be skipped if there is a default mode
(--with-mode=). It also fixes a couple of typo on the selectors.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2016-12-09  Thomas Preud'homme  

* lib/target-supports.exp (check_configured_with): New procedure.
(check_effective_target_default_mode): new effective target.
* gcc.target/arm/optional_thumb-1.c: Skip if GCC was configured with a
default mode.  Fix dg-skip-if target selector syntax.
* gcc.target/arm/optional_thumb-2.c: Likewise.
* gcc.target/arm/optional_thumb-3.c: Fix dg-skip-if target selector
syntax.


Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
index 23df62887ba4aaa1d8717a34ecda9a40246f0552..99cb0c3f33b601fff4493feef72765f7590e18f6 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
@@ -1,5 +1,5 @@
-/* { dg-do compile } */
-/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-*} { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
+/* { dg-do compile { target { ! default_mode } } } */
+/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
 /* { dg-options "-march=armv6-m" } */
 
 /* Check that -mthumb is not needed when compiling for a Thumb-only target.  */
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-2.c b/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
index 4bd53a45eca97e62dd3b86d5a1a66c5ca21e7aad..280dfb3fec55570b6cfe934303c9bd3d50322b86 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
@@ -1,5 +1,5 @@
-/* { dg-do compile } */
-/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-*} { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
+/* { dg-do compile { target { ! default_mode } } } */
+/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
 /* { dg-options "-mcpu=cortex-m4" } */
 
 /* Check that -mthumb is not needed when compiling for a Thumb-only target.  */
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
index f1fd5c8840b191e600c20a7817c611bb9bb645df..d9150e09e475dfbeb7b0b0c153c913b1ad6f0777 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
@@ -1,8 +1,8 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
-/* { dg-skip-if "-mthumb given" { *-*-*} { "-mthumb" } } */
+/* { dg-skip-if "-mthumb given" { *-*-* } { "-mthumb" } } */
 /* { dg-options "-marm" } */
-/* { dg-error "target CPU does not support ARM mode" "missing error with -marm on Thumb-only targets" { target *-*-*} 0 } */
+/* { dg-error "target CPU does not support ARM mode" "missing error with -marm on Thumb-only targets" { target *-*-* } 0 } */
 
 /* Check that -marm gives an error when compiling for a Thumb-only target.  */
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 0fc0bafa67d8d34ec74ce2d1d7a2323a375615cc..f7511ef3aebca72c798496fb95ce43fcbbc08ed1 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -252,6 +252,20 @@ proc check_runtime {prop args} {
 }]
 }
 
+# Return 1 if GCC was configured with $pattern.
+proc check_configured_with { pattern } {
+global tool
+
+set gcc_output [${tool}_target_compile "-v" "" "none" ""]
+if { [ regexp "Configured with: \[^\n\]*$pattern" $gcc_output ] } {
+verbose "Matched: $pattern" 2
+return 1
+}
+
+verbose "Failed to match: $pattern" 2
+return 0
+}
+
 ###
 # proc check_weak_available { }
 ###
@@ -3797,6 +3811,12 @@ proc add_options_for_arm_arch_v7ve { flags } {
 return "$flags -march=armv7ve"
 }
 
+# 

Re: [PATCH, GCC/LRA, gcc-5/6-branch] Fix PR78617: Fix conflict detection in rematerialization

2017-01-13 Thread Thomas Preudhomme

Ping? I'm not sure if an ok from Valdimir is enough or if I also need RM 
approval.

Best regards,

Thomas

On 09/01/17 09:51, Thomas Preudhomme wrote:

Hi,

Is it ok to backport the fix for PR78617 (incorrect conflict detection in
rematerialization) to GCC 5 and GCC 6? The patch applies cleanly and the
testsuite showed no regression when performed with the following configurations:

- an arm-none-eabi GCC cross-compiler targeting Cortex-M0 and Cortex-M3
- a bootstrapped arm-linux-gnueabihf GCC native compiler
- a bootstrapped x86_64-linux-gnu GCC native compiler

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-01-03 Thomas Preud'homme 

Backport from mainline
2016-12-07 Thomas Preud'homme 

PR rtl-optimization/78617
* lra-remat.c (do_remat): Initialize live_hard_regs from live in
registers, also setting hard registers mapped to pseudo registers.


*** gcc/testsuite/ChangeLog ***


2017-01-03 Thomas Preud'homme 

Backport from mainline
2016-12-07 Thomas Preud'homme 

PR rtl-optimization/78617
* gcc.c-torture/execute/pr78617.c: New test.


Best regards,

Thomas
diff --git a/gcc/lra-remat.c b/gcc/lra-remat.c
index 5e5d62c50b011fe53c5652a4406d711feb448885..17da91b7f2144b2eaf48ce13f547239013c6e7c3 100644
--- a/gcc/lra-remat.c
+++ b/gcc/lra-remat.c
@@ -1124,6 +1124,7 @@ update_scratch_ops (rtx_insn *remat_insn)
 static bool
 do_remat (void)
 {
+  unsigned regno;
   rtx_insn *insn;
   basic_block bb;
   bitmap_head avail_cands;
@@ -1131,12 +1132,21 @@ do_remat (void)
   bool changed_p = false;
   /* Living hard regs and hard registers of living pseudos.  */
   HARD_REG_SET live_hard_regs;
+  bitmap_iterator bi;
 
   bitmap_initialize (&avail_cands, ®_obstack);
   bitmap_initialize (&active_cands, ®_obstack);
   FOR_EACH_BB_FN (bb, cfun)
 {
-  REG_SET_TO_HARD_REG_SET (live_hard_regs, df_get_live_out (bb));
+  CLEAR_HARD_REG_SET (live_hard_regs);
+  EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), 0, regno, bi)
+	{
+	  int hard_regno = regno < FIRST_PSEUDO_REGISTER
+			   ? regno
+			   : reg_renumber[regno];
+	  if (hard_regno >= 0)
+	SET_HARD_REG_BIT (live_hard_regs, hard_regno);
+	}
   bitmap_and (&avail_cands, &get_remat_bb_data (bb)->avin_cands,
 		  &get_remat_bb_data (bb)->livein_cands);
   /* Activating insns are always in the same block as their corresponding
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78617.c b/gcc/testsuite/gcc.c-torture/execute/pr78617.c
new file mode 100644
index ..89c4f6dea8cb507b963f91debb94cbe16eb1db90
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr78617.c
@@ -0,0 +1,25 @@
+int a = 0;
+int d = 1;
+int f = 1;
+
+int fn1() {
+  return a || 1 >> a;
+}
+
+int fn2(int p1, int p2) {
+  return p2 >= 2 ? p1 : p1 >> 1;
+}
+
+int fn3(int p1) {
+  return d ^ p1;
+}
+
+int fn4(int p1, int p2) {
+  return fn3(!d > fn2((f = fn1() - 1000) || p2, p1));
+}
+
+int main() {
+  if (fn4(0, 0) != 1)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/lra-remat.c b/gcc/lra-remat.c
index 187ee3e7752d1ebe15ba8e8014620c0a94e11424..79504d4eb1a052d906a69178d847e6e618b468ec 100644
--- a/gcc/lra-remat.c
+++ b/gcc/lra-remat.c
@@ -1116,6 +1116,7 @@ update_scratch_ops (rtx_insn *remat_insn)
 static bool
 do_remat (void)
 {
+  unsigned regno;
   rtx_insn *insn;
   basic_block bb;
   bitmap_head avail_cands;
@@ -1123,12 +1124,21 @@ do_remat (void)
   bool changed_p = false;
   /* Living hard regs and hard registers of living pseudos.  */
   HARD_REG_SET live_hard_regs;
+  bitmap_iterator bi;
 
   bitmap_initialize (&avail_cands, ®_obstack);
   bitmap_initialize (&active_cands, ®_obstack);
   FOR_EACH_BB_FN (bb, cfun)
 {
-  REG_SET_TO_HARD_REG_SET (live_hard_regs, df_get_live_out (bb));
+  CLEAR_HARD_REG_SET (live_hard_regs);
+  EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), 0, regno, bi)
+	{
+	  int hard_regno = regno < FIRST_PSEUDO_REGISTER
+			   ? regno
+			   : reg_renumber[regno];
+	  if (hard_regno >= 0)
+	SET_HARD_REG_BIT (live_hard_regs, hard_regno);
+	}
   bitmap_and (&avail_cands, &get_remat_bb_data (bb)->avin_cands,
 		  &get_remat_bb_data (bb)->livein_cands);
   /* Activating insns are always in the same block as their corresponding
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78617.c b/gcc/testsuite/gcc.c-torture/execute/pr78617.c
new file mode 100644
index ..89c4f6dea8cb507b963f91debb94cbe16eb1db90
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr78617.c
@@ -0,0 +1,25 @@
+int a = 0;
+int d = 1;
+int f = 1;
+
+int fn1() {
+  return a || 1 >> a;
+}
+
+int fn2(int p1, int p2) {
+  return p2 >= 2 ? p1 : p1 >> 1;
+}
+
+int fn3(int p1) {
+  return d ^ p1;
+}
+
+int fn4(int p1, int p2) {
+  return fn3(!d > fn2((f = fn1() - 1000) || p2, p1));
+}
+
+int main() {
+  if (fn4(0, 0) != 1)
+__builtin_abort ();
+  return 0;
+}


Re: [PATCH] Allow building GCC with PTX offloading even without CUDA being installed (gcc and nvptx-tools patches)

2017-01-13 Thread Joseph Myers
On Fri, 13 Jan 2017, Jakub Jelinek wrote:

> --- libgomp/plugin/cuda/cuda.h.jj 2017-01-13 15:58:00.966544147 +0100
> +++ libgomp/plugin/cuda/cuda.h2017-01-13 17:02:47.355817896 +0100
> @@ -0,0 +1,174 @@
> +/* CUDA API description.
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.

The new file should presumably have the runtime license exception.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, rs6000] Add vec_nabs builtin support

2017-01-13 Thread Carl E. Love
Segar:

The issues you pointed out below have been addressed in the following
updated patch.  Please let me know if the changes are acceptable.
Thanks for your help and feedback.

   Carl Love
> rs6000-c.c
> 
> > vector signed char vec_nabs (vector signed char)
> > vector signed short vec_nabs (vector signed short)
> > vector signed int vec_nabs (vector signed int)
> > vector signed long long vec_nabs (vector signed long long)
> > vector float vec_nabs (vector float)
> > vector double vec_nabs (vector double)
> 
> You should mention the name of the function or data etc. you modified here:

>   rs6000-c.c (altivec_overloaded_builtins): Blabla.
> 
> or something like that.
> 
> > * config/rs6000/rs6000-builtin.def: Add definitions for NABS functions
> > and NABS overload.
> > * config/rs6000/altivec.md: Add define to expand nabs2 types
> > * config/rs6000/altivec.h: Add define for vec_nabs built-in function.
> > * doc/extend.texi: Update the built-in documentation file for the
> > new built-in functions.
> 
> Here, too.
> 
> > +  int i, n_elt = GET_MODE_NUNITS (mode);
> 
> Two lines for this please, two separate declarations.  I realise you just
> copied this code ;-)
--

gcc/ChangeLog:

2017-01-08  Carl Love  

* config/rs6000/rs6000-c (altivec_overloaded_builtins): Add support
for built-in functions
vector signed char vec_nabs (vector signed char)
vector signed short vec_nabs (vector signed short)
vector signed int vec_nabs (vector signed int)
vector signed long long vec_nabs (vector signed long long)
vector float vec_nabs (vector float)
vector double vec_nabs (vector double)
* config/rs6000/rs6000-builtin.def: Add definitions for NABS functions
and NABS overload.
* config/rs6000/altivec.md: Add define to expand nabs2 types
* config/rs6000/altivec.h: Add define for vec_nabs built-in function.
* doc/extend.texi (section 6.60.22 PowerPC AltiVec Built-in Functions):
Update the documentation file for the new built-in functions.

gcc/testsuite/ChangeLog:

2017-01-08  Carl Love  

* gcc.target/powerpc/builtins-3.c: Add tests for the new built-ins
to the test suite file.
* gcc.target/powerpc/builtins-3-p8.c: Add tests for the new built-ins
to the test suite file.
---
 gcc/config/rs6000/altivec.h  |  1 +
 gcc/config/rs6000/altivec.md | 27 ++
 gcc/config/rs6000/rs6000-builtin.def |  9 +
 gcc/config/rs6000/rs6000-c.c | 12 ++
 gcc/doc/extend.texi  |  8 
 gcc/testsuite/gcc.target/powerpc/builtins-3-p8.c | 12 +-
 gcc/testsuite/gcc.target/powerpc/builtins-3.c| 47 +++-
 7 files changed, 114 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 73567ff..17bc33e 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -189,6 +189,7 @@
 #define vec_vupklsh __builtin_vec_vupklsh
 #define vec_vupklsb __builtin_vec_vupklsb
 #define vec_abs __builtin_vec_abs
+#define vec_nabs __builtin_vec_nabs
 #define vec_abss __builtin_vec_abss
 #define vec_add __builtin_vec_add
 #define vec_adds __builtin_vec_adds
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index c2063d5..2a26007 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2741,6 +2741,33 @@
 })
 
 ;; Generate
+;;vspltisw SCRATCH1,0
+;;vsubu?m SCRATCH2,SCRATCH1,%1
+;;vmins? %0,%1,SCRATCH2"
+(define_expand "nabs2"
+  [(set (match_dup 2) (match_dup 3))
+   (set (match_dup 4)
+(minus:VI2 (match_dup 2)
+  (match_operand:VI2 1 "register_operand" "v")))
+   (set (match_operand:VI2 0 "register_operand" "=v")
+(smin:VI2 (match_dup 1) (match_dup 4)))]
+  ""
+{
+  int i;
+  int n_elt = GET_MODE_NUNITS (mode);
+
+  rtvec v = rtvec_alloc (n_elt);
+
+  /* Create an all 0 constant.  */
+  for (i = 0; i < n_elt; ++i)
+RTVEC_ELT (v, i) = const0_rtx;
+
+  operands[2] = gen_reg_rtx (mode);
+  operands[3] = gen_rtx_CONST_VECTOR (mode, v);
+  operands[4] = gen_reg_rtx (mode);
+})
+
+;; Generate
 ;;vspltisw SCRATCH1,-1
 ;;vslw SCRATCH2,SCRATCH1,SCRATCH1
 ;;vandc %0,%1,SCRATCH2
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 2329c1f..1cdf9a8 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1129,6 +1129,14 @@ BU_ALTIVEC_A (ABSS_V4SI,  "abss_v4si",   SAT,
altivec_abss_v4si)
 BU_ALTIVEC_A (ABSS_V8HI,  "abss_v8hi", SAT,altivec_abss_v8hi)
 BU_ALTIVEC_A (ABSS_V16QI, "abss_v16qi",SAT,altivec_abss_v16qi)
 
+/* Altivec NABS functions.  */
+BU_ALTIVEC_

Re: [PATCH] Allow building GCC with PTX offloading even without CUDA being installed (gcc and nvptx-tools patches)

2017-01-13 Thread Jakub Jelinek
On Fri, Jan 13, 2017 at 06:19:02PM +, Joseph Myers wrote:
> > --- libgomp/plugin/cuda/cuda.h.jj   2017-01-13 15:58:00.966544147 +0100
> > +++ libgomp/plugin/cuda/cuda.h  2017-01-13 17:02:47.355817896 +0100
> > @@ -0,0 +1,174 @@
> > +/* CUDA API description.
> > +   Copyright (C) 2017 Free Software Foundation, Inc.
> > +
> > +This file is part of GCC.
> > +
> > +GCC is free software; you can redistribute it and/or modify
> > +it under the terms of the GNU General Public License as published by
> > +the Free Software Foundation; either version 3, or (at your option)
> > +any later version.
> > +
> > +GCC is distributed in the hope that it will be useful,
> > +but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +GNU General Public License for more details.
> > +
> > +You should have received a copy of the GNU General Public License
> > +along with GCC; see the file COPYING3.  If not see
> > +.
> 
> The new file should presumably have the runtime license exception.

Agreed (though, most likely the file isn't copyrightable anyway, but
we use copyright boilerplate for various files that might not be
copyrightable).  But we should use it not just for cuda.h, but also
for hsa.h and hsa_ext_finalize.h (CCing Martin who has added those).

2017-01-13  Jakub Jelinek  

* plugin/cuda/cuda.h: Add GCC runtime library exception.
* plugin/hsa.h: Likewise.
* plugin/hsa_ext_finalize.h: Likewise.

--- libgomp/plugin/cuda/cuda.h.jj   2017-01-13 17:02:47.0 +0100
+++ libgomp/plugin/cuda/cuda.h  2017-01-13 19:21:06.307547518 +0100
@@ -13,8 +13,13 @@ but WITHOUT ANY WARRANTY; without even t
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .
 
 This header provides the minimum amount of typedefs, enums and function
--- libgomp/plugin/hsa.h.jj 2017-01-13 12:07:56.0 +0100
+++ libgomp/plugin/hsa.h2017-01-13 19:21:37.230153569 +0100
@@ -13,8 +13,13 @@ but WITHOUT ANY WARRANTY; without even t
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .
 
 The contents of the file was created by extracting data structures, enum,
--- libgomp/plugin/hsa_ext_finalize.h.jj2017-01-13 12:07:56.0 
+0100
+++ libgomp/plugin/hsa_ext_finalize.h   2017-01-13 19:22:05.388794833 +0100
@@ -13,8 +13,13 @@ but WITHOUT ANY WARRANTY; without even t
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .
 
 The contents of the file was created by extracting data structures, enum,


Jakub


Re: [PATCH, ARM] correctly encode the CC reg data flow

2017-01-13 Thread Bernd Edlinger
On 01/13/17 17:10, Bernd Edlinger wrote:
> On 01/13/17 14:50, Richard Earnshaw (lists) wrote:
>> On 18/12/16 12:58, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this is related to PR77308, the follow-up patch will depend on this one.
>>>
>>> When trying the split the *arm_cmpdi_insn and *arm_cmpdi_unsigned
>>> before reload, a mis-compilation in libgcc function __gnu_satfractdasq
>>> was discovered, see [1] for more details.
>>>
>>> The reason seems to be that when the *arm_cmpdi_insn is directly
>>> followed by a *arm_cmpdi_unsigned instruction, both are split
>>> up into this:
>>>
>>>[(set (reg:CC CC_REGNUM)
>>>  (compare:CC (match_dup 0) (match_dup 1)))
>>> (parallel [(set (reg:CC CC_REGNUM)
>>> (compare:CC (match_dup 3) (match_dup 4)))
>>>(set (match_dup 2)
>>> (minus:SI (match_dup 5)
>>>  (ltu:SI (reg:CC_C CC_REGNUM) (const_int
>>> 0])]
>>>
>>>[(set (reg:CC CC_REGNUM)
>>>  (compare:CC (match_dup 2) (match_dup 3)))
>>> (cond_exec (eq:SI (reg:CC CC_REGNUM) (const_int 0))
>>>(set (reg:CC CC_REGNUM)
>>> (compare:CC (match_dup 0) (match_dup 1]
>>>
>>> The problem is that the reg:CC from the *subsi3_carryin_compare
>>> is not mentioning that the reg:CC is also dependent on the reg:CC
>>> from before.  Therefore the *arm_cmpsi_insn appears to be
>>> redundant and thus got removed, because the data values are identical.
>>>
>>> I think that applies to a number of similar pattern where data
>>> flow is happening through the CC reg.
>>>
>>> So this is a kind of correctness issue, and should be fixed
>>> independently from the optimization issue PR77308.
>>>
>>> Therefore I think the patterns need to specify the true
>>> value that will be in the CC reg, in order for cse to
>>> know what the instructions are really doing.
>>>
>>>
>>> Bootstrapped and reg-tested on arm-linux-gnueabihf.
>>> Is it OK for trunk?
>>>
>>
>> I agree you've found a valid problem here, but I have some issues with
>> the patch itself.
>>
>>
>> (define_insn_and_split "subdi3_compare1"
>>   [(set (reg:CC_NCV CC_REGNUM)
>> (compare:CC_NCV
>>   (match_operand:DI 1 "register_operand" "r")
>>   (match_operand:DI 2 "register_operand" "r")))
>>(set (match_operand:DI 0 "register_operand" "=&r")
>> (minus:DI (match_dup 1) (match_dup 2)))]
>>   "TARGET_32BIT"
>>   "#"
>>   "&& reload_completed"
>>   [(parallel [(set (reg:CC CC_REGNUM)
>>(compare:CC (match_dup 1) (match_dup 2)))
>>   (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))])
>>(parallel [(set (reg:CC_C CC_REGNUM)
>>(compare:CC_C
>>  (zero_extend:DI (match_dup 4))
>>  (plus:DI (zero_extend:DI (match_dup 5))
>>   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)
>>   (set (match_dup 3)
>>(minus:SI (minus:SI (match_dup 4) (match_dup 5))
>>  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0])]
>>
>>
>> This pattern is now no-longer self consistent in that before the split
>> the overall result for the condition register is in mode CC_NCV, but
>> afterwards it is just CC_C.
>>
>> I think CC_NCV is correct mode (the N, C and V bits all correctly
>> reflect the result of the 64-bit comparison), but that then implies that
>> the cc mode of subsi3_carryin_compare is incorrect as well and should in
>> fact also be CC_NCV.  Thinking about this pattern, I'm inclined to agree
>> that CC_NCV is the correct mode for this operation
>>
>> I'm not sure if there are other consequences that will fall out from
>> fixing this (it's possible that we might need a change to select_cc_mode
>> as well).
>>
>
> Yes, this is still a bit awkward...
>
> The N and V bit will be the correct result for the subdi3_compare1
> a 64-bit comparison, but zero_extend:DI (match_dup 4) (plus:DI ...)
> only gets the C bit correct, the expression for N and V is a different
> one.
>
> It probably works, because the subsi3_carryin_compare instruction sets
> more CC bits than the pattern does explicitly specify the value.
> We know the subsi3_carryin_compare also computes the NV bits, but it is
> hard to write down the correct rtl expression for it.
>
> In theory the pattern should describe everything correctly,
> maybe, like:
>
> set (reg:CC_C CC_REGNUM)
> (compare:CC_C
>   (zero_extend:DI (match_dup 4))
>   (plus:DI (zero_extend:DI (match_dup 5))
>(ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)
> set (reg:CC_NV CC_REGNUM)
> (compare:CC_NV
>  (match_dup 4))
>  (plus:SI (match_dup 5) (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))
> set (match_dup 3)
> (minus:SI (minus:SI (match_dup 4) (match_dup 5))
>   (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)
>
>
> But I doubt that will work to set CC_REGNUM with two different modes
> in parallel?
>
> Another idea would be to invent a CC_CNV_NOOV mode, that implic

Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)

2017-01-13 Thread Aldy Hernandez

[Sorry for the delay, I was sick.]

On 01/09/2017 04:30 AM, Richard Biener wrote:

On Sat, Jan 7, 2017 at 1:54 PM, Aldy Hernandez  wrote:

On 01/04/2017 07:11 AM, Richard Biener wrote:


On Tue, Jan 3, 2017 at 6:36 PM, Aldy Hernandez  wrote:


On 12/20/2016 09:16 AM, Richard Biener wrote:



On Fri, Dec 16, 2016 at 3:41 PM, Aldy Hernandez 
wrote:



Hi folks.

This is a follow-up on Jeff and Richi's interaction on the
aforementioned
PR
here:

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01397.html

I decided to explore the idea of analyzing may-undefness on-demand,
which
actually looks rather cheap.

BTW, I don't understand why we don't have auto_bitmap's, as we already
have
auto_sbitmap's.  I've implemented the former based on auto_sbitmap's
code
we
already have.

The attached patch fixes the bug without introducing any regressions.

I also tested the patch by compiling 242 .ii files with -O3.  These
were
gathered from a stage1 build with -save-temps.  There is a slight time
degradation of 4 seconds within 27 minutes of user time:

tainted:26:52
orig:   26:48

This was the average aggregate time of two runs compiling all 242 .ii
files.
IMO, this looks reasonable.  It is after all, -O3.Is it acceptable?




+  while (!worklist.is_empty ())
+{
+  name = worklist.pop ();
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  if (ssa_undefined_value_p (name, true))
+   return true;
+
+  bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));

it should be already set as we use visited_ssa as "was it ever on the
worklist",
so maybe renaming it would be a good thing as well.




I don't understand what you would prefer here.



Set the bit when you put name on the worklist (and do not do that if the
bit is set).  Thus simply remove the above and add a bitmap_set_bit
for the initial name you put on the worklist.



+ if (TREE_CODE (name) == SSA_NAME)
+   {
+ /* If an SSA has already been seen, this may be a
loop.
+Fail conservatively.  */
+ if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION
(name)))
+   return false;

so to me "conservative" is returning true, not false.




OK



+ bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
+ worklist.safe_push (name);

but for loops we can just continue and ignore this use.  And
bitmap_set_bit
returns whether it set a bit, thus

if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION
(name)))
  worklist.safe_push (name);

should work?




Fixed.



+  /* Check that any SSA names used to define NAME is also fully
+defined.  */
+  use_operand_p use_p;
+  ssa_op_iter iter;
+  FOR_EACH_SSA_USE_OPERAND (use_p, def, iter, SSA_OP_USE)
+   {
+ name = USE_FROM_PTR (use_p);
+ if (TREE_CODE (name) == SSA_NAME)

always true.

+   {
+ /* If an SSA has already been seen, this may be a loop.
+Fail conservatively.  */
+ if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION (name)))
+   return false;
+ bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
+ worklist.safe_push (name);

See above.

In principle the thing is sound but I'd like to be able to pass in a
bitmap of
known maybe-undefined/must-defined SSA names to have a cache for
multiple invocations from within a pass (like this unswitching case).




Done, though bitmaps are now calculated as part of the instantiation.



Also once you hit defs that are in a post-dominated region of the loop
entry
you can treat them as not undefined (as their use invokes undefined
behavior).  This is also how you treat function parameters (well,
ssa_undefined_value_p does), where the call site invokes undefined
behavior
when passing in undefined values.  So we need an extra parameter
specifying
the post-dominance region.




Done.



You do not handle memory or calls conservatively which means the
existing
testcase only needs some obfuscation to become a problem again.  To fix
that before /* Check that any SSA names used to define NAME is also
fully
defined.  */ bail out conservatively, like

   if (! is_gimple_assign (def)
  || gimple_assign_single_p (def))
return true;




As I asked previously, I understand the !is_gimple_assign, which will
skip
over GIMPLE_CALLs setting a value, but the "gimple_assign_single_p(def)
==
true"??

Won't this last one bail on things like e.3_7 = e, or x_77 = y_88? Don't
we
want to follow up the def chain precisely on these?

The attached implementation uses a cache, and a pre-computed
post-dominance
region.  A previous incantation of this patch (sans the post-dominance
stuff) had performance within the noise of the previous implementation.

I am testing again, and will do some performance comparisons in a bit,
but
for now-- are we on the same page here?  Is this what yo

Re: [PATCH] adding missing LTO to some warning options (PR 78606)

2017-01-13 Thread Andreas Schwab
On Jan 10 2017, Martin Sebor  wrote:

> Index: gcc/testsuite/gcc.dg/pr78768.c
> ===
> --- gcc/testsuite/gcc.dg/pr78768.c(revision 0)
> +++ gcc/testsuite/gcc.dg/pr78768.c(working copy)
> @@ -0,0 +1,13 @@
> +/* PR c/78768 - -Walloca-larger-than and -Wformat-length warnings disabled
> +   by -flto
> +  { dg-do run }
> +  { dg-options "-O2 -Walloca-larger-than=10 -Wformat -Wformat-length -flto" 
> } */
> +
> +int main (void)
> +{
> +  char *d = (char *)__builtin_alloca (12);  /* { dg-warning "argument to 
> .alloca. is too large" } */
> +
> +  __builtin_sprintf (d, "%32s", "x");   /* { dg-warning "directive writing 
> 32 bytes into a region of size 12" "-Wformat-length" { xfail *-*-* } } */
> +
> +  return 0;
> +}

Why is that a run test?  It cannot be usefully executed.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


  1   2   >