[PR57371] transform (double)i eq/ne 0 to i eq/ne 0

2016-08-03 Thread Prathamesh Kulkarni
Hi,
The attached patch tries to transform
(double)i eq/ne 0 to i eq/ne 0
AFAIU from Joseph's comment 1 in PR, the transform should be safe with
-fno-trapping-math ?
Bootstrap+tested on x86_64-unknown-linux-gnu in progress.

Thanks,
Prathamesh
2016-08-03  Prathamesh Kulkarni  

PR tree-optimization/57371
* match.pd ((double) i eq/ne 0 -> i eq/ne 0): New pattern.

testsuite/
* gcc.dg/pr57371-1.c: New test-case.
* gcc.dg/pr57371-2.c: Likewise.

diff --git a/gcc/match.pd b/gcc/match.pd
index 2380d90..63be2e9 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2611,6 +2611,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
{ constant_boolean_node (cmp == ORDERED_EXPR || cmp == LTGT_EXPR
? false : true, type); })))
 
+/* PR57371: Transform (double)i eq/ne 0 to i eq/ne 0.  */
+(for cmp (ne eq)
+ (simplify
+  (cmp (float @0) real_zerop@1)
+   (if (!flag_trapping_math && INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(cmp @0 { build_zero_cst (TREE_TYPE (@0)); }
+
 /* bool_var != 0 becomes bool_var.  */
 (simplify
  (ne @0 integer_zerop)
diff --git a/gcc/testsuite/gcc.dg/pr57371-1.c b/gcc/testsuite/gcc.dg/pr57371-1.c
new file mode 100644
index 000..fd15509
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr57371-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-trapping-math -fdump-tree-gimple" } */
+
+int f1(int i)
+{
+  return (double)i != 0;
+}
+
+int f2(int i)
+{
+  return (double)i == 0;
+}
+
+/* { dg-final { scan-tree-dump-times "double" 0 "gimple" } } */
diff --git a/gcc/testsuite/gcc.dg/pr57371-2.c b/gcc/testsuite/gcc.dg/pr57371-2.c
new file mode 100644
index 000..e19d054
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr57371-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-trapping-math -fdump-tree-forwprop-details" } */
+
+int f1(int i)
+{
+  double x = (double) i;
+  return x != 0.0;
+}
+
+int f2(int i)
+{
+  double x = (double) i;
+  return x == 0.0;
+}
+
+/* { dg-final { scan-tree-dump "i_\[0-9\]*\\(D\\) != 0" "forwprop1" } } */
+/* { dg-final { scan-tree-dump "i_\[0-9\]*\\(D\\) == 0" "forwprop1" } } */


Re: [PR57371] transform (double)i eq/ne 0 to i eq/ne 0

2016-08-03 Thread Georg-Johann Lay

On 03.08.2016 09:53, Prathamesh Kulkarni wrote:

Hi,
The attached patch tries to transform
(double)i eq/ne 0 to i eq/ne 0
AFAIU from Joseph's comment 1 in PR, the transform should be safe with
-fno-trapping-math ?


What about signed zeroes?

Johann


Bootstrap+tested on x86_64-unknown-linux-gnu in progress.

Thanks,
Prathamesh





Re: [PATCH RFC] do not throw in std::make_exception_ptr

2016-08-03 Thread Jonathan Wakely

On 28/07/16 10:20 +0300, Gleb Natapov wrote:

[resent with hopefully correct libstdc++ mailing list address this time]

Here is my attempt to fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68297. The resulting patch
is a little bit long because I had to split  and cxxabi.h


"A little bit", yes ;-)

A changelog would help review it, because it's not clear what has
moved to where. You've split files, but not said what parts end up
where (which obviously can be seen by the patch, but it would be
easier with a summary in ChangeLog form).


include files. The former had to be split due to circular dependency
that formed after including  in exception_ptr.h and the later
is because of inability to include cxxabi.h in exception_ptr.h because
it makes libstdc++/30586 to reappear again.




diff --git a/libstdc++-v3/libsupc++/eh_throw.cc 
b/libstdc++-v3/libsupc++/eh_throw.cc
index 9aac218..b368f8a 100644
--- a/libstdc++-v3/libsupc++/eh_throw.cc
+++ b/libstdc++-v3/libsupc++/eh_throw.cc
@@ -55,6 +55,22 @@ __gxx_exception_cleanup (_Unwind_Reason_Code code, 
_Unwind_Exception *exc)
#endif
}

+extern "C" __cxa_refcounted_exception*
+__cxxabiv1::__cxa_init_primary_exception(void *obj, const std::type_info 
*tinfo,
+ void (_GLIBCXX_CDTOR_CALLABI *dest) 
(void *))
+{
+  __cxa_refcounted_exception *header
+= __get_refcounted_exception_header_from_obj (obj);
+  header->referenceCount = 0;
+  header->exc.exceptionType = tinfo;
+  header->exc.exceptionDestructor = dest;
+  header->exc.unexpectedHandler = std::get_unexpected ();
+  header->exc.terminateHandler = std::get_terminate ();
+  __GXX_INIT_PRIMARY_EXCEPTION_CLASS(header->exc.unwindHeader.exception_class);
+  header->exc.unwindHeader.exception_cleanup = __gxx_exception_cleanup;
+
+  return header;
+}


I'd like to see any additions like this function discussed on the C++
ABI list, so we at least have an idea whether other vendors would
consider implementing it too.





extern "C" void
__cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo,
@@ -64,17 +80,10 @@ __cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo,

  __cxa_eh_globals *globals = __cxa_get_globals ();
  globals->uncaughtExceptions += 1;
-
  // Definitely a primary.
-  __cxa_refcounted_exception *header
-= __get_refcounted_exception_header_from_obj (obj);
+  __cxa_refcounted_exception *header =
+__cxa_init_primary_exception(obj, tinfo, dest);
  header->referenceCount = 1;
-  header->exc.exceptionType = tinfo;
-  header->exc.exceptionDestructor = dest;
-  header->exc.unexpectedHandler = std::get_unexpected ();
-  header->exc.terminateHandler = std::get_terminate ();
-  __GXX_INIT_PRIMARY_EXCEPTION_CLASS(header->exc.unwindHeader.exception_class);
-  header->exc.unwindHeader.exception_cleanup = __gxx_exception_cleanup;

#ifdef __USING_SJLJ_EXCEPTIONS__
  _Unwind_SjLj_RaiseException (&header->exc.unwindHeader);
diff --git a/libstdc++-v3/libsupc++/exception b/libstdc++-v3/libsupc++/exception
index 63631f6..6f8b596 100644
--- a/libstdc++-v3/libsupc++/exception
+++ b/libstdc++-v3/libsupc++/exception
@@ -34,135 +34,7 @@

#pragma GCC visibility push(default)

-#include 
-#include 
-
-extern "C++" {
-
-namespace std
-{
-  /**
-   * @defgroup exceptions Exceptions
-   * @ingroup diagnostics
-   *
-   * Classes and functions for reporting errors via exception classes.
-   * @{
-   */
-
-  /**
-   *  @brief Base class for all library exceptions.
-   *
-   *  This is the base class for all exceptions thrown by the standard
-   *  library, and by certain language expressions.  You are free to derive
-   *  your own %exception classes, or use a different hierarchy, or to
-   *  throw non-class data (e.g., fundamental types).
-   */
-  class exception
-  {
-  public:
-exception() _GLIBCXX_USE_NOEXCEPT { }
-virtual ~exception() _GLIBCXX_TXN_SAFE_DYN _GLIBCXX_USE_NOEXCEPT;
-
-/** Returns a C-style character string describing the general cause
- *  of the current error.  */
-virtual const char*
-what() const _GLIBCXX_TXN_SAFE_DYN _GLIBCXX_USE_NOEXCEPT;
-  };
-
-  /** If an %exception is thrown which is not listed in a function's
-   *  %exception specification, one of these may be thrown.  */
-  class bad_exception : public exception
-  {
-  public:
-bad_exception() _GLIBCXX_USE_NOEXCEPT { }
-
-// This declaration is not useless:
-// http://gcc.gnu.org/onlinedocs/gcc-3.0.2/gcc_6.html#SEC118
-virtual ~bad_exception() _GLIBCXX_TXN_SAFE_DYN _GLIBCXX_USE_NOEXCEPT;
-
-// See comment in eh_exception.cc.
-virtual const char*
-what() const _GLIBCXX_TXN_SAFE_DYN _GLIBCXX_USE_NOEXCEPT;
-  };



Does bad_exception need to move to  ?



-  /// If you write a replacement %terminate handler, it must be of this type.
-  typedef void (*terminate_handler) ();
-
-  /// If you write a replacement %unexpected handler, it must be of this type.
-  typedef void (*unexpected_handler) ();


These typedefs are certainly needed in  because

Re: [PATCH RFC] do not throw in std::make_exception_ptr

2016-08-03 Thread Jonathan Wakely

On 03/08/16 10:48 +0100, Jonathan Wakely wrote:

On 28/07/16 10:20 +0300, Gleb Natapov wrote:

+extern "C" __cxa_refcounted_exception*
+__cxxabiv1::__cxa_init_primary_exception(void *obj, const std::type_info 
*tinfo,
+ void (_GLIBCXX_CDTOR_CALLABI *dest) 
(void *))
+{
+  __cxa_refcounted_exception *header
+= __get_refcounted_exception_header_from_obj (obj);
+  header->referenceCount = 0;
+  header->exc.exceptionType = tinfo;
+  header->exc.exceptionDestructor = dest;
+  header->exc.unexpectedHandler = std::get_unexpected ();
+  header->exc.terminateHandler = std::get_terminate ();
+  __GXX_INIT_PRIMARY_EXCEPTION_CLASS(header->exc.unwindHeader.exception_class);
+  header->exc.unwindHeader.exception_cleanup = __gxx_exception_cleanup;
+
+  return header;
+}


I'd like to see any additions like this function discussed on the C++
ABI list, so we at least have an idea whether other vendors would
consider implementing it too.


Oops, I meant to delete that comment. Please ignore it!

You're only suggesting a new function in the "GNU extensions" part of
our header, and it's only needed in our make_exception_ptr function,
not required by any of the actual runtime files in libsupc++. So
there's no need for it to be common to other implementations of the
ABI.




[PATCH] Do not enable -fprefetch-loop-arrays with -fprofile-use (PR, gcov-profile/58250).

2016-08-03 Thread Martin Liška
Hi.

I've just grabbed patch a that was suggested in the PR (and IMHO makes sense).

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From e75e997f05d547017b3a962069fa4d1b024420cd Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 2 Aug 2016 09:21:22 +0200
Subject: [PATCH] Do not enable -fprefetch-loop-arrays with -fprofile-use (PR
 gcov-profile/58250).

gcc/ChangeLog:

2016-08-02  Martin Liska  

	* config/i386/i386.c (ix86_option_override_internal):
	Do not enable -fprefetch-loop-arrays with -fprofile-use.
---
 gcc/config/i386/i386.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7c8bb17..91cea25 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5847,7 +5847,7 @@ ix86_option_override_internal (bool main_args_p,
   /* Enable sw prefetching at -O3 for CPUS that prefetching is helpful.  */
   if (opts->x_flag_prefetch_loop_arrays < 0
   && HAVE_prefetch
-  && (opts->x_optimize >= 3 || opts->x_flag_profile_use)
+  && opts->x_optimize >= 3
   && !opts->x_optimize_size
   && TARGET_SOFTWARE_PREFETCHING_BENEFICIAL)
 opts->x_flag_prefetch_loop_arrays = 1;
-- 
2.9.2



Re: [PATCH][RFC] PR middle-end/22141 GIMPLE store widening pass

2016-08-03 Thread Kyrill Tkachov

Hi Richard,

On 18/07/16 13:22, Richard Biener wrote:




+  /* Record the original statements so that we can keep track of
+statements emitted in this pass and not re-process new
+statements.  */
+  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+ gimple *stmt = gsi_stmt (gsi);
+ if (!is_gimple_debug (stmt))
+   orig_stmts.add (stmt);
+ num_statements++;
+   }

please use gimple_set_visited () instead, that should be cheaper.


+  do
+   {
+ changes_made = false;
+ for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
...
+   }
+  while (changes_made);

looks pretty quadratic to me.  Instead of tracking things with m_curr_base_expr
why not use a hash-map to track stores related to a base?


I've implemented this scheme but I'm having trouble making it work.
In particular I have a hash_map keyed on a 'tree' that is the base
object (as extracted by get_inner_reference) but I can't get the hash_map
to properly extract the already recorded stores to the same base.
For example for the simple code:
struct bar {
  int a;
  char b;
  char c;
  char d;
  char e;
  char f;
  char g;
};

void
foo1 (struct bar *p)
{
  p->b = 0;
  p->a = 0;
  p->c = 0;
  p->d = 0;
  p->e = 0;
}

As we can see, the stores are all to the same object and should
be recognised as such.

The base of the first store is recorded as:

and for the second store as 
where the dumps of the two mem_refs are identical except for that first
hex number (their address in memory?)
In my first version of the patch I compare these with operand_equal_p and that
detects that they are the same, but in the hash_map they are not detected
as equal. Is there some special hashing function I must specify?

Thanks,
Kyrill


Re: [PATCH RFC] do not throw in std::make_exception_ptr

2016-08-03 Thread Gleb Natapov
On Wed, Aug 03, 2016 at 10:48:27AM +0100, Jonathan Wakely wrote:
> On 28/07/16 10:20 +0300, Gleb Natapov wrote:
> > [resent with hopefully correct libstdc++ mailing list address this time]
> > 
> > Here is my attempt to fix
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68297. The resulting patch
> > is a little bit long because I had to split  and cxxabi.h
> 
> "A little bit", yes ;-)
> 
> A changelog would help review it, because it's not clear what has
> moved to where. You've split files, but not said what parts end up
> where (which obviously can be seen by the patch, but it would be
> easier with a summary in ChangeLog form).
> 
Will do for next submission.

> > include files. The former had to be split due to circular dependency
> > that formed after including  in exception_ptr.h and the later
> > is because of inability to include cxxabi.h in exception_ptr.h because
> > it makes libstdc++/30586 to reappear again.
> 
> 
> > diff --git a/libstdc++-v3/libsupc++/eh_throw.cc 
> > b/libstdc++-v3/libsupc++/eh_throw.cc
> > index 9aac218..b368f8a 100644
> > --- a/libstdc++-v3/libsupc++/eh_throw.cc
> > +++ b/libstdc++-v3/libsupc++/eh_throw.cc
> > @@ -55,6 +55,22 @@ __gxx_exception_cleanup (_Unwind_Reason_Code code, 
> > _Unwind_Exception *exc)
> > #endif
> > }
> > 
> > +extern "C" __cxa_refcounted_exception*
> > +__cxxabiv1::__cxa_init_primary_exception(void *obj, const std::type_info 
> > *tinfo,
> > + void (_GLIBCXX_CDTOR_CALLABI 
> > *dest) (void *))
> > +{
> > +  __cxa_refcounted_exception *header
> > += __get_refcounted_exception_header_from_obj (obj);
> > +  header->referenceCount = 0;
> > +  header->exc.exceptionType = tinfo;
> > +  header->exc.exceptionDestructor = dest;
> > +  header->exc.unexpectedHandler = std::get_unexpected ();
> > +  header->exc.terminateHandler = std::get_terminate ();
> > +  
> > __GXX_INIT_PRIMARY_EXCEPTION_CLASS(header->exc.unwindHeader.exception_class);
> > +  header->exc.unwindHeader.exception_cleanup = __gxx_exception_cleanup;
> > +
> > +  return header;
> > +}
> 
> I'd like to see any additions like this function discussed on the C++
> ABI list, so we at least have an idea whether other vendors would
> consider implementing it too.
> 
> 
> 
> > 
> > extern "C" void
> > __cxxabiv1::__cxa_throw (void *obj, std::type_info *tinfo,
> > @@ -64,17 +80,10 @@ __cxxabiv1::__cxa_throw (void *obj, std::type_info 
> > *tinfo,
> > 
> >   __cxa_eh_globals *globals = __cxa_get_globals ();
> >   globals->uncaughtExceptions += 1;
> > -
> >   // Definitely a primary.
> > -  __cxa_refcounted_exception *header
> > -= __get_refcounted_exception_header_from_obj (obj);
> > +  __cxa_refcounted_exception *header =
> > +__cxa_init_primary_exception(obj, tinfo, dest);
> >   header->referenceCount = 1;
> > -  header->exc.exceptionType = tinfo;
> > -  header->exc.exceptionDestructor = dest;
> > -  header->exc.unexpectedHandler = std::get_unexpected ();
> > -  header->exc.terminateHandler = std::get_terminate ();
> > -  
> > __GXX_INIT_PRIMARY_EXCEPTION_CLASS(header->exc.unwindHeader.exception_class);
> > -  header->exc.unwindHeader.exception_cleanup = __gxx_exception_cleanup;
> > 
> > #ifdef __USING_SJLJ_EXCEPTIONS__
> >   _Unwind_SjLj_RaiseException (&header->exc.unwindHeader);
> > diff --git a/libstdc++-v3/libsupc++/exception 
> > b/libstdc++-v3/libsupc++/exception
> > index 63631f6..6f8b596 100644
> > --- a/libstdc++-v3/libsupc++/exception
> > +++ b/libstdc++-v3/libsupc++/exception
> > @@ -34,135 +34,7 @@
> > 
> > #pragma GCC visibility push(default)
> > 
> > -#include 
> > -#include 
> > -
> > -extern "C++" {
> > -
> > -namespace std
> > -{
> > -  /**
> > -   * @defgroup exceptions Exceptions
> > -   * @ingroup diagnostics
> > -   *
> > -   * Classes and functions for reporting errors via exception classes.
> > -   * @{
> > -   */
> > -
> > -  /**
> > -   *  @brief Base class for all library exceptions.
> > -   *
> > -   *  This is the base class for all exceptions thrown by the standard
> > -   *  library, and by certain language expressions.  You are free to derive
> > -   *  your own %exception classes, or use a different hierarchy, or to
> > -   *  throw non-class data (e.g., fundamental types).
> > -   */
> > -  class exception
> > -  {
> > -  public:
> > -exception() _GLIBCXX_USE_NOEXCEPT { }
> > -virtual ~exception() _GLIBCXX_TXN_SAFE_DYN _GLIBCXX_USE_NOEXCEPT;
> > -
> > -/** Returns a C-style character string describing the general cause
> > - *  of the current error.  */
> > -virtual const char*
> > -what() const _GLIBCXX_TXN_SAFE_DYN _GLIBCXX_USE_NOEXCEPT;
> > -  };
> > -
> > -  /** If an %exception is thrown which is not listed in a function's
> > -   *  %exception specification, one of these may be thrown.  */
> > -  class bad_exception : public exception
> > -  {
> > -  public:
> > -bad_exception() _GLIBCXX_USE_NOEXCEPT { }
> > -
> > -// This declaration is not useless:
> > -// http://gcc.gnu.o

Re: [ARM] FP16 ARM Alternative format variants of AAPCS tests.

2016-08-03 Thread Ramana Radhakrishnan
On Mon, Jun 27, 2016 at 11:09 AM, Matthew Wahab
 wrote:
> Hello,
>
> Tests added for FP16 argument and return values being passed in
> registers only check the case when the FP16 IEEE format is used. This
> patch adds equivalent tests that also check the behaviour when the
> ARM Alternative format is used.
>
> This patch depends on the testsuite directives added for the FP16 aapcs
> tests at https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01794.html.
>
> Tested arm-none-eabi with cross-compiled make check-gcc and
> arm-none-linux-gnueabihf with native make check.
>
> Ok for trunk?
> Matthew
>
> testsuite/
> 2016-06-27  Matthew Wahab  
>
> * gcc.target/arm/fp16-aapcs-3.c: New.
> * gcc.target/arm/fp16-aapcs-4.c: New.
> * gcc.target/arm/aapcs/aapcs/vfp22.c: New.
> * gcc.target/arm/aapcs/aapcs/vfp23.c: New.
> * gcc.target/arm/aapcs/aapcs/vfp24.c: New.
> * gcc.target/arm/aapcs/aapcs/vfp25.c: New.

OK once the pre-reqs are in place.

Thanks,
Ramana
>


Re: [PATCH RFC] do not throw in std::make_exception_ptr

2016-08-03 Thread Jonathan Wakely

On 03/08/16 14:26 +0300, Gleb Natapov wrote:

On Wed, Aug 03, 2016 at 10:48:27AM +0100, Jonathan Wakely wrote:

Does bad_exception need to move to  ?



I think only std::exception is really needed by . When I
did header split I when with "move as much as possible" approach, not
the other way around. You seems to suggest the opposite approach. I'll
try it.


Yes, that way  and  are as small as possible, and
only declare what they need.

Code that wants std::bad_exception or std::set_unexpected() should
include , as before.


> swap(exception_ptr& __lhs, exception_ptr& __rhs)
> { __lhs.swap(__rhs); }
>
> -  } // namespace __exception_ptr
> +template
> +  static void dest_thunk(void* x) {
> +  reinterpret_cast<_Ex*>(x)->_Ex::~_Ex();
> +  }

This isn't a name reserved for implementors, it needs to be uglified,
e.g. __dest_thunk.


OK.


This function should be declared 'inline' too.


We take a pointer to it. How can it be inline?


Well it certainly *can* be, declaring it inline doesn't mean you can't
take its address.

Currently it's 'static' which means every translation unit that calls
make_exception_ptr will instantiate a different copy of the function
template. That produces more object code than needed, so it should not
be static. It's a tiny function defined in a header, so should be
'inline'.

Practically speaking, making it 'inline' doesn't make much difference,
because an instantiated function template will produce a weak symbol
anyway, which as the same effect as declaring it inline. But there's
no reason _not_ to declare it inline.



> +  } // namespace __exception_ptr
>
>   /// Obtain an exception_ptr pointing to a copy of the supplied object.
>   template
> @@ -173,7 +184,15 @@ namespace std
> #if __cpp_exceptions
>   try
>{
> -throw __ex;
> +#if __cpp_rtti && !_GLIBCXX_HAVE_CDTOR_CALLABI
> +  void *e = __cxxabiv1::__cxa_allocate_exception(sizeof(_Ex));

Again, 'e' isn't a reserved name.

It is local variable, why should it be reserved?


Because otherwise this valid C++ program won't compile:

#define e 2.71828
#include 
int main() { }






> +  (void)__cxxabiv1::__cxa_init_primary_exception(e, &typeid(__ex),
> +   __exception_ptr::dest_thunk<_Ex>);
> +  new (e) _Ex(__ex);

If the copy constructor of _Ex throws an exception should we call
std::terminate here?

That's what would have happened previously, I believe.


I do not think so. throw compiles to something like:

 __cxa_allocate_exception
 call move_or_copy_constructor
 __cxa_throw

If move_or_copy_constructor throws the code does not terminate, but
catch() gets different exception instead.


I don't think we want to catch that exception and store it
in the exception_ptr in place of the __ex object we were asked to
store.


I wrote a test program below to check current behaviour and this is what code
does now.

#include 
#include 

struct E {
   E()  {}
   E(const E&) { throw 5; }
};

int main() {
   auto x = std::make_exception_ptr(E());
   try {
   std::rethrow_exception(x);
   } catch(E& ep) {
   std::cout << "E" << std::endl;
   } catch (int& i) {
   std::cout << "int" << std::endl;
   }
}


Huh. If I'm reading the ABI spec correctly, we should terminate if the
copy constructor throws.

We don't seem to do that even without exception_ptr involved:

#include 
#include 

struct E {
   E()  {}
   E(const E&) { throw 5; }
};

int main() {
   static E e;
   try {
   throw e;
   } catch(E& ep) {
   std::cout << "E" << std::endl;
   } catch (int& i) {
   std::cout << "int" << std::endl;
   }
}


So on that basis, do we need the try/catch around your new code?

Can we just do:

 template
   exception_ptrmake_exception_ptr(_Ex __ex) _GLIBCXX_USE_NOEXCEPT
   {
#if __cpp_exceptions
# if __cpp_rtti && !_GLIBCXX_HAVE_CDTOR_CALLABI
 void *__ptr = __cxxabiv1::__cxa_allocate_exception(sizeof(_Ex));
 (void)__cxxabiv1::__cxa_init_primary_exception(__ptr, &typeid(__ex),
  __exception_ptr::__dest_thunk<_Ex>);
 new (__ptr) _Ex(__ex);
# else
 try
{
  throw __ex;
}
 catch(...)
{
  return current_exception();
}
# endif
#else
 return exception_ptr();
#endif
   }

The noexcept spec will cause it to terminate if the copy constructor
of _Ex throws.


> +  return exception_ptr(e);
> +#else
> +  throw __ex;
> +#endif
>}
>   catch(...)
>{


> diff --git a/libstdc++-v3/libsupc++/unwind-cxx.h 
b/libstdc++-v3/libsupc++/unwind-cxx.h
> index 9121934..11da4a7 100644
> --- a/libstdc++-v3/libsupc++/unwind-cxx.h
> +++ b/libstdc++-v3/libsupc++/unwind-cxx.h
> @@ -31,7 +31,7 @@
> // Level 2: C++ ABI
>
> #include 
> -#include 
> +#include 
> #include 
> #include "unwind.h"
> #include 
> @@ -62,7 +62,7 @@ namespace __cxxabiv1
> struct __cxa_exception
> {
>   // Manage the exception object itself.

Re: [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-08-03 Thread Ramana Radhakrishnan
On Thu, Jul 28, 2016 at 12:37 PM, Ramana Radhakrishnan
 wrote:
> On Mon, Jul 4, 2016 at 3:02 PM, Matthew Wahab
>  wrote:
>> On 19/05/16 15:54, Matthew Wahab wrote:
>>> On 18/05/16 16:20, Joseph Myers wrote:
 On Wed, 18 May 2016, Matthew Wahab wrote:

 In short: instructions for direct HFmode arithmetic should be described
 with patterns with the standard names.  It's the job of the
 architecture-independent compiler to ensure that fp16 arithmetic in the
 user's source code only generates direct fp16 arithmetic in GIMPLE (and
 thus ends up using those patterns) if that is a correct representation of
 the source code's semantics according to ACLE.

 The intrinsics you provide can then be written to use direct arithmetic,
 and rely on convert_to_real_1 eliminating the promotions, rather than
 needing built-in functions at all, just like many arm_neon.h intrinsics
 make direct use of GNU C vector arithmetic.
>>>
>>> I think it's clear that this has exhausted my knowledge of FP semantics.
>>>
>>> Forcing promotion to single-precision was to settle concerns brought up in
>>> internal discussions about __fp16 semantics. I'll see if anybody has any
>>> problem with the changes you suggest.
>>
>> This patch changes the implementation to use the standard names for the
>> HFmode arithmetic. Later patches will also be updated to use the
>> arithmetic operators where appropriate.
>>
>> Changes since the last version of this patch:
>> - The standard names for plus, minus, mult, div and fma are defined for
>>   HF mode.
>> - The patterns supporting the new ACLE intrinsics vnegh_f16, vaddh_f16,
>>   vsubh_f16, vmulh_f16 and vdivh_f16 are removed, the arithmetic
>>   operators will be used instead.
>> - The tests are updated to expect f16 instructions rather than the f32
>>   instructions that were previously emitted.
>>
>> Tested the series for arm-none-linux-gnueabihf with native bootstrap and
>> make check and for arm-none-eabi and armeb-none-eabi with make check on
>> an ARMv8.2-A emulator.
>
>
> All fine except -
>
> Why can we not extend the  and the l in
> vfp.md for fp16 and avoid all the unspecs for vcvta and vrnd*
> instructions ?
>

I now feel reasonably convinced that these can go away and be replaced
by extending the  and l expanders to
consider FP16 as well. Given that we are still only in the middle of
stage1 - I'm ok for you to apply this as is and then follow-up with a
patch that gets rid of the UNSPECs . If this holds for add, sub and
other patterns I don't see why it wouldn't hold for all these patterns
as well.

Joseph, do you have any opinions on whether we should be extending the
standard pattern names or not for btrunc, ceil, round, floor,
nearbyint, rint, lround, lfloor and lceil optabs for the HFmode
quantities ?

Thanks,
Ramana

> Ramana
>
>
>
>
>>
>> Ok for trunk?
>> Matthew
>>
>> 2016-07-04  Matthew Wahab  
>>
>> * config/arm/iterators.md (Code iterators): Fix some white-space
>> in the comments.
>> (GLTE): New.
>> (ABSNEG): New
>> (FCVT): Moved from vfp.md.
>> (VCVT_HF_US_N): New.
>> (VCVT_SI_US_N): New.
>> (VCVT_HF_US): New.
>> (VCVTH_US): New.
>> (FP16_RND): New.
>> (absneg_str): New.
>> (FCVTI32typename): Moved from vfp.md.
>> (sup): Add UNSPEC_VCVTA_S, UNSPEC_VCVTA_U, UNSPEC_VCVTM_S,
>> UNSPEC_VCVTM_U, UNSPEC_VCVTN_S, UNSPEC_VCVTN_U, UNSPEC_VCVTP_S,
>> UNSPEC_VCVTP_U, UNSPEC_VCVT_HF_S_N, UNSPEC_VCVT_HF_U_N,
>> UNSPEC_VCVT_SI_S_N, UNSPEC_VCVT_SI_U_N,  UNSPEC_VCVTH_S_N,
>> UNSPEC_VCVTH_U_N, UNSPEC_VCVTH_S and UNSPEC_VCVTH_U.
>>
>> (vcvth_op): New.
>> (fp16_rnd_str): New.
>> (fp16_rnd_insn): New.
>
>
>> * config/arm/unspecs.md (UNSPEC_VCVT_HF_S_N): New.
>> (UNSPEC_VCVT_HF_U_N): New.
>> (UNSPEC_VCVT_SI_S_N): New.
>> (UNSPEC_VCVT_SI_U_N): New.
>> (UNSPEC_VCVTH_S): New.
>> (UNSPEC_VCVTH_U): New.
>> (UNSPEC_VCVTA_S): New.
>> (UNSPEC_VCVTA_U): New.
>> (UNSPEC_VCVTM_S): New.
>> (UNSPEC_VCVTM_U): New.
>> (UNSPEC_VCVTN_S): New.
>> (UNSPEC_VCVTN_U): New.
>> (UNSPEC_VCVTP_S): New.
>> (UNSPEC_VCVTP_U): New.
>> (UNSPEC_VCVTP_S): New.
>> (UNSPEC_VCVTP_U): New.
>> (UNSPEC_VRND): New.
>> (UNSPEC_VRNDA): New.
>> (UNSPEC_VRNDI): New.
>> (UNSPEC_VRNDM): New.
>> (UNSPEC_VRNDN): New.
>> (UNSPEC_VRNDP): New.
>> (UNSPEC_VRNDX): New.
>> * config/arm/vfp.md (hf2): New.
>> (neon_vabshf): New.
>> (neon_vhf): New.
>> (neon_vrndihf): New.
>> (addhf3): New.
>> (subhf3): New.
>> (divhf3): New.
>> (mulhf3): New.
>> (*mulsf3neghf_vfp): New.
>> (*negmulhf3_vfp): New.
>> (*mulsf3addhf_vfp): New.
>> (*mulhf3subhf_vfp): New.
>> (*mulhf3ne

[Patch, testsuite] Fix some more bogus failures for avr

2016-08-03 Thread Senthil Kumar Selvaraj
Hi,

Committed below patch to trunk as obvious.

Regards
Senthil


2016-08-03  Senthil Kumar Selvaraj  

* gcc.dg/init-excess-2.c: Require int32plus.
* gcc.dg/pr44024.c: Skip if target keeps null pointer checks.
* gcc.dg/pr59963-2.c: Require int32plus.
* gcc.dg/pr71084.c: Cast pointer to intprt_t.
* gcc.dg/unroll-7.c: Require int32plus.


Index: gcc.dg/init-excess-2.c
===
--- gcc.dg/init-excess-2.c  (revision 239064)
+++ gcc.dg/init-excess-2.c  (working copy)
@@ -3,6 +3,7 @@
c/71115 - Missing warning: excess elements in struct initializer.  */
 /* { dg-do compile } */
 /* { dg-options "" } */
+/* { dg-require-effective-target int32plus } */
 
 #include 
 
Index: gcc.dg/pr44024.c
===
--- gcc.dg/pr44024.c(revision 239064)
+++ gcc.dg/pr44024.c(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do link } */
 /* { dg-options "-O1 -fdelete-null-pointer-checks -fdump-tree-ccp1" } */
+/* { dg-skip-if "" keeps_null_pointer_checks } */
 
 void foo();
 void link_error (void);
Index: gcc.dg/pr59963-2.c
===
--- gcc.dg/pr59963-2.c  (revision 239064)
+++ gcc.dg/pr59963-2.c  (working copy)
@@ -1,6 +1,7 @@
 /* PR c/59963 */
 /* { dg-do compile } */
 /* { dg-options "-Woverflow -Wconversion" } */
+/* { dg-require-effective-target int32plus } */
 
 extern void bar (unsigned char);
 extern void bar8 (unsigned char, unsigned char, unsigned char, unsigned char,
Index: gcc.dg/pr71084.c
===
--- gcc.dg/pr71084.c(revision 239064)
+++ gcc.dg/pr71084.c(working copy)
@@ -2,6 +2,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2" } */
 
+__extension__ typedef __INTPTR_TYPE__ intptr_t;
+
 void babl_format (void);
 void gimp_drawable_get_format (void);
 int _setjmp (void);
@@ -32,7 +34,7 @@
gimp_drawable_get_format();
   }
   for (; run_height;)
-for (; run_i < (long)fn1; ++run_i)
+for (; run_i < (long)(intptr_t)fn1; ++run_i)
   for (; width;)
 ;
 }
Index: gcc.dg/unroll-7.c
===
--- gcc.dg/unroll-7.c   (revision 239064)
+++ gcc.dg/unroll-7.c   (working copy)
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-rtl-loop2_unroll -funroll-loops" } */
+/* { dg-require-effective-target int32plus } */
+
 int t(int *a)
 {
   int i;


Re: [PR70920] transform (intptr_t) x eq/ne CST to x eq/ne (typeof x) cst

2016-08-03 Thread Matthew Wahab

On 29/07/16 15:32, Prathamesh Kulkarni wrote:

On 29 July 2016 at 12:42, Richard Biener  wrote:

On Fri, 29 Jul 2016, Prathamesh Kulkarni wrote:


On 28 July 2016 at 19:18, Richard Biener  wrote:

On Thu, 28 Jul 2016, Prathamesh Kulkarni wrote:


On 28 July 2016 at 15:58, Andreas Schwab  wrote:

On Mo, Jul 25 2016, Prathamesh Kulkarni  wrote:


diff --git a/gcc/testsuite/gcc.dg/pr70920-4.c b/gcc/testsuite/gcc.dg/pr70920-4.c
new file mode 100644
index 000..dedb895
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr70920-4.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp-details -Wno-int-to-pointer-cast" } */
+
+#include 
+
+void f1();
+void f2();
+
+void
+foo (int a)
+{
+  void *cst = 0;
+  if ((int *) a == cst)
+{
+  f1 ();
+  if (a)
+ f2 ();
+}
+}
+
+/* { dg-final { scan-tree-dump "gimple_simplified to if \\(_\[0-9\]* == 0\\)" 
"ccp1" } } */


This fails on all ilp32 platforms.

[..]


I don't think just matching == 0 is a good idea.  I suggest to
restrict the testcase to lp64 targets and maybe add a ilp32 variant.

Hi,
I restricted the test-case to lp64 targets.
Is this OK to commit ?


Hello,

The test case is failing for arm-none-linux-gnueabihf.

It is correctly skipped if the 'dg-require-effective-target lp64' you added is 
moved to the end of the directives (after the dg-options).


Matthew



Re: [PATCH RFC] do not throw in std::make_exception_ptr

2016-08-03 Thread Gleb Natapov
On Wed, Aug 03, 2016 at 12:47:30PM +0100, Jonathan Wakely wrote:
> > > 
> > > > +  } // namespace __exception_ptr
> > > >
> > > >   /// Obtain an exception_ptr pointing to a copy of the supplied object.
> > > >   template
> > > > @@ -173,7 +184,15 @@ namespace std
> > > > #if __cpp_exceptions
> > > >   try
> > > > {
> > > > - throw __ex;
> > > > +#if __cpp_rtti && !_GLIBCXX_HAVE_CDTOR_CALLABI
> > > > +  void *e = __cxxabiv1::__cxa_allocate_exception(sizeof(_Ex));
> > > 
> > > Again, 'e' isn't a reserved name.
> > It is local variable, why should it be reserved?
> 
> Because otherwise this valid C++ program won't compile:
> 
> #define e 2.71828
> #include 
> int main() { }
> 
Ah, I missed that fact that the code is in user visible include.

> 
> > 
> > > 
> > > > +  (void)__cxxabiv1::__cxa_init_primary_exception(e, 
> > > > &typeid(__ex),
> > > > +   
> > > > __exception_ptr::dest_thunk<_Ex>);
> > > > +  new (e) _Ex(__ex);
> > > 
> > > If the copy constructor of _Ex throws an exception should we call
> > > std::terminate here?
> > > 
> > > That's what would have happened previously, I believe.
> > > 
> > I do not think so. throw compiles to something like:
> > 
> >  __cxa_allocate_exception
> >  call move_or_copy_constructor
> >  __cxa_throw
> > 
> > If move_or_copy_constructor throws the code does not terminate, but
> > catch() gets different exception instead.
> > 
> > > I don't think we want to catch that exception and store it
> > > in the exception_ptr in place of the __ex object we were asked to
> > > store.
> > > 
> > I wrote a test program below to check current behaviour and this is what 
> > code
> > does now.
> > 
> > #include 
> > #include 
> > 
> > struct E {
> >E()  {}
> >E(const E&) { throw 5; }
> > };
> > 
> > int main() {
> >auto x = std::make_exception_ptr(E());
> >try {
> >std::rethrow_exception(x);
> >} catch(E& ep) {
> >std::cout << "E" << std::endl;
> >} catch (int& i) {
> >std::cout << "int" << std::endl;
> >}
> > }
> 
> Huh. If I'm reading the ABI spec correctly, we should terminate if the
> copy constructor throws.
> 
I'll make it terminate like you've suggested then.

> We don't seem to do that even without exception_ptr involved:
> 
Yes, that's the reason current make_exception_ptr behaves as it does,
but to fix your test case below the code that generates code for 'throw'
will have to be fixed.

> #include 
> #include 
> 
> struct E {
>E()  {}
>E(const E&) { throw 5; }
> };
> 
> int main() {
>static E e;
>try {
>throw e;
>} catch(E& ep) {
>std::cout << "E" << std::endl;
>} catch (int& i) {
>std::cout << "int" << std::endl;
>}
> }
[skip]
> > > > -  std::type_info *exceptionType;
> > > > +  const std::type_info *exceptionType;
> > > >   void (_GLIBCXX_CDTOR_CALLABI *exceptionDestructor)(void *);
> > > 
> > > The __cxa_exception type is defined by
> > > https://mentorembedded.github.io/cxx-abi/abi-eh.html#cxx-data and this
> > > doesn't conform to that spec. Is this change necessary?
I missed this comment. typeid() returns const std::type_info so I
either need to add const here or cast the const away from  typeid()
return value.

--
Gleb.


Re: [PR57371] transform (double)i eq/ne 0 to i eq/ne 0

2016-08-03 Thread Richard Biener
On Wed, 3 Aug 2016, Prathamesh Kulkarni wrote:

> Hi,
> The attached patch tries to transform
> (double)i eq/ne 0 to i eq/ne 0
> AFAIU from Joseph's comment 1 in PR, the transform should be safe with
> -fno-trapping-math ?
> Bootstrap+tested on x86_64-unknown-linux-gnu in progress.

Couldn't this even be

 (cmp (float @0) REAL_CST@1)
 (with
  {
HOST_WIDE_INT n = real_to_integer (TREE_REAL_CST (@1));
REAL_VALUE_TYPE cint;
real_from_integer (&cint, VOIDmode, n, SIGNED);
  }
  (if (real_identical (&c, &cint))
   (cmp @0 { build_int_cst (TREE_TYPE (@0), n); }

with some additional type checks to make sure n fits the type of @0
(and otherwise fold to true/false directly).

Not sure whether we need to restrict it to float types that can
represent all values of the type of @0 exactly.

Richard.


Re: [PATCH RFC] do not throw in std::make_exception_ptr

2016-08-03 Thread Jonathan Wakely

On 03/08/16 15:02 +0300, Gleb Natapov wrote:

On Wed, Aug 03, 2016 at 12:47:30PM +0100, Jonathan Wakely wrote:

Huh. If I'm reading the ABI spec correctly, we should terminate if the
copy constructor throws.


I'll make it terminate like you've suggested then.


Let's leave it as you had it originally. I'm not sure if my reading of
the ABI is correct, so let's keep the behaviour consistent for now.

We can always revisit it later if we decide terminating is correct.


We don't seem to do that even without exception_ptr involved:


Yes, that's the reason current make_exception_ptr behaves as it does,
but to fix your test case below the code that generates code for 'throw'
will have to be fixed.


Right, and we wont' be changing that as part of this patch, so let's
stay consistent with that too.


#include 
#include 

struct E {
   E()  {}
   E(const E&) { throw 5; }
};

int main() {
   static E e;
   try {
   throw e;
   } catch(E& ep) {
   std::cout << "E" << std::endl;
   } catch (int& i) {
   std::cout << "int" << std::endl;
   }
}

[skip]

> > > -  std::type_info *exceptionType;
> > > +  const std::type_info *exceptionType;
> > >   void (_GLIBCXX_CDTOR_CALLABI *exceptionDestructor)(void *);
> >
> > The __cxa_exception type is defined by
> > https://mentorembedded.github.io/cxx-abi/abi-eh.html#cxx-data and this
> > doesn't conform to that spec. Is this change necessary?

I missed this comment. typeid() returns const std::type_info so I
either need to add const here or cast the const away from  typeid()
return value.


Please use const_cast. std::type_info doesn't have any non-cosnt
member that would allow modifications through that pointer anyway.

Thanks.



[PATCH, COMMITTED] Add branch_changer.py script to maintainer-scripts

2016-08-03 Thread Martin Liška
Hello.

I've installed (r239066) the script which is used by maintainers to update PRs 
in a batch mode.

Martin
>From 2b63c1aebe452fc67ea60ff9ab4f3173300015a9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 3 Aug 2016 14:39:17 +0200
Subject: [PATCH] Add branch_changer.py script to maintainer-scripts

maintainer-scripts/ChangeLog:

2016-08-03  Martin Liska  

	* branch_changer.py: New file.
---
 maintainer-scripts/branch_changer.py | 195 +++
 1 file changed, 195 insertions(+)
 create mode 100755 maintainer-scripts/branch_changer.py

diff --git a/maintainer-scripts/branch_changer.py b/maintainer-scripts/branch_changer.py
new file mode 100755
index 000..5e1681b
--- /dev/null
+++ b/maintainer-scripts/branch_changer.py
@@ -0,0 +1,195 @@
+#!/usr/bin/env python3
+
+# The script requires simplejson, requests, semantic_version packages, in case
+# of openSUSE:
+# zypper in python3-simplejson python3-requests
+# pip3 install semantic_version
+
+import requests
+import json
+import argparse
+import re
+
+from semantic_version import Version
+
+base_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/'
+statuses = ['UNCONFIRMED', 'ASSIGNED', 'SUSPENDED', 'NEW', 'WAITING', 'REOPENED']
+search_summary = ' Regression]'
+regex = '(.*\[)([0-9\./]*)( [rR]egression])(.*)'
+
+class Bug:
+def __init__(self, data):
+self.data = data
+self.versions = None
+self.fail_versions = []
+self.is_regression = False
+
+self.parse_summary()
+self.parse_known_to_fail()
+
+def parse_summary(self):
+m = re.match(regex, self.data['summary'])
+if m != None:
+self.versions = m.group(2).split('/')
+self.is_regression = True
+self.regex_match = m
+
+def parse_known_to_fail(self):
+v = self.data['cf_known_to_fail'].strip()
+if v != '':
+self.fail_versions = [x for x in re.split(' |,', v) if x != '']
+
+def name(self):
+return 'PR%d (%s)' % (self.data['id'], self.data['summary'])
+
+def remove_release(self, release):
+# Do not remove last value of [x Regression]
+if len(self.versions) == 1:
+return
+self.versions = list(filter(lambda x: x != release, self.versions))
+
+def add_release(self, releases):
+parts = releases.split(':')
+assert len(parts) == 2
+for i, v in enumerate(self.versions):
+if v == parts[0]:
+self.versions.insert(i + 1, parts[1])
+break
+
+def add_known_to_fail(self, release):
+if release in self.fail_versions:
+return False
+else:
+self.fail_versions.append(release)
+return True
+
+def update_summary(self, api_key, doit):
+summary = self.data['summary']
+new_summary = self.serialize_summary()
+if new_summary != summary:
+print(self.name())
+print('  changing summary: "%s" to "%s"' % (summary, new_summary))
+self.modify_bug(api_key, {'summary': new_summary}, doit)
+
+return True
+
+return False
+
+def change_milestone(self, api_key, old_milestone, new_milestone, comment, new_fail_version, doit):
+old_major = Bug.get_major_version(old_milestone)
+new_major = Bug.get_major_version(new_milestone)
+
+print(self.name())
+args = {}
+if old_major == new_major:
+args['target_milestone'] = new_milestone
+print('  changing target milestone: "%s" to "%s" (same branch)' % (old_milestone, new_milestone))
+elif self.is_regression and new_major in self.versions:
+args['target_milestone'] = new_milestone
+print('  changing target milestone: "%s" to "%s" (regresses with the new milestone)' % (old_milestone, new_milestone))
+else:
+print('  not changing target milestone: not a regression or does not regress with the new milestone')
+
+if 'target_milestone' in args and comment != None:
+print('  adding comment: "%s"' % comment)
+args['comment'] = {'comment': comment }
+
+if new_fail_version != None:
+if self.add_known_to_fail(new_fail_version):
+s = self.serialize_known_to_fail()
+print('  changing known_to_fail: "%s" to "%s"' % (self.data['cf_known_to_fail'], s))
+args['cf_known_to_fail'] = s
+
+if len(args.keys()) != 0:
+self.modify_bug(api_key, args, doit)
+return True
+else:
+return False
+
+def serialize_summary(self):
+assert self.versions != None
+assert self.is_regression == True
+
+new_version = '/'.join(self.versions)
+new_summary = self.regex_match.group(1) + new_version + self.regex_match.group(3) + self.regex_match.group(4)
+return new_summary
+
+def serialize_known_to_fail(self):
+asse

Re: [PATCH 14/17][ARM] Add NEON FP16 instrinsics.

2016-08-03 Thread Ramana Radhakrishnan
On Mon, Jul 4, 2016 at 3:15 PM, Matthew Wahab
 wrote:
> On 17/05/16 15:46, Matthew Wahab wrote:
>> The ARMv8.2-A architecture introduces an optional FP16 extension adding
>> half-precision floating point data processing instructions to the
>> existing Adv.SIMD (NEON) support. A future version of the ACLE will add
>> support for these instructions and this patch implements that support.
>
> Updated to fix the vsqrte/vrsqrte spelling mistake.
>
> Tested the series for arm-none-linux-gnueabihf with native bootstrap and
> make check and for arm-none-eabi and armeb-none-eabi with make check on
> an ARMv8.2-A emulator.
>
> Ok for trunk?
> Matthew
>
> 2016-07-04  Matthew Wahab  
>
> * config/arm/arm_neon.h (vabd_f16): New.
>
> (vabdq_f16): New.
> (vabs_f16): New.
> (vabsq_f16): New.
> (vadd_f16): New.
> (vaddq_f16): New.
> (vcage_f16): New.
> (vcageq_f16): New.
> (vcagt_f16): New.
> (vcagtq_f16): New.
> (vcale_f16): New.
> (vcaleq_f16): New.
> (vcalt_f16): New.
> (vcaltq_f16): New.
> (vceq_f16): New.
> (vceqq_f16): New.
> (vceqz_f16): New.
> (vceqzq_f16): New.
> (vcge_f16): New.
> (vcgeq_f16): New.
> (vcgez_f16): New.
> (vcgezq_f16): New.
> (vcgt_f16): New.
> (vcgtq_f16): New.
> (vcgtz_f16): New.
> (vcgtzq_f16): New.
> (vcle_f16): New.
> (vcleq_f16): New.
> (vclez_f16): New.
> (vclezq_f16): New.
> (vclt_f16): New.
> (vcltq_f16): New.
> (vcltz_f16): New.
> (vcltzq_f16): New.
> (vcvt_f16_s16): New.
> (vcvt_f16_u16): New.
> (vcvt_s16_f16): New.
> (vcvt_u16_f16): New.
> (vcvtq_f16_s16): New.
> (vcvtq_f16_u16): New.
> (vcvtq_s16_f16): New.
> (vcvtq_u16_f16): New.
> (vcvta_s16_f16): New.
> (vcvta_u16_f16): New.
> (vcvtaq_s16_f16): New.
> (vcvtaq_u16_f16): New.
> (vcvtm_s16_f16): New.
> (vcvtm_u16_f16): New.
> (vcvtmq_s16_f16): New.
> (vcvtmq_u16_f16): New.
> (vcvtn_s16_f16): New.
> (vcvtn_u16_f16): New.
> (vcvtnq_s16_f16): New.
> (vcvtnq_u16_f16): New.
> (vcvtp_s16_f16): New.
> (vcvtp_u16_f16): New.
> (vcvtpq_s16_f16): New.
> (vcvtpq_u16_f16): New.
> (vcvt_n_f16_s16): New.
> (vcvt_n_f16_u16): New.
> (vcvtq_n_f16_s16): New.
> (vcvtq_n_f16_u16): New.
> (vcvt_n_s16_f16): New.
> (vcvt_n_u16_f16): New.
> (vcvtq_n_s16_f16): New.
> (vcvtq_n_u16_f16): New.
> (vfma_f16): New.
> (vfmaq_f16): New.
> (vfms_f16): New.
> (vfmsq_f16): New.
> (vmax_f16): New.
> (vmaxq_f16): New.
> (vmaxnm_f16): New.
> (vmaxnmq_f16): New.
> (vmin_f16): New.
> (vminq_f16): New.
> (vminnm_f16): New.
> (vminnmq_f16): New.
> (vmul_f16): New.
> (vmul_lane_f16): New.
> (vmul_n_f16): New.
> (vmulq_f16): New.
> (vmulq_lane_f16): New.
> (vmulq_n_f16): New.
> (vneg_f16): New.
> (vnegq_f16): New.
> (vpadd_f16): New.
> (vpmax_f16): New.
> (vpmin_f16): New.
> (vrecpe_f16): New.
> (vrecpeq_f16): New.
> (vrnd_f16): New.
> (vrndq_f16): New.
> (vrnda_f16): New.
> (vrndaq_f16): New.
> (vrndm_f16): New.
> (vrndmq_f16): New.
> (vrndn_f16): New.
> (vrndnq_f16): New.
> (vrndp_f16): New.
> (vrndpq_f16): New.
> (vrndx_f16): New.
> (vrndxq_f16): New.
> (vrsqrte_f16): New.
> (vrsqrteq_f16): New.
>
> (vrecps_f16): New.
> (vrecpsq_f16): New.
> (vrsqrts_f16): New.
> (vrsqrtsq_f16): New.
> (vsub_f16): New.
> (vsubq_f16): New.
>


OK ...

Thanks,
Ramana


Re: [ARM][PATCH] Add support for overflow add, sub, and neg operations

2016-08-03 Thread Christophe Lyon
On 2 August 2016 at 10:13, Michael Collison  wrote:
> Hi,
>
> This patch improves code generations for builtin arithmetic overflow 
> operations for the arm backend. As an example for a simple test case such as:
>
> int
> fn3 (int x, int y, int *ovf)
> {
>   int res;
>   *ovf = __builtin_sadd_overflow (x, y, &res);
>   return res;
> }
>
> Current trunk at -O2 generates
>
> fn3:
> @ args = 0, pretend = 0, frame = 0
> @ frame_needed = 0, uses_anonymous_args = 0
> @ link register save eliminated.
> cmp r1, #0
> mov r3, #0
> add r1, r0, r1
> blt .L4
> cmp r1, r0
> blt .L3
> .L2:
> str r3, [r2]
> mov r0, r1
> bx  lr
> .L4:
> cmp r1, r0
> ble .L2
> .L3:
> mov r3, #1
> b   .L2
>
> With the patch this now generates:
>
>addsr0, r0, r1
>movvs   r3, #1
>movvc   r3, #0
>str r3, [r2]
>bx  lr
>
> Ok for trunk?
>

Hi Michael,

I've run validations with your patch, and I am seeing several failures
during tests execution:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239008-bugzilla-69663-upstream-final/report-build-info.html

I'm using qemu 2.6.0.

Did you run validation on actual HW? It could be a qemu bug, but I
haven't tried to manually reproduce the problem yet.

Christophe.


> 2016-07-27  Michael Collison 
>  Michael Collison 
>
>  * config/arm/arm-modes.def: Add new condition code mode CC_V
>  to represent the overflow bit.
>  * config/arm/arm.c (maybe_get_arm_condition_code):
>  Add support for CC_Vmode.
>  (arm_gen_unlikely_cbranch): New function to generate common
>  rtl conditional branches for overflow patterns.
>  * config/arm/arm-protos.h: Add prototype for
>  arm_gen_unlikely_cbranch.
>  * config/arm/arm.md (addv4, add3_compareV,
>  addsi3_compareV_upper): New patterns to support signed
>  builtin overflow add operations.
>  (uaddv4, add3_compareC, addsi3_compareV_upper):
>  New patterns to support unsigned builtin add overflow operations.
>  (subv4, sub3_compare1): New patterns to support signed
>  builtin overflow subtract operations,
>  (usubv4): New patterns to support unsigned builtin subtract
>  overflow operations.
>  (negvsi3, negvdi3, negdi2_compare, negsi2_carryin_compare): New patterns
>  to support builtin overflow negate operations.
>  * gcc.target/arm/builtin_saddl.c: New testcase.
>  * gcc.target/arm/builtin_saddll.c: New testcase.
>  * gcc.target/arm/builtin_uaddl.c: New testcase.
>  * gcc.target/arm/builtin_uaddll.c: New testcase.
>  * gcc.target/arm/builtin_ssubl.c: New testcase.
>  * gcc.target/arm/builtin_ssubll.c: New testcase.
>  * gcc.target/arm/builtin_usubl.c: New testcase.
>  * gcc.target/arm/builtin_usubll.c: New testcase.


Re: [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-08-03 Thread Matthew Wahab

On 03/08/16 12:52, Ramana Radhakrishnan wrote:

On Thu, Jul 28, 2016 at 12:37 PM, Ramana Radhakrishnan
 wrote:

On Mon, Jul 4, 2016 at 3:02 PM, Matthew Wahab
 wrote:

On 19/05/16 15:54, Matthew Wahab wrote:

On 18/05/16 16:20, Joseph Myers wrote:

On Wed, 18 May 2016, Matthew Wahab wrote:

In short: instructions for direct HFmode arithmetic should be described
with patterns with the standard names.  It's the job of the
architecture-independent compiler to ensure that fp16 arithmetic in the
user's source code only generates direct fp16 arithmetic in GIMPLE (and
thus ends up using those patterns) if that is a correct representation of
the source code's semantics according to ACLE.



This patch changes the implementation to use the standard names for the
HFmode arithmetic. Later patches will also be updated to use the
arithmetic operators where appropriate.



All fine except -

Why can we not extend the  and the l in
vfp.md for fp16 and avoid all the unspecs for vcvta and vrnd*
instructions ?



I now feel reasonably convinced that these can go away and be replaced
by extending the  and l expanders to
consider FP16 as well. Given that we are still only in the middle of
stage1 - I'm ok for you to apply this as is and then follow-up with a
patch that gets rid of the UNSPECs . If this holds for add, sub and
other patterns I don't see why it wouldn't hold for all these patterns
as well.

Joseph, do you have any opinions on whether we should be extending the
standard pattern names or not for btrunc, ceil, round, floor,
nearbyint, rint, lround, lfloor and lceil optabs for the HFmode
quantities ?



Sorry for the delay replying.

I didn't extend the lvrint_pattern and vrint_pattern expanders to HF mode 
because of the general intention to do fp16 operations through the NEON 
intrinsics. If extending them to HF mode  produces the expected behaviour for 
the standard names that they implement then I agree that the change should be made.


I would prefer to do that as a separate patch though, to make sure that the new 
operations are properly tested. Some of the existing tests (in gcc.target/arm) 
use builtins that aren't available for HF mode so something else will be needed.


Matthew




Re: [PATCH] Do not enable -fprefetch-loop-arrays with -fprofile-use (PR, gcov-profile/58250).

2016-08-03 Thread Richard Biener
On Wed, Aug 3, 2016 at 11:54 AM, Martin Liška  wrote:
> Hi.
>
> I've just grabbed patch a that was suggested in the PR (and IMHO makes sense).
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?

Err, but what was suggested, not enable it with -Os is already done - you
disable it unconditionally?

Richard.

> Martin


[PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)

2016-08-03 Thread Martin Liška
Hello.

As I've going through all PRs related to gcov-profile, I've noticed this PR.
Current implementation of cycle detection in gcov is very poor, leading to 
extreme run time
for cases like mentioned in the PR (which does not contain a cycle). Thank to 
Joshua, I've
grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he 
did. After doing that
the patch is quite subtle and fast (of course).

The patch survives gcov.exp regression tests and I also verified than *.gcov is 
identical before
and after the patch for Inkscape:

$ find . -name '*.gcov' | wc -l
10752

I'm also thinking about adding [1] to test-suite, however it would require 
implementation 'timeout'
argument in gcov.exp. Does it worth doing?

Ready to install?
Thanks,
Martin

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67992#c5
>From faf7fb72d439974de68eb672edc6d76424f6022d Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 3 Aug 2016 09:56:45 +0200
Subject: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection
 (PR gcov-profile/67992)

gcc/ChangeLog:

2016-08-03  Martin Liska  
	Joshua Cranmer  

	PR gcov-profile/67992
	* gcov.c (line_t::has_block): New function.
	(handle_cycle): Likewise.
	(unblock): Likewise.
	(circuit): Likewise.
	(find_cycles): Likewise.
	(get_cycles_count): Likewise.
	(main): Fix GNU coding style.
	(output_intermediate_file): Likewise.
	(process_file): Likewise.
	(generate_results): Likewise.
	(release_structures): Likewise.
	(create_file_names): Likewise.
	(find_source): Likewise.
	(read_graph_file): Likewise.
	(find_exception_blocks): Likewise.
	(canonicalize_name): Likewise.
	(make_gcov_file_name): Likewise.
	(mangle_name): Likewise.
	(accumulate_line_counts): Use the new Hawick's algorithm.
	(output_branch_count): Fix GNU coding style.
	(read_line): Likewise.
---
 gcc/gcov.c | 378 +
 1 file changed, 229 insertions(+), 149 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 417b4f4..8855980 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -41,6 +41,11 @@ along with Gcov; see the file COPYING3.  If not see
 
 #include 
 
+#include 
+#include 
+
+using namespace std;
+
 #define IN_GCOV 1
 #include "gcov-io.h"
 #include "gcov-io.c"
@@ -222,6 +227,9 @@ typedef struct coverage_info
 
 typedef struct line_info
 {
+  /* Return true when NEEDLE is one of basic blocks the line belongs to.  */
+  bool has_block (block_t *needle);
+
   gcov_type count;	   /* execution count */
   union
   {
@@ -235,6 +243,16 @@ typedef struct line_info
   unsigned unexceptional : 1;
 } line_t;
 
+bool
+line_t::has_block (block_t *needle)
+{
+  for (block_t *n = u.blocks; n; n = n->chain)
+if (n == needle)
+  return true;
+
+  return false;
+}
+
 /* Describes a file mentioned in the block graph.  Contains an array
of line info.  */
 
@@ -407,6 +425,164 @@ static void release_structures (void);
 static void release_function (function_t *);
 extern int main (int, char **);
 
+/* Cycle detection!
+   There are a bajillion algorithms that do this.  Boost's function is named
+   hawick_cycles, so I used the algorithm by K. A. Hawick and H. A. James in
+   "Enumerating Circuits and Loops in Graphs with Self-Arcs and Multiple-Arcs"
+   (url at ).
+
+   The basic algorithm is simple: effectively, we're finding all simple paths
+   in a subgraph (that shrinks every iteration).  Duplicates are filtered by
+   "blocking" a path when a node is added to the path (this also prevents non-
+   simple paths)--the node is unblocked only when it participates in a cycle.
+   */
+
+/* Flag that drives cycle detection after a negative cycle is seen.  */
+static bool did_negate = false;
+
+/* Handle cycle identified by EDGES, where the function finds minimum cs_count
+   and subtract the value from all counts.  The subtracted value is added
+   to COUNT.  */
+
+static void
+handle_cycle (const vector &edges, int64_t &count)
+{
+  /* Find the minimum edge of the cycle, and reduce all nodes in the cycle by
+ that amount.  */
+  int64_t cycle_count = INT64_MAX;
+  for (unsigned i = 0; i < edges.size (); i++)
+{
+  int64_t ecount = edges[i]->cs_count;
+  if (cycle_count > ecount)
+	cycle_count = ecount;
+}
+  count += cycle_count;
+  for (unsigned i = 0; i < edges.size (); i++)
+edges[i]->cs_count -= cycle_count;
+
+  if (cycle_count < 0)
+did_negate = true;
+}
+
+/* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
+   blocked by U in BLOCK_LISTS.  */
+
+static void
+unblock (block_t *u, vector &blocked,
+	 vector > &block_lists)
+{
+  vector::iterator it = find (blocked.begin (), blocked.end (), u);
+  if (it == blocked.end ())
+return;
+
+  unsigned index = it - blocked.begin ();
+  blocked.erase (it);
+
+  for (vector::iterator it2 = block_lists[index].begin ();
+   it2 != block_lists[index].end (); it2++)
+unblock (*it2, blocked, block_lists);
+

Re: [PATCH] Do not enable -fprefetch-loop-arrays with -fprofile-use (PR, gcov-profile/58250).

2016-08-03 Thread Martin Liška
On 08/03/2016 03:28 PM, Richard Biener wrote:
> On Wed, Aug 3, 2016 at 11:54 AM, Martin Liška  wrote:
>> Hi.
>>
>> I've just grabbed patch a that was suggested in the PR (and IMHO makes 
>> sense).
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
> 
> Err, but what was suggested, not enable it with -Os is already done - you
> disable it unconditionally?

Sorry, I was wrong ;) Looks the bug is fixed since r222033, where following 
hunk was added:

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 110ec4a..b442da9 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4168,6 +4168,7 @@ ix86_option_override_internal (bool main_args_p,
   if (opts->x_flag_prefetch_loop_arrays < 0
   && HAVE_prefetch
   && (opts->x_optimize >= 3 || opts->x_flag_profile_use)
+  && !opts->x_optimize_size
   && TARGET_SOFTWARE_PREFETCHING_BENEFICIAL)
 opts->x_flag_prefetch_loop_arrays = 1;

I'm going to close the PR.

Martin

> 
> Richard.
> 
>> Martin



Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-03 Thread Richard Biener
On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev  wrote:
> Hi Richard.
>
> It turned out that the fix proposed by you does not work for liggomp
> tests simd3 and simd4.
> The reason is that we can't change safelen value for references not
> defined inside loop. So I add missed check on it to patch.
> Is it OK for trunk?

Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as
that operation can end up being quadratic in the loop depth/width.

But I also wonder about correctness given that LIM "commons"
references.  So we can have

  for (;;)
.. = ref;  (1)
for (;;) // safelen == 2  (2)
  ... = ref;

and when looking at the ref at (1) which according to you should not
have safelen applied your function will happily return that ref is defined
in the inner loop.

So it looks like to be able to apply safelen the caller of ref_indep_loop_p
needs to pass down a ref plus a location (a stmt).  In which case your
function can simply use flow_loop_nested_p (loop, gimple_bb
(stmt)->loop_father);

Richard.

> ChangeLog:
> 2016-07-29  Yuri Rumyantsev  
>
> PR tree-optimization/71734
> * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function.
> (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside LOOP.
>
> 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev :
>> Sorry H.J.
>>
>> I checked both these tests manually but forgot to pass "-fopenmp" option.
>> I will fix the issue asap.
>>
>> 2016-07-29 0:33 GMT+03:00 H.J. Lu :
>>> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev  wrote:
 Richard,

 I prepare a patch which is based on yours. New test is also included.
 Bootstrapping and regression testing did not show any new failures.
 Is it OK for trunk?

 Thanks.
 ChangeLog:
 2016-07-28  Yuri Rumyantsev  

 PR tree-optimization/71734
 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen
 attribute instead of REF_LOOP and use it.
 (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and
 set it for Loops having non-zero safelen attribute.
 (ref_indep_loop_p): Pass zero as initial value for safelen.
 gcc/testsuite/ChangeLog:
 * g++.dg/vect/pr70729-nest.cc: New test.

>>>
>>> Does this cause
>>>
>>> FAIL: libgomp.fortran/pr71734-1.f90   -O3 -fomit-frame-pointer
>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>> test
>>> FAIL: libgomp.fortran/pr71734-1.f90   -O3 -g  execution test
>>> FAIL: libgomp.fortran/pr71734-2.f90   -O3 -fomit-frame-pointer
>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>> test
>>> FAIL: libgomp.fortran/pr71734-2.f90   -O3 -g  execution test
>>>
>>> on AVX machines and
>>>
>>> FAIL: libgomp.fortran/simd3.f90   -O3 -fomit-frame-pointer
>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>> test
>>> FAIL: libgomp.fortran/simd3.f90   -O3 -g  execution test
>>> FAIL: libgomp.fortran/simd4.f90   -O3 -fomit-frame-pointer
>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>> test
>>> FAIL: libgomp.fortran/simd4.f90   -O3 -g  execution test
>>>
>>> on non-AVX machines?
>>>
>>> --
>>> H.J.


Re: [PATCH] Teach VRP to truncate the case ranges of a switch

2016-08-03 Thread Richard Biener
On Wed, Aug 3, 2016 at 6:00 AM, Patrick Palka  wrote:
> VRP currently has functionality to eliminate case labels that lie
> completely outside of the switch operand's value range.  This patch
> complements this functionality by teaching VRP to also truncate the case
> label ranges that partially overlap with the operand's value range.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu.  Does this look like
> a reasonable optimization?  Admittedly, its effect will almost always be
> negligible except in cases where a case label range spans a large number
> of values which is a pretty rare thing.  The optimization triggered
> about 250 times during bootstrap.

I think it's most useful when the range collapses to a single value.

Ok.

Thanks,
Richard.

> gcc/ChangeLog:
>
> * tree-vrp.c (simplify_switch_using_ranges): Try to truncate
> the case label ranges that partially overlap with OP's value
> range.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/vrp107.c: New test.
> * gcc.dg/tree-ssa/vrp108.c: New test.
> * gcc.dg/tree-ssa/vrp109.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/vrp107.c | 25 +++
>  gcc/testsuite/gcc.dg/tree-ssa/vrp108.c | 25 +++
>  gcc/testsuite/gcc.dg/tree-ssa/vrp109.c | 65 +++
>  gcc/tree-vrp.c | 80 
> +-
>  4 files changed, 194 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp107.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp108.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp109.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp107.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/vrp107.c
> new file mode 100644
> index 000..b74f031
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp107.c
> @@ -0,0 +1,25 @@
> +/* { dg-options "-O2 -fdump-tree-vrp1" }  */
> +/* { dg-final { scan-tree-dump "case 2:" "vrp1" } }  */
> +/* { dg-final { scan-tree-dump "case 7 ... 8:" "vrp1" } }  */
> +
> +extern void foo (void);
> +extern void bar (void);
> +extern void baz (void);
> +
> +void
> +test (int i)
> +{
> +  if (i >= 2 && i <= 8)
> +  switch (i)
> +{
> +case 1: /* Redundant label.  */
> +case 2:
> +  bar ();
> +  break;
> +case 7:
> +case 8:
> +case 9: /* Redundant label.  */
> +  baz ();
> +  break;
> +}
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp108.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/vrp108.c
> new file mode 100644
> index 000..49dbfb5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp108.c
> @@ -0,0 +1,25 @@
> +/* { dg-options "-O2 -fdump-tree-vrp1" }  */
> +/* { dg-final { scan-tree-dump "case 1:" "vrp1" } }  */
> +/* { dg-final { scan-tree-dump "case 9:" "vrp1" } }  */
> +
> +extern void foo (void);
> +extern void bar (void);
> +extern void baz (void);
> +
> +void
> +test (int i)
> +{
> +  if (i < 2 || i > 8)
> +  switch (i)
> +{
> +case 1:
> +case 2: /* Redundant label.  */
> +  bar ();
> +  break;
> +case 7: /* Redundant label.  */
> +case 8: /* Redundant label.  */
> +case 9:
> +  baz ();
> +  break;
> +}
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp109.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/vrp109.c
> new file mode 100644
> index 000..86299a9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp109.c
> @@ -0,0 +1,65 @@
> +/* { dg-options "-O2 -fdump-tree-vrp1" }  */
> +/* { dg-final { scan-tree-dump "case 9 ... 10:" "vrp1" } }  */
> +/* { dg-final { scan-tree-dump "case 17 ... 18:" "vrp1" } }  */
> +/* { dg-final { scan-tree-dump "case 27 ... 30:" "vrp1" } }  */
> +
> +extern void foo (void);
> +extern void bar (void);
> +
> +void
> +test1 (int i)
> +{
> +  if (i != 7 && i != 8)
> +switch (i)
> +  {
> +  case 1:
> +  case 2:
> +foo ();
> +break;
> +  case 7: /* Redundant label.  */
> +  case 8: /* Redundant label.  */
> +  case 9:
> +  case 10:
> +bar ();
> +break;
> +  }
> +}
> +
> +void
> +test3 (int i)
> +{
> +  if (i != 19 && i != 20)
> +switch (i)
> +  {
> +  case 1:
> +  case 2:
> +foo ();
> +break;
> +  case 17:
> +  case 18:
> +  case 19: /* Redundant label.  */
> +  case 20: /* Redundant label.  */
> +bar ();
> +break;
> +  }
> +}
> +
> +void
> +test2 (int i)
> +{
> +  if (i != 28 && i != 29)
> +switch (i)
> +  {
> +  case 1:
> +  case 2:
> +foo ();
> +break;
> +  /* No redundancy.  */
> +  case 27:
> +  case 28:
> +  case 29:
> +  case 30:
> +bar ();
> +break;
> +  }
> +}
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index cb316b0..b654b1b 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -9586,7 +9586,7 @@ static bool
>  simplify_switch_using_ranges (gswitch *stmt)
>  {
>tree op = gimple_switch_index (stmt);
> -  value_range *vr;

Re: [PATCH][RFC] PR middle-end/22141 GIMPLE store widening pass

2016-08-03 Thread Richard Biener
On Wed, Aug 3, 2016 at 11:59 AM, Kyrill Tkachov
 wrote:
> Hi Richard,
>
> On 18/07/16 13:22, Richard Biener wrote:
>
> 
>
>> +  /* Record the original statements so that we can keep track of
>> +statements emitted in this pass and not re-process new
>> +statements.  */
>> +  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next
>> (&gsi))
>> +   {
>> + gimple *stmt = gsi_stmt (gsi);
>> + if (!is_gimple_debug (stmt))
>> +   orig_stmts.add (stmt);
>> + num_statements++;
>> +   }
>>
>> please use gimple_set_visited () instead, that should be cheaper.
>>
>>
>> +  do
>> +   {
>> + changes_made = false;
>> + for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next
>> (&gsi))
>> +   {
>> ...
>> +   }
>> +  while (changes_made);
>>
>> looks pretty quadratic to me.  Instead of tracking things with
>> m_curr_base_expr
>> why not use a hash-map to track stores related to a base?
>
>
> I've implemented this scheme but I'm having trouble making it work.
> In particular I have a hash_map keyed on a 'tree' that is the base
> object (as extracted by get_inner_reference) but I can't get the hash_map
> to properly extract the already recorded stores to the same base.
> For example for the simple code:
> struct bar {
>   int a;
>   char b;
>   char c;
>   char d;
>   char e;
>   char f;
>   char g;
> };
>
> void
> foo1 (struct bar *p)
> {
>   p->b = 0;
>   p->a = 0;
>   p->c = 0;
>   p->d = 0;
>   p->e = 0;
> }
>
> As we can see, the stores are all to the same object and should
> be recognised as such.
>
> The base of the first store is recorded as:
> 
> and for the second store as 
> where the dumps of the two mem_refs are identical except for that first
> hex number (their address in memory?)
> In my first version of the patch I compare these with operand_equal_p and
> that
> detects that they are the same, but in the hash_map they are not detected
> as equal. Is there some special hashing function I must specify?

If you just use hash_map  then it will hash on the pointer value.
I think you need to use tree_operand_hash.

Richard.

> Thanks,
> Kyrill


Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)

2016-08-03 Thread Richard Biener
On Wed, Aug 3, 2016 at 3:31 PM, Martin Liška  wrote:
> Hello.
>
> As I've going through all PRs related to gcov-profile, I've noticed this PR.
> Current implementation of cycle detection in gcov is very poor, leading to 
> extreme run time
> for cases like mentioned in the PR (which does not contain a cycle). Thank to 
> Joshua, I've
> grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he 
> did. After doing that
> the patch is quite subtle and fast (of course).
>
> The patch survives gcov.exp regression tests and I also verified than *.gcov 
> is identical before
> and after the patch for Inkscape:
>
> $ find . -name '*.gcov' | wc -l
> 10752
>
> I'm also thinking about adding [1] to test-suite, however it would require 
> implementation 'timeout'
> argument in gcov.exp. Does it worth doing?

We usually add such tests without any "timeout" and expect we'll notice
if compile-time goes through the roof for them.

Richard.

> Ready to install?
> Thanks,
> Martin
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67992#c5


Re: [PATCH][RFC] PR middle-end/22141 GIMPLE store widening pass

2016-08-03 Thread Kyrill Tkachov


On 03/08/16 14:50, Richard Biener wrote:

On Wed, Aug 3, 2016 at 11:59 AM, Kyrill Tkachov
 wrote:

Hi Richard,

On 18/07/16 13:22, Richard Biener wrote:




+  /* Record the original statements so that we can keep track of
+statements emitted in this pass and not re-process new
+statements.  */
+  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next
(&gsi))
+   {
+ gimple *stmt = gsi_stmt (gsi);
+ if (!is_gimple_debug (stmt))
+   orig_stmts.add (stmt);
+ num_statements++;
+   }

please use gimple_set_visited () instead, that should be cheaper.


+  do
+   {
+ changes_made = false;
+ for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next
(&gsi))
+   {
...
+   }
+  while (changes_made);

looks pretty quadratic to me.  Instead of tracking things with
m_curr_base_expr
why not use a hash-map to track stores related to a base?


I've implemented this scheme but I'm having trouble making it work.
In particular I have a hash_map keyed on a 'tree' that is the base
object (as extracted by get_inner_reference) but I can't get the hash_map
to properly extract the already recorded stores to the same base.
For example for the simple code:
struct bar {
   int a;
   char b;
   char c;
   char d;
   char e;
   char f;
   char g;
};

void
foo1 (struct bar *p)
{
   p->b = 0;
   p->a = 0;
   p->c = 0;
   p->d = 0;
   p->e = 0;
}

As we can see, the stores are all to the same object and should
be recognised as such.

The base of the first store is recorded as:

and for the second store as 
where the dumps of the two mem_refs are identical except for that first
hex number (their address in memory?)
In my first version of the patch I compare these with operand_equal_p and
that
detects that they are the same, but in the hash_map they are not detected
as equal. Is there some special hashing function I must specify?

If you just use hash_map  then it will hash on the pointer value.
I think you need to use tree_operand_hash.


Ah, thanks. That did the trick.

Kyrill


Richard.


Thanks,
Kyrill




Re: [PATCH] gcov tool: Implement Hawick's algorithm for cycle detection, (PR gcov-profile/67992)

2016-08-03 Thread Nathan Sidwell

Martin,

As I've going through all PRs related to gcov-profile, I've noticed this PR.
Current implementation of cycle detection in gcov is very poor, leading to 
extreme run time
for cases like mentioned in the PR (which does not contain a cycle). Thank to 
Joshua, I've
grabbed his patch and removed the scaffolding (classes: Arc, Block, ...) he 
did. After doing that
the patch is quite subtle and fast (of course).


sorry to be a pain, but could you split the patch into
a) formatting changes
b) the clever  bits

the formatting changes can then (probably) be applied as obvious.

nathan


Re: [PATCH] Fix wrong code on aarch64 due to paradoxical subreg

2016-08-03 Thread Bernd Edlinger
Hi,

Is it OK for the trunk?

I guess so, but need an explicit OK.


Thanks
Bernd.

On 08/01/16 20:52, Bernd Edlinger wrote:
> Hi Jeff,
>
> On 08/01/16 19:54, Jeff Law wrote:
>> Looks like you've probably nailed it.  It'll be interesting see if
>> there's any fallout (though our RTL optimizer testing is pretty weak, so
>> even if there were, I doubt we'd catch it).
>>
>
> If there is, it will probably a performance regression...
>
> Anyway I'd say these two patches do just disable actually wrong
> transformations.  So here are both patches as separate diffs
> with your suggestion for the comment in cse_insn.
>
> I believe that on x86_64 both patches do not change a single bit.
>
> However I think there are more paradoxical subregs generated all over,
> but the aarch64 insv code pattern did trigger more hidden bugs than
> any other port.  It is certainly unfortunate that the major source
> of paradoxical subreg is in a target-dependent code path :(
>
> Please apologize that I am not able to reduce/finalize the aarch64 test
> case at this time, as I usually only work with arm and intel targets,
> but I made an exception here, because a bug like that may affect all
> targets sooner or later.
>
>
> Boot-strap and reg-testing on x86_64-linux-gnu.
> Plus aarch64 bootstrap and isl-testing by Andreas.
>
>
> Is it OK for trunk?
>
>
>
> Thanks
> Bernd.


[PATCH] Adjust some PRE testcases

2016-08-03 Thread Richard Biener

Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-08-03  Richard Biener  

* gcc.dg/tree-ssa/loadpre2.c: Disable LIM.
* gcc.dg/tree-ssa/loadpre21.c: Likewise.
* gcc.dg/tree-ssa/loadpre22.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-23.c: Likewise.

Index: gcc.dg/tree-ssa/loadpre2.c
===
--- gcc.dg/tree-ssa/loadpre2.c  (revision 239066)
+++ gcc.dg/tree-ssa/loadpre2.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-pre-stats" } */
+/* { dg-options "-O2 -fno-tree-loop-im -fdump-tree-pre-stats" } */
 int main(int *a, int argc)
 {
   int i;
Index: gcc.dg/tree-ssa/loadpre21.c
===
--- gcc.dg/tree-ssa/loadpre21.c (revision 239066)
+++ gcc.dg/tree-ssa/loadpre21.c (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-pre-stats" } */
+/* { dg-options "-O2 -fno-tree-loop-im -fdump-tree-pre-stats" } */
 typedef int type[2];
 int main(type *a, int argc)
 {
Index: gcc.dg/tree-ssa/loadpre22.c
===
--- gcc.dg/tree-ssa/loadpre22.c (revision 239066)
+++ gcc.dg/tree-ssa/loadpre22.c (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-pre-stats" } */
+/* { dg-options "-O2 -fno-tree-loop-im -fdump-tree-pre-stats" } */
 typedef int type[2];
 int main(type *a, int argc)
 {
Index: gcc.dg/tree-ssa/ssa-pre-23.c
===
--- gcc.dg/tree-ssa/ssa-pre-23.c(revision 239066)
+++ gcc.dg/tree-ssa/ssa-pre-23.c(working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-pre-stats" } */
+/* { dg-options "-O2 -fno-tree-loop-im -fdump-tree-pre-stats" } */
 
 struct { int x; int y; } global;
 void foo(int n)


[hsa-branch] Outline HSA function attribute modification

2016-08-03 Thread Martin Jambor
Hi,

since future changes will want to do this from two places, this patch
outlines function attribute changes to a special private method.

Martin

2016-07-20  Martin Jambor  

* hsa.h (hsa_summary_t): Add provate member function
process_gpu_implementation_attributes.
* hsa.c (process_gpu_implementation_attributes): New function.
(link_functions): Move some functionality into it.
---
 gcc/hsa.c | 31 +++
 gcc/hsa.h |  3 +++
 2 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/gcc/hsa.c b/gcc/hsa.c
index fdadcb1..caca939 100644
--- a/gcc/hsa.c
+++ b/gcc/hsa.c
@@ -813,6 +813,24 @@ hsa_get_declaration_name (tree decl)
   return name;
 }
 
+/* Add a flatten attribute and disable vectorization for gpu implementation
+   function decl GDECL.  */
+
+void hsa_summary_t::process_gpu_implementation_attributes (tree gdecl)
+{
+  DECL_ATTRIBUTES (gdecl)
+= tree_cons (get_identifier ("flatten"), NULL_TREE,
+DECL_ATTRIBUTES (gdecl));
+
+  tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl);
+  if (fn_opts == NULL_TREE)
+fn_opts = optimization_default_node;
+  fn_opts = copy_node (fn_opts);
+  TREE_OPTIMIZATION (fn_opts)->x_flag_tree_loop_vectorize = false;
+  TREE_OPTIMIZATION (fn_opts)->x_flag_tree_slp_vectorize = false;
+  DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl) = fn_opts;
+}
+
 void
 hsa_summary_t::link_functions (cgraph_node *gpu, cgraph_node *host,
   hsa_function_kind kind, bool gridified_kernel_p)
@@ -832,18 +850,7 @@ hsa_summary_t::link_functions (cgraph_node *gpu, 
cgraph_node *host,
   gpu_summary->m_binded_function = host;
   host_summary->m_binded_function = gpu;
 
-  tree gdecl = gpu->decl;
-  DECL_ATTRIBUTES (gdecl)
-= tree_cons (get_identifier ("flatten"), NULL_TREE,
-DECL_ATTRIBUTES (gdecl));
-
-  tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl);
-  if (fn_opts == NULL_TREE)
-fn_opts = optimization_default_node;
-  fn_opts = copy_node (fn_opts);
-  TREE_OPTIMIZATION (fn_opts)->x_flag_tree_loop_vectorize = false;
-  TREE_OPTIMIZATION (fn_opts)->x_flag_tree_slp_vectorize = false;
-  DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl) = fn_opts;
+  process_gpu_implementation_attributes (gpu->decl);
 
   /* Create reference between a kernel and a corresponding host implementation
  to quarantee LTO streaming to a same LTRANS.  */
diff --git a/gcc/hsa.h b/gcc/hsa.h
index f13e216..4d98bb3 100644
--- a/gcc/hsa.h
+++ b/gcc/hsa.h
@@ -1322,6 +1322,9 @@ public:
 
   void link_functions (cgraph_node *gpu, cgraph_node *host,
   hsa_function_kind kind, bool gridified_kernel_p);
+
+private:
+  void process_gpu_implementation_attributes (tree gdecl);
 };
 
 /* OMP simple builtin describes behavior that should be done for
-- 
2.9.0



[hsa-branch] Rename m_binded_function to m_bound_function

2016-08-03 Thread Martin Jambor
Hi,

past participle of bind is bound, not binded (although ispell for some
accepts it for some reason) and so I am going to fix this mistake in the
name of the field.  I know normally do not do this, but I believe that
in this case the only out-of-tree patches affected are mine and so will
make an exception.

Moreover, this patch allows for the field to be NULL, which will soon be
handy on the branch.

Thanks,

Martin

2016-07-20  Martin Jambor  

gcc/

* hsa.h (hsa_function_summary): Rename m_binded_function to
m_bound_function.
* hsa-gen.c (hsa_get_host_function): Handle functions with no
bound CPU implementation.  Fix binded to bound.
(get_brig_function_name): Likewise.
* hsa.c (link_functions): Adjust after renaming m_binded_functions
to m_bound_functions.
* ipa-hsa.c (process_hsa_functions): Likewise.
(ipa_hsa_write_summary): Likewise.
(ipa_hsa_read_section): Likewise.
---
 gcc/hsa-gen.c |  8 +---
 gcc/hsa.c |  4 ++--
 gcc/hsa.h |  6 +++---
 gcc/ipa-hsa.c | 12 ++--
 4 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 5208dab..e16a5c7 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -894,7 +894,7 @@ hsa_get_host_function (tree decl)
   gcc_assert (s->m_kind != HSA_NONE);
   gcc_assert (s->m_gpu_implementation_p);
 
-  return s->m_binded_function->decl;
+  return s->m_bound_function ? s->m_bound_function->decl : NULL;
 }
 
 /* Return true if function DECL has a host equivalent function.  */
@@ -905,8 +905,10 @@ get_brig_function_name (tree decl)
   tree d = decl;
 
   hsa_function_summary *s = hsa_summaries->get (cgraph_node::get_create (d));
-  if (s->m_kind != HSA_NONE && s->m_gpu_implementation_p)
-d = s->m_binded_function->decl;
+  if (s->m_kind != HSA_NONE
+  && s->m_gpu_implementation_p
+  && s->m_bound_function)
+d = s->m_bound_function->decl;
 
   /* IPA split can create a function that has no host equivalent.  */
   if (d == NULL)
diff --git a/gcc/hsa.c b/gcc/hsa.c
index caca939..9f02bba 100644
--- a/gcc/hsa.c
+++ b/gcc/hsa.c
@@ -847,8 +847,8 @@ hsa_summary_t::link_functions (cgraph_node *gpu, 
cgraph_node *host,
   gpu_summary->m_gridified_kernel_p = gridified_kernel_p;
   host_summary->m_gridified_kernel_p = gridified_kernel_p;
 
-  gpu_summary->m_binded_function = host;
-  host_summary->m_binded_function = gpu;
+  gpu_summary->m_bound_function = host;
+  host_summary->m_bound_function = gpu;
 
   process_gpu_implementation_attributes (gpu->decl);
 
diff --git a/gcc/hsa.h b/gcc/hsa.h
index 4d98bb3..092fd3b 100644
--- a/gcc/hsa.h
+++ b/gcc/hsa.h
@@ -1291,9 +1291,9 @@ struct hsa_function_summary
   hsa_function_kind m_kind;
 
   /* Pointer to a cgraph node which is a HSA implementation of the function.
- In case of the function is a HSA function, the binded function points
+ In case of the function is a HSA function, the bound function points
  to the host function.  */
-  cgraph_node *m_binded_function;
+  cgraph_node *m_bound_function;
 
   /* Identifies if the function is an HSA function or a host function.  */
   bool m_gpu_implementation_p;
@@ -1304,7 +1304,7 @@ struct hsa_function_summary
 
 inline
 hsa_function_summary::hsa_function_summary (): m_kind (HSA_NONE),
-  m_binded_function (NULL), m_gpu_implementation_p (false)
+  m_bound_function (NULL), m_gpu_implementation_p (false)
 {
 }
 
diff --git a/gcc/ipa-hsa.c b/gcc/ipa-hsa.c
index 769657f..9ab4927 100644
--- a/gcc/ipa-hsa.c
+++ b/gcc/ipa-hsa.c
@@ -79,7 +79,7 @@ process_hsa_functions (void)
   hsa_function_summary *s = hsa_summaries->get (node);
 
   /* A linked function is skipped.  */
-  if (s->m_binded_function != NULL)
+  if (s->m_bound_function != NULL)
continue;
 
   if (s->m_kind != HSA_NONE)
@@ -131,7 +131,7 @@ process_hsa_functions (void)
  hsa_function_summary *dst = hsa_summaries->get (e->callee);
  if (dst->m_kind != HSA_NONE && !dst->m_gpu_implementation_p)
{
- e->redirect_callee (dst->m_binded_function);
+ e->redirect_callee (dst->m_bound_function);
  if (dump_file)
fprintf (dump_file,
 "Redirecting edge to HSA function: %s->%s\n",
@@ -193,10 +193,10 @@ ipa_hsa_write_summary (void)
  bp = bitpack_create (ob->main_stream);
  bp_pack_value (&bp, s->m_kind, 2);
  bp_pack_value (&bp, s->m_gpu_implementation_p, 1);
- bp_pack_value (&bp, s->m_binded_function != NULL, 1);
+ bp_pack_value (&bp, s->m_bound_function != NULL, 1);
  streamer_write_bitpack (&bp);
- if (s->m_binded_function)
-   stream_write_tree (ob, s->m_binded_function->decl, true);
+ if (s->m_bound_function)
+   stream_write_tree (ob, s->m_bound_function->decl, true);
}
 }
 
@@ -249,7 +249,7 @@ ipa_hsa_read_section (struct lto_file_decl

[hsa-branch] Handle BRIG_OPCODE_DEBUGTRAP in op_output_p

2016-08-03 Thread Martin Jambor
Hi,

although BRIG_OPCODE_DEBUGTRAP isn't currently generated by gcc trunk,
some experimental patches that I have do and it is one of the rare cases
where registers in zeroth operand are input, so  it should be recognized
as such by hsa_insn_basic::op_output_p.

Martin

2016-07-04  Martin Jambor  

* hsa.c (hsa_insn_basic::op_output_p): Add BRIG_OPCODE_DEBUGTRAP
to the list of instructions with no output registers.
---
 gcc/hsa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/hsa.c b/gcc/hsa.c
index 9f02bba..01520e8 100644
--- a/gcc/hsa.c
+++ b/gcc/hsa.c
@@ -170,6 +170,7 @@ hsa_insn_basic::op_output_p (unsigned opnum)
 case BRIG_OPCODE_SBR:
 case BRIG_OPCODE_ST:
 case BRIG_OPCODE_SIGNALNORET:
+case BRIG_OPCODE_DEBUGTRAP:
   /* FIXME: There are probably missing cases here, double check.  */
   return false;
 case BRIG_OPCODE_EXPAND:
-- 
2.9.0



[hsa-branch] Minor tweak to HSA_SORRY macros

2016-08-03 Thread Martin Jambor
Hi,

the whole point of having HSA_SORRY be encapsulated in a rather useless
while statement is to enforce a semicolon after its each expansion, like
if it was a function.  To my surprise, I found the semicolon is already
there and missing at two invocation points.  I plan to change the macro
to be different on the hsa branch and fix it there and to minimize the
difference I'd like to commit this to trunk eventualy.

Thanks,

Martin

2016-07-21  Martin Jambor  

* hsa-gen.c (HSA_SORRY_ATV): Remove semicolon after macro.
(HSA_SORRY_AT): Likewise.
(omp_simple_builtin::generate): Add missing semicolons.
---
 gcc/hsa-gen.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 24cc2c7..6accbd7 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -69,7 +69,7 @@ along with GCC; see the file COPYING3.  If not see
HSA_SORRY_MSG)) \
   inform (location, message, __VA_ARGS__); \
   } \
-  while (false);
+  while (false)
 
 /* Same as previous, but highlight a location.  */
 
@@ -81,7 +81,7 @@ along with GCC; see the file COPYING3.  If not see
HSA_SORRY_MSG)) \
   inform (location, message); \
   } \
-  while (false);
+  while (false)
 
 /* Default number of threads used by kernel dispatch.  */
 
@@ -4379,11 +4379,11 @@ omp_simple_builtin::generate (gimple *stmt, hsa_bb *hbb)
   if (m_sorry)
 {
   if (m_warning_message)
-   HSA_SORRY_AT (gimple_location (stmt), m_warning_message)
+   HSA_SORRY_AT (gimple_location (stmt), m_warning_message);
   else
HSA_SORRY_ATV (gimple_location (stmt),
   "Support for HSA does not implement calls to %s\n",
-  m_name)
+  m_name);
 }
   else if (m_warning_message != NULL)
 warning_at (gimple_location (stmt), OPT_Whsa, m_warning_message);
-- 
2.9.0



[hsa-branch] Cleanups of hsa_insns_signal and hsa_insns_queue

2016-08-03 Thread Martin Jambor
Hi,

currently, hsa_insn_signal is a descendant of hsa_insn_atomic, even
though its BRIG representation actually contains fewer fields than the
atomic BRIG representation so if anything, it should be the other way
around.  But this patch actually makes both direct descendants of
hsa_insn_basic because apart from construction, which needs to
differentiate between them anyway, there is no need for common
processing.

This patch also adds fields to hsa_insn_queue which we have so far
assumed to have one particular value in order to enable more flexibility
in subsequent patches on the hsa branch (Unlike those, I intend to
commit this one to trunk soon as well).

Thanks,

Martin

2016-07-18  Martin Jambor  

* hsa.h (hsa_insn_signal): Make a direct descendant of
hsa_insn_basic.  Add memorder constructor parameter and
m_memory_order and m_signalop member variables.
(hsa_insn_queue): Changed constructor parameters to common form.
Added m_segment and m_memory_order member variables.
* hsa-brig.c (emit_signal_insn): Remove obsolete comment.  Update
member variable name, pick a type according to profile.
(emit_alloca_insn): Remove obsolete comment.
(emit_atomic_insn): Likewise.
(emit_queue_insn): Get segment and memory order from the IR object.
* hsa-dump.c (dump_hsa_insn_1): Update signal member variable
name.  Special dumping for queue objects.
* hsa-gen.c (hsa_insn_atomic): Fix function comment.
(hsa_insn_signal::hsa_insn_signal): Fix comment.  Update call to
ancestor constructor and initialization of new member variables.
(hsa_insn_queue::hsa_insn_queue): Added initialization of new
member variables.
(gen_hsa_insns_for_kernel_call): Use constructor arguments to
initialize signal memory order, remove signal, memory scope
initialization.  Use new constructor arguments to initialize queue
members.  Remove atomic instruction scope initialization.
---
 gcc/hsa-brig.c | 22 +---
 gcc/hsa-dump.c | 15 +++---
 gcc/hsa-gen.c  | 65 ++
 gcc/hsa.h  | 28 -
 4 files changed, 69 insertions(+), 61 deletions(-)

diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index 716d8f5..1edd126 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -1333,10 +1333,6 @@ emit_signal_insn (hsa_insn_signal *mem)
 {
   struct BrigInstSignal repr;
 
-  /* This is necessary because of the erroneous typedef of
- BrigMemoryModifier8_t which introduces padding which may then contain
- random stuff (which we do not want so that we can test things don't
- change).  */
   memset (&repr, 0, sizeof (repr));
   repr.base.base.byteCount = lendian16 (sizeof (repr));
   repr.base.base.kind = lendian16 (BRIG_KIND_INST_SIGNAL);
@@ -1344,9 +1340,9 @@ emit_signal_insn (hsa_insn_signal *mem)
   repr.base.type = lendian16 (mem->m_type);
   repr.base.operands = lendian32 (emit_insn_operands (mem));
 
-  repr.memoryOrder = mem->m_memoryorder;
-  repr.signalOperation = mem->m_atomicop;
-  repr.signalType = BRIG_TYPE_SIG64;
+  repr.memoryOrder = mem->m_memory_order;
+  repr.signalOperation = mem->m_signalop;
+  repr.signalType = hsa_machine_large_p () ? BRIG_TYPE_SIG64 : BRIG_TYPE_SIG32;
 
   brig_code.add (&repr, sizeof (repr));
   brig_insn_count++;
@@ -1367,10 +1363,6 @@ emit_atomic_insn (hsa_insn_atomic *mem)
   else
 addr = as_a  (mem->get_op (1));
 
-  /* This is necessary because of the erroneous typedef of
- BrigMemoryModifier8_t which introduces padding which may then contain
- random stuff (which we do not want so that we can test things don't
- change).  */
   memset (&repr, 0, sizeof (repr));
   repr.base.base.byteCount = lendian16 (sizeof (repr));
   repr.base.base.kind = lendian16 (BRIG_KIND_INST_ATOMIC);
@@ -1447,10 +1439,6 @@ emit_alloca_insn (hsa_insn_alloca *alloca)
   struct BrigInstMem repr;
   gcc_checking_assert (alloca->operand_count () == 2);
 
-  /* This is necessary because of the erroneous typedef of
- BrigMemoryModifier8_t which introduces padding which may then contain
- random stuff (which we do not want so that we can test things don't
- change).  */
   memset (&repr, 0, sizeof (repr));
   repr.base.base.byteCount = lendian16 (sizeof (repr));
   repr.base.base.kind = lendian16 (BRIG_KIND_INST_MEM);
@@ -1747,8 +1735,8 @@ emit_queue_insn (hsa_insn_queue *insn)
   repr.base.base.kind = lendian16 (BRIG_KIND_INST_QUEUE);
   repr.base.opcode = lendian16 (insn->m_opcode);
   repr.base.type = lendian16 (insn->m_type);
-  repr.segment = BRIG_SEGMENT_GLOBAL;
-  repr.memoryOrder = BRIG_MEMORY_ORDER_SC_RELEASE;
+  repr.segment = insn->m_segment;
+  repr.memoryOrder = insn->m_memory_order;
   repr.base.operands = lendian32 (emit_insn_operands (insn));
   brig_data.round_size_up (4);
   brig_code.add (&repr, sizeof (repr));
diff --git a/gcc/hsa-dump.c 

[hsa-branch] Switch dynamic parallelism off by default

2016-08-03 Thread Martin Jambor
Hi,

the dynamic parallelism (i.e. ability to execute one HSA from another
and in particular wait for its completion) path is unreliable and
problematic in many ways with no fix in sight.  I do not want to remove
the code just yet, it is likely to prove useful at least as a reference,
but am going to switch it off by default even on the HSA branch (it is
not even present on trunk).  It can be switched on by means of a new
parameter, which is technically a switch, but I want to want to
emphasize that the interface is volatile.

Thanks,

Martin

2016-07-25  Martin Jambor  

* params.def (PARAM_HSA_EXPAND_GOMP_PARALLEL): New.
* hsa-gen.c (gen_hsa_insns_for_call): Only expand gomp_parallel if
the above parameter is set to one.
* invoke.texi (hsa-expand-omp-parallel): New.
---
 gcc/doc/invoke.texi |  6 ++
 gcc/hsa-gen.c   | 31 ++-
 gcc/params.def  |  6 ++
 3 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index bddac9c..1ba10e4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9769,6 +9769,12 @@ Enable creation of gridified GPU kernels out of loops 
within target
 OpenMP constructs.  This conversion is enabled by default when
 offloading to HSA, to disable it, use @option{--param omp-gpu-gridify=0}
 
+@item hsa-expand-omp-parallel
+Enable compiling non-gridified OpenMP parallel constructs into HSAIL as
+invocations of child kernels in their own grid.  This behavior is
+disabled by default because in many scenarios it does not work
+properly.  To enable it, use @option{--param hsa-expand-omp-parallel=1}.
+
 @item hsa-gen-debug-stores
 Enable emission of special debug stores within HSA kernels which are
 then read and reported by libgomp plugin.  Generation of these stores
diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 6accbd7..a944df4 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5870,20 +5870,25 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
 BRIG_WIDTH_ALL));
   break;
 case BUILT_IN_GOMP_PARALLEL:
-  {
-   gcc_checking_assert (gimple_call_num_args (stmt) == 4);
-   tree called = gimple_call_arg (stmt, 0);
-   gcc_checking_assert (TREE_CODE (called) == ADDR_EXPR);
-   called = TREE_OPERAND (called, 0);
-   gcc_checking_assert (TREE_CODE (called) == FUNCTION_DECL);
-
-   const char *name
- = hsa_brig_function_name (hsa_get_declaration_name (called));
-   hsa_add_kernel_dependency (hsa_cfun->m_decl, name);
-   gen_hsa_insns_for_kernel_call (hbb, as_a  (stmt));
+  if (PARAM_VALUE (PARAM_HSA_EXPAND_GOMP_PARALLEL) == 1)
+   {
+ gcc_checking_assert (gimple_call_num_args (stmt) == 4);
+ tree called = gimple_call_arg (stmt, 0);
+ gcc_checking_assert (TREE_CODE (called) == ADDR_EXPR);
+ called = TREE_OPERAND (called, 0);
+ gcc_checking_assert (TREE_CODE (called) == FUNCTION_DECL);
+
+ const char *name
+   = hsa_brig_function_name (hsa_get_declaration_name (called));
+ hsa_add_kernel_dependency (hsa_cfun->m_decl, name);
+ gen_hsa_insns_for_kernel_call (hbb, as_a  (stmt));
+   }
+  else
+   HSA_SORRY_AT (gimple_location (stmt), "expansion of ungridified "
+ "omp parallel is epxerimental, enable with "
+ "--param hsa-expand-omp-parallel");
+  break;
 
-   break;
-  }
 case BUILT_IN_OMP_GET_THREAD_NUM:
   {
query_hsa_grid_nodim (stmt, BRIG_OPCODE_WORKITEMFLATABSID, hbb);
diff --git a/gcc/params.def b/gcc/params.def
index 129da8f..632d5ef 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1242,6 +1242,12 @@ DEFPARAM (PARAM_OMP_GPU_GRIDIFY,
  "constructs",
  1, 0, 1)
 
+DEFPARAM (PARAM_HSA_EXPAND_GOMP_PARALLEL,
+ "hsa-expand-omp-parallel",
+ "Expand ungridified OpenMP parallel via dynamic parallelism "
+ "constructs",
+ 0, 0, 1)
+
 DEFPARAM (PARAM_HSA_GEN_DEBUG_STORES,
  "hsa-gen-debug-stores",
  "Level of hsa debug stores verbosity",
-- 
2.9.0



[hsa-branch] Outline some functionality in HSA libgomp plugin

2016-08-03 Thread Martin Jambor
Hi,

the patch below contains just code reorganization that I'd like to
commit to trunk to minimize non-functional differences between hsa
branch, where I'll need these new functions also at different places.

Thanks,

Martin

2016-07-20  Martin Jambor  

libgomp/plugin/

* plugin-hsa.c (init_basic_kernel_info): New function.
(GOMP_OFFLOAD_load_image): Outline kernel initialization into it.
(run_kernel): New function.
(GOMP_OFFLOAD_run): Outlined most of functionality to the function
above.
---
 libgomp/plugin/plugin-hsa.c | 74 ++---
 1 file changed, 49 insertions(+), 25 deletions(-)

diff --git a/libgomp/plugin/plugin-hsa.c b/libgomp/plugin/plugin-hsa.c
index 7cd95cb..ef7a202 100644
--- a/libgomp/plugin/plugin-hsa.c
+++ b/libgomp/plugin/plugin-hsa.c
@@ -837,6 +837,29 @@ destroy_hsa_program (struct agent_info *agent)
   return true;
 }
 
+/* Initialize KERNEL from D and other parameters.  Return true on success. */
+
+static bool
+init_basic_kernel_info (struct kernel_info *kernel,
+   struct hsa_kernel_description *d,
+   struct agent_info *agent,
+   struct module_info *module)
+{
+  kernel->agent = agent;
+  kernel->module = module;
+  kernel->name = d->name;
+  kernel->omp_data_size = d->omp_data_size;
+  kernel->gridified_kernel_p = d->gridified_kernel_p;
+  kernel->dependencies_count = d->kernel_dependencies_count;
+  kernel->dependencies = d->kernel_dependencies;
+  if (pthread_mutex_init (&kernel->init_mutex, NULL))
+{
+  GOMP_PLUGIN_error ("Failed to initialize an HSA kernel mutex");
+  return false;
+}
+  return true;
+}
+
 /* Part of the libgomp plugin interface.  Load BRIG module described by struct
brig_image_desc in TARGET_DATA and return references to kernel descriptors
in TARGET_TABLE.  */
@@ -891,19 +914,8 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, void 
*target_data,
   pair->end = (uintptr_t) (kernel + 1);
 
   struct hsa_kernel_description *d = &image_desc->kernel_infos[i];
-  kernel->agent = agent;
-  kernel->module = module;
-  kernel->name = d->name;
-  kernel->omp_data_size = d->omp_data_size;
-  kernel->gridified_kernel_p = d->gridified_kernel_p;
-  kernel->dependencies_count = d->kernel_dependencies_count;
-  kernel->dependencies = d->kernel_dependencies;
-  if (pthread_mutex_init (&kernel->init_mutex, NULL))
-   {
- GOMP_PLUGIN_error ("Failed to initialize an HSA kernel mutex");
- return -1;
-   }
-
+  if (!init_basic_kernel_info (kernel, d, agent, module))
+   return -1;
   kernel++;
   pair++;
 }
@@ -1456,22 +1468,14 @@ packet_store_release (uint32_t* packet, uint16_t 
header, uint16_t rest)
   __atomic_store_n (packet, header | (rest << 16), __ATOMIC_RELEASE);
 }
 
-/* Part of the libgomp plugin interface.  Run a kernel on device N and pass it
-   an array of pointers in VARS as a parameter.  The kernel is identified by
-   FN_PTR which must point to a kernel_info structure.  */
+/* Run KERNEL on its agent, pass VARS to it as arguments and take
+   launchattributes from KLA.  */
 
 void
-GOMP_OFFLOAD_run (int n, void *fn_ptr, void *vars, void **args)
+run_kernel (struct kernel_info *kernel, void *vars,
+   struct GOMP_kernel_launch_attributes *kla)
 {
-  struct kernel_info *kernel = (struct kernel_info *) fn_ptr;
   struct agent_info *agent = kernel->agent;
-  struct GOMP_kernel_launch_attributes def;
-  struct GOMP_kernel_launch_attributes *kla;
-  if (!parse_target_attributes (args, &def, &kla))
-{
-  HSA_DEBUG ("Will not run HSA kernel because the grid size is zero\n");
-  return;
-}
   if (pthread_rwlock_rdlock (&agent->modules_rwlock))
 GOMP_PLUGIN_fatal ("Unable to read-lock an HSA agent rwlock");
 
@@ -1596,6 +1600,26 @@ GOMP_OFFLOAD_run (int n, void *fn_ptr, void *vars, void 
**args)
 GOMP_PLUGIN_fatal ("Unable to unlock an HSA agent rwlock");
 }
 
+/* Part of the libgomp plugin interface.  Run a kernel on device N (the number
+   is actually ignored, we assume the FN_PTR has been mapped using the correct
+   device) and pass it an array of pointers in VARS as a parameter.  The kernel
+   is identified by FN_PTR which must point to a kernel_info structure.  */
+
+void
+GOMP_OFFLOAD_run (int n __attribute__((unused)),
+ void *fn_ptr, void *vars, void **args)
+{
+  struct kernel_info *kernel = (struct kernel_info *) fn_ptr;
+  struct GOMP_kernel_launch_attributes def;
+  struct GOMP_kernel_launch_attributes *kla;
+  if (!parse_target_attributes (args, &def, &kla))
+{
+  HSA_DEBUG ("Will not run HSA kernel because the grid size is zero\n");
+  return;
+}
+  run_kernel (kernel, vars, kla);
+}
+
 /* Information to be passed to a thread running a kernel asycnronously.  */
 
 struct async_run_info
-- 
2.9.0



[hsa-branch] Refactoring of handling atomiscs

2016-08-03 Thread Martin Jambor
Hi,

the following patch is a part of an effort to implement handling of HSA
signal builtins but contains mainly refactoring and tiny fixes of issues
that I came accross when changing the code.  I have divided this work in
this way so that this part can leter be cherry-picked for trunk, unlike
the rest.

Martin

2016-07-19  Martin Jambor  

* hsa-gen.c (get_memory_order_name): Removed.
(get_memory_order): Likewise.
(hsa_memorder_from_tree): New function.
(gen_hsa_ternary_atomic_for_builtin): Renamed to
gen_hsa_atomic_for_builtin, can also create signals.
(gen_hsa_insns_for_call): Adjust to use hsa_memory_order_from_tree
and gen_hsa_atomic_for_builtin.  Prepare for also handling signal
builtins.
---
 gcc/hsa-gen.c | 227 --
 1 file changed, 111 insertions(+), 116 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index a944df4..ecd6f8a 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -5202,95 +5202,86 @@ expand_memory_set (gimple *stmt, unsigned HOST_WIDE_INT 
n,
   expand_lhs_of_string_op (stmt, n, merge_bb, builtin);
 }
 
-/* Return string for MEMMODEL.  */
+/* Store into MEMORDER the memory order specified by tree T, which must be an
+   integer constant representing a C++ memory order.  If it isn't, issue an HSA
+   sorry message using LOC and return true, otherwise return false and store
+   the name of the requested order to *MNAME.  */
 
-static const char *
-get_memory_order_name (unsigned memmodel)
+static bool
+hsa_memorder_from_tree (tree t, BrigMemoryOrder *memorder, const char **mname,
+   location_t loc)
 {
-  switch (memmodel & MEMMODEL_BASE_MASK)
+  if (!tree_fits_uhwi_p (t))
 {
-case MEMMODEL_RELAXED:
-  return "relaxed";
-case MEMMODEL_CONSUME:
-  return "consume";
-case MEMMODEL_ACQUIRE:
-  return "acquire";
-case MEMMODEL_RELEASE:
-  return "release";
-case MEMMODEL_ACQ_REL:
-  return "acq_rel";
-case MEMMODEL_SEQ_CST:
-  return "seq_cst";
-default:
-  return NULL;
+  HSA_SORRY_ATV (loc, "support for HSA does not implement memory model %E",
+t);
+  return true;
 }
-}
 
-/* Return memory order according to predefined __atomic memory model
-   constants.  LOCATION is provided to locate the problematic statement.  */
-
-static BrigMemoryOrder
-get_memory_order (unsigned memmodel, location_t location)
-{
-  switch (memmodel & MEMMODEL_BASE_MASK)
+  unsigned HOST_WIDE_INT mm = tree_to_uhwi (t);
+  switch (mm & MEMMODEL_BASE_MASK)
 {
 case MEMMODEL_RELAXED:
-  return BRIG_MEMORY_ORDER_RELAXED;
+  *memorder = BRIG_MEMORY_ORDER_RELAXED;
+  *mname = "relaxed";
+  break;
 case MEMMODEL_CONSUME:
   /* HSA does not have an equivalent, but we can use the slightly stronger
 ACQUIRE.  */
+  *memorder = BRIG_MEMORY_ORDER_SC_ACQUIRE;
+  *mname = "consume";
+  break;
 case MEMMODEL_ACQUIRE:
-  return BRIG_MEMORY_ORDER_SC_ACQUIRE;
+  *memorder = BRIG_MEMORY_ORDER_SC_ACQUIRE;
+  *mname = "acquire";
+  break;
 case MEMMODEL_RELEASE:
-  return BRIG_MEMORY_ORDER_SC_RELEASE;
+  *memorder = BRIG_MEMORY_ORDER_SC_RELEASE;
+  *mname = "release";
+  break;
 case MEMMODEL_ACQ_REL:
+  *memorder = BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE;
+  *mname = "acq_rel";
+  break;
 case MEMMODEL_SEQ_CST:
   /* Callers implementing a simple load or store need to remove the release
 or acquire part respectively.  */
-  return BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE;
+  *memorder = BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE;
+  *mname = "seq_cst";
+  break;
 default:
   {
-   const char *mmname = get_memory_order_name (memmodel);
-   HSA_SORRY_ATV (location,
-  "support for HSA does not implement the specified "
-  " memory model%s %s",
-  mmname ? ": " : "", mmname ? mmname : "");
-   return BRIG_MEMORY_ORDER_NONE;
+   HSA_SORRY_AT (loc, "support for HSA does not implement the specified "
+ "memory model");
+   return true;
   }
 }
+  return false;
 }
 
-/* Helper function to create an HSA atomic binary operation instruction out of
-   calls to atomic builtins.  RET_ORIG is true if the built-in is the variant
-   that return s the value before applying operation, and false if it should
-   return the value after applying the operation (if it returns value at all).
-   ACODE is the atomic operation code, STMT is a gimple call to a builtin.  HBB
-   is the HSA BB to which the instruction should be added.  */
+/* Helper function to create an HSA atomic operation instruction out of calls
+   to atomic builtins.  RET_ORIG is true if the built-in is the variant that
+   return s the value before applying operation, and false if it should return
+   the value after 

Re: [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-08-03 Thread James Greenhalgh
On Wed, Aug 03, 2016 at 12:52:42PM +0100, Ramana Radhakrishnan wrote:
> On Thu, Jul 28, 2016 at 12:37 PM, Ramana Radhakrishnan
>  wrote:
> > On Mon, Jul 4, 2016 at 3:02 PM, Matthew Wahab
> >  wrote:
> >> On 19/05/16 15:54, Matthew Wahab wrote:
> >>> On 18/05/16 16:20, Joseph Myers wrote:
>  On Wed, 18 May 2016, Matthew Wahab wrote:
> 
>  In short: instructions for direct HFmode arithmetic should be described
>  with patterns with the standard names.  It's the job of the
>  architecture-independent compiler to ensure that fp16 arithmetic in the
>  user's source code only generates direct fp16 arithmetic in GIMPLE (and
>  thus ends up using those patterns) if that is a correct representation of
>  the source code's semantics according to ACLE.
> 
>  The intrinsics you provide can then be written to use direct arithmetic,
>  and rely on convert_to_real_1 eliminating the promotions, rather than
>  needing built-in functions at all, just like many arm_neon.h intrinsics
>  make direct use of GNU C vector arithmetic.
> >>>
> >>> I think it's clear that this has exhausted my knowledge of FP semantics.
> >>>
> >>> Forcing promotion to single-precision was to settle concerns brought up in
> >>> internal discussions about __fp16 semantics. I'll see if anybody has any
> >>> problem with the changes you suggest.
> >>
> >> This patch changes the implementation to use the standard names for the
> >> HFmode arithmetic. Later patches will also be updated to use the
> >> arithmetic operators where appropriate.
> >>
> >> Changes since the last version of this patch:
> >> - The standard names for plus, minus, mult, div and fma are defined for
> >>   HF mode.
> >> - The patterns supporting the new ACLE intrinsics vnegh_f16, vaddh_f16,
> >>   vsubh_f16, vmulh_f16 and vdivh_f16 are removed, the arithmetic
> >>   operators will be used instead.
> >> - The tests are updated to expect f16 instructions rather than the f32
> >>   instructions that were previously emitted.
> >>
> >> Tested the series for arm-none-linux-gnueabihf with native bootstrap and
> >> make check and for arm-none-eabi and armeb-none-eabi with make check on
> >> an ARMv8.2-A emulator.
> >
> >
> > All fine except -
> >
> > Why can we not extend the  and the l in
> > vfp.md for fp16 and avoid all the unspecs for vcvta and vrnd*
> > instructions ?
> >
> 
> I now feel reasonably convinced that these can go away and be replaced
> by extending the  and l expanders to
> consider FP16 as well. Given that we are still only in the middle of
> stage1 - I'm ok for you to apply this as is and then follow-up with a
> patch that gets rid of the UNSPECs . If this holds for add, sub and
> other patterns I don't see why it wouldn't hold for all these patterns
> as well.
> 
> Joseph, do you have any opinions on whether we should be extending the
> standard pattern names or not for btrunc, ceil, round, floor,
> nearbyint, rint, lround, lfloor and lceil optabs for the HFmode
> quantities ?

Mapping these to standard pattern names is the right thing to do if they
implement the correct semantics for those standard pattern names. That's
true whether you access them by function name (as you would for _Float16),
or as intrinsics (as you may want to do for __fp16 in arm_fp16.h).

I see that the ARM port doesn't have as general a mechanism for specifying
intrinsics in config/arm/arm_neon_builtins.def as the AArch64 port has in
config/aarch64/aarch64-simd-builtins.def . In the AArch64 port it is
perfectly acceptable for a builtin to map on to a standard pattern name.
In the ARM port it seems there is a limitation such that all builtins *must*
map on to pattern names with the prefix "neon_".

Fixing this limitation (perhaps in the way that AArch64 goes about it with
a series of magic macros) would permit these to be Standard Pattern names.
See https://gcc.gnu.org/ml/gcc-patches/2013-04/msg01219.html for what I did
to AArch64 3 years ago.

I think that's probably the right way to go about resolving this, but I
haven't looked too hard in to what it would take in the ARM port to refactor
along those lines.
 
Thanks,
James



[PATCH] Fix c-c++-common/ubsan/pr71403-?.c testcases

2016-08-03 Thread Richard Biener

c-c++-common/ubsan is tortured, no need to specify -O3 (in fact
it's a waste of time).

Committed.

Richard.

2016-08-03  Richard Biener  

* c-c++-common/ubsan/pr71403-1.c: Use dg-additional-options
and remove -O3.
* c-c++-common/ubsan/pr71403-2.c: Likewise.
* c-c++-common/ubsan/pr71403-3.c: Likewise.

Index: c-c++-common/ubsan/pr71403-1.c
===
--- c-c++-common/ubsan/pr71403-1.c  (revision 239066)
+++ c-c++-common/ubsan/pr71403-1.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -fsanitize=unreachable" } */
+/* { dg-additional-options "-fsanitize=unreachable" } */
 
 char a = -97;
 int b, c, d, e;
Index: c-c++-common/ubsan/pr71403-2.c
===
--- c-c++-common/ubsan/pr71403-2.c  (revision 239066)
+++ c-c++-common/ubsan/pr71403-2.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -fsanitize=unreachable" } */
+/* { dg-additional-options "-fsanitize=unreachable" } */
 
 char a, c;
 short b;
Index: c-c++-common/ubsan/pr71403-3.c
===
--- c-c++-common/ubsan/pr71403-3.c  (revision 239066)
+++ c-c++-common/ubsan/pr71403-3.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -fsanitize=unreachable" } */
+/* { dg-additional-options "-fsanitize=unreachable" } */
 
 
 int a, b, c, d;



[hsa-branch] Update existing hsa builtin handling

2016-08-03 Thread Martin Jambor
Hi,

this patch updates the neames of HSA builtins that are emitted by omp
lowering so that their names correspond to the HSA instruction names
they are implemented by.  This avoids a lot of confusion when we add
more HSA builtins for other special operations the architecture has.

Moreover, it cleans up how we handle them and also reimplements handling
of OpenMP builtins and calls to run-time OpenMP functions, so that they
work as expected in multi-dimensional grids.

Thanks,

Martin

2016-07-20  Martin Jambor  

* hsa-builtins.def (BUILT_IN_HSA_GET_WORKGROUP_ID): Renamed to
BUILT_IN_HSA_WORKGROUPID.
(BUILT_IN_HSA_GET_WORKITEM_ID): Renamed to BUILT_IN_HSA_WORKITEMID.
(BUILT_IN_HSA_GET_WORKITEM_ABSID): Renamed to
BUILT_IN_HSA_WORKITEMABSID.
* hsa-gen.c (query_hsa_grid): Renamed to
query_hsa_grid_dim, reimplemented, cut down to two overloads.
(query_hsa_grid_nodim): New function.
(multiply_grid_dim_characteristics): Likewise.
(gen_get_num_threads): Likewise.
(gen_get_num_teams): Reimplemented.
(gen_get_team_num): Likewise.
(gen_hsa_insns_for_known_library_call): Updated calls to the above
helper functions.
(gen_hsa_insns_for_call): Updated names of builtins, use
the above helper functions to handle them.
* omp-low.c (grid_expand_omp_for_loop): Update builtin names.
---
 gcc/hsa-builtins.def |   6 +-
 gcc/hsa-gen.c| 173 ++-
 gcc/omp-low.c|   6 +-
 3 files changed, 123 insertions(+), 62 deletions(-)

diff --git a/gcc/hsa-builtins.def b/gcc/hsa-builtins.def
index 3f183f1..dcd0c55 100644
--- a/gcc/hsa-builtins.def
+++ b/gcc/hsa-builtins.def
@@ -27,9 +27,9 @@ along with GCC; see the file COPYING3.  If not see
 /* The reason why they aren't in gcc/builtins.def is that the Fortran front end
doesn't source those.  */
 
-DEF_HSA_BUILTIN (BUILT_IN_HSA_GET_WORKGROUP_ID, "hsa_get_workgroup_id",
+DEF_HSA_BUILTIN (BUILT_IN_HSA_WORKGROUPID, "hsa_workgroupid",
 BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
-DEF_HSA_BUILTIN (BUILT_IN_HSA_GET_WORKITEM_ID, "hsa_get_workitem_id",
+DEF_HSA_BUILTIN (BUILT_IN_HSA_WORKITEMID, "hsa_workitemid",
 BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
-DEF_HSA_BUILTIN (BUILT_IN_HSA_GET_WORKITEM_ABSID, "hsa_get_workitem_absid",
+DEF_HSA_BUILTIN (BUILT_IN_HSA_WORKITEMABSID, "hsa_workitemabsid",
 BT_FN_UINT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index e16a5c7..24cc2c7 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -3795,8 +3795,8 @@ hsa_insn_basic::set_output_in_type (hsa_op_reg *dest, 
unsigned op_index,
HBB.  */
 
 static void
-query_hsa_grid (hsa_op_reg *dest, BrigType16_t opcode,  hsa_op_immed 
*dimension,
-   hsa_bb *hbb)
+query_hsa_grid_dim (hsa_op_reg *dest, int opcode, hsa_op_immed *dimension,
+   hsa_bb *hbb)
 {
   hsa_insn_basic *insn = new hsa_insn_basic (2, opcode, BRIG_TYPE_U32, NULL,
 dimension);
@@ -3804,32 +3804,49 @@ query_hsa_grid (hsa_op_reg *dest, BrigType16_t opcode,  
hsa_op_immed *dimension,
   insn->set_output_in_type (dest, 0, hbb);
 }
 
-/* Generate a special HSA-related instruction for gimple STMT.
-   Instructions are appended to basic block HBB.  */
+/* Generate instruction OPCODE to query a property of HSA grid along the given
+   dimension which is an immediate in first argument of STMT.  Store result
+   into the register corresponding to LHS of STMT and append the instruction to
+   HBB.  */
 
 static void
-query_hsa_grid (gimple *stmt, BrigOpcode16_t opcode, hsa_op_immed *dimension,
-   hsa_bb *hbb)
+query_hsa_grid_dim (gimple *stmt, int opcode, hsa_bb *hbb)
 {
   tree lhs = gimple_call_lhs (dyn_cast  (stmt));
   if (lhs == NULL_TREE)
 return;
 
-  hsa_op_reg *dest = hsa_cfun->reg_for_gimple_ssa (lhs);
+  tree arg = gimple_call_arg (stmt, 0);
+  unsigned HOST_WIDE_INT dim = 5;
+  if (tree_fits_uhwi_p (arg))
+dim = tree_to_uhwi (arg);
+  if (dim > 2)
+{
+  HSA_SORRY_AT (gimple_location (stmt),
+   "HSA grid query dimension must be immediate constant 0, 1 "
+   "or 2");
+  return;
+}
 
-  query_hsa_grid (dest, opcode, dimension, hbb);
+  hsa_op_immed *hdim = new hsa_op_immed (dim, (BrigKind16_t) BRIG_TYPE_U32);
+  hsa_op_reg *dest = hsa_cfun->reg_for_gimple_ssa (lhs);
+  query_hsa_grid_dim (dest, opcode, hdim, hbb);
 }
 
-/* Generate a special HSA-related instruction for gimple STMT.
-   Instructions are appended to basic block HBB.  */
+/* Generate instruction OPCODE to query a property of HSA grid that is
+   independent of any dimension.  Store result into the register corresponding
+   to LHS of STMT and append the instruction to HBB.  */
 
 static void
-query_hsa_grid (gimple *stmt, BrigOpcode16_t opcode, int dimension,
- 

Re: [PATCH] Fix wrong code on aarch64 due to paradoxical subreg

2016-08-03 Thread Jeff Law

On 08/02/2016 02:16 PM, Bernd Schmidt wrote:

On 08/02/2016 06:46 PM, Segher Boessenkool wrote:

On Tue, Aug 02, 2016 at 09:21:34AM -0600, Jeff Law wrote:

However I think there are more paradoxical subregs generated all over,
but the aarch64 insv code pattern did trigger more hidden bugs than
any other port.  It is certainly unfortunate that the major source
of paradoxical subreg is in a target-dependent code path :(


It is certainly unfortunate that paradoxical subregs exist at all!  :-)

Yea.  It probably seemed like a good idea 25-30 years ago, but I always
cringe when I see them being used.  Yea it gives the compiler some more
freedom, but more often than not I think we'd be better off with real
extensions.


And then perhaps have some bits marked as "do not care", perhaps using
a register note...  This would help other cases as well.


I'm thinking maybe an any_extend code to go along with sign_extend and
zero_extend. If input and output registers are the same it would be
treated like a no-op move. That might be close enough to get us the
benefits of a paradoxical subreg.

I was kind of thinking along the same lines.  Not high priority though.

Jeff


Re: [PATCH][RFC] PR middle-end/22141 GIMPLE store widening pass

2016-08-03 Thread Kyrill Tkachov

Hi Richard,

On 18/07/16 13:22, Richard Biener wrote:

On Fri, Jul 15, 2016 at 5:13 PM, Kyrill Tkachov
 wrote:





+  /* Record the original statements so that we can keep track of
+statements emitted in this pass and not re-process new
+statements.  */
+  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+ gimple *stmt = gsi_stmt (gsi);
+ if (!is_gimple_debug (stmt))
+   orig_stmts.add (stmt);
+ num_statements++;
+   }

please use gimple_set_visited () instead, that should be cheaper.


+  do
+   {
+ changes_made = false;
+ for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
...
+   }
+  while (changes_made);

looks pretty quadratic to me.  Instead of tracking things with m_curr_base_expr
why not use a hash-map to track stores related to a base?

+  /* Don't handle bitfields that are not a multiple
+ of BITS_PER_UNIT for now.  Can be extended
+ later.  */
+  && (bitsize % BITS_PER_UNIT == 0)

:(

+  && !stmt_interferes_with_mem_accesses_p (lhs))

given this function loops over all accesses this is quadratic as well.

I think alias queries could be simplified if you reduce them to alias
checks on the base object (and allow overlapping constant stores
which should be easy to handle during merging).


I've implemented that and it simplified the detecting code as well as its 
complexity
but I'm now missing some cases that were being caught before.
An example is:
struct bar {
  int a;
  char b;
  char c;
  char d;
  char e;
  char g;
};

void
foo1 (struct bar *p, char tmp)
{
  p->a = 0;
  p->b = 0;
  p->g = tmp;
  p->c = 0;
  p->d = 0;
  p->e = 0;
}

The store to 'g' doesn't interfere with the contiguous stores to the early 
fields but because
we perform the aliasing checks on the base object 'p' rather than the full LHS 
of the assignments
this is deemed to alias the constant stores and only the first two and the last 
three constant stores
are merged instead of the full 5 stores.

I'll experiment with some solutions for this involving recording the 
non-constant stores and performing
some trickery during the merging phase.

Thanks,
Kyrill


The VUSE/VDEF handling is somewhat odd.  All stores have both
a VDEF and a VUSE, if you merge a set of them you can simply
re-use the VDEF/VUSE of one of them, effectively replacing the
stmt.  For all other stores you remove you miss a
   unlink_stmt_vdef (stmt);
   release_defs (stmt);
to update virtual SSA form and properly release SSA names.

As you use TBAA in your alias checks you may only _sink_
stores.  Not sure if your merged store placement ensures this.

I think this kind of transforms is useful in early optimizations, not only
very late as you perform it.  Of course it should be really cheap there.

Note that options like -ftree-store-widening are discouraged
("tree" does mean nothing to our users and store widening isn't
what is done - it's store merging).  Simply name it -fstore-merging.
Also the file name should be gimple-ssa-store-merging.c

Looking forward to an algorithmically enhanced version.

Richard.



Thanks,
Kyrill

N.B. I'm going on vacation until August so I won't be able to respond to any
feedback until I get back.

[1] https://gcc.gnu.org/ml/gcc-patches/2009-09/msg01745.html

2016-07-15  Kyrylo Tkachov  

 PR middle-end/22141
 * Makefile.in (OBJS): Add tree-ssa-store-widening.o.
 * common.opt (ftree-store-widening): New Optimization option.
 * opts.c (default_options_table): Add entry for
 OPT_ftree_store_widening.
 * params.def (PARAM_STORE_WIDENING_ALLOW_UNALIGNED): Define.
 * passes.def: Insert pass_tree_store_widening.
 * tree-pass.h (make_pass_tree_store_widening): Declare extern
 prototype.
 * tree-ssa-store-widening.c: New file.
 * doc/invoke.texi (Optimization Options): Document
 -ftree-store-widening.

2016-07-15  Kyrylo Tkachov  
 Jakub Jelinek  

 PR middle-end/22141
 * gcc.c-torture/execute/pr22141-1.c: New test.
 * gcc.c-torture/execute/pr22141-2.c: Likewise.
 * gcc.target/aarch64/ldp_stp_1.c: Adjust for -ftree-store-widening.
 * gcc.dg/store_widening_1.c: New test.
 * gcc.dg/store_widening_2.c: Likewise.
 * gcc.dg/store_widening_3.c: Likewise.
 * gcc.dg/store_widening_4.c: Likewise.
 * gcc.target/i386/pr22141.c: Likewise.




[PATCH 1/4] selftest.h: Add ASSERT_TRUE_AT and ASSERT_FALSE_AT

2016-08-03 Thread David Malcolm
I split out the selftest.h changes from v2 of the kit for ease of review;
here they are.

Successfully bootstrapped®rtested in conjunction with the rest of the
patch kit on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* selftest.h (ASSERT_TRUE): Reimplement in terms of...
(ASSERT_TRUE_AT): New macro.
(ASSERT_FALSE): Reimplement in terms of...
(ASSERT_FALSE_AT): New macro.
(ASSERT_STREQ_AT): Fix typo in comment.
---
 gcc/selftest.h | 30 +-
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/gcc/selftest.h b/gcc/selftest.h
index 0bee476..397e998 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -104,13 +104,19 @@ extern int num_passes;
::selftest::fail if it false.  */
 
 #define ASSERT_TRUE(EXPR)  \
+  ASSERT_TRUE_AT (SELFTEST_LOCATION, (EXPR))
+
+/* Like ASSERT_TRUE, but treat LOC as the effective location of the
+   selftest.  */
+
+#define ASSERT_TRUE_AT(LOC, EXPR)  \
   SELFTEST_BEGIN_STMT  \
   const char *desc = "ASSERT_TRUE (" #EXPR ")";\
   bool actual = (EXPR);\
   if (actual)  \
-::selftest::pass (SELFTEST_LOCATION, desc);\
+::selftest::pass ((LOC), desc);\
   else \
-::selftest::fail (SELFTEST_LOCATION, desc);\
+::selftest::fail ((LOC), desc);\
   SELFTEST_END_STMT
 
 /* Evaluate EXPR and coerce to bool, calling
@@ -118,13 +124,19 @@ extern int num_passes;
::selftest::fail if it true.  */
 
 #define ASSERT_FALSE(EXPR) \
+  ASSERT_FALSE_AT (SELFTEST_LOCATION, (EXPR))
+
+/* Like ASSERT_FALSE, but treat LOC as the effective location of the
+   selftest.  */
+
+#define ASSERT_FALSE_AT(LOC, EXPR) \
   SELFTEST_BEGIN_STMT  \
-  const char *desc = "ASSERT_FALSE (" #EXPR ")";   \
-  bool actual = (EXPR);\
-  if (actual)  \
-::selftest::fail (SELFTEST_LOCATION, desc);
\
-  else \
-::selftest::pass (SELFTEST_LOCATION, desc);
\
+  const char *desc = "ASSERT_FALSE (" #EXPR ")";   \
+  bool actual = (EXPR);
\
+  if (actual)  \
+::selftest::fail ((LOC), desc);\
+  else \
+::selftest::pass ((LOC), desc);\
   SELFTEST_END_STMT
 
 /* Evaluate EXPECTED and ACTUAL and compare them with ==, calling
@@ -169,7 +181,7 @@ extern int num_passes;
(EXPECTED), (ACTUAL));  \
   SELFTEST_END_STMT
 
-/* Like ASSERT_STREQ_AT, but treat LOC as the effective location of the
+/* Like ASSERT_STREQ, but treat LOC as the effective location of the
selftest.  */
 
 #define ASSERT_STREQ_AT(LOC, EXPECTED, ACTUAL) \
-- 
1.8.5.3



[PATCH 4/4] c-format.c: suggest the correct format string to use (PR c/64955)

2016-08-03 Thread David Malcolm
This adds fix-it hints to c-format.c so that it can (sometimes) suggest
the format string the user should have used.

The patch adds selftests for the new code in c-format.c.  These
selftests are thus lang-specific.  This is the first time we've had
lang-specific selftests, and hence the patch also adds a langhook for
running them.  (Note that currently the Makefile only invokes the
selftests for cc1).

Successfully bootstrapped®rtested in conjunction with the rest of the
patch kit on x86_64-pc-linux-gnu.

(The v2 version of the patch had a successful selftest run for stage 1 on
powerpc-ibm-aix7.1.3.0 (gcc111) in conjunction with the rest of the patch
kit, and a successful build of stage1 for all targets via config-list.mk;
the patch has only been rebased since)

OK for trunk if it passes testing?

gcc/c-family/ChangeLog:
PR c/64955
* c-common.h (selftest::c_format_c_tests): New declaration.
(selftest::run_c_tests): New declaration.
* c-format.c: Include "selftest.h.
(format_warning_va): Add param "corrected_substring" and use
it to add a replacement fix-it hint.
(format_warning_at_substring): Likewise.
(format_warning_at_char): Update for new param of
format_warning_va.
(check_format_info_main): Pass "fki" to check_format_types.
(check_format_types): Add param "fki" and pass it to
format_type_warning.
(deref_n_times): New function.
(get_modifier_for_format_len): New function.
(selftest::test_get_modifier_for_format_len): New function.
(get_format_for_type): New function.
(format_type_warning): Add param "fki" and use it to attempt
to provide hints for argument types when calling
format_warning_at_substring.
(selftest::get_info): New function.
(selftest::assert_format_for_type_streq): New function.
(ASSERT_FORMAT_FOR_TYPE_STREQ): New macro.
(selftest::test_get_format_for_type_printf): New function.
(selftest::test_get_format_for_type_scanf): New function.
(selftest::c_format_c_tests): New function.

gcc/c/ChangeLog:
PR c/64955
* c-lang.c (LANG_HOOKS_RUN_LANG_SELFTESTS): If CHECKING_P, wire
this up to selftest::run_c_tests.
(selftest::run_c_tests): New function.

gcc/ChangeLog:
PR c/64955
* langhooks-def.h (LANG_HOOKS_RUN_LANG_SELFTESTS): New default
do-nothing langhook.
(LANG_HOOKS_INITIALIZER): Add LANG_HOOKS_RUN_LANG_SELFTESTS.
* langhooks.h (struct lang_hooks): Add run_lang_selftests.
* selftest-run-tests.c: Include "tree.h" and "langhooks.h".
(selftest::run_tests): Call lang_hooks.run_lang_selftests.

gcc/testsuite/ChangeLog:
PR c/64955
* gcc.dg/format/diagnostic-ranges.c: Add fix-it hints to expected
output.
---
 gcc/c-family/c-common.h |   7 +
 gcc/c-family/c-format.c | 268 ++--
 gcc/c/c-lang.c  |  22 ++
 gcc/langhooks-def.h |   4 +-
 gcc/langhooks.h |   3 +
 gcc/selftest-run-tests.c|   5 +
 gcc/testsuite/gcc.dg/format/diagnostic-ranges.c |  30 ++-
 7 files changed, 319 insertions(+), 20 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7b5da57..61f9ced 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1533,4 +1533,11 @@ extern bool valid_array_size_p (location_t, tree, tree);
 extern bool cilk_ignorable_spawn_rhs_op (tree);
 extern bool cilk_recognize_spawn (tree, tree *);
 
+#if CHECKING_P
+namespace selftest {
+  extern void c_format_c_tests (void);
+  extern void run_c_tests (void);
+} // namespace selftest
+#endif /* #if CHECKING_P */
+
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 5b79588..f5a4011 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "langhooks.h"
 #include "c-format.h"
 #include "diagnostic.h"
+#include "selftest.h"
 
 /* Handle attributes associated with format checking.  */
 
@@ -126,11 +127,21 @@ static int format_flags (int format_num);
  printf(fmt, msg);
 ^~~  ~~~
 
+   If CORRECTED_SUBSTRING is non-NULL, use it for cases 1 and 2 to provide
+   a fix-it hint, suggesting that it should replace the text within the
+   substring range.  For example:
+
+ test.c:90:10: warning: problem with '%i' here [-Wformat=]
+ printf ("hello %i", msg);
+~^
+%s
+
Return true if a warning was emitted, false otherwise.  */
 
-ATTRIBUTE_GCC_DIAG (4,0)
+ATTRIBUTE_GCC_DIAG (5,0)
 static bool
 format_warning_va (const substring_loc &fmt_loc, source_range *param_range,
+  const char *corrected_substring,
   int opt,

[PATCH 3/4] Use class substring_loc in c-format.c (PR c/52952)

2016-08-03 Thread David Malcolm
This patch updates c-format.c to use the new class substring_loc, added
in the previous patch, replacing location_column_from_byte_offset.
Hence with this patch, Wformat can underline the precise erroneous
format string in many more cases.

The patch also introduces two new functions for emitting Wformat
warnings: format_warning_at_substring and format_warning_at_char,
providing an inform in the face of macros where the pertinent part of
the format string may be separate from the function call.

Successfully bootstrapped®rtested in conjunction with the rest of the
patch kit on x86_64-pc-linux-gnu.

(The v2 version of the patch had a successful selftest run for stage 1 on
powerpc-ibm-aix7.1.3.0 (gcc111) in conjunction with the rest of the patch
kit, and a successful build of stage1 for all targets via config-list.mk;
the patch has only been rebased since)

OK for trunk if it passes individual testing? (on top of patches 1-2)

gcc/c-family/ChangeLog:
PR c/52952
* c-format.c: Include "diagnostic.h".
(location_column_from_byte_offset): Delete.
(location_from_offset): Delete.
(format_warning_va): New function.
(format_warning_at_substring): New function.
(format_warning_at_char): New function.
(check_format_arg): Capture location of format_tree and pass to
check_format_info_main.
(check_format_info_main): Add params FMT_PARAM_LOC and
FORMAT_STRING_CST.  Convert calls to warning_at to calls to
format_warning_at_char.  Pass a substring_loc instance to
check_format_types.
(check_format_types): Convert first param from a location_t
to a const substring_loc & and rename to "fmt_loc".  Attempt
to extract the range of the relevant parameter and pass it
to format_type_warning.
(format_type_warning): Convert first param from a location_t
to a const substring_loc & and rename to "fmt_loc".  Add
params "param_range" and "type".  Replace calls to warning_at
with calls to format_warning_at_substring.

gcc/testsuite/ChangeLog:
PR c/52952
* gcc.dg/cpp/pr66415-1.c: Likewise.
* gcc.dg/format/asm_fprintf-1.c: Update column numbers.
* gcc.dg/format/c90-printf-1.c: Likewise.
* gcc.dg/format/diagnostic-ranges.c: New test case.
---
 gcc/c-family/c-format.c | 476 +++-
 gcc/testsuite/gcc.dg/cpp/pr66415-1.c|   8 +-
 gcc/testsuite/gcc.dg/format/asm_fprintf-1.c |   6 +-
 gcc/testsuite/gcc.dg/format/c90-printf-1.c  |  14 +-
 gcc/testsuite/gcc.dg/format/diagnostic-ranges.c | 222 +++
 5 files changed, 544 insertions(+), 182 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/format/diagnostic-ranges.c

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index c19c411..5b79588 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "intl.h"
 #include "langhooks.h"
 #include "c-format.h"
+#include "diagnostic.h"
 
 /* Handle attributes associated with format checking.  */
 
@@ -65,78 +66,169 @@ static int first_target_format_type;
 static const char *format_name (int format_num);
 static int format_flags (int format_num);
 
-/* Given a string S of length LINE_WIDTH, find the visual column
-   corresponding to OFFSET bytes.   */
+/* Emit a warning governed by option OPT, using GMSGID as the format
+   string and AP as its arguments.
 
-static unsigned int
-location_column_from_byte_offset (const char *s, int line_width,
- unsigned int offset)
-{
-  const char * c = s;
-  if (*c != '"')
-return 0;
+   Attempt to obtain precise location information within a string
+   literal from FMT_LOC.
+
+   Case 1: if substring location is available, and is within the range of
+   the format string itself, the primary location of the
+   diagnostic is the substring range obtained from FMT_LOC, with the
+   caret at the *end* of the substring range.
+
+   For example:
+
+ test.c:90:10: warning: problem with '%i' here [-Wformat=]
+ printf ("hello %i", msg);
+~^
+
+   Case 2: if the substring location is available, but is not within
+   the range of the format string, the primary location is that of the
+   format string, and an note is emitted showing the substring location.
+
+   For example:
+ test.c:90:10: warning: problem with '%i' here [-Wformat=]
+ printf("hello " INT_FMT " world", msg);
+^
+ test.c:19: note: format string is defined here
+ #define INT_FMT "%i"
+  ~^
+
+   Case 3: if precise substring information is unavailable, the primary
+   location is that of the whole string passed to FMT_LOC's constructor.
+   For example:
+
+ test.c:90:10: warning: problem with '%i' here [-Wformat=]
+ printf(fmt, msg);
+^~~
+
+   For

[PATCH 2/4] (v3) On-demand locations within string-literals

2016-08-03 Thread David Malcolm
Changes in v3:
- Avoid including cpplib.h from input.h
- Properly handle stringified macro arguments (with tests for this)
- Minor whitespace fixes
- Move selftest.h changes to a separate patch

Changes in v2:
- Tweaks to substring location selftests
- Many more selftests (EBCDIC, the various wide string types, etc)
- Clean up conditions in charset.c; require source == execution charset
  to have substring locations
- Make string_concat_db field private
- Return error messages rather than bool
- Fix source_range for charset.c:convert_escape
- Introduce class substring_loc
- Handle bad input locations more gracefully
- Ensure that we can read substring information for a token which
  starts in one linemap and ends in another (seen in
  gcc.dg/cpp/pr69985.c)

This version addresses Joseph's qn about stringification of macro
arguments (by failing gracefully on them), and the modularity
concerns noted by Manu.

Successfully bootstrapped®rtested in conjunction with the rest of the
patch kit on x86_64-pc-linux-gnu.

v2 of the kit successfully passes a full config-list.mk and a successful 
selftest
run for stage 1 on powerpc-ibm-aix7.1.3.0 (gcc111), both in conjunction with the
rest of the patch kit; I plan to repeat those tests.

I believe I can self-approve the changes to input.c, input.h, libcpp,
and the testsuite; the remaining changes needing approval are those
to c-family and to gcc.c.

OK for trunk if it passes testing? (by itself)

Blurb from v2 follows, for context:

This patch implements precise tracking of source locations for the
individual chars within string literals, so that we can e.g. underline
specific ranges in -Wformat diagnostics.  It handles macros,
concatenated tokens, escaped characters etc.

The idea is to replace the limited implementation of this we currently
have in c-format.c (see r223470 [1]).  Doing so happens in patch 2 of
the kit; this patch just provides the infrastructure to do so.

As before the patch implements a new mode within libcpp's string literal
lexer.  It's disabled during the regular lexer, but it's available
through a low-level interface in input.{c|h} which can rerun the libcpp
code and capture the per-char source_ranges for when we need to issue a
diagnostic.  It also now adds a higher-level interface in c-common.h:
class substring_loc.

As before, to handle concatentation the patch adds some extra data
storage: every time a string concatenation happens in c-lex.c: it stores
the locations of the component tokens in a hash_map, keyed by the
spelling location of the start first token (see class string_concat_db
in input.h).

Hence it's only storing extra data for string concatenations,
not for simple string literals.

As before, this doesn't support the C++ frontend yet, but it doesn't
regress the status quo for c-format.c from C++.  I have a patch for
the C++ FE that records string concatenation information to the lexer,
but given that it's not used yet, I didn't add that in this patch, as
the data would be redundant.

This version of the patch properly handles encodings (and adds a
lot of test coverage for this to input.c).  It makes the simplifying
restriction that precise source location information is only available
if source charset == execution charset, as discussed on this list,
failing gracefully when this isn't the case.

[1]  
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=d5a2ddc76a109258297ff345957c35cb50116c94#patch2

gcc/c-family/ChangeLog:
* c-common.c: Include "substring-locations.h".
(get_cpp_ttype_from_string_type): New function.
(g_string_concat_db): New global.
(substring_loc::get_range): New method.
* c-common.h (g_string_concat_db): New declaration.
(class substring_loc): New class.
* c-lex.c (lex_string): When concatenating strings, capture the
locations of all tokens using a new obstack, and record the
concatenation locations within g_string_concat_db.
* c-opts.c (c_common_init_options): Construct g_string_concat_db
on the ggc-heap.

gcc/ChangeLog:
* gcc.c (cpp_options): Rename string to...
(cpp_options_): ...this, to avoid clashing with struct in
cpplib.h.
(static_specs): Update initialize for above renaming
* input.c (string_concat::string_concat): New constructor.
(string_concat_db::string_concat_db): New constructor.
(string_concat_db::record_string_concatenation): New method.
(string_concat_db::get_string_concatenation): New method.
(string_concat_db::get_key_loc): New method.
(class auto_cpp_string_vec): New class.
(get_substring_ranges_for_loc): New function.
(get_source_range_for_substring): New function.
(get_num_source_ranges_for_substring): New function.
(class selftest::lexer_test_options): New class.
(struct selftest::lexer_test): New struct.
(class selftest::ebcdic_execution_charset): New class.
(selftest:

Re: [PATCH] Teach VRP to truncate the case ranges of a switch

2016-08-03 Thread Jeff Law

On 08/03/2016 07:47 AM, Richard Biener wrote:

On Wed, Aug 3, 2016 at 6:00 AM, Patrick Palka  wrote:

VRP currently has functionality to eliminate case labels that lie
completely outside of the switch operand's value range.  This patch
complements this functionality by teaching VRP to also truncate the case
label ranges that partially overlap with the operand's value range.

Bootstrapped and regtested on x86_64-pc-linux-gnu.  Does this look like
a reasonable optimization?  Admittedly, its effect will almost always be
negligible except in cases where a case label range spans a large number
of values which is a pretty rare thing.  The optimization triggered
about 250 times during bootstrap.


I think it's most useful when the range collapses to a single value.
It's mostly a code/rodata savings.  It's something I've wanted for a 
long time, though the priority dropped considerably once the PA died as 
an architecture :-)


Jeff



[PATCH] Remove deprecated has_trivial_xxx traits

2016-08-03 Thread Jonathan Wakely

These non-standard traits have been deprecated since GCC 5.1 and as
proposed recently I'm removing them for GCC 7.

* include/std/type_traits (has_trivial_default_constructor): Remove.
(has_trivial_copy_constructor, has_trivial_copy_assign): Likewise.
* testsuite/20_util/has_trivial_copy_assign/requirements/
explicit_instantiation.cc: Remove test.
* testsuite/20_util/declval/requirements/1_neg.cc: Adjust dg-error
line number.
* testsuite/20_util/has_trivial_copy_assign/requirements/typedefs.cc:
Likewise.
* testsuite/20_util/has_trivial_copy_assign/value.cc: Likewise.
* testsuite/20_util/has_trivial_copy_constructor/requirements/
explicit_instantiation.cc: Likewise.
* testsuite/20_util/has_trivial_copy_constructor/requirements/
typedefs.cc: Likewise.
* testsuite/20_util/has_trivial_copy_constructor/value.cc: Likewise.
* testsuite/20_util/has_trivial_default_constructor/requirements/
explicit_instantiation.cc: Likewise.
* testsuite/20_util/has_trivial_default_constructor/requirements/
typedefs.cc: Likewise.
* testsuite/20_util/has_trivial_default_constructor/value.cc:
Likewise.
* testsuite/20_util/headers/type_traits/types_std_c++0x_neg.cc:
Check has_trivial_default_constructor, has_trivial_copy_constructor,
and has_trivial_copy_assign are not defined.
* testsuite/20_util/pair/requirements/dr801.cc: Remove commented out
tests.
* testsuite/20_util/tuple/requirements/dr801.cc: Likewise.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc: Adjust
dg-error line number.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc:
Likewise.

Tested powerpc64-linux, committed to trunk.

commit 2355273dde50a3858a065a3a823165cef7cd471d
Author: Jonathan Wakely 
Date:   Tue Aug 2 20:48:57 2016 +0100

Remove deprecated has_trivial_xxx traits

* include/std/type_traits (has_trivial_default_constructor): Remove.
(has_trivial_copy_constructor, has_trivial_copy_assign): Likewise.
* testsuite/20_util/has_trivial_copy_assign/requirements/
explicit_instantiation.cc: Remove test.
* testsuite/20_util/declval/requirements/1_neg.cc: Adjust dg-error
line number.
* testsuite/20_util/has_trivial_copy_assign/requirements/typedefs.cc:
Likewise.
* testsuite/20_util/has_trivial_copy_assign/value.cc: Likewise.
* testsuite/20_util/has_trivial_copy_constructor/requirements/
explicit_instantiation.cc: Likewise.
* testsuite/20_util/has_trivial_copy_constructor/requirements/
typedefs.cc: Likewise.
* testsuite/20_util/has_trivial_copy_constructor/value.cc: Likewise.
* testsuite/20_util/has_trivial_default_constructor/requirements/
explicit_instantiation.cc: Likewise.
* testsuite/20_util/has_trivial_default_constructor/requirements/
typedefs.cc: Likewise.
* testsuite/20_util/has_trivial_default_constructor/value.cc:
Likewise.
* testsuite/20_util/headers/type_traits/types_std_c++0x_neg.cc:
Check has_trivial_default_constructor, has_trivial_copy_constructor,
and has_trivial_copy_assign are not defined.
* testsuite/20_util/pair/requirements/dr801.cc: Remove commented out
tests.
* testsuite/20_util/tuple/requirements/dr801.cc: Likewise.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc: Adjust
dg-error line number.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc:
Likewise.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index bfdc3ba..dd9f57e 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1447,23 +1447,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __has_trivial_destructor(_Tp)>>
 { };
 
-  /// has_trivial_default_constructor (temporary legacy)
-  template
-struct has_trivial_default_constructor
-: public integral_constant
-{ } _GLIBCXX_DEPRECATED;
-
-  /// has_trivial_copy_constructor (temporary legacy)
-  template
-struct has_trivial_copy_constructor
-: public integral_constant
-{ } _GLIBCXX_DEPRECATED;
-
-  /// has_trivial_copy_assign (temporary legacy)
-  template
-struct has_trivial_copy_assign
-: public integral_constant
-{ } _GLIBCXX_DEPRECATED;
 
   /// has_virtual_destructor
   template
diff --git a/libstdc++-v3/testsuite/20_util/declval/requirements/1_neg.cc 
b/libstdc++-v3/testsuite/20_util/declval/requirements/1_neg.cc
index 558b8c6..1c05f61 100644
--- a/libstdc++-v3/testsuite/20_util/declval/requirements/1_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/declval/requirements/1_neg.cc
@@ -19,7 +19,7 @@
 // with this library; see the file COPYING3.  If not see
 // 

Re: [PATCH] Teach VRP to truncate the case ranges of a switch

2016-08-03 Thread David Malcolm
On Wed, 2016-08-03 at 15:47 +0200, Richard Biener wrote:
> On Wed, Aug 3, 2016 at 6:00 AM, Patrick Palka 
> wrote:
> > VRP currently has functionality to eliminate case labels that lie
> > completely outside of the switch operand's value range.  This patch
> > complements this functionality by teaching VRP to also truncate the
> > case
> > label ranges that partially overlap with the operand's value range.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu.  Does this look
> > like
> > a reasonable optimization?  Admittedly, its effect will almost
> > always be
> > negligible except in cases where a case label range spans a large
> > number
> > of values which is a pretty rare thing.  The optimization triggered
> > about 250 times during bootstrap.
> 
> I think it's most useful when the range collapses to a single value.
> 
> Ok.

Is this always an improvement?   I can see that it can simplify things,
eliminate dead code etc, but could it make evaluating the switch less
efficient?

Consider e.g.

 void
 test (char ch)
 {
   if (ch > 17)
 return;

   switch (ch)
 {
 case 0:
   foo (); break;

 case 1 .. 255:
   bar (); break;
 }
 }

which (assuming this could survive this far in this form) previously
could be implemented as a simple "if (ch == 0)" but with this would get
simplified to:

 void
 test (char ch)
 {
   if (ch > 17)
 return;

   switch (ch)
 {
 case 0:
   foo (); break;

 case 1 .. 17:
   bar (); break;
 }
 }

which presumably introduces a compare against 17 in the implementation of the 
switch; does the new compare get optimized away by jump threading?


Sorry if this is a silly qn.
Dave


> Thanks,
> Richard.
> 
> > gcc/ChangeLog:
> > 
> > * tree-vrp.c (simplify_switch_using_ranges): Try to
> > truncate
> > the case label ranges that partially overlap with OP's
> > value
> > range.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.dg/tree-ssa/vrp107.c: New test.
> > * gcc.dg/tree-ssa/vrp108.c: New test.
> > * gcc.dg/tree-ssa/vrp109.c: New test.
> > ---
> >  gcc/testsuite/gcc.dg/tree-ssa/vrp107.c | 25 +++
> >  gcc/testsuite/gcc.dg/tree-ssa/vrp108.c | 25 +++
> >  gcc/testsuite/gcc.dg/tree-ssa/vrp109.c | 65
> > +++
> >  gcc/tree-vrp.c | 80
> > +-
> >  4 files changed, 194 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp107.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp108.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp109.c
> > 
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp107.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/vrp107.c
> > new file mode 100644
> > index 000..b74f031
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp107.c
> > @@ -0,0 +1,25 @@
> > +/* { dg-options "-O2 -fdump-tree-vrp1" }  */
> > +/* { dg-final { scan-tree-dump "case 2:" "vrp1" } }  */
> > +/* { dg-final { scan-tree-dump "case 7 ... 8:" "vrp1" } }  */
> > +
> > +extern void foo (void);
> > +extern void bar (void);
> > +extern void baz (void);
> > +
> > +void
> > +test (int i)
> > +{
> > +  if (i >= 2 && i <= 8)
> > +  switch (i)
> > +{
> > +case 1: /* Redundant label.  */
> > +case 2:
> > +  bar ();
> > +  break;
> > +case 7:
> > +case 8:
> > +case 9: /* Redundant label.  */
> > +  baz ();
> > +  break;
> > +}
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp108.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/vrp108.c
> > new file mode 100644
> > index 000..49dbfb5
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp108.c
> > @@ -0,0 +1,25 @@
> > +/* { dg-options "-O2 -fdump-tree-vrp1" }  */
> > +/* { dg-final { scan-tree-dump "case 1:" "vrp1" } }  */
> > +/* { dg-final { scan-tree-dump "case 9:" "vrp1" } }  */
> > +
> > +extern void foo (void);
> > +extern void bar (void);
> > +extern void baz (void);
> > +
> > +void
> > +test (int i)
> > +{
> > +  if (i < 2 || i > 8)
> > +  switch (i)
> > +{
> > +case 1:
> > +case 2: /* Redundant label.  */
> > +  bar ();
> > +  break;
> > +case 7: /* Redundant label.  */
> > +case 8: /* Redundant label.  */
> > +case 9:
> > +  baz ();
> > +  break;
> > +}
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp109.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/vrp109.c
> > new file mode 100644
> > index 000..86299a9
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp109.c
> > @@ -0,0 +1,65 @@
> > +/* { dg-options "-O2 -fdump-tree-vrp1" }  */
> > +/* { dg-final { scan-tree-dump "case 9 ... 10:" "vrp1" } }  */
> > +/* { dg-final { scan-tree-dump "case 17 ... 18:" "vrp1" } }  */
> > +/* { dg-final { scan-tree-dump "case 27 ... 30:" "vrp1" } }  */
> > +
> > +extern void foo (void);
> > +extern void bar (void);
> > +
> > +void
> > +test1 (int i)
> > +{
> > +  if (i != 7 && i != 8)
> > +switch (i)
> > +

Re: [PATCH] Fix wrong code on aarch64 due to paradoxical subreg

2016-08-03 Thread Jeff Law

On 08/01/2016 12:52 PM, Bernd Edlinger wrote:

Hi Jeff,

On 08/01/16 19:54, Jeff Law wrote:

> Looks like you've probably nailed it.  It'll be interesting see if
> there's any fallout (though our RTL optimizer testing is pretty weak, so
> even if there were, I doubt we'd catch it).
>

If there is, it will probably a performance regression...

Anyway I'd say these two patches do just disable actually wrong
transformations.  So here are both patches as separate diffs
with your suggestion for the comment in cse_insn.

I believe that on x86_64 both patches do not change a single bit.

However I think there are more paradoxical subregs generated all over,
but the aarch64 insv code pattern did trigger more hidden bugs than
any other port.  It is certainly unfortunate that the major source
of paradoxical subreg is in a target-dependent code path :(

Please apologize that I am not able to reduce/finalize the aarch64 test
case at this time, as I usually only work with arm and intel targets,
but I made an exception here, because a bug like that may affect all
targets sooner or later.


Boot-strap and reg-testing on x86_64-linux-gnu.
Plus aarch64 bootstrap and isl-testing by Andreas.


Is it OK for trunk?
cse.c changes look good, but I'd really like to see a testcase for each 
issue in the dejagnu framework.  Extra points if you tried to build a 
unit test using David M's framework, but that isn't required.


The testcase from 70903 ought to be trivial to add to the dejagnu suite. 
 71779 might be more difficult, but if you could take a stab, it'd be 
appreciated.


jeff





[patch,avr] PR 55181 work around do_store_flag producing shifts for bit extractions

2016-08-03 Thread Georg-Johann Lay
do_store_flag has hard-coded right shift for testing a bit, I found no way to 
let the backend direct expr.c into generating an extzv.  As rectifying the 
middle-end is beyond by time frame, here is yet another kludge to catch the 
situation by means of a pattern.



Also hints are welcome if I overlooked something, i.e. if there is a better 
approach to fix this in the avr BE.  FYI, avr has no barrel shifter and hence 
shifts are very costly.


Ok for trunk if nobody comes up with a better solution?

Johann



gcc/
PR 55181
* config/avr/avr.md: New pattern to work around do_store_flag
generating shift instructions for bit extractions.



Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 238983)
+++ config/avr/avr.md	(working copy)
@@ -6691,6 +6691,29 @@ (define_insn_and_split "*extzv.qihi2"
 operands[4] = simplify_gen_subreg (QImode, operands[0], HImode, 1);
   })
 
+;; ??? do_store_flag emits a hard-coded right shift to extract a bit without
+;; even considering rtx_costs, extzv, or a bit-test.  See PR 55181 for an example.
+(define_insn_and_split "*extract.subreg.bit"
+  [(set (match_operand:QI 0 "register_operand"   "=r")
+(and:QI (subreg:QI (any_shiftrt:HISI (match_operand:HISI 1 "register_operand" "r")
+ (match_operand:QI 2 "const_int_operand"  "n"))
+   0)
+(const_int 1)))]
+  "INTVAL (operands[2]) < GET_MODE_BITSIZE (mode)"
+  { gcc_unreachable(); }
+  "&& reload_completed"
+  [;; "*extzv"
+   (set (match_dup 0)
+(zero_extract:QI (match_dup 3)
+ (const_int 1)
+ (match_dup 4)))]
+  {
+int bitno = INTVAL (operands[2]);
+operands[3] = simplify_gen_subreg (QImode, operands[1], mode, bitno / 8);
+operands[4] = GEN_INT (bitno % 8);
+  })
+
+
 
 ;; Fixed-point instructions
 (include "avr-fixed.md")


[PATCH] Enable Mathematical Special Functions for C++17

2016-08-03 Thread Jonathan Wakely

The contents of the Special Functions IS were added to the C++17
draft, so this enables them by default for C++17.

* include/bits/c++config (_GLIBCXX_USE_STD_SPEC_FUNCS): Define for
C++17, or for C++11/C++14 when __STDCPP_WANT_MATH_SPEC_FUNCS__ is
true.
* include/bits/specfun.h [!__STDCPP_WANT_MATH_SPEC_FUNCS__]: Don't
do #error for C++17.
* include/c_global/cmath: Check _GLIBCXX_USE_STD_SPEC_FUNCS instead
of __STDCPP_WANT_MATH_SPEC_FUNCS__.
* include/tr1/bessel_function.tcc: Likewise.
* include/tr1/beta_function.tcc: Likewise.
* include/tr1/cmath: Likewise.
* include/tr1/ell_integral.tcc: Likewise.
* include/tr1/exp_integral.tcc: Likewise.
* include/tr1/gamma.tcc: Likewise.
* include/tr1/hypergeometric.tcc: Likewise.
* include/tr1/legendre_function.tcc: Likewise.
* include/tr1/modified_bessel_func.tcc: Likewise.
* include/tr1/poly_hermite.tcc: Likewise.
* include/tr1/poly_laguerre.tcc: Likewise.
* include/tr1/riemann_zeta.tcc: Likewise.
* include/tr1/special_function_util.h: Likewise.
* testsuite/26_numerics/headers/cmath/functions_std_c++17.cc: New.

Tested x86_64-linux and powerpc64-linux, committed to trunk.


commit b2a59e048c1f125c4876229743bbb9d5475c4b8f
Author: Jonathan Wakely 
Date:   Tue Aug 2 16:09:50 2016 +0100

Enable Mathematical Special Functions for C++17

* include/bits/c++config (_GLIBCXX_USE_STD_SPEC_FUNCS): Define for
C++17, or for C++11/C++14 when __STDCPP_WANT_MATH_SPEC_FUNCS__ is
true.
* include/bits/specfun.h [!__STDCPP_WANT_MATH_SPEC_FUNCS__]: Don't
do #error for C++17.
* include/c_global/cmath: Check _GLIBCXX_USE_STD_SPEC_FUNCS instead
of __STDCPP_WANT_MATH_SPEC_FUNCS__.
* include/tr1/bessel_function.tcc: Likewise.
* include/tr1/beta_function.tcc: Likewise.
* include/tr1/cmath: Likewise.
* include/tr1/ell_integral.tcc: Likewise.
* include/tr1/exp_integral.tcc: Likewise.
* include/tr1/gamma.tcc: Likewise.
* include/tr1/hypergeometric.tcc: Likewise.
* include/tr1/legendre_function.tcc: Likewise.
* include/tr1/modified_bessel_func.tcc: Likewise.
* include/tr1/poly_hermite.tcc: Likewise.
* include/tr1/poly_laguerre.tcc: Likewise.
* include/tr1/riemann_zeta.tcc: Likewise.
* include/tr1/special_function_util.h: Likewise.
* testsuite/26_numerics/headers/cmath/functions_std_c++17.cc: New.

diff --git a/libstdc++-v3/include/bits/c++config 
b/libstdc++-v3/include/bits/c++config
index 4625607..8d2c361 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -532,6 +532,13 @@ namespace std
 #define _GLIBCXX_TXN_SAFE_DYN
 #endif
 
+#if __cplusplus > 201402L
+// In C++17 mathematical special functions are in namespace std.
+# define _GLIBCXX_USE_STD_SPEC_FUNCS 1
+#elif __cplusplus >= 201103L && __STDCPP_WANT_MATH_SPEC_FUNCS__ != 0
+// For C++11 and C++14 they are in namespace std when requested.
+# define _GLIBCXX_USE_STD_SPEC_FUNCS 1
+#endif
 
 // The remainder of the prewritten config is automatic; all the
 // user hooks are listed above.
diff --git a/libstdc++-v3/include/bits/specfun.h 
b/libstdc++-v3/include/bits/specfun.h
index 9f7bb87..93e1852 100644
--- a/libstdc++-v3/include/bits/specfun.h
+++ b/libstdc++-v3/include/bits/specfun.h
@@ -38,7 +38,7 @@
 
 #define __cpp_lib_math_special_functions 201603L
 
-#if __STDCPP_WANT_MATH_SPEC_FUNCS__ == 0
+#if __cplusplus <= 201403L && __STDCPP_WANT_MATH_SPEC_FUNCS__ == 0
 # error include  and define __STDCPP_WANT_MATH_SPEC_FUNCS__
 #endif
 
diff --git a/libstdc++-v3/include/c_global/cmath 
b/libstdc++-v3/include/c_global/cmath
index 6a24ebf..6db9dee 100644
--- a/libstdc++-v3/include/c_global/cmath
+++ b/libstdc++-v3/include/c_global/cmath
@@ -1790,7 +1790,7 @@ _GLIBCXX_END_NAMESPACE_VERSION
 
 #endif // C++11
 
-#if __STDCPP_WANT_MATH_SPEC_FUNCS__ == 1
+#if _GLIBCXX_USE_STD_SPEC_FUNCS
 #  include 
 #endif
 
diff --git a/libstdc++-v3/include/tr1/bessel_function.tcc 
b/libstdc++-v3/include/tr1/bessel_function.tcc
index a2655d8..692f6da 100644
--- a/libstdc++-v3/include/tr1/bessel_function.tcc
+++ b/libstdc++-v3/include/tr1/bessel_function.tcc
@@ -50,7 +50,7 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
-#if __STDCPP_WANT_MATH_SPEC_FUNCS__
+#if _GLIBCXX_USE_STD_SPEC_FUNCS
 # define _GLIBCXX_MATH_NS ::std
 #elif defined(_GLIBCXX_TR1_CMATH)
 namespace tr1
@@ -630,7 +630,7 @@ namespace tr1
   _GLIBCXX_END_NAMESPACE_VERSION
   } // namespace __detail
 #undef _GLIBCXX_MATH_NS
-#if ! __STDCPP_WANT_MATH_SPEC_FUNCS__ && defined(_GLIBCXX_TR1_CMATH)
+#if ! _GLIBCXX_USE_STD_SPEC_FUNCS && defined(_GLIBCXX_TR1_CMATH)
 } // namespace tr1
 #endif
 }
diff --git a/libstdc++-v3/include/tr1/beta_function.tcc 
b/libstdc++-v3/include/tr1/beta_function.tcc
index 76f5093..34b22d4 100644
--- a

Re: [PATCH 1/3] (v2) On-demand locations within string-literals

2016-08-03 Thread Jeff Law

On 07/29/2016 11:27 AM, David Malcolm wrote:

On Fri, 2016-07-29 at 17:53 +0100, Manuel López-Ibáñez wrote:

On 29 July 2016 at 16:25, David Malcolm  wrote:


FWIW, it appears that clang uses the on-demand approach; the
relevant
code appears to be StringLiteral::getLocationOfByte:
http://clang.llvm.org/doxygen/Expr_8cpp_source.html#l01008


As far as I know, llvm doesn't do language diagnostics from the
middle-end/LTO. Thus, they do not have those problems.


If you really want to have middle-end diagnostics from LTO, I can make
the on-demand approach work.

I can also do the stored-location approach, but it would mean rewriting
all the patches again, I think, would be less efficient.

I would prefer the on-demand approach.

Who is empowered to make a decision here?
ISTM we've got a bit of a deadlock here with the two intertwined 
patches.  I'm wondering if we can move both forward, perhaps without the 
higher quality diagnostics for Martin's work initially.  Then iterate on 
what's in-tree to add the higher quality diagnostics, then figure out 
how to deal with some of the issues we have in the LTO space.


Martin's model of running early or late depending on flags is, IMHO, the 
right approach.  And more generally its a good solution for other 
problems in this space.  With that in mind, finding a way to get at the 
diagnostics framework from within the middle end and eventually LTO is, 
IMHO, important.


Given that the diagnostics are the uncommon case, I would strongly 
prefer an on-demand approach rather than recording a ton of stuff in the 
front-end for the unlikely case that we're going to want a diagnostic in 
the middle-end or LTO.


Jeff


Re: [PATCH 1/3] (v2) On-demand locations within string-literals

2016-08-03 Thread Jeff Law

On 07/29/2016 03:42 PM, Joseph Myers wrote:

On Tue, 26 Jul 2016, David Malcolm wrote:


This patch implements precise tracking of source locations for the
individual chars within string literals, so that we can e.g. underline
specific ranges in -Wformat diagnostics.  It handles macros,
concatenated tokens, escaped characters etc.


What if the string literal results from stringizing other tokens (which
might have arisen in turn from macro expansion, including expansion of
built-in macros not just those defined in source files, etc.)?  "You don't
get precise locations" would be a fine answer for such cases - provided
there is good testsuite coverage of them to show they don't crash the
compiler or underline nonsensical characters.
I think losing precise locations in some circumstances would be fine as 
well -- as long as we understand the limitations.


And, yes, crashing or underlining nonsensical characters would be bad, 
so it'd be obviously good to test some of that to ensure the fallbacks 
work as expected.


jeff



Re: [PATCH] Teach VRP to truncate the case ranges of a switch

2016-08-03 Thread Jeff Law

On 08/03/2016 09:29 AM, David Malcolm wrote:

On Wed, 2016-08-03 at 15:47 +0200, Richard Biener wrote:

On Wed, Aug 3, 2016 at 6:00 AM, Patrick Palka 
wrote:

VRP currently has functionality to eliminate case labels that lie
completely outside of the switch operand's value range.  This patch
complements this functionality by teaching VRP to also truncate the
case
label ranges that partially overlap with the operand's value range.

Bootstrapped and regtested on x86_64-pc-linux-gnu.  Does this look
like
a reasonable optimization?  Admittedly, its effect will almost
always be
negligible except in cases where a case label range spans a large
number
of values which is a pretty rare thing.  The optimization triggered
about 250 times during bootstrap.


I think it's most useful when the range collapses to a single value.

Ok.


Is this always an improvement?   I can see that it can simplify things,
eliminate dead code etc, but could it make evaluating the switch less
efficient?
I don't think so.  We should recognize that the default fall-thru 
doesn't happen because of the range associated with CH as we enter the 
switch.


So, in theory we'd still be able to collapse down to

if (ch > 17)
  return
if (ch == 0)
  foo ();
else
  bar ();

I haven't actually checked though.


Jeff


Re: [PATCH 1/4] selftest.h: Add ASSERT_TRUE_AT and ASSERT_FALSE_AT

2016-08-03 Thread Jeff Law

On 08/03/2016 09:45 AM, David Malcolm wrote:

I split out the selftest.h changes from v2 of the kit for ease of review;
here they are.

Successfully bootstrapped®rtested in conjunction with the rest of the
patch kit on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* selftest.h (ASSERT_TRUE): Reimplement in terms of...
(ASSERT_TRUE_AT): New macro.
(ASSERT_FALSE): Reimplement in terms of...
(ASSERT_FALSE_AT): New macro.
(ASSERT_STREQ_AT): Fix typo in comment.
OK.  Though I do wonder if these should just be normal functions...  I 
assume there's a good reason for the macro pain :)



jeff



Re: [PATCH, contrib] download_prerequisites: check for existing symlinks before making new ones

2016-08-03 Thread Jeff Law

On 07/21/2016 01:39 PM, Eric Gallager wrote:

On 7/21/16, Jeff Law  wrote:

On 07/14/2016 01:57 PM, Eric Gallager wrote:



So apparently the "-f" flag properly overwrites symlinks that point to
regular files, but I also did this in my gcc builddir:

$ mkdir isl-0.1.2.3
$ ln -s isl-0.1.2.3 isl-s
$ ln -sfv isl isl-s
isl-s/isl -> isl
$ ln -sfFv isl isl-s
isl-s/isl -> isl
$ ls -l isl-s
lrwxr-xr-x  1 root  wheel  11 Jul 14 07:03 isl-s -> isl-0.1.2.3
$ unlink isl-s
$ ln -sfFv isl isl-s
isl-s -> isl
$ ls -l isl-s
lrwxr-xr-x  1 root  wheel  3 Jul 14 15:51 isl-s -> isl

...it just doesn't overwrite symlinks that point to a directory.

Joys :(

AFAIK unlink may not necessarily be available on the various host
systems GCC supports (solaris, aix, hpux, etc etc).

So rather than relying on ln to remove the link, why don't we just
explicitly remove it with rm -f?

Jeff



Sure, rm -f works, too; I just went with "unlink" in my patch because
it more clearly expresses programmer intent. But I guess portability
is more important. Updated patch attached, although someone else would
have to commit it, as I don't have commit access.

Thanks for your patience.  I've installed your patch.

Jeff


Re: [RFC, v2] Test coverage for --param boundary values

2016-08-03 Thread Jeff Law

On 08/01/2016 06:02 AM, Martin Liška wrote:

On 07/28/2016 11:26 PM, Joseph Myers wrote:

On Mon, 18 Jul 2016, Martin Liška wrote:


Well, I can imaging a guard which will test whether
"$objdir/../../params.options" file exits, and if so, then the tests are
executed? Is it acceptable approach?


The correct way to test for build-tree testing is [info exists
TESTING_IN_BUILD_TREE].  When testing outside the build tree, you should
not assume anything about directories outside of the test and source
directories, meaning you should not test for existence of paths in
$objdir/../ in that case.


Thank you for the hint, I'm attaching patch.



(The preferable approach is to factor out the code generating this file so
it can be run from the testsuite.  Then you don't need to distinguish
build-tree and other testing at all.)



That would be the best approach, but I've got quite limited experience with 
DejaGNU,
I would postpone it and write it on my TODO list.

May I install the suggested patch?

Yes.
jeff


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-08-03 Thread Jeff Law

On 07/29/2016 01:31 PM, Bernd Edlinger wrote:

On 07/29/16 19:28, Jeff Law wrote:

On 07/26/2016 09:48 AM, Bernd Edlinger wrote:


Richard, this was the latest version of the patch:
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01481.html


Are you OK with my clean-up of the presumably dead function names,
or would you like to keep the quirks in special_function_p for now
and just remove ECF_NORETURN and ECF_LEAF?

The only worry I have left is we're not catching _ prefixed savectx,
vfork & getcontext (the the latter two being the important ones).
Various ports still use USER_LABEL_PREFIX, so those symbols might be
showing up as _vfork & friends.

That might just be an oversight (using name instead of tname in the
simplified conditional).



no, that is actually intentional.

OK.  Thanks for clarifying.



If name would contain USER_LABEL_PREFIX then special_function_p would
already have been broken, and that would need to be fixed as well.

We need to handle __sigsetjmp, and that would not work with
one extra underscore.

I try to keep compatible to what is used in the header files of
glibc and newlib.

glibc defines setjmp to _setjmp, and sigsetjmp to __sigsetjmp.

and newlib does not do that.

Thus we actually need to handle these names:

setjmp
_setjmp
sigsetjmp
__sigsetjmp

vfork and getcontext are no defines, so I don't see why
we should match more names than absolutely necessary here.

OK.  I understand and agree.

Jeff


[avr,RFC,patch] Add var attribute "absdata" to support LDS / STS on AVR_TINY.

2016-08-03 Thread Georg-Johann Lay

This is a proposal to support LDS / STS instructions on AVR_TINY.

Currently the fact that the compile won't generate LDS / STS instructions is a 
major source of code bloat.  The patch adds a new variable attribute so that 
the user can assert that address range of respective static-storage data will 
be in the range of LDS / STS (0x40...0xbf).


There is currently no support in Binutils, and IMO support it in Binutils would 
be kind of overkill...  Such support could implement new sections like .zdata, 
.zrodata, .zbss in the linker script which would be located before .data, 
.rodata, .bss.  The compiler would put absdata objects into one of the 
z-sections, but we would have to supply extended startup-code because bss data 
is no more in one contiguous chunk of memory; same for [ro]data.


More thoughts on how to get better support for LDS / STS?

Johann



gcc/
* doc/extend.texi (AVR Variable Attributes) [absdata]: Document it.
* config/avr/avr.c (AVR_SYMBOL_FLAG_TINY_ABSDATA): New macro.
(avr_address_tiny_absdata_p): New static function.
(avr_legitimate_address_p, avr_legitimize_address) [AVR_TINY]: Use
it to determine validity of constant addresses.
(avr_attribute_table) [absdata]: New variable attribute...
(avr_handle_absdata_attribute): ...and handler.
(avr_decl_absdata_p): New static function.
(avr_encode_section_info) [AVR_TINY]: Use it to add flag
AVR_SYMBOL_FLAG_TINY_ABSDATA to respective symbols_refs.

Index: doc/extend.texi
===
--- doc/extend.texi	(revision 238983)
+++ doc/extend.texi	(working copy)
@@ -5957,6 +5957,25 @@ memory-mapped peripherals that may lie o
 volatile int porta __attribute__((address (0x600)));
 @end smallexample
 
+@item absdata
+@cindex @code{absdata} variable attribute, AVR
+Variables in static storage and with the @code{absdata} attribute can
+be accessed by the @code{LDS} and @code{STS} instructions which take
+absolute addresses.
+
+@itemize @bullet
+@item
+This attribute is only supported for the reduced AVR Tiny core
+like @code{ATtiny40}.
+
+@item
+There is currently no Binutils support for this attribute and the user has
+to make sure that respective data is located into an address range that
+can actually be handled by @code{LDS} and @code{STS}.
+This applies to addresses in the range @code{0x40}@dots{}@code{0xbf}.
+
+@end itemize
+
 @end table
 
 @node Blackfin Variable Attributes
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 238983)
+++ config/avr/avr.c	(working copy)
@@ -81,9 +81,13 @@
/ SYMBOL_FLAG_MACH_DEP)
 
 /* (AVR_TINY only): Symbol has attribute progmem */
-#define AVR_SYMBOL_FLAG_TINY_PM \
+#define AVR_SYMBOL_FLAG_TINY_PM \
   (SYMBOL_FLAG_MACH_DEP << 7)
 
+/* (AVR_TINY only): Symbol has attribute absdata */
+#define AVR_SYMBOL_FLAG_TINY_ABSDATA\
+  (SYMBOL_FLAG_MACH_DEP << 8)
+
 #define TINY_ADIW(REG1, REG2, I)\
 "subi " #REG1 ",lo8(-(" #I "))" CR_TAB  \
 "sbci " #REG2 ",hi8(-(" #I "))"
@@ -1802,6 +1806,28 @@ avr_mode_dependent_address_p (const_rtx
 }
 
 
+/* Return true if rtx X is a CONST_INT, CONST or SYMBOL_REF
+   address with the `absdata' variable attribute, i.e. respective
+   data can be read / written by LDS / STS instruction.
+   This is used only for AVR_TINY.  */
+
+static bool
+avr_address_tiny_absdata_p (rtx x, machine_mode mode)
+{
+  if (CONST == GET_CODE (x))
+x = XEXP (XEXP (x, 0), 0);
+
+  if (SYMBOL_REF_P (x))
+return SYMBOL_REF_FLAGS (x) & AVR_SYMBOL_FLAG_TINY_ABSDATA;
+
+  if (CONST_INT_P (x)
+  && IN_RANGE (INTVAL (x), 0, 0xc0 - GET_MODE_SIZE (mode)))
+return true;
+
+  return false;
+}
+
+
 /* Helper function for `avr_legitimate_address_p'.  */
 
 static inline bool
@@ -1886,8 +1912,7 @@ avr_legitimate_address_p (machine_mode m
   /* avrtiny's load / store instructions only cover addresses 0..0xbf:
  IN / OUT range is 0..0x3f and LDS / STS can access 0x40..0xbf.  */
 
-  ok = (CONST_INT_P (x)
-&& IN_RANGE (INTVAL (x), 0, 0xc0 - GET_MODE_SIZE (mode)));
+  ok = avr_address_tiny_absdata_p (x, mode);
 }
 
   if (avr_log.legitimate_address_p)
@@ -1929,8 +1954,7 @@ avr_legitimize_address (rtx x, rtx oldx,
   if (AVR_TINY)
 {
   if (CONSTANT_ADDRESS_P (x)
-  && !(CONST_INT_P (x)
-   && IN_RANGE (INTVAL (x), 0, 0xc0 - GET_MODE_SIZE (mode
+  && ! avr_address_tiny_absdata_p (x, mode))
 {
   x = force_reg (Pmode, x);
 }
@@ -9124,6 +9148,32 @@ avr_handle_fntype_attribute (tree *node,
 }
 
 static tree
+avr_handle_absdata_attribute (tree *node, tree name, tree /* args */,
+  int /* flags */, bool *no_add)
+{
+  location_t loc = DECL_SOURCE_LOCATION (*node);
+
+  if (AVR_TINY)
+{
+   

Go patch committed: stack allocate non-escaping expressions

2016-08-03 Thread Ian Lance Taylor
This patch by Chris Manghane allocates expressions that do not escape
on the stack.  This only happens when doing escape analysis, which is
still not enabled by default.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 239002)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-89a0b3a04f80df388242166b8835f12e82ceb194
+7d6c53910e52b7db2a77c1c1c3bc2c170283a1fa
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 238653)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -252,7 +252,9 @@ Expression::convert_type_to_interface(Ty
   else
 {
   // We are assigning a non-pointer value to the interface; the
-  // interface gets a copy of the value in the heap.
+  // interface gets a copy of the value in the heap if it escapes.
+  // TODO(cmang): Associate escape state state of RHS with newly
+  // created OBJ.
   obj = Expression::make_heap_expression(rhs, location);
 }
 
@@ -729,6 +731,13 @@ Var_expression::do_address_taken(bool es
   else
go_unreachable();
 }
+
+  if (this->variable_->is_variable()
+  && this->variable_->var_value()->is_in_heap())
+{
+  Node::make_node(this)->set_encoding(Node::ESCAPE_HEAP);
+  Node::make_node(this->variable_)->set_encoding(Node::ESCAPE_HEAP);
+}
 }
 
 // Get the backend representation for a reference to a variable.
@@ -831,6 +840,10 @@ Enclosed_var_expression::do_address_take
   else
go_unreachable();
 }
+
+  if (this->variable_->is_variable()
+  && this->variable_->var_value()->is_in_heap())
+Node::make_node(this->variable_)->set_encoding(Node::ESCAPE_HEAP);
 }
 
 // Ast dump for enclosed variable expression.
@@ -3769,9 +3782,18 @@ Unary_expression::do_flatten(Gogo* gogo,
   // value does not escape.  If this->escapes_ is true, we may be
   // able to set it to false if taking the address of a variable
   // that does not escape.
-  if (this->escapes_ && this->expr_->var_expression() != NULL)
+  Node* n = Node::make_node(this);
+  if ((n->encoding() & ESCAPE_MASK) == int(Node::ESCAPE_NONE))
+   this->escapes_ = false;
+
+  Named_object* var = NULL;
+  if (this->expr_->var_expression() != NULL)
+   var = this->expr_->var_expression()->named_object();
+  else if (this->expr_->enclosed_var_expression() != NULL)
+   var = this->expr_->enclosed_var_expression()->variable();
+
+  if (this->escapes_ && var != NULL)
{
- Named_object* var = this->expr_->var_expression()->named_object();
  if (var->is_variable())
this->escapes_ = var->var_value()->escapes();
  if (var->is_result_variable())
@@ -11658,7 +11680,9 @@ Allocation_expression::do_get_backend(Tr
   Gogo* gogo = context->gogo();
   Location loc = this->location();
 
-  if (this->allocate_on_stack_)
+  Node* n = Node::make_node(this);
+  if (this->allocate_on_stack_
+  || (n->encoding() & ESCAPE_MASK) == int(Node::ESCAPE_NONE))
 {
   int64_t size;
   bool ok = this->type_->backend_type_size(gogo, &size);
@@ -12344,7 +12368,15 @@ Slice_construction_expression::do_get_ba
   space->unary_expression()->set_is_slice_init();
 }
   else
-space = Expression::make_heap_expression(array_val, loc);
+{
+  space = Expression::make_heap_expression(array_val, loc);
+  Node* n = Node::make_node(this);
+  if ((n->encoding() & ESCAPE_MASK) == int(Node::ESCAPE_NONE))
+   {
+ n = Node::make_node(space);
+ n->set_encoding(Node::ESCAPE_NONE);
+   }
+}
 
   // Build a constructor for the slice.
 
@@ -13417,8 +13449,12 @@ Heap_expression::do_get_backend(Translat
   Location loc = this->location();
   Gogo* gogo = context->gogo();
   Btype* btype = this->type()->get_backend(gogo);
-  Bexpression* space = Expression::make_allocation(this->expr_->type(),
-  loc)->get_backend(context);
+
+  Expression* alloc = Expression::make_allocation(this->expr_->type(), loc);
+  Node* n = Node::make_node(this);
+  if ((n->encoding() & ESCAPE_MASK) == int(Node::ESCAPE_NONE))
+alloc->allocation_expression()->set_allocate_on_stack();
+  Bexpression* space = alloc->get_backend(context);
 
   Bstatement* decl;
   Named_object* fn = context->function();


[PATCH GCC]Simplify interface for simplify_using_initial_conditions

2016-08-03 Thread Bin Cheng
Hi,
When I introduced parameter STOP for expand_simple_operations, I also added it 
for simplify_using_initial_conditions.  The STOP argument is also passed to 
simplify_using_initial_conditions in 
simple_iv_with_niters/loop_exits_before_overflow.  After analyzing case 
reported by PR72772, I think STOP expanding is only needed for 
expand_simple_operations when handling IV.step in tree-ssa-loop-ivopts.c.  For 
other cases like calls to simplify_using_initial_condition, both cond and expr 
should be expanded to check tree expression equality.  This patch does so.  It 
simplifies interface by removing parameter STOP, also moves 
expand_simple_operations from tree_simplify_using_condition_1 to its caller.

Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin

2016-08-02  Bin Cheng  

PR tree-optimization/72772
* tree-ssa-loop-niter.h (simplify_using_initial_conditions): Delete
parameter STOP.
* tree-ssa-loop-niter.c (tree_simplify_using_condition_1): Delete
parameter STOP and update calls.  Move expand_simple_operations
function call from here...
(simplify_using_initial_conditions): ...to here.  Delete parameter
STOP.
(tree_simplify_using_condition): Delete parameter STOP.
* tree-scalar-evolution.c (simple_iv_with_niters): Update call to
simplify_using_initial_conditions.diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index 7c5cefd..b8bfe51 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -3484,7 +3484,7 @@ simple_iv_with_niters (struct loop *wrto_loop, struct 
loop *use_loop,
   bool allow_nonconstant_step)
 {
   enum tree_code code;
-  tree type, ev, base, e, stop;
+  tree type, ev, base, e;
   wide_int extreme;
   bool folded_casts, overflow;
 
@@ -3601,8 +3601,7 @@ simple_iv_with_niters (struct loop *wrto_loop, struct 
loop *use_loop,
 return true;
   e = fold_build2 (code, boolean_type_node, base,
   wide_int_to_tree (type, extreme));
-  stop = (TREE_CODE (base) == SSA_NAME) ? base : NULL;
-  e = simplify_using_initial_conditions (use_loop, e, stop);
+  e = simplify_using_initial_conditions (use_loop, e);
   if (!integer_zerop (e))
 return true;
 
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index b7d7c32..7690f2f 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -1880,10 +1880,10 @@ expand_simple_operations (tree expr, tree stop)
expression (or EXPR unchanged, if no simplification was possible).  */
 
 static tree
-tree_simplify_using_condition_1 (tree cond, tree expr, tree stop)
+tree_simplify_using_condition_1 (tree cond, tree expr)
 {
   bool changed;
-  tree e, te, e0, e1, e2, notcond;
+  tree e, e0, e1, e2, notcond;
   enum tree_code code = TREE_CODE (expr);
 
   if (code == INTEGER_CST)
@@ -1895,17 +1895,17 @@ tree_simplify_using_condition_1 (tree cond, tree expr, 
tree stop)
 {
   changed = false;
 
-  e0 = tree_simplify_using_condition_1 (cond, TREE_OPERAND (expr, 0), 
stop);
+  e0 = tree_simplify_using_condition_1 (cond, TREE_OPERAND (expr, 0));
   if (TREE_OPERAND (expr, 0) != e0)
changed = true;
 
-  e1 = tree_simplify_using_condition_1 (cond, TREE_OPERAND (expr, 1), 
stop);
+  e1 = tree_simplify_using_condition_1 (cond, TREE_OPERAND (expr, 1));
   if (TREE_OPERAND (expr, 1) != e1)
changed = true;
 
   if (code == COND_EXPR)
{
- e2 = tree_simplify_using_condition_1 (cond, TREE_OPERAND (expr, 2), 
stop);
+ e2 = tree_simplify_using_condition_1 (cond, TREE_OPERAND (expr, 2));
  if (TREE_OPERAND (expr, 2) != e2)
changed = true;
}
@@ -1968,16 +1968,14 @@ tree_simplify_using_condition_1 (tree cond, tree expr, 
tree stop)
return boolean_true_node;
 }
 
-  te = expand_simple_operations (expr, stop);
-
   /* Check whether COND ==> EXPR.  */
   notcond = invert_truthvalue (cond);
-  e = fold_binary (TRUTH_OR_EXPR, boolean_type_node, notcond, te);
+  e = fold_binary (TRUTH_OR_EXPR, boolean_type_node, notcond, expr);
   if (e && integer_nonzerop (e))
 return e;
 
   /* Check whether COND ==> not EXPR.  */
-  e = fold_binary (TRUTH_AND_EXPR, boolean_type_node, cond, te);
+  e = fold_binary (TRUTH_AND_EXPR, boolean_type_node, cond, expr);
   if (e && integer_zerop (e))
 return e;
 
@@ -1992,11 +1990,11 @@ tree_simplify_using_condition_1 (tree cond, tree expr, 
tree stop)
the loop do not cause us to fail.  */
 
 static tree
-tree_simplify_using_condition (tree cond, tree expr, tree stop)
+tree_simplify_using_condition (tree cond, tree expr)
 {
-  cond = expand_simple_operations (cond, stop);
+  cond = expand_simple_operations (cond);
 
-  return tree_simplify_using_condition_1 (cond, expr, stop);
+  return tree_simplify_using_condition_1 (cond, expr);
 }
 
 /* Tries to simplify EXPR using the conditions on entry to LOOP.
@@ -2004,7 +2002

[PATCH PR72772]Also check equality for expanded iv base.

2016-08-03 Thread Bin Cheng
Hi,
Following previous patch, this one fixes PR72772 by checking equality for 
expanded iv base.  Richard is fixing the PR by removing degenerate PHI at the 
first place, but I think this one also catches more cases.
Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin

2016-08-02  Bin Cheng  

PR tree-optimization/72772
* tree-ssa-loop-niter.c (loop_exits_before_overflow): Check equality
for expanded base.

gcc/testsuite/ChangeLog
2016-08-02  Bin Cheng  

PR tree-optimization/pr72772
* gcc.dg/tree-ssa/pr72772.c: New test.diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 2fa51ea..39183c6 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -4141,7 +4141,7 @@ loop_exits_before_overflow (tree base, tree step,
   for (civ = loop->control_ivs; civ; civ = civ->next)
{
  enum tree_code code;
- tree stepped, extreme, civ_type = TREE_TYPE (civ->step);
+ tree civ_type = TREE_TYPE (civ->step);
 
  /* Have to consider type difference because operand_equal_p ignores
 that for constants.  */
@@ -4154,11 +4154,13 @@ loop_exits_before_overflow (tree base, tree step,
continue;
 
  /* Done proving if this is a no-overflow control IV.  */
- if (operand_equal_p (base, civ->base, 0)
- /* Control IV is recorded after expanding simple operations,
-Here we compare it against expanded base too.  */
- || operand_equal_p (expand_simple_operations (base),
- civ->base, 0))
+ if (operand_equal_p (base, civ->base, 0))
+   return true;
+
+ /* Control IV is recorded after expanding simple operations,
+Here we expand base and compare it too.  */
+ tree expanded_base = expand_simple_operations (base);
+ if (operand_equal_p (expanded_base, civ->base, 0))
return true;
 
  /* If this is a before stepping control IV, in other words, we have
@@ -4180,9 +4182,14 @@ loop_exits_before_overflow (tree base, tree step,
  else
code = PLUS_EXPR;
 
- stepped = fold_build2 (code, TREE_TYPE (base), base, step);
- if (operand_equal_p (stepped, civ->base, 0))
+ tree stepped = fold_build2 (code, TREE_TYPE (base), base, step);
+ tree expanded_stepped = fold_build2 (code, TREE_TYPE (base),
+  expanded_base, step);
+ if (operand_equal_p (stepped, civ->base, 0)
+ || operand_equal_p (expanded_stepped, civ->base, 0))
{
+ tree extreme;
+
  if (tree_int_cst_sign_bit (step))
{
  code = LT_EXPR;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr72772.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr72772.c
new file mode 100644
index 000..7aa2b59
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr72772.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -fdump-tree-ldist-details" } */
+
+int foo (int flag, char *a)
+{
+  short i, j;
+  short l = 0;
+  if (flag == 1)
+l = 3;
+
+  for (i = 0; i < 4; i++)
+{
+  for (j = l - 1; j > 0; j--)
+a[j] = a[j - 1];
+  a[0] = i;
+}
+}
+
+/* Addresses of array reference a[j] and a[j - 1] are SCEVs.  */
+/* { dg-final { scan-tree-dump-not "failed: evolution of base is not affine." 
"ldist" } } */
+


[PATCH testsuite/PR33707]Add test case.

2016-08-03 Thread Bin Cheng
Hi,
The case has already been fixed by my unsigned improvement for scev/niter, and 
it can be vectorized successfully.  This patch simply adds a test for it.
Test result checked on x86_64.  Is it OK?

Thanks,
bin

gcc/testsuite/ChangeLog
2016-08-02  Bin Cheng  

PR tree-optimization/33707
* gcc.dg/vect/pr33707.c: New test.diff --git a/gcc/testsuite/gcc.dg/vect/pr33707.c 
b/gcc/testsuite/gcc.dg/vect/pr33707.c
new file mode 100644
index 000..b553142
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr33707.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+int
+foo (char *a, unsigned n)
+{
+int i;
+a[0] = 0;
+for (i = 16; i < n; i++)
+  a[i] = a[i-16];
+}
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */


[PATCH] Define __cpp_lib_generic_associative_lookup feature-test macro

2016-08-03 Thread Jonathan Wakely

This featue is supported, so we should define the macro.

I've also added tests for a couple of other macros, which we already
define.

* include/bits/stl_function.h: Remove commented-out macro.
* include/bits/stl_tree.h (__cpp_lib_generic_associative_lookup):
Define feature-test macro.
* testsuite/experimental/feat-cxx14.cc: Add tests for more macros.

Tested powerpc64-linux, committed to trunk.


commit 236db9620d23e3d5b3b247e30b13e633affbc396
Author: Jonathan Wakely 
Date:   Wed Aug 3 16:57:41 2016 +0100

Define __cpp_lib_generic_associative_lookup feature-test macro

* include/bits/stl_function.h: Remove commented-out macro.
* include/bits/stl_tree.h (__cpp_lib_generic_associative_lookup):
Define feature-test macro.
* testsuite/experimental/feat-cxx14.cc: Add tests for more macros.

diff --git a/libstdc++-v3/include/bits/stl_function.h 
b/libstdc++-v3/include/bits/stl_function.h
index eabf9ba..68f39ff 100644
--- a/libstdc++-v3/include/bits/stl_function.h
+++ b/libstdc++-v3/include/bits/stl_function.h
@@ -225,7 +225,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cplusplus > 201103L
 
 #define __cpp_lib_transparent_operators 201210
-//#define __cpp_lib_generic_associative_lookup 201304
 
   template<>
 struct plus
diff --git a/libstdc++-v3/include/bits/stl_tree.h 
b/libstdc++-v3/include/bits/stl_tree.h
index 7a9a4a6..8697a71 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -73,6 +73,10 @@ namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+#if __cplusplus > 201103L
+# define __cpp_lib_generic_associative_lookup 201304
+#endif
+
   // Red-black tree class, designed for use in implementing STL
   // associative containers (set, multiset, map, and multimap). The
   // insertion and deletion algorithms are based on those in Cormen,
diff --git a/libstdc++-v3/testsuite/experimental/feat-cxx14.cc 
b/libstdc++-v3/testsuite/experimental/feat-cxx14.cc
index 2cc31ca..42b633f6 100644
--- a/libstdc++-v3/testsuite/experimental/feat-cxx14.cc
+++ b/libstdc++-v3/testsuite/experimental/feat-cxx14.cc
@@ -11,6 +11,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #ifndef  __cpp_lib_integer_sequence
 #  error "__cpp_lib_integer_sequence"
@@ -78,11 +80,11 @@
 #  error "__cpp_lib_complex_udls != 201309"
 #endif
 
-//#ifndef  __cpp_lib_generic_associative_lookup
-//#  error "__cpp_lib_generic_associative_lookup"
-//#elif  __cpp_lib_generic_associative_lookup != 201304
-//#  error "__cpp_lib_generic_associative_lookup != 201304"
-//#endif
+#ifndef  __cpp_lib_generic_associative_lookup
+#  error "__cpp_lib_generic_associative_lookup"
+#elif  __cpp_lib_generic_associative_lookup != 201304
+#  error "__cpp_lib_generic_associative_lookup != 201304"
+#endif
 
 //#ifndef  __cpp_lib_null_iterators
 //#  error "__cpp_lib_null_iterators"
@@ -119,3 +121,15 @@
 #elif  __cpp_lib_is_final != 201402
 #  error "__cpp_lib_is_final != 201402"
 #endif
+
+#ifndef  __cpp_lib_is_null_pointer
+#  error "__cpp_lib_is_null_pointer"
+#elif  __cpp_lib_is_null_pointer != 201309
+#  error "__cpp_lib_is_null_pointer != 201309"
+#endif
+
+#ifndef  __cpp_lib_make_reverse_iterator
+#  error "__cpp_lib_make_reverse_iterator"
+#elif  __cpp_lib_make_reverse_iterator != 201402
+#  error "__cpp_lib_make_reverse_iterator != 201402"
+#endif


Re: Implement -Wimplicit-fallthrough (take 2): questionable code

2016-08-03 Thread Jeff Law

On 07/27/2016 10:53 AM, Marek Polacek wrote:

These are the cases where I wasn't sure if the falls through were intentional
or not.

This patch has been tested on powerpc64le-unknown-linux-gnu, aarch64-linux-gnu,
and x86_64-redhat-linux.

2016-07-27  Marek Polacek  

PR c/7652
gcc/
* config/i386/i386.c (ix86_expand_branch): Add gcc_fallthrough..

I'm pretty sure this is an intended fallthru.


* cselib.c (cselib_expand_value_rtx_1): Likewise.
I'm pretty sure this is an intended fallthru.  But rather than fallthru, 
can we just "return orig;"?  If more code gets added here it'll be less 
and less clear if we continue to fall thru.




* gensupport.c (get_alternatives_number): Likewise.
(subst_dup): Likewise.
These are definitely an intended fallthrough.  E & V are essentially the 
same thing, except that for V the vector may be NULL.



* gimplify.c (goa_stabilize_expr): Likewise.
Definitely intended fallthrough.  We're "cleverly" handling the 2nd 
operand of binary expressions, then falling into the unary expression 
cases which handle the 1st operand.



* hsa-gen.c (gen_hsa_insn_for_internal_fn_call): Likewise.
I think this is a bug, but please contact Martin Jambor directly on this 
to confirm.




* reg-stack.c (get_true_reg): Likewise.
Pretty sure this is intentional.  Like the cselib.c case, I'd prefer to 
just duplicate the code from teh fallthru path since it's trivial and 
makes the intent clearer.



* tree-complex.c (expand_complex_division): Likewise.

No sure on this one.




* tree-data-ref.c (get_references_in_stmt): Likewise.
Intentional fallthru.  See how we change ref.is_read for IFN_MASK_LOAD, 
then test it in the fallthru path.



* tree-pretty-print.c (dump_generic_node): Likewise.
?!?  This looks like a false positive from your warning.  We're not 
falling into any case statements here AFAICT.



* var-tracking.c (adjust_mems): Likewise.
Definitely an intended fallthru.   Though I don't like the way this code 
is structured at all.





gcc/cp/
* call.c (add_builtin_candidate): Add gcc_fallthrough.
* cxx-pretty-print.c (pp_cxx_unqualified_id): Likewise.
* parser.c (cp_parser_skip_to_end_of_statement): Likewise.
(cp_parser_cache_defarg): Likewise.

No idea on these.



gcc/c-family/
* c-ada-spec.c (dump_generic_ada_node): Add gcc_fallthrough.
I don't think this is supposed to fall thru.  If it falls through then 
it's going to issue an error about an unexpected TREE_VEC node when we 
were actually working on a TREE_BINFO node.   I think a return 0 is 
appropriate here.




libcpp/
* pch.c (write_macdef): Add CPP_FALLTHRU.
* lex.c (_cpp_lex_direct): Likewise.

No idea on these.

Sooo.  For those where I've indicated fallthru is intentional, drop the 
comment and/or rewrite to avoid fallthru as indicated.


You've got one that looks like a error in your warning 
(tree-pretty-print.c).


Sync with Martin J. on the hsa warning.  I think it's a bug in the hsa 
code, but confirm with Martin.


I think the c-ada-spec should be twiddled to avoid the fallthru with a 
simple "return 0;"


For the rest, use your patch as-is.  That preserves behavior and 
indicates we're not sure if fallthru is appropriate or not.  For cp/ 
feel free to ping Jason directly on those for his thoughts.


jeff



Re: [patch,avr] PR 55181 work around do_store_flag producing shifts for bit extractions

2016-08-03 Thread Denis Chertykov
2016-08-03 18:41 GMT+03:00 Georg-Johann Lay :
> do_store_flag has hard-coded right shift for testing a bit, I found no way
> to let the backend direct expr.c into generating an extzv.  As rectifying
> the middle-end is beyond by time frame, here is yet another kludge to catch
> the situation by means of a pattern.
>
>
> Also hints are welcome if I overlooked something, i.e. if there is a better
> approach to fix this in the avr BE.  FYI, avr has no barrel shifter and
> hence shifts are very costly.
>
> Ok for trunk if nobody comes up with a better solution?
>
> Johann
>
>
>
> gcc/
> PR 55181
> * config/avr/avr.md: New pattern to work around do_store_flag
> generating shift instructions for bit extractions.
>
>

I have no objections.


[PTX] fix worker propagation ICE

2016-08-03 Thread Nathan Sidwell
The PTX backend could ice when generating a state propagation sequence entering 
partitioned execution.  Although the stack frame is DImode aligned, nothing 
actually rounds the size up consistent with that.  That meant we could encounter 
frames that were not a DImode multiple in size.  Which broke the assert checking 
that.


Rather than faff around trying to copy just the extra bit on the end of such a 
frame, I changed the frame emission to round the size up, and adjust the 
propagation machinery likewise.  (Mostly one gets frames when not optimizing 
anyway).


Applied to trunk & gomp4.
2016-08-03  Nathan Sidwell  

	gcc/
	* config/nvptx/nvptx.c (nvptx_declare_function_name): Round frame
	size to DImode boundary.
	(nvptx_propagate): Likewise.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/crash-1.c: New.

Index: gcc/config/nvptx/nvptx.c
===
--- gcc/config/nvptx/nvptx.c	(revision 239084)
+++ gcc/config/nvptx/nvptx.c	(working copy)
@@ -999,11 +999,14 @@ nvptx_declare_function_name (FILE *file,
 init_frame (file, STACK_POINTER_REGNUM,
 		UNITS_PER_WORD, crtl->outgoing_args_size);
 
-  /* Declare a local variable for the frame.  */
+  /* Declare a local variable for the frame.  Force its size to be
+ DImode-compatible.  */
   HOST_WIDE_INT sz = get_frame_size ();
   if (sz || cfun->machine->has_chain)
 init_frame (file, FRAME_POINTER_REGNUM,
-		crtl->stack_alignment_needed / BITS_PER_UNIT, sz);
+		crtl->stack_alignment_needed / BITS_PER_UNIT,
+		(sz + GET_MODE_SIZE (DImode) - 1)
+		& ~(HOST_WIDE_INT)(GET_MODE_SIZE (DImode) - 1));
 
   /* Declare the pseudos we have as ptx registers.  */
   int maxregs = max_reg_num ();
@@ -3222,8 +3225,9 @@ nvptx_propagate (basic_block block, rtx_
   rtx pred = NULL_RTX;
   rtx_code_label *label = NULL;
 
-  gcc_assert (!(fs & (GET_MODE_SIZE (DImode) - 1)));
-  fs /= GET_MODE_SIZE (DImode);
+  /* The frame size might not be DImode compatible, but the frame
+	 array's declaration will be.  So it's ok to round up here.  */
+  fs = (fs + GET_MODE_SIZE (DImode) - 1) / GET_MODE_SIZE (DImode);
   /* Detect single iteration loop. */
   if (fs == 1)
 	fs = 0;
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/crash-1.c
===
--- libgomp/testsuite/libgomp.oacc-c-c++-common/crash-1.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/crash-1.c	(working copy)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O0" } */
+
+/* ICEd in nvptx backend due to unexpected frame size.  */
+#pragma acc routine worker
+void
+worker_matmul (int *c, int i)
+{
+  int j;
+
+#pragma acc loop
+  for (j = 0; j < 4; j++)
+c[j] = j;
+}
+
+
+int
+main ()
+{
+  int c[4];
+
+#pragma acc parallel 
+  {
+worker_matmul (c, 0);
+  }
+  
+  return 0;
+}


Re: [PR57371] transform (double)i eq/ne 0 to i eq/ne 0

2016-08-03 Thread Joseph Myers
On Wed, 3 Aug 2016, Georg-Johann Lay wrote:

> On 03.08.2016 09:53, Prathamesh Kulkarni wrote:
> > Hi,
> > The attached patch tries to transform
> > (double)i eq/ne 0 to i eq/ne 0
> > AFAIU from Joseph's comment 1 in PR, the transform should be safe with
> > -fno-trapping-math ?
> 
> What about signed zeroes?

Comparing with -0 is exactly the same as comparing with +0 (-0 == +0), so 
there is no issue in that regard.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix wrong code on aarch64 due to paradoxical subreg

2016-08-03 Thread Bernd Edlinger
On 08/03/16 17:38, Jeff Law wrote:
> cse.c changes look good, but I'd really like to see a testcase for each
> issue in the dejagnu framework.  Extra points if you tried to build a
> unit test using David M's framework, but that isn't required.
>
> The testcase from 70903 ought to be trivial to add to the dejagnu suite.
>   71779 might be more difficult, but if you could take a stab, it'd be
> appreciated.
>


Yes, sure.  I had assumed that the pr70903 test case is using some
target-specific vector types, but now I see that it even works as-is in
the gcc.c-torture/execute directory.

So I've added the test case to the cse patch.  And quickly verified that
it works on x86_64-linux-gnu.


The pr71779 test case will be pretty difficult to reduce, because it
depends on combine to do the incorrect transformation and lra to spill
the subreg, and on the stack content at runtime to be non-zero.

But technically it *is* already in the isl-test suite, so if isl is
in-tree, it is always executed by make check or make check-isl.

It is just that gmp/mpfr/mpc and isl test results are not included by
contrib/test_summary, but that should be fixable.  What do you think?

Actually that should not be too difficult, as there are test-suite.log
files that we could just added to the test_summary output as-is, for
instance:

cat isl/test-suite.log

==
isl 0.16.1: ./test-suite.log
==

# TOTAL: 5
# PASS:  5
# SKIP:  0
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2


Are the patches OK now?


Thanks
Bernd.
2016-08-01  Bernd Edlinger  

	PR rtl-optimization/70903
	* cse.c (cse_insn): If DEST is a paradoxical SUBREG, don't record DEST.

testsuite:
2016-08-01  Bernd Edlinger  

	PR rtl-optimization/70903
	* gcc.c-torture/execute/pr70903.c: New test.

Index: gcc/cse.c
===
--- gcc/cse.c	(revision 238915)
+++ gcc/cse.c	(working copy)
@@ -5898,15 +5898,7 @@ cse_insn (rtx_insn *insn)
 	|| GET_MODE (dest) == BLKmode
 	/* If we didn't put a REG_EQUAL value or a source into the hash
 	   table, there is no point is recording DEST.  */
-	|| sets[i].src_elt == 0
-	/* If DEST is a paradoxical SUBREG and SRC is a ZERO_EXTEND
-	   or SIGN_EXTEND, don't record DEST since it can cause
-	   some tracking to be wrong.
-
-	   ??? Think about this more later.  */
-	|| (paradoxical_subreg_p (dest)
-		&& (GET_CODE (sets[i].src) == SIGN_EXTEND
-		|| GET_CODE (sets[i].src) == ZERO_EXTEND)))
+	|| sets[i].src_elt == 0)
 	  continue;
 
 	/* STRICT_LOW_PART isn't part of the value BEING set,
@@ -5925,6 +5917,11 @@ cse_insn (rtx_insn *insn)
 	  sets[i].dest_hash = HASH (dest, GET_MODE (dest));
 	}
 
+	/* If DEST is a paradoxical SUBREG, don't record DEST since the bits
+	   outside the mode of GET_MODE (SUBREG_REG (dest)) are undefined.  */
+	if (paradoxical_subreg_p (dest))
+	  continue;
+
 	elt = insert (dest, sets[i].src_elt,
 		  sets[i].dest_hash, GET_MODE (dest));
 
Index: gcc/testsuite/gcc.c-torture/execute/pr70903.c
===
--- gcc/testsuite/gcc.c-torture/execute/pr70903.c	(revision 0)
+++ gcc/testsuite/gcc.c-torture/execute/pr70903.c	(working copy)
@@ -0,0 +1,19 @@
+typedef unsigned char V8 __attribute__ ((vector_size (32)));
+typedef unsigned int V32 __attribute__ ((vector_size (32)));
+typedef unsigned long long V64 __attribute__ ((vector_size (32)));
+
+static V32 __attribute__ ((noinline, noclone))
+foo (V64 x)
+{
+  V64 y = (V64)(V8){((V8)(V64){65535, x[0]})[1]};
+  return (V32){y[0], 255};
+}
+
+int main ()
+{
+  V32 x = foo ((V64){});
+//  __builtin_printf ("%08x %08x %08x %08x %08x %08x %08x %08x\n", x[0], x[1], x[2], x[3], x[4], x[5], x[6], x[7]);
+  if (x[1] != 255)
+__builtin_abort();
+  return 0;
+}
2016-08-01  Bernd Edlinger  

	PR rtl-optimization/71779
	* emit-rtl.c (set_reg_attrs_from_value): Only propagate REG_POINTER,
	if the value was sign-extended according to POINTERS_EXTEND_UNSIGNED
	or if it was truncated.

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c	(revision 238915)
+++ gcc/emit-rtl.c	(working copy)
@@ -1156,7 +1156,11 @@ set_reg_attrs_from_value (rtx reg, rtx x)
 {
 #if defined(POINTERS_EXTEND_UNSIGNED)
   if (((GET_CODE (x) == SIGN_EXTEND && POINTERS_EXTEND_UNSIGNED)
-	   || (GET_CODE (x) != SIGN_EXTEND && ! POINTERS_EXTEND_UNSIGNED))
+	   || (GET_CODE (x) == ZERO_EXTEND && ! POINTERS_EXTEND_UNSIGNED)
+	   || (paradoxical_subreg_p (x)
+	   && ! (SUBREG_PROMOTED_VAR_P (x)
+		 && SUBREG_CHECK_PROMOTED_SIGN (x,
+		POINTERS_EXTEND_UNSIGNED
 	  && !targetm.have_ptr_extend ())
 	can_be_reg_pointer = false;
 #endif


Re: [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-08-03 Thread Joseph Myers
On Wed, 3 Aug 2016, Ramana Radhakrishnan wrote:

> Joseph, do you have any opinions on whether we should be extending the
> standard pattern names or not for btrunc, ceil, round, floor,
> nearbyint, rint, lround, lfloor and lceil optabs for the HFmode
> quantities ?

If the semantics match a standard pattern, you should use the standard 
name.

It may well be the case that many of those patterns would not actually be 
used for generic code even after my _FloatN patches, since (a) I only add 
a minimal set of built-in functions, not the full set of all libm 
functions for all _FloatN / _FloatNx types (given possible issues with 
enum size and initialization time when seven new variants of every libm 
function are added as built-in functions) and (b) many relevant 
optimizations only work for float, double and long double.  But I think 
the right pattern names should still be used.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PR57371] transform (double)i eq/ne 0 to i eq/ne 0

2016-08-03 Thread Joseph Myers
On Wed, 3 Aug 2016, Richard Biener wrote:

> Couldn't this even be
> 
>  (cmp (float @0) REAL_CST@1)
>  (with
>   {
> HOST_WIDE_INT n = real_to_integer (TREE_REAL_CST (@1));
> REAL_VALUE_TYPE cint;
> real_from_integer (&cint, VOIDmode, n, SIGNED);
>   }
>   (if (real_identical (&c, &cint))
>(cmp @0 { build_int_cst (TREE_TYPE (@0), n); }
> 
> with some additional type checks to make sure n fits the type of @0
> (and otherwise fold to true/false directly).

Well, real_identical is too strong.  Comparisons with -0 can be optimized 
just like those with +0.

> Not sure whether we need to restrict it to float types that can
> represent all values of the type of @0 exactly.

I discussed the conditions for this optimization in more detail in PR 
57371.  For an arbitrary comparison operator, between a converted integer 
and an arbitrary real constant: (a) you need -fno-trapping-math unless the 
(type, range) information for the integer implies it can be converted 
exactly to the floating-point type, (b) if the conversion may not be 
exact, you also need -fno-rounding-math unless you know from the value of 
the real constant that rounding for the conversion from integer cannot 
affect the result of the comparison.  Given that, all such floating-point 
comparisons could be converted to equivalent integer comparisons, *but* 
equality comparisons (== != islessgreater) may need to be converted to a 
range test on the integer rather than a comparison with a single integer 
value.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] accept flexible arrays in struct in unions (c++/71912 - [6/7 regression])

2016-08-03 Thread Jason Merrill
On Tue, Aug 2, 2016 at 10:13 PM, Martin Sebor  wrote:
> The change let me replace an if statement and the ANONCTX data
> member with a new function plus a conditional.  I'll let you be
> the judge but to me it doesn't seem like an improvement.

You can use the existing context_for_name_lookup instead of a new function.

>> Hmm, doesn't that mean that
>>
>> typedef struct { int a[]; } B;
>>
>> is never handled?  Perhaps you want to do this check from
>> cp_parser_simple_declaration, when we know whether or not there's a
>> declarator?
>
> Yes, it does mean that and it's one of the outstanding bugs that
> still need fixing (in 7.0).  Same way top-level arrays aren't
> handled.

If you're going to temporarily handle member typedefs here, please add
a prominent FIXME about removing that handling when typedefs get fixed
properly.

Jason


[PATCH] Define C++17 feature-test macros

2016-08-03 Thread Jonathan Wakely

This just defines feature-test macros for some C++17 features that we
already support.

* include/bits/allocator.h (__cpp_lib_incomplete_container_elements):
Define feature-test macro.
* include/bits/range_access.h (__cpp_lib_array_constexpr): Likewise.
* include/std/shared_mutex (__cpp_lib_shared_mutex): Uncomment.
* include/std/type_traits (__cpp_lib_logical_traits): Fix value.
(__cpp_lib_type_trait_variable_templates): Define.

Tested powerpc64-linux, committed to trunk.


commit b607d0af6b72cbcef36d2aa107e98285fe133245
Author: Jonathan Wakely 
Date:   Wed Aug 3 18:06:50 2016 +0100

Define C++17 feature-test macros

* include/bits/allocator.h (__cpp_lib_incomplete_container_elements):
Define feature-test macro.
* include/bits/range_access.h (__cpp_lib_array_constexpr): Likewise.
* include/std/shared_mutex (__cpp_lib_shared_mutex): Uncomment.
* include/std/type_traits (__cpp_lib_logical_traits): Fix value.
(__cpp_lib_type_trait_variable_templates): Define.

diff --git a/libstdc++-v3/include/bits/allocator.h 
b/libstdc++-v3/include/bits/allocator.h
index 597d305..984d800 100644
--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -49,6 +49,8 @@
 #include 
 #endif
 
+#define __cpp_lib_incomplete_container_elements 201505
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
diff --git a/libstdc++-v3/include/bits/range_access.h 
b/libstdc++-v3/include/bits/range_access.h
index e2ec072..d6f8fa1 100644
--- a/libstdc++-v3/include/bits/range_access.h
+++ b/libstdc++-v3/include/bits/range_access.h
@@ -38,6 +38,10 @@ namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+#if __cplusplus >= 201402L
+# define __cpp_lib_array_constexpr 201603L
+#endif
+
   /**
*  @brief  Return an iterator pointing to the first element of
*  the container.
diff --git a/libstdc++-v3/include/std/shared_mutex 
b/libstdc++-v3/include/std/shared_mutex
index 6ca322b..9712b35 100644
--- a/libstdc++-v3/include/std/shared_mutex
+++ b/libstdc++-v3/include/std/shared_mutex
@@ -52,7 +52,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifdef _GLIBCXX_HAS_GTHREADS
 
 #if __cplusplus > 201402L
-// TODO: #define __cpp_lib_shared_mutex 201505
+#define __cpp_lib_shared_mutex 201505
   class shared_mutex;
 #endif
 
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index dd9f57e..693952a 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -156,7 +156,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus > 201402L
 
-#define __cpp_lib_logical_traits 201511
+#define __cpp_lib_logical_traits 201510
 
   template
 struct conjunction
@@ -2763,6 +2763,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif // __cplusplus >= 201402L
 
 #if __cplusplus > 201402L
+# define __cpp_lib_type_trait_variable_templates 201510L
 template 
   constexpr bool is_void_v = is_void<_Tp>::value;
 template 


[PATCH] Define std::owner_less specialization (P0074R0)

2016-08-03 Thread Jonathan Wakely

This was my proposal, so it's about time we supported it.

* include/bits/shared_ptr.h (owner_less): Add default template
argument.
* include/bits/shared_ptr_base.h (_Sp_owner_less): Define
specialization.
(owner_less): Define specialization.
* include/bits/stl_function.h (__cpp_lib_transparent_operators):
Update value.
* testsuite/20_util/owner_less/void.cc: New test.
* testsuite/experimental/feat-cxx14.cc: Update macro value tested.

Tested powerpc64-linux, committed to trunk.

commit 56bd2d871a7d7ddb930c0221188446e339c873ab
Author: Jonathan Wakely 
Date:   Thu Apr 7 13:40:33 2016 +0100

Define std::owner_less specialization (P0074R0)

* include/bits/shared_ptr.h (owner_less): Add default template
argument.
* include/bits/shared_ptr_base.h (_Sp_owner_less): Define
specialization.
(owner_less): Define specialization.
* include/bits/stl_function.h (__cpp_lib_transparent_operators):
Update value.
* testsuite/20_util/owner_less/void.cc: New test.
* testsuite/experimental/feat-cxx14.cc: Update macro value tested.

diff --git a/libstdc++-v3/include/bits/shared_ptr.h 
b/libstdc++-v3/include/bits/shared_ptr.h
index b22477e..16f78f7 100644
--- a/libstdc++-v3/include/bits/shared_ptr.h
+++ b/libstdc++-v3/include/bits/shared_ptr.h
@@ -535,9 +535,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 
   /// Primary template owner_less
-  template
+  template
 struct owner_less;
 
+  /// Void specialization of owner_less
+  template<>
+struct owner_less : _Sp_owner_less
+{ };
+
   /// Partial specialization of owner_less for shared_ptr.
   template
 struct owner_less>
diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index e844c9c..1474df6 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1506,6 +1506,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return __lhs.owner_before(__rhs); }
 };
 
+  template<>
+struct _Sp_owner_less
+{
+  template
+   auto
+   operator()(const _Tp& __lhs, const _Up& __rhs) const
+   -> decltype(__lhs.owner_before(__rhs))
+   { return __lhs.owner_before(__rhs); }
+
+  using is_transparent = void;
+};
+
   template
 struct owner_less<__shared_ptr<_Tp, _Lp>>
 : public _Sp_owner_less<__shared_ptr<_Tp, _Lp>, __weak_ptr<_Tp, _Lp>>
diff --git a/libstdc++-v3/include/bits/stl_function.h 
b/libstdc++-v3/include/bits/stl_function.h
index 68f39ff..1408da6 100644
--- a/libstdc++-v3/include/bits/stl_function.h
+++ b/libstdc++-v3/include/bits/stl_function.h
@@ -224,7 +224,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus > 201103L
 
-#define __cpp_lib_transparent_operators 201210
+#define __cpp_lib_transparent_operators 201510
 
   template<>
 struct plus
diff --git a/libstdc++-v3/testsuite/20_util/owner_less/void.cc 
b/libstdc++-v3/testsuite/20_util/owner_less/void.cc
new file mode 100644
index 000..4facbf5
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/owner_less/void.cc
@@ -0,0 +1,48 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do compile { target c++11 } }
+
+#include 
+
+#if __cplusplus >= 201402L
+// The feature-test macro is only defined for C++14 and later.
+# if __cpp_lib_transparent_operators < 201510
+#  error "__cpp_lib_transparent_operators < 201510"
+# endif
+#endif
+
+void
+test01()
+{
+  using namespace std;
+
+  static_assert(is_same, owner_less>::value,
+"owner_less<> uses void specialization");
+
+  shared_ptr sp1;
+  shared_ptr sp2;
+  shared_ptr sp3;
+  weak_ptr wp1;
+
+  owner_less<> cmp;
+  cmp(sp1, sp2);
+  cmp(sp1, wp1);
+  cmp(sp1, sp3);
+  cmp(wp1, sp1);
+  cmp(wp1, wp1);
+}
diff --git a/libstdc++-v3/testsuite/experimental/feat-cxx14.cc 
b/libstdc++-v3/testsuite/experimental/feat-cxx14.cc
index 42b633f6..c61f7b0 100644
--- a/libstdc++-v3/testsuite/experimental/feat-cxx14.cc
+++ b/libstdc++-v3/testsuite/experimental/feat-cxx14.cc
@@ -40,8 +40,8 @@
 
 #ifndef  __cpp_lib_transparent_operators
 #  error "__cpp_lib_transparent_operators"
-#elif  __cpp_lib_transparent_operators != 201210
-#  error "__cpp_lib_

[PATCH] Define std::as_const

2016-08-03 Thread Jonathan Wakely

Another C++17 feature.

* include/std/utility (as_const): Define.
* testsuite/20_util/as_const/1.cc: New test.
* testsuite/20_util/as_const/rvalue_neg.cc: New test.

Tested powerpc64-linux, committed to trunk.

commit c6d91a7d6d5ec64f8c4e84cd79aadde72f01a4f4
Author: Jonathan Wakely 
Date:   Wed Aug 3 18:43:20 2016 +0100

Define std::as_const

* include/std/utility (as_const): Define.
* testsuite/20_util/as_const/1.cc: New test.
* testsuite/20_util/as_const/rvalue_neg.cc: New test.

diff --git a/libstdc++-v3/include/std/utility b/libstdc++-v3/include/std/utility
index 106ba4d..0c03644 100644
--- a/libstdc++-v3/include/std/utility
+++ b/libstdc++-v3/include/std/utility
@@ -356,6 +356,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template 
 in_place_tag in_place(__in_place_index<_Idx>*) {terminate();}
 
+#define  __cpp_lib_as_const 201510
+  template
+constexpr add_const_t<_Tp>& as_const(_Tp& __t) noexcept { return __t; }
+
+  template
+void as_const(const _Tp&&) = delete;
+
 #endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/testsuite/20_util/as_const/1.cc 
b/libstdc++-v3/testsuite/20_util/as_const/1.cc
new file mode 100644
index 000..2f257b4
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/as_const/1.cc
@@ -0,0 +1,30 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++17" }
+// { dg-do compile }
+
+#include 
+
+#if __cpp_lib_as_const != 201510
+# error "__cpp_lib_as_const != 201510"
+#endif
+
+int i;
+constexpr auto& ci = std::as_const(i);
+static_assert( &i == &ci );
+static_assert( std::is_same_v );
diff --git a/libstdc++-v3/testsuite/20_util/as_const/rvalue_neg.cc 
b/libstdc++-v3/testsuite/20_util/as_const/rvalue_neg.cc
new file mode 100644
index 000..3fe9e18
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/as_const/rvalue_neg.cc
@@ -0,0 +1,28 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++17" }
+// { dg-do compile }
+
+#include 
+
+void test01()
+{
+  int i = 0;
+  std::as_const(std::move(i)); // { dg-error "deleted function" }
+  std::as_const(0);// { dg-error "deleted function" }
+}


one more patch for PR72778

2016-08-03 Thread Vladimir N Makarov
  The following patch fixes a bug reported by Uros on a bootstrap with 
golang:


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72778

  The patch was bootstrapped on x86-64 with golang.

Committed as rev. 239091.
Index: ChangeLog
===
--- ChangeLog	(revision 239090)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2016-08-03  Vladimir Makarov  
+
+	PR middle-end/72778
+	* lra-spills.c (regno_in_use_p): Check bb and regno modification.
+	Don't stop on regular insns.
+
 2016-08-03  Nathan Sidwell  
 
 	* config/nvptx/nvptx.c (nvptx_declare_function_name): Round frame
Index: lra-spills.c
===
--- lra-spills.c	(revision 239000)
+++ lra-spills.c	(working copy)
@@ -686,16 +686,40 @@ return_regno_p (unsigned int regno)
   return false;
 }
 
-/* Return true if REGNO is one of subsequent USE after INSN.  */
+/* Return true if REGNO is in one of subsequent USE after INSN in the
+   same BB.  */
 static bool
 regno_in_use_p (rtx_insn *insn, unsigned int regno)
 {
+  static lra_insn_recog_data_t id;
+  static struct lra_static_insn_data *static_id;
+  struct lra_insn_reg *reg;
+  int i, arg_regno;
+  basic_block bb = BLOCK_FOR_INSN (insn);
+
   while ((insn = next_nondebug_insn (insn)) != NULL_RTX
-	 && INSN_P (insn) && GET_CODE (PATTERN (insn)) == USE)
+	 && bb == BLOCK_FOR_INSN (insn))
 {
-  if (REG_P (XEXP (PATTERN (insn), 0))
+  if (! INSN_P (insn))
+	continue;
+  if (GET_CODE (PATTERN (insn)) == USE
+	  && REG_P (XEXP (PATTERN (insn), 0))
 	  && regno == REGNO (XEXP (PATTERN (insn), 0)))
-	return TRUE;
+	return true;
+  /* Check that the regno is not modified.  */
+  id = lra_get_insn_recog_data (insn);
+  for (reg = id->regs; reg != NULL; reg = reg->next)
+	if (reg->type != OP_IN && reg->regno == (int) regno)
+	  return false;
+  static_id = id->insn_static_data;
+  for (reg = static_id->hard_regs; reg != NULL; reg = reg->next)
+	if (reg->type != OP_IN && reg->regno == (int) regno)
+	  return false;
+  if (id->arg_hard_regs != NULL)
+	for (i = 0; (arg_regno = id->arg_hard_regs[i]) >= 0; i++)
+	  if ((int) regno == (arg_regno >= FIRST_PSEUDO_REGISTER
+			  ? arg_regno : arg_regno - FIRST_PSEUDO_REGISTER))
+	return false;
 }
   return false;
 }


Re: [PATCH] accept flexible arrays in struct in unions (c++/71912 - [6/7 regression])

2016-08-03 Thread Martin Sebor

Do you have ideas about how to improve the naming?  Perhaps change
TYPE_ANONYMOUS_P to TYPE_NO_LINKAGE_NAME?


I haven't thought about changing names but TYPE_NO_LINKAGE_NAME
seems better than TYPE_ANONYMOUS_P.


Or perhaps TYPE_UNNAMED_P.


TYPE_UNNAMED_P would work but it wouldn't be a replacement for
TYPE_ANONYMOUS_P.

It sounds like TYPE_ANONYMOUS_P is the right name and the problem
is that the value it returns isn't accurate until the full context
to which it applies has been seen.

I wonder if the right solution to this class of problems (which
are probably unavoidable in the front end as the tree is being
constructed), is to design an API that prevents using these
"unreliable" queries until they can return a reliable result.
With this approach, as each tree node is being constructed, its
"dynamic" type would reflect only the most specific entity that
has been determined so far (e.g., here, the type would be STRUCT
but not ANONYMOUS_STRUCT, with the latter "derived" from the
former). I suspect that implementing this model using C++
polymorphism would be far too big and slow to be practical but
it could be done via some lighter-weight mechanism that would
avoid these problems.

Martin


[PATCH] Define std::shared_ptr::weak_type

2016-08-03 Thread Jonathan Wakely

Another tiny C++17 feature.

* include/bits/shared_ptr.h (shared_ptr::weak_type): Define.
* include/bits/shared_ptr_base.h (__shared_ptr::weak_type): Define.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust dg-error.
* testsuite/20_util/shared_ptr/requirements/weak_type.cc: New test.
* testsuite/20_util/shared_ptr/cons/void_neg.cc: Likewise.

Tested x86_64-linux, committed to trunk


commit b52141d2180fc30622cf7c4208c949861b7e9e29
Author: Jonathan Wakely 
Date:   Wed Aug 3 19:21:31 2016 +0100

Define std::shared_ptr::weak_type

* include/bits/shared_ptr.h (shared_ptr::weak_type): Define.
* include/bits/shared_ptr_base.h (__shared_ptr::weak_type): Define.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust dg-error.
* testsuite/20_util/shared_ptr/requirements/weak_type.cc: New test.
* testsuite/20_util/shared_ptr/cons/void_neg.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/shared_ptr.h 
b/libstdc++-v3/include/bits/shared_ptr.h
index 16f78f7..483c2bc 100644
--- a/libstdc++-v3/include/bits/shared_ptr.h
+++ b/libstdc++-v3/include/bits/shared_ptr.h
@@ -97,6 +97,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  = typename enable_if::value>::type;
 
 public:
+
+#if __cplusplus > 201402L
+# define __cpp_lib_shared_ptr_weak_type 201606
+  using weak_type = weak_ptr<_Tp>;
+#endif
   /**
*  @brief  Construct an empty %shared_ptr.
*  @post   use_count()==0 && get()==0
diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 1474df6..93ce901 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -876,6 +876,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 public:
   typedef _Tp   element_type;
 
+#if __cplusplus > 201402L
+  using weak_type = __weak_ptr<_Tp, _Lp>;
+#endif
+
   constexpr __shared_ptr() noexcept
   : _M_ptr(0), _M_refcount()
   { }
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
index 395094f..1b3dc1d 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
@@ -32,7 +32,7 @@ void test01()
 {
   X* px = 0;
   std::shared_ptr p1(px);   // { dg-error "here" }
-  // { dg-error "incomplete" "" { target *-*-* } 889 }
+  // { dg-error "incomplete" "" { target *-*-* } 893 }
 
   std::shared_ptr p9(ap());  // { dg-error "here" }
   // { dg-error "incomplete" "" { target *-*-* } 307 }
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc
index 8843ffe..399d2f0 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/void_neg.cc
@@ -25,5 +25,5 @@
 void test01()
 {
   std::shared_ptr p((void*)nullptr);   // { dg-error "here" }
-  // { dg-error "incomplete" "" { target *-*-* } 888 }
+  // { dg-error "incomplete" "" { target *-*-* } 892 }
 }
diff --git 
a/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/weak_type.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/weak_type.cc
new file mode 100644
index 000..38f9502
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/weak_type.cc
@@ -0,0 +1,31 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++17" }
+// { dg-do compile }
+
+#include 
+
+using std::shared_ptr;
+using std::weak_ptr;
+using std::is_same_v;
+
+static_assert( is_same_v::weak_type, weak_ptr> );
+static_assert( is_same_v::weak_type, weak_ptr> );
+
+struct X { };
+static_assert( is_same_v::weak_type, weak_ptr> );


[PATCH] Define feature-test macro for std::enable_shared_from_this

2016-08-03 Thread Jonathan Wakely

Another feature we already support, so just define the macro.

* include/bits/shared_ptr_base.h (__cpp_lib_enable_shared_from_this):
Define feature-test macro.
* testsuite/20_util/enable_shared_from_this/members/reinit.cc: Test
for the macro.

Tested x86_64-linux, committed to trunk.

commit 3a75677f2b6f2b3d2b01138c82bce5c051859e94
Author: Jonathan Wakely 
Date:   Wed Aug 3 19:47:53 2016 +0100

Define feature-test macro for std::enable_shared_from_this

* include/bits/shared_ptr_base.h (__cpp_lib_enable_shared_from_this):
Define feature-test macro.
* testsuite/20_util/enable_shared_from_this/members/reinit.cc: Test
for the macro.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 93ce901..2698ba4 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1472,6 +1472,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   void
   _M_assign(_Tp* __ptr, const __shared_count<_Lp>& __refcount) noexcept
   {
+#define __cpp_lib_enable_shared_from_this 201603
if (use_count() == 0)
  {
_M_ptr = __ptr;
diff --git 
a/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/reinit.cc 
b/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/reinit.cc
index 4ce23bc..1cf9148 100644
--- a/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/reinit.cc
+++ b/libstdc++-v3/testsuite/20_util/enable_shared_from_this/members/reinit.cc
@@ -20,6 +20,10 @@
 #include 
 #include 
 
+#if __cpp_lib_enable_shared_from_this < 201603
+# error "__cpp_lib_enable_shared_from_this < 201603"
+#endif
+
 struct X : public std::enable_shared_from_this { };
 
 bool


C++ PATCH to allow constexpr ctor with typedef declaration in C++11 (c++/70229)

2016-08-03 Thread Marek Polacek
In C++11, constexpr constructor must have an empty body except for
several cases, one of them being:
- typedef declarations and alias-declarations that do not define
  classes or enumerations
But we were rejecting constexpr constructors consisting of a typedef
declaration only.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-08-03  Marek Polacek  

PR c++/70229
* constexpr.c (check_constexpr_ctor_body_1): Allow typedef
declarations.

* g++.dg/cpp0x/constexpr-ctor19.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index edade48..41665c5 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -425,7 +425,8 @@ check_constexpr_ctor_body_1 (tree last, tree list)
   switch (TREE_CODE (list))
 {
 case DECL_EXPR:
-  if (TREE_CODE (DECL_EXPR_DECL (list)) == USING_DECL)
+  if (TREE_CODE (DECL_EXPR_DECL (list)) == USING_DECL
+ || TREE_CODE (DECL_EXPR_DECL (list)) == TYPE_DECL)
return true;
   return false;
 
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-ctor19.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-ctor19.C
index e69de29..f5ef053 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-ctor19.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-ctor19.C
@@ -0,0 +1,42 @@
+// PR c++/70229
+// { dg-do compile { target c++11 } }
+
+template 
+class S {
+  constexpr S (void) {
+typedef int T;
+  }
+};
+
+template 
+class S2 {
+  constexpr S2 (void) {
+;
+  }
+};
+
+template 
+class S3 {
+  constexpr S3 (void) {
+typedef enum { X } E;
+  } // { dg-error "does not have empty body" "" { target c++11_only } }
+};
+
+template 
+class S4 {
+  constexpr S4 (void) {
+typedef struct { int j; } U;
+  } // { dg-error "does not have empty body" "" { target c++11_only } }
+};
+
+struct V
+{
+  int i;
+};
+
+template 
+class S5 {
+  constexpr S5 (void) {
+typedef V W;
+  }
+};

Marek


Go patch committed: use a cache for interface methods

2016-08-03 Thread Ian Lance Taylor
The Go frontend constructs the list of interface methods in a few
different cases.  These can be called frequently, but the list is
reconstructed each time.  This patch by Than McIntosh uses a cache.
It reduces the memory usage of the frontend from around 16mb to around
10mb when compiling the fmt package.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 239083)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-7d6c53910e52b7db2a77c1c1c3bc2c170283a1fa
+0fb416a7bed076bdfef168480789bb2994a58de3
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 239083)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -14114,16 +14114,27 @@ Interface_info_expression::do_type()
 {
 case INTERFACE_INFO_METHODS:
   {
+typedef Unordered_map(Interface_type*, Type*) Hashtable;
+static Hashtable result_types;
+
+Interface_type* itype = this->iface_->type()->interface_type();
+
+Hashtable::const_iterator p = result_types.find(itype);
+if (p != result_types.end())
+  return p->second;
+
 Type* pdt = Type::make_type_descriptor_ptr_type();
-if (this->iface_->type()->interface_type()->is_empty())
-  return pdt;
+if (itype->is_empty())
+  {
+result_types[itype] = pdt;
+return pdt;
+  }
 
 Location loc = this->location();
 Struct_field_list* sfl = new Struct_field_list();
 sfl->push_back(
 Struct_field(Typed_identifier("__type_descriptor", pdt, loc)));
 
-Interface_type* itype = this->iface_->type()->interface_type();
 for (Typed_identifier_list::const_iterator p = 
itype->methods()->begin();
  p != itype->methods()->end();
  ++p)
@@ -14156,7 +14167,9 @@ Interface_info_expression::do_type()
 sfl->push_back(Struct_field(Typed_identifier(fname, mft, loc)));
   }
 
-return Type::make_pointer_type(Type::make_struct_type(sfl, loc));
+Pointer_type *pt = Type::make_pointer_type(Type::make_struct_type(sfl, 
loc));
+result_types[itype] = pt;
+return pt;
   }
 case INTERFACE_INFO_OBJECT:
   return Type::make_pointer_type(Type::make_void_type());


Re: [PATCH] accept flexible arrays in struct in unions (c++/71912 - [6/7 regression])

2016-08-03 Thread Jason Merrill
On Wed, Aug 3, 2016 at 3:10 PM, Martin Sebor  wrote:
 Do you have ideas about how to improve the naming?  Perhaps change
 TYPE_ANONYMOUS_P to TYPE_NO_LINKAGE_NAME?
>>>
>>> I haven't thought about changing names but TYPE_NO_LINKAGE_NAME
>>> seems better than TYPE_ANONYMOUS_P.
>>
>> Or perhaps TYPE_UNNAMED_P.
>
> TYPE_UNNAMED_P would work but it wouldn't be a replacement for
> TYPE_ANONYMOUS_P.
>
> It sounds like TYPE_ANONYMOUS_P is the right name and the problem
> is that the value it returns isn't accurate until the full context
> to which it applies has been seen.

I think you're thinking of ANON_AGGR_TYPE_P, which identifies
anonymous structs/unions; TYPE_ANONYMOUS_P identifies unnamed classes.

> I wonder if the right solution to this class of problems (which
> are probably unavoidable in the front end as the tree is being
> constructed), is to design an API that prevents using these
> "unreliable" queries until they can return a reliable result.

It would be possible to change ANON_AGGR_TYPE_P to require
COMPLETE_TYPE_P, but a lot of uses will need to be adjusted to avoid
crashing.

Jason


Re: [PATCH] accept flexible arrays in struct in unions (c++/71912 - [6/7 regression])

2016-08-03 Thread Martin Sebor

On 08/03/2016 02:01 PM, Jason Merrill wrote:

On Wed, Aug 3, 2016 at 3:10 PM, Martin Sebor  wrote:

Do you have ideas about how to improve the naming?  Perhaps change
TYPE_ANONYMOUS_P to TYPE_NO_LINKAGE_NAME?


I haven't thought about changing names but TYPE_NO_LINKAGE_NAME
seems better than TYPE_ANONYMOUS_P.


Or perhaps TYPE_UNNAMED_P.


TYPE_UNNAMED_P would work but it wouldn't be a replacement for
TYPE_ANONYMOUS_P.

It sounds like TYPE_ANONYMOUS_P is the right name and the problem
is that the value it returns isn't accurate until the full context
to which it applies has been seen.


I think you're thinking of ANON_AGGR_TYPE_P, which identifies
anonymous structs/unions; TYPE_ANONYMOUS_P identifies unnamed classes.


Doh!  You're right.  I let the name confuse me again. Clearly
TYPE_ANONYMOUS_P isn't the best name since it doesn't correspond
to the C/C++ concept of an anonymous struct or union.  TYPE_UNNAMED
would be better (the same can be said about the C++ diagnostics that
refer to unnamed structs as .)




I wonder if the right solution to this class of problems (which
are probably unavoidable in the front end as the tree is being
constructed), is to design an API that prevents using these
"unreliable" queries until they can return a reliable result.


It would be possible to change ANON_AGGR_TYPE_P to require
COMPLETE_TYPE_P, but a lot of uses will need to be adjusted to avoid
crashing.


No, crashing shouldn't happen.  It shouldn't be possible to call
the function unless/until the node that represents the concept
has been fully constructed. Using C++ syntax:

  void foo (tree *t)
  {
if (ANONYMOUS_STRUCT *anon = dynamic_cast(t))
  anon->function_only_defined_in_anonymous_struct ();
  }

I was hoping something like this was close to what someone (Andrew?)
has been working on.

Martin


Re: [PATCH] accept flexible arrays in struct in unions (c++/71912 - [6/7 regression])

2016-08-03 Thread Jason Merrill
On Wed, Aug 3, 2016 at 4:23 PM, Martin Sebor  wrote:
> On 08/03/2016 02:01 PM, Jason Merrill wrote:
>> On Wed, Aug 3, 2016 at 3:10 PM, Martin Sebor  wrote:
>>
>> Do you have ideas about how to improve the naming?  Perhaps change
>> TYPE_ANONYMOUS_P to TYPE_NO_LINKAGE_NAME?
>
> I haven't thought about changing names but TYPE_NO_LINKAGE_NAME
> seems better than TYPE_ANONYMOUS_P.

 Or perhaps TYPE_UNNAMED_P.
>>>
>>> TYPE_UNNAMED_P would work but it wouldn't be a replacement for
>>> TYPE_ANONYMOUS_P.
>>>
>>> It sounds like TYPE_ANONYMOUS_P is the right name and the problem
>>> is that the value it returns isn't accurate until the full context
>>> to which it applies has been seen.
>>
>> I think you're thinking of ANON_AGGR_TYPE_P, which identifies
>> anonymous structs/unions; TYPE_ANONYMOUS_P identifies unnamed classes.
>
> Doh!  You're right.  I let the name confuse me again. Clearly
> TYPE_ANONYMOUS_P isn't the best name since it doesn't correspond
> to the C/C++ concept of an anonymous struct or union.  TYPE_UNNAMED
> would be better (the same can be said about the C++ diagnostics that
> refer to unnamed structs as .)

I'll make this change; sorry for the merge conflict it will cause.

>>> I wonder if the right solution to this class of problems (which
>>> are probably unavoidable in the front end as the tree is being
>>> constructed), is to design an API that prevents using these
>>> "unreliable" queries until they can return a reliable result.
>>
>> It would be possible to change ANON_AGGR_TYPE_P to require
>> COMPLETE_TYPE_P, but a lot of uses will need to be adjusted to avoid
>> crashing.
>
> No, crashing shouldn't happen.  It shouldn't be possible to call
> the function unless/until the node that represents the concept
> has been fully constructed. Using C++ syntax:
>
>   void foo (tree *t)
>   {
> if (ANONYMOUS_STRUCT *anon = dynamic_cast(t))
>   anon->function_only_defined_in_anonymous_struct ();
>   }
>
> I was hoping something like this was close to what someone (Andrew?)
> has been working on.

I think for a while he was working on changing trees to use
inheritance rather than a discriminated union, but don't think he is
anymore.

Jason


Re: [PATCH] Fix wrong code on aarch64 due to paradoxical subreg

2016-08-03 Thread Jeff Law

On 08/03/2016 11:41 AM, Bernd Edlinger wrote:

On 08/03/16 17:38, Jeff Law wrote:

cse.c changes look good, but I'd really like to see a testcase for each
issue in the dejagnu framework.  Extra points if you tried to build a
unit test using David M's framework, but that isn't required.

The testcase from 70903 ought to be trivial to add to the dejagnu suite.
  71779 might be more difficult, but if you could take a stab, it'd be
appreciated.




Yes, sure.  I had assumed that the pr70903 test case is using some
target-specific vector types, but now I see that it even works as-is in
the gcc.c-torture/execute directory.

So I've added the test case to the cse patch.  And quickly verified that
it works on x86_64-linux-gnu.


The pr71779 test case will be pretty difficult to reduce, because it
depends on combine to do the incorrect transformation and lra to spill
the subreg, and on the stack content at runtime to be non-zero.

But technically it *is* already in the isl-test suite, so if isl is
in-tree, it is always executed by make check or make check-isl.

It is just that gmp/mpfr/mpc and isl test results are not included by
contrib/test_summary, but that should be fixable.  What do you think?

Actually that should not be too difficult, as there are test-suite.log
files that we could just added to the test_summary output as-is, for
instance:

cat isl/test-suite.log

==
isl 0.16.1: ./test-suite.log
==

# TOTAL: 5
# PASS:  5
# SKIP:  0
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2


Are the patches OK now?

Yes.  Thanks for taking care of this...

Jeff



fix fallout of pr22051-2.c on arm

2016-08-03 Thread Prathamesh Kulkarni
Hi,
The attached patch fixes pr22051-2.c which regressed due to
r238754. Matthew, could you please confirm if this patch fixes the
test-case for you ?

Bootstrapped and tested on x86_64-unknown-linux-gnu.
Cross tested on arm*-*-*.
OK for trunk ?

Thanks,
Prathamesh
2016-08-04  Prathamesh Kulkarni  

* match.pd ((intptr_t) x eq/ne CST to x eq/ne (typeof x) cst): Disable
transform if operand's type is pointer to function.

diff --git a/gcc/match.pd b/gcc/match.pd
index 2380d90..9b97aac 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2532,8 +2532,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for cmp (ne eq)
  (simplify
   (cmp (convert @0) INTEGER_CST@1)
-  (if ((POINTER_TYPE_P (TREE_TYPE (@0)) && INTEGRAL_TYPE_P (TREE_TYPE (@1)))
-|| (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && POINTER_TYPE_P (TREE_TYPE 
(@1
+  (if ((POINTER_TYPE_P (TREE_TYPE (@0)) && !FUNCTION_POINTER_TYPE_P (TREE_TYPE 
(@0))
+   && INTEGRAL_TYPE_P (TREE_TYPE (@1)))
+  || (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && POINTER_TYPE_P (TREE_TYPE (@1))
+ && !FUNCTION_POINTER_TYPE_P (TREE_TYPE (@1
(cmp @0 (convert @1)
 
 /* Non-equality compare simplifications from fold_binary  */


Re: [PATCH testsuite/PR33707]Add test case.

2016-08-03 Thread Jeff Law

On 08/03/2016 10:36 AM, Bin Cheng wrote:

Hi,
The case has already been fixed by my unsigned improvement for scev/niter, and 
it can be vectorized successfully.  This patch simply adds a test for it.
Test result checked on x86_64.  Is it OK?

Thanks,
bin

gcc/testsuite/ChangeLog
2016-08-02  Bin Cheng  

PR tree-optimization/33707
* gcc.dg/vect/pr33707.c: New test.

OK.
jeff



Re: [PR70920] transform (intptr_t) x eq/ne CST to x eq/ne (typeof x) cst

2016-08-03 Thread Prathamesh Kulkarni
On 3 August 2016 at 17:27, Matthew Wahab  wrote:
> On 29/07/16 15:32, Prathamesh Kulkarni wrote:
>>
>> On 29 July 2016 at 12:42, Richard Biener  wrote:
>>>
>>> On Fri, 29 Jul 2016, Prathamesh Kulkarni wrote:
>>>
 On 28 July 2016 at 19:18, Richard Biener  wrote:
>
> On Thu, 28 Jul 2016, Prathamesh Kulkarni wrote:
>
>> On 28 July 2016 at 15:58, Andreas Schwab  wrote:
>>>
>>> On Mo, Jul 25 2016, Prathamesh Kulkarni
>>>  wrote:
>>>
 diff --git a/gcc/testsuite/gcc.dg/pr70920-4.c
 b/gcc/testsuite/gcc.dg/pr70920-4.c
 new file mode 100644
 index 000..dedb895
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/pr70920-4.c
 @@ -0,0 +1,21 @@
 +/* { dg-do compile } */
 +/* { dg-options "-O2 -fdump-tree-ccp-details
 -Wno-int-to-pointer-cast" } */
 +
 +#include 
 +
 +void f1();
 +void f2();
 +
 +void
 +foo (int a)
 +{
 +  void *cst = 0;
 +  if ((int *) a == cst)
 +{
 +  f1 ();
 +  if (a)
 + f2 ();
 +}
 +}
 +
 +/* { dg-final { scan-tree-dump "gimple_simplified to if
 \\(_\[0-9\]* == 0\\)" "ccp1" } } */
>>>
>>>
>>> This fails on all ilp32 platforms.
>
> [..]
>>>
>>>
>>> I don't think just matching == 0 is a good idea.  I suggest to
>>> restrict the testcase to lp64 targets and maybe add a ilp32 variant.
>>
>> Hi,
>> I restricted the test-case to lp64 targets.
>> Is this OK to commit ?
>
>
> Hello,
>
> The test case is failing for arm-none-linux-gnueabihf.
Oops, sorry about that.
>
> It is correctly skipped if the 'dg-require-effective-target lp64' you added
> is moved to the end of the directives (after the dg-options).
Indeed, it is skipped after moving to end.
Is it OK to commit the attached patch ?

Thanks,
Prathamesh
>
> Matthew
>
diff --git a/gcc/testsuite/gcc.dg/pr70920-4.c b/gcc/testsuite/gcc.dg/pr70920-4.c
index ab2748b..c83ebf9 100644
--- a/gcc/testsuite/gcc.dg/pr70920-4.c
+++ b/gcc/testsuite/gcc.dg/pr70920-4.c
@@ -1,6 +1,6 @@
-/* { dg-require-effective-target lp64 } */
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-forwprop-details -Wno-int-to-pointer-cast" } 
*/
+/* { dg-require-effective-target lp64 } */
 
 #include 
 


Re: C++ PATCH to allow constexpr ctor with typedef declaration in C++11 (c++/70229)

2016-08-03 Thread Jason Merrill
OK.

On Wed, Aug 3, 2016 at 3:49 PM, Marek Polacek  wrote:
> In C++11, constexpr constructor must have an empty body except for
> several cases, one of them being:
> - typedef declarations and alias-declarations that do not define
>   classes or enumerations
> But we were rejecting constexpr constructors consisting of a typedef
> declaration only.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2016-08-03  Marek Polacek  
>
> PR c++/70229
> * constexpr.c (check_constexpr_ctor_body_1): Allow typedef
> declarations.
>
> * g++.dg/cpp0x/constexpr-ctor19.C: New test.
>
> diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
> index edade48..41665c5 100644
> --- gcc/cp/constexpr.c
> +++ gcc/cp/constexpr.c
> @@ -425,7 +425,8 @@ check_constexpr_ctor_body_1 (tree last, tree list)
>switch (TREE_CODE (list))
>  {
>  case DECL_EXPR:
> -  if (TREE_CODE (DECL_EXPR_DECL (list)) == USING_DECL)
> +  if (TREE_CODE (DECL_EXPR_DECL (list)) == USING_DECL
> + || TREE_CODE (DECL_EXPR_DECL (list)) == TYPE_DECL)
> return true;
>return false;
>
> diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-ctor19.C 
> gcc/testsuite/g++.dg/cpp0x/constexpr-ctor19.C
> index e69de29..f5ef053 100644
> --- gcc/testsuite/g++.dg/cpp0x/constexpr-ctor19.C
> +++ gcc/testsuite/g++.dg/cpp0x/constexpr-ctor19.C
> @@ -0,0 +1,42 @@
> +// PR c++/70229
> +// { dg-do compile { target c++11 } }
> +
> +template 
> +class S {
> +  constexpr S (void) {
> +typedef int T;
> +  }
> +};
> +
> +template 
> +class S2 {
> +  constexpr S2 (void) {
> +;
> +  }
> +};
> +
> +template 
> +class S3 {
> +  constexpr S3 (void) {
> +typedef enum { X } E;
> +  } // { dg-error "does not have empty body" "" { target c++11_only } }
> +};
> +
> +template 
> +class S4 {
> +  constexpr S4 (void) {
> +typedef struct { int j; } U;
> +  } // { dg-error "does not have empty body" "" { target c++11_only } }
> +};
> +
> +struct V
> +{
> +  int i;
> +};
> +
> +template 
> +class S5 {
> +  constexpr S5 (void) {
> +typedef V W;
> +  }
> +};
>
> Marek


Re: [PATCH GCC]Simplify interface for simplify_using_initial_conditions

2016-08-03 Thread Jeff Law

On 08/03/2016 10:35 AM, Bin Cheng wrote:

Hi,
When I introduced parameter STOP for expand_simple_operations, I also added it 
for simplify_using_initial_conditions.  The STOP argument is also passed to 
simplify_using_initial_conditions in 
simple_iv_with_niters/loop_exits_before_overflow.  After analyzing case 
reported by PR72772, I think STOP expanding is only needed for 
expand_simple_operations when handling IV.step in tree-ssa-loop-ivopts.c.  For 
other cases like calls to simplify_using_initial_condition, both cond and expr 
should be expanded to check tree expression equality.  This patch does so.  It 
simplifies interface by removing parameter STOP, also moves 
expand_simple_operations from tree_simplify_using_condition_1 to its caller.

Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin

2016-08-02  Bin Cheng  

PR tree-optimization/72772
* tree-ssa-loop-niter.h (simplify_using_initial_conditions): Delete
parameter STOP.
* tree-ssa-loop-niter.c (tree_simplify_using_condition_1): Delete
parameter STOP and update calls.  Move expand_simple_operations
function call from here...
(simplify_using_initial_conditions): ...to here.  Delete parameter
STOP.
(tree_simplify_using_condition): Delete parameter STOP.
* tree-scalar-evolution.c (simple_iv_with_niters): Update call to
simplify_using_initial_conditions.


OK.
jeff


Re: [PATCH] adjust spelling of constant expression in C++ diagnostics

2016-08-03 Thread Jason Merrill
OK.

On Tue, Aug 2, 2016 at 2:44 PM, Martin Sebor  wrote:
> My recently committed patch for c++/60760 triggered test suite
> failures in ILP32 mode due to a couple of problems:
>
> 1) The test assumed that (void*)1 will appear in GCC diagnostics
>as 1ul, which is correct in LP64 but not in ILP32.
> 2) GCC is inconsistent in how it spells "constant expression."
>Some errors in the C++ front end hyphenate the words while
>others don't.  Depending on which one happens to get triggered
>a test that assumes one or the other will fail.  Some test work
>around this inconsistency by using the regex period expression
>instead of the space but they aren't consistent about it either.
>
> The attached patch corrects (1), and partially also (2) for errors
> emitted from cp/constexpr.c.  There are still more places in the
> C++ front end that use the hyphenation that should be adjusted
> but I leave that for a follow-on patch.  I chose the spelling
> without the hyphen because it's the dominant form in GCC (32 vs
> 26 diagnostics) and also for consistency with the C front end
> and with other C++ compilers.
>
> While fixing (2) I spent quite a bit of time wrestling with the
> overflow-warn-1.C tests that started failing after the removal
> of the hyphen for no apparent reason.  For some reason I don't
> fully understand, the following test fails in the g++.dg/warn/ directory but
> passes when the order of the dg-error directives
> is reversed.  My guess is that it has something to do with
> the second one being a strict subset of the first and so maybe
> when DejaGnu processes the first one it removes the errors
> matching the second as well and then fails on the second
> directive.
>
> I mention it because it seems like a gotcha worth knowing about
> when writing these types of multi-diagnostic directives.
>
> void f ()
> {
>   switch (0)
> case 1 / 0: ;   // { dg-warning "division by zero" }
> }
>
> // { dg-error "not a constant" "#1" { target *-*-*-* } 4 }
> // { dg-error "division by zero is not a constant.expression" "#2" { target
> c++11 } 4 }
>
> Martin


Re: [PATCH PR72772]Also check equality for expanded iv base.

2016-08-03 Thread Jeff Law

On 08/03/2016 10:35 AM, Bin Cheng wrote:

Hi,
Following previous patch, this one fixes PR72772 by checking equality for 
expanded iv base.  Richard is fixing the PR by removing degenerate PHI at the 
first place, but I think this one also catches more cases.
Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin

2016-08-02  Bin Cheng  

PR tree-optimization/72772
* tree-ssa-loop-niter.c (loop_exits_before_overflow): Check equality
for expanded base.

gcc/testsuite/ChangeLog
2016-08-02  Bin Cheng  

PR tree-optimization/pr72772
* gcc.dg/tree-ssa/pr72772.c: New test.


OK.
jeff


Re: [PATCH] Fix ICE on invalid variable template instantiation (PR c++/72759)

2016-08-03 Thread Jason Merrill
Why not check for error_mark_node right after the tsubst_template_args?

On Tue, Aug 2, 2016 at 10:06 AM, Patrick Palka  wrote:
> This patch fixes PR c++/72759.  The problem seems to be that when
> instantiating a variable template, we fail to propagate error_mark_node
> when its template arguments are erroneous, and we instead build a bogus
> TEMPLATE_ID_EXPR which later confuses check_initializer().  Does this
> look OK to commit after bootstrap + regtesting?
>
> gcc/cp/ChangeLog:
>
> PR c++/72759
> * pt.c (tsubst_qualified_id): Return error_mark_node if
> template_args is error_mark_node.
>
> gcc/testsuite/ChangeLog:
>
> PR c++/72759
> * g++.dg/cpp1y/pr72759.C: New test.
> ---
>  gcc/cp/pt.c  |  3 +++
>  gcc/testsuite/g++.dg/cpp1y/pr72759.C | 18 ++
>  2 files changed, 21 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1y/pr72759.C
>
> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index a23a05a..6b70a65 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -13907,6 +13907,9 @@ tsubst_qualified_id (tree qualified_id, tree args,
>
>if (is_template)
>  {
> +  if (template_args == error_mark_node)
> +   return error_mark_node;
> +
>if (variable_template_p (expr))
> expr = lookup_and_finish_template_variable (expr, template_args,
> complain);
> diff --git a/gcc/testsuite/g++.dg/cpp1y/pr72759.C 
> b/gcc/testsuite/g++.dg/cpp1y/pr72759.C
> new file mode 100644
> index 000..4af6ea4
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp1y/pr72759.C
> @@ -0,0 +1,18 @@
> +// PR c++/72759
> +// { dg-do compile { target c++14 } }
> +
> +template  struct SpecPerType;
> +class Specializer {
> +  public:  template  static void MbrFnTempl();
> +  template  struct A { static void InnerMemberFn(); };
> +  void Trigger() { A<0>::InnerMemberFn; }
> +};
> +template <> struct SpecPerType {
> +  using FnType = void *;
> +  template 
> +  static constexpr FnType SpecMbrFnPtr = Specializer::MbrFnTempl;
> +};
> +template  void Specializer::A::InnerMemberFn() {
> +  using Spec = SpecPerType;
> +  Spec ErrorSite = Spec::SpecMbrFnPtr;  // { dg-error "not 
> declared" }
> +}
> --
> 2.9.2.564.g4d4f0b7
>


[Committed] Add testcase that ICEs after loop splitting patch

2016-08-03 Thread Andrew Pinski
Hi,
  I committed a testcase that ICEs after applying "Gimple loop
splitting v2" patch to a GCC 6.  The IV and the bounds were two
different types which was causing the ICE to happen.

Thanks,
Andrew Pinski

ChangeLog:

* gcc.c-torture/compile/20160802-1.c: New testcase.
Index: testsuite/gcc.c-torture/compile/20160802-1.c
===
--- testsuite/gcc.c-torture/compile/20160802-1.c(revision 0)
+++ testsuite/gcc.c-torture/compile/20160802-1.c(revision 0)
@@ -0,0 +1,13 @@
+long g (long width, unsigned long byte) {
+  long r_hi = 0;
+  unsigned long r_lo = 0;
+  int s;
+  for (s = 0; s < width; s += 8)
+{
+  int d = width - s - 8;
+  if (s < (8 * 8))
+r_hi |= byte << (d - (8 * 8));
+}
+  return r_lo + r_hi;
+}
+
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 239098)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2016-08-03  Andrew Pinski  
+
+   * gcc.c-torture/compile/20160802-1.c: New testcase.
+
 2016-08-03  Fritz Reese  
 
* gfortran.dg/dec_intrinsic_ints.f90: New testcase.


[PATCH/AARCH64] Add ThunderX vector cost model

2016-08-03 Thread Andrew Pinski
Hi,
  This patch adds to the thunderx model, the vector cost model.  I
benchmarked this on SPEC CPU INT 2006 and got a small speed up.  I
have a few more cost model patches that I am going upstream but they
are going to be split up.

OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

ChangeLog:
* config/aarch64/aarch64.c (thunderx_vector_cost): New variable.
(thunderx_tunings): Use thunderx_vector_cost instead of generic_vector_cost.
Index: config/aarch64/aarch64.c
===
--- config/aarch64/aarch64.c(revision 239098)
+++ config/aarch64/aarch64.c(working copy)
@@ -376,6 +376,24 @@ static const struct cpu_vector_cost gene
   1 /* cond_not_taken_branch_cost  */
 };
 
+/* ThunderX costs for vector insn classes.  */
+static const struct cpu_vector_cost thunderx_vector_cost =
+{
+  1, /* scalar_stmt_cost  */
+  3, /* scalar_load_cost  */
+  1, /* scalar_store_cost  */
+  4, /* vec_stmt_cost  */
+  4, /* vec_permute_cost  */
+  2, /* vec_to_scalar_cost  */
+  2, /* scalar_to_vec_cost  */
+  3, /* vec_align_load_cost  */
+  10, /* vec_unalign_load_cost  */
+  10, /* vec_unalign_store_cost  */
+  1, /* vec_store_cost  */
+  3, /* cond_taken_branch_cost  */
+  3 /* cond_not_taken_branch_cost  */
+};
+
 /* Generic costs for vector insn classes.  */
 static const struct cpu_vector_cost cortexa57_vector_cost =
 {
@@ -677,7 +695,7 @@ static const struct tune_params thunderx
   &thunderx_extra_costs,
   &generic_addrcost_table,
   &thunderx_regmove_cost,
-  &generic_vector_cost,
+  &thunderx_vector_cost,
   &generic_branch_cost,
   &generic_approx_modes,
   6, /* memmov_cost  */


Re: libgo patch committed: Update to 1.7rc3

2016-08-03 Thread Ian Lance Taylor
On Thu, Jul 28, 2016 at 2:29 AM, Uros Bizjak  wrote:
>
>> I have committed a patch to update libgo to the 1.7rc3 release
>> candidate.  This is very close to the upcoming 1.7 release.  As usual
>> with libgo updates, the patch is too large to include in this e-mail
>> message.  I've appended the changes to the gccgo-specific directories.
>
> There is an issue with
>
> libgo/go/crypto/sha1/issue15617_test.go.
>
> The test crypto/sha1 fails on alpha-linux-gnu with:
>
> --- FAIL: TestOutOfBoundsRead (0.00s)
> panic: invalid argument [recovered]
> panic: invalid argument
> ...
>
> since the test hard-codes 4k pages, but alpha uses 8k pages.
>
> It looks that the second line of build directives in the test:
>
> // +build amd64
> // +build linux darwin
>
> overwrites the first one, so the test runs also on non-amd64
> architecture linux OS. I have confirmed this by removing the second
> build directive, and crypto/sha1 test then passed, since
> issue15617_test.go was not linked into the final executable.

Thanks.  Looking into this revealed some problems with the handling of
multiple +build lines.  Not only was the shell script not anding them
together as required, it wasn't even distinguishing them since the
shell drops newlines in backquoted data.  Fixed with the appended
patch, now committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 239095)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-0fb416a7bed076bdfef168480789bb2994a58de3
+3096ac81185edacbf800783f0f803d1c419dccdd
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/testsuite/gotest
===
--- libgo/testsuite/gotest  (revision 238653)
+++ libgo/testsuite/gotest  (working copy)
@@ -313,56 +313,60 @@ x)
esac
 
if test x$tag1 != xnonmatchingtag -a x$tag2 != xnonmatchingtag; then
-   taglines=`sed '/^package /q' < $f | fgrep '// +build '`
-   if test "$taglines" = ""; then
-   omatch=true
-   else
-   omatch=false
-   fi
-   for tags in $taglines; do
-   match=false
-   for tag in $tags; do
-   reverse=false
-   case $tag in
-   "!"*)
-   reverse=true
-   tag=`echo $tag | sed -e 's/^!//'`
-   ;;
-   esac
-
-   case $tag in
-   "//" | "+build")
-   ;;
-   $goos | $goarch | cgo)
-   match=true
-   ;;
-   *,*)
-   match=true
-   for ctag in `echo $tag | sed -e 's/,/ /g'`; do
-   case $ctag in
-   $goos | $goarch | cgo)
-   ;;
-   *)
-   match=false
-   ;;
-   esac
-   done
-   ;;
-   esac
+   tags=`sed '/^package /q' < $f | fgrep '// +build '`
+   omatch=true
+   first=true
+   match=false
+   for tag in $tags; do
+   reverse=false
+   case $tag in
+   "!"*)
+   reverse=true
+   tag=`echo $tag | sed -e 's/^!//'`
+   ;;
+   esac
 
-   if test "$reverse" = true; then
-   if test "$match" = true; then
+   case $tag in
+   "//")
+   ;;
+   "+build")
+   if test "$first" = "true"; then
+   first=false
+   elif test "$match" = "false"; then
+   omatch=false
+   fi
+   match=false
+   ;;
+   $goos | $goarch | cgo)
+   match=true
+   ;;
+   *,*)
+   match=true
+   for ctag in `echo $tag | sed -e 's/,/ /g'`; do
+   case $ctag in
+   $goos | $goarch | cgo)
+   ;;
+   *)
match=false
-   else
-   match=true
-   fi
+   ;;
+   esac

Re: libgo patch committed: Update to 1.7rc3

2016-08-03 Thread Ian Lance Taylor
On Thu, Jul 28, 2016 at 4:24 AM, Uros Bizjak  wrote:
>
> A new testsuite failure is introduced:
>
> FAIL: text/template
>
> on both, x86_64-linux-gnu and alpha-linux-gnu.
>
> The testcase corrupts stack with a too deep recursion.
>
> There is a part in libgo/go/text/template/exec.go that should handle
> this situaiton:
>
> // maxExecDepth specifies the maximum stack depth of templates within
> // templates. This limit is only practically reached by accidentally
> // recursive template invocations. This limit allows us to return
> // an error instead of triggering a stack overflow.
> const maxExecDepth = 10
>
> but the limit is either set too high, or the error handling code is
> inefficient on both, split-stack (x86_64) and non-split-stack (alpha)
> targets. Lowering this value to 1 "fixes" the testcase on both
> targets.

I can not recreate this problem on x86 or x86_64.

Does this patch work around the problem on Alpha?

Ian
diff --git a/libgo/go/text/template/exec_test.go 
b/libgo/go/text/template/exec_test.go
index 3ef065e..6319706 100644
--- a/libgo/go/text/template/exec_test.go
+++ b/libgo/go/text/template/exec_test.go
@@ -11,6 +11,7 @@ import (
"fmt"
"io/ioutil"
"reflect"
+   "runtime"
"strings"
"testing"
 )
@@ -1299,6 +1300,10 @@ func TestMissingFieldOnNil(t *testing.T) {
 }
 
 func TestMaxExecDepth(t *testing.T) {
+   // Don't try to run this test if stack space is limited.
+   if runtime.Compiler == "gccgo" && runtime.GOARCH != "amd64" && 
runtime.GOARCH != "386" {
+   t.Skipf("skipping on gccgo GOARCH %s", runtime.GOARCH)
+   }
tmpl := Must(New("tmpl").Parse(`{{template "tmpl" .}}`))
err := tmpl.Execute(ioutil.Discard, nil)
got := ""


Re: [PATCH, rs6000] Switch the rs6000 port over to LRA

2016-08-03 Thread Peter Bergner

On 8/2/16 3:17 PM, Peter Bergner wrote:

Now that Vlad has fixed PR69847, which was the last problem holding the
rs6000 port from switching from reload to LRA, we are ready to flip the
switch.

Is the following ok once bootstrap/regtesting on both LE and BE
(32 & 64 regtesting) comes out clean?


So we have two "regressions":

+FAIL: gcc.target/powerpc/bool3-p7.c scan-assembler-not [ \\t]xxlnor
+FAIL: gcc.target/powerpc/bool3-p8.c scan-assembler-not [ \\t]xxlnor


Looking into these "failures", they show up because when we enable
LRA, we also implicitly enable -mvsx-timode and these failures are
due to -mvsx-timode.  The same test cases fail when we use -mvsx-timode
with reload.

I'll note that these failures are not code correctness bugs, but
performance bugs.  I plan to open a bugzilla to track the fixing
of these failures.

My question, is since these failures are not due to LRA, do we
want to consider the switch to LRA ok to commit or do we want to
wait until the -mvsx-timode performance bug is fixed?

Peter



  1   2   >