date:20130428

[patch, fortran] PR 57071, some power optimizations

2013-04-28 Thread Thomas Koenig


Hello world,

the attached patch does some optimization on
power using ishft and iand, as discussed in the PR.

I have left out handling real numbers, that should be left
to the middle-end (PR 57073).

Regression-tested.  OK for trunk?

Thomas

2013-04-28  Thomas Koenig  

PR fortran/57071
* frontend-passes (optimize_power):  New function.
(optimize_op):  Use it.

2013-04-28  Thomas Koenig  

PR fortran/57071
* gfortran.dg/power_3.f90:  New test.
* gfortran.dg/power_4.f90:  New test.

Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 198340)
+++ frontend-passes.c	(Arbeitskopie)
@@ -1091,7 +1091,66 @@ combine_array_constructor (gfc_expr *e)
   return true;
 }
 
+/* Change (-1)**k into 1-ishift(iand(k,1),1) and
+ 2**k into ishift(1,k) */
 
+static bool
+optimize_power (gfc_expr *e)
+{
+  gfc_expr *op1, *op2;
+  gfc_expr *iand, *ishft;
+
+  if (e->ts.type != BT_INTEGER)
+return false;
+
+  op1 = e->value.op.op1;
+
+  if (op1 == NULL || op1->expr_type != EXPR_CONSTANT)
+return false;
+
+  if (mpz_cmp_si (op1->value.integer, -1L) == 0)
+{
+  gfc_free_expr (op1);
+
+  op2 = e->value.op.op2;
+
+  if (op2 == NULL)
+	return false;
+
+  iand = gfc_build_intrinsic_call (current_ns, GFC_ISYM_IAND,
+   "_internal_iand", e->where, 2, op2,
+   gfc_get_int_expr (e->ts.kind,
+			 &e->where, 1));
+   
+  ishft = gfc_build_intrinsic_call (current_ns, GFC_ISYM_ISHFT,
+	"_internal_ishft", e->where, 2, iand,
+	gfc_get_int_expr (e->ts.kind,
+			  &e->where, 1));
+
+  e->value.op.op = INTRINSIC_MINUS;
+  e->value.op.op1 = gfc_get_int_expr (e->ts.kind, &e->where, 1);
+  e->value.op.op2 = ishft;
+  return true;
+}
+  else if (mpz_cmp_si (op1->value.integer, 2L) == 0)
+{
+  gfc_free_expr (op1);
+
+  op2 = e->value.op.op2;
+  if (op2 == NULL)
+	return false;
+
+  ishft = gfc_build_intrinsic_call (current_ns, GFC_ISYM_ISHFT,
+	"_internal_ishft", e->where, 2,
+	gfc_get_int_expr (e->ts.kind,
+			  &e->where, 1),
+	op2);
+  *e = *ishft;
+  return true;
+}
+  return false;
+}
+
 /* Recursive optimization of operators.  */
 
 static bool
@@ -1152,6 +1211,10 @@ optimize_op (gfc_expr *e)
 case INTRINSIC_DIVIDE:
   return combine_array_constructor (e) || changed;
 
+case INTRINSIC_POWER:
+  return optimize_power (e);
+  break;
+
 default:
   break;
 }
! { dg-do run }
! { dg-options "-ffrontend-optimize -fdump-tree-original" }
! PR 57071 - Check that (-1)**k is transformed into 1-2*iand(k,1).
program main
  implicit none
  integer, parameter :: n = 3
  integer(kind=8), dimension(-n:n) :: a, b
  integer, dimension(-n:n) :: c, d, e
  integer :: m
  integer :: i, v
  integer (kind=2) :: i2

  m = n
  v = -1
  ! Test in scalar expressions
  do i=-n,n
 if (v**i /= (-1)**i) call abort
  end do

  ! Test in array constructors
  a(-m:m) = [ ((-1)**i, i= -m, m) ]
  b(-m:m) = [ (   v**i, i= -m, m) ]
  if (any(a .ne. b)) call abort

  ! Test in array expressions
  c = [ ( i, i = -n , n ) ]
  d = (-1)**c
  e = v**c
  if (any(d .ne. e)) call abort

  ! Test in different kind expressions
  do i2=-n,n
 if (v**i2 /= (-1)**i2) call abort
  end do

end program main
! { dg-final { scan-tree-dump-times "_gfortran_pow_i4_i4" 4 "original" } }
! { dg-final { cleanup-tree-dump "original" } }
! { dg-do run }
! { dg-options "-ffrontend-optimize -fdump-tree-original" }
! PR 57071 - Check that 2**k is transformed into ishift(1,k).
program main
  implicit none
  integer :: i,m,v
  integer, parameter :: n=30
  integer, dimension(-n:n) :: a,b,c,d,e
  m = n

  v = 2
  ! Test scalar expressions.
  do i=-n,n
 if (2**i /= v**i) call abort
  end do

  ! Test array constructors
  b = [(2**i,i=-m,m)]
  c = [(v**i,i=-m,m)]
  if (any(b /= c)) call abort

  ! Test array expressions
  a = [(i,i=-m,m)]
  d = 2**a
  e = v**a
  if (any(d /= e)) call abort
end program main
! { dg-final { scan-tree-dump-times "_gfortran_pow_i4_i4" 3 "original" } }
! { dg-final { cleanup-tree-dump "original" } }

Re: [patch, fortran] PR 57071, some power optimizations

2013-04-28 Thread Tobias Burnus


  Am 28.04.2013 10:32, schrieb Thomas Koenig:

the attached patch does some optimization on
power using ishft and iand, as discussed in the PR.

I have left out handling real numbers, that should be left
to the middle-end (PR 57073).

Regression-tested.  OK for trunk?


OK - thanks for the patch.

I wonder whether one should also handle:
  1**k  == 1
That should only happen (in-real-world code) due to simplifying other 
expressions but it is simple to implement.


(0**k is also possible, but it gets more complicated: 0 for k > 0, 1 for 
k == 0 and invalid for k < 0 [which one might ignore]. As this is really 
special, one can also leave the library call.)


Tobias


2013-04-28 Thomas Koenig  

PR fortran/57071
* frontend-passes (optimize_power):  New function.
(optimize_op):  Use it.

2013-04-28  Thomas Koenig  

PR fortran/57071
* gfortran.dg/power_3.f90:  New test.
* gfortran.dg/power_4.f90:  New test.

New Swedish PO file for 'gcc' (version 4.8.0)

2013-04-28 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-4.8.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [patch, fortran] PR 57071, some power optimizations

2013-04-28 Thread Steve Kargl

On Sun, Apr 28, 2013 at 10:32:28AM +0200, Thomas Koenig wrote:
> Hello world,
> 
> the attached patch does some optimization on
> power using ishft and iand, as discussed in the PR.
> 
> I have left out handling real numbers, that should be left
> to the middle-end (PR 57073).
> 
> Regression-tested.  OK for trunk?
> 

% cat foo.f90
function foo(k)
   integer k
   integer foo
   foo = (1)**k
end function foo

% cat foo.f90.003t.original 
foo (integer(kind=4) & restrict k)
{
  integer(kind=4) __result_foo;

  __result_foo = _gfortran_pow_i4_i4 (1, *k);
  return __result_foo;
}

%  cat foo.f90.143t.optimized 

;; Function foo (foo_)

foo (integer(kind=4) & restrict k)
{
  integer(kind=4) __result_foo.0;
  integer(kind=4) D.1479;

:
  D.1479_2 = *k_1(D);
  __result_foo.0_3 = _gfortran_pow_i4_i4 (1, D.1479_2); [tail call]
  return __result_foo.0_3;

}

%  nm foo.o
 U _gfortran_pow_i4_i4
 T foo_

:-)

-- 
Steve

[patch] libstdc++/51365 for shared_ptr

2013-04-28 Thread Jonathan Wakely

This fixes shared_ptr to allow 'final' allocators to be used.

As an added bonus it also reduces the memory footprint of the
shared_ptr control block when constructing a shared_ptr with an empty
deleter or when using make_shared/allocate_shared.

I decided not to use std::tuple here, because it's a pretty heavy
template to instantiate, so added another EBO helper like the one in
hashtable_policy.h -- they should be merged and reused for the other
containers to fix the rest of PR 51365.

PR libstdc++/51365
* include/bits/shared_ptr_base (_Sp_ebo_helper): Helper class to
implement EBO safely.
(_Sp_counted_base::_M_get_deleter): Add noexcept.
(_Sp_counter_ptr): Use noexcept instead of comments.
(_Sp_counted_deleter): Likewise. Use _Sp_ebo_helper.
(_Sp_counted_ptr_inplace): Likewise.
* testsuite/20_util/shared_ptr/cons/51365.cc: New.
* testsuite/20_util/shared_ptr/cons/52924.cc: Add rebind member to
custom allocator and test construction with custom allocator.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust dg-error
line number.

Tested x86_64-linux, committed to trunk.
commit 4c5977aae386fa8c60519a8c7b9ba3e448f43c22
Author: Jonathan Wakely 
Date:   Sun Apr 28 12:10:00 2013 +0100

PR libstdc++/51365
* include/bits/shared_ptr_base (_Sp_ebo_helper): Helper class to
implement EBO safely.
(_Sp_counted_base::_M_get_deleter): Add noexcept.
(_Sp_counter_ptr): Use noexcept instead of comments.
(_Sp_counted_deleter): Likewise. Use _Sp_ebo_helper.
(_Sp_counted_ptr_inplace): Likewise.
* testsuite/20_util/shared_ptr/cons/51365.cc: New.
* testsuite/20_util/shared_ptr/cons/52924.cc: Add rebind member to
custom allocator and test construction with custom allocator.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust dg-error
line number.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index f463645..a0f513f 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -126,7 +126,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { delete this; }
   
   virtual void*
-  _M_get_deleter(const std::type_info&) = 0;
+  _M_get_deleter(const std::type_info&) noexcept = 0;
 
   void
   _M_add_ref_copy()
@@ -284,7 +284,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
 public:
   explicit
-  _Sp_counted_ptr(_Ptr __p)
+  _Sp_counted_ptr(_Ptr __p) noexcept
   : _M_ptr(__p) { }
 
   virtual void
@@ -296,14 +296,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { delete this; }
 
   virtual void*
-  _M_get_deleter(const std::type_info&)
-  { return 0; }
+  _M_get_deleter(const std::type_info&) noexcept
+  { return nullptr; }
 
   _Sp_counted_ptr(const _Sp_counted_ptr&) = delete;
   _Sp_counted_ptr& operator=(const _Sp_counted_ptr&) = delete;
 
-protected:
-  _Ptr _M_ptr;  // copy constructor must not throw
+private:
+  _Ptr _M_ptr;
 };
 
   template<>
@@ -318,59 +318,91 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 inline void
 _Sp_counted_ptr::_M_dispose() noexcept { }
 
+  template
+struct _Sp_ebo_helper;
+
+  /// Specialization using EBO.
+  template
+struct _Sp_ebo_helper<_Nm, _Tp, true> : private _Tp
+{
+  explicit _Sp_ebo_helper(const _Tp& __tp) : _Tp(__tp) { }
+
+  static _Tp&
+  _S_get(_Sp_ebo_helper& __eboh) { return static_cast<_Tp&>(__eboh); }
+};
+
+  /// Specialization not using EBO.
+  template
+struct _Sp_ebo_helper<_Nm, _Tp, false>
+{
+  explicit _Sp_ebo_helper(const _Tp& __tp) : _M_tp(__tp) { }
+
+  static _Tp&
+  _S_get(_Sp_ebo_helper& __eboh)
+  { return __eboh._M_tp; }
+
+private:
+  _Tp _M_tp;
+};
+
   // Support for custom deleter and/or allocator
   template
 class _Sp_counted_deleter final : public _Sp_counted_base<_Lp>
 {
-  // Helper class that stores the Deleter and also acts as an allocator.
-  // Used to dispose of the owned pointer and the internal refcount
-  // Requires that copies of _Alloc can free each other's memory.
-  struct _My_Deleter
-  : public _Alloc   // copy constructor must not throw
+  class _Impl : _Sp_ebo_helper<0, _Deleter>, _Sp_ebo_helper<1, _Alloc>
   {
-   _Deleter _M_del;// copy constructor must not throw
-   _My_Deleter(_Deleter __d, const _Alloc& __a)
-   : _Alloc(__a), _M_del(__d) { }
+   typedef _Sp_ebo_helper<0, _Deleter> _Del_base;
+   typedef _Sp_ebo_helper<1, _Alloc>   _Alloc_base;
+
+  public:
+   _Impl(_Ptr __p, _Deleter __d, const _Alloc& __a) noexcept
+   : _M_ptr(__p), _Del_base(__d), _Alloc_base(__a)
+   { }
+
+   _Deleter& _M_del() noexcept { return _Del_base::_S_get(*this); }
+   _Alloc& _M_alloc

Re: [PATCH] Preserve loops from CFG build until after RTL loop opts

2013-04-28 Thread Tom de Vries

On 26/04/13 16:27, Tom de Vries wrote:
> On 25/04/13 16:19, Richard Biener wrote:
> 
>> and compared to the previous patch changed the tree-ssa-tailmerge.c
>> part to deal with merging of loop latch and loop preheader (even
>> if that's a really bad idea) to not regress gcc.dg/pr50763.c.
>> Any suggestion on how to improve that part welcome.

> So I think this is really a cornercase, and we should disregard it if that 
> makes
> things simpler.
> 
> Rather than fixing up the loop structure, we could prevent tail-merge in these
> cases.
> 
> The current fix tests for current_loops == NULL, and I'm not sure that can 
> still
> happen there, given that we have PROP_loops.

Richard,

I've found that it happens in these g++ test-cases:
  g++.dg/ext/mv1.C
  g++.dg/ext/mv12.C
  g++.dg/ext/mv2.C
  g++.dg/ext/mv5.C
  g++.dg/torture/covariant-1.C
  g++.dg/torture/pr43068.C
  g++.dg/torture/pr47714.C
This seems rare enough to just bail out of tail-merge in those cases.

> It's not evident to me that the test bb2->loop_father->latch == bb2 is
> sufficient. Before calling tail_merge_optimize, we call 
> loop_optimizer_finalize
> in which we assert that LOOPS_MAY_HAVE_MULTIPLE_LATCHES from there on, so in
> theory we might miss some latches.
> 
> But I guess that pre (having started out with simple latches) maintains simple
> latches throughout, and that tail-merge does the same.

I've added a comment related to this in the patch.

Bootstrapped and reg-tested (ada inclusive) on x86_64.

OK for trunk?

Thanks,
- Tom

2013-04-28  Tom de Vries  

* tree-ssa-tail-merge.c (find_same_succ_bb): Skip loop latch bbs.
(replace_block_by): Don't set LOOPS_NEED_FIXUP.
(tail_merge_optimize): Handle current_loops == NULL.

* gcc.dg/pr50763.c: Update test.

diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index f2ab7444..5467ae5 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -689,7 +689,15 @@ find_same_succ_bb (basic_block bb, same_succ *same_p)
   edge_iterator ei;
   edge e;
 
-  if (bb == NULL)
+  if (bb == NULL
+  /* Be conservative with loop structure.  It's not evident that this test
+	 is sufficient.  Before tail-merge, we've just called
+	 loop_optimizer_finalize, and LOOPS_MAY_HAVE_MULTIPLE_LATCHES is now
+	 set, so there's no guarantee that the loop->latch value is still valid.
+	 But we assume that, since we've forced LOOPS_HAVE_SIMPLE_LATCHES at the
+	 start of pre, we've kept that property intact throughout pre, and are
+	 keeping it throughout tail-merge using this test.  */
+  || bb->loop_father->latch == bb)
 return;
   bitmap_set_bit (same->bbs, bb->index);
   FOR_EACH_EDGE (e, ei, bb->succs)
@@ -1460,17 +1468,6 @@ replace_block_by (basic_block bb1, basic_block bb2)
   /* Mark the basic block as deleted.  */
   mark_basic_block_deleted (bb1);
 
-  /* ???  If we merge the loop preheader with the loop latch we are creating
- additional entries into the loop, eventually rotating it.
- Mark loops for fixup in this case.
- ???  This is a completely unwanted transform and will wreck most
- loops at this point - but with just not considering loop latches as
- merge candidates we fail to commonize the two loops in gcc.dg/pr50763.c.
- A better fix to avoid that regression is needed.  */
-  if (current_loops
-  && bb2->loop_father->latch == bb2)
-loops_state_set (LOOPS_NEED_FIXUP);
-
   /* Redirect the incoming edges of bb1 to bb2.  */
   for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i)
 {
@@ -1612,7 +1609,19 @@ tail_merge_optimize (unsigned int todo)
   int iteration_nr = 0;
   int max_iterations = PARAM_VALUE (PARAM_MAX_TAIL_MERGE_ITERATIONS);
 
-  if (!flag_tree_tail_merge || max_iterations == 0)
+  if (!flag_tree_tail_merge
+  || max_iterations == 0
+  /* We try to be conservative with respect to loop structure, since:
+	 - the cases where tail-merging could both affect loop structure and be
+	   benificial are rare,
+	 - it prevents us from having to fixup the loops using
+	   loops_state_set (LOOPS_NEED_FIXUP), and
+	 - keeping loop structure may allow us to simplify the pass.
+	 In order to be conservative, we need loop information.	 In rare cases
+	 (about 7 test-cases in the g++ testsuite) there is none (because
+	 loop_optimizer_finalize has been called before tail-merge, and
+	 PROP_loops is not set), so we bail out.  */
+  || current_loops == NULL)
 return 0;
 
   timevar_push (TV_TREE_TAIL_MERGE);
diff --git a/gcc/testsuite/gcc.dg/pr50763.c b/gcc/testsuite/gcc.dg/pr50763.c
index 60025e3..695b61c 100644
--- a/gcc/testsuite/gcc.dg/pr50763.c
+++ b/gcc/testsuite/gcc.dg/pr50763.c
@@ -12,5 +12,5 @@ foo (int c, int d)
   while (c == d);
 }
 
-/* { dg-final { scan-tree-dump-times "== 33" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-times "== 33" 2 "pre"} } */
 /* { dg-final { cleanup-tree-dump "pre" } } */

[patch] tweak some libstdc++ comments

2013-04-28 Thread Jonathan Wakely

* include/bits/hashtable_policy.h (_Hashtable_ebo_helper): Fix
comment.
* include/std/mutex (__recursive_mutex_base): Likewise.

Tested x86_64-linux, committed to trunk.
commit eebe0bf329438168c002754c4a0d5b7e1b59e6f5
Author: Jonathan Wakely 
Date:   Sun Apr 28 12:49:20 2013 +0100

* include/bits/hashtable_policy.h (_Hashtable_ebo_helper): Fix
comment.
* include/std/mutex (__recursive_mutex_base): Likewise.

diff --git a/libstdc++-v3/include/bits/hashtable_policy.h 
b/libstdc++-v3/include/bits/hashtable_policy.h
index 1cf6cb2..1c76af0 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -844,8 +844,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /**
*  Primary class template _Hashtable_ebo_helper.
*
-   *  Helper class using EBO when it is not forbidden, type is not
-   *  final, and when it worth it, type is empty.
+   *  Helper class using EBO when it is not forbidden (the type is not
+   *  final) and when it is worth it (the type is empty.)
*/
   template
diff --git a/libstdc++-v3/include/std/mutex b/libstdc++-v3/include/std/mutex
index 67f3418..3c666c1 100644
--- a/libstdc++-v3/include/std/mutex
+++ b/libstdc++-v3/include/std/mutex
@@ -78,7 +78,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __mutex_base& operator=(const __mutex_base&) = delete;
   };
 
-  // Common base class for std::recursive_mutex and std::timed_recursive_mutex
+  // Common base class for std::recursive_mutex and std::recursive_timed_mutex
   class __recursive_mutex_base
   {
   protected:

Re: [patch, fortran] PR 57071, some power optimizations

2013-04-28 Thread Thomas Koenig


Am 28.04.2013 12:10, schrieb Tobias Burnus:

OK - thanks for the patch.

I wonder whether one should also handle:
   1**k  == 1


Committed, thanks for the review!

I will do 1**k as a followup patch which should be quite obvious :-)

Regarding PR 57073, the real case - I have never worked with
GIMPLE, so I have no real idea how to start this.

Any volunteers?  Or should we handle this in the Fortran front
end after all?

Thomas

RE: [gomp4] Some progress on #pragma omp simd

2013-04-28 Thread Iyer, Balaji V



> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Aldy Hernandez
> Sent: Saturday, April 27, 2013 1:30 PM
> To: Iyer, Balaji V
> Cc: Jakub Jelinek; Richard Henderson; gcc-patches@gcc.gnu.org
> Subject: Re: [gomp4] Some progress on #pragma omp simd
> 
> Hi Balaji.
> 
> >> "The syntax and semantics of the various simd-openmp-data-clauses are
> >> detailed in the OpenMP specification.
> >> (http://www.openmp.org/mp-documents/spec30.pdf, Section 2.9.3)."
> >>
> >> Balaji, can you verify which is correct?  For that matter, which are
> >> the official specs from which we should be basing this work?
> >
> > Privatization clause makes a variable private for the simd lane. In
> > general,  I would follow the spec. If you have further questions,
> > please feel free to ask.
> 
> Ok, so the Cilk Plus 1.1 spec is incorrectly pointing to the OpenMP 3.0 spec,
> because the OpenMP 3.0 spec has the private clause being task/thread private.
> Since the OpenMP 4.0rc2 explicitly says that the private clause is for the 
> SIMD
> lane (as you've stated), can we assume that when the Cilk Plus 1.1 spec 
> mentions
> OpenMP, it is talking about the OpenMP 4.0 spec?

I don't know of all the references to the OMP manual in the spec, so I will be 
a bit hesitant to make a blanket assumption like that. In this case, I think 
you can assume that it behaves in the same way as 4.0. If you have further 
questions, please feel free to ask. In general, #pragma simd, array notation 
and elemental functions deal with vectorization, not threading. But, Cilk part 
(Cilk keywords and reducers) deal with threading. All these parts can be mixed 
and matched (with restrictions) to take advantage of both threading and 
vectorization.

> 
> One more question Balaji, the Cilk Plus spec says that for #pragma simd, the
> private, firstprivate, lastprivate, and reduction clauses are as OpenMP.
> However, for <#omp simd>, there is no firstprivate in the OpenMP 4.0rc2 spec.
> Is the firstprivate clause valid for Cilk Plus'
> <#pragma simd>?



> 
> Thanks.
> Aldy

Re: [wwwdocs] C++14 support for binary literals says Noinstead of Yes

2013-04-28 Thread Ed Smith-Rowland


On 04/27/2013 02:59 AM, Jakub Jelinek wrote:

On Sat, Apr 27, 2013 at 01:03:17AM -0400, Ed Smith-Rowland wrote:

In htdocs/projects/cxx1y.html it says no for support of binary
literals.  I think that's a Yes actually.

Here is a little patchlet.

Am I missing something?
So yes... ;-)  I had tested on g++-4.1 and I guess pedantic let that go 
through back then. I swear -pedantic gave nothing...  Oh well.


I like the patch.  It compiled and tested clean on x86_64-linux.

I'm working on a little proposal to add std::bin manipulator and related 
stuff in analogy to std::hex, etc. to the library.  i think you should 
be able to write and extract binary literals.



Given
./xg++ -B ./ a.C -std=c++1y -pedantic-errors -S
a.C:1:9: error: binary constants are a GCC extension
  int i = 0b110101;
  ^

I'd say the fact that it is available as a GNU extension isn't sufficient to
mark this as supported.  I think you need something like (untested so far
except for make check-g++ RUNTESTFLAGS=*binary_const*):

2013-04-27  Jakub Jelinek

N3472 binary constants
* include/cpplib.h (struct cpp_options): Fix a typo in user_literals
field comment.  Add binary_constants field.
* init.c (struct lang_flags): Add binary_constants field.
(lang_defaults): Add bin_cst column to the table.
(cpp_set_lang): Initialize CPP_OPTION (pfile, binary_constants).
* expr.c (cpp_classify_number): Talk about C++11 instead of C++0x
in diagnostics.  Accept binary constants if
CPP_OPTION (pfile, binary_constants) even when pedantic.  Adjust
pedwarn message.

* g++.dg/cpp/limits.C: Adjust warning wording.
* g++.dg/system-binary-constants-1.C: Likewise.
* g++.dg/cpp1y/system-binary-constants-1.C: New test.

--- libcpp/include/cpplib.h.jj  2013-04-25 23:47:58.0 +0200
+++ libcpp/include/cpplib.h 2013-04-27 08:31:52.349122712 +0200
@@ -423,7 +423,7 @@ struct cpp_options
/* True for traditional preprocessing.  */
unsigned char traditional;
  
-  /* Nonzero for C++ 2011 Standard user-defnied literals.  */

+  /* Nonzero for C++ 2011 Standard user-defined literals.  */
unsigned char user_literals;
  
/* Nonzero means warn when a string or character literal is followed by a

@@ -434,6 +434,9 @@ struct cpp_options
   literal number suffixes as user-defined literal number suffixes.  */
unsigned char ext_numeric_literals;
  
+  /* Nonzero for C++ 2014 Standard binary constants.  */

+  unsigned char binary_constants;
+
/* Holds the name of the target (execution) character set.  */
const char *narrow_charset;
  
--- libcpp/init.c.jj	2013-04-25 23:47:58.0 +0200

+++ libcpp/init.c   2013-04-27 08:34:54.103120530 +0200
@@ -83,24 +83,25 @@ struct lang_flags
char uliterals;
char rliterals;
char user_literals;
+  char binary_constants;
  };
  
  static const struct lang_flags lang_defaults[] =

-{ /*  c99 c++ xnum xid std  //   digr ulit rlit user_literals */
-  /* GNUC89   */  { 0,  0,  1,   0,  0,   1,   1,   0,   0,0 },
-  /* GNUC99   */  { 1,  0,  1,   0,  0,   1,   1,   1,   1,0 },
-  /* GNUC11   */  { 1,  0,  1,   0,  0,   1,   1,   1,   1,0 },
-  /* STDC89   */  { 0,  0,  0,   0,  1,   0,   0,   0,   0,0 },
-  /* STDC94   */  { 0,  0,  0,   0,  1,   0,   1,   0,   0,0 },
-  /* STDC99   */  { 1,  0,  1,   0,  1,   1,   1,   0,   0,0 },
-  /* STDC11   */  { 1,  0,  1,   0,  1,   1,   1,   1,   0,0 },
-  /* GNUCXX   */  { 0,  1,  1,   0,  0,   1,   1,   0,   0,0 },
-  /* CXX98*/  { 0,  1,  1,   0,  1,   1,   1,   0,   0,0 },
-  /* GNUCXX11 */  { 1,  1,  1,   0,  0,   1,   1,   1,   1,1 },
-  /* CXX11*/  { 1,  1,  1,   0,  1,   1,   1,   1,   1,1 },
-  /* GNUCXX1Y */  { 1,  1,  1,   0,  0,   1,   1,   1,   1,1 },
-  /* CXX1Y*/  { 1,  1,  1,   0,  1,   1,   1,   1,   1,1 },
-  /* ASM  */  { 0,  0,  1,   0,  0,   1,   0,   0,   0,0 }
+{ /*  c99 c++ xnum xid std  //   digr ulit rlit udlit bin_cst */
+  /* GNUC89   */  { 0,  0,  1,   0,  0,   1,   1,   0,   0,   0,0 },
+  /* GNUC99   */  { 1,  0,  1,   0,  0,   1,   1,   1,   1,   0,0 },
+  /* GNUC11   */  { 1,  0,  1,   0,  0,   1,   1,   1,   1,   0,0 },
+  /* STDC89   */  { 0,  0,  0,   0,  1,   0,   0,   0,   0,   0,0 },
+  /* STDC94   */  { 0,  0,  0,   0,  1,   0,   1,   0,   0,   0,0 },
+  /* STDC99   */  { 1,  0,  1,   0,  1,   1,   1,   0,   0,   0,0 },
+  /* STDC11   */  { 1,  0,  1,   0,  1,   1,   1,   1,   0,   0,0 },
+  /* GNUCXX   */  { 0,  1,  1,   0,  0,   1,   1,   0,   0,   0,0 },
+  /* CXX98*/  { 0,  1,  1,   0,  1,   1,   1,   0,   0,   0,0 },
+  /* GNUCXX11 */  { 1,  1,  1,   0,  0,   1,   1,   1,   1,   1,0 },
+  /* CXX11*/  { 1,  1,  1,   0,  1,   1,   1,   1,   1,   1,0 },
+  /* GNUCXX1Y */  { 1,  1,  1,   0,  0,   1,   1,   1,   1,   1,1 },
+  /* CXX1Y*/  { 1,  1,

Re: RFA: enable LRA for rs6000

2013-04-28 Thread Vladimir Makarov

- Original Message -
From: "Michael Meissner" 
To: "Vladimir Makarov" 
Cc: "Michael Meissner" , "David Edelsohn" 
, "gcc-patches" , "Peter Bergner" 
, aavru...@redhat.com
Sent: Friday, April 26, 2013 7:13:55 PM
Subject: Re: RFA: enable LRA for rs6000

On Fri, Apr 26, 2013 at 07:00:37PM -0400, Vladimir Makarov wrote:
> 2013-04-26  Vladimir Makarov  
> 
> * lra.c (setup_operand_alternative): Ignore '?'.
> * lra-constraints.c (process_alt_operands): Print cost dump for
> alternatives.  Check only moves for cycling.
> (curr_insn_transform): Print insn name.

I'm not sure I'm comfortable with ignoring the '?' altogether.  For example, if
you do something in the GPR unit, instructions run at one cycle, while if you
do it in the vector unit, it runs in two cycles.  In the past, I've seen cases
where it wanted to spill floating point values from the floating point
registers to the CTR.  And if you spill to the LR, it can interfere with the
call cache.

Admitily, when to use '!', '?', and '*' is unclear, and unfortunately it has
changed over time.

---

I don't like to change '?' semantics too.  So I found another solution.
I've committed it to the branch although it might be not final solution --
I'd like to see how it affects x86/x86-64.

Mike, could you send me the config file (if it is possible of course) for
spec2006 you are using in order to be in sync.

Thanks, Vlad.

2013-04-28  Vladimir Makarov  

* lra.c (setup_operand_alternative): Restore taking into account
'?'.
* lra-constraints.c (process_alt_operands): Discourage a bit more using
memory for pseudos.  Remove printing undefined values.  Modify
cost values for conflicts with early clobbers.

Index: lra.c
===
--- lra.c   (revision 198350)
+++ lra.c   (working copy)
@@ -784,6 +784,9 @@ setup_operand_alternative (lra_insn_reco
  lra_assert (i != nop - 1);
  break;

+   case '?':
+ op_alt->reject += LRA_LOSER_COST_FACTOR;
+ break;
case '!':
  op_alt->reject += LRA_MAX_REJECT;
  break;
Index: lra-constraints.c
===
--- lra-constraints.c   (revision 198373)
+++ lra-constraints.c   (working copy)
@@ -2007,7 +2007,7 @@ process_alt_operands (int only_alternati
 although it might takes the same number of
 reloads.  */
  if (no_regs_p && REG_P (op))
-   reject++;
+   reject += 2;

 #ifdef SECONDARY_MEMORY_NEEDED
  /* If reload requires moving value through secondary
@@ -2040,9 +2040,9 @@ process_alt_operands (int only_alternati
  if ((best_losers == 0 || losers != 0) && best_overall < overall)
{
  if (lra_dump_file != NULL)
-   fprintf (lra_dump_file, "  alt=%d,overall=%d,losers=%d,"
-"small_class_ops=%d,rld_nregs=%d -- reject\n",
-nalt, overall, losers, small_class_operands_num, reload_nregs);
+   fprintf (lra_dump_file,
+"  alt=%d,overall=%d,losers=%d -- reject\n",
+nalt, overall, losers);
  goto fail;
}

@@ -2139,7 +2139,10 @@ process_alt_operands (int only_alternati
  curr_alt_dont_inherit_ops[curr_alt_dont_inherit_ops_num++]
= last_conflict_j;
  losers++;
- overall += LRA_LOSER_COST_FACTOR;

+ /* Early clobber was already reflected in REJECT. */
+ lra_assert (reject > 0);
+ reject--;
+ overall += LRA_LOSER_COST_FACTOR - 1;
}
  else
{
@@ -2163,7 +2166,10 @@ process_alt_operands (int only_alternati
}
  curr_alt_win[i] = curr_alt_match_win[i] = false;
  losers++;
- overall += LRA_LOSER_COST_FACTOR;
+ /* Early clobber was already reflected in REJECT. */
+ lra_assert (reject > 0);
+ reject--;
+ overall += LRA_LOSER_COST_FACTOR - 1;
}
}
   small_class_operands_num = 0;

Fix minor regression with size functions

2013-04-28 Thread Eric Botcazou

It's a minor regression present on mainline and 4.8 branch: the size functions 
are output as (no-fn) in the .original dump file.

Bootstrapped/regtested on x86_64-suse-linux, applied on the mainline and 4.8 
branch as obvious (this only affects the Ada compiler).


2013-04-28  Eric Botcazou  

* stor-layout.c (finalize_size_functions): Allocate a structure and
reset cfun before dumping the functions.


-- 
Eric Botcazou
Index: stor-layout.c
===
--- stor-layout.c	(revision 198366)
+++ stor-layout.c	(working copy)
@@ -290,6 +290,8 @@ finalize_size_functions (void)
 
   for (i = 0; size_functions && size_functions->iterate (i, &fndecl); i++)
 {
+  allocate_struct_function (fndecl, false);
+  set_cfun (NULL);
   dump_function (TDI_original, fndecl);
   gimplify_function_tree (fndecl);
   dump_function (TDI_generic, fndecl);

Re: [patch, fortran] PR 57071, some power optimizations

2013-04-28 Thread Tobias Burnus


Thomas Koenig wrote:

Regarding PR 57073, the real case - I have never worked with
GIMPLE, so I have no real idea how to start this.

Any volunteers?  Or should we handle this in the Fortran front
end after all?


I think one should do it in the middle end. Actually, it shouldn't be 
that difficult. The starting point is gcc/builtins.c's fold_builtin_powi 
- all ingredients are there;  (1.0)**k, x**0, x**1 and x**(-1) are 
already handled. Thus, only (-1)**k is missing. For   "= (a)?(b):(c)" 
one uses "fold_build3_loc (loc, COND_EXPR, ..." and to check whether a 
real argument is -1, you can use "real_minus_onep".


How about writing your first middle end patch?

Tobias

[Patch, fortran] Fix sign error in SYSTEM_CLOCK kind=4 Windows version

2013-04-28 Thread Janne Blomqvist

Hi,

committed the patch below as obvious:

Index: ChangeLog
===
--- ChangeLog   (revision 198377)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2013-04-28  Janne Blomqvist  
+
+   * intrinsics/system_clock.c (system_clock_4): Fix sign error in
+   Windows version.
+
 2013-04-15  Tobias Burnus  

* list_read.c (finish_separator): Initialize variable.
@@ -37,7 +42,7 @@
(nml_get_obj_data): Likewise use the correct error mechanism.
* io/transfer.c (hit_eof): Do not set AFTER_ENDFILE if in namelist
mode.
-
+
 2013-03-29  Tobias Burnus  

PR fortran/56737
Index: intrinsics/system_clock.c
===
--- intrinsics/system_clock.c   (revision 198377)
+++ intrinsics/system_clock.c   (working copy)
@@ -134,7 +134,7 @@ system_clock_4(GFC_INTEGER_4 *count, GFC
 QueryPerformanceCounter has potential issues.  */
   uint32_t cnt = GetTickCount ();
   if (cnt > GFC_INTEGER_4_HUGE)
-   cnt -= GFC_INTEGER_4_HUGE - 1;
+   cnt = cnt - GFC_INTEGER_4_HUGE - 1;
   *count = cnt;
 }
   if (count_rate)


--
Janne Blomqvist

Re: [Patch, fortran] PR 56981 Improve unbuffered unformatted performance

2013-04-28 Thread Janne Blomqvist

PING

On Fri, Apr 19, 2013 at 1:30 PM, Janne Blomqvist
 wrote:
> Hi,
>
> the attached patch improves the performance for unformatted and
> unbuffered files. Currently unbuffered unformatted really means that
> we don't buffer anything and use the POSIX I/O syscalls directly. With
> the patch, we use the buffer but flush it at the end of each I/O
> statement.
>
> (For formatted I/O we essentially already do this, as the format
> buffer (fbuf) buffers each record).
>
> For the ever important benchmark of writing small (containing a single
> scalar 4 byte value) unformatted sequential records to /dev/null, the
> patch reduces the number of syscalls by a factor of 6, and performance
> improves by more than double.
>
> For trunk, time for the benchmark in the PR:
>
> real0m0.727s
> user0m0.272s
> sys 0m0.452s
>
> With the patch:
>
> real0m0.313s
> user0m0.220s
> sys 0m0.092s
>
> For comparison, writing to a file where we use buffered I/O:
>
> real0m0.202s
> user0m0.180s
> sys 0m0.020s
>
>
> As a semi-unrelated minor improvement, the patch also changes the
> ordering when writing out unformatted sequential record markers.
> Currently we do
>
> write bogus marker
> write record data
> write tail marker
> seek back to before the bogus marker
> write the correct head marker
> seek to the end of the record, behind the tail marker
>
> With the patch we instead do
>
> write bogus marker
> write record data
> seek back to before the bogus marker
> write the correct head marker
> seek to the end of the record data
> write tail marker
>
> With the patch, the slightly shorter seek distances ever-so-slightly
> increase the chance that the seeks will be contained within the buffer
> so we don't have to flush.
>
> Regtested on x86_64-unknown-linux-gnu, Ok for trunk?
>
> 2013-04-19  Janne Blomqvist  
>
> PR fortran/56981
> * io/transfer.c (next_record_w_unf): First fix head marker, then
> write tail.
> (next_record): Call flush_if_unbuffered.
> * io/unix.c (struct unix_stream): Add field unbuffered.
> (flush_if_unbuffered): New function.
> (fd_to_stream): New argument.
> (open_external): Fix fd_to_stream call.
> (input_stream): Likewise.
> (output_stream): Likewise.
> (error_stream): Likewise.
> * io/unix.h (flush_if_unbuffered): New prototype.
>
>
> --
> Janne Blomqvist



--
Janne Blomqvist

Re: [wwwdocs] C++14 support for binary literals says Noinstead of Yes

2013-04-28 Thread Jason Merrill


The patch looks good to me.

Jason

Re: [C++ Patch/RFC] PR 56450

2013-04-28 Thread Jason Merrill


On 04/27/2013 05:17 PM, Paolo Carlini wrote:

Assuming as obvious that we don't want to crash on it, the interesting
issue is whether we want the static_asserts to both fail or succeed: in
practice, a rather recent ICC I have at hand fails both; a rather recent
clang++ passes both (consistently with the expectation of Bug
submitter). In fact, as I'm reading now 7.1.6.2/4, since we are dealing
with a class member access in any case, it may well be possible that
*ICC* is right.


Yes, I think so.  Since it's a class member access, decltype should be 
the declared type, i.e. const int.


Jason

Re: vtables patch 1/3: allow empty array initializations

2013-04-28 Thread Mike Stump

On Apr 26, 2013, at 4:05 AM, Bernd Schmidt  wrote:
> On 04/24/2013 09:14 PM, DJ Delorie wrote:
>>> 24 bits stored as three bytes, or four? How does this affect vtable
>>> layout? I would have expected the C++ frontend and libsupc++ to
>>> currently be inconsistent with each other given such a setup.
>> 
>> In memory, four, I think.  The address registers really are three
>> bytes though.  They're PSImode and gcc doesn't really have a good way
>> of using any specified PSImode precision.

I have patches to let one specify a precision for partial int types, easy 
enough to do, and the rest of the compiler plays nicely for the most part with 
it...

Re: [PATCH] Allow nested use of attributes in MD-files

2013-04-28 Thread Mike Stump

On Apr 26, 2013, at 8:17 AM, Michael Zolotukhin 
 wrote:
> This patch allows to use attributes inside other attributes in MD-files.

Sweet, I like it…

[Patch, Fortran] PR57093 - fix to-small malloc size with scalar coarrays of type character

2013-04-28 Thread Tobias Burnus


The problem is a bit nested but the solution is obvious:

The type (TREE_TYPE) of an array is an array type - and one needs to 
drill one level deeper to get the element type. For allocatable scalar 
coarrays, one has an array descriptor to handle the bounds (and, with 
-fcoarray=lib, to store the coarray token) but the type is a scalar. 
gfc_get_element_type by default applies TREE_TYPE twice - which is once 
to much for coarrays. There was a check that this is not done for 
arrays. Unfortunately, character strings are internally arrays as well. 
Thus, the second TREE_TYPE didn't give the array (with the proper string 
length) but an element of the string, which only had size 1 (for kind=1 
characters). The solution is obvious, see patch.


Committed as Rev. 198379 after bootstrapping and regtesting on 
x86-64-gnu-linux.


Tobias
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 198378)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,10 @@
+2013-04-28  Tobias Burnus  
+
+	PR fortran/57093
+	* trans-types.c (gfc_get_element_type): Fix handling
+	of scalar coarrays of type character.
+	* intrinsic.texi (PACK): Add missing ")".
+
 2013-04-28  Thomas Koenig  
 
 	PR fortran/57071
@@ -6,9 +13,9 @@
 
 2013-04-25  Janne Blomqvist  
 
-PR bootstrap/57028
-* Make-lang.in (f951): Link in ZLIB.
-(CFLAGS-fortran/module.o): Add zlib include directory.
+	PR bootstrap/57028
+	* Make-lang.in (f951): Link in ZLIB.
+	(CFLAGS-fortran/module.o): Add zlib include directory.
 
 2013-04-22  Janus Weil  
 
Index: gcc/fortran/intrinsic.texi
===
--- gcc/fortran/intrinsic.texi	(Revision 198378)
+++ gcc/fortran/intrinsic.texi	(Arbeitskopie)
@@ -9619,7 +9619,7 @@ Fortran 95 and later
 Transformational function
 
 @item @emph{Syntax}:
-@code{RESULT = PACK(ARRAY, MASK[,VECTOR]}
+@code{RESULT = PACK(ARRAY, MASK[,VECTOR])}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
Index: gcc/fortran/trans-types.c
===
--- gcc/fortran/trans-types.c	(Revision 198378)
+++ gcc/fortran/trans-types.c	(Arbeitskopie)
@@ -1179,7 +1179,7 @@ gfc_get_element_type (tree type)
   element = TREE_TYPE (element);
 
   /* For arrays, which are not scalar coarrays.  */
-  if (TREE_CODE (element) == ARRAY_TYPE)
+  if (TREE_CODE (element) == ARRAY_TYPE && !TYPE_STRING_FLAG (element))
 	element = TREE_TYPE (element);
 }
 
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 198378)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,8 @@
+2013-04-28  Tobias Burnus  
+
+	PR fortran/57093
+	* gfortran.dg/coarray_30.f90: New.
+
 2013-04-28  Thomas Koenig  
 
 	PR fortran/57071
@@ -43,7 +48,7 @@
 2013-04-25  Marek Polacek  
 
 	PR tree-optimization/57066
-* gcc.dg/torture/builtin-logb-1.c: Adjust testcase.
+	* gcc.dg/torture/builtin-logb-1.c: Adjust testcase.
 
 2013-04-25  James Greenhalgh  
 	Tejas Belagod  
Index: gcc/testsuite/gfortran.dg/coarray_30.f90
===
--- gcc/testsuite/gfortran.dg/coarray_30.f90	(Revision 0)
+++ gcc/testsuite/gfortran.dg/coarray_30.f90	(Arbeitskopie)
@@ -0,0 +1,15 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=single -fdump-tree-original" }
+!
+! PR fortran/57093
+!
+! Contributed by Damian Rouson
+!
+program main
+  character(len=25), allocatable :: greeting[:]
+  allocate(greeting[*])
+  write(greeting,"(a)") "z"
+end
+
+! { dg-final { scan-tree-dump-times "greeting.data = \\(void . restrict\\) __builtin_malloc \\(25\\);" 1 "original" } }
+! { dg-final { cleanup-tree-dump "original" } }

[Patch, libfortran] Simplify SYSTEM_CLOCK implementation

2013-04-28 Thread Janne Blomqvist

Hi,

while looking at system_clock due to the recent Windows patches, it
occurred to me that the Unix versions can be simplified somewhat. The
attached patch does this.

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

2013-04-28  Janne Blomqvist  

* intrinsics/system_clock (gf_gettime_mono): Use variable
resolution for fractional seconds argument.
(system_clock_4): Simplify, update for gf_gettime_mono change.
(system_clock_8): Likewise.



--
Janne Blomqvist


system_clock_simplify.diff
Description: Binary data

Re: [C++ Patch/RFC] PR 56450

2013-04-28 Thread Paolo Carlini


Hi,

On 04/28/2013 09:10 PM, Jason Merrill wrote:

On 04/27/2013 05:17 PM, Paolo Carlini wrote:

Assuming as obvious that we don't want to crash on it, the interesting
issue is whether we want the static_asserts to both fail or succeed: in
practice, a rather recent ICC I have at hand fails both; a rather recent
clang++ passes both (consistently with the expectation of Bug
submitter). In fact, as I'm reading now 7.1.6.2/4, since we are dealing
with a class member access in any case, it may well be possible that
*ICC* is right.


Yes, I think so.  Since it's a class member access, decltype should be 
the declared type, i.e. const int.

Thanks. Is the below Ok, then? Tested (again) on x86_64-linux.

Thanks,
Paolo.

//
/cp
2013-04-28  Paolo Carlini  

PR c++/56450
* semantics.c (finish_decltype_type): Handle COMPOUND_EXPR.

/testsuite
2013-04-28  Paolo Carlini  

PR c++/56450
* testsuite/g++.dg/cpp0x/decltype52.C: New.
Index: cp/semantics.c
===
--- cp/semantics.c  (revision 198366)
+++ cp/semantics.c  (working copy)
@@ -5398,6 +5398,7 @@ finish_decltype_type (tree expr, bool id_expressio
   break;
 
 case COMPONENT_REF:
+   case COMPOUND_EXPR:
  mark_type_use (expr);
   type = is_bitfield_expr_with_lowered_type (expr);
   if (!type)
Index: testsuite/g++.dg/cpp0x/decltype52.C
===
--- testsuite/g++.dg/cpp0x/decltype52.C (revision 0)
+++ testsuite/g++.dg/cpp0x/decltype52.C (working copy)
@@ -0,0 +1,18 @@
+// PR c++/56450
+// { dg-do compile { target c++11 } }
+
+template
+T&& declval();
+
+template
+struct is_same
+{ static constexpr bool value = false; };
+
+template
+struct is_same
+{ static constexpr bool value = true; };
+
+struct A { static const int dummy = 0; };
+
+static_assert(is_same().dummy), const int>::value, "");
+static_assert(!is_same().dummy), const int&>::value, "");

Re: vtables patch 1/3: allow empty array initializations

2013-04-28 Thread DJ Delorie


> I have patches to let one specify a precision for partial int types,
> easy enough to do, and the rest of the compiler plays nicely for the
> most part with it...

If you can make size_t truly be a 24-bit value, I'd be very happy :-)

[PATCH] Improve vec_widen_?mult_odd_* (take 2)

2013-04-28 Thread Jakub Jelinek

On Sat, Apr 27, 2013 at 11:07:50AM +0200, Uros Bizjak wrote:
> Yes, please add a new predicate, the pattern is much more descriptive
> this way. (Without the predicate, it looks like an expander that
> generates a RTX fragment, used instead of gen_RTX... sequences).

Ok, updated patch below.  Bootstrapped/regtested again on x86_64-linux and
i686-linux.

> OTOH, does vector mode "general_operand" still accept scalar
> immediates? The predicate, proposed above is effectively

general_operand doesn't accept most of CONST_VECTOR constants (because they
aren't targetm.legitimate_constant_p).  It won't accept CONST_INT
or CONST_DOUBLE due to:
  /* Don't accept CONST_INT or anything similar
 if the caller wants something floating.  */
  if (GET_MODE (op) == VOIDmode && mode != VOIDmode
  && GET_MODE_CLASS (mode) != MODE_INT
  && GET_MODE_CLASS (mode) != MODE_PARTIAL_INT)
return 0;

2013-04-28  Jakub Jelinek  

* config/i386/predicates.md (general_vector_operand): New predicate.
* config/i386/i386.c (const_vector_equal_evenodd_p): New function.
(ix86_expand_mul_widen_evenodd): Force op1 resp. op2 into register
if they aren't nonimmediate operands.  If their original values
satisfy const_vector_equal_evenodd_p, don't shift them.
* config/i386/sse.md (mul3): Use general_vector_operand
predicates.  For the SSE4.1 case force operands[{1,2}] into registers if
not nonimmediate_operand.
(vec_widen_smult_even_v4si): Use nonimmediate_operand predicates
instead of register_operand.
(vec_widen_mult_odd_): Use general_vector_operand predicates.

--- gcc/config/i386/predicates.md.jj2013-04-28 21:29:40.0 +0200
+++ gcc/config/i386/predicates.md   2013-04-28 21:32:30.308907622 +0200
@@ -1303,3 +1303,8 @@ (define_predicate "avx2_pblendw_operand"
   HOST_WIDE_INT low = val & 0xff;
   return val == ((low << 8) | low);
 })
+
+;; Return true if OP is nonimmediate_operand or CONST_VECTOR.
+(define_predicate "general_vector_operand"
+  (ior (match_operand 0 "nonimmediate_operand")
+   (match_code "const_vector")))
--- gcc/config/i386/i386.c.jj   2013-04-28 21:29:44.248917523 +0200
+++ gcc/config/i386/i386.c  2013-04-28 21:32:07.274908996 +0200
@@ -40827,6 +40827,24 @@ ix86_expand_vecop_qihi (enum rtx_code co
   gen_rtx_fmt_ee (code, qimode, op1, op2));
 }
 
+/* Helper function of ix86_expand_mul_widen_evenodd.  Return true
+   if op is CONST_VECTOR with all odd elements equal to their
+   preceeding element.  */
+
+static bool
+const_vector_equal_evenodd_p (rtx op)
+{
+  enum machine_mode mode = GET_MODE (op);
+  int i, nunits = GET_MODE_NUNITS (mode);
+  if (GET_CODE (op) != CONST_VECTOR
+  || nunits != CONST_VECTOR_NUNITS (op))
+return false;
+  for (i = 0; i < nunits; i += 2)
+if (CONST_VECTOR_ELT (op, i) != CONST_VECTOR_ELT (op, i + 1))
+  return false;
+  return true;
+}
+
 void
 ix86_expand_mul_widen_evenodd (rtx dest, rtx op1, rtx op2,
   bool uns_p, bool odd_p)
@@ -40834,6 +40852,12 @@ ix86_expand_mul_widen_evenodd (rtx dest,
   enum machine_mode mode = GET_MODE (op1);
   enum machine_mode wmode = GET_MODE (dest);
   rtx x;
+  rtx orig_op1 = op1, orig_op2 = op2;
+
+  if (!nonimmediate_operand (op1, mode))
+op1 = force_reg (mode, op1);
+  if (!nonimmediate_operand (op2, mode))
+op2 = force_reg (mode, op2);
 
   /* We only play even/odd games with vectors of SImode.  */
   gcc_assert (mode == V4SImode || mode == V8SImode);
@@ -40852,10 +40876,12 @@ ix86_expand_mul_widen_evenodd (rtx dest,
}
 
   x = GEN_INT (GET_MODE_UNIT_BITSIZE (mode));
-  op1 = expand_binop (wmode, lshr_optab, gen_lowpart (wmode, op1),
- x, NULL, 1, OPTAB_DIRECT);
-  op2 = expand_binop (wmode, lshr_optab, gen_lowpart (wmode, op2),
- x, NULL, 1, OPTAB_DIRECT);
+  if (!const_vector_equal_evenodd_p (orig_op1))
+   op1 = expand_binop (wmode, lshr_optab, gen_lowpart (wmode, op1),
+   x, NULL, 1, OPTAB_DIRECT);
+  if (!const_vector_equal_evenodd_p (orig_op2))
+   op2 = expand_binop (wmode, lshr_optab, gen_lowpart (wmode, op2),
+   x, NULL, 1, OPTAB_DIRECT);
   op1 = gen_lowpart (mode, op1);
   op2 = gen_lowpart (mode, op2);
 }
--- gcc/config/i386/sse.md.jj   2013-04-28 21:29:44.244917523 +0200
+++ gcc/config/i386/sse.md  2013-04-28 21:32:07.280908995 +0200
@@ -5631,14 +5631,16 @@ (define_insn "*sse2_pmaddwd"
 (define_expand "mul3"
   [(set (match_operand:VI4_AVX2 0 "register_operand")
(mult:VI4_AVX2
- (match_operand:VI4_AVX2 1 "nonimmediate_operand")
- (match_operand:VI4_AVX2 2 "nonimmediate_operand")))]
+ (match_operand:VI4_AVX2 1 "general_vector_operand")
+ (match_operand:VI4_AVX2 2 "general_vector_operand")))]
   "TARGET_SSE2"
 {
   if (TARGET_SSE4_1)
 {
-  if (CONSTANT_P (op

Re: vtables patch 1/3: allow empty array initializations

2013-04-28 Thread Bernd Schmidt

On 04/28/2013 11:13 PM, DJ Delorie wrote:
> 
>> I have patches to let one specify a precision for partial int types,
>> easy enough to do, and the rest of the compiler plays nicely for the
>> most part with it...
> 
> If you can make size_t truly be a 24-bit value, I'd be very happy :-)

This confuses me a little. Currently, size_t is 16 bits, correct? How
large is the address space really? Are pointers 24 bit flat objects, or
are the upper 16 bit some kind of segment selector?

Bernd

Re: [C++ Patch/RFC] PR 56450

2013-04-28 Thread Jason Merrill


OK.

Jason

Re: vtables patch 1/3: allow empty array initializations

2013-04-28 Thread DJ Delorie


For m32c chips, The address space is a flat 24-bit address space.
Address registers are 24 bits (i.e. they cannot hold an SImode) but
size_t is 16 bits originally because there aren't enough 24-bit math
ops and 32-bit math is too expensive.  I've tried to use PSImode for
size_t recently (different port) and it just doesn't work, partly
because size_t is defined by a *string* that must match a C type, and
partly because PSImode turns into BLKmode in many cases (not 2**N
sized).

Re: [Patch, fortran] PR 56981 Improve unbuffered unformatted performance

2013-04-28 Thread Jerry DeLisle

On 04/28/2013 11:31 AM, Janne Blomqvist wrote:
> PING
> 
> On Fri, Apr 19, 2013 at 1:30 PM, Janne Blomqvist
>  wrote:
>> Hi,
>>
>> the attached patch improves the performance for unformatted and
>> unbuffered files. Currently unbuffered unformatted really means that
>> we don't buffer anything and use the POSIX I/O syscalls directly. With
>> the patch, we use the buffer but flush it at the end of each I/O
>> statement.
>>
>> (For formatted I/O we essentially already do this, as the format
>> buffer (fbuf) buffers each record).
>>
>> For the ever important benchmark of writing small (containing a single
>> scalar 4 byte value) unformatted sequential records to /dev/null, the
>> patch reduces the number of syscalls by a factor of 6, and performance
>> improves by more than double.
>>
>> For trunk, time for the benchmark in the PR:
>>
>> real0m0.727s
>> user0m0.272s
>> sys 0m0.452s
>>
>> With the patch:
>>
>> real0m0.313s
>> user0m0.220s
>> sys 0m0.092s
>>
>> For comparison, writing to a file where we use buffered I/O:
>>
>> real0m0.202s
>> user0m0.180s
>> sys 0m0.020s
>>
>>
>> As a semi-unrelated minor improvement, the patch also changes the
>> ordering when writing out unformatted sequential record markers.
>> Currently we do
>>
>> write bogus marker
>> write record data
>> write tail marker
>> seek back to before the bogus marker
>> write the correct head marker
>> seek to the end of the record, behind the tail marker
>>
>> With the patch we instead do
>>
>> write bogus marker
>> write record data
>> seek back to before the bogus marker
>> write the correct head marker
>> seek to the end of the record data
>> write tail marker
>>
>> With the patch, the slightly shorter seek distances ever-so-slightly
>> increase the chance that the seeks will be contained within the buffer
>> so we don't have to flush.
>>
>> Regtested on x86_64-unknown-linux-gnu, Ok for trunk?
>>

OK Janne and thanks for the patch.

What are your thoughts about special casing nul devices/

Jerry

Re: [Patch, libfortran] Simplify SYSTEM_CLOCK implementation

2013-04-28 Thread Jerry DeLisle

On 04/28/2013 01:16 PM, Janne Blomqvist wrote:
> Hi,
> 
> while looking at system_clock due to the recent Windows patches, it
> occurred to me that the Unix versions can be simplified somewhat. The
> attached patch does this.
> 
> Regtested on x86_64-unknown-linux-gnu, Ok for trunk?
> 

OK for trunk. Thanks!

Jerry

Re: Proposition

2013-04-28 Thread Shin Eon-seong


Greetings from South Korea

I have a proposition for you to the tune of Fifty Million EUR, if interested, 
kindly reply for specifics

Regards,
Shin Eon-seong

Re: [PATCH] Improve vec_widen_?mult_odd_* (take 2)

2013-04-28 Thread Uros Bizjak

On Sun, Apr 28, 2013 at 11:43 PM, Jakub Jelinek  wrote:
> On Sat, Apr 27, 2013 at 11:07:50AM +0200, Uros Bizjak wrote:
>> Yes, please add a new predicate, the pattern is much more descriptive
>> this way. (Without the predicate, it looks like an expander that
>> generates a RTX fragment, used instead of gen_RTX... sequences).
>
> Ok, updated patch below.  Bootstrapped/regtested again on x86_64-linux and
> i686-linux.
>
>> OTOH, does vector mode "general_operand" still accept scalar
>> immediates? The predicate, proposed above is effectively
>
> general_operand doesn't accept most of CONST_VECTOR constants (because they
> aren't targetm.legitimate_constant_p).  It won't accept CONST_INT
> or CONST_DOUBLE due to:
>   /* Don't accept CONST_INT or anything similar
>  if the caller wants something floating.  */
>   if (GET_MODE (op) == VOIDmode && mode != VOIDmode
>   && GET_MODE_CLASS (mode) != MODE_INT
>   && GET_MODE_CLASS (mode) != MODE_PARTIAL_INT)
> return 0;
>
> 2013-04-28  Jakub Jelinek  
>
> * config/i386/predicates.md (general_vector_operand): New predicate.
> * config/i386/i386.c (const_vector_equal_evenodd_p): New function.
> (ix86_expand_mul_widen_evenodd): Force op1 resp. op2 into register
> if they aren't nonimmediate operands.  If their original values
> satisfy const_vector_equal_evenodd_p, don't shift them.
> * config/i386/sse.md (mul3): Use general_vector_operand
> predicates.  For the SSE4.1 case force operands[{1,2}] into registers 
> if
> not nonimmediate_operand.
> (vec_widen_smult_even_v4si): Use nonimmediate_operand predicates
> instead of register_operand.
> (vec_widen_mult_odd_): Use general_vector_operand predicates.

OK for mainline.

Thanks,
Uros.

[patch, fortran] PR 57071, some power optimizations

Re: [patch, fortran] PR 57071, some power optimizations

New Swedish PO file for 'gcc' (version 4.8.0)

Re: [patch, fortran] PR 57071, some power optimizations

[patch] libstdc++/51365 for shared_ptr

Re: [PATCH] Preserve loops from CFG build until after RTL loop opts

[patch] tweak some libstdc++ comments

Re: [patch, fortran] PR 57071, some power optimizations

RE: [gomp4] Some progress on #pragma omp simd

Re: [wwwdocs] C++14 support for binary literals says Noinstead of Yes

Re: RFA: enable LRA for rs6000

Fix minor regression with size functions

Re: [patch, fortran] PR 57071, some power optimizations

[Patch, fortran] Fix sign error in SYSTEM_CLOCK kind=4 Windows version

Re: [Patch, fortran] PR 56981 Improve unbuffered unformatted performance

Re: [wwwdocs] C++14 support for binary literals says Noinstead of Yes

Re: [C++ Patch/RFC] PR 56450

Re: vtables patch 1/3: allow empty array initializations

Re: [PATCH] Allow nested use of attributes in MD-files

[Patch, Fortran] PR57093 - fix to-small malloc size with scalar coarrays of type character

[Patch, libfortran] Simplify SYSTEM_CLOCK implementation

Re: [C++ Patch/RFC] PR 56450

Re: vtables patch 1/3: allow empty array initializations

[PATCH] Improve vec_widen_?mult_odd_* (take 2)

Re: vtables patch 1/3: allow empty array initializations

Re: [C++ Patch/RFC] PR 56450

Re: vtables patch 1/3: allow empty array initializations

Re: [Patch, fortran] PR 56981 Improve unbuffered unformatted performance

Re: [Patch, libfortran] Simplify SYSTEM_CLOCK implementation

Re: Proposition

Re: [PATCH] Improve vec_widen_?mult_odd_* (take 2)

31 matches

Site Navigation

Mail list logo

Footer information