date:20110918

Re: [v3] libstdc++/50441

2011-09-18 Thread Marc Glisse


On Sun, 18 Sep 2011, Paolo Carlini wrote:


tested x86_64-linux, committed to mainline.


Hello,

bugzilla seems to be down, so let me write it here:

the testsuite uses #ifdef __SIZEOF_INT128__ to test for the availability 
of a 128 bit integer type. I haven't seen a similar define for float128, 
you might want to request it. In any case, a way to check for the 
availability of these types should be documented in the manual...


--
Marc Glisse

Re: PATCH: Replace tmp with __tmp

2011-09-18 Thread Uros Bizjak

On Sat, Sep 17, 2011 at 11:26 PM, H.J. Lu  wrote:

>>> Agreed. Some parets are missing, though:
>>>
>>> -  unsigned long long tmp = (__X) ^ (__X - 1);
>>> -  return tmp;
>>> +  unsigned long long __tmp = (__X) ^ (__X - 1);
>>> +  return __tmp;
>>
>> There is none missing.  This is not a macro.
>>
>
> Here is the updated patch.  Tested on Linux/x86-64.  OK
> for trunk?

> 2011-09-17  H.J. Lu  
>
>        * config/i386/bmiintrin.h: Remove tmp.
>        * config/i386/tbmintrin.h: Likewise.

OK.

Thanks,
Uros.

Re: [v3] libstdc++/50441

2011-09-18 Thread Paolo Carlini


On 09/18/2011 09:03 AM, Marc Glisse wrote:
the testsuite uses #ifdef __SIZEOF_INT128__ to test for the 
availability of a 128 bit integer type. I haven't seen a similar 
define for float128,
Thanks. For now I went for a configure test, consistently for int and 
float, which also allows to check whether the type is the same as an 
existing one (otherwise we risk bad errors due to duplicate 
specializations, well possible right now for __float128!).


Paolo.

[patch] Fix PR tree-optimization/50412

2011-09-18 Thread Ira Rosen

Hi,

Strided accesses of single element or with gaps may require creation
of epilogue loop. At the moment we don't support peeling for outer
loops, therefore, we should not allow such strided accesses in outer
loops.

Bootstrapped and tested on powerpc64-suse-linux.
Committed to trunk.

Now testing for 4.6.
OK for 4.6 when the testing completes?

Thanks,
Ira

ChangeLog:

PR tree-optimization/50412
* tree-vect-data-refs.c (vect_analyze_group_access): Fail for
acceses that require epilogue loop if vectorizing outer loop.

testsuite/ChangeLog:

PR tree-optimization/50412
* gfortran.dg/vect/pr50412.f90: New.



Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 178939)
+++ tree-vect-data-refs.c   (working copy)
@@ -2060,7 +2060,11 @@ vect_analyze_group_access (struct data_reference *
   HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step);
   HOST_WIDE_INT stride, last_accessed_element = 1;
   bool slp_impossible = false;
+  struct loop *loop = NULL;

+  if (loop_vinfo)
+loop = LOOP_VINFO_LOOP (loop_vinfo);
+
   /* For interleaving, STRIDE is STEP counted in elements, i.e., the
size of the
  interleaving group (including gaps).  */
   stride = dr_step / type_size;
@@ -2090,11 +2094,18 @@ vect_analyze_group_access (struct data_reference *

  if (loop_vinfo)
{
- LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) = true;
-
  if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "Data access with gaps requires scalar "
"epilogue loop");
+  if (loop->inner)
+{
+  if (vect_print_dump_info (REPORT_DETAILS))
+fprintf (vect_dump, "Peeling for outer loop is not"
+" supported");
+  return false;
+}
+
+  LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) = true;
}

  return true;
@@ -2277,10 +2288,17 @@ vect_analyze_group_access (struct data_reference *
   /* There is a gap in the end of the group.  */
   if (stride - last_accessed_element > 0 && loop_vinfo)
{
- LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) = true;
  if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "Data access with gaps requires scalar "
"epilogue loop");
+  if (loop->inner)
+{
+  if (vect_print_dump_info (REPORT_DETAILS))
+fprintf (vect_dump, "Peeling for outer loop is not supported");
+  return false;
+}
+
+  LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) = true;
}
 }
Index: testsuite/gfortran.dg/vect/pr50412.f90
===
--- testsuite/gfortran.dg/vect/pr50412.f90  (revision 0)
+++ testsuite/gfortran.dg/vect/pr50412.f90  (revision 0)
@@ -0,0 +1,12 @@
+! { dg-do compile }
+
+  DOUBLE PRECISION AK,AI,AAE
+  COMMON/com/AK(36),AI(4,4),AAE(8,4),ii,jj
+  DO 20 II=1,4
+DO 21 JJ=1,4
+  AK(n)=AK(n)-AAE(I,II)*AI(II,JJ)
+   21   CONTINUE
+   20 CONTINUE
+  END
+
+! { dg-final { cleanup-tree-dump "vect" } }

Re: [v3] libstdc++/50441

2011-09-18 Thread Marc Glisse


On Sun, 18 Sep 2011, Paolo Carlini wrote:


On 09/18/2011 09:03 AM, Marc Glisse wrote:
the testsuite uses #ifdef __SIZEOF_INT128__ to test for the availability of 
a 128 bit integer type. I haven't seen a similar define for float128,
Thanks. For now I went for a configure test, consistently for int and float, 
which also allows to check whether the type is the same as an existing one 
(otherwise we risk bad errors due to duplicate specializations, well possible 
right now for __float128!).


Indeed!
The documentation is not clear on whether __int128 and __float128 may be 
the same types as say long long and long double, or they are different 
types even if they have the same size (the doc was written for C, where it 
doesn't matter as much).


--
Marc Glisse

Re: PATCH: Replace tmp with __tmp

2011-09-18 Thread Paolo Carlini

... probably somebody will hate me, but stylistically I also don't 
understand why the uppercases.


Paolo.

[patch] Fix tree-optimization/50414

2011-09-18 Thread Ira Rosen

Hi,

This patch adds a missing handling of MAX/MIN_EXPR in SLP reduction.

Boostrapped and tested on powerpc64-suse-linux.
Committed to trunk.

Ira

ChangeLog:

PR tree-optimization/50414
* tree-vect-slp.c (vect_get_constant_vectors): Handle MAX_EXPR and
MIN_EXPR.

testsuite/ChangeLog:

PR tree-optimization/50414
* gfortran.dg/vect/Ofast-pr50414.f90: New.
* gfortran.dg/vect/vect.exp: Run Ofast-* tests with -Ofast.
* gcc.dg/vect/no-scevccp-noreassoc-slp-reduc-7.c: New.


Index: tree-vect-slp.c
===
--- tree-vect-slp.c (revision 178939)
+++ tree-vect-slp.c (working copy)
@@ -1902,6 +1902,8 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
   bool constant_p, is_store;
   tree neutral_op = NULL;
   enum tree_code code = gimple_assign_rhs_code (stmt);
+  gimple def_stmt;
+  struct loop *loop;

   if (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def)
 {
@@ -1943,8 +1945,16 @@ vect_get_constant_vectors (tree op, slp_tree slp_n
 neutral_op = build_int_cst (TREE_TYPE (op), -1);
 break;

+  case MAX_EXPR:
+  case MIN_EXPR:
+def_stmt = SSA_NAME_DEF_STMT (op);
+loop = (gimple_bb (stmt))->loop_father;
+neutral_op = PHI_ARG_DEF_FROM_EDGE (def_stmt,
+loop_preheader_edge (loop));
+break;
+
   default:
- neutral_op = NULL;
+neutral_op = NULL;
 }
 }

@@ -1997,8 +2007,8 @@ vect_get_constant_vectors (tree op, slp_tree slp_n

   if (reduc_index != -1)
 {
-  struct loop *loop = (gimple_bb (stmt))->loop_father;
-  gimple def_stmt = SSA_NAME_DEF_STMT (op);
+  loop = (gimple_bb (stmt))->loop_father;
+  def_stmt = SSA_NAME_DEF_STMT (op);

   gcc_assert (loop);
Index: testsuite/gfortran.dg/vect/Ofast-pr50414.f90
===
--- testsuite/gfortran.dg/vect/Ofast-pr50414.f90(revision 0)
+++ testsuite/gfortran.dg/vect/Ofast-pr50414.f90(revision 0)
@@ -0,0 +1,11 @@
+! { dg-do compile }
+
+  SUBROUTINE  SUB  (A,L,YMAX)
+  DIMENSION A(L)
+  YMA=A(1)
+  DO 2 I=1,L,2
+2 YMA=MAX(YMA,A(I),A(I+1))
+  CALL PROUND(YMA)
+  END
+
+! { dg-final { cleanup-tree-dump "vect" } }
Index: testsuite/gfortran.dg/vect/vect.exp
===
--- testsuite/gfortran.dg/vect/vect.exp (revision 178939)
+++ testsuite/gfortran.dg/vect/vect.exp (working copy)
@@ -84,6 +84,12 @@ lappend DEFAULT_VECTCFLAGS "-O3"
 dg-runtest [lsort [glob -nocomplain
$srcdir/$subdir/O3-*.\[fF\]{,90,95,03,08} ]]  \
 "" $DEFAULT_VECTCFLAGS

+# With -Ofast
+set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
+lappend DEFAULT_VECTCFLAGS "-Ofast"
+dg-runtest [lsort [glob -nocomplain
$srcdir/$subdir/Ofast-*.\[fF\]{,90,95,03,08} ]]  \
+"" $DEFAULT_VECTCFLAGS
+
 # Clean up.
 set dg-do-what-default ${save-dg-do-what-default}

Index: testsuite/gcc.dg/vect/no-scevccp-noreassoc-slp-reduc-7.c
===
--- testsuite/gcc.dg/vect/no-scevccp-noreassoc-slp-reduc-7.c(revision 0)
+++ testsuite/gcc.dg/vect/no-scevccp-noreassoc-slp-reduc-7.c(revision 0)
@@ -0,0 +1,42 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+
+#define N 16
+#define MAX 121
+
+unsigned int ub[N] = {0,3,6,9,12,15,18,121,24,27,113,33,36,39,42,45};
+
+/* Vectorization of reduction using loop-aware SLP (with unrolling).  */
+
+__attribute__ ((noinline))
+int main1 (int n)
+{
+  int i;
+  unsigned int max = 50;
+
+  for (i = 0; i < n; i++) {
+max = max < ub[2*i] ? ub[2*i] : max;
+max = max < ub[2*i + 1] ? ub[2*i + 1] : max;
+  }
+
+  /* Check results:  */
+  if (max != MAX)
+abort ();
+
+  return 0;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  main1 (N/2);
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
xfail vect_no_int_max } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1
"vect" { xfail vect_no_int_max } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+

[patch] Fix PR testsuite/50435

2011-09-18 Thread Ira Rosen

Hi,

This patch adds an if-statement to avoid loop vectorization and fixes
underscores around restrict in gcc.dg/vect/bb-slp-25.c.

Tested by Dominique on x86_64-apple-darwin10 and on x86_64-suse-linux.

Committed to trunk.

Ira


2011-09-18  Dominique d'Humieres  
  Ira Rosen  

PR testsuite/50435
* gcc.dg/vect/bb-slp-25.c: Add an if to avoid loop vectorization.
Fix underscores around restrict.

Index: testsuite/gcc.dg/vect/bb-slp-25.c
===
--- testsuite/gcc.dg/vect/bb-slp-25.c   (revision 178940)
+++ testsuite/gcc.dg/vect/bb-slp-25.c   (working copy)
@@ -9,7 +9,7 @@

 short src[N], dst[N];

-void foo (short * __restrict dst, short * __restrict src, int h, int stride)
+void foo (short * __restrict__ dst, short * __restrict__ src, int h,
int stride, int dummy)
 {
   int i;
   h /= 16;
@@ -25,6 +25,8 @@ void foo (short * __restrict dst, short
   dst[7] += A*src[7] + src[7+stride];
   dst += 8;
   src += 8;
+  if (dummy == 32)
+abort ();
}
 }

@@ -41,7 +43,7 @@ int main (void)
src[i] = i;
 }

-  foo (dst, src, N, 8);
+  foo (dst, src, N, 8, 0);

   for (i = 0; i < N/2; i++)
 {

Re: [v3] libstdc++/50441

2011-09-18 Thread Paolo Carlini


On 09/18/2011 11:07 AM, Marc Glisse wrote:

Indeed!
The documentation is not clear on whether __int128 and __float128 may 
be the same types as say long long and long double, or they are 
different types even if they have the same size (the doc was written 
for C, where it doesn't matter as much).
For sure __float80 is just long double on x86. And by the way, given the 
infrastructure in place, it would be easy to safely add the former too, 
if somebody asks for it (I think it's currently supported only for 
targets where it actually just boils down to long double, though)


Paolo.

Re: [PATCH 6/7] Kill pedantic warnings on system headers macros

2011-09-18 Thread Dodji Seketeli

Jason Merrill  writes:

> On 09/16/2011 04:46 AM, Dodji Seketeli wrote:
>>  struct c_declspecs *
>> -finish_declspecs (struct c_declspecs *specs)
>> +finish_declspecs (struct c_declspecs *specs,
>> + location_t where)
>
> Let's call this first_token_loc, too.  And mention it in the function
> comment.
>
> OK with that change.

Thanks.  For the record, this is the updated patch.

From: Dodji Seketeli 
Date: Sat, 4 Dec 2010 18:35:47 +0100
Subject: [PATCH 6/7] Kill pedantic warnings on system headers macros

This patch leverages the virtual location infrastructure to avoid
emitting pedantic warnings related to macros defined in system headers
but expanded in normal TUs.

The point is to make diagnostic routines use virtual locations of
tokens instead of their spelling locations.  The diagnostic routines
in turn indirectly use linemap_location_in_system_header_p to know if
a given virtual location originated from a system header.

The patch has two main parts.

The libcpp part makes diagnostic routines called from the preprocessor
expression parsing and number conversion code use virtual locations.

The C FE part makes diagnostic routines called from the type
specifiers validation code use virtual locations.

This fixes the relevant examples presented in the comments of the bug
but I guess, as usual, libcpp and the FEs will need on-going care to
use more and more virtual locations of tokens instead of spelling
locations.

The combination of the patch and the previous ones boostrapped with
--enable-languages=all,ada and passed regression tests on
x86_64-unknown-linux-gnu.

libcpp/

* include/cpplib.h (cpp_classify_number): Add a location parameter
to the declaration.
* expr.c (SYNTAX_ERROR_AT, SYNTAX_ERROR2_AT): New macros to emit
syntax error using a virtual location.
(cpp_classify_number): Add a virtual location parameter.  Use
SYNTAX_ERROR_AT instead of SYNTAX_ERROR, cpp_error_with_line
instead of cpp_error and cpp_warning_with_line instead of
cpp_warning.  Pass the new virtual location parameter to those
diagnostic routines.
(eval_token): Add a virtual location parameter.  Pass it down to
cpp_classify_number.  Use cpp_error_with_line instead of
cpp_error, cpp_warning_with_line instead of cpp_warning, and pass
the new virtual location parameter to these.
(_cpp_parse_expr): Use cpp_get_token_with_location instead of
cpp_get_token, to get the virtual location of the token. Use
SYNTAX_ERROR2_AT instead of SYNTAX_ERROR2, cpp_error_with_line
instead of cpp_error. Use the virtual location instead of the
spelling location.
* macro.c (maybe_adjust_loc_for_trad_cpp): Define new static
function.
(cpp_get_token_with_location): Use it.

gcc/c-family

* c-lex.c (c_lex_with_flags): Adjust to pass the virtual location
to cpp_classify_number.

gcc/

* c-tree.h (finish_declspecs): Add a virtual location parameter.
* c-decl.c (finish_declspecs): Add a virtual location parameter.
Use error_at instead of error and pass down the virtual location
to pewarn and error_at.
(declspecs_add_type): Use in_system_header_at instead of
in_system_header.
* c-parser.c (c_parser_declaration_or_fndef): Pass virtual
location of the relevant token to finish_declspecs.
(c_parser_struct_declaration, c_parser_parameter_declaration):
Likewise.
(c_parser_type_name): Likewise.

gcc/testsuite/

* gcc.dg/cpp/syshdr3.h: New test header.
* gcc.dg/cpp/syshdr3.c: New test file.
* gcc.dg/nofixed-point-2.c: Adjust to more precise location.
---
 gcc/c-decl.c   |   22 +++--
 gcc/c-family/c-lex.c   |4 +-
 gcc/c-parser.c |   12 ++-
 gcc/c-tree.h   |2 +-
 gcc/testsuite/gcc.dg/cpp/syshdr3.c |   16 +++
 gcc/testsuite/gcc.dg/cpp/syshdr3.h |7 ++
 gcc/testsuite/gcc.dg/nofixed-point-2.c |6 +-
 libcpp/expr.c  |  173 +++-
 libcpp/include/cpplib.h|3 +-
 9 files changed, 153 insertions(+), 92 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/cpp/syshdr3.c
 create mode 100644 gcc/testsuite/gcc.dg/cpp/syshdr3.h

diff --git a/gcc/c-decl.c b/gcc/c-decl.c
index 5d4564a..cd1b276 100644
--- a/gcc/c-decl.c
+++ b/gcc/c-decl.c
@@ -8983,7 +8983,7 @@ declspecs_add_type (location_t loc, struct c_declspecs 
*specs,
  break;
case RID_COMPLEX:
  dupe = specs->complex_p;
- if (!flag_isoc99 && !in_system_header)
+ if (!flag_isoc99 && !in_system_header_at (loc))
pedwarn (loc, OPT_pedantic,
 "ISO C90 does not support complex types");
  if (specs->typespec_word == cts_void)
@@ -9508,10 +9508,12

Re: [PATCH 7/7] Reduce memory waste due to non-power-of-2 allocs

2011-09-18 Thread Dodji Seketeli

Jason Merrill  writes:

> On 09/17/2011 07:08 AM, Dodji Seketeli wrote:
>> OK, so the patch below extracts a public ggc_alloced_size_for_request
>> function from the different implementations of the ggc allocator's
>> interface, and lets new_linemap use that.
>
> Maybe "ggc_round_alloc_size"?

OK, updated the patch below accordingly.

>  OK with that change if nobody else has comments this week.

Thanks.

Below is the updated patch.

From: Dodji Seketeli 
Date: Tue, 17 May 2011 16:48:01 +0200
Subject: [PATCH 7/7] Reduce memory waste due to non-power-of-2 allocs

This patch basically arranges for the allocation size of line_map
buffers to be as close as possible to a power of two.  This
*significantly* decreases peak memory consumption as (macro) maps are
numerous and stay live during all the compilation.

The patch adds a new ggc_round_alloc_size interface to the ggc
allocator.  In each of the two main allocator implementations of
('page' and 'zone') the function has been extracted from the main
allocation function code and returns the actual size of the allocated
memory region, thus giving a chance to the caller to maximize the
amount of memory it actually uses from the allocated memory region.
In the 'none' allocator implementation (that uses xmalloc) the
ggc_round_alloc_size just returns the requested allocation size.

Tested on x86_64-unknown-linux-gnu against trunk for each allocator.

libcpp/

* include/line-map.h (struct line_maps::alloced_size_for_request):
New member.
* line-map.c (new_linemap): Use set->alloced_size_for_request to
get the actual allocated size of line maps.

gcc/

* ggc.h (ggc_round_alloc_size): Declare new public entry point.
* ggc-none.c (ggc_round_alloc_size): New public stub function.
* ggc-page.c (ggc_alloced_size_order_for_request): New static
function.  Factorized from ggc_internal_alloc_stat.
(ggc_round_alloc_size): New public function.  Uses
ggc_alloced_size_order_for_request.
(ggc_internal_alloc_stat): Use ggc_alloced_size_order_for_request.
* ggc-zone.c (ggc_round_alloc_size): New public function extracted
from ggc_internal_alloc_zone_stat.
(ggc_internal_alloc_zone_stat): Use ggc_round_alloc_size.
* toplev.c (general_init): Initialize
line_table->alloced_size_for_request.
---
 gcc/ggc-none.c|9 +++
 gcc/ggc-page.c|   53 +++-
 gcc/ggc-zone.c|   27 --
 gcc/ggc.h |2 +
 gcc/toplev.c  |1 +
 libcpp/include/line-map.h |8 ++
 libcpp/line-map.c |   39 -
 7 files changed, 114 insertions(+), 25 deletions(-)

diff --git a/gcc/ggc-none.c b/gcc/ggc-none.c
index 97d25b9..e57d617 100644
--- a/gcc/ggc-none.c
+++ b/gcc/ggc-none.c
@@ -39,6 +39,15 @@ ggc_alloc_typed_stat (enum gt_types_enum ARG_UNUSED (gte), 
size_t size
   return xmalloc (size);
 }
 
+/* For a given size of memory requested for allocation, return the
+   actual size that is going to be allocated.  */
+
+size_t
+ggc_round_alloc_size (size_t requested_size)
+{
+  return requested_size;
+}
+
 void *
 ggc_internal_alloc_stat (size_t size MEM_STAT_DECL)
 {
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 624f029..f919a6b 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -1054,6 +1054,47 @@ static unsigned char size_lookup[NUM_SIZE_LOOKUP] =
   9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9
 };
 
+/* For a given size of memory requested for allocation, return the
+   actual size that is going to be allocated, as well as the size
+   order.  */
+
+static void
+ggc_round_alloc_size_1 (size_t requested_size,
+   size_t *size_order,
+   size_t *alloced_size)
+{
+  size_t order, object_size;
+
+  if (requested_size < NUM_SIZE_LOOKUP)
+{
+  order = size_lookup[requested_size];
+  object_size = OBJECT_SIZE (order);
+}
+  else
+{
+  order = 10;
+  while (requested_size > (object_size = OBJECT_SIZE (order)))
+order++;
+}
+
+  if (size_order)
+*size_order = order;
+  if (alloced_size)
+*alloced_size = object_size;
+}
+
+/* For a given size of memory requested for allocation, return the
+   actual size that is going to be allocated.  */
+
+size_t
+ggc_round_alloc_size (size_t requested_size)
+{
+  size_t size = 0;
+  
+  ggc_round_alloc_size_1 (requested_size, NULL, &size);
+  return size;
+}
+
 /* Typed allocation function.  Does nothing special in this collector.  */
 
 void *
@@ -1072,17 +1113,7 @@ ggc_internal_alloc_stat (size_t size MEM_STAT_DECL)
   struct page_entry *entry;
   void *result;
 
-  if (size < NUM_SIZE_LOOKUP)
-{
-  order = size_lookup[size];
-  object_size = OBJECT_SIZE (order);
-}
-  else
-{
-  order = 10;
-  while (size > (object_size = OBJECT_SIZE (order)))
-   order++;
-}
+  ggc

Re: [rs6000] Fix PR target/50091

2011-09-18 Thread David Edelsohn

Tested on PowerPC/Darwin by Iain and on PowerPC/Linux by me.  OK for mainline
and the 4.6/4.5 branches?


2011-09-06  Eric Botcazou  
Iain Sandoe  

PR target/50091
* config/rs6000/rs6000.md (probe_stack): Use explicit operand.
* config/rs6000/rs6000.c (output_probe_stack_range): Likewise.


Okay everywhere.

Thanks, David

Re: [v3] libstdc++/50441

2011-09-18 Thread Joseph S. Myers

On Sun, 18 Sep 2011, Marc Glisse wrote:

> On Sun, 18 Sep 2011, Paolo Carlini wrote:
> 
> > On 09/18/2011 09:03 AM, Marc Glisse wrote:
> > > the testsuite uses #ifdef __SIZEOF_INT128__ to test for the availability
> > > of a 128 bit integer type. I haven't seen a similar define for float128,
> > Thanks. For now I went for a configure test, consistently for int and float,
> > which also allows to check whether the type is the same as an existing one
> > (otherwise we risk bad errors due to duplicate specializations, well
> > possible right now for __float128!).
> 
> Indeed!
> The documentation is not clear on whether __int128 and __float128 may be the
> same types as say long long and long double, or they are different types even
> if they have the same size (the doc was written for C, where it doesn't matter
> as much).

__int128 and unsigned __int128 are currently separate types, just like 
long and long long are always distinct.  I'm not sure you should rely on 
them being distinct on any hypothetical future target where long long is 
128-bit; if we added __int64 I'm not sure having it a distinct type would 
be the most useful implementation.

__int128_t and __uint128_t are legacy typedefs for __int128 and unsigned 
__int128.

__float128 and __float80 are typedefs.  It appears (without testing) that 
IA64 __float80 is always a distinct type but otherwise those names will be 
typedefs for long double if they have the same representation and 
alignment as long double.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [v3] libstdc++/50441

2011-09-18 Thread Paolo Carlini


On 09/18/2011 08:36 PM, Joseph S. Myers wrote:
__int128_t and __uint128_t are legacy typedefs for __int128 and 
unsigned __int128.
I didn't realize this. Thus I guess, for 50441 and also for 40856 (which 
I'm about to do) better doing everything in terms of __int128 and 
unsigned __int128.


Paolo.

Re: [v3] libstdc++/50441

2011-09-18 Thread Paolo Carlini


On 09/18/2011 08:58 PM, Paolo Carlini wrote:

On 09/18/2011 08:36 PM, Joseph S. Myers wrote:
__int128_t and __uint128_t are legacy typedefs for __int128 and 
unsigned __int128.
I didn't realize this. Thus I guess, for 50441 and also for 40856 
(which I'm about to do) better doing everything in terms of __int128 
and unsigned __int128.
I'm currently blocked by the following issue. If I try to compile, with 
-std=gnu++98 (the default for C++) and -pedantic-errors:


template
struct limits;

template<>
struct limits<__int128> { };

template<>
struct limits { };

I get:

a.cc:8:26: error: ISO C++ does not support ‘__int128’ for ‘type name’ 
[-pedantic]

a.cc:8:10: error: redefinition of ‘struct limits<__int128>’
a.cc:5:10: error: previous definition of ‘struct limits<__int128>’

this of course does *not* happen with __int128_t and __uint128_t. 
Apparently I can suppress such -pedantic and -pedantic-errors issues in 
pragma system_header headers, but then resurface when using PCHs. Please 
let me know if the above is supposed to work in a gnu++98 (or gnu++0x) 
system header also together with any -pedantic options, or I should 
really use __int128_t and __uint128_t for the time being, I would be 
certainly ok with the latter.


Thanks!
Paolo.

Re: PowerPC shrink-wrap support 0 of 3

2011-09-18 Thread Alan Modra

On Sat, Sep 17, 2011 at 03:26:21PM +0200, Bernd Schmidt wrote:
> On 09/17/11 09:16, Alan Modra wrote:
> > This patch series adds shrink-wrap support for PowerPC.  The patches
> > are on top of Bernd's "Initial shrink-wrapping patch":
> > http://gcc.gnu.org/ml/gcc-patches/2011-08/msg02557.html, but with the
> > tm.texi patch applied to tm.texi.in.  Bootstrapped and regression
> > tested powerpc64-linux all langs except ada, and spec CPU2006 tested.
> > The spec results were a little disappointing as I expected to see some
> > gains, but my baseline was a -O3 run and I suppose most of the
> > shrink-wrap opportunities were lost to inlining.
> 
> The last posted version had a bug that crept in during the review cycle,
> and which made it quite ineffective.

I wasn't complaining!  My disappointment really stemmed from having
unrealistically high expectations.  I still think this optimization is
a great feature.  Thanks for contributing it!

-- 
Alan Modra
Australia Development Lab, IBM

Re: [v3] libstdc++/50441

2011-09-18 Thread Paolo Carlini


Hi again,

just little more details:
I'm currently blocked by the following issue. If I try to compile, 
with -std=gnu++98 (the default for C++) and -pedantic-errors:


template
struct limits;

template<>
struct limits<__int128> { };

template<>
struct limits { };

I get:

a.cc:8:26: error: ISO C++ does not support ‘__int128’ for ‘type name’ 
[-pedantic]

a.cc:8:10: error: redefinition of ‘struct limits<__int128>’
a.cc:5:10: error: previous definition of ‘struct limits<__int128>’

If I remove the second specialization:

template
struct limits;

template<>
struct limits<__int128> { };

then it compiles just fine, with -pedantic and -pedantic-errors too. 
Thus, it looks like something is definitely wrong here. Well, the first 
and second line of the error message above - wrongly talking about 
__int128 instead of unsigned __int128, should also be an hint...


Thanks again,
Paolo.

Re: [PATCH 7/7] Reduce memory waste due to non-power-of-2 allocs

2011-09-18 Thread Laurynas Biveinis

2011/9/17 Dodji Seketeli :
> OK, so the patch below extracts a public ggc_alloced_size_for_request
> function from the different implementations of the ggc allocator's
> interface, and lets new_linemap use that.
> libcpp/
>
>        * include/line-map.h (struct line_maps::alloced_size_for_request):
>        New member.
>        * line-map.c (new_linemap): Use set->alloced_size_for_request to
>        get the actual allocated size of line maps.
>
> gcc/
>
>        * ggc.h (ggc_alloced_size_for_request): Declare new public entry
>        point.
>        * ggc-none.c (ggc_alloced_size_for_request): New public stub
>        function.
>        * ggc-page.c (ggc_alloced_size_order_for_request): New static
>        function.  Factorized from ggc_internal_alloc_stat.
>        (ggc_alloced_size_for_request): New public function.  Uses
>        ggc_alloced_size_order_for_request.
>        (ggc_internal_alloc_stat): Use ggc_alloced_size_order_for_request.
>        * ggc-zone.c (ggc_alloced_size_for_request): New public function
>        extracted from ggc_internal_alloc_zone_stat.
>        (ggc_internal_alloc_zone_stat): Use ggc_alloced_size_for_request.
>        * toplev.c (general_init): Initialize
>        line_table->alloced_size_for_request.

For the record, the patch is fine with me. (I cannot approve it
though, but you already got the approval)

Thanks,
-- 
Laurynas

[ARM] pass "--be8" to linker when linking for M profile

2011-09-18 Thread Bin Cheng

Hi,
Here attached the second version patch, with changes mentioned previously.

Is it ok?

Thanks-chengbin

2011-09-16  Cheng Bin 

* config/arm/bpabi.h (BE8_LINK_SPEC): Add cortex-m arch and
processors.


> -Original Message-
> From: Richard Earnshaw
> Sent: Thursday, September 15, 2011 6:46 PM
> To: Bin Cheng
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [ARM] pass "--be8" to linker when linking for M profile
> 
> On 15/09/11 03:41, Bin Cheng wrote:
> > Hi,
> > The linker should do endian swizzling at link-time according to "--be8"
> > option.
> > This patch modifies BE8_LINK_SPEC by adding cortex-m processors in 
> > the specs string.
> >
> > Since R-profile supports configurable big-endian instruction fetch, 
> > I didn't include it here.
> >
> > Is it ok? Thanks.
> >
> > 2011-09-15  Cheng Bin 
> > * config/arm/bpabi.h (BE8_LINK_SPEC): add cortex-m 
> > arch and processors.
> >
> > Thanks-chengbin=
> >
> >
> > gcc-be8-for-m-profile.patch
> >
> >
> 
> +#define BE8_LINK_SPEC  \
> +  " %{mbig-endian:%{march=armv7-a|mcpu=cortex-a5 \
> +   |mcpu=cortex-a8|mcpu=cortex-a9|mcpu=cortex-a15 \
> +   |march=armv7-m|march=armv7e-m|mcpu=cortex-m3|mcpu=cortex-m4 \
> +   |march=armv6-m|mcpu=cortex-m0:%{!r:--be8}}}"
> 
> 
> Please sort this so that the list is ordered alphabetically by 
> architecture/cpu (with architectures first).
> 
> It might save some patch churn in the future if each element was put 
> on a line on its own.
> 
> OK with that change.
> 
> R.

gcc-be8-for-m-profile-20110916.patch
Description: Binary data

[arm-embedded] Backport mainline 171225

2011-09-18 Thread Joey Ye

Committed

Backport r171225 from mainline
2011-03-21  Rainer Orth  

PR bootstrap/48120:
* configure.ac (pwllib): Use LIBS instead of LDFLAGS.
Add -lstdc++ -lm to LIBS.
* configure: Regenerate.

Index: configure
===
--- configure   (revision 171224)
+++ configure   (revision 171225)
@@ -5725,8 +5725,8 @@
 
 if test "x$with_ppl" != xno; then
   if test "x$pwllib" = x; then
-saved_LDFLAGS="$LDFLAGS"
-LDFLAGS="$LDFLAGS $ppllibs"
+saved_LIBS="$LIBS"
+LIBS="$LIBS $ppllibs -lstdc++ -lm"
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for
PWL_handle_timeout in -lpwl" >&5
 $as_echo_n "checking for PWL_handle_timeout in -lpwl... " >&6; }
 if test "${ac_cv_lib_pwl_PWL_handle_timeout+set}" = set; then :
@@ -5767,7 +5767,7 @@
   pwllib="-lpwl"
 fi
 
-LDFLAGS="$saved_LDFLAGS"
+LIBS="$saved_LIBS"
   fi
 
   ppllibs="$ppllibs -lppl_c -lppl $pwllib -lgmpxx"
Index: configure.ac
===
--- configure.ac(revision 171224)
+++ configure.ac(revision 171225)
@@ -1677,10 +1677,10 @@
 
 if test "x$with_ppl" != xno; then
   if test "x$pwllib" = x; then
-saved_LDFLAGS="$LDFLAGS"
-LDFLAGS="$LDFLAGS $ppllibs"
-AC_CHECK_LIB(pwl,PWL_handle_timeout,[pwllib="-lpwl"])
-LDFLAGS="$saved_LDFLAGS"
+saved_LIBS="$LIBS"
+LIBS="$LIBS $ppllibs -lstdc++ -lm"
+AC_CHECK_LIB(pwl, PWL_handle_timeout, [pwllib="-lpwl"])
+LIBS="$saved_LIBS"
   fi
 
   ppllibs="$ppllibs -lppl_c -lppl $pwllib -lgmpxx"

[arm-embedded] Backport mainline 171096 .. 174035

2011-09-18 Thread Joey Ye

Backport from mainline to arm-embedded branch r171096, r171251, r171379,
r171632, r171978, r172297, r174035.

Committed.

2011-09-19  chengbin  

Backport r174035 from mainline
2011-05-22  Tom de Vries  

PR middle-end/48689
* fold-const.c (fold_checksum_tree): Guard TREE_CHAIN use with
CODE_CONTAINS_STRUCT (TS_COMMON).

Backport r172297 from mainline
2011-04-11  Chung-Lin Tang  
Richard Earnshaw  

PR target/48250
* config/arm/arm.c (arm_legitimize_reload_address): Update cases
to use sign-magnitude offsets. Reject unsupported unaligned
cases. Add detailed description in comments.
* config/arm/arm.md (reload_outdf): Disable for ARM mode; change
condition from TARGET_32BIT to TARGET_ARM.

Backport r171978 from mainline
2011-04-05  Tom de Vries  

PR target/43920
* config/arm/arm.h (BRANCH_COST): Set to 1 for Thumb-2 when
optimizing
for size.

Backport r171632 from mainline
2011-03-28  Richard Sandiford  

* builtins.c (expand_builtin_memset_args): Use gen_int_mode
instead of GEN_INT.

Backport r171379 from mainline
2011-03-23  Chung-Lin Tang  

PR target/46934
* config/arm/arm.md (casesi): Use the gen_int_mode() function
to subtract lower bound instead of GEN_INT().

Backport r171251 from mainline
2011-03-21  Daniel Jacobowitz  

* config/arm/unwind-arm.c (__gnu_unwind_pr_common): Correct test
for barrier handlers.

Backport r171096 from mainline
2011-03-17  Chung-Lin Tang  

PR target/43872
* config/arm/arm.c (arm_get_frame_offsets): Adjust early
return condition with !cfun->calls_alloca.

RE: [arm-embedded] Simply enable GCC to support -march=armv6s-m as GAS does.

2011-09-18 Thread Terry Guo

Hello,

I patched arm-arches.def and re-generated arm-tables.opt using command
"./genopt.sh ../arm > arm-tables.opt" in directory gcc/config/arm. Now the
updated patch is as below. Is it OK to trunk?

BR,
Terry

2011-09-19  Terry Guo  

  * config/arm/arm-arches.def (armv6s-m): New.
  * config/arm/arm-tables.opt: Regenerate.


diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index 1086233..3123426 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -49,6 +49,7 @@ ARM_ARCH("armv6z",  arm1176jzs, 6Z,  FL_CO_PROC |
FL_FOR_ARCH6Z
 ARM_ARCH("armv6zk", arm1176jzs, 6ZK, FL_CO_PROC |
FL_FOR_ARCH6ZK)
 ARM_ARCH("armv6t2", arm1156t2s, 6T2, FL_CO_PROC |
FL_FOR_ARCH6T2)
 ARM_ARCH("armv6-m", cortexm1,  6M,   FL_FOR_ARCH6M)
+ARM_ARCH("armv6s-m", cortexm1, 6M,   FL_FOR_ARCH6M)
 ARM_ARCH("armv7",   cortexa8,  7,   FL_CO_PROC | FL_FOR_ARCH7)
 ARM_ARCH("armv7-a", cortexa8,  7A,  FL_CO_PROC | FL_FOR_ARCH7A)
 ARM_ARCH("armv7-r", cortexr4,  7R,  FL_CO_PROC | FL_FOR_ARCH7R)
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index d86e376..23339c7 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -323,28 +323,31 @@ EnumValue
 Enum(arm_arch) String(armv6-m) Value(16)
 
 EnumValue
-Enum(arm_arch) String(armv7) Value(17)
+Enum(arm_arch) String(armv6s-m) Value(17)
 
 EnumValue
-Enum(arm_arch) String(armv7-a) Value(18)
+Enum(arm_arch) String(armv7) Value(18)
 
 EnumValue
-Enum(arm_arch) String(armv7-r) Value(19)
+Enum(arm_arch) String(armv7-a) Value(19)
 
 EnumValue
-Enum(arm_arch) String(armv7-m) Value(20)
+Enum(arm_arch) String(armv7-r) Value(20)
 
 EnumValue
-Enum(arm_arch) String(armv7e-m) Value(21)
+Enum(arm_arch) String(armv7-m) Value(21)
 
 EnumValue
-Enum(arm_arch) String(ep9312) Value(22)
+Enum(arm_arch) String(armv7e-m) Value(22)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt) Value(23)
+Enum(arm_arch) String(ep9312) Value(23)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt2) Value(24)
+Enum(arm_arch) String(iwmmxt) Value(24)
+
+EnumValue
+Enum(arm_arch) String(iwmmxt2) Value(25)
 
 Enum
 Name(arm_fpu) Type(int)

Re: [RFC] Split -mrecip

2011-09-18 Thread Uros Bizjak

On Sat, Sep 3, 2011 at 11:11 PM, Uros Bizjak  wrote:

>>> > I've decided to not use four new bits from target_flags, and instead
>>> > created a new mask (recip_mask).  Four bits would have fit in target
>>> > bits right now,  but in the future we might want to add more
>>> > specialization, like modes for which the reciprocals are active.
>>> >
>>> > What do you think?
>>>
>>> These new flags looks like a nice addition, but I wonder, why we need
>>> separate options to handle vector recip. A vector rsqrt or rdiv is
>>> generated automatically in the same way as scalar rsqrt or rdiv is
>>> generated, so IMO, -mrecip-sqrt and -mrecip-div should be enough.
>>
>> No, the difference does matter.  Using reciprocal estimates for scalar
>> divs often results in errors in benchmarks because those sometimes are
>> used to feed integer conversions for either index calculations or
>> printouts.  The small rounding errors with the reciprocals lead to
>> incorrect outputs then.  Context where the div can be vectorized often
>> don't have this problem (they're then used purely for calculations over
>> arrays of float data).  For instance spec2006 and polyhedron break with
>> -mrecip purely because of the scalar reciprocals, but work with only
>> vectorized ones.  I.e. users really want to differ between both.
>
> I agree with your analysis.
>
>> Also, when this patch goes in I plan to submit another one that activates
>> vectorized rcp/rsqrt under -ffast-math already (that's what ICC happens to
>> do too).
>
> Great! In the past, we tried to use -mrecip with -ffast-math. IIRC,
> polyhedron broke on scalar rdiv and spec2006 broke on rsqrt. Taking
> into account your analysis above, using separate options and
> activating vectorized ones for -ffast-math makes much sense.
>
>>> For the future - could rs6000 and x86 use the same compile options to
>>> handle reciprocals?
>>
>> I'd guess so.  rs6000 uses a hand-written comma-splitter, which we could
>> reuse.
>
> Perhaps rs6000 could adopt our approach in addition to its
> comma-splitter? OTOH, whatever is more convenient, I don't care that
> much. I have CC'd rs6000 maintainer for his opinion.

Looking at this topic again, I'd propose that x86 adopts approach from
rs6000. The rs6000 approach is more extensible, and offers the same
flexibility, due to "!".

So, x86 could have "-mrecip=", with all, default, none, div,
vec-div, divf, vec-divf, rsqrt, etc ... combinations, perhaps some day
also using divd & co.

Probably, rs6000 needs to extend its options with vec- prefix, to
conditionally enable vector reciprocals, for the same reason x86 has
to.

Uros.

Re: [v3] libstdc++/50441

Re: PATCH: Replace tmp with __tmp

Re: [v3] libstdc++/50441

[patch] Fix PR tree-optimization/50412

Re: [v3] libstdc++/50441

Re: PATCH: Replace tmp with __tmp

[patch] Fix tree-optimization/50414

[patch] Fix PR testsuite/50435

Re: [v3] libstdc++/50441

Re: [PATCH 6/7] Kill pedantic warnings on system headers macros

Re: [PATCH 7/7] Reduce memory waste due to non-power-of-2 allocs

Re: [rs6000] Fix PR target/50091

Re: [v3] libstdc++/50441

Re: [v3] libstdc++/50441

Re: [v3] libstdc++/50441

Re: PowerPC shrink-wrap support 0 of 3

Re: [v3] libstdc++/50441

Re: [PATCH 7/7] Reduce memory waste due to non-power-of-2 allocs

[ARM] pass "--be8" to linker when linking for M profile

[arm-embedded] Backport mainline 171225

[arm-embedded] Backport mainline 171096 .. 174035

RE: [arm-embedded] Simply enable GCC to support -march=armv6s-m as GAS does.

Re: [RFC] Split -mrecip

23 matches

Site Navigation

Mail list logo

Footer information