date:20120715

[Fortran-dev][Patch] Fix cshift1

2012-07-15 Thread Tobias Burnus

This patch fixes the stride setting for cshift1; hence, it fixes 
gfortran.dg/optional_dim_3.f90.


Build and regtested on x86-64-linux - 13 failing tests remain.
OK?

Tobias
2012-07-15  Tobias Burnus  

	* m4/cshift1.m4 (cshift1): Correctly set stride multiplier.
	* generated/cshift1_16.c: Regenerate.
	* generated/cshift1_4.c: Regenerate.
	* generated/cshift1_8.c: Regenerate.

Index: libgfortran/m4/cshift1.m4
===
--- libgfortran/m4/cshift1.m4	(Revision 189480)
+++ libgfortran/m4/cshift1.m4	(Arbeitskopie)
@@ -80,22 +80,18 @@ cshift1 (gfc_array_char * const restrict ret,
   if (ret->base_addr == NULL)
 {
   int i;
+  index_type sm, ext;
 
   ret->base_addr = xmalloc (size * arraysize);
   ret->offset = 0;
   ret->dtype = array->dtype;
+  sm = sizeof ('atype_name`);
+  ext = 1;
   for (i = 0; i < GFC_DESCRIPTOR_RANK (array); i++)
 {
-	  index_type ext, sm;
-
+  sm *= ext;
   ext = GFC_DESCRIPTOR_EXTENT (array, i);
 
-  if (i == 0)
-sm = 1;
-  else
-	sm = GFC_DESCRIPTOR_EXTENT (ret, i-1)
-		 * GFC_DESCRIPTOR_SM (ret, i-1);
-
 	  GFC_DIMENSION_SET (ret->dim[i], 0, ext, sm);
 }
 }
Index: libgfortran/generated/cshift1_16.c
===
--- libgfortran/generated/cshift1_16.c	(Revision 189480)
+++ libgfortran/generated/cshift1_16.c	(Arbeitskopie)
@@ -79,22 +79,18 @@ cshift1 (gfc_array_char * const restrict ret,
   if (ret->base_addr == NULL)
 {
   int i;
+  index_type sm, ext;
 
   ret->base_addr = xmalloc (size * arraysize);
   ret->offset = 0;
   ret->dtype = array->dtype;
+  sm = sizeof (GFC_INTEGER_16);
+  ext = 1;
   for (i = 0; i < GFC_DESCRIPTOR_RANK (array); i++)
 {
-	  index_type ext, sm;
-
+	  sm *= ext;
   ext = GFC_DESCRIPTOR_EXTENT (array, i);
 
-  if (i == 0)
-sm = 1;
-  else
-	sm = GFC_DESCRIPTOR_EXTENT (ret, i-1)
-		 * GFC_DESCRIPTOR_SM (ret, i-1);
-
 	  GFC_DIMENSION_SET (ret->dim[i], 0, ext, sm);
 }
 }
Index: libgfortran/generated/cshift1_4.c
===
--- libgfortran/generated/cshift1_4.c	(Revision 189480)
+++ libgfortran/generated/cshift1_4.c	(Arbeitskopie)
@@ -79,22 +79,18 @@ cshift1 (gfc_array_char * const restrict ret,
   if (ret->base_addr == NULL)
 {
   int i;
+  index_type sm, ext;
 
   ret->base_addr = xmalloc (size * arraysize);
   ret->offset = 0;
   ret->dtype = array->dtype;
+  sm = sizeof (GFC_INTEGER_4);
+  ext = 1;
   for (i = 0; i < GFC_DESCRIPTOR_RANK (array); i++)
 {
-	  index_type ext, sm;
-
+	  sm *= ext;
   ext = GFC_DESCRIPTOR_EXTENT (array, i);
 
-  if (i == 0)
-sm = 1;
-  else
-	sm = GFC_DESCRIPTOR_EXTENT (ret, i-1)
-		 * GFC_DESCRIPTOR_SM (ret, i-1);
-
 	  GFC_DIMENSION_SET (ret->dim[i], 0, ext, sm);
 }
 }
Index: libgfortran/generated/cshift1_8.c
===
--- libgfortran/generated/cshift1_8.c	(Revision 189480)
+++ libgfortran/generated/cshift1_8.c	(Arbeitskopie)
@@ -79,22 +79,18 @@ cshift1 (gfc_array_char * const restrict ret,
   if (ret->base_addr == NULL)
 {
   int i;
+  index_type sm, ext;
 
   ret->base_addr = xmalloc (size * arraysize);
   ret->offset = 0;
   ret->dtype = array->dtype;
+  sm = sizeof (GFC_INTEGER_8);
+  ext = 1;
   for (i = 0; i < GFC_DESCRIPTOR_RANK (array); i++)
 {
-	  index_type ext, sm;
-
+	  sm *= ext;
   ext = GFC_DESCRIPTOR_EXTENT (array, i);
 
-  if (i == 0)
-sm = 1;
-  else
-	sm = GFC_DESCRIPTOR_EXTENT (ret, i-1)
-		 * GFC_DESCRIPTOR_SM (ret, i-1);
-
 	  GFC_DIMENSION_SET (ret->dim[i], 0, ext, sm);
 }
 }

Re: [PATCH] Improve andq $0xffffffff, %reg handling (PR target/53110)

2012-07-15 Thread Uros Bizjak

On Sun, Jul 15, 2012 at 1:56 AM, H.J. Lu  wrote:
> On Wed, Apr 25, 2012 at 12:14 PM, Jakub Jelinek  wrote:
>
>> We have a splitter for reg1 = reg2 & 0x, but only if regnums
>> are different.  But movl %edi, %edi is a cheaper variant of
>> andq $0x, %rdi even with the same register and doesn't clobber
>> flags, so this patch attempts to expand it as a zero extension early.
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>
>> 2012-04-25  Jakub Jelinek  
>>
>> PR target/53110
>> * config/i386/i386.md (and3): For andq $0x, reg
>> instead expand it as zero extension.
>>
>> --- gcc/config/i386/i386.md.jj  2012-04-25 12:14:54.0 +0200
>> +++ gcc/config/i386/i386.md 2012-04-25 14:50:48.708925963 +0200
>> @@ -7694,7 +7694,17 @@ (define_expand "and3"
>> (and:SWIM (match_operand:SWIM 1 "nonimmediate_operand")
>>   (match_operand:SWIM 2 "")))]
>>""
>> -  "ix86_expand_binary_operator (AND, mode, operands); DONE;")
>> +{
>> +  if (mode == DImode
>> +  && GET_CODE (operands[2]) == CONST_INT
>> +  && INTVAL (operands[2]) == (HOST_WIDE_INT) 0x
>> +  && REG_P (operands[1]))
>> +emit_insn (gen_zero_extendsidi2 (operands[0],
>> +gen_lowpart (SImode, operands[1])));
>> +  else
>> +ix86_expand_binary_operator (AND, mode, operands);
>> +  DONE;
>> +})
>>
>>  (define_insn "*anddi_1"
>>[(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r")
>>
>> Jakub
>
> Can it be backported to 4.7 branch?  It also fixed:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53961
>
> on hjl/x32/gcc-4_7-branch.

I have backported the patch to 4.7 branch. OTOH, those LEA patterns
are really in the need of some cleanup in the future.

Uros.

[Fortran-dev][committed] Fix associate

2012-07-15 Thread Tobias Burnus

The following patch compares the stride multiplier rather than the 
stride; that's not only faster as it avoids an useless division, but it 
also fixes gfortran.dg/associated_2.f90 by not dividing by zero.


Build and regtested on x86-64-gnu-linux, and committed as Rev. 189492.

Remaining 12 (branch-only) regressions (without pending patches):

gfortran.dg/auto_char_dummy_array_1.f90
gfortran.dg/auto_char_len_3.f90
gfortran.dg/class_array_1.f03
gfortran.dg/class_array_2.f03
gfortran.dg/class_array_3.f03
gfortran.dg/class_to_type_1.f03
gfortran.dg/proc_decl_23.f90
gfortran.dg/select_type_26.f03
gfortran.dg/select_type_27.f03
gfortran.dg/read_eof_all.f90
gfortran.dg/transfer_intrinsic_3.f90
gfortran.dg/subref_array_pointer_2.f90

I think when we are down to zero (branch-only) regressions, we should 
try a bunch of real-world programs – there are probably more issues.


Tobias
2012-07-15  Tobias Burnus  

	* trans-intrinsic.c (gfc_conv_associated): Compare sm
	instead of stride.

2012-07-15  Tobias Burnus  

	* intrinsics/associated.c (associated): Compare sm
	instead of stride.

Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c	(Revision 189481)
+++ gcc/fortran/trans-intrinsic.c	(Arbeitskopie)
@@ -5849,19 +5849,19 @@ gfc_conv_associated (gfc_se *se, gfc_exp
   se->expr = fold_build2_loc (input_location, TRUTH_AND_EXPR,
   boolean_type_node, tmp, tmp2);
 }
   else
 {
 	  /* An array pointer of zero length is not associated if target is
 	 present.  */
 	  arg1se.descriptor_only = 1;
 	  gfc_conv_expr_lhs (&arg1se, arg1->expr);
-	  tmp = gfc_conv_descriptor_stride_get (arg1se.expr,
+	  tmp = gfc_conv_descriptor_sm_get (arg1se.expr,
 	gfc_rank_cst[arg1->expr->rank - 1]);
 	  nonzero_arraylen = fold_build2_loc (input_location, NE_EXPR,
 	  boolean_type_node, tmp,
 	  build_int_cst (TREE_TYPE (tmp), 0));
 
   /* A pointer to an array, call library function _gfor_associated.  */
   gcc_assert (ss2 != gfc_ss_terminator);
   arg1se.want_pointer = 1;
   gfc_conv_expr_descriptor (&arg1se, arg1->expr, ss1);
Index: libgfortran/intrinsics/associated.c
===
--- libgfortran/intrinsics/associated.c	(Revision 189480)
+++ libgfortran/intrinsics/associated.c	(Arbeitskopie)
@@ -42,17 +42,17 @@ associated (const gfc_array_void *pointe
 
   rank = GFC_DESCRIPTOR_RANK (pointer);
   for (n = 0; n < rank; n++)
 {
   long extent;
   extent = GFC_DESCRIPTOR_EXTENT(pointer,n);
 
   if (extent != GFC_DESCRIPTOR_EXTENT(target,n))
 return 0;
-  if (GFC_DESCRIPTOR_STRIDE(pointer,n) != GFC_DESCRIPTOR_STRIDE(target,n) && extent != 1)
+  if (GFC_DESCRIPTOR_SM (pointer,n) != GFC_DESCRIPTOR_SM (target,n) && extent != 1)
 return 0;
   if (extent <= 0)
 	return 0;
 }
 
   return 1;
 }

Re: [Fortran-dev][Patch] Fix cshift1

2012-07-15 Thread Thomas Koenig


Hi Tobias,


This patch fixes the stride setting for cshift1; hence, it fixes
gfortran.dg/optional_dim_3.f90.

Build and regtested on x86-64-linux - 13 failing tests remain.
OK?


OK. Thanks for the patch!

Thomas

Re: [testsuite] Allow for / comments in g++.dg/debug/dwarf2/pubnames-2.C

2012-07-15 Thread Andreas Schwab

Installed.

Andreas.

* g++.dg/debug/dwarf2/pubnames-2.C: Support all known comment
characters.

diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/pubnames-2.C 
b/gcc/testsuite/g++.dg/debug/dwarf2/pubnames-2.C
index 375b856..3b7f95e 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/pubnames-2.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/pubnames-2.C
@@ -1,63 +1,63 @@
 // { dg-do compile }
 // { dg-options "-gpubnames -gdwarf-4 -std=c++0x -dA" }
 // { dg-final { scan-assembler ".section\t.debug_pubnames" } }
-// { dg-final { scan-assembler "\"\\(anonymous namespace\\)0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"one0\"+\[ \t\]+\[#;/]+\[ \t\]+external 
name" } }
-// { dg-final { scan-assembler "\"one::G_A0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"one::G_B0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"one::G_C0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"one::\\(anonymous namespace\\)0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"two0\"+\[ \t\]+\[#;/]+\[ \t\]+external 
name" } }
-// { dg-final { scan-assembler "\"F_A0\"+\[ \t\]+\[#;/]+\[ \t\]+external 
name" } }
-// { dg-final { scan-assembler "\"F_B0\"+\[ \t\]+\[#;/]+\[ \t\]+external 
name" } }
-// { dg-final { scan-assembler "\"F_C0\"+\[ \t\]+\[#;/]+\[ \t\]+external 
name" } }
-// { dg-final { scan-assembler "\"inline_func_10\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"one::c1::c10\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"one::c1::~c10\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"one::c1::val0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"check_enum0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"main0\"+\[ \t\]+\[#;/]+\[ \t\]+external 
name" } }
-// { dg-final { scan-assembler "\"two::c2::c20\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2::c20\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2::c20\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"check0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"check \\>0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"check \\>0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"check \\>0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2::val0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2::val0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2::val0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler 
"\"__static_initialization_and_destruction_00\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2::~c20\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2::~c20\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2::~c20\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"_GLOBAL__sub_I__ZN3one3c1vE0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"anonymous_union_var0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::ci0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2v10\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2v20\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"two::c2v30\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"one::c1v0\"+\[ \t\]+\[#;/]+\[ 
\t\]+external name" } }
-// { dg-final { scan-assembler "\"one::\\(anonymous 
namespace\\)::one_anonymous_var0\"+\[ \t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"\\(anonymous 
namespace\\)::c1_count0\"+\[ \t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"\\(anonymous 
namespace\\)::c2_count0\"+\[ \t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"\\(anonymous namespace\\)::three0\"+\[ 
\t\]+\[#;/]+\[ \t\]+external name" } }
-// { dg-final { scan-assembler "\"\\(anonymous 
namespace\\)::three::anonymous_three_var0\"+\[ \t\]+\[#;/]+\[ \t\]+external 
name" } }
+// { dg-final { scan-assembler "\"\\(anonymous namespace\\)0\"+\[ 
\t\]+\[#;/|@!]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler "\"one0\"+\[ \t\]+\[#;/|@!]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "

[Fortran-dev][Patch] Some (ubound-lbound+1) -> extent cleanup

2012-07-15 Thread Tobias Burnus

This patch cleans up the source code and generated code ("dump") by 
changing (ubound-lbound+1) calculations to directly taking the "extent". 
Except for a faster -O0 performance and saving some cycles during code 
generation, the code should be the same.



The only real code change I did was to gfc_grow_array. I didn't 
understand the previous code, I think the current code makes more sense. 
I believe the old code did:


  desc->extent = desc.extent + extra;
  realloc (&desc->data, (desc.extent + 1)*element_size);

while I think the latter should be "+ extra" and not "+ 1". From the 
callee, I couldn't deduce whether extra is always unity in practice, in 
any case it doesn't look like.



For fcncall_realloc_result, I didn't check whether the new code 
effectively matches the old one, I just followed the comment which 
states: "Check that the shapes are the same between lhs and expression.".



There are some more cases, but there the code wasn't as obvious. For 
instance "extent = ubound - lbound"; there I was unsure whether that's a 
bug (missing "+1") or valid.



Build and regtested with no new failures.
OK for the branch?

Tobias
2012-07-15  Tobias Burnus  

	* trans-intrinsic.c (gfc_conv_intrinsic_size,
	gfc_conv_intrinsic_sizeof): Replace (ubound-lbound+1) calculation
	by "extent".
	* trans-expr.c (fcncall_realloc_result): Ditto.
	* trans-io.c (gfc_convert_array_to_string): Ditto.
	* trans-openmp.c (gfc_omp_clause_default_ctor,
	gfc_omp_clause_copy_ctor, gfc_omp_clause_assign_op,
	gfc_trans_omp_array_reduction): Ditto.
	* trans-array.c (array_parameter_size): Ditto.
	(gfc_grow_array): Ditto - and fix size calculation for realloc.

2012-07-15  Tobias Burnus  

	* gfortran.dg/array_section_2.f90: Update scan-tree-dump pattern.

Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c	(Revision 189492)
+++ gcc/fortran/trans-intrinsic.c	(Arbeitskopie)
@@ -5145,17 +5145,8 @@ gfc_conv_intrinsic_size (gfc_se * se, gfc_expr * e
 
   if (se->expr == NULL_TREE)
 {
-  tree ubound, lbound;
-
-  arg1 = build_fold_indirect_ref_loc (input_location,
-  arg1);
-  ubound = gfc_conv_descriptor_ubound_get (arg1, argse.expr);
-  lbound = gfc_conv_descriptor_lbound_get (arg1, argse.expr);
-  se->expr = fold_build2_loc (input_location, MINUS_EXPR,
-  gfc_array_index_type, ubound, lbound);
-  se->expr = fold_build2_loc (input_location, PLUS_EXPR,
-  gfc_array_index_type,
-  se->expr, gfc_index_one_node);
+  arg1 = build_fold_indirect_ref_loc (input_location, arg1);
+  se->expr = gfc_conv_descriptor_extent_get (arg1, argse.expr);
   se->expr = fold_build2_loc (input_location, MAX_EXPR,
   gfc_array_index_type, se->expr,
   gfc_index_zero_node);
@@ -5194,8 +5185,6 @@ gfc_conv_intrinsic_sizeof (gfc_se *se, gfc_expr *e
   tree source_bytes;
   tree type;
   tree tmp;
-  tree lower;
-  tree upper;
   int n;
 
   arg = expr->value.function.actual->expr;
@@ -5240,12 +5229,7 @@ gfc_conv_intrinsic_sizeof (gfc_se *se, gfc_expr *e
 	{
 	  tree idx;
 	  idx = gfc_rank_cst[n];
-	  lower = gfc_conv_descriptor_lbound_get (argse.expr, idx);
-	  upper = gfc_conv_descriptor_ubound_get (argse.expr, idx);
-	  tmp = fold_build2_loc (input_location, MINUS_EXPR,
- gfc_array_index_type, upper, lower);
-	  tmp = fold_build2_loc (input_location, PLUS_EXPR,
- gfc_array_index_type, tmp, gfc_index_one_node);
+	  tmp = gfc_conv_descriptor_extent_get (argse.expr, idx);
 	  tmp = fold_build2_loc (input_location, MULT_EXPR,
  gfc_array_index_type, tmp, source_bytes);
 	  gfc_add_modify (&argse.pre, source_bytes, tmp);
Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c	(Revision 189481)
+++ gcc/fortran/trans-expr.c	(Arbeitskopie)
@@ -6476,19 +6481,10 @@ fcncall_realloc_result (gfc_se *se, int rank)
   for (n = 0 ; n < rank; n++)
 {
   tree tmp1;
-  tmp = gfc_conv_descriptor_lbound_get (desc, gfc_rank_cst[n]);
-  tmp1 = gfc_conv_descriptor_lbound_get (res_desc, gfc_rank_cst[n]);
-  tmp = fold_build2_loc (input_location, MINUS_EXPR,
-			 gfc_array_index_type, tmp, tmp1);
-  tmp1 = gfc_conv_descriptor_ubound_get (desc, gfc_rank_cst[n]);
-  tmp = fold_build2_loc (input_location, MINUS_EXPR,
-			 gfc_array_index_type, tmp, tmp1);
-  tmp1 = gfc_conv_descriptor_ubound_get (res_desc, gfc_rank_cst[n]);
-  tmp = fold_build2_loc (input_location, PLUS_EXPR,
-			 gfc_array_index_type, tmp, tmp1);
+  tmp = gfc_conv_descriptor_extent_get (desc, gfc_rank_cst[n]);
+  tmp1 = gfc_conv_descriptor_extent_get (res_desc, gfc_rank_cst[n]);
   tmp = fold_build2_loc (input_location, NE_EXPR,
-			 boolean_type_node, tmp,
-			 gfc_index_zero_node);
+			 boolean_type_node, tmp, tmp1);
   tmp = gfc_evaluate_now (tmp, &se->post);
   zero_cond = fold_build2_loc (input_lo

[patch, Fortran] Fix PR 53824

2012-07-15 Thread Thomas Koenig


Hello world,

this fixes an ICE with allocation of coarrays.  Regression-tested.

OK for trunk?  What about 4.7?

Thomas

2012-07-15  Thomas König  

PR fortran/53824
* resolve.c (resolve_allocate_deallocate):  If both
start indices are NULL, skip the test for equality.

2012-07-15  Thomas König  

PR fortran/53824
* gfortran.dg/coarray_allocate_1.f90:  New test.
Index: resolve.c
===
--- resolve.c	(Revision 189478)
+++ resolve.c	(Arbeitskopie)
@@ -7326,8 +7326,8 @@ resolve_allocate_deallocate (gfc_code *code, const
 	  }
 }
 
-  /* Check that an allocate-object appears only once in the statement.  
- FIXME: Checking derived types is disabled.  */
+  /* Check that an allocate-object appears only once in the statement.  */
+
   for (p = code->ext.alloc.list; p; p = p->next)
 {
   pe = p->expr;
@@ -7377,9 +7377,10 @@ resolve_allocate_deallocate (gfc_code *code, const
 			{
 			  gfc_array_ref *par = &(pr->u.ar);
 			  gfc_array_ref *qar = &(qr->u.ar);
-			  if (gfc_dep_compare_expr (par->start[0],
-		qar->start[0]) != 0)
-			  break;
+			  if ((par->start[0] != NULL || qar->start[0] != NULL)
+			  && gfc_dep_compare_expr (par->start[0],
+		   qar->start[0]) != 0)
+			break;
 			}
 		}
 		  else
! { dg-do compile }
! { dg-options "-fcoarray=single" }
! PR 53824 - this used to ICE.
! Original test case by VladimÃr Fuka
program Jac
 implicit none

 integer,parameter:: KND=KIND(1.0)

 type Domain
  real(KND),dimension(:,:,:),allocatable:: A,B
  integer :: n=64,niter=2,blockit=1000
  integer :: starti,endi
  integer :: startj,endj
  integer :: startk,endk
  integer,dimension(:),allocatable :: startsi,startsj,startsk
  integer,dimension(:),allocatable :: endsi,endsj,endsk
 end type

 type(Domain),allocatable :: D[:,:,:]
! real(KND),codimension[*] :: sumA,sumB,diffAB
 integer i,j,k,ncom
 integer nims,nxims,nyims,nzims
 integer im,iim,jim,kim
 character(20):: ch

 nims = num_images()
 nxims = nint(nims**(1./3.))
 nyims = nint(nims**(1./3.))
 nzims = nims / (nxims*nyims)

 im = this_image()
 if (im==1) write(*,*) "n: [",nxims,nyims,nzims,"]"

 kim = (im-1) / (nxims*nyims) + 1
 jim = ((im-1) - (kim-1)*(nxims*nyims)) / nxims + 1
 iim = (im-1) - (kim-1)*(nxims*nyims) - (jim-1)*(nxims) + 1

 write (*,*) im,"[",iim,jim,kim,"]"

 allocate(D[nxims,nyims,*])

 ncom=command_argument_count()
 if (command_argument_count() >=2) then
  call get_command_argument(1,value=ch)
  read (ch,*) D%n
  call get_command_argument(2,value=ch)
  read (ch,*) D%niter
  call get_command_argument(3,value=ch)
  read (ch,*) D%blockit
 end if

 allocate(D%startsi(nxims))
 allocate(D%startsj(nyims))
 allocate(D%startsk(nzims))
 allocate(D%endsi(nxims))
 allocate(D%endsj(nyims))
 allocate(D%endsk(nzims))

 D%startsi(1) = 1
 do i=2,nxims
   D%startsi(i) = D%startsi(i-1) + D%n/nxims
 end do
 D%endsi(nxims) = D%n
 D%endsi(1:nxims-1) = D%startsi(2:nxims) - 1

 D%startsj(1) = 1
 do j=2,nyims
   D%startsj(j) = D%startsj(j-1) + D%n/nyims
 end do
 D%endsj(nyims) = D%n
 D%endsj(1:nyims-1) = D%startsj(2:nyims) - 1

 D%startsk(1) = 1
 do k=2,nzims
   D%startsk(k) = D%startsk(k-1) + D%n/nzims
 end do
 D%endsk(nzims) = D%n
 D%endsk(1:nzims-1) = D%startsk(2:nzims) - 1

 D%starti = D%startsi(iim)
 D%endi = D%endsi(iim)
 D%startj = D%startsj(jim)
 D%endj = D%endsj(jim)
 D%startk = D%startsk(kim)
 D%endk = D%endsk(kim)

 write(*,*) D%startsi,D%endsi
 write(*,*) D%startsj,D%endsj
 write(*,*) D%startsk,D%endsk

 !$hmpp JacKernel allocate, args[A,B].size={0:D%n+1,0:D%n+1,0:D%n+1}
 allocate(D%A(D%starti-1:D%endi+1,D%startj-1:D%endj+1,D%startk-1:D%endk+1),&
  D%B(D%starti-1:D%endi+1,D%startj-1:D%endj+1,D%startk-1:D%endk+1))
end program Jac

Re: G++ namespace association extension

2012-07-15 Thread Gerald Pfeifer

On Tue, 10 Jul 2012, Jonathan Wakely wrote:
>> Yes, but people should use inline namespaces instead; we should deprecate
>> this form and then remove it in 4.9.
> 
> * doc/extend.texi (Namespace Association): Alter cautionary text.

I think this also should go into the GCC 4.8 release notes
(gcc-4.8/changes.html)?

Gerald

Re: [Fortran-dev][Patch] Some (ubound-lbound+1) -> extent cleanup

2012-07-15 Thread Mikael Morin

On 15/07/2012 13:24, Tobias Burnus wrote:
> This patch cleans up the source code and generated code ("dump") by
> changing (ubound-lbound+1) calculations to directly taking the "extent".
> Except for a faster -O0 performance and saving some cycles during code
> generation, the code should be the same.
> 
A small, yet welcome simplification :-).

> 
> The only real code change I did was to gfc_grow_array. I didn't
> understand the previous code, I think the current code makes more sense.
> I believe the old code did:
> 
>   desc->extent = desc.extent + extra;
>   realloc (&desc->data, (desc.extent + 1)*element_size);

If you are talking about the old code, I think it does:

desc.ubound[0] += extra;
realloc (desc.data, (desc.ubound[0]+1)*element_size);

> 
> while I think the latter should be "+ extra" and not "+ 1". From the
> callee, I couldn't deduce whether extra is always unity in practice, in
> any case it doesn't look like.
> 
I think the old code is correct, under the assumption that desc.rank is
1, and desc.lbound[0] is 0, which is the case in the contexts where the
functions is called (array constructors).

> 
> For fcncall_realloc_result, I didn't check whether the new code
> effectively matches the old one, I just followed the comment which
> states: "Check that the shapes are the same between lhs and expression.".
> 
I think the old code is:
res_desc.ubound[n] - res_desc.lbound[n]
- (desc.ubound[n] - desc.lbound[n]) != 0

while the new one is:
res_desc.extent[n] != res.extent[n]


> 
> There are some more cases, but there the code wasn't as obvious. For
> instance "extent = ubound - lbound"; there I was unsure whether that's a
> bug (missing "+1") or valid.
> 
> 
> Build and regtested with no new failures.
> OK for the branch?
> 
Yes, thanks.

Mikael

[wwwdocs] Add note about C++11 ABI to gcc-4.7/changes.html

2012-07-15 Thread Jonathan Wakely

Added a caveat to the gcc-4.7/changes.html page about C++11 ABI
incompatibilities.

Committed to wwwdocs.
Index: htdocs/gcc-4.7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.120
diff -u -r1.120 changes.html
--- htdocs/gcc-4.7/changes.html 20 Jun 2012 23:34:49 -  1.120
+++ htdocs/gcc-4.7/changes.html 15 Jul 2012 15:04:29 -
@@ -97,6 +97,17 @@
 It is no longer possible to use the "l"
 constraint in MIPS16 asm statements.
 
+GCC versions 4.7.0 and 4.7.1 had changes to the C++ standard library
+which affected the ABI in C++11 mode: a data member was added to
+std::list changing its size and altering the definitions of
+some member functions, and std::pair's move constructor was
+non-trivial which altered the calling convention for functions with
+std::pair arguments or return types.  The ABI 
incompatibilities
+have been fixed for GCC version 4.7.2 but as a result C++11 code compiled
+with GCC 4.7.0 or 4.7.1 may be incompatible with C++11 code compiled with
+different GCC versions and with C++98/C++03 code compiled with any version.
+
+
 More information on porting to GCC 4.7 from previous versions
 of GCC can be found in
 the http://gcc.gnu.org/gcc-4.7/porting_to.html";>porting

Re: G++ namespace association extension

2012-07-15 Thread Jonathan Wakely

On 15 July 2012 12:26, Gerald Pfeifer wrote:
> On Tue, 10 Jul 2012, Jonathan Wakely wrote:
>>> Yes, but people should use inline namespaces instead; we should deprecate
>>> this form and then remove it in 4.9.
>>
>> * doc/extend.texi (Namespace Association): Alter cautionary text.
>
> I think this also should go into the GCC 4.8 release notes
> (gcc-4.8/changes.html)?

I can do that too.  There's no gcc-4.8 dir yet, do I need to copy over
the various other files from the gcc-4.7 dir or can I just create
changes.html and leave the RM to do the rest at the appropriate time?

Re: [PATCH][MIPS] NetLogic XLP scheduling

2012-07-15 Thread Richard Sandiford

Chung-Lin Tang  writes:
> This patch adds scheduling support for the NetLogic XLP, including a new
> pipeline description, and associated changes.
>
> Asides from the new xlp.md description file, there are also some sync
> primitive attribute modifications, for better scheduling of sync loops
> (Maxim should be able to better explain this).

Rather than add a "type" attribute to each sync loop, please just add:

  (not (eq_attr "sync_mem" "none"))
  (symbol_ref "syncloop")

to the default value of the "type" attribute.  You'll probably need
to swap the order of the sync* attributes with the "type" attribute
in order for this to compile.

The patch is effectively changing the type of the sync loops from
"unknown" to "syncloop".  That's certainly OK, but you'll need to
add "syncloop" to the "unknown" reservations of all other schedulers
(except for generic.md, where what you've done instead is fine).
It might be easier if you split out the addition of syncloop
as a separate patch.

> Other generic changes include a new "hilo" insn attribute, to mark which
> of HI/LO does a m[ft]hilo insn access.

The way other schedulers handle this is with things like:

(define_insn_reservation "ir_sb1_mfhi" 1
  (and (eq_attr "cpu" "sb1,sb1a")
   (and (eq_attr "type" "mfhilo")
(not (match_operand 1 "lo_operand"
  "sb1_ex1")

which seems simpler.  mfhilo and mthilo are required to read operand 1
and write to operand 0 (respectively) in order to support this kind of
construct.

That said, even the above is a hold-over from when we tried to allow
high registers to store independent values.  These days we can be a bit
more precise, as with the patch below.  (As the comment says:

 ;; If a doubleword move uses these expensive instructions,
 ;; it is usually better to schedule them in the same way
 ;; as the singleword form, rather than as "multi".

I'm continuing to assume that mflo and mtlo are the best type choices
for unsplit double-register moves.  That path should be very rarely
outside of MIPS16 anyway -- just by sched1 if hi and lo are exposed
directly -- and no current scheduler tries to model a doubleword hi/lo
move separately from single-register ones.  The information is available
via the dword_mode attribute if required.)

Tested on mips64-elf, and by making sure that there were no changes in
-O2 output for a recent set of cc1 .ii files.  Applied.

I'm probably punishing you for being honest here, but the only other
thing is that you've listed NetLogic Microsystems Inc. as one of the
authors.  I think that means they'll need to sign a copyright assignment.
Have they already done that?

Thanks,
Richard

gcc/
* config/mips/mips.md (move_type): Replace mfhilo and mthilo
with mflo and mtlo.
(type): Split mfhilo into mfhi and mflo.  Split mthilo into mthi
and mtlo.  Adjust move_type->type mapping.
(may_clobber_hilo): Split mthilo into mthi and mtlo.
(*movdi_32bit, *movdi_32bit_mips16, *movdi_64bit, *movdi_64bit_mips16)
(*mov_internal, *mov_mips16, *movhi_internal)
(*movhi_mips16, *movqi_internal, *movqi_mips16): Use mtlo and mflo
instead of mthilo and mfhilo.
(mfhi_): Use mfhi instead of mfhilo.
(mthi_): Use mthi instead of mthilo.
* config/mips/mips-dsp.md (mips_extr_w, mips_extr_r_w, mips_extr_rs_w)
(mips_extr_s_h, mips_extp, mips_extpdp, mips_shilo, mips_mthlip):
Use mflo instead of mfhilo.
* config/mips/1.md (r10k_arith): Split mthilo.
(r10k_mfhi, r10k_mflo): Use mfhi and mflo directly.
* config/mips/sb1.md (ir_sb1_mfhi, ir_sb1_mflo): Likewise.
(ir_sb1_mthilo): Split mthilo into mthi and mtlo.
* config/mips/20kc.md (r20kc_imthilo, r20kc_imfhilo): Split
mthilo and mfhilo.
* config/mips/24k.md (r24k_int_mfhilo, r24k_int_mthilo): Likewise.
* config/mips/4130.md (vr4130_class, vr4130_mfhilo, vr4130_mthilo):
Likewise.
* config/mips/4k.md (r4k_int_mthilo, r4k_int_mfhilo): Likewise.
* config/mips/5400.md (ir_vr54_hilo): Likewise.
* config/mips/5500.md (ir_vr55_mthilo, ir_vr55_mfhilo): Likewise.
* config/mips/5k.md (r5k_int_mthilo, r5k_int_mfhilo): Likewise.
* config/mips/7000.md (rm7_mthilo, rm7_mfhilo): Likewise.
* config/mips/74k.md (r74k_int_mfhilo, r74k_int_mthilo): Likewise.
* config/mips/9000.md (rm9k_mfhilo, rm9k_mthilo): Likewise.
* config/mips/generic.md (generic_hilo): Likewise.
* config/mips/loongson2ef.md (ls2_alu): Likewise.
* config/mips/loongson3a.md (ls3a_mfhilo): Likewise.
* config/mips/octeon.md (octeon_imul_o1, octeon_imul_o2)
(octeon_mfhilo_o1, octeon_mfhilo_o2): Likewise.
* config/mips/sr71k.md (ir_sr70_hilo): Likewise.
* config/mips/xlr.md (xlr_hilo): Likewise.

Index: gcc/config/mips/mips.md
=

[SH] Remove old mov peepholes

2012-07-15 Thread Oleg Endo

Hello,

The attached patch removes old peephole patterns that seem to be unused.
Tested with 'make all'.  CSiBE result-size (-m4-single -ml -O2
-mpretend-cmove) does not show any changes.

OK?

Cheers,
Oleg


ChangeLog:

* config/sh/sh.md: Delete mov related define_peephole patterns.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 186311)
+++ gcc/config/sh/sh.md	(working copy)
@@ -11726,73 +11726,9 @@
 	(mem:HI (plus:SI (match_dup 1) (match_dup 2]
   "")
 
-;; These convert sequences such as `mov #k,r0; add r15,r0; mov.l @r0,rn'
-;; to `mov #k,r0; mov.l @(r0,r15),rn'.  These sequences are generated by
-;; reload when the constant is too large for a reg+offset address.
-
-;; ??? We would get much better code if this was done in reload.  This would
-;; require modifying find_reloads_address to recognize that if the constant
-;; is out-of-range for an immediate add, then we get better code by reloading
-;; the constant into a register than by reloading the sum into a register,
-;; since the former is one instruction shorter if the address does not need
-;; to be offsettable.  Unfortunately this does not work, because there is
-;; only one register, r0, that can be used as an index register.  This register
-;; is also the function return value register.  So, if we try to force reload
-;; to use double-reg addresses, then we end up with some instructions that
-;; need to use r0 twice.  The only way to fix this is to change the calling
-;; convention so that r0 is not used to return values.
-
 (define_peephole
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(plus:SI (match_dup 0) (match_operand:SI 1 "register_operand" "r")))
-   (set (mem:SI (match_dup 0))
-	(match_operand:SI 2 "general_movsrc_operand" ""))]
-  "TARGET_SH1 && REGNO (operands[0]) == 0 && reg_unused_after (operands[0], insn)"
-  "mov.l	%2,@(%0,%1)")
-
-(define_peephole
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(plus:SI (match_dup 0) (match_operand:SI 1 "register_operand" "r")))
-   (set (match_operand:SI 2 "general_movdst_operand" "")
-	(mem:SI (match_dup 0)))]
-  "TARGET_SH1 && REGNO (operands[0]) == 0 && reg_unused_after (operands[0], insn)"
-  "mov.l	@(%0,%1),%2")
-
-(define_peephole
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(plus:SI (match_dup 0) (match_operand:SI 1 "register_operand" "r")))
-   (set (mem:HI (match_dup 0))
-	(match_operand:HI 2 "general_movsrc_operand" ""))]
-  "TARGET_SH1 && REGNO (operands[0]) == 0 && reg_unused_after (operands[0], insn)"
-  "mov.w	%2,@(%0,%1)")
-
-(define_peephole
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(plus:SI (match_dup 0) (match_operand:SI 1 "register_operand" "r")))
-   (set (match_operand:HI 2 "general_movdst_operand" "")
-	(mem:HI (match_dup 0)))]
-  "TARGET_SH1 && REGNO (operands[0]) == 0 && reg_unused_after (operands[0], insn)"
-  "mov.w	@(%0,%1),%2")
-
-(define_peephole
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(plus:SI (match_dup 0) (match_operand:SI 1 "register_operand" "r")))
-   (set (mem:QI (match_dup 0))
-	(match_operand:QI 2 "general_movsrc_operand" ""))]
-  "TARGET_SH1 && REGNO (operands[0]) == 0 && reg_unused_after (operands[0], insn)"
-  "mov.b	%2,@(%0,%1)")
-
-(define_peephole
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(plus:SI (match_dup 0) (match_operand:SI 1 "register_operand" "r")))
-   (set (match_operand:QI 2 "general_movdst_operand" "")
-	(mem:QI (match_dup 0)))]
-  "TARGET_SH1 && REGNO (operands[0]) == 0 && reg_unused_after (operands[0], insn)"
-  "mov.b	@(%0,%1),%2")
-
-(define_peephole
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(plus:SI (match_dup 0) (match_operand:SI 1 "register_operand" "r")))
(set (mem:SF (match_dup 0))
 	(match_operand:SF 2 "general_movsrc_operand" ""))]
   "TARGET_SH1 && REGNO (operands[0]) == 0

[SH] Reorg some CONST_OK_ macros

2012-07-15 Thread Oleg Endo

Hello,

This patch replaces usages of CONST_OK_FOR_I06 with
satisfies_constraint_I06 and moves the CONST_OK_FOR_I10 macro to sh.c.
Tested with 'make all-gcc'.

OK?

Cheers,
Oleg

ChangeLog:

* config/sh/sh.h (CONST_OK_FOR_I06): Delete.
(CONST_OK_FOR_I10): Move macro to ...
* config/sh/sh.c: ... here.
(sh_legitimate_index_p): Use satisfies_constraint_I06
instead of CONST_OK_FOR_I06.
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 189427)
+++ gcc/config/sh/sh.c	(working copy)
@@ -63,6 +63,9 @@
 #define LSW (TARGET_LITTLE_ENDIAN ? 0 : 1)
 
 /* These are some macros to abstract register modes.  */
+#define CONST_OK_FOR_I10(VALUE) (((HOST_WIDE_INT)(VALUE)) >= -512 \
+ && ((HOST_WIDE_INT)(VALUE)) <= 511)
+
 #define CONST_OK_FOR_ADD(size) \
   (TARGET_SHMEDIA ? CONST_OK_FOR_I10 (size) : CONST_OK_FOR_I08 (size))
 #define GEN_MOV (*(TARGET_SHMEDIA64 ? gen_movdi : gen_movsi))
@@ -9776,7 +9779,7 @@
 
   /* Check if this is the address of an unaligned load / store.  */
   if (mode == VOIDmode)
-	return CONST_OK_FOR_I06 (INTVAL (op));
+	return satisfies_constraint_I06 (op);
 
   size = GET_MODE_SIZE (mode);
   return (!(INTVAL (op) & (size - 1))
Index: gcc/config/sh/sh.h
===
--- gcc/config/sh/sh.h	(revision 189427)
+++ gcc/config/sh/sh.h	(working copy)
@@ -1213,12 +1213,8 @@
 
 /* Defines for sh.md and constraints.md.  */
 
-#define CONST_OK_FOR_I06(VALUE) (((HOST_WIDE_INT)(VALUE)) >= -32 \
- && ((HOST_WIDE_INT)(VALUE)) <= 31)
 #define CONST_OK_FOR_I08(VALUE) (((HOST_WIDE_INT)(VALUE))>= -128 \
  && ((HOST_WIDE_INT)(VALUE)) <= 127)
-#define CONST_OK_FOR_I10(VALUE) (((HOST_WIDE_INT)(VALUE)) >= -512 \
- && ((HOST_WIDE_INT)(VALUE)) <= 511)
 #define CONST_OK_FOR_I16(VALUE) (((HOST_WIDE_INT)(VALUE)) >= -32768 \
  && ((HOST_WIDE_INT)(VALUE)) <= 32767)

Re: [patch, Fortran] Fix PR 53824

2012-07-15 Thread Tobias Burnus


Thomas Koenig wrote:

this fixes an ICE with allocation of coarrays.  Regression-tested.
OK for trunk?  What about 4.7?


OK. Thanks for the patch. Regarding 4.7, I don't have a strong opinion. 
Given that it is a simple patch and given that (single-image) coarrays 
work rather well in 4.7, maybe one should.


Tobias


2012-07-15 Thomas König  

PR fortran/53824
* resolve.c (resolve_allocate_deallocate):  If both
start indices are NULL, skip the test for equality.

2012-07-15  Thomas König  

PR fortran/53824
* gfortran.dg/coarray_allocate_1.f90:  New test.

Re: PATCH: PR target/53383: Allow -mpreferred-stack-boundary=3 on x86-64

2012-07-15 Thread Gerald Pfeifer

On Fri, 22 Jun 2012, H.J. Lu wrote:
> I am not sure if news.html is the best place for this.

news.html definitely is not a good place for this, cf. the comment
  
in that file. ;-)

> How about putting it in gcc-4.8/changes.html?

Yes, that fits.

> Does it look OK?

Index: ./gcc-4.8/changes.html
===
+Allow -mpreferred-stack-boundary=3 for the x86-64
+architecture with SSE extensions disabled.  Since x86-64 ABI require

the...ABI
requires

+used in controlled environment where stack space is important limitation.

is an important limitation

+long double and __int128), leading to wrong results.  You must build all

...

And the header for the supersection was missing.  

All fixed with the patch below which I committed.


Index: gcc-4.8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.4
diff -u -3 -p -r1.4 changes.html
--- gcc-4.8/changes.html2 Jul 2012 11:35:28 -   1.4
+++ gcc-4.8/changes.html15 Jul 2012 21:23:19 -
@@ -63,22 +63,22 @@ more information about requirements to b
 Java (GCJ)
 -->
 
-
 
 IA-32/x86-64
   
 Allow -mpreferred-stack-boundary=3 for the x86-64
-architecture with SSE extensions disabled.  Since x86-64 ABI require
-16 byte stack alignment, this is ABI incompatible and intended to be
-used in controlled environment where stack space is important limitation.
+architecture with SSE extensions disabled.  Since the x86-64 ABI
+requires 16 byte stack alignment, this is ABI incompatible and
+intended to be used in controlled environments where stack space
+is an important limitation.
 This option will lead to wrong code when functions compiled with 16 byte
 stack alignment (such as functions from a standard library) are called
 with misaligned stack.  In this case, SSE instructions may lead to
 misaligned memory access traps.  In addition, variable arguments will
 be handled incorrectly for 16 byte aligned objects (including x87
-long double and __int128), leading to wrong results.  You must build all
+long double and __int128), leading to
+wrong results.  You must build all
 modules with -mpreferred-stack-boundary=3, including any
 libraries.  This includes the system libraries and startup modules.

[PATCH] Enable vectorizer cost model by default at -O3

2012-07-15 Thread William J. Schmidt

The auto-vectorizer is overly aggressive when not constrained by the
vectorizer cost model.  Although the cost model is by no means perfect,
it does a reasonable job of avoiding many poor vectorization decisions.
Since the auto-vectorizer is enabled by default at -O3 and above, we
should also enable the vectorizer cost model by default at -O3 and
above.

Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
regressions.  Ok for trunk?

Thanks,
Bill


2012-07-15  Bill Schmidt  

* opts.c (default_option): Add -fvect-cost-model to default options
at -O3 and above.


Index: gcc/opts.c
===
--- gcc/opts.c  (revision 189481)
+++ gcc/opts.c  (working copy)
@@ -501,6 +501,7 @@ static const struct default_options default_option
 { OPT_LEVELS_3_PLUS, OPT_funswitch_loops, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_fgcse_after_reload, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_ftree_vectorize, NULL, 1 },
+{ OPT_LEVELS_3_PLUS, OPT_fvect_cost_model, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_fipa_cp_clone, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_ftree_partial_pre, NULL, 1 },

[wwwdocs] SH - add 'b' target characteristic

2012-07-15 Thread Oleg Endo

Hello,

If I'm not mistaken, the SH target does not use the '"* ..."' notation
for output template code.  The patch below updates the table in
backends.html to reflect the current status.

Cheers,
Oleg


Index: htdocs/backends.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/backends.html,v
retrieving revision 1.45
diff -u -r1.45 backends.html
--- htdocs/backends.html23 Feb 2012 13:24:36 -  1.45
+++ htdocs/backends.html15 Jul 2012 22:52:00 -
@@ -96,7 +96,7 @@
 pdp11|L   ICqrcp   e 
 rs6000   | Q   Cqr  da   
 s390 |   ? Qqr p g bda e 
-sh   | Q   CB   qr  da   
+sh   | Q   CB   qr bda   
 sparc| Q   CB   qr pda   
 spu  |   ? Q  *C   p g bd
 stormy16 | ???L  FIC D l   p  m  a

Re: G++ namespace association extension

2012-07-15 Thread Gerald Pfeifer

On Sun, 15 Jul 2012, Jonathan Wakely wrote:
>> I think this also should go into the GCC 4.8 release notes
>> (gcc-4.8/changes.html)?
> I can do that too.  There's no gcc-4.8 dir yet, do I need to copy over
> the various other files from the gcc-4.7 dir or can I just create
> changes.html and leave the RM to do the rest at the appropriate time?

If you run `cvs up -PAd` it should magically appear. :-)

Gerald

Re: [wwwdocs] Buildstat update for 4.7

2012-07-15 Thread Gerald Pfeifer

On Sun, 1 Jul 2012, Tom G. Christensen wrote:
> Latest results for 4.7.x

Thanks, Tom!

Gerald

Re: [wwwdocs] Document ARM/embedded-x_y-branch family

2012-07-15 Thread Gerald Pfeifer

Hi Terry,

On Mon, 9 Jul 2012, Terry Guo wrote:
> As it becomes our long term goal to deliver arm-none-eabi tool chain 
> based on GCC 4.6/4.7/4.8 and future branches, I am going to use the 
> following pattern to document this branch family. Is it ok to commit?

yep, this looks good (and sorry for somehow missing this at first).

The only question I'd have, that may make sense addressing in the
description, is "What kind of changes to you expect to take there,
but not the corresponding release branches? And why?"

Gerald

Re: [patch] PR web/53919 - Add note to install.texi

2012-07-15 Thread Gerald Pfeifer

Hi Jonathan,

On Sat, 14 Jul 2012, Jonathan Wakely wrote:
> Attached this time, here's the original mail again:
> 
> PR c++/53919
> * doc/install.texi (Installing GCC): Refer to instructions for
> released versions. Fix hyphenation.
> 
> Whether or not we want the release-specific installation instructions
> online, I don't think it hurts to point out that the docs at
> http://gcc.gnu.org/install/ refer to the SVN trunk.

This is good, with one caveat.  Can we please not refer to SVN? ;-)

This is one of those lessons learned, where just "current development
sources" or similar says the same, de facto, and makes it easier to
change things.  (You'd be surprised how long we kept finding references
to CVS.)

 The latest version of this document is always available at
 @uref{http://gcc.gnu.org/install/,,http://gcc.gnu.org/install/}.
+The latest version refers to the SVN development sources, instructions for
+specific released versions are included with the sources.

And here, instead of repeating "The latest version", how about just
"It refers..."?

Thanks!

Gerald

Re: [SH] Reorg some CONST_OK_ macros

2012-07-15 Thread Kaz Kojima

Oleg Endo  wrote:
> This patch replaces usages of CONST_OK_FOR_I06 with
> satisfies_constraint_I06 and moves the CONST_OK_FOR_I10 macro to sh.c.
> Tested with 'make all-gcc'.
> 
> OK?

OK.

Regards,
kaz

Re: [SH] Remove old mov peepholes

2012-07-15 Thread Kaz Kojima

Oleg Endo  wrote:
> The attached patch removes old peephole patterns that seem to be unused.
> Tested with 'make all'.  CSiBE result-size (-m4-single -ml -O2
> -mpretend-cmove) does not show any changes.
> 
> OK?

OK.

Regards,
kaz

Re: [wwwdocs] SH - add 'b' target characteristic

2012-07-15 Thread Kaz Kojima

Oleg Endo  wrote:
> If I'm not mistaken, the SH target does not use the '"* ..."' notation
> for output template code.  The patch below updates the table in
> backends.html to reflect the current status.

Looks obvious and OK to me.

Regards,
kaz

Re: [RFA/ARM 1/3] Add VFP support for VFMA and friends

2012-07-15 Thread Michael Hope

On 5 July 2012 21:13, Matthew Gretton-Dann  wrote:
> On 26/06/12 14:44, Richard Earnshaw wrote:
>>
>> On 25/06/12 15:59, Matthew Gretton-Dann wrote:
>>>
>>> All,
>>>
>>> This patch adds support to the ARM backend for generating floating-point
>>> fused multiply-accumulate.
>>>
>>> OK?
>>>
>>> gcc/ChangeLog:
>>>
>>> 2012-06-25  Matthew Gretton-Dann  
>>>
>>> * config/arm/iterators.md (SDF): New mode iterator.
>>> (V_if_elem): Add support for SF and DF modes.
>>> (V_reg): Likewise.
>>> (F_w_constraint): New mode iterator attribute.
>>> (F_r_constraint): Likewise.
>>> (F_fma_type): Likewise.
>>> (F_target): Likewise.
>>> config/arm/vfp.md (fma4): New pattern.
>>> (*fmsub4): Likewise.
>>> (*fmnsub4): Likewise.
>>> (*fmnadd4): Likewise.
>>>
>>
>> F_target as an attribute name doesn't tell me anything useful.  I
>> suggest F_maybe_not_df.
>>
>>> +  "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_FMA "
>>
>>
>> This should be written as
>>
>> "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_FMA && "
>>
>> Then the attribute should expand
>>
>>(define_mode_attr F_maybe_not_df [(SF "1") (DF "TARGET_VFP_DOUBLE")])
>>
>> As I style nit, I would also suggest using the iterator name when it
>> appears in the pattern name, even though it is redundant.  This avoids
>> potential ambiguities when there are multiple iterators operating on
>> different expansions.  That is, instead of:
>>
>>   (define_insn "fma4"
>>
>> use:
>>
>>   (define_insn "fma4"
>>
>> OK with those changes.
>>
>> R.
>>
>
> Now checked in with some changes (see attached patch for what was committed)
> - changes approved off list.

Hi Matt.  Your new patterns require TARGET_HARD_FLOAT but the
testsuite doesn't giving failures when building for soft float[1] or
softfp[2].  Which should it be?

-- Michael

[1] 
http://builds.linaro.org/toolchain/gcc-4.8~svn189401/logs/armv7l-natty-cbuild344-tcpanda06-armv5r2/gcc-testsuite.txt
[2] 
http://builds.linaro.org/toolchain/gcc-4.8~svn189401/logs/armv7l-natty-cbuild344-tcpanda02-cortexa9r1/gcc-testsuite.txt

CRIS atomics revisited 0/4: summary

2012-07-15 Thread Hans-Peter Nilsson

These were spotted while debugging usage of atomics within
glibc.  The kind of changes are microoptimizations,
nanooptimizations, a buglet and a major issue.  Micro: the
load-store-conditional sequence for compare-and-swap I
originally committed was an earlier version improved later.
Nanooptimizations: choosing better-fitting operands for the
atomic operator insn.  Buglet: a post-increment could have
sneaked into the (non-atomic) arithmetic operator operand;
better make it nonmemory_operand altogether.  I also threw in
use of the now generic need_atomic_barrier_p, let's call that a
microoptimization.  The major issue, giving up on alignment of
atomic data by default, is last.  Tested together and some
separately, no regressions for cris-elf nor crisv32-elf.
Committed separately.

brgds, H-P

CRIS atomics revisited 1/4: use need_atomic_barrier_p.

2012-07-15 Thread Hans-Peter Nilsson

Use the new need_atomic_barrier_p.

gcc:
* config/cris/sync.md ("atomic_fetch_")
("atomic_compare_and_swap"): Gate expand_mem_thread_fence
calls on result of call to need_atomic_barrier_p.

Index: config/cris/sync.md
===
--- config/cris/sync.md (revision 189499)
+++ config/cris/sync.md (working copy)
@@ -93,11 +93,15 @@ (define_expand "atomic_fetch_mode != QImode && TARGET_TRAP_UNALIGNED_ATOMIC)
 cris_emit_trap_for_misalignment (operands[1]);
 
-  expand_mem_thread_fence (mmodel);
+  if (need_atomic_barrier_p (mmodel, true))
+expand_mem_thread_fence (mmodel);
+
   emit_insn (gen_cris_atomic_fetch__1 (operands[0],
 operands[1],
 operands[2]));
-  expand_mem_thread_fence (mmodel);
+  if (need_atomic_barrier_p (mmodel, false))
+expand_mem_thread_fence (mmodel);
+
   DONE;
 })
 
@@ -196,13 +200,17 @@ (define_expand "atomic_compare_and_swap<
   if (mode != QImode && TARGET_TRAP_UNALIGNED_ATOMIC)
 cris_emit_trap_for_misalignment (operands[2]);
 
-  expand_mem_thread_fence (mmodel);
+  if (need_atomic_barrier_p (mmodel, true))
+expand_mem_thread_fence (mmodel);
+
   emit_insn (gen_cris_atomic_compare_and_swap_1 (operands[0],
   operands[1],
   operands[2],
   operands[3],
   operands[4]));
-  expand_mem_thread_fence (mmodel);
+  if (need_atomic_barrier_p (mmodel, false))
+expand_mem_thread_fence (mmodel);
+
   DONE;
 })
 
brgds, H-P

CRIS atomics revisited 2/4: don't allow a memory operand (with possible side-effects)

2012-07-15 Thread Hans-Peter Nilsson

Buglet in "atomic_compare_and_swap", allowing (in theory)
a volatile or post-increment memory operand.  Simplest and
safest fixed by excluding all memory operands.

gcc:
* config/cris/sync.md ("atomic_compare_and_swap"): Change
predicate to nonmemory_operand for operand 3.  Add FIXME.
("cris_atomic_compare_and_swap_1"): Change predicates and
constraints for operand 3 to exclude memory.

Index: config/cris/sync.md
===
--- config/cris/sync.md (revision 189500)
+++ config/cris/sync.md (working copy)
@@ -184,11 +184,12 @@ (define_insn "cris_atomic_fetch_ over this, but having both would be
 ;; redundant.
+;; FIXME: handle memory without side-effects for operand[3].
 (define_expand "atomic_compare_and_swap"
   [(match_operand:SI 0 "register_operand")
(match_operand:BWD 1 "register_operand")
(match_operand:BWD 2 "memory_operand")
-   (match_operand:BWD 3 "general_operand")
+   (match_operand:BWD 3 "nonmemory_operand")
(match_operand:BWD 4 "register_operand")
(match_operand 5)
(match_operand 6)
@@ -218,7 +219,7 @@ (define_insn "cris_atomic_compare_and_sw
   [(set (match_operand:SI 0 "register_operand" "=&r")
(unspec_volatile:SI
 [(match_operand:BWD 2 "memory_operand" "+Q")
- (match_operand:BWD 3 "general_operand" "g")]
+ (match_operand:BWD 3 "nonmemory_operand" "ri")]
 CRIS_UNSPEC_ATOMIC_SWAP_BOOL))
(set (match_operand:BWD 1 "register_operand" "=&r") (match_dup 2))
(set (match_dup 2)

brgds, H-P

CRIS atomics revisited 3/4: pattern improvements

2012-07-15 Thread Hans-Peter Nilsson

Microoptimizations for the atomic patterns themselves.  Constant
operands are so common that it seems wasteful not to handle the
most common cases and avoid wasting a register.

gcc/testsuite:
* gcc.target/cris/20011127-1.c: Adjust to %P being a
valid register operand output modifier.

gcc:
* config/cris/cris.c (cris_print_operand) : New cases.
* config/cris/sync.md (atomic_op_op_cnstr): New code_attr.
(atomic_op_op_pred): Ditto.
(atomic_op_mnem_pre_op2): Renamed from atomic_op_mnem_pre; to
reflect the change to include %2 in expansion.  All callers changed.
(qm3): New mode_attr.
("atomic_fetch_"): Use 
as predicate for operand 2. 
("cris_atomic_fetch__1"): Update FIXME.  Use
"" "" for predicate and
constraint for operand 2.
("atomic_compare_and_swap"): Add FIXME.  Change predicate to
nonmemory_operand for operand 3.
("cris_atomic_compare_and_swap_1"): Change operand 3 to
exclude memory.  Improve emitted sync code for v10 and v32.  Use
 instead of  for size designator for cmp.

Index: config/cris/cris.c
===
--- config/cris/cris.c  (revision 189499)
+++ config/cris/cris.c  (working copy)
@@ -981,6 +981,53 @@ cris_print_operand (FILE *file, rtx x, i
   fprintf (file, INTVAL (operand) < 0 ? "adds.w" : "addq");
   return;
 
+case 'P':
+  /* For const_int operands, print the additive mnemonic and the
+modified operand (byte-sized operands don't save anything):
+  N=MIN_INT..-65536: add.d N
+  -65535..-64: subu.w -N
+  -63..-1: subq -N
+  0..63: addq N
+  64..65535: addu.w N
+  65536..MAX_INT: add.d N.
+(Emitted mnemonics are capitalized to simplify testing.)
+For anything else (N.B: only register is valid), print "add.d".  */
+  if (REG_P (operand))
+   {
+ fprintf (file, "Add.d ");
+
+ /* Deal with printing the operand by dropping through to the
+normal path.  */
+ break;
+   }
+  else
+   {
+ int val;
+ gcc_assert (CONST_INT_P (operand));
+
+ val = INTVAL (operand);
+ if (!IN_RANGE (val, -65535, 65535))
+ fprintf (file, "Add.d %d", val);
+ else if (val <= -64)
+   fprintf (file, "Subu.w %d", -val);
+ else if (val <= -1)
+   fprintf (file, "Subq %d", -val);
+ else if (val <= 63)
+ fprintf (file, "Addq %d", val);
+ else if (val <= 65535)
+   fprintf (file, "Addu.w %d", val);
+ return;
+   }
+  break;
+
+case 'q':
+  /* If the operand is an integer -31..31, print "q" else ".d".  */
+  if (CONST_INT_P (operand) && IN_RANGE (INTVAL (operand), -31, 31))
+   fprintf (file, "q");
+  else
+   fprintf (file, ".d");
+  return;
+
 case 'd':
   /* If this is a GOT symbol, force it to be emitted as :GOT and
 :GOTPLT regardless of -fpic (i.e. not as :GOT16, :GOTPLT16).
Index: config/cris/sync.md
===
--- config/cris/sync.md (revision 189501)
+++ config/cris/sync.md (working copy)
@@ -73,17 +73,32 @@ (define_code_iterator atomic_op [plus mi
 (define_code_attr atomic_op_name
  [(plus "add") (minus "sub") (and "and") (ior "or") (xor "xor") (mult "nand")])
 
+;; The operator nonatomic-operand can be memory, constant or register
+;; for all but xor.  We can't use memory or addressing modes with
+;; side-effects though, so just use registers and literal constants.
+(define_code_attr atomic_op_op_cnstr
+ [(plus "ri") (minus "ri") (and "ri") (ior "ri") (xor "r") (mult "ri")])
+
+(define_code_attr atomic_op_op_pred
+ [(plus "nonmemory_operand") (minus "nonmemory_operand")
+  (and "nonmemory_operand") (ior "nonmemory_operand")
+  (xor "register_operand") (mult "nonmemory_operand")])
+
 ;; Pairs of these are used to insert the "not" after the "and" for nand.
-(define_code_attr atomic_op_mnem_pre ;; Upper-case only to sinplify testing.
- [(plus "Add.d") (minus "Sub.d") (and "And.d") (ior "Or.d") (xor "Xor")
-  (mult "aNd.d")])
+(define_code_attr atomic_op_mnem_pre_op2 ;; Upper-case only to simplify 
testing.
+ [(plus "%P2") (minus "Sub.d %2") (and "And%q2 %2") (ior "Or%q2 %2") (xor "Xor 
%2")
+  (mult "aNd%q2 %2")])
+
 (define_code_attr atomic_op_mnem_post_op3
  [(plus "") (minus "") (and "") (ior "") (xor "") (mult "not %3\;")])
 
+;; For SImode, emit "q" for operands -31..31.
+(define_mode_attr qm3 [(SI "%q3") (HI ".w") (QI ".b")])
+
 (define_expand "atomic_fetch_"
   [(match_operand:BWD 0 "register_operand")
(match_operand:BWD 1 "memory_operand")
-   (match_operand:BWD 2 "register_operand")
+   (match_operand:BWD 2 "")
(match_operand 3)
(atomic_op:BWD (match_dup 0) (match_dup 1))]
   ""
@@ -109,8 +124,9 @@ (define_insn "cris_atomic_fetch_" "")))

CRIS atomics revisited 4/4: give up on alignment of atomic data, RFC for is_lock_free hook

2012-07-15 Thread Hans-Peter Nilsson

Well, give up by default that is, and fix it up in a helper
function in glibc to hold a global byte-sized atomic lock for
the duration.  (Sorry!)  Yes, this means that
fold_builtin_atomic_always_lock_free is wrong.  It knows about
alignment in general but doesn't handle the case where the
default alignment of the underlying type is too small for atomic
accesses, and should probably be augmented by a target hook,
alternatively, change the allow_libcall argument in the call to
can_compare_and_swap_p to false.  I guess I should open a PR for
this and add a test-case.  Later.

Too many library API writers don't cater for the possibility
that atomic (lockless) data may need to have certain properties
that may not be matched by the basic underlying data type,
specifically alignment, and fixing the failing instances by hand
is...challenging.  About half the cases have the atomic data
defined in the proximity of the atomic operations, and are
easily locally fixable.  The other half are of increasing
complexity; may have the data defined elsewhere, where the need
for atomicity is surprising and fixing it would be a kludge.
(But with a proper API could be easily handled, e.g. if a
data-type defined specific for the purpose was used; one
different than the underlying type or other common derivated
type used in the library.)

So, we'll change things for cris*-linux.  By default, call a
helper function.  Users can change the default at the caller
site where atomic alignment is known good or where there is
interest in fixing it up when failure is seen; executing a trap
insn was the old default.

Regarding changing fold_builtin_atomic_always_lock_free or
adding a hook, I posit that a better default is to assume atomic
data has to be naturally aligned on *all* existing GCC targets
to accomplish lockless operations, at least for those that don't
just punt to a system call, so maybe
fold_builtin_atomic_always_lock_free should be changed to check
that, by default.  Now, it just assumes that the default
type-alignment is ok and that only a smaller alignment is not
always atomic.  People with counter-examples are asked to please
explain how the counter-example handles data straddling a page
boundary.  Yes, it can be done, but how does the kernel-equivalent
accomplish atomicity; are the pages locked while the instruction
(assumed to cause an exception), is emulated, or is kernel
re-entrance impossible or what?

The default remains the same for non-*-linux-* cris-* and
crisv32-* subtargets, since the code compiled for those targets
is expected to have a different focus, one where fixing
non-aligned data definitions is feasible and desirable.

I deliberately make it optional and use weasel-wording whether
the library functions are actually called or an atomic insn
sequence emitted when suitable; when GCC knows the alignment of
the data (for example, for local static data or through
deliberate attributes) it should be allowed to emit the atomic
instruction sequence even without alignment checks.  Right now
(or maybe it was just the 4.7 branch), GCC's handling of
alignment is so poor that the emitted alignment checks (those
that conditionally execute a trap insn) aren't eliminated for
atomic data with explicit large-enough attribute declarations.
>From what I (with limited tree-foo) could see, IIRC basically
everything about alignment for the specific data is discarded
and the underlying default type alignment is reported.  (Right,
a PR is in order; I know I've entered a PR for the related
__alignof__.)

gcc:
* config/cris/cris.c (cris_init_libfuncs): Handle initialization
of library functions for basic atomic compare-and-swap.
* config/cris/cris.h (TARGET_ATOMICS_MAY_CALL_LIBFUNCS): New macro.
* config/cris/cris.opt (munaligned-atomic-may-use-library): New
option.
* config/cris/sync.md ("atomic_fetch_")
("cris_atomic_fetch__1")
("atomic_compare_and_swap")
("cris_atomic_compare_and_swap_1"): Make
conditional on TARGET_ATOMICS_MAY_CALL_LIBFUNCS for
sizes larger than byte.

gcc/testsuite:
* gcc.target/cris/sync-2i.c, gcc.target/cris/sync-2s.c,
gcc.target/cris/sync-3i.c, gcc.target/cris/sync-3s.c,
gcc.target/cris/sync-4i.c, gcc.target/cris/sync-4s.c,
gcc.target/cris/sync-1-v10.c,
gcc.target/cris/sync-1-v32.c: For cris*-*-linux*, also
pass -mno-unaligned-atomic-may-use-library.
* gcc.target/cris/sync-xchg-1.c: New test.

diff --git gcc/config/cris/cris.c gcc/config/cris/cris.c
index 22b254f..e4c11fd 100644
--- gcc/config/cris/cris.c
+++ gcc/config/cris/cris.c
@@ -3130,6 +3176,16 @@ cris_init_libfuncs (void)
   set_optab_libfunc (udiv_optab, SImode, "__Udiv");
   set_optab_libfunc (smod_optab, SImode, "__Mod");
   set_optab_libfunc (umod_optab, SImode, "__Umod");
+
+  /* Atomic data being unaligned is unfortunately a reality.
+ Deal with it.  */
+  if (TARGET_ATOMICS_MAY_CALL_LIBFUNCS)
+

Fixing gcc.c-torture/compile/pr44707.c for CRIS v32 1/2.

2012-07-15 Thread Hans-Peter Nilsson

Buglet in cris_preferred_reload_class, incidental, apparently
without effect at least regarding failing test-cases.  A class
disjunct from the input was returned as "preferred".  It could
arguably be gcc_asserted as a sanity-check by the caller that
the returned class is a subset of the original class.  ...and I
guess I'll add such a gcc_assert *inside*
cris_preferred_reload_class.  Later.  No regressions, cris-elf
and crisv32-elf.  Committed.

gcc:
* config/cris/cris.c (cris_preferred_reload_class):
Don't return GENERAL_REGS as preferred to MOF_SRP_REGS.

Index: gcc/config/cris/cris.c
===
--- gcc/config/cris/cris.c  (revision 189470)
+++ gcc/config/cris/cris.c  (working copy)
@@ -1503,6 +1550,7 @@ cris_preferred_reload_class (rtx x ATTRI
 {
   if (rclass != ACR_REGS
   && rclass != MOF_REGS
+  && rclass != MOF_SRP_REGS
   && rclass != SRP_REGS
   && rclass != CC0_REGS
   && rclass != SPECIAL_REGS)


brgds, H-P

Fixing gcc.c-torture/compile/pr44707.c for CRIS v32 2/2: RFC: CONSTANT_ADDRESS_P and its default are evil!

2012-07-15 Thread Hans-Peter Nilsson

I think CONSTANT_ADDRESS_P can and should be eliminated,
replaced by something like
 CONSTANT_P (x) && targetm.legitimate_address_p (QImode, x, false)
(or QImode replaced by the known used mode) in the code
currently calling it.

It should, because the default definition is redundant and evil;
easy to miss for targets where (mem (const x)) is not valid for
any arbitrary generic x (symbol_ref, label_ref or const_int,
including offsetted ones; (plus x (const_int N)).

This is the case for CRIS v32, for which only (mem reg) and (mem
(post_inc reg)) are valid.  Like ia64 it has no offsettable
addressing mode.  For example, the constraint in
gcc.c-torture/compile/pr44707.c of "nro" can only match for the
"r" part.

If your target fails gcc.c-torture/compile/pr44707.c, this might
be the reason.

No regressions for cris-elf nor crisv32-elf; fixes
gcc.c-torture/compile/pr44707.c for the latter.  Committed.

* config/cris/cris-protos.h (cris_legitimate_address_p): Declare.
* config/cris/cris.h (CONSTANT_ADDRESS_P): Define in terms of
CONSTANT_P and cris_legitimate_address_p.
* config/cris/cris.c (cris_legitimate_address_p): Make non-static.

Index: config/cris/cris.c
===
--- config/cris/cris.c  (revision 189506)
+++ config/cris/cris.c  (working copy)
@@ -127,8 +127,6 @@ static void cris_init_libfuncs (void);
 
 static reg_class_t cris_preferred_reload_class (rtx, reg_class_t);
 
-static bool cris_legitimate_address_p (enum machine_mode, rtx, bool);
-
 static int cris_register_move_cost (enum machine_mode, reg_class_t, 
reg_class_t);
 static int cris_memory_move_cost (enum machine_mode, reg_class_t, bool);
 static bool cris_rtx_costs (rtx, int, int, int, int *, bool);
@@ -1414,7 +1412,7 @@ cris_biap_index_p (const_rtx x, bool str
here (but is thankfully a general_operand in itself).  A local PIC
symbol is valid for the plain "symbol + offset" case.  */
 
-static bool
+bool
 cris_legitimate_address_p (enum machine_mode mode, rtx x, bool strict)
 {
   const_rtx x1, x2;
Index: config/cris/cris.h
===
--- config/cris/cris.h  (revision 189504)
+++ config/cris/cris.h  (working copy)
@@ -778,6 +778,9 @@ struct cum_args {int regs;};
 
 #define HAVE_POST_INCREMENT 1
 
+#define CONSTANT_ADDRESS_P(X) \
+  (CONSTANT_P (X) && cris_legitimate_address_p (QImode, X, false))
+
 /* Must be a compile-time constant, so we go with the highest value
among all CRIS variants.  */
 #define MAX_REGS_PER_ADDRESS 2
Index: config/cris/cris-protos.h
===
--- config/cris/cris-protos.h   (revision 189499)
+++ config/cris/cris-protos.h   (working copy)
@@ -40,6 +40,7 @@ extern bool cris_base_p (const_rtx, bool
 extern bool cris_base_or_autoincr_p (const_rtx, bool);
 extern bool cris_bdap_index_p (const_rtx, bool);
 extern bool cris_biap_index_p (const_rtx, bool);
+extern bool cris_legitimate_address_p (enum machine_mode, rtx, bool);
 extern bool cris_store_multiple_op_p (rtx);
 extern bool cris_movem_load_rest_p (rtx, int);
 extern void cris_asm_output_symbol_ref (FILE *, rtx);

brgds, H-P

[Ping, ARM]PR53189: optimizations of 64bit logic operation with constant

2012-07-15 Thread Carrot Wei

Hi

The following patches implemented the optimizations suggested by
PR53189, optimizations of 64bit logic operation with constant. Could
any maintainer help to review it?

http://gcc.gnu.org/ml/gcc-patches/2012-07/msg00087.html
http://gcc.gnu.org/ml/gcc-patches/2012-07/msg00169.html
http://gcc.gnu.org/ml/gcc-patches/2012-07/msg00226.html

thanks
Carrot

Re: CRIS atomics revisited 4/4: give up on alignment of atomic data

2012-07-15 Thread Hans-Peter Nilsson

> From: Hans-Peter Nilsson 
> Date: Mon, 16 Jul 2012 05:49:00 +0200

> gcc:

>   * config/cris/sync.md ("atomic_fetch_")
>   ("cris_atomic_fetch__1")
>   ("atomic_compare_and_swap")
>   ("cris_atomic_compare_and_swap_1"): Make
>   conditional on TARGET_ATOMICS_MAY_CALL_LIBFUNCS for
>   sizes larger than byte.

A sync goof (the VC kind): the committed and sent patch, but
not the changelog, was missing the first hunk, now committed:

Index: config/cris/sync.md
===
--- config/cris/sync.md (revision 189504)
+++ config/cris/sync.md (working copy)
@@ -101,7 +101,7 @@ (define_expand "atomic_fetch_")
(match_operand 3)
(atomic_op:BWD (match_dup 0) (match_dup 1))]
-  ""
+  "mode == QImode || !TARGET_ATOMICS_MAY_CALL_LIBFUNCS"
 {
   enum memmodel mmodel = (enum memmodel) INTVAL (operands[3]);
 

brgds, H-P

Re: [PATCH][MIPS] NetLogic XLP scheduling

2012-07-15 Thread Chung-Lin Tang

On 2012/7/16 12:28 AM, Richard Sandiford wrote:
> Chung-Lin Tang  writes:
>> This patch adds scheduling support for the NetLogic XLP, including a new
>> pipeline description, and associated changes.
>>
>> Asides from the new xlp.md description file, there are also some sync
>> primitive attribute modifications, for better scheduling of sync loops
>> (Maxim should be able to better explain this).
> 
> Rather than add a "type" attribute to each sync loop, please just add:
> 
> (not (eq_attr "sync_mem" "none"))
> (symbol_ref "syncloop")
> 
> to the default value of the "type" attribute.  You'll probably need
> to swap the order of the sync* attributes with the "type" attribute
> in order for this to compile.
> 
> The patch is effectively changing the type of the sync loops from
> "unknown" to "syncloop".  That's certainly OK, but you'll need to
> add "syncloop" to the "unknown" reservations of all other schedulers
> (except for generic.md, where what you've done instead is fine).
> It might be easier if you split out the addition of syncloop
> as a separate patch.

I'll leave it to Maxim to respond to the sync parts.

>> Other generic changes include a new "hilo" insn attribute, to mark which
>> of HI/LO does a m[ft]hilo insn access.
> 
> The way other schedulers handle this is with things like:
> 
> (define_insn_reservation "ir_sb1_mfhi" 1
>   (and (eq_attr "cpu" "sb1,sb1a")
>(and (eq_attr "type" "mfhilo")
>   (not (match_operand 1 "lo_operand"
>   "sb1_ex1")
> 
> which seems simpler.  mfhilo and mthilo are required to read operand 1
> and write to operand 0 (respectively) in order to support this kind of
> construct.
> 
> That said, even the above is a hold-over from when we tried to allow
> high registers to store independent values.  These days we can be a bit
> more precise, as with the patch below.  (As the comment says:
> 
>;; If a doubleword move uses these expensive instructions,
>;; it is usually better to schedule them in the same way
>;; as the singleword form, rather than as "multi".
> 
> I'm continuing to assume that mflo and mtlo are the best type choices
> for unsplit double-register moves.  That path should be very rarely
> outside of MIPS16 anyway -- just by sched1 if hi and lo are exposed
> directly -- and no current scheduler tries to model a doubleword hi/lo
> move separately from single-register ones.  The information is available
> via the dword_mode attribute if required.)

I suppose this means that actual generation of moves as mfhi/mthi should
almost never happen out of normal conditions?

> Tested on mips64-elf, and by making sure that there were no changes in
> -O2 output for a recent set of cc1 .ii files.  Applied.
> 
> I'm probably punishing you for being honest here, but the only other
> thing is that you've listed NetLogic Microsystems Inc. as one of the
> authors.  I think that means they'll need to sign a copyright assignment.
> Have they already done that?

They have assigned the copyright to Mentor Graphics, so it should mean
the code can be contributed by us.

Thanks,
Chung-Lin

Re: [PATCH][MIPS] NetLogic XLP scheduling

2012-07-15 Thread Maxim Kuvyrkov

On 16/07/2012, at 6:37 PM, Chung-Lin Tang wrote:

> On 2012/7/16 12:28 AM, Richard Sandiford wrote:
>> Chung-Lin Tang  writes:
>>> This patch adds scheduling support for the NetLogic XLP, including a new
>>> pipeline description, and associated changes.
>>> 
>>> Asides from the new xlp.md description file, there are also some sync
>>> primitive attribute modifications, for better scheduling of sync loops
>>> (Maxim should be able to better explain this).
>> 
>> Rather than add a "type" attribute to each sync loop, please just add:
>> 
>>(not (eq_attr "sync_mem" "none"))
>>(symbol_ref "syncloop")
>> 
>> to the default value of the "type" attribute.  You'll probably need
>> to swap the order of the sync* attributes with the "type" attribute
>> in order for this to compile.
>> 
>> The patch is effectively changing the type of the sync loops from
>> "unknown" to "syncloop".  That's certainly OK, but you'll need to
>> add "syncloop" to the "unknown" reservations of all other schedulers
>> (except for generic.md, where what you've done instead is fine).
>> It might be easier if you split out the addition of syncloop
>> as a separate patch.
> 
> I'll leave it to Maxim to respond to the sync parts.

Richard, that's indeed simpler, thanks.

Chung-Lin, I'll try to make a patch for the patch in the next couple of days 
and will send it to you.  Let me know if you'd rather fixed this yourself.

...

>> Tested on mips64-elf, and by making sure that there were no changes in
>> -O2 output for a recent set of cc1 .ii files.  Applied.
>> 
>> I'm probably punishing you for being honest here, but the only other
>> thing is that you've listed NetLogic Microsystems Inc. as one of the
>> authors.  I think that means they'll need to sign a copyright assignment.
>> Have they already done that?
> 
> They have assigned the copyright to Mentor Graphics, so it should mean
> the code can be contributed by us.

That is correct.  NetLogic developed the original xlp.md description, which 
Chung-Lin essentially rewrote.  In any case, Mentor has copyright assignment 
for the original xlp.md specifically so that we can contribute this upstream.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics

37 matches

Mail list logo