Re: [PR49888, VTA] don't keep VALUEs bound to modified MEMs

2012-06-28 Thread Alexandre Oliva
On Jun 27, 2012, Richard Henderson  wrote:

> On 06/26/2012 01:54 PM, Alexandre Oliva wrote:
>> +  track_stack_pointer (dst, src1, src2);

> Why does this function return a value then?

During testing, I used an assert on the return value to catch cases that
couldn't be handled.  The comments before that function say:

+   ??? The return value, that was useful during testing, ended up
+   unused, but this single-use static function will be inlined, and
+   then the return value computation will be optimized out, so I'm
+   leaving it in.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Alexandre Oliva
On Jun 27, 2012, Mike Stump  wrote:

> On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote:
>> Why?  We don't demand a working plugin.  Indeed, we disable the use of
>> the plugin if we find a linker that doesn't support it.  We just don't
>> account for the possibility of finding a linker that supports plugins,
>> but that doesn't support the one we'll build later.

> If this is the preferred solution, then having configure check the
> 64-bitness of ld and turning off the plugin altogether on mismatches
> sounds like a reasonable course of action to me.

I'd very be surprised if I asked for an i686 native build to package and
install elsewhere, and didn't get a plugin just because the build-time
linker wouldn't have been able to run the plugin.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Jakub Jelinek
On Wed, Jun 27, 2012 at 02:37:08PM -0700, Richard Henderson wrote:
> 
> I was sitting on this patch until I got around to fixing up Jakub's
> existing vector divmod code to use it.  But seeing as how he's adding
> more uses, I think it's better to get it in earlier.
> 
> Tested via a patch sent under separate cover that changes
> __builtin_alpha_umulh to immediately fold to MULT_HIGHPART_EXPR.

Thanks.  Here is an incremental patch on top of my patch from yesterday
which expands some of the vector divisions/modulos using MULT_HIGHPART_EXPR
instead of VEC_WIDEN_MULT_*_EXPR + VEC_PERM_EXPR if backend supports that.
Improves code generated for ushort or short / or % on i?86 (slightly
complicated by the fact that unfortunately even -mavx2 doesn't support
vector by vector shifts for V{8,16}HImode (nor V{16,32}QImode), XOP does
though).

Ok for trunk?

I'll look at using MULT_HIGHPART_EXPR in the pattern recognizer and
vectorizing it as either of the sequences next.

2012-06-28  Jakub Jelinek  

PR tree-optimization/53645
* tree-vect-generic.c (expand_vector_divmod): Use MULT_HIGHPART_EXPR
instead of VEC_WIDEN_MULT_{HI,LO}_EXPR followed by VEC_PERM_EXPR
if possible.

* gcc.c-torture/execute/pr53645-2.c: New test.

--- gcc/tree-vect-generic.c.jj  2012-06-28 08:32:50.0 +0200
+++ gcc/tree-vect-generic.c 2012-06-28 09:10:51.436748834 +0200
@@ -455,7 +455,7 @@ expand_vector_divmod (gimple_stmt_iterat
   unsigned HOST_WIDE_INT mask = GET_MODE_MASK (TYPE_MODE (TREE_TYPE (type)));
   optab op;
   tree *vec;
-  unsigned char *sel;
+  unsigned char *sel = NULL;
   tree cur_op, mhi, mlo, mulcst, perm_mask, wider_type, tem;
 
   if (prec > HOST_BITS_PER_WIDE_INT)
@@ -744,26 +744,34 @@ expand_vector_divmod (gimple_stmt_iterat
   if (mode == -2 || BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
 return NULL_TREE;
 
-  op = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR, type, optab_default);
-  if (op == NULL
-  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
-return NULL_TREE;
-  op = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR, type, optab_default);
-  if (op == NULL
-  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
-return NULL_TREE;
-  sel = XALLOCAVEC (unsigned char, nunits);
-  for (i = 0; i < nunits; i++)
-sel[i] = 2 * i + (BYTES_BIG_ENDIAN ? 0 : 1);
-  if (!can_vec_perm_p (TYPE_MODE (type), false, sel))
-return NULL_TREE;
-  wider_type
-= build_vector_type (build_nonstandard_integer_type (prec * 2, unsignedp),
-nunits / 2);
-  if (GET_MODE_CLASS (TYPE_MODE (wider_type)) != MODE_VECTOR_INT
-  || GET_MODE_BITSIZE (TYPE_MODE (wider_type))
-!= GET_MODE_BITSIZE (TYPE_MODE (type)))
-return NULL_TREE;
+  op = optab_for_tree_code (MULT_HIGHPART_EXPR, type, optab_default);
+  if (op != NULL
+  && optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing)
+wider_type = NULL_TREE;
+  else
+{
+  op = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR, type, optab_default);
+  if (op == NULL
+ || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
+   return NULL_TREE;
+  op = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR, type, optab_default);
+  if (op == NULL
+ || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
+   return NULL_TREE;
+  sel = XALLOCAVEC (unsigned char, nunits);
+  for (i = 0; i < nunits; i++)
+   sel[i] = 2 * i + (BYTES_BIG_ENDIAN ? 0 : 1);
+  if (!can_vec_perm_p (TYPE_MODE (type), false, sel))
+   return NULL_TREE;
+  wider_type
+   = build_vector_type (build_nonstandard_integer_type (prec * 2,
+unsignedp),
+nunits / 2);
+  if (GET_MODE_CLASS (TYPE_MODE (wider_type)) != MODE_VECTOR_INT
+ || GET_MODE_BITSIZE (TYPE_MODE (wider_type))
+!= GET_MODE_BITSIZE (TYPE_MODE (type)))
+   return NULL_TREE;
+}
 
   cur_op = op0;
 
@@ -772,7 +780,7 @@ expand_vector_divmod (gimple_stmt_iterat
 case 0:
   gcc_assert (unsignedp);
   /* t1 = oprnd0 >> pre_shift;
-t2 = (type) (t1 w* ml >> prec);
+t2 = t1 h* ml;
 q = t2 >> post_shift;  */
   cur_op = add_rshift (gsi, type, cur_op, pre_shifts);
   if (cur_op == NULL_TREE)
@@ -801,30 +809,37 @@ expand_vector_divmod (gimple_stmt_iterat
   for (i = 0; i < nunits; i++)
 vec[i] = build_int_cst (TREE_TYPE (type), mulc[i]);
   mulcst = build_vector (type, vec);
-  for (i = 0; i < nunits; i++)
-vec[i] = build_int_cst (TREE_TYPE (type), sel[i]);
-  perm_mask = build_vector (type, vec);
-  mhi = gimplify_build2 (gsi, VEC_WIDEN_MULT_HI_EXPR, wider_type,
-cur_op, mulcst);
-  mhi = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, mhi);
-  mlo = gimplify_build2 (gsi, VEC_WIDEN_MULT_LO_EXPR, wider_type,
-cur_op, mulcst);
-  mlo = gimplify_build1 (gsi, VIEW_CONVERT

Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote:
> On Jun 27, 2012, Mike Stump  wrote:
> 
> > On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote:
> >> Why?  We don't demand a working plugin.  Indeed, we disable the use of
> >> the plugin if we find a linker that doesn't support it.  We just don't
> >> account for the possibility of finding a linker that supports plugins,
> >> but that doesn't support the one we'll build later.
> 
> > If this is the preferred solution, then having configure check the
> > 64-bitness of ld and turning off the plugin altogether on mismatches
> > sounds like a reasonable course of action to me.
> 
> I'd very be surprised if I asked for an i686 native build to package and
> install elsewhere, and didn't get a plugin just because the build-time
> linker wouldn't have been able to run the plugin.

Not disable plugin support altogether, but disable assuming the linker
supports the plugin.  If user uses explicit -f{,no-}use-linker-plugin,
it is his problem to care that the linker has support.  But the problem
is that when build-time ld is new enough gcc assumes it has to support
the plugin.  And that is not the case.

Jakub


Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2

2012-06-28 Thread Alexandre Oliva
On Jun 27, 2012, Christophe Lyon  wrote:

>> I looked at the patch in there, and I'm afraid I don't understand how it
>> achieves the ChangeLog-suggested purpose of ensuring -O2 makes to
>> C*FLAGS_FOR_TARGET, when all it appears to do is to prepend -g.  Can you
>> please clarify?

> With more context, the current code fragment is:
>   CFLAGS_FOR_TARGET=$CFLAGS
>   case " $CFLAGS " in
> *" -O2 "*) ;;
> *) CFLAGS_FOR_TARGET="-O2 $CFLAGS" ;;
>   esac
>   case " $CFLAGS " in
> *" -g "* | *" -g3 "*) ;;
> *) CFLAGS_FOR_TARGET="-g $CFLAGS" ;;
>   esac

> where pre-pending -g discards -O2 if it was pre-pended just above.

I see, thanks for clarifying.

I suggest changing both occurrences of $CFLAGS within the case
statements, then; the more uniform logic is more appealing to me.

Patch approved with these changes.

Thanks,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


[Patch, Fortran] Handle C_F_POINTER with a noncontiguous SHAPE=

2012-06-28 Thread Tobias Burnus
This patch generates inline code for C_F_POINTER with an array argument. 
One reason is that GCC didn't handle SHAPE= arguments which were 
noncontiguous.


However, the real motivation is the fortran-dev branch with the new 
array-descriptor: C_F_POINTER needs then to set the stride multiplier, 
but as it doesn't know the size of a single element, one had either to 
pass the value or handle it partially in the front end. Hence, doing it 
all in the front-end was simpler. The C_F_Pointer issue is the main 
cause for failing test cases on the branch, though several other issues 
remain.


Build and regtested on x86-64-linux-
OK for the trunk?

* * *

If you wonder why I had some problems before: 
http://gcc.gnu.org/ml/fortran/2012-04/msg00115.html


The reason is that I called pushlevel() twice for "body":

+  gfc_start_block (&body);
+  gfc_start_scalarized_body (&loop, &body);


I removed the first one - and now it works. (Well, there were also some 
other issues in the patch, which are now fixed.)


Tobias

PS: After committal, I will update the patch for the branch; let's see 
how many failures will remain on the branch.


PPS: The offset handling in gfortran is really complicated. I wonder 
whether we have to (or at least should) change it for the new array 
descriptor.
2012-06-27  Tobias Burnus  

	* trans-expr.c (conv_isocbinding_procedure): Generate c_f_pointer code
	inline.

2012-06-27  Tobias Burnus  


	* gfortran.dg/c_f_pointer_shape_tests_5.f90: New.
	* gfortran.dg/c_f_pointer_tests_3.f90: Update
	scan-tree-dump-times pattern.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 7d1a6d4..9ebde9d 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -3307,14 +3351,17 @@ conv_isocbinding_procedure (gfc_se * se, gfc_symbol * sym,
   
   return 1;
 }
-  else if ((sym->intmod_sym_id == ISOCBINDING_F_POINTER
-	&& arg->next->expr->rank == 0)
+  else if (sym->intmod_sym_id == ISOCBINDING_F_POINTER
 	   || sym->intmod_sym_id == ISOCBINDING_F_PROCPOINTER)
 {
-  /* Convert c_f_pointer if fptr is a scalar
-	 and convert c_f_procpointer.  */
+  /* Convert c_f_pointer and c_f_procpointer.  */
   gfc_se cptrse;
   gfc_se fptrse;
+  gfc_se shapese;
+  gfc_ss *ss, *shape_ss;
+  tree desc, dim, tmp, stride, offset;
+  stmtblock_t body, block, ifblock;
+  gfc_loopinfo loop;
 
   gfc_init_se (&cptrse, NULL);
   gfc_conv_expr (&cptrse, arg->expr);
@@ -3322,25 +3369,113 @@ conv_isocbinding_procedure (gfc_se * se, gfc_symbol * sym,
   gfc_add_block_to_block (&se->post, &cptrse.post);
 
   gfc_init_se (&fptrse, NULL);
-  if (sym->intmod_sym_id == ISOCBINDING_F_POINTER
-	  || gfc_is_proc_ptr_comp (arg->next->expr, NULL))
-	fptrse.want_pointer = 1;
+  if (arg->next->expr->rank == 0)
+	{
+	  if (sym->intmod_sym_id == ISOCBINDING_F_POINTER
+	  || gfc_is_proc_ptr_comp (arg->next->expr, NULL))
+	fptrse.want_pointer = 1;
+
+	  gfc_conv_expr (&fptrse, arg->next->expr);
+	  gfc_add_block_to_block (&se->pre, &fptrse.pre);
+	  gfc_add_block_to_block (&se->post, &fptrse.post);
+	  if (arg->next->expr->symtree->n.sym->attr.proc_pointer
+	  && arg->next->expr->symtree->n.sym->attr.dummy)
+	fptrse.expr = build_fold_indirect_ref_loc (input_location,
+		   fptrse.expr);
+ 	  se->expr = fold_build2_loc (input_location, MODIFY_EXPR,
+  TREE_TYPE (fptrse.expr),
+  fptrse.expr,
+  fold_convert (TREE_TYPE (fptrse.expr),
+		cptrse.expr));
+	  return 1;
+	}
 
-  gfc_conv_expr (&fptrse, arg->next->expr);
-  gfc_add_block_to_block (&se->pre, &fptrse.pre);
-  gfc_add_block_to_block (&se->post, &fptrse.post);
-  
-  if (arg->next->expr->symtree->n.sym->attr.proc_pointer
-	  && arg->next->expr->symtree->n.sym->attr.dummy)
-	fptrse.expr = build_fold_indirect_ref_loc (input_location,
-		   fptrse.expr);
-  
-  se->expr = fold_build2_loc (input_location, MODIFY_EXPR,
-  TREE_TYPE (fptrse.expr),
-  fptrse.expr,
-  fold_convert (TREE_TYPE (fptrse.expr),
-		cptrse.expr));
+  gfc_start_block (&block);
+
+  /* Get the descriptor of the Fortran pointer.  */
+  ss = gfc_walk_expr (arg->next->expr);
+  gcc_assert (ss != gfc_ss_terminator);
+  fptrse.descriptor_only = 1;
+  gfc_conv_expr_descriptor (&fptrse, arg->next->expr, ss);
+  gfc_add_block_to_block (&block, &fptrse.pre);
+  desc = fptrse.expr;
+
+  /* Set data value, dtype, and offset.  */
+  tmp = GFC_TYPE_ARRAY_DATAPTR_TYPE (TREE_TYPE (desc));
+  gfc_conv_descriptor_data_set (&block, desc,
+fold_convert (tmp, cptrse.expr));
+  gfc_add_modify (&block, gfc_conv_descriptor_dtype (desc),
+		  gfc_get_dtype (TREE_TYPE (desc)));
+
+  /* Start scalarization of the bounds, using the shape argument.  */
+
+  shape_ss = gfc_walk_expr (arg->next->next->expr);
+  gcc_assert (shape_ss != gfc_ss_terminator);
+  gf

Re: [onlinedocs]: No more automatic rebuilt?

2012-06-28 Thread Andreas Schwab
libgomp.texi is still using gpl.texi, although libgomp has been
relicensed to GPLv3 in 2009.  OK?

(This is the last use of gpl.texi in the gcc sources.  Perhaps it should
be removed and gpl_v3.texi renamed back to gpl.texi?)

Andreas.

* libgomp.texi: Include gpl_v3.texi instead of gpl.texi.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 29c078b..f8996f4 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -7,7 +7,7 @@
 
 
 @copying
-Copyright @copyright{} 2006, 2007, 2008, 2010, 2011 Free Software Foundation, 
Inc.
+Copyright @copyright{} 2006, 2007, 2008, 2010, 2011, 2012 Free Software 
Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
@@ -1737,7 +1737,7 @@ Bugs in the GNU OpenMP implementation should be reported 
via
 @c GNU General Public License
 @c -
 
-@include gpl.texi
+@include gpl_v3.texi
 
 
 
-- 
1.7.11.1


-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [onlinedocs]: No more automatic rebuilt?

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 10:18:49AM +0200, Andreas Schwab wrote:
> libgomp.texi is still using gpl.texi, although libgomp has been
> relicensed to GPLv3 in 2009.  OK?

Yes.

> 
>   * libgomp.texi: Include gpl_v3.texi instead of gpl.texi.

Jakub


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Matthew Gretton-Dann

On 27/06/12 21:35, Andrew Pinski wrote:

On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
 wrote:

All,

This patch enables the dump-noaddr test to work in out-of-build-tree
testing.

[snip]


I created a much simpler patch which I have been meaning to submit.
I attached it for reference.


Thanks,
Andrew Pinski

ChangeLog:
* testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
an absolute dump base instead of a relative one.

Index: gcc.c-torture/unsorted/dump-noaddr.x
===
--- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452)
+++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453)
@@ -11,10 +11,10 @@ proc dump_compare { src options } {
  foreach option $option_list {
file delete -force dump1
file mkdir dump1
-   c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase -DMASK=1 
-x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all 
-fdump-noaddr"
+   c-torture-compile $src "$option $options -dumpbase [pwd]/dump1/$dumpbase 
-DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all 
-fdump-noaddr"
file delete -force dump2
file mkdir dump2
-   c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase -DMASK=2 
-x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
+   c-torture-compile $src "$option $options -dumpbase [pwd]/dump2/$dumpbase 
-DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
foreach dump1 [lsort [glob -nocomplain dump1/*]] {
regsub dump1/ $dump1 dump2/ dump2
set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"


What I don't like about this approach is that dump1 and dump2 are created in 
the current working directory.  With out of build-tree testing this may not 
(I believe) be the same as $tmpdir (where temporaries are normally created). 
 Also the current directory may already contain directories/files called 
dump1 or dump2 which will get destroyed by running the testsuite.


Hence why my approach used tmpdir.

Does this reasoning make sense?

I've not committed my version yet in case I am missing something in my 
reasoning above with regards to the relationship between the current working 
directory and $tmpdir.


Thanks,

Matt

--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltd




RE: [PATCH] Disable loop2_invariant for -Os

2012-06-28 Thread Zhenqiang Chen
>> diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73
>> 100644
>> --- a/gcc/loop-init.c
>> +++ b/gcc/loop-init.c
>> @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
>>  static bool
>>  gate_rtl_move_loop_invariants (void)
>>  {
>> +  /* In general, invariant motion can not reduce code size. But it
>> + will
>> +     change the liverange of the invariant, which increases the
>> + register
>> +     pressure and might lead to more spilling.  */
>> +  if (optimize_function_for_size_p (cfun))
>> +    return false;
>> +
>
>Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?

Update it according to the comments.

Thanks!
-Zhenqiang

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f8405dd..b0e84a7 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1931,7 +1931,8 @@ move_loop_invariants (void)
   curr_loop = loop;
   /* move_single_loop_invariants for very large loops
 is time consuming and might need a lot of memory.  */
-  if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP)
+  if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP
+ && ! optimize_loop_nest_for_size_p (loop))
move_single_loop_invariants (loop);
 }

ChangeLog:
2012-06-28  Zhenqiang Chen 

* loop-invariant.c (move_loop_invariants): Skip
move_single_loop_invariants when optimizing loop for size






RE: [PATCH] Disable loop2_invariant for -Os

2012-06-28 Thread Zhenqiang Chen
>-Original Message-
>From: Steven Bosscher [mailto:stevenb@gmail.com]
>Sent: 2012年6月27日 16:54
>To: Zhenqiang Chen
>Cc: gcc-patches@gcc.gnu.org
>Subject: Re: [PATCH] Disable loop2_invariant for -Os
>
>On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen
> wrote:
>> Hi,
>>
>> In general, invariant motion itself can not reduce code size. But it
>> will change the liverange of the invariant, which might lead to more
spilling.
>
>This may be true for ARM but it's not true in general. Sometimes
loop-invariant

Benchmark tests show it also benefits MIPS, PPC and X86 for code size.

>address arithmetic, that is not exposed in GIMPLE, is profitable to hoist
out of
>the loop. See e.g. PR41026 (for which I still have a patch in the queue).

>If this goes in anyway, please mention PR39837 in your ChangeLog entry.

It can not handle the case.

Thanks!
-Zhenqiang





Re: [PATCH] Move Graphite from using PPL over to ISL

2012-06-28 Thread Tobias Grosser

On 06/27/2012 05:06 PM, Richard Guenther wrote:


This merges from the graphite branch the move of PPL to ISL,
and completes it where it was lacking - thanks to Micha.
It leaves unmerged the addition of a pluto-like ISL optimizer
as well as a bugfix for stride>  1 which did not come with
a testcase.

With this patch (ontop of the one requiring ClooG 0.17.0)
we will require ISL 0.10 for enabling Graphite.

I've bootstrapped and built various combinations with in-tree
and out-of-tree cloog and ISL, so I'm pretty confident that
this works.

With out-of-tree ClooG and ISL a slightly older patch ontop of its
prerequesite passed bootstrap and testing on x86_64-unknown-linux-gnu.

Currently re-bootstrapping and testing on x86_64-unknown-linux-gnu.

Ok for trunk?


Hi Richard, hi Micha,

thanks a lot for pushing this forward. Especially the fast 
implementation of the interchange heuristic was impressive!
I am fine with the general goal and think the patch is close to get in, 
but I would like to give feedback on the interchange heuristic. I will 
try to review it today or tomorrow.


Thanks again!!

Tobias


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Carrot Wei
Hi Ramana

Thanks for the review, please see my inlined comments.

On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan
 wrote:
>
> On 8 June 2012 10:12, Carrot Wei  wrote:
> > Hi
> >
> > In rtl expression, substract a constant c is expressed as add a value -c, 
> > so it
> > is alse processed by adddi3, and I extend it more to handle a subtraction of
> > 64bit constant. I created an insn pattern arm_subdi3_immediate to 
> > specifically
> > represent substraction with 64bit constant while continue keeping the add 
> > rtl
> > expression.
> >
>
> Sorry about the time it has taken to review this patch -Thanks for
> tackling this but I'm not convinced that this patch is correct and
> definitely can be more efficient.
>
> The range of valid 64 bit constants allowed would be in my opinion are
> the following- obtained by dividing the 64 bit constant into 2 32 bit
> halves (upper32 and lower32 referred to as upper and lower below)
>
>  arm_not_operand (upper) && arm_add_operand (lower) which boils down
> to the valid combination of
>
>  adds lo : adc hi - both positive constants.
>  adds lo ; sbc hi  - lower positive, upper negative
I assume you mean "sbc -hi" or "sbc abs(hi)", similar for following instructions

>
>  subs lo ; sbc hi - lower negative, upper negative
>  subs lo ; adc hi  - lower negative, upper positive
>
My first version did the similar thing, but in some cases subs and
adds may generate different carry flag. Assume the low word is 0 and
high word is negative, your method will generate

adds r0, r0, 0
sbc   r1, r1, abs(hi)

My method generates

subs r0, r0, 0
sbc   r1, r1, abs(hi)

ARM's definition of subs is

(result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’);

So the subs instruction will set carry flag, but adds clear carry
flag, and finally generate different result in r1.

>
> Therefore I'd do the following -
>
> * Don't make *arm_adddi3 a named pattern - we don't need that.
> * Change the *addsi3_carryin_ pattern to be something like this :
>
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -1001,12 +1001,14 @@
>  )
>
>  (define_insn "*addsi3_carryin_"
> -  [(set (match_operand:SI 0 "s_register_operand" "=r")
> -       (plus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "%r")
> -                         (match_operand:SI 2 "arm_rhs_operand" "rI"))
> +  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
> +       (plus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "%r,r"
> +                         (match_operand:SI 2 "arm_not_operand" "rI,K"

Do you mean arm_add_operand?

>                 (LTUGEU:SI (reg: CC_REGNUM) (const_int 0]
>   "TARGET_32BIT"
> -  "adc%?\\t%0, %1, %2"
> +  "@
> +  adc%?\\t%0, %1, %2
> +  sbc%?\\t%0, %1, %#n2"
>   [(set_attr "conds" "use")]
>  )
>
> * I'd like a new const_ok_for_dimode_op function that dealt with each
> of these operations, thus your plus operation with a DImode constant
> would just be a check similar to what I've said above.

Good idea, it will make the interface cleaner. I will do it later.

> * You then don't need the new subdi3_immediate pattern and the split
> can happen after reload. Adjust predicates and constraints
> accordingly, delete it. Also please use CONST_INT_P instead of

Even if I delete subdi3_immediate pattern, we still need the
predicates and constraints to represent the negative di numbers in
other patterns.

thanks
Carrot


Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2

2012-06-28 Thread Christophe Lyon

On 28.06.2012 09:32, Alexandre Oliva wrote:

I suggest changing both occurrences of $CFLAGS within the case
statements, then; the more uniform logic is more appealing to me.

Patch approved with these changes.

Thanks,


Thanks; here is an updated version taking your comment into account.

Can you commit it for me (I don't have write access).

Thanks.

Christophe.

2012-06-28  Christophe Lyon 

* configure.ac (CFLAGS_FOR_TARGET, CXXFLAGS_FOR_TARGET): Make sure
they contain -O2.
* configure: Regenerate.

diff --git a/configure b/configure
index 083f2ce..1ab12db 100755
--- a/configure
+++ b/configure
@@ -6690,11 +6690,11 @@ if test "x$CFLAGS_FOR_TARGET" = x; then
   CFLAGS_FOR_TARGET=$CFLAGS
   case " $CFLAGS " in
 *" -O2 "*) ;;
-*) CFLAGS_FOR_TARGET="-O2 $CFLAGS" ;;
+*) CFLAGS_FOR_TARGET="-O2 $CFLAGS_FOR_TARGET" ;;
   esac
   case " $CFLAGS " in
 *" -g "* | *" -g3 "*) ;;
-*) CFLAGS_FOR_TARGET="-g $CFLAGS" ;;
+*) CFLAGS_FOR_TARGET="-g $CFLAGS_FOR_TARGET" ;;
   esac
 fi
 
@@ -6703,11 +6703,11 @@ if test "x$CXXFLAGS_FOR_TARGET" = x; then
   CXXFLAGS_FOR_TARGET=$CXXFLAGS
   case " $CXXFLAGS " in
 *" -O2 "*) ;;
-*) CXXFLAGS_FOR_TARGET="-O2 $CXXFLAGS" ;;
+*) CXXFLAGS_FOR_TARGET="-O2 $CXXFLAGS_FOR_TARGET" ;;
   esac
   case " $CXXFLAGS " in
 *" -g "* | *" -g3 "*) ;;
-*) CXXFLAGS_FOR_TARGET="-g $CXXFLAGS" ;;
+*) CXXFLAGS_FOR_TARGET="-g $CXXFLAGS_FOR_TARGET" ;;
   esac
 fi
 
diff --git a/configure.ac b/configure.ac
index 378e9f5..82dbe4c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2145,11 +2145,11 @@ if test "x$CFLAGS_FOR_TARGET" = x; then
   CFLAGS_FOR_TARGET=$CFLAGS
   case " $CFLAGS " in
 *" -O2 "*) ;;
-*) CFLAGS_FOR_TARGET="-O2 $CFLAGS" ;;
+*) CFLAGS_FOR_TARGET="-O2 $CFLAGS_FOR_TARGET" ;;
   esac
   case " $CFLAGS " in
 *" -g "* | *" -g3 "*) ;;
-*) CFLAGS_FOR_TARGET="-g $CFLAGS" ;;
+*) CFLAGS_FOR_TARGET="-g $CFLAGS_FOR_TARGET" ;;
   esac
 fi
 AC_SUBST(CFLAGS_FOR_TARGET)
@@ -2158,11 +2158,11 @@ if test "x$CXXFLAGS_FOR_TARGET" = x; then
   CXXFLAGS_FOR_TARGET=$CXXFLAGS
   case " $CXXFLAGS " in
 *" -O2 "*) ;;
-*) CXXFLAGS_FOR_TARGET="-O2 $CXXFLAGS" ;;
+*) CXXFLAGS_FOR_TARGET="-O2 $CXXFLAGS_FOR_TARGET" ;;
   esac
   case " $CXXFLAGS " in
 *" -g "* | *" -g3 "*) ;;
-*) CXXFLAGS_FOR_TARGET="-g $CXXFLAGS" ;;
+*) CXXFLAGS_FOR_TARGET="-g $CXXFLAGS_FOR_TARGET" ;;
   esac
 fi
 AC_SUBST(CXXFLAGS_FOR_TARGET)


Re: [PATCH] Disable loop2_invariant for -Os

2012-06-28 Thread Richard Guenther
On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen  wrote:
>>> diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73
>>> 100644
>>> --- a/gcc/loop-init.c
>>> +++ b/gcc/loop-init.c
>>> @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
>>>  static bool
>>>  gate_rtl_move_loop_invariants (void)
>>>  {
>>> +  /* In general, invariant motion can not reduce code size. But it
>>> + will
>>> +     change the liverange of the invariant, which increases the
>>> + register
>>> +     pressure and might lead to more spilling.  */
>>> +  if (optimize_function_for_size_p (cfun))
>>> +    return false;
>>> +
>>
>>Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?
>
> Update it according to the comments.
>
> Thanks!
> -Zhenqiang
>
> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
> index f8405dd..b0e84a7 100644
> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
> @@ -1931,7 +1931,8 @@ move_loop_invariants (void)
>       curr_loop = loop;
>       /* move_single_loop_invariants for very large loops
>         is time consuming and might need a lot of memory.  */
> -      if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP)
> +      if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP
> +         && ! optimize_loop_nest_for_size_p (loop))
>        move_single_loop_invariants (loop);

Wait - move_single_loop_invariants itself already uses
optimize_loop_for_speed_p.
And looking down it seems to have support for tracking spill cost (eventually
only with -fira-loop-pressure) - please work out why this support is not working
for you.

Richard.

>     }
>
> ChangeLog:
> 2012-06-28  Zhenqiang Chen 
>
>        * loop-invariant.c (move_loop_invariants): Skip
>        move_single_loop_invariants when optimizing loop for size
>
>
>
>


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Ramana Radhakrishnan
On 28 June 2012 10:03, Carrot Wei  wrote:
> Hi Ramana
>
> Thanks for the review, please see my inlined comments.
>
> On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan
>  wrote:
>>
>> On 8 June 2012 10:12, Carrot Wei  wrote:
>> > Hi
>> >
>> > In rtl expression, substract a constant c is expressed as add a value -c, 
>> > so it
>> > is alse processed by adddi3, and I extend it more to handle a subtraction 
>> > of
>> > 64bit constant. I created an insn pattern arm_subdi3_immediate to 
>> > specifically
>> > represent substraction with 64bit constant while continue keeping the add 
>> > rtl
>> > expression.
>> >
>>
>> Sorry about the time it has taken to review this patch -Thanks for
>> tackling this but I'm not convinced that this patch is correct and
>> definitely can be more efficient.
>>
>> The range of valid 64 bit constants allowed would be in my opinion are
>> the following- obtained by dividing the 64 bit constant into 2 32 bit
>> halves (upper32 and lower32 referred to as upper and lower below)
>>
>>  arm_not_operand (upper) && arm_add_operand (lower) which boils down
>> to the valid combination of
>>
>>  adds lo : adc hi - both positive constants.
>>  adds lo ; sbc hi  - lower positive, upper negative

> I assume you mean "sbc -hi" or "sbc abs(hi)", similar for following 
> instructions

hi = ~upper32

lower = lower 32 bits of the constant
hi =  ~ (upper32 bits) of the constant ( bitwise twiddle not a negate :) )

For e.g.

unsigned long long foo4 (unsigned long long x)
{
 return x - 0x25ULL;
}

should be
subs r0, r0, #37
sbc   r1, r1, #0

Notice that it's #0 and not 1 . :)



>
>>
>>  subs lo ; sbc hi - lower negative, upper negative
>>  subs lo ; adc hi  - lower negative, upper positive
>>
> My first version did the similar thing, but in some cases subs and
> adds may generate different carry flag. Assume the low word is 0 and
> high word is negative, your method will generate
>
> adds r0, r0, 0
> sbc   r1, r1, abs(hi)

No it will generate

adds r0, r0, #0
sbcr1, r1, ~hi

and not abs (hi)



>
> My method generates
>
> subs r0, r0, 0
> sbc   r1, r1, abs(hi)
>
> ARM's definition of subs is
>
> (result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’);
>
> So the subs instruction will set carry flag, but adds clear carry
> flag, and finally generate different result in r1.
>
>>
>> Therefore I'd do the following -
>>
>> * Don't make *arm_adddi3 a named pattern - we don't need that.
>> * Change the *addsi3_carryin_ pattern to be something like this :
>>
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -1001,12 +1001,14 @@
>>  )
>>
>>  (define_insn "*addsi3_carryin_"
>> -  [(set (match_operand:SI 0 "s_register_operand" "=r")
>> -       (plus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "%r")
>> -                         (match_operand:SI 2 "arm_rhs_operand" "rI"))
>> +  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>> +       (plus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "%r,r"
>> +                         (match_operand:SI 2 "arm_not_operand" "rI,K"
>
> Do you mean arm_add_operand?

No I mean arm_not_operand and it was a deliberate choice as explained above.

>
>>                 (LTUGEU:SI (reg: CC_REGNUM) (const_int 0]
>>   "TARGET_32BIT"
>> -  "adc%?\\t%0, %1, %2"
>> +  "@
>> +  adc%?\\t%0, %1, %2
>> +  sbc%?\\t%0, %1, %#n2"
>>   [(set_attr "conds" "use")]
>>  )
>>
>> * I'd like a new const_ok_for_dimode_op function that dealt with each
>> of these operations, thus your plus operation with a DImode constant
>> would just be a check similar to what I've said above.
>
> Good idea, it will make the interface cleaner. I will do it later.

I think it should help with a clean interface for all the operations
you plan to add.

>
>> * You then don't need the new subdi3_immediate pattern and the split
>> can happen after reload. Adjust predicates and constraints
>> accordingly, delete it. Also please use CONST_INT_P instead of
>
> Even if I delete subdi3_immediate pattern, we still need the
> predicates and constraints to represent the negative di numbers in
> other patterns.

I agree you need the predicate - I suspect you can get away with a
single constraint for all valid add immediate DImode operands
especially if you are splitting it later to the constituent forms.



regards,
Ramana


>
> thanks
> Carrot


Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-06-28 Thread Iain Buclaw
On 27 June 2012 19:17, Mike Stump  wrote:
> On Jun 27, 2012, at 7:45 AM, Iain Buclaw wrote:
>> I do have a question though, what is available for the transition of
>> development from git to svn?  Other than a lot of ready and getting
>> used to the various switches and commands on my part.
>
> Why transition?  Quite a few people around here use git on a day to day basis 
> and just push and pull to/from svn as they see fit.  gcc has a read-only git 
> repo you can track and pull from.  For pushing into svn, you can use git to 
> do that as well (dcommit).  You'll want to read up on work flows on the 
> net... as dcommit and merges require a little extra caution that isn't 
> obvious.
>

I did not know of this, thanks. I'll be sure to look it up.


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Richard Earnshaw
On 28/06/12 10:03, Carrot Wei wrote:
> Hi Ramana
> 
> Thanks for the review, please see my inlined comments.
> 
> On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan
>  wrote:
>>
>> On 8 June 2012 10:12, Carrot Wei  wrote:
>>> Hi
>>>
>>> In rtl expression, substract a constant c is expressed as add a value -c, 
>>> so it
>>> is alse processed by adddi3, and I extend it more to handle a subtraction of
>>> 64bit constant. I created an insn pattern arm_subdi3_immediate to 
>>> specifically
>>> represent substraction with 64bit constant while continue keeping the add 
>>> rtl
>>> expression.
>>>
>>
>> Sorry about the time it has taken to review this patch -Thanks for
>> tackling this but I'm not convinced that this patch is correct and
>> definitely can be more efficient.
>>
>> The range of valid 64 bit constants allowed would be in my opinion are
>> the following- obtained by dividing the 64 bit constant into 2 32 bit
>> halves (upper32 and lower32 referred to as upper and lower below)
>>
>>  arm_not_operand (upper) && arm_add_operand (lower) which boils down
>> to the valid combination of
>>
>>  adds lo : adc hi - both positive constants.
>>  adds lo ; sbc hi  - lower positive, upper negative
> I assume you mean "sbc -hi" or "sbc abs(hi)", similar for following 
> instructions
> 

No, it's sbc ~hi -- bitwise inversion

It all falls out from the specification, where

adc == X + Y + C
and
sbc == X + ~Y + C.

Hence the need to use arm_not_operand.

R.



Re: [patch] support for multiarch systems

2012-06-28 Thread Thomas Schwinge
Hi!

On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose  wrote:
> On 25.06.2012 15:56, Joseph S. Myers wrote:
> > On Mon, 25 Jun 2012, Matthias Klose wrote:
> > 
> >> Please find attached the patch updated for trunk 20120625, x86 only, 
> >> tested on
> >> x86-linux-gnu, KFreeBSD and the Hurd.

> 2012-06-25  Matthias Klose  
> 
>   * doc/invoke.texi: Document -print-multiarch.
>   * doc/install.texi: Document --enable-multiarch.
>   * doc/fragments.texi: Document MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME.
>   * configure.ac: Add --enable-multiarch option.
>   * configure.in: Regenerate.
>   * Makefile.in (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib.
>   enable_multiarch, with_float: New macros.
>   if_multiarch: New macro, define in terms of enable_multiarch.
>   * genmultilib: Add new argument for the multiarch name.
>   * gcc.c (multiarch_dir): Define.
>   (for_each_path): Search for multiarch suffixes.
>   (driver_handle_option): Handle multiarch option.
>   (do_spec_1): Pass -imultiarch if defined.
>   (main): Print multiarch.
>   (set_multilib_dir): Separate multilib and multiarch names
>   from multilib_select.
>   (print_multilib_info): Ignore multiarch names in multilib_select.
>   * incpath.c (add_standard_paths): Search the multiarch include dirs.
>   * cppdeault.h (default_include): Document multiarch in multilib
>   member.
>   * cppdefault.c: [LOCAL_INCLUDE_DIR, STANDARD_INCLUDE_DIR] Add an
> include directory for multiarch directories.
>   * common.opt: New options --print-multiarch and -imultilib.
>   * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd
>   for i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu
>   for i[34567]86-*-gnu*).
>   * config/i386/t-kfreebsd: Add multiarch names in
>   MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME.
>   * config/i386/t-linux64: Likewise.
>   * config/i386/t-linux: Define MULTIARCH_DIRNAME.
>   * config/i386/t-gnu: Likewise.

As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files.
Instead of repeating: my comments from

as well as the follow-up still hold.

> Index: genmultilib
> ===
> --- genmultilib   (revision 188931)
> +++ genmultilib   (working copy)
> @@ -84,6 +84,8 @@
>  # This argument can be used together with MULTILIB_EXCEPTIONS and will take
>  # effect after the MULTILIB_EXCEPTIONS.
>  
> +# The optional eight argument is the multiarch name.

»ninth argument«.


Grüße,
 Thomas


pgpZRoJXMiArK.pgp
Description: PGP signature


Re: [C++ RFC / Patch] PR 51213 ("access control under SFINAE")

2012-06-28 Thread Paolo Carlini

On 06/15/2012 04:27 PM, Paolo Carlini wrote:

Hi,

as I mentioned a few days ago, I'm working on implementing this
feature, which I personally consider rather high priority, from the
library point of view too (eg, ).

I have been making some progress - I'm attaching below what I have so
far in my local tree - but I also think it's time to get feedback both
about the general approach and about more specific issues with the
testsuite.

... any comments on this?

Thanks!
Paolo.



Re: [Ada] Attribute 'Old should only be used in postconditions

2012-06-28 Thread Eric Botcazou
> 2012-06-26  Yannick Moy  
>
>   * sem_attr.adb (Analyze_Attribute): Detect if 'Old is used outside a
>   postcondition, and issue an error in such a case.

This has introduced the following failures in the gnat.dg testsuite:

FAIL: gnat.dg/deep_old.adb (test for excess errors)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 7)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 16)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 28)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 34)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 38)
FAIL: gnat.dg/old_errors.adb  (test for warnings, line 40)
FAIL: gnat.dg/old_errors.adb  (test for errors, line 44)
FAIL: gnat.dg/old_errors.adb (test for excess errors)

What should we do about them?

-- 
Eric Botcazou


Re: [Ada] Attribute 'Old should only be used in postconditions

2012-06-28 Thread Arnaud Charlet
> > * sem_attr.adb (Analyze_Attribute): Detect if 'Old is used outside a
> > postcondition, and issue an error in such a case.
> 
> This has introduced the following failures in the gnat.dg testsuite:
> 
> FAIL: gnat.dg/deep_old.adb (test for excess errors)
> FAIL: gnat.dg/old_errors.adb  (test for errors, line 7)
> FAIL: gnat.dg/old_errors.adb  (test for errors, line 16)
> FAIL: gnat.dg/old_errors.adb  (test for errors, line 28)
> FAIL: gnat.dg/old_errors.adb  (test for errors, line 34)
> FAIL: gnat.dg/old_errors.adb  (test for errors, line 38)
> FAIL: gnat.dg/old_errors.adb  (test for warnings, line 40)
> FAIL: gnat.dg/old_errors.adb  (test for errors, line 44)
> FAIL: gnat.dg/old_errors.adb (test for excess errors)
> 
> What should we do about them?

Probably suppress both, since they no longer make sense (they are testing
an early implementation of 'Old, before 'Old was standardized in Ada 2012).

I'll take care of it.

Arno


Re: [PATCH] Add generic vector lowering for integer division and modulus (PR tree-optimization/53645)

2012-06-28 Thread Richard Guenther
On Wed, 27 Jun 2012, Jakub Jelinek wrote:

> Hi!
> 
> This patch makes veclower2 attempt to emit integer division/modulus of
> vectors by constants using vector multiplication, shifts or masking.
> 
> It is somewhat similar to the vect_recog_divmod_pattern, but it needs
> to analyze everything first, see if all divisions or modulos are doable
> using the same sequence of vector insns, and then emit vector insns
> as opposed to the scalar ones the pattern recognizer adds.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

I wonder what to do for -O0 though - shouldn't we not call
expand_vector_divmod in that case?  Thus,

+ if (!optimize
  || !VECTOR_INTEGER_TYPE_P (type) || TREE_CODE (rhs2) != 
VECTOR_CST)
+   break;

?

Thanks,
Richard.

> The testcase additionally eyeballed even for -mavx2, which unlike -mavx
> has vector >> vector shifts.
> 
> 2012-06-27  Jakub Jelinek  
> 
>   PR tree-optimization/53645
>   * tree-vect-generic.c (add_rshift): New function.
>   (expand_vector_divmod): New function.
>   (expand_vector_operation): Use it for vector integer
>   TRUNC_{DIV,MOD}_EXPR by VECTOR_CST.
>   * tree-vect-patterns.c (vect_recog_divmod_pattern): Replace
>   unused lguup variable with dummy_int.
> 
>   * gcc.c-torture/execute/pr53645.c: New test.
> 
> --- gcc/tree-vect-generic.c.jj2012-06-26 10:00:42.935832834 +0200
> +++ gcc/tree-vect-generic.c   2012-06-27 10:15:20.534103045 +0200
> @@ -391,6 +391,515 @@ expand_vector_comparison (gimple_stmt_it
>return t;
>  }
>  
> +/* Helper function of expand_vector_divmod.  Gimplify a RSHIFT_EXPR in type
> +   of OP0 with shift counts in SHIFTCNTS array and return the temporary 
> holding
> +   the result if successful, otherwise return NULL_TREE.  */
> +static tree
> +add_rshift (gimple_stmt_iterator *gsi, tree type, tree op0, int *shiftcnts)
> +{
> +  optab op;
> +  unsigned int i, nunits = TYPE_VECTOR_SUBPARTS (type);
> +  bool scalar_shift = true;
> +
> +  for (i = 1; i < nunits; i++)
> +{
> +  if (shiftcnts[i] != shiftcnts[0])
> + scalar_shift = false;
> +}
> +
> +  if (scalar_shift && shiftcnts[0] == 0)
> +return op0;
> +
> +  if (scalar_shift)
> +{
> +  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar);
> +  if (op != NULL
> +   && optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing)
> + return gimplify_build2 (gsi, RSHIFT_EXPR, type, op0,
> + build_int_cst (NULL_TREE, shiftcnts[0]));
> +}
> +
> +  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector);
> +  if (op != NULL
> +  && optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing)
> +{
> +  tree *vec = XALLOCAVEC (tree, nunits);
> +  for (i = 0; i < nunits; i++)
> + vec[i] = build_int_cst (TREE_TYPE (type), shiftcnts[i]);
> +  return gimplify_build2 (gsi, RSHIFT_EXPR, type, op0,
> +   build_vector (type, vec));
> +}
> +
> +  return NULL_TREE;
> +}
> +
> +/* Try to expand integer vector division by constant using
> +   widening multiply, shifts and additions.  */
> +static tree
> +expand_vector_divmod (gimple_stmt_iterator *gsi, tree type, tree op0,
> +   tree op1, enum tree_code code)
> +{
> +  bool use_pow2 = true;
> +  bool has_vector_shift = true;
> +  int mode = -1, this_mode;
> +  int pre_shift = -1, post_shift;
> +  unsigned int nunits = TYPE_VECTOR_SUBPARTS (type);
> +  int *shifts = XALLOCAVEC (int, nunits * 4);
> +  int *pre_shifts = shifts + nunits;
> +  int *post_shifts = pre_shifts + nunits;
> +  int *shift_temps = post_shifts + nunits;
> +  unsigned HOST_WIDE_INT *mulc = XALLOCAVEC (unsigned HOST_WIDE_INT, nunits);
> +  int prec = TYPE_PRECISION (TREE_TYPE (type));
> +  int dummy_int;
> +  unsigned int i, unsignedp = TYPE_UNSIGNED (TREE_TYPE (type));
> +  unsigned HOST_WIDE_INT mask = GET_MODE_MASK (TYPE_MODE (TREE_TYPE (type)));
> +  optab op;
> +  tree *vec;
> +  unsigned char *sel;
> +  tree cur_op, mhi, mlo, mulcst, perm_mask, wider_type, tem;
> +
> +  if (prec > HOST_BITS_PER_WIDE_INT)
> +return NULL_TREE;
> +
> +  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector);
> +  if (op == NULL
> +  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
> +has_vector_shift = false;
> +
> +  /* Analysis phase.  Determine if all op1 elements are either power
> + of two and it is possible to expand it using shifts (or for remainder
> + using masking).  Additionally compute the multiplicative constants
> + and pre and post shifts if the division is to be expanded using
> + widening or high part multiplication plus shifts.  */
> +  for (i = 0; i < nunits; i++)
> +{
> +  tree cst = VECTOR_CST_ELT (op1, i);
> +  unsigned HOST_WIDE_INT ml;
> +
> +  if (!host_integerp (cst, unsignedp) || integer_zerop (cst))
> + return NULL_TREE;
> +  pre_shifts[i] = 0;
> +  post_shifts[i] = 

Re: [patch] support for multiarch systems

2012-06-28 Thread Matthias Klose
On 28.06.2012 12:01, Thomas Schwinge wrote:
> Hi!
> 
> On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose 
> wrote:
>> On 25.06.2012 15:56, Joseph S. Myers wrote:
>>> On Mon, 25 Jun 2012, Matthias Klose wrote:
>>> 
 Please find attached the patch updated for trunk 20120625, x86 only,
 tested on x86-linux-gnu, KFreeBSD and the Hurd.
> 
>> 2012-06-25  Matthias Klose  
>> 
>> * doc/invoke.texi: Document -print-multiarch. * doc/install.texi:
>> Document --enable-multiarch. * doc/fragments.texi: Document
>> MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME. * configure.ac: Add
>> --enable-multiarch option. * configure.in: Regenerate. * Makefile.in
>> (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib. enable_multiarch,
>> with_float: New macros. if_multiarch: New macro, define in terms of
>> enable_multiarch. * genmultilib: Add new argument for the multiarch
>> name. * gcc.c (multiarch_dir): Define. (for_each_path): Search for
>> multiarch suffixes. (driver_handle_option): Handle multiarch option. 
>> (do_spec_1): Pass -imultiarch if defined. (main): Print multiarch. 
>> (set_multilib_dir): Separate multilib and multiarch names from
>> multilib_select. (print_multilib_info): Ignore multiarch names in
>> multilib_select. * incpath.c (add_standard_paths): Search the multiarch
>> include dirs. * cppdeault.h (default_include): Document multiarch in
>> multilib member. * cppdefault.c: [LOCAL_INCLUDE_DIR,
>> STANDARD_INCLUDE_DIR] Add an include directory for multiarch
>> directories. * common.opt: New options --print-multiarch and -imultilib. 
>> * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd for
>> i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu for
>> i[34567]86-*-gnu*). * config/i386/t-kfreebsd: Add multiarch names in 
>> MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME. * config/i386/t-linux64:
>> Likewise. * config/i386/t-linux: Define MULTIARCH_DIRNAME. *
>> config/i386/t-gnu: Likewise.
> 
> As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files. 
> Instead of repeating: my comments from 
> 
>
> 
as well as the follow-up still hold.

Like

* config/i386/t-gnu: New, define MULTIARCH_DIRNAME.

?

>> Index: genmultilib 
>> === ---
>> genmultilib  (revision 188931) +++ genmultilib   (working copy) @@ -84,6
>> +84,8 @@ # This argument can be used together with MULTILIB_EXCEPTIONS
>> and will take # effect after the MULTILIB_EXCEPTIONS.
>> 
>> +# The optional eight argument is the multiarch name.
> 
> »ninth argument«.

fixed.


Re: [onlinedocs]: No more automatic rebuilt?

2012-06-28 Thread Gerald Pfeifer
On Thu, 28 Jun 2012, Andreas Schwab wrote:
> libgomp.texi is still using gpl.texi, although libgomp has been
> relicensed to GPLv3 in 2009.  OK?

Looks good, thank you.

> (This is the last use of gpl.texi in the gcc sources.  Perhaps it
> should be removed and gpl_v3.texi renamed back to gpl.texi?)

If it's not used any more, yes, please go ahead an remove it.

As for renaming gpl_v3.texi to gpl.texi, I'm not sure.

Gerald


Re: [patch] support for multiarch systems

2012-06-28 Thread Thomas Schwinge
Hi!

On Thu, 28 Jun 2012 12:42:23 +0200, Matthias Klose  wrote:
> On 28.06.2012 12:01, Thomas Schwinge wrote:
> > On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose 
> > wrote:
> >> On 25.06.2012 15:56, Joseph S. Myers wrote:
> >>> On Mon, 25 Jun 2012, Matthias Klose wrote:
> >>> 
>  Please find attached the patch updated for trunk 20120625, x86 only,
>  tested on x86-linux-gnu, KFreeBSD and the Hurd.
> > 
> >> 2012-06-25  Matthias Klose  
> >> 
> >> * doc/invoke.texi: Document -print-multiarch. * doc/install.texi:
> >> Document --enable-multiarch. * doc/fragments.texi: Document
> >> MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME. * configure.ac: Add
> >> --enable-multiarch option. * configure.in: Regenerate. * Makefile.in
> >> (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib. enable_multiarch,
> >> with_float: New macros. if_multiarch: New macro, define in terms of
> >> enable_multiarch. * genmultilib: Add new argument for the multiarch
> >> name. * gcc.c (multiarch_dir): Define. (for_each_path): Search for
> >> multiarch suffixes. (driver_handle_option): Handle multiarch option. 
> >> (do_spec_1): Pass -imultiarch if defined. (main): Print multiarch. 
> >> (set_multilib_dir): Separate multilib and multiarch names from
> >> multilib_select. (print_multilib_info): Ignore multiarch names in
> >> multilib_select. * incpath.c (add_standard_paths): Search the multiarch
> >> include dirs. * cppdeault.h (default_include): Document multiarch in
> >> multilib member. * cppdefault.c: [LOCAL_INCLUDE_DIR,
> >> STANDARD_INCLUDE_DIR] Add an include directory for multiarch
> >> directories. * common.opt: New options --print-multiarch and -imultilib. 
> >> * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd for
> >> i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu for
> >> i[34567]86-*-gnu*). * config/i386/t-kfreebsd: Add multiarch names in 
> >> MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME. * config/i386/t-linux64:
> >> Likewise. * config/i386/t-linux: Define MULTIARCH_DIRNAME. *
> >> config/i386/t-gnu: Likewise.
> > 
> > As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files. 
> > Instead of repeating: my comments from 
> > 
> >
> > 
> as well as the follow-up still hold.
> 
> Like
> 
>   * config/i386/t-gnu: New, define MULTIARCH_DIRNAME.
> 
> ?

I'd use:

* config/i386/t-gnu: New file.
* config/i386/t-kfreebsd: Likewise.
* config/i386/t-linux: Likewise.

Plus the following instead of your changes:

gcc/
* config.gcc  (tmake_file):
Include i386/t-linux.
 (tmake_file):
Include i386/t-kfreebsd.
 (tmake_file): Include i386/t-gnu.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7ec184c..39c70f2 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3481,9 +3481,14 @@ case ${target} in
 
i[34567]86-*-darwin* | x86_64-*-darwin*)
;;
-   i[34567]86-*-linux* | x86_64-*-linux* | \
- i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
- i[34567]86-*-gnu*)
+   i[34567]86-*-linux* | x86_64-*-linux*)
+   tmake_file="$tmake_file i386/t-linux"
+   ;;
+   i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu)
+   tmake_file="$tmake_file i386/t-kfreebsd"
+   ;;
+   i[34567]86-*-gnu*)
+   tmake_file="$tmake_file i386/t-gnu"
;;
i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*)
;;

Otherwise, I can't imagine how that would work.


Grüße,
 Thomas


pgpJmqjTH8LJD.pgp
Description: PGP signature


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Carrot Wei
On Thu, Jun 28, 2012 at 5:37 PM, Ramana Radhakrishnan
 wrote:
> On 28 June 2012 10:03, Carrot Wei  wrote:
>> Hi Ramana
>>
>> Thanks for the review, please see my inlined comments.
>>
>> On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan
>>  wrote:
>>>
>>> On 8 June 2012 10:12, Carrot Wei  wrote:
>>> > Hi
>>> >
>>> > In rtl expression, substract a constant c is expressed as add a value -c, 
>>> > so it
>>> > is alse processed by adddi3, and I extend it more to handle a subtraction 
>>> > of
>>> > 64bit constant. I created an insn pattern arm_subdi3_immediate to 
>>> > specifically
>>> > represent substraction with 64bit constant while continue keeping the add 
>>> > rtl
>>> > expression.
>>> >
>>>
>>> Sorry about the time it has taken to review this patch -Thanks for
>>> tackling this but I'm not convinced that this patch is correct and
>>> definitely can be more efficient.
>>>
>>> The range of valid 64 bit constants allowed would be in my opinion are
>>> the following- obtained by dividing the 64 bit constant into 2 32 bit
>>> halves (upper32 and lower32 referred to as upper and lower below)
>>>
>>>  arm_not_operand (upper) && arm_add_operand (lower) which boils down
>>> to the valid combination of
>>>
>>>  adds lo : adc hi - both positive constants.
>>>  adds lo ; sbc hi  - lower positive, upper negative
>
>> I assume you mean "sbc -hi" or "sbc abs(hi)", similar for following 
>> instructions
>
> hi = ~upper32
>
> lower = lower 32 bits of the constant
> hi =  ~ (upper32 bits) of the constant ( bitwise twiddle not a negate :) )
>
> For e.g.
>
> unsigned long long foo4 (unsigned long long x)
> {
>  return x - 0x25ULL;
> }
>
> should be
> subs r0, r0, #37
> sbc   r1, r1, #0
>
> Notice that it's #0 and not 1 . :)
>
>
>
>>
>>>
>>>  subs lo ; sbc hi - lower negative, upper negative
>>>  subs lo ; adc hi  - lower negative, upper positive
>>>

Thank you for the detailed explanation. So the four cases should be

 adds lo : adc hi - both positive constants.
 adds lo ; sbc ~hi  - lower positive, upper negative
 subs -lo ; sbc ~hi - lower negative, upper negative
 subs -lo ; adc hi  - lower negative, upper positive


>> My first version did the similar thing, but in some cases subs and
>> adds may generate different carry flag. Assume the low word is 0 and
>> high word is negative, your method will generate
>>
>> adds r0, r0, 0
>> sbc   r1, r1, abs(hi)
>
> No it will generate
>
> adds r0, r0, #0
> sbc    r1, r1, ~hi
>
> and not abs (hi)
>
>
>
>>
>> My method generates
>>
>> subs r0, r0, 0
>> sbc   r1, r1, abs(hi)
>>
>> ARM's definition of subs is
>>
>> (result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’);
>>
>> So the subs instruction will set carry flag, but adds clear carry
>> flag, and finally generate different result in r1.
>>
>>>
>>> Therefore I'd do the following -
>>>
>>> * Don't make *arm_adddi3 a named pattern - we don't need that.
>>> * Change the *addsi3_carryin_ pattern to be something like this :
>>>
>>> --- a/gcc/config/arm/arm.md
>>> +++ b/gcc/config/arm/arm.md
>>> @@ -1001,12 +1001,14 @@
>>>  )
>>>
>>>  (define_insn "*addsi3_carryin_"
>>> -  [(set (match_operand:SI 0 "s_register_operand" "=r")
>>> -       (plus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "%r")
>>> -                         (match_operand:SI 2 "arm_rhs_operand" "rI"))
>>> +  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>>> +       (plus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "%r,r"
>>> +                         (match_operand:SI 2 "arm_not_operand" "rI,K"
>>
>> Do you mean arm_add_operand?
>
> No I mean arm_not_operand and it was a deliberate choice as explained above.
>
>>
>>>                 (LTUGEU:SI (reg: CC_REGNUM) (const_int 0]
>>>   "TARGET_32BIT"
>>> -  "adc%?\\t%0, %1, %2"
>>> +  "@
>>> +  adc%?\\t%0, %1, %2
>>> +  sbc%?\\t%0, %1, %#n2"

Since constraint "K" is logical not, not negative, should the last
line be following?

+  sbc%?\\t%0, %1, #%B2"

thanks
Carrot


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Alexandre Oliva
On Jun 28, 2012, Jakub Jelinek  wrote:

> On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote:
>> I'd very be surprised if I asked for an i686 native build to package and
>> install elsewhere, and didn't get a plugin just because the build-time
>> linker wouldn't have been able to run the plugin.

> Not disable plugin support altogether, but disable assuming the linker
> supports the plugin.

That still doesn't sound right to me: why should the compiler refrain
from using a perfectly functional linker plugin on the machine where
it's installed (not where it's built)?

Also, this scenario of silently deciding whether or not to use the
linker plugin could bring us to different test results for the same
command lines.  I don't like that.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


[PATCH] Fix PR53790

2012-06-28 Thread Richard Guenther

This fixes PR53790 - with MEM_REF you can get base decls of
incomplete type.  Deal with that.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied
everywhere.

Richard.

2012-06-28  Richard Guenther  

PR middle-end/53790
* expr.c (expand_expr_real_1): Verify if the type is complete
before inspecting its size.

* gcc.dg/torture/pr53790.c: New testcase.

Index: gcc/expr.c
===
*** gcc/expr.c  (revision 189041)
--- gcc/expr.c  (working copy)
*** expand_expr_real_1 (tree exp, rtx target
*** 9832,9837 
--- 9832,9838 
orig_op0 = op0
  = expand_expr (tem,
 (TREE_CODE (TREE_TYPE (tem)) == UNION_TYPE
+ && COMPLETE_TYPE_P (TREE_TYPE (tem))
  && (TREE_CODE (TYPE_SIZE (TREE_TYPE (tem)))
  != INTEGER_CST)
  && modifier != EXPAND_STACK_PARM
Index: gcc/testsuite/gcc.dg/torture/pr53790.c
===
*** gcc/testsuite/gcc.dg/torture/pr53790.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr53790.c  (working copy)
***
*** 0 
--- 1,17 
+ /* { dg-do compile } */
+ 
+ typedef struct s {
+ int value;
+ } s_t;
+ 
+ static inline int 
+ read(s_t const *var)
+ {
+   return var->value;
+ }
+ 
+ int main()
+ {
+   extern union u extern_var;
+   return read((s_t *)&extern_var);
+ }


Re: [onlinedocs]: No more automatic rebuilt?

2012-06-28 Thread Andreas Schwab
Gerald Pfeifer  writes:

> If it's not used any more, yes, please go ahead an remove it.

Done as this, tested with make info.

Andreas.

* doc/include/gpl.texi: Remove.
* doc/sourcebuild.texi (Texinfo Manuals): Don't mention gpl.texi.

diff --git a/gcc/doc/include/gpl.texi b/gcc/doc/include/gpl.texi
deleted file mode 100644
index bcb5535..000
[omitted]
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 3d834ee..dc5cc47 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1,4 +1,4 @@
-@c Copyright (C) 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011
+@c Copyright (C) 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011, 2012
 @c Free Software Foundation, Inc.
 @c This is part of the GCC manual.
 @c For copying conditions, see the file gcc.texi.
@@ -368,8 +368,7 @@ The GNU Free Documentation License.
 The section ``Funding Free Software''.
 @item gcc-common.texi
 Common definitions for manuals.
-@item gpl.texi
-@itemx gpl_v3.texi
+@item gpl_v3.texi
 The GNU General Public License.
 @item texinfo.tex
 A copy of @file{texinfo.tex} known to work with the GCC manuals.
-- 
1.7.11.1

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Richard Guenther
On Thu, Jun 28, 2012 at 1:39 PM, Alexandre Oliva  wrote:
> On Jun 28, 2012, Jakub Jelinek  wrote:
>
>> On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote:
>>> I'd very be surprised if I asked for an i686 native build to package and
>>> install elsewhere, and didn't get a plugin just because the build-time
>>> linker wouldn't have been able to run the plugin.
>
>> Not disable plugin support altogether, but disable assuming the linker
>> supports the plugin.
>
> That still doesn't sound right to me: why should the compiler refrain
> from using a perfectly functional linker plugin on the machine where
> it's installed (not where it's built)?
>
> Also, this scenario of silently deciding whether or not to use the
> linker plugin could bring us to different test results for the same
> command lines.  I don't like that.

I don't like that we derive the default setting this way either.  In the end
I would like us to arrive at the point that LTO does not work at all without
a linker plugin.

Richard.

> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-28 Thread Ramana Radhakrishnan
>  subs -lo ; sbc ~hi - lower negative, upper negative
>  subs -lo ; adc hi  - lower negative, upper positive

Yes.



>>
>>>
                 (LTUGEU:SI (reg: CC_REGNUM) (const_int 0]
   "TARGET_32BIT"
 -  "adc%?\\t%0, %1, %2"
 +  "@
 +  adc%?\\t%0, %1, %2
 +  sbc%?\\t%0, %1, %#n2"
>
> Since constraint "K" is logical not, not negative, should the last
> line be following?
>
> +  sbc%?\\t%0, %1, #%B2"

Indeed that was a typo on my part. Sorry about that.

Ramana


Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2

2012-06-28 Thread Alexandre Oliva
On Jun 28, 2012, Christophe Lyon  wrote:

> Can you commit it for me (I don't have write access).

Done, GCC SVN and src CVS trees.  Thanks!

> 2012-06-28  Christophe Lyon 

>   * configure.ac (CFLAGS_FOR_TARGET, CXXFLAGS_FOR_TARGET): Make sure
>   they contain -O2.
>   * configure: Regenerate.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


[patch]: Fix PR53595 (hard_regno_call_part_clobbered called with invalid regno)

2012-06-28 Thread Georg-Johann Lay
This patch returns false in HARD_REGNO_CALL_PART_CLOBBERED if
!HARD_REGNO_MODE_OK.

Returning true for such registers might lead to performance
degradation that eat up all performance gained from 4.6 to 4.7
for example.

Ok to apply?

Johann

PR 53595
* config/avr/avr.c (avr_hard_regno_call_part_clobbered): New.
* config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): New.
* config/avr/avr.h (HARD_REGNO_CALL_PART_CLOBBERED): Forward to
avr_hard_regno_call_part_clobbered.
Index: config/avr/avr-protos.h
===
--- config/avr/avr-protos.h	(revision 189011)
+++ config/avr/avr-protos.h	(working copy)
@@ -47,6 +47,7 @@ extern void init_cumulative_args (CUMULA
 #endif /* TREE_CODE */
 
 #ifdef RTX_CODE
+extern int avr_hard_regno_call_part_clobbered (unsigned, enum machine_mode);
 extern const char *output_movqi (rtx insn, rtx operands[], int *l);
 extern const char *output_movhi (rtx insn, rtx operands[], int *l);
 extern const char *output_movsisf (rtx insn, rtx operands[], int *l);
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 189011)
+++ config/avr/avr.c	(working copy)
@@ -8856,6 +8856,28 @@ avr_hard_regno_mode_ok (int regno, enum
 }
 
 
+/* Implement `HARD_REGNO_CALL_PART_CLOBBERED'.  */
+
+int
+avr_hard_regno_call_part_clobbered (unsigned regno, enum machine_mode mode)
+{
+  /* FIXME: This hook gets called with MODE:REGNO combinations that don't
+represent valid hard registers like, e.g. HI:29.  Returning TRUE
+for such registers can lead to performance degradation as mentioned
+in PR53595.  Thus, report invalid hard registers as FALSE.  */
+  
+  if (!avr_hard_regno_mode_ok (regno, mode))
+return 0;
+  
+  /* Return true if any of the following boundaries is crossed:
+ 17/18, 27/28 and 29/30.  */
+  
+  return ((regno < 18 && regno + GET_MODE_SIZE (mode) > 18)
+  || (regno < REG_Y && regno + GET_MODE_SIZE (mode) > REG_Y)
+  || (regno < REG_Z && regno + GET_MODE_SIZE (mode) > REG_Z));
+}
+
+
 /* Implement `MODE_CODE_BASE_REG_CLASS'.  */
 
 enum reg_class
Index: config/avr/avr.h
===
--- config/avr/avr.h	(revision 189011)
+++ config/avr/avr.h	(working copy)
@@ -402,10 +402,8 @@ enum reg_class {
 
 #define REGNO_OK_FOR_INDEX_P(NUM) 0
 
-#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE)\
-  (((REGNO) < 18 && (REGNO) + GET_MODE_SIZE (MODE) > 18)   \
-   || ((REGNO) < REG_Y && (REGNO) + GET_MODE_SIZE (MODE) > REG_Y)  \
-   || ((REGNO) < REG_Z && (REGNO) + GET_MODE_SIZE (MODE) > REG_Z))
+#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \
+  avr_hard_regno_call_part_clobbered (REGNO, MODE)
 
 #define TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P hook_bool_mode_true
 


[PATCH] MIPS/libgcc: Add soft-fp support for SDE bare-iron targets

2012-06-28 Thread Maciej W. Rozycki
Hello,

 This change adds soft-fp support for SDE bare-iron targets.

 The settings have been mostly based on the version already present in 
glibc, except that the ABI variations have been merged into a single file 
and conditionalised on preprocessor macros (and the file reformatted to 
follow the GNU coding standard that the glibc variants don't).  Only n32 
has to be treated somewhat specially as it is ILP32 but its "long long" 
type is 64-bit with native support (using single registers rather than 
pairs).  The rest is handled generically, based on the width of the types 
chosen.

 This has been regression tested for the mips-sde-elf target with no new 
failures, using the o32 and n64 ABI multilibs, with and without 
-msoft-float, o32 also with MIPS16 variants.

 There's currently no SDE runtime support for n32, however despite the 
unability to test I decided the configuration shouldn't be pessimised by 
default (by avoiding the special exception and using the 32-bit "long" 
type) as glibc already uses such an arrangement so it's been verified 
elsewhere and if a platform that supports the n32 ABI decides later on to 
enable soft-fp too, it will be verified in libgcc anyway.  I believe this 
is reasonable and avoids the risk of someone chooing the "long" type by 
omission.

 Comments or questions are welcome, otherwise OK to apply?

2012-06-28  Catherine Moore  
Maciej W. Rozycki  

libgcc/
* config/mips/sfp-machine.h: New file.
* config.host : Enable soft-fp.

  Maciej

gcc-mips-softfp.diff
Index: gcc-trunk-4.6/libgcc/config/mips/sfp-machine.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ gcc-trunk-4.6/libgcc/config/mips/sfp-machine.h  2012-06-24 
14:38:40.083663725 +0100
@@ -0,0 +1,101 @@
+#if defined _ABIN32 && _MIPS_SIM == _ABIN32
+
+#define _FP_W_TYPE_SIZE64
+#define _FP_W_TYPE unsigned long long
+#define _FP_WS_TYPEsigned long long
+#define _FP_I_TYPE long long
+
+#else
+
+#define _FP_W_TYPE_SIZE_MIPS_SZLONG
+#define _FP_W_TYPE unsigned long
+#define _FP_WS_TYPEsigned long
+#define _FP_I_TYPE long
+
+#endif
+
+#if _FP_W_TYPE_SIZE < 64
+
+#define _FP_MUL_MEAT_S(R, X, Y)\
+  _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_S, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_D(R, X, Y)\
+  _FP_MUL_MEAT_2_wide (_FP_WFRACBITS_D, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_Q(R, X, Y)\
+  _FP_MUL_MEAT_4_wide (_FP_WFRACBITS_Q, R, X, Y, umul_ppmm)
+
+#define _FP_DIV_MEAT_S(R, X, Y)\
+  _FP_DIV_MEAT_1_udiv_norm (S, R, X, Y)
+#define _FP_DIV_MEAT_D(R, X, Y)\
+  _FP_DIV_MEAT_2_udiv (D, R, X, Y)
+#define _FP_DIV_MEAT_Q(R, X, Y)\
+  _FP_DIV_MEAT_4_udiv (Q, R, X, Y)
+
+#else
+
+#define _FP_MUL_MEAT_S(R, X, Y)\
+  _FP_MUL_MEAT_1_imm (_FP_WFRACBITS_S, R, X, Y)
+#define _FP_MUL_MEAT_D(R, X, Y)\
+  _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_D, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_Q(R, X, Y)\
+  _FP_MUL_MEAT_2_wide_3mul (_FP_WFRACBITS_Q, R, X, Y, umul_ppmm)
+
+#define _FP_DIV_MEAT_S(R, X, Y)\
+  _FP_DIV_MEAT_1_imm (S, R, X, Y, _FP_DIV_HELP_imm)
+#define _FP_DIV_MEAT_D(R, X, Y)\
+  _FP_DIV_MEAT_1_udiv_norm (D, R, X, Y)
+#define _FP_DIV_MEAT_Q(R, X, Y)\
+  _FP_DIV_MEAT_2_udiv (Q, R, X, Y)
+
+#endif
+
+#define _FP_NANFRAC_S  ((_FP_QNANBIT_S << 1) - 1)
+#define _FP_NANFRAC_D  ((_FP_QNANBIT_D << 1) - 1), -1
+#define _FP_NANFRAC_Q  ((_FP_QNANBIT_Q << 1) - 1), -1, -1, -1
+#define _FP_NANSIGN_S  0
+#define _FP_NANSIGN_D  0
+#define _FP_NANSIGN_Q  0
+
+#define _FP_KEEPNANFRACP 1
+/* From my experiments it seems X is chosen unless one of the
+   NaNs is sNaN,  in which case the result is NANSIGN/NANFRAC.  */
+#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP) \
+  do { \
+if ((_FP_FRAC_HIGH_RAW_##fs (X)\
+| _FP_FRAC_HIGH_RAW_##fs (Y)) & _FP_QNANBIT_##fs)  \
+  {\
+   R##_s = _FP_NANSIGN_##fs;   \
+_FP_FRAC_SET_##wc (R, _FP_NANFRAC_##fs);   \
+  }\
+else   \
+  {\
+   R##_s = X##_s; 

Re: [PATCH] Move Graphite from using PPL over to ISL

2012-06-28 Thread Diego Novillo

On 12-06-27 11:06 , Richard Guenther wrote:


2012-06-27  Richard Guenther  
Michael Matz  
Tobias Grosser 
Sebastian Pop 

config/
* cloog.m4: Set up to work against ISL only.
* isl.m4: New file.

* Makefile.def: Add ISL host module, remove PPL host module.
Adjust ClooG host module to use the proper ISL.
* Makefile.tpl: Pass ISL include flags instead of PPL ones.
* configure.ac: Include config/isl.m4.  Add ISL host library,
remove PPL.  Remove PPL configury, add ISL configury, adjust
ClooG configury.
* Makefile.in: Regenerated.
* configure: Likewise.

gcc/
* Makefile.in: Remove PPL flags in favor of ISL ones.
(BACKENDLIBS): Remove PPL libs.
(INCLUDES): Remove PPL includes in favor of ISL ones.
(graphite-clast-to-gimple.o): Remove graphite-dependences.h and
graphite-cloog-compat.h dependencies.
(graphite-dependences.o): Likewise.
(graphite-poly.o): Likewise.
* configure.ac: Declare ISL vars instead of PPL ones.
* configure: Regenerated.
* doc/install.texi: Replace PPL requirement documentation
with ISL one.
* graphite-blocking.c: Remove PPL code, add ISL equivalent.
* graphite-clast-to-gimple.c: Likewise.
* graphite-dependences.c: Likewise.
* graphite-interchange.c: Likewise.
* graphite-poly.h: Likewise.
* graphite-poly.c: Likewise.
* graphite-sese-to-poly.c: Likewise.
* graphite.c: Likewise.
* graphite-scop-detection.c: Re-arrange includes.
* graphite-cloog-util.c: Remove.
* graphite-cloog-util.h: Likewise.
* graphite-ppl.h: Likewise.
* graphite-ppl.c: Likewise.
* graphite-dependences.h: Likewise.

libgomp/
* testsuite/libgomp.graphite/force-parallel-4.c: Adjust.
* testsuite/libgomp.graphite/force-parallel-5.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-7.c: Likewise.
* testsuite/libgomp.graphite/force-parallel-8.c: Likewise.


OK.


Diego.




Re: [patch]: Fix PR53595 (hard_regno_call_part_clobbered called with invalid regno)

2012-06-28 Thread Denis Chertykov
2012/6/28 Georg-Johann Lay :
> This patch returns false in HARD_REGNO_CALL_PART_CLOBBERED if
> !HARD_REGNO_MODE_OK.
>
> Returning true for such registers might lead to performance
> degradation that eat up all performance gained from 4.6 to 4.7
> for example.
>
> Ok to apply?
>
> Johann
>
>        PR 53595
>        * config/avr/avr.c (avr_hard_regno_call_part_clobbered): New.
>        * config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): New.
>        * config/avr/avr.h (HARD_REGNO_CALL_PART_CLOBBERED): Forward to
>        avr_hard_regno_call_part_clobbered.

Please, apply.

Denis


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann 
 wrote:
> On 27/06/12 21:35, Andrew Pinski wrote:
>> On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
>>  wrote:
>>> All,
>>> 
>>> This patch enables the dump-noaddr test to work in out-of-build-tree
>>> testing.
> [snip]
>> 
>> I created a much simpler patch which I have been meaning to submit.
>> I attached it for reference.
>> 
>> 
>> Thanks,
>> Andrew Pinski
>> 
>> ChangeLog:
>> * testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
>> an absolute dump base instead of a relative one.
>> 
>> Index: gcc.c-torture/unsorted/dump-noaddr.x
>> ===
>> --- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452)
>> +++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453)
>> @@ -11,10 +11,10 @@ proc dump_compare { src options } {
>>  foreach option $option_list {
>>  file delete -force dump1
>>  file mkdir dump1
>> -c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase 
>> -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all 
>> -fdump-tree-all -fdump-noaddr"
>> +c-torture-compile $src "$option $options -dumpbase 
>> [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 
>> -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
>>  file delete -force dump2
>>  file mkdir dump2
>> -c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase 
>> -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
>> +c-torture-compile $src "$option $options -dumpbase 
>> [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all 
>> -fdump-tree-all -fdump-noaddr"
>>  foreach dump1 [lsort [glob -nocomplain dump1/*]] {
>>  regsub dump1/ $dump1 dump2/ dump2
>>  set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"
> 
> What I don't like about this approach is that dump1 and dump2 are created in 
> the current working directory.

On vxworks as I recall we did a cd to tmpdir, is that generally true?  Also, if 
one telnets in or sshes into the host under test, the cd is mandatory... as 
otherwise one would dump turds (that's a technical term) in the home directory 
which would be very uncool.  Maybe a better approach would be to cd to the 
right place if all the Canadian setups cd, as that then unifies them.

> With out of build-tree testing this may not (I believe) be the same as 
> $tmpdir (where temporaries are normally created).  Also the current directory 
> may already contain directories/files called dump1 or dump2 which will get 
> destroyed by running the 

The point of the cd was to get to a place where temps can be created freely...

> I've not committed my version yet in case I am missing something in my 
> reasoning above with regards to the relationship between the current working 
> directory and $tmpdir.

So the question would be, does his patch work for you?  It was unclear to me if 
the answer is no.

Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the 
directory on the target, the host or the build machine?  And is that going to 
the host machine?  They are not the same.  One needs a directory on the host 
machine.


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Matthew Gretton-Dann

On 28/06/12 14:38, Mike Stump wrote:

On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann

>  wrote:

On 27/06/12 21:35, Andrew Pinski wrote:

On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
 wrote:

All,

This patch enables the dump-noaddr test to work in out-of-build-tree
testing.

[snip]


I created a much simpler patch which I have been meaning to submit.
I attached it for reference.


Thanks,
Andrew Pinski

ChangeLog:
* testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
an absolute dump base instead of a relative one.

Index: gcc.c-torture/unsorted/dump-noaddr.x
===
--- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452)
+++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453)
@@ -11,10 +11,10 @@ proc dump_compare { src options } {
  foreach option $option_list {
  file delete -force dump1
  file mkdir dump1
-c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase -DMASK=1 -x 
c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all 
-fdump-noaddr"
+c-torture-compile $src "$option $options -dumpbase [pwd]/dump1/$dumpbase 
-DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all 
-fdump-noaddr"
  file delete -force dump2
  file mkdir dump2
-c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase -DMASK=2 -x 
c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
+c-torture-compile $src "$option $options -dumpbase [pwd]/dump2/$dumpbase 
-DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
  foreach dump1 [lsort [glob -nocomplain dump1/*]] {
  regsub dump1/ $dump1 dump2/ dump2
  set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"


What I don't like about this approach is that dump1 and dump2 are

>> created in the current working directory.


On vxworks as I recall we did a cd to tmpdir, is that generally true?
Also, if one telnets in or sshes into the host under test, the cd is
mandatory... as otherwise one would dump turds (that's a technical term)
in the home directory which would be very uncool.  Maybe a better
approach would be to cd to the right place if all the Canadian setups cd,
as that then unifies them.


With out of build-tree testing this may not (I believe) be the same as
$tmpdir (where temporaries are normally created).  Also the current
directory may already contain directories/files called dump1 or dump2
which will get destroyed by running the


The point of the cd was to get to a place where temps can be created
freely...


I've not committed my version yet in case I am missing something in my
reasoning above with regards to the relationship between the current
working directory and $tmpdir.


So the question would be, does his patch work for you?  It was unclear to
me if the answer is no.


Sorry - the patch works for my use case (build==host), but I was concerned 
over the use of [pwd] vs $tmpdir.



Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the
directory on the target, the host or the build machine?  And is that
going to the host machine?  They are not the same.  One needs a directory
on the host machine.


I don't think this applies to my patch though, so are you still okay for my 
version to go in or is there something else I haven't considered?


Thanks,

Matt

--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltd


--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltd




Re: [PATCH] MIPS/libgcc: Add soft-fp support for SDE bare-iron targets

2012-06-28 Thread Joseph S. Myers
On Thu, 28 Jun 2012, Maciej W. Rozycki wrote:

>   * config/mips/sfp-machine.h: New file.
>   * config.host : Enable soft-fp.

The compiler uses MIPS NaN conventions on MIPS; fp-bit knows about those 
but soft-fp does not.  Are you not concerned about that regression?  (Is 
this code only ever going to be used in software floating-point 
configurations, without exception support, so the choice of NaN doesn't 
matter much?)

libgcc/config/mips/t-mips sets FPBIT and DPBIT.  Shouldn't you do 
something to override those settings?  Even if the libgcc logic is to 
build soft-fp if both soft-fp and fp-bit are configured, it would seem 
cleaner for the fragments to configure only the relevant one.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 12:16 AM, Alexandre Oliva  wrote:
> On Jun 27, 2012, Mike Stump  wrote:
>> On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote:
>>> Why?  We don't demand a working plugin.  Indeed, we disable the use of
>>> the plugin if we find a linker that doesn't support it.  We just don't
>>> account for the possibility of finding a linker that supports plugins,
>>> but that doesn't support the one we'll build later.
> 
>> If this is the preferred solution, then having configure check the
>> 64-bitness of ld and turning off the plugin altogether on mismatches
>> sounds like a reasonable course of action to me.
> 
> I'd very be surprised if I asked for an i686 native build to package and
> install elsewhere, and didn't get a plugin just because the build-time
> linker wouldn't have been able to run the plugin.

The architecture of the compiler, last I knew it, was to smell out the feature 
set of the system, including libraries, headers, assemblers and linkers.  It 
uses this as static configuration parameters for the build.  One is not free to 
take the built compiler to a differently configured system at run time.

Now, with that as a backdrop, how exactly do you ever plan on using the plugin? 
 If there is no possible use for it, why then build it?

So, even if there is a way to toggle the feature on, which would mean the 
plug-in should be built, it should still be off initially, which it isn't.


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 4:39 AM, Alexandre Oliva  wrote:
> On Jun 28, 2012, Jakub Jelinek  wrote:
> 
>> On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote:
>>> I'd very be surprised if I asked for an i686 native build to package and
>>> install elsewhere, and didn't get a plugin just because the build-time
>>> linker wouldn't have been able to run the plugin.
> 
>> Not disable plugin support altogether, but disable assuming the linker
>> supports the plugin.
> 
> That still doesn't sound right to me: why should the compiler refrain
> from using a perfectly functional linker plugin on the machine where
> it's installed (not where it's built?

See your point below for one reason.  The next would be because it would be a 
speed hit to re-check at runtime the qualities of the linker and do something 
different.  If the system had an architecture to avoid the speed hit and people 
wanted to do the work to support the runtime reconfigure, that'd be fine with 
me.  I don't think you system supports this, and I don't think you want to do 
that work, do you?

> Also, this scenario of silently deciding whether or not to use the
> linker plugin could bring us to different test results for the same
> command lines.  I don't like that.

Right, which is why the static configuration of the host system at build time 
is forever after an invariant.  The linker is smelled, it doesn't support 
plugins, therefore we can't ever use it, therefore we never build it...


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 09:17:55AM +0200, Jakub Jelinek wrote:
> I'll look at using MULT_HIGHPART_EXPR in the pattern recognizer and
> vectorizing it as either of the sequences next.

And here is corresponding pattern recognizer and vectorizer patch.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
to pessimize the generated code for gcc.dg/vect/pr51581-3.c
testcase (at least with -O3 -mavx) compared to when the hooks aren't
present, because i?86 has more natural support for widen mult lo/hi
compoared to widen mult even/odd, but I assume that on powerpc it is the
other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
and builtin_mul_widen_* are possible for the particular vectype which one
will be cheaper?

2012-06-28  Jakub Jelinek  

PR tree-optimization/51581
* tree-vect-stmts.c (permute_vec_elements): Add forward decl.
(vectorizable_operation): Handle vectorization of MULT_HIGHPART_EXPR
also using VEC_WIDEN_MULT_*_EXPR or builtin_mul_widen_* plus
VEC_PERM_EXPR if vector MULT_HIGHPART_EXPR isn't supported.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Use
MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_*_EXPR and shifts.

* gcc.dg/vect/pr51581-4.c: New test.

--- gcc/tree-vect-stmts.c.jj2012-06-26 11:38:28.0 +0200
+++ gcc/tree-vect-stmts.c   2012-06-28 13:27:50.475158271 +0200
@@ -3288,6 +3288,10 @@ vectorizable_shift (gimple stmt, gimple_
 }
 
 
+static tree permute_vec_elements (tree, tree, tree, gimple,
+ gimple_stmt_iterator *);
+
+
 /* Function vectorizable_operation.
 
Check if STMT performs a binary, unary or ternary operation that can
@@ -3300,17 +3304,18 @@ static bool
 vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
gimple *vec_stmt, slp_tree slp_node)
 {
-  tree vec_dest;
+  tree vec_dest, vec_dest2 = NULL_TREE;
+  tree vec_dest3 = NULL_TREE, vec_dest4 = NULL_TREE;
   tree scalar_dest;
   tree op0, op1 = NULL_TREE, op2 = NULL_TREE;
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
-  tree vectype;
+  tree vectype, wide_vectype = NULL_TREE;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
   enum tree_code code;
   enum machine_mode vec_mode;
   tree new_temp;
   int op_type;
-  optab optab;
+  optab optab, optab2 = NULL;
   int icode;
   tree def;
   gimple def_stmt;
@@ -3327,6 +3332,8 @@ vectorizable_operation (gimple stmt, gim
   tree vop0, vop1, vop2;
   bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
   int vf;
+  unsigned char *sel = NULL;
+  tree decl1 = NULL_TREE, decl2 = NULL_TREE, perm_mask = NULL_TREE;
 
   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
 return false;
@@ -3451,31 +3458,97 @@ vectorizable_operation (gimple stmt, gim
   optab = optab_for_tree_code (code, vectype, optab_default);
 
   /* Supportable by target?  */
-  if (!optab)
+  if (!optab && code != MULT_HIGHPART_EXPR)
 {
   if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "no optab.");
   return false;
 }
   vec_mode = TYPE_MODE (vectype);
-  icode = (int) optab_handler (optab, vec_mode);
+  icode = optab ? (int) optab_handler (optab, vec_mode) : CODE_FOR_nothing;
+
+  if (icode == CODE_FOR_nothing
+  && code == MULT_HIGHPART_EXPR
+  && VECTOR_MODE_P (vec_mode)
+  && BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN)
+{
+  /* If MULT_HIGHPART_EXPR isn't supported by the backend, see
+if we can emit VEC_WIDEN_MULT_{LO,HI}_EXPR followed by VEC_PERM_EXPR
+or builtin_mul_widen_{even,odd} followed by VEC_PERM_EXPR.  */
+  unsigned int prec = TYPE_PRECISION (TREE_TYPE (scalar_dest));
+  unsigned int unsignedp = TYPE_UNSIGNED (TREE_TYPE (scalar_dest));
+  tree wide_type
+   = build_nonstandard_integer_type (prec * 2, unsignedp);
+  wide_vectype
+= get_same_sized_vectype (wide_type, vectype);
+
+  sel = XALLOCAVEC (unsigned char, nunits_in);
+  if (VECTOR_MODE_P (TYPE_MODE (wide_vectype))
+ && GET_MODE_SIZE (TYPE_MODE (wide_vectype))
+== GET_MODE_SIZE (vec_mode))
+   {
+ if (targetm.vectorize.builtin_mul_widen_even
+ && (decl1 = targetm.vectorize.builtin_mul_widen_even (vectype))
+ && targetm.vectorize.builtin_mul_widen_odd
+ && (decl2 = targetm.vectorize.builtin_mul_widen_odd (vectype))
+ && TYPE_MODE (TREE_TYPE (TREE_TYPE (decl1)))
+== TYPE_MODE (wide_vectype))
+   {
+ for (i = 0; i < nunits_in; i++)
+   sel[i] = !BYTES_BIG_ENDIAN + (i & ~1)
++ ((i & 1) ? nunits_in : 0);
+ if (0 && can_vec_perm_p (vec_mode, false, sel))
+   icode = 0;
+   }
+ if (icode == CODE_FOR_nothing)
+   {
+ decl1 = NULL_TREE;
+   

Re: [wwwdocs] Update coding conventions for C++

2012-06-28 Thread Joseph S. Myers
On Wed, 27 Jun 2012, Lawrence Crowl wrote:

> >> +Namespaces
> >> +
> >> +
> >> +Namespaces are encouraged.
> >> +All separable libraries should have a unique global namespace.
> >> +All individual tools should have a unique global namespace.
> >> +Nested include directories names should map to nested namespaces when
> >> possible.
> >> +
> >
> > Do all people have a consensus on the use of namespace ?
> 
> Well, we really only know about objections, and I have not seen any.

I certainly think namespaces are a useful feature to use in GCC (with a 
namespace for the gcc/ directory, or as you imply separate ones for the 
driver and the compilers proper, one for libcpp, one for each front end, 
etc.).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 07:03:37AM -0700, Mike Stump wrote:
> > Also, this scenario of silently deciding whether or not to use the
> > linker plugin could bring us to different test results for the same
> > command lines.  I don't like that.
> 
> Right, which is why the static configuration of the host system at build
> time is forever after an invariant.  The linker is smelled, it doesn't
> support plugins, therefore we can't ever use it, therefore we never build
> it...

THis test is not about whether to build the plugin, but whether to force
using it by default.  And to be able to use it by default, you need a
guarantee that all the linkers you'll use it with do support the plugin.
Therefore, if the build-time linker doesn't support it, I think it is just
fine not all of your linkers support the plugin and not enable it by
default.

Jakub


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Richard Guenther
On Thu, Jun 28, 2012 at 4:08 PM, Jakub Jelinek  wrote:
> On Thu, Jun 28, 2012 at 07:03:37AM -0700, Mike Stump wrote:
>> > Also, this scenario of silently deciding whether or not to use the
>> > linker plugin could bring us to different test results for the same
>> > command lines.  I don't like that.
>>
>> Right, which is why the static configuration of the host system at build
>> time is forever after an invariant.  The linker is smelled, it doesn't
>> support plugins, therefore we can't ever use it, therefore we never build
>> it...
>
> THis test is not about whether to build the plugin, but whether to force
> using it by default.  And to be able to use it by default, you need a
> guarantee that all the linkers you'll use it with do support the plugin.
> Therefore, if the build-time linker doesn't support it, I think it is just
> fine not all of your linkers support the plugin and not enable it by
> default.

I'd like to have a more reliable way to enable/disable the default use
of the linker-plugin then.  Something in config.gcc maybe, or at least
a flag I can specify at configure time.  If the default in config.gcc is
detected to not work then explicitely changing that (or confirming it)
would be required - otherwise we'd error out.

Richard.


Re: [Ada] Attribute 'Old should only be used in postconditions

2012-06-28 Thread Eric Botcazou
> Probably suppress both, since they no longer make sense (they are testing
> an early implementation of 'Old, before 'Old was standardized in Ada 2012).
>
> I'll take care of it.

Thanks!

-- 
Eric Botcazou


Re: [Ada] Attribute 'Old should only be used in postconditions

2012-06-28 Thread Arnaud Charlet
> > Probably suppress both, since they no longer make sense (they are testing
> > an early implementation of 'Old, before 'Old was standardized in Ada
> > 2012).
> >
> > I'll take care of it.
> 
> Thanks!

Sure, done for the record (revision 189042).


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Richard Henderson
On 2012-06-28 07:05, Jakub Jelinek wrote:
> Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
> to pessimize the generated code for gcc.dg/vect/pr51581-3.c
> testcase (at least with -O3 -mavx) compared to when the hooks aren't
> present, because i?86 has more natural support for widen mult lo/hi
> compoared to widen mult even/odd, but I assume that on powerpc it is the
> other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
> and builtin_mul_widen_* are possible for the particular vectype which one
> will be cheaper?

I would assume that if the builtin exists, then it is cheaper.

I disagree about "x86 has more natural support for hi/lo".  The basic sse2 
multiplication is even.  One shift per input is needed to generate odd.  On the 
other hand, one interleave per input is required for both hi/lo.  So 4 setup 
insns for hi/lo, and 2 setup insns for even/odd.  And on top of all that, XOP 
includes multiply odd at least for signed V4SI.

I'll have a look at the test case you mention while I re-look at the patches...


r~


Re: [PATCH, GCC][AArch64] Use Enums for code models option selection

2012-06-28 Thread Tejas Belagod

Tejas Belagod wrote:

Marcus Shawcroft wrote:

On 13/06/12 14:38, Sofiane Naci wrote:

Hi,

I discovered a bug in my previous patch, so I attach a new one.
The ChangeLog hasn't changed.
OK to commit?

Thanks
Sofiane


-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org]

On

Behalf Of Sofiane Naci
Sent: 31 May 2012 10:55
To: gcc-patches@gcc.gnu.org
Subject: [PATCH, GCC][AArch64] Use Enums for code models option selection

Hi,

This patch re-factors code models option selection in the AArch64 port:

  . Renaming variables such as mem_model to cmodel, for better clarity.
  . Using the generic support for enumerated option arguments.
  . Fixing touched code layout and formatting issues.

Thanks
Sofiane

-

ChangeLog:

2012-05-31  Sofiane Naci

[AArch64] Use Enums for code models option selection.

* config/aarch64/aarch64-elf-raw.h (AARCH64_DEFAULT_MEM_MODEL):
Delete.
* config/aarch64/aarch64-linux.h (AARCH64_DEFAULT_MEM_MODEL):
Delete.
* config/aarch64/aarch64-opts.h (enum aarch64_code_model): New.
* config/aarch64/aarch64-protos.h: Update comments.
* config/aarch64/aarch64.c: Update comments.
(aarch64_default_mem_model): Rename to aarch64_code_model.
(aarch64_expand_mov_immediate): Remove error message.
(aarch64_select_rtx_section): Remove assertion and update comment.
(aarch64_override_options): Move memory model initialization from
here.
(struct aarch64_mem_model): Delete.
(aarch64_memory_models[]): Delete.
(initialize_aarch64_memory_model): Rename to
initialize_aarch64_code_model
and update.
(aarch64_classify_symbol): Handle AARCH64_CMODEL_TINY and
AARCH64_CMODEL_TINY_PIC
* config/aarch64/aarch64.h
(enum aarch64_memory_model): Delete.
(aarch64_default_mem_model): Rename to aarch64_cmodel.
(HAS_LONG_COND_BRANCH): Update.
(HAS_LONG_UNCOND_BRANCH): Update.
* config/aarch64/aarch64.opt
(cmodel): New.
(mcmodel): Update.

OK




I've checked this in on aarch64-branch upstream for Sofiane.

Tejas.


Sorry, I broke the build when I applied this patch. Attached is a patch that 
fixes this. Build and regressions are happy. OK to commit?


Thanks,
Tejas Belagod.
ARM.

Changelog

2012-06-28  Tejas Belagod  

gcc/
* config/aarch64/aarch64.h (aarch64_cmodel): Fix enum name.diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index ce2f899..5e24cd7 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -802,7 +802,7 @@ enum aarch64_builtins
 /* Check TLS Descriptors mechanism is selected.  */
 #define TARGET_TLS_DESC (aarch64_tls_dialect == TLS_DESCRIPTORS)
 
-extern enum aarch64_memory_model aarch64_cmodel;
+extern enum aarch64_code_model aarch64_cmodel;
 
 /* When using the tiny addressing model conditional and unconditional branches
can span the whole of the available address space (1MB).  */

[lra] trunk merged into the branch

2012-06-28 Thread Vladimir Makarov

I merged trunk at 188913 into lra branch.  Some changes were required to make 
lra branch bootstrapped on x86/x86-64 and ppc.

2012-06-23  Vladimir Makarov

* lra.c (check_rtl): Add arg to insn_invalid_p call.

* lra-assigns.c (init_regno_assign_info): Use
ira_class_hard_regs_num instead of ira_available_class_regs.
(reload_pseudo_compare_func): Ditto.

* lra-constraints.c (extract_loc_address_regs): Set up disp_loc
first.  Transfer true for context_p only when base_reg_loc is
defined.  Add processing UNSPEC.
(process_addr_reg): Reload always for non-reg.
(equiv_address_substitution): Add arg to plus_constant calls.
(curr_insn_transform): Don't process addresses for operators.
Change duplication updates.
(inherit_reload_reg): Use ira_class_hard_regs_num instead of
ira_available_class_regs.

* lra-eliminations.c (for_sum, lra_eliminate_regs_1): Add arg to
plus_constant calls.
(eliminate_regs_in_insn): Ditto.

2012-06-25  Vladimir Makarov

* output.h (alter_subreg): Add new argument.

* sdbout.c (sdbout_symbol): Pass new argument to alter_subreg.

* dbxout.c (dbxout_symbol_location): Ditto.

* final.c (final_scan_insn, cleanup_subreg_operands): Ditto.
(walk_alter_subreg, output_operand): Ditto.
(alter_subreg): Add new argument.

* emit-rtl.c (gen_rtx_REG): Add lra_in_progress.

* config/rs6000/rs6000.c (rs6000_legitimate_offset_address_p):
Always pass true to legitimate_constant_pool_address_p when
lra_in_progress.
(rs6000_legitimate_address_p): Ditto.

* lra-int.h (lra_update_operator_dups): New.

* lra.c (lra): Put lra_in_progress after
lra_hard_reg_substitution.

* lra-spills.c (lra_hard_reg_substitution): Pass new argument to
alter_subreg.  Call lra_update_operator_dups.

* lra-eliminations.c (lra_eliminate_regs_1):  Pass new argument to
alter_subreg.

* lra-constraints.c (simplify_operand_subreg): Ditto.
(curr_insn_transform): Use lra_update_operator_dups.




Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Jakub Jelinek
On Thu, Jun 28, 2012 at 08:57:23AM -0700, Richard Henderson wrote:
> On 2012-06-28 07:05, Jakub Jelinek wrote:
> > Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
> > to pessimize the generated code for gcc.dg/vect/pr51581-3.c
> > testcase (at least with -O3 -mavx) compared to when the hooks aren't
> > present, because i?86 has more natural support for widen mult lo/hi
> > compoared to widen mult even/odd, but I assume that on powerpc it is the
> > other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
> > and builtin_mul_widen_* are possible for the particular vectype which one
> > will be cheaper?
> 
> I would assume that if the builtin exists, then it is cheaper.
> 
> I disagree about "x86 has more natural support for hi/lo".  The basic sse2
> multiplication is even.  One shift per input is needed to generate odd. 
> On the other hand, one interleave per input is required for both hi/lo. 
> So 4 setup insns for hi/lo, and 2 setup insns for even/odd.  And on top of
> all that, XOP includes multiply odd at least for signed V4SI.

Perhaps the problem is then that the permutation is much more expensive
for even/odd.  With even/odd the f2 routine is:
vmovdqa d(%rip), %xmm2
vmovdqa .LC1(%rip), %xmm0
vpsrlq  $32, %xmm2, %xmm4
vmovdqa d+16(%rip), %xmm1
vpmuludq%xmm0, %xmm2, %xmm5
vpsrlq  $32, %xmm0, %xmm3
vpmuludq%xmm3, %xmm4, %xmm4
vpmuludq%xmm0, %xmm1, %xmm0
vmovdqa .LC2(%rip), %xmm2
vpsrlq  $32, %xmm1, %xmm1
vpmuludq%xmm3, %xmm1, %xmm3
vmovdqa .LC3(%rip), %xmm1
vpshufb %xmm2, %xmm5, %xmm5
vpshufb %xmm1, %xmm4, %xmm4
vpshufb %xmm2, %xmm0, %xmm2
vpshufb %xmm1, %xmm3, %xmm1
vpor%xmm4, %xmm5, %xmm4
vpor%xmm1, %xmm2, %xmm1
vpsrld  $1, %xmm4, %xmm4
vmovdqa %xmm4, c(%rip)
vpsrld  $1, %xmm1, %xmm1
vmovdqa %xmm1, c+16(%rip)
ret
and with lo/hi it is:
vmovdqa d(%rip), %xmm2
vpunpckhdq  %xmm2, %xmm2, %xmm3
vpunpckldq  %xmm2, %xmm2, %xmm2
vmovdqa .LC1(%rip), %xmm0
vpmuludq%xmm0, %xmm3, %xmm3
vmovdqa d+16(%rip), %xmm1
vpmuludq%xmm0, %xmm2, %xmm2
vshufps $221, %xmm2, %xmm3, %xmm2
vpsrld  $1, %xmm2, %xmm2
vmovdqa %xmm2, c(%rip)
vpunpckhdq  %xmm1, %xmm1, %xmm2
vpunpckldq  %xmm1, %xmm1, %xmm1
vpmuludq%xmm0, %xmm2, %xmm2
vpmuludq%xmm0, %xmm1, %xmm0
vshufps $221, %xmm0, %xmm2, %xmm0
vpsrld  $1, %xmm0, %xmm0
vmovdqa %xmm0, c+16(%rip)
ret

Jakub


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread H.J. Lu
On Thu, Jun 28, 2012 at 8:57 AM, Richard Henderson  wrote:
> On 2012-06-28 07:05, Jakub Jelinek wrote:
>> Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
>> to pessimize the generated code for gcc.dg/vect/pr51581-3.c
>> testcase (at least with -O3 -mavx) compared to when the hooks aren't
>> present, because i?86 has more natural support for widen mult lo/hi
>> compoared to widen mult even/odd, but I assume that on powerpc it is the
>> other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
>> and builtin_mul_widen_* are possible for the particular vectype which one
>> will be cheaper?
>
> I would assume that if the builtin exists, then it is cheaper.
>
> I disagree about "x86 has more natural support for hi/lo".  The basic sse2 
> multiplication is even.  One shift per input is needed to generate odd.  On 
> the other hand, one interleave per input is required for both hi/lo.  So 4 
> setup insns for hi/lo, and 2 setup insns for even/odd.  And on top of all 
> that, XOP includes multiply odd at least for signed V4SI.
>
> I'll have a look at the test case you mention while I re-look at the 
> patches...
>

The upper 128-bit of 256-bit AVX instructions aren't a good fit with the
current vectorizer infrastructure.


-- 
H.J.


Re: [testsuite] gcc.dg/vect/vect-50.c: combine two scans

2012-06-28 Thread Janis Johnson
On 06/27/2012 05:05 PM, Mike Stump wrote:
> On Jun 27, 2012, at 3:36 PM, Janis Johnson wrote:
>> These scans from gcc.dg/vect/vect-50.c, and others similar to them in
>> other vect tests, hurt my brain:
>>
>> /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 
>> "vect" { xfail { vect_no_align } } } }  */
>> /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 
>> "vect" { target vect_hw_misalign } } } */
>>
>> Both of these PASS for i686-pc-linux-gnu, causing duplicate lines in the
>> gcc test summary.  I'm pretty sure the following accomplishes the same
>> goal:
>>
>> /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 
>> "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
> 
> I don't think so?  The first sets the xfail status for the testcase.  If you 
> change the condition, you can't the xfail state for some targets, which would 
> be wrong (without a vec person chiming in).

The two checks are run separately.  The first one runs everywhere and
is expected to fail for vect_no_align.  The second is only run for
vect_hw_misalign.  Targets for which vect_no_align is false and
vect_hw_misliang is true get two PASS reports.

> I'd like to think you can compose the two with some spelling...  I just don't 
> think this one is it.?

No, there is no way to combine "target" and "xfail", although since we
intercept them we could presumably come up with a way to do that, with
syntax and semantics we design.

> I grepped around and found:
> 
>   /* { dg-message "does break strict-aliasing" "" { target { *-*-* && lp64 } 
> xfail *-*-* } 8 } */
> 
> which might have the right way to spell it, though, I always test to ensure 
> the construct does what I want.

Nope.  That should be flagged as an error by dg-message but it's
passed through GCC's process-message which ignore errors (a bug)
and simply ignores the directive.  I'm currently trying a fix to
not ignore errors from dg-error/dg-warning/dg-message and will then
fix up the broken tests.

>> That is, run the check everywhere
> 
> We don't want to run the test on other than vect_hw_misalign targets, right?
> 

I don't know, but right now it's run everywhere at least once.

Janis


Re: [PATCH, GCC][AArch64] Use Enums for code models option selection

2012-06-28 Thread Richard Earnshaw
On 28/06/12 16:58, Tejas Belagod wrote:
> 
> Sorry, I broke the build when I applied this patch. Attached is a patch that 
> fixes this. Build and regressions are happy. OK to commit?
> 
> Thanks,
> Tejas Belagod.
> ARM.
> 
> Changelog
> 
> 2012-06-28  Tejas Belagod  
> 
> gcc/
>   * config/aarch64/aarch64.h (aarch64_cmodel): Fix enum name.
> 
> 

OK.

R.




Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Richard Henderson
On 2012-06-28 09:20, Jakub Jelinek wrote:
> Perhaps the problem is then that the permutation is much more expensive
> for even/odd.  With even/odd the f2 routine is:
...
> vpshufb %xmm2, %xmm5, %xmm5
> vpshufb %xmm1, %xmm4, %xmm4
> vpor%xmm4, %xmm5, %xmm4
...
> and with lo/hi it is:
> vshufps $221, %xmm2, %xmm3, %xmm2

Hmm.  That second has a reformatting delay.

Last week when I pulled the mulv4si3 routine out to i386.c,
I experimented with a few different options, including that
interleave+shufps sequence seen here for lo/hi.  See the 
comment there discussing options and timing.

This also shows a deficiency in our vec_perm logic:

0L 0H 2L 2H 1L 1H 3L 3H
0H 2H 0H 2H 1H 3H 1H 3H 2*pshufd
0H 1H 2H 3H punpckldq

without the permutation constants in memory.


r~


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Richard Henderson
On 2012-06-28 00:17, Jakub Jelinek wrote:
> 2012-06-28  Jakub Jelinek  
> 
>   PR tree-optimization/53645
>   * tree-vect-generic.c (expand_vector_divmod): Use MULT_HIGHPART_EXPR
>   instead of VEC_WIDEN_MULT_{HI,LO}_EXPR followed by VEC_PERM_EXPR
>   if possible.
> 
>   * gcc.c-torture/execute/pr53645-2.c: New test.

Ok.


r~


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Richard Henderson
On 2012-06-28 07:05, Jakub Jelinek wrote:
>   PR tree-optimization/51581
>   * tree-vect-stmts.c (permute_vec_elements): Add forward decl.
>   (vectorizable_operation): Handle vectorization of MULT_HIGHPART_EXPR
>   also using VEC_WIDEN_MULT_*_EXPR or builtin_mul_widen_* plus
>   VEC_PERM_EXPR if vector MULT_HIGHPART_EXPR isn't supported.
>   * tree-vect-patterns.c (vect_recog_divmod_pattern): Use
>   MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_*_EXPR and shifts.
> 
>   * gcc.dg/vect/pr51581-4.c: New test.

Ok, except,

> +   if (0 && can_vec_perm_p (vec_mode, false, sel))
> + icode = 0;

Testing hack left in.


r~


[C++ Pubnames Patch] Anonymous namespaces enclosed in named namespaces. (issue6343052)

2012-06-28 Thread Sterling Augustine
The enclosed patch adds a fix for the pubnames anonymous namespaces contained
within named namespaces, and adds an extensive test for the various pubnames.

The bug is that when printing at verbosity level 1, and lang_decl_name sees a
namespace decl in not in the global namespace, it prints the namespace's
enclosing scopes--so far so good. However, the code I added earlier this month
to handle anonymous namespaces also prints the enclosing scopes, so one would
get foo::foo::(anonymous namespace) instead of foo::(anonymous namespace).

The solution is to stop the added code from printing the enclosing scope, which
is correct for both verbosity levels 0 and 1. Level 2 is handled elsewhere and
so not relevant.

I have formalized the tests I have been using to be sure pubnames are correct
and include that in this patch. It is based on ccoutant's gdb_index_test.cc from
the gold test suite.

OK for mainline?

Sterling


gcc/cp/ChangeLog

2012-06-28  Sterling Augustine  

* error.c (lang_decl_name): Use TFF_UNQUALIFIED_NAME flag.

gcc/testsuite/ChangeLog

2012-06-28  Sterling Augustine  

* g++.dg/debug/dwarf2/pubnames-2.C: New.

Index: cp/error.c
===
--- cp/error.c  (revision 189025)
+++ cp/error.c  (working copy)
@@ -2633,7 +2633,7 @@
 dump_function_name (decl, TFF_PLAIN_IDENTIFIER);
   else if ((DECL_NAME (decl) == NULL_TREE)
&& TREE_CODE (decl) == NAMESPACE_DECL)
-dump_decl (decl, TFF_PLAIN_IDENTIFIER);
+dump_decl (decl, TFF_PLAIN_IDENTIFIER | TFF_UNQUALIFIED_NAME);
   else
 dump_decl (DECL_NAME (decl), TFF_PLAIN_IDENTIFIER);

Index: testsuite/g++.dg/debug/dwarf2/pubnames-2.C
===
--- testsuite/g++.dg/debug/dwarf2/pubnames-2.C  (revision 0)
+++ testsuite/g++.dg/debug/dwarf2/pubnames-2.C  (revision 0)
@@ -0,0 +1,194 @@
+// { dg-do compile }
+// { dg-options "-gpubnames -gdwarf-4 -std=c++0x -dA" }
+// { dg-final { scan-assembler ".section\t.debug_pubnames" } }
+// { dg-final { scan-assembler "\"\\(anonymous namespace\\)0\"+\[ 
\t\]+\[#;]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler "\"one0\"+\[ \t\]+\[#;]+\[ \t\]+external 
name" } }
+// { dg-final { scan-assembler "\"one::G_A0\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"one::G_B0\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"one::G_C0\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"one::\\(anonymous namespace\\)0\"+\[ 
\t\]+\[#;]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler "\"two0\"+\[ \t\]+\[#;]+\[ \t\]+external 
name" } }
+// { dg-final { scan-assembler "\"F_A0\"+\[ \t\]+\[#;]+\[ \t\]+external 
name" } }
+// { dg-final { scan-assembler "\"F_B0\"+\[ \t\]+\[#;]+\[ \t\]+external 
name" } }
+// { dg-final { scan-assembler "\"F_C0\"+\[ \t\]+\[#;]+\[ \t\]+external 
name" } }
+// { dg-final { scan-assembler "\"inline_func_10\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"one::c1::c10\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"one::c1::~c10\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"one::c1::val0\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"check_enum0\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"main0\"+\[ \t\]+\[#;]+\[ \t\]+external 
name" } }
+// { dg-final { scan-assembler "\"two::c2::c20\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"two::c2::c20\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"two::c2::c20\"+\[ 
\t\]+\[#;]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler "\"check0\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"check \\>0\"+\[ 
\t\]+\[#;]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler "\"check \\>0\"+\[ 
\t\]+\[#;]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler "\"check \\>0\"+\[ 
\t\]+\[#;]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler "\"two::c2::val0\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"two::c2::val0\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"two::c2::val0\"+\[ 
\t\]+\[#;]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler 
"\"__static_initialization_and_destruction_00\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"two::c2::~c20\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"two::c2::~c20\"+\[ \t\]+\[#;]+\[ 
\t\]+external name" } }
+// { dg-final { scan-assembler "\"two::c2::~c20\"+\[ 
\t\]+\[#;]+\[ \t\]+external name" } }
+// { dg-final { scan-assembler "\"_GLOBAL__sub_I__ZN3o

[lra] a patch to fix last testsuite regression on x86/x86-64

2012-06-28 Thread Vladimir Makarov
The following patch fixes last GCC testsuite regression (in comparison 
with reload) on x86/x86-64 after last merge of trunk into lra.


The patch actually implements recent Bernd's optimization (restoring an 
argument pseudo value from the call result) in LRA.


The patch was successfully bootstrapped on x86/x86-64.

Committed as rev. 189051.

2012-06-28  Vladimir Makarov 

* lra-constraints.c (inherit_in_ebb): Implement restoring argument
pseudo value from the call result.


Index: lra-constraints.c
===
--- lra-constraints.c	(revision 189016)
+++ lra-constraints.c	(working copy)
@@ -4344,7 +4344,7 @@ inherit_in_ebb (rtx head, rtx tail)
 {
   int i, src_regno, dst_regno;
   bool change_p, succ_p;
-  rtx prev_insn, next_usage_insns, set,  first_insn, last_insn, next_insn;
+  rtx prev_insn, next_usage_insns, set, first_insn, last_insn, next_insn;
   enum reg_class cl;
   struct lra_insn_reg *reg;
   basic_block last_processed_bb, curr_bb = NULL;
@@ -4354,7 +4354,6 @@ inherit_in_ebb (rtx head, rtx tail)
   bitmap_iterator bi;
   bool head_p, after_p;
 
-
   change_p = false;
   curr_usage_insns_check++;
   reloads_num = calls_num = 0;
@@ -4536,7 +4535,41 @@ inherit_in_ebb (rtx head, rtx tail)
   to_inherit[i].insns))
 	  change_p = true;
 	  if (CALL_P (curr_insn))
-	calls_num++;
+	{
+	  rtx cheap, pat, dest, restore;
+	  int regno, hard_regno;
+
+	  calls_num++;
+	  if ((cheap = find_reg_note (curr_insn,
+	  REG_RETURNED, NULL_RTX)) != NULL_RTX
+		  && ((cheap = XEXP (cheap, 0)), true)
+		  && (regno = REGNO (cheap)) >= FIRST_PSEUDO_REGISTER
+		  && (hard_regno = reg_renumber[regno]) >= 0
+		  /* If there are pending saves/restores, the
+		 optimization is not worth.  */
+		  && usage_insns[regno].calls_num == calls_num - 1
+		  && TEST_HARD_REG_BIT (call_used_reg_set, hard_regno))
+		{
+		  /* Restore the pseudo from the call result as
+		 REG_RETURNED note says that the pseudo value is
+		 in the call result and the pseudo is an argument
+		 of the call.  */
+		  pat = PATTERN (curr_insn);
+		  if (GET_CODE (pat) == PARALLEL)
+		pat = XVECEXP (pat, 0, 0);
+		  dest = SET_DEST (pat);
+		  start_sequence ();
+		  emit_move_insn (cheap, copy_rtx (dest));
+		  restore = get_insns ();
+		  end_sequence ();
+		  lra_process_new_insns (curr_insn, NULL, restore,
+	 "Inserting call parameter restore");
+		  /* We don't need to save/restore of the pseudo from
+		 this call.  */
+		  usage_insns[regno].calls_num = calls_num;
+		  bitmap_set_bit (&check_only_regs, regno);
+		}
+	}
 	  to_inherit_num = 0;
 	  /* Process insn usages.  */
 	  for (reg = curr_id->regs; reg != NULL; reg = reg->next)


[PATCH][RFC, Reload]. Reload bug?

2012-06-28 Thread Tejas Belagod


Hi,

Attached is a fix for what seems to be a reload bug while handling 
subreg(mem...). I ran into this problem while implementing support for struct 
load/store in AArch64 using the standard patterns 
vec__lanes on the same lines of the ARM 
backend. The test case that caused the issue was:


void SexiALI_Convert(void *vdest, void *vsrc, unsigned int frames, int n)
{
 unsigned int x;
 short *src = vsrc;
 unsigned char *dest = vdest;
 for(x=0;x<256;x++)
 {
  int tmp;
  tmp = *src;
  src++;
  tmp += *src;
  src++;
  *dest++ = tmp;
 }
}

Before reload, this is the RTL dump I see:

.
(insn 110 114 111 4 (set (reg:V8HI 158 [ vect_var_.21 ])
(subreg:V8HI (reg:OI 530 [ vect_array.20 ]) 0)) ice.i:9 512 
{*aarch64_simd_movv8hi}

 (nil))

(insn 111 110 115 4 (set (reg:V8HI 159 [ vect_var_.22 ])
(subreg:V8HI (reg:OI 530 [ vect_array.20 ]) 16)) ice.i:9 512 
{*aarch64_simd_movv8hi}

 (expr_list:REG_DEAD (reg:OI 530 [ vect_array.20 ])
(nil)))

(insn 115 111 116 4 (set (reg:V8HI 161 [ vect_var_.24 ])
(subreg:V8HI (reg:OI 529 [ vect_array.23 ]) 0)) ice.i:9 512 
{*aarch64_simd_movv8hi}

 (nil))

(insn 116 115 117 4 (set (reg:V8HI 162 [ vect_var_.25 ])
(subreg:V8HI (reg:OI 529 [ vect_array.23 ]) 16)) ice.i:9 512 
{*aarch64_simd_movv8hi}

 (expr_list:REG_DEAD (reg:OI 529 [ vect_array.23 ])
(nil)))

(insn 117 116 118 4 (set (reg:V4SI 544 [ vect_var_.27 ])
(sign_extend:V4SI (vec_select:V4HI (reg:V8HI 159 [ vect_var_.22 ])
(parallel:V8HI [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
] ice.i:11 700 {aarch64_simd_vec_unpacks_lo_v8hi}
 (nil))

(insn 118 117 124 4 (set (reg:V4SI 545 [ vect_var_.26 ])
(sign_extend:V4SI (vec_select:V4HI (reg:V8HI 158 [ vect_var_.21 ])
(parallel:V8HI [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
] ice.i:9 700 {aarch64_simd_vec_unpacks_lo_v8hi}
 (nil))

.

In insn 116, reg_equiv_mem () of the psuedoreg 529 is (mem:OI (reg sp)), and the 
subreg is equivalent to:

subreg:V8HI (mem:OI (reg sp) 16)
which does not get folded into
mem:V8HI (plus:DI (reg sp) (const_int 16))
because, in reload.c:find_reloads_toplev () where such subregs are narrowed into
narower memrefs, the memref supplied to strict_memory_address_addr_space_P () is
just (mem:OI (reg sp)) and the SUBREG_BYTE is forgotten. Therefore
strict_memory_address_addr_space_P () thinks that (mem:OI (reg sp)) is a
valid target address and lets it pass as a subreg and does not narrow the subreg
into a narrower memref. find_reloads_toplev () should have infact given
strict_memory_address_addr_space_P () (mem:OI (plus:DI (reg sp) (const_int 16)) 
) which will be returned as false as base+offset is invalid for NEON addressing

modes and this will be reloaded into a narrower memref.

Also, I tried writing a secondary reload for this, but at no time is the RTL
(subreg:V8HI (mem:OI (reg sp)) 16)
available to the target secondary reload for it to fix it up.

Therefore, I've fixed find_reloads_toplev () to pass the full address to 
strict_memory_address_addr_space_P () in the case of subregs.


Does this look like a sane fix?

I've tested this patch on arm-none-eabi and bootstrapped on x86_64-pc-linux and
all is well.

Thanks,
Tejas Belagod.
ARM.

Changelog:

2012-06-28  Tejas Belagod  

gcc/
* reload.c (find_reloads_toplev): Include the subreg byte in the address
of memrefs when converting subregs of mems into narrower memrefs.diff --git a/gcc/reload.c b/gcc/reload.c
index e42cc5c..b6d4ce9 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -4771,15 +4771,27 @@ find_reloads_toplev (rtx x, int opnum, enum reload_type 
type,
 #ifdef LOAD_EXTEND_OP
  && !paradoxical_subreg_p (x)
 #endif
- && (reg_equiv_address (regno) != 0
- || (reg_equiv_mem (regno) != 0
- && (! strict_memory_address_addr_space_p
- (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
-  MEM_ADDR_SPACE (reg_equiv_mem (regno)))
- || ! offsettable_memref_p (reg_equiv_mem (regno))
- || num_not_at_initial_offset
-   x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
-  insn, address_reloaded);
+)
+   {
+ if (reg_equiv_address (regno) != 0)
+   x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
+insn, address_reloaded);
+ else if (reg_equiv_mem (regno) != 0)
+   {
+ tem =
+   simplify_gen_subreg (GET_MODE (x), reg_equiv_mem (regno),
+  

Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Andrew Pinski
On Thu, Jun 28, 2012 at 6:50 AM, Matthew Gretton-Dann
 wrote:
> On 28/06/12 14:38, Mike Stump wrote:
>>
>> On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann
>
>>  wrote:
>>>
>>> On 27/06/12 21:35, Andrew Pinski wrote:

 On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
  wrote:
>
> All,
>
> This patch enables the dump-noaddr test to work in out-of-build-tree
> testing.
>>>
>>> [snip]


 I created a much simpler patch which I have been meaning to submit.
 I attached it for reference.


 Thanks,
 Andrew Pinski

 ChangeLog:
 * testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
 an absolute dump base instead of a relative one.

 Index: gcc.c-torture/unsorted/dump-noaddr.x
 ===
 --- gcc.c-torture/unsorted/dump-noaddr.x    (revision 61452)
 +++ gcc.c-torture/unsorted/dump-noaddr.x    (revision 61453)
 @@ -11,10 +11,10 @@ proc dump_compare { src options } {
      foreach option $option_list {
      file delete -force dump1
      file mkdir dump1
 -    c-torture-compile $src "$option $options -dumpbase dump1/$dumpbase
 -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all
 -fdump-tree-all -fdump-noaddr"
 +    c-torture-compile $src "$option $options -dumpbase
 [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1
 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
      file delete -force dump2
      file mkdir dump2
 -    c-torture-compile $src "$option $options -dumpbase dump2/$dumpbase
 -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr"
 +    c-torture-compile $src "$option $options -dumpbase
 [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all
 -fdump-tree-all -fdump-noaddr"
      foreach dump1 [lsort [glob -nocomplain dump1/*]] {
          regsub dump1/ $dump1 dump2/ dump2
          set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"
>>>
>>>
>>> What I don't like about this approach is that dump1 and dump2 are
>
>>> created in the current working directory.
>>
>>
>> On vxworks as I recall we did a cd to tmpdir, is that generally true?
>> Also, if one telnets in or sshes into the host under test, the cd is
>> mandatory... as otherwise one would dump turds (that's a technical term)
>> in the home directory which would be very uncool.  Maybe a better
>> approach would be to cd to the right place if all the Canadian setups cd,
>> as that then unifies them.
>>
>>> With out of build-tree testing this may not (I believe) be the same as
>>> $tmpdir (where temporaries are normally created).  Also the current
>>> directory may already contain directories/files called dump1 or dump2
>>> which will get destroyed by running the
>>
>>
>> The point of the cd was to get to a place where temps can be created
>> freely...
>>
>>> I've not committed my version yet in case I am missing something in my
>>> reasoning above with regards to the relationship between the current
>>> working directory and $tmpdir.
>>
>>
>> So the question would be, does his patch work for you?  It was unclear to
>> me if the answer is no.
>
>
> Sorry - the patch works for my use case (build==host), but I was concerned
> over the use of [pwd] vs $tmpdir.

Both will work in the case of build==host.  I don't even know if we
really support build!=host testing at all.  I have never seen it done
and I have no idea how to control it via dejagnu.  Has anyone tested
build!=host recently?

Thanks,
Andrew Pinski

>
>> Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the
>> directory on the target, the host or the build machine?  And is that
>> going to the host machine?  They are not the same.  One needs a directory
>> on the host machine.
>
>
> I don't think this applies to my patch though, so are you still okay for my
> version to go in or is there something else I haven't considered?
>
>
> Thanks,
>
> Matt
>
> --
> Matthew Gretton-Dann
> Principal Engineer, PD Software - Tools, ARM Ltd
>
>
> --
> Matthew Gretton-Dann
> Principal Engineer, PD Software - Tools, ARM Ltd
>
>


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 11:42 AM, Andrew Pinski wrote:
> Both will work in the case of build==host.  I don't even know if we
> really support build!=host testing at all.

Sure...  works just fine, last I knew.  Generally easy enough to fixup, if 
people get it wrong.

> I have never seen it done and I have no idea how to control it via dejagnu.  
> Has anyone tested
> build!=host recently?

Be curious to know if people do this anymore.  Host testing a lame OS, like 
MS-DOS...  was why it was put in.


[PATCH] Fix PR46556 (straight-line strength reduction, part 2)

2012-06-28 Thread William J. Schmidt
Here's a relatively small piece of strength reduction that solves that
pesky addressing bug that got me looking at this in the first place...

The main part of the code is the stuff that was reviewed last year, but
which needed to find a good home.  So hopefully that's in pretty good
shape.  I recast base_cand_map as an htab again since I now need to look
up trees other than SSA names.  I plan to put together a follow-up patch
to change code and commentary references so that "base_name" becomes
"base_expr".  Doing that now would clutter up the patch too much.

Bootstrapped and tested on powerpc64-linux-gnu with no new regressions.
Ok for trunk?

Thanks,
Bill


gcc:

PR tree-optimization/46556
* gimple-ssa-strength-reduction.c (enum cand_kind): Add CAND_REF.
(base_cand_map): Change to hash table.
(base_cand_hash): New function.
(base_cand_free): Likewise.
(base_cand_eq): Likewise.
(lookup_cand): Change base_cand_map to hash table.
(find_basis_for_candidate): Likewise.
(base_cand_from_table): Exclude CAND_REF.
(restructure_reference): New function.
(slsr_process_ref): Likewise.
(find_candidates_in_block): Call slsr_process_ref.
(dump_candidate): Handle CAND_REF.
(base_cand_dump_callback): New function.
(dump_cand_chains): Change base_cand_map to hash table.
(replace_ref): New function.
(replace_refs): Likewise.
(analyze_candidates_and_replace): Call replace_refs.
(execute_strength_reduction): Change base_cand_map to hash table.

gcc/testsuite:

PR tree-optimization/46556
* testsuite/gcc.dg/tree-ssa/slsr-27.c: New.
* testsuite/gcc.dg/tree-ssa/slsr-28.c: New.
* testsuite/gcc.dg/tree-ssa/slsr-29.c: New.


Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dom2" } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p->a[n], p->c[n], p->b[n]);
+}
+
+/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */
+/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */
+/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 3 "dom2" } } 
*/
+/* { dg-final { cleanup-tree-dump "dom2" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dom2" } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p->a[n], p->c[n], p->b[n]);
+  if (n > 12)
+foo (p->a[n], p->c[n], p->b[n]);
+  else if (n > 3)
+foo (p->b[n], p->a[n], p->c[n]);
+}
+
+/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */
+/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */
+/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 9 "dom2" } } 
*/
+/* { dg-final { cleanup-tree-dump "dom2" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dom2" } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p->a[n], p->c[n], p->b[n]);
+  if (n > 3)
+{
+  foo (p->a[n], p->c[n], p->b[n]);
+  if (n > 12)
+   foo (p->b[n], p->a[n], p->c[n]);
+}
+}
+
+/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */
+/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */
+/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 9 "dom2" } } 
*/
+/* { dg-final { cleanup-tree-dump "dom2" } } */
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 189025)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -32,7 +32,7 @@ along with GCC; see the file COPYING3.  If not see
2) Explicit multiplies, unknown constant multipliers,
   no conditional increments. (data gathering complete,
   replacements pending)
-   3) Implicit multiplies in addressing expressions. (pending)
+   3) Implicit multiplies in addressing expressions. (complete)
4) Explicit multiplies, conditional inc

Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Bernhard Reutner-Fischer
On Thu, Jun 28, 2012 at 04:05:58PM +0200, Jakub Jelinek wrote:
>On Thu, Jun 28, 2012@09:17:55AM +0200, Jakub Jelinek wrote:
>> I'll look@using MULT_HIGHPART_EXPR in the pattern recognizer and
>> vectorizing it as either of the sequences next.
>
>And here is corresponding pattern recognizer and vectorizer patch.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
>Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems
>to pessimize the generated code for gcc.dg/vect/pr51581-3.c
>testcase (at least with -O3 -mavx) compared to when the hooks aren't
>present, because i?86 has more natural support for widen mult lo/hi
>compoared to widen mult even/odd, but I assume that on powerpc it is the
>other way around.  So, how should I find out if both VEC_WIDEN_MULT_*_EXPR
>and builtin_mul_widen_* are possible for the particular vectype which one
>will be cheaper?
>
>2012-06-28  Jakub Jelinek  
>
>   PR tree-optimization/51581
>   * tree-vect-stmts.c (permute_vec_elements): Add forward decl.
>   (vectorizable_operation): Handle vectorization of MULT_HIGHPART_EXPR
>   also using VEC_WIDEN_MULT_*_EXPR or builtin_mul_widen_* plus
>   VEC_PERM_EXPR if vector MULT_HIGHPART_EXPR isn't supported.
>   * tree-vect-patterns.c (vect_recog_divmod_pattern): Use
>   MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_*_EXPR and shifts.
>
>   * gcc.dg/vect/pr51581-4.c: New test.
>
>--- gcc/tree-vect-stmts.c.jj   2012-06-26 11:38:28.0 +0200
>+++ gcc/tree-vect-stmts.c  2012-06-28 13:27:50.475158271 +0200
>@@ -3300,17 +3304,18 @@ static bool

>+  icode = optab ? (int) optab_handler (optab, vec_mode) : CODE_FOR_nothing;
>+
>+  if (icode == CODE_FOR_nothing
>+  && code == MULT_HIGHPART_EXPR
>+  && VECTOR_MODE_P (vec_mode)
>+  && BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN)
>+{
>+  /* If MULT_HIGHPART_EXPR isn't supported by the backend, see
>+   if we can emit VEC_WIDEN_MULT_{LO,HI}_EXPR followed by VEC_PERM_EXPR
>+   or builtin_mul_widen_{even,odd} followed by VEC_PERM_EXPR.  */
>+  unsigned int prec = TYPE_PRECISION (TREE_TYPE (scalar_dest));
>+  unsigned int unsignedp = TYPE_UNSIGNED (TREE_TYPE (scalar_dest));
>+  tree wide_type
>+  = build_nonstandard_integer_type (prec * 2, unsignedp);
>+  wide_vectype
>+= get_same_sized_vectype (wide_type, vectype);
>+
>+  sel = XALLOCAVEC (unsigned char, nunits_in);
>+  if (VECTOR_MODE_P (TYPE_MODE (wide_vectype))
>+&& GET_MODE_SIZE (TYPE_MODE (wide_vectype))
>+   == GET_MODE_SIZE (vec_mode))
>+  {
>+if (targetm.vectorize.builtin_mul_widen_even
>+&& (decl1 = targetm.vectorize.builtin_mul_widen_even (vectype))
>+&& targetm.vectorize.builtin_mul_widen_odd
>+&& (decl2 = targetm.vectorize.builtin_mul_widen_odd (vectype))
>+&& TYPE_MODE (TREE_TYPE (TREE_TYPE (decl1)))
>+   == TYPE_MODE (wide_vectype))
>+  {
>+for (i = 0; i < nunits_in; i++)
>+  sel[i] = !BYTES_BIG_ENDIAN + (i & ~1)
>+   + ((i & 1) ? nunits_in : 0);
>+if (0 && can_vec_perm_p (vec_mode, false, sel))
>+  icode = 0;
>+  }
>+if (icode == CODE_FOR_nothing)
>+  {
>+decl1 = NULL_TREE;
>+decl2 = NULL_TREE;
>+optab = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR,
>+ vectype, optab_default);
>+optab2 = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR,
>+  vectype, optab_default);

Really both HI? If so optab2 could be removed from that fn altogether..

>+if (optab != NULL
>+&& optab2 != NULL
>+&& optab_handler (optab, vec_mode) != CODE_FOR_nothing
>+&& optab_handler (optab2, vec_mode) != CODE_FOR_nothing)
>+  {
>+for (i = 0; i < nunits_in; i++)
>+  sel[i] = !BYTES_BIG_ENDIAN + 2 * i;
>+if (can_vec_perm_p (vec_mode, false, sel))
>+  icode = optab_handler (optab, vec_mode);
>+  }
>+  }
>+  }
>+  if (icode == CODE_FOR_nothing)
>+  {
>+if (optab_for_tree_code (code, vectype, optab_default) == NULL)
>+  {
>+if (vect_print_dump_info (REPORT_DETAILS))
>+  fprintf (vect_dump, "no optab.");
>+return false;
>+  }
>+wide_vectype = NULL_TREE;
>+optab2 = NULL;
>+  }
>+}
>+


Re: [wwwdocs] Update coding conventions for C++

2012-06-28 Thread Lawrence Crowl
On 6/27/12, Lawrence Crowl  wrote:
> ..., does anyone object to removing the permission to use C++
> streams?

Having heard no objection, I removed the permission.

The following patch is the current state of the changes.  Since the
discussion appears to have died down, can I commit this patch?

BTW, as before, I have removed the  tags from this patch,
as they cause the mail server to reject the patch.


Index: htdocs/codingconventions.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/codingconventions.html,v
retrieving revision 1.66
diff -u -u -r1.66 codingconventions.html
--- htdocs/codingconventions.html   19 Feb 2012 00:45:34 -  1.66
+++ htdocs/codingconventions.html   28 Jun 2012 22:03:38 -
@@ -15,8 +19,73 @@
 code to follow these conventions, it is best to send changes to follow
 the conventions separately from any other changes to the code.

+
+Documentation
+ChangeLogs
+Portability
+Makefiles
+Testsuite Conventions
+Diagnostics Conventions
+Spelling, terminology and markup
+C and C++ Language Conventions
+
+Compiler Options
+Language Use
+
+Assertions
+Character Testing
+Error Node Testing
+Parameters Affecting Generated Code
+Inlining Functions
+
+
+Formatting Conventions
+
+Line Length
+Names
+Expressions
+
+
+
+
+C++ Language Conventions
+
+Language Use
+
+Variable Definitions
+Struct Definitions
+Class Definitions
+Constructors and Destructors
+Conversions
+Overloading Functions
+Overloading Operators
+Default Arguments
+Inlining Functions
+Templates
+Namespaces
+RTTI and dynamic_cast
+Other Casts
+Exceptions
+The Standard Library
+
+
+Formatting Conventions
+
+Names
+Struct Definitions
+Class Definitions
+Class Member Definitions
+Templates
+Extern "C"
+Namespaces
+
+
+
+
+

-Documentation
+
+Documentation

 Documentation, both of user interfaces and of internals, must be
 maintained and kept up to date.  In particular:
@@ -43,7 +112,7 @@
 


-ChangeLogs
+ChangeLogs

 GCC requires ChangeLog entries for documentation changes; for the web
 pages (apart from java/ and libstdc++/) the CVS
@@ -71,20 +140,40 @@
 java/58 is the actual number of the PR) at the top
 of the ChangeLog entry.

-Portability
+Portability

 There are strict requirements for portability of code in GCC to
-older systems whose compilers do not implement all of the ISO C standard.
-GCC requires at least an ANSI C89 or ISO C90 host compiler, and code
-should avoid pre-standard style function definitions, unnecessary
-function prototypes and use of the now deprecated @code{PARAMS} macro.
+older systems whose compilers do not implement all of the
+latest ISO C and C++ standards.
+
+
+
+The directories
+gcc, libcpp and fixincludes
+may use C++03.
+They may also use the long long type
+if the host C++ compiler supports it.
+These directories should use reasonably portable parts of C++03,
+so that it is possible to build GCC with C++ compilers other than GCC itself.
+If testing reveals that
+reasonably recent versions of non-GCC C++ compilers cannot compile GCC,
+then GCC code should be adjusted accordingly.
+(Avoiding unusual language constructs helps immensely.)
+Furthermore,
+these directories should also be compatible with C++11.
+
+
+
+The directories libiberty and libdecnumber must use C
+and require at least an ANSI C89 or ISO C90 host compiler.
+C code should avoid pre-standard style function definitions, unnecessary
+function prototypes and use of the now deprecated PARAMS macro.
 See http://gcc.gnu.org/cgi-bin/cvsweb.cgi/~checkout~/gcc/gcc/README.Portability?content-type=text/plain&only_with_tag=HEAD";>README.Portability
 for details of some of the portability problems that may arise.  Some
 of these problems are warned about by gcc -Wtraditional,
 which is included in the default warning options in a bootstrap.
-(Code outside the C front end is only compiled by GCC, so such
-requirements do not apply to it.)
+

 The programs included in GCC are linked with the
 libiberty library, which will replace some standard
@@ -108,12 +197,6 @@
 the release cycle, to reduce the risk involved in fixing a problem
 that only shows up on one particular system.

-Avoid the use of identifiers or idioms that would prevent code
-compiling with a C++ compiler.  Identifiers such as new
-or class, that are reserved words in C++, should not be
-used as variables or field names.  Explicit casts should be used to
-convert between void* and other pointer types.
-
 Function prototypes for extern functions should only occur in
 header files.  Functions should be ordered within source files to
 minimize the number of function pr

Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-28 Thread Alexandre Oliva
On Jun 28, 2012, Mike Stump  wrote:

> On Jun 28, 2012, at 4:39 AM, Alexandre Oliva  wrote:
>> That still doesn't sound right to me: why should the compiler refrain
>> from using a perfectly functional linker plugin on the machine where
>> it's installed (not where it's built?

> See your point below for one reason.

My point below suggests a reason for us to *verbosely* indicate the
change, e.g., in the test command line, like my patch does.

> The next would be because it would be a speed hit to re-check at
> runtime the qualities of the linker and do something different.

But then, our testsuite *does* re-check at runtime, but without my
patch, we're not using completely the result of the test.

> If the system had an architecture to avoid the speed hit and people
> wanted to do the work to support the runtime reconfigure, that'd be
> fine with me.

Me too, but I'm not arguing for or against that.  I'm just arguing for a
change to the test harness that will use the result of the dynamic test,
and verbosely so.

>> Also, this scenario of silently deciding whether or not to use the
>> linker plugin could bring us to different test results for the same
>> command lines.  I don't like that.

> Right, which is why the static configuration of the host system at
> build time is forever after an invariant.

That doesn't even match *current* reality.

We can run the testsuite on a machine that's neither the build system
nor the run-time target.  That's presumably why the test harness tests
whether the plugin works.  And that's one reason why we should use that
result instead of letting the compiler override it.

> The linker is smelled, it doesn't support plugins, therefore we can't
> ever use it, therefore we never build it...

'cept even in the build system it *does* support plugins, so it's just
reasonable for us to build the plugin, and for the compiler to expect to
be able to use it.

Now, this will work just fine if the compiler is installed on a system
that matches the host=target (i.e., native compiler) triplet specified
when building the compiler.  It might not work on the build machine, but
that's irrelevant, for we're not supposed to be able to use the compiler
on the build machine.  It might not work on the test machine, and that's
why the test harness tests for plugin support.  But the test harness
doesn't communicate back to the compiler its findings without my patch,
so if the test system doesn't happen to support plugins, we'd get tons
of pointless failures.

If we change the compiler configuration so that it disables the plugin
just because it guesses some potential incompatibility between the
linker and the plugin we're about to build, we'll lose features and
testing.

If we change the compiler to detect it dynamically, we'll get ambiguous
test results.  “did this -flto test use the plugin or not?”

Why would you want any of the scenarios in the two paragraphs above?

If you wouldn't, what do you have against the patch that complements the
plugin detection on the test machine in the test harness?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [Patch, libgfortran] Add FPU Support for powerpc

2012-06-28 Thread Steven Bosscher
On Tue, May 22, 2012 at 3:45 AM, rbmj  wrote:
> Hi everyone,
>
> This patch adds FPU support for powerpc on platforms that do not have glibc.
>  It is basically the same code as glibc has.  The motivation for this was
> that right now there is no fpu-target.h that works for powerpc-*-vxworks.
>
> Again, 90% of this code comes directly from glibc.  But on vxworks targets
> there is no glibc.
>
> I also patched the configure.host script in order to add this in.
>
> Any opinions?

Since AFAICT nobody has responded...

I suppose this is something you need, or you would probably not be
working on it. I wouldn't have thought of VxWorks as an obvious target
platform for a Fortran compiler. :-)

The copying of the code from glibc (LGPL code) to libgfortran
(GPL+exception) is something that you probably need permission for
from the FSF. For the VxWorks specific bits, you could poke the only
listed VxWorks maintainer in MAINTAINERS (hi Nathan!).

For the configure.host bits,

+  powerpc)

Not powerpc64? Or at least "powerpc|ppc"?
IIUC this test is overridden for powerpc-linux by a glibc test
following your new code, right? What happens for e.g. powerpc-aix?
Shouldn't your test also be conditional on have_feenableexcept?

Ciao!
Steven


Re: [PATCH] gfortran testsuite: implicitly cleanup-modules

2012-06-28 Thread Bernhard Reutner-Fischer
Rehi Janis,

Good to see you active again :)

Perhaps you want to pursue this? We'd need to suggest this to dejagnu,
have it in a release and bump the minimum required deja version of gcc.
So it may take time but IMO would be a worthwhile cleanup.
Or do you see a better way to handle this properly?

The first patch below is the dejagnu part, the other patch is the
corresponding follow-up for gcc.

cheers,
Bernhard

On Fri, Mar 16, 2012 at 03:59:58PM +0100, Bernhard Reutner-Fischer wrote:
>On Fri, Mar 16, 2012 at 11:04:45AM +0100, Bernhard Reutner-Fischer wrote:
>
>>The underlying problem is that dejagnu's runtest.exp only allows for a
>>single "libdir" where it searches for includes -- see comment in
>>libgomp.exp and libitm.exp
>>
>>While just adding more and more load_gcc_lib calls to users outside of
>>gcc/ is the easy way out, it is (IMHO) error prone (i ran make check
>>just in gcc and not in toplevel, fixed my script now).
>>
>>It would be desirable if dejagnu would just find all the currently
>>load_gcc_lib'ed files on its own, via load_lib.
>>One could
>>- teach dejagnu to treat libdir as a list of paths
>
>The attached works for me for a toplevel make -k check (double-checked
>with individual make check in lib{gomp,itm}). I do not intend to pursue
>this any further.

>runtest.exp: add libdirs list for load_lib()
>
>libgomp wants to load .exp files from ../gcc/testsuite/lib.
>Instrument load_lib to be able to find the files.
>Previously we used to have a helper proc that had to first load all
>dependent .exp manually and then, again manually, the desired .exp.
>
>2012-03-16  Bernhard Reutner-Fischer  
>
>   * runtest.exp (libdirs): New global list.
>   (load_lib): Append libdirs to search_and_load_files directories.
>
>diff --git a/runtest.exp b/runtest.exp
>index 4bfed83..8e6a7de 100644
>--- a/runtest.exp
>+++ b/runtest.exp
>@@ -589,7 +589,7 @@ proc lookfor_file { dir name } {
> # source tree, (up one or two levels), then in the current dir.
> #
> proc load_lib { file } {
>-global verbose libdir srcdir base_dir execpath tool
>+global verbose libdir libdirs srcdir base_dir execpath tool
> global loaded_libs
> 
> if {[info exists loaded_libs($file)]} {
>@@ -597,8 +597,11 @@ proc load_lib { file } {
> }
> 
> set loaded_libs($file) ""
>-
>-if { [search_and_load_file "library file" $file [list ../lib $libdir 
>$libdir/lib [file dirname [file dirname $srcdir]]/dejagnu/lib $srcdir/lib 
>$execpath/lib . [file dirname [file dirname [file dirname 
>$srcdir]]]/dejagnu/lib]] == 0 } {
>+set search_dirs [list ../lib $libdir $libdir/lib [file dirname [file 
>dirname $srcdir]]/dejagnu/lib $srcdir/lib $execpath/lib . [file dirname [file 
>dirname [file dirname $srcdir]]]/dejagnu/lib]
>+if {[info exists libdirs]} {
>+lappend search_dirs $libdirs
>+}
>+if { [search_and_load_file "library file" $file $search_dirs ] == 0 } {
>   send_error "ERROR: Couldn't find library file $file.\n"
>   exit 1
> }
>@@ -652,6 +655,8 @@ set libdir   [file dirname $execpath]/dejagnu
> if {[info exists env(DEJAGNULIBS)]} {
> set libdir $env(DEJAGNULIBS)
> }
>+# list of extra directories for load_lib
>+set libdirs {}
> 
> verbose "Using $libdir to find libraries"
> 

>libgomp/ChangeLog
>
>2012-03-16  Bernhard Reutner-Fischer  
>
>   * testsuite/lib/libgomp.exp: Set libdirs. Remove now redundant
>   manual inclusion of gfortran-dg's dependencies.
>
>libitm/ChangeLog
>
>2012-03-16  Bernhard Reutner-Fischer  
>
>   * testsuite/lib/libitm.exp: Set libdirs. Remove now redundant
>   manual inclusion of gcc-dg's dependencies.
>
>
>diff --git a/libgomp/testsuite/lib/libgomp.exp 
>b/libgomp/testsuite/lib/libgomp.exp
>index 02909f8..54e1e652 100644
>--- a/libgomp/testsuite/lib/libgomp.exp
>+++ b/libgomp/testsuite/lib/libgomp.exp
>@@ -1,32 +1,12 @@
>-# Damn dejagnu for not having proper library search paths for load_lib.
>-# We have to explicitly load everything that gcc-dg.exp wants to load.
>+global libdirs
>+lappend libdirs $srcdir/../../gcc/testsuite/lib
> 
>-proc load_gcc_lib { filename } {
>-global srcdir loaded_libs
>+load_lib dg.exp
> 
>-load_file $srcdir/../../gcc/testsuite/lib/$filename
>-set loaded_libs($filename) ""
>-}
>+# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load gcc-defs!
>+load_lib gcc-defs.exp
> 
>-load_lib dg.exp
>-load_gcc_lib file-format.exp
>-load_gcc_lib target-supports.exp
>-load_gcc_lib target-supports-dg.exp
>-load_gcc_lib scanasm.exp
>-load_gcc_lib scandump.exp
>-load_gcc_lib scanrtl.exp
>-load_gcc_lib scantree.exp
>-load_gcc_lib scanipa.exp
>-load_gcc_lib prune.exp
>-load_gcc_lib target-libpath.exp
>-load_gcc_lib wrapper.exp
>-load_gcc_lib gcc-defs.exp
>-load_gcc_lib torture-options.exp
>-load_gcc_lib timeout.exp
>-load_gcc_lib timeout-dg.exp
>-load_gcc_lib fortran-modules.exp
>-load_gcc_lib gcc-dg.exp
>-load_gcc_lib gfortran-dg.exp
>+load_lib gfortran-dg.exp
> 
> set dg-do-what-default run
> 
>di

Fwd: [Bug debug/53754] [4.8 Regression][lto] ICE in lhd_decl_printable_name, at langhooks.c:222 (with -g)

2012-06-28 Thread Cary Coutant
[resending in plain text. Sorry, gmail defaulted to HTML.]

Ping. I'm not looking for commit approval yet, just advice on how
thorough we need to be to support -g and LTO together.

(What's the right way to send a patch to fix a PR? I'm not even sure
whether you were cc'ed on my response.)

-cary


-- Forwarded message --
From: ccoutant at gcc dot gnu.org 
Date: Mon, Jun 25, 2012 at 2:19 PM
Subject: [Bug debug/53754] [4.8 Regression][lto] ICE in
lhd_decl_printable_name, at langhooks.c:222 (with -g)
To: ccout...@google.com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53754

Cary Coutant  changed:

          What    |Removed                     |Added

            Status|NEW                         |ASSIGNED
        AssignedTo|unassigned at gcc dot       |ccoutant at gcc dot gnu.org
                  |gnu.org                     |

--- Comment #4 from Cary Coutant 
2012-06-25 21:19:17 UTC ---
Created attachment 27705
 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27705
Patch to fix ICE with -g -flto and anonymous namespace

> You can't delay producing pubnames this way with LTO.  Please fix.

The obvious problem is that we're calling langhooks.dwarf_name (in
gen_namespace_die) for an anonymous namespace, even with the default
-gno-pubnames. I can fix that by adding a check for want_pubnames just before
the call to add_pubname_string, as in the patch below. But this is still going
to ICE if you turn on -gpubnames with -lto. The only way I can think of to fix
that is relax the assert in lhd_decl_printable_name, and just have it return an
empty string in the DECL_NAMELESS case. That will not produce the right results
for an anonmyous namespace, but without front-end langhooks available to us
(and until we implement the lazy debug plan), how can we do better?

How much is expected to work today with LTO and -g? Aren't we still stuck with
calling langhooks from dwarf2out.c back-end routines? I can understand that we
don't want to ICE, but what guarantees do we make about debug info?

-cary

--
Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.



On Mon, Jun 25, 2012 at 2:19 PM, ccoutant at gcc dot gnu.org
 wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53754
>
> Cary Coutant  changed:
>
>           What    |Removed                     |Added
> 
>             Status|NEW                         |ASSIGNED
>         AssignedTo|unassigned at gcc dot       |ccoutant at gcc dot gnu.org
>                   |gnu.org                     |
>
> --- Comment #4 from Cary Coutant  2012-06-25 
> 21:19:17 UTC ---
> Created attachment 27705
>  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27705
> Patch to fix ICE with -g -flto and anonymous namespace
>
> > You can't delay producing pubnames this way with LTO.  Please fix.
>
> The obvious problem is that we're calling langhooks.dwarf_name (in
> gen_namespace_die) for an anonymous namespace, even with the default
> -gno-pubnames. I can fix that by adding a check for want_pubnames just before
> the call to add_pubname_string, as in the patch below. But this is still going
> to ICE if you turn on -gpubnames with -lto. The only way I can think of to fix
> that is relax the assert in lhd_decl_printable_name, and just have it return 
> an
> empty string in the DECL_NAMELESS case. That will not produce the right 
> results
> for an anonmyous namespace, but without front-end langhooks available to us
> (and until we implement the lazy debug plan), how can we do better?
>
> How much is expected to work today with LTO and -g? Aren't we still stuck with
> calling langhooks from dwarf2out.c back-end routines? I can understand that we
> don't want to ICE, but what guarantees do we make about debug info?
>
> -cary
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> --- You are receiving this mail because: ---
> You are on the CC list for the bug.


Re: [PATCH] gfortran testsuite: implicitly cleanup-modules

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote:
> Perhaps you want to pursue this? We'd need to suggest this to dejagnu,

Actually, we have the technology, so that isn't necessary.  :-)  You can 
install replacements for any procs you want, not pretty, but... it does work.  
I think this is a more deterministic path forward than waiting for a mythical 
dejagnu release.  Also, we then can avoid the hassle of requiring a new dejagnu.


Re: [PATCH] gfortran testsuite: implicitly cleanup-modules

2012-06-28 Thread Bernhard Reutner-Fischer
On Thu, Jun 28, 2012 at 04:43:05PM -0700, Mike Stump wrote:
>On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote:
>> Perhaps you want to pursue this? We'd need to suggest this to dejagnu,
>
>Actually, we have the technology, so that isn't necessary.  :-)  You can 
>install replacements for any procs you want, not pretty, but... it does work.  
>I think this is a more deterministic path forward than waiting for a mythical 
>dejagnu release.  Also, we then can avoid the hassle of requiring a new 
>dejagnu.

Wouldn't that mean that we have to completely replace proc load_lib?
But anyway.
Mike, it would be nice if you could fix
>+# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load gcc-defs!

if you did not do that already -- TIA :)

That's under the assumption that one should be able to use the major
lib/*exp without including their pre-requisites first.

cheers,


[testsuite] gcc.dg/Wstrict-aliasing-converted-assigned.c: fix dg-message errors

2012-06-28 Thread Janis Johnson
Test gcc.dg/Wstrict-aliasing-converted-assigned.c uses a combination of
"target" and "xfail" selectors in a way that would be nice if it worked,
but it doesn't.  Unfortunately the local code to override dg-error and
friends ignores errors, so directives with errors have been silently
skipped.  I plan to fix that after fixing the affected tests.

This patch causes the affected dg-message directives in this test to be
XFAIL'd everywhere, with a comment asking that when the test starts
passing on the relevant targets, the "xfail" be replaced with a "target"
list.  It also adds comments to the dg-message directives to make their
messages unique in the test summary.

Tested on i686-pc-linux-gnu; OK for trunk?

Janis
2012-06-28  Janis Johnson  

* gcc.dg/Wstrict-aliasing-converted-assigned.c: Fix syntax
errors in dg-message directives, add comments.

Index: gcc.dg/Wstrict-aliasing-converted-assigned.c
===
--- gcc.dg/Wstrict-aliasing-converted-assigned.c(revision 189025)
+++ gcc.dg/Wstrict-aliasing-converted-assigned.c(working copy)
@@ -5,9 +5,12 @@
 int foo()
 {
   int i;
-  *(long*)&i = 0;  /* { dg-warning "type-punn" } */
+  *(long*)&i = 0;  /* { dg-warning "type-punn" "type-punn" } */
   return i;
 }
 
-/* { dg-message "does break strict-aliasing" "" { target { *-*-* && lp64 } 
xfail *-*-* } 8 } */
-/* { dg-message "initialized" "" { target { *-*-* && lp64 } xfail *-*-* } 8 } 
*/
+/* These messages are only expected for lp64, but fail there.  When they
+   pass for lp64, replace "xfail *-*-*" with "target lp64".  */
+
+/* { dg-message "does break strict-aliasing" "break" { xfail *-*-* } 8 } */
+/* { dg-message "initialized" "init" { xfail *-*-* } 8 } */


[testsuite] add required comments to dg-message directives in g++.dg

2012-06-28 Thread Janis Johnson
Several tests in g++.dg use dg-message with a target list and line
number but without the comment field, which is required when those
additional arguments are used.  The local replacement of dg-message
silently ignores errors (something I plan to fix), so the checks have
been ignored.  Unprocessed notes (as opposed to errors and warning)
in compiler output are intentionally ignored, so this wasn't noticed
before..

This patch adds the required comments, and the tests now pass on
i686-pc-linux-gnu.  OK for trunk?

Janis
2012-06-28  Janis Johnson  

* g++.dg/template/error46.C: Add missing comment to dg-message.
* g++.dg/template/crash107.C: Likewise.
* g++.dg/template/error47.C: Likewise.
* g++.dg/template/crash108.C: Likewise.
* g++.dg/overload/operator5.C: Likewise.

Index: g++.dg/template/error46.C
===
--- g++.dg/template/error46.C   (revision 189025)
+++ g++.dg/template/error46.C   (working copy)
@@ -8,4 +8,4 @@
 {
   foo(A<0>(), A<1>()); // { dg-error "no matching" }
 }
-// { dg-message "candidate|parameter 'N' ('0' and '1')" { target *-*-* } 9 }
+// { dg-message "candidate|parameter 'N' ('0' and '1')" "" { target *-*-* } 9 }
Index: g++.dg/template/crash107.C
===
--- g++.dg/template/crash107.C  (revision 189025)
+++ g++.dg/template/crash107.C  (working copy)
@@ -14,7 +14,7 @@
 }
 };
 Vec v(3,4,12); // { dg-error "no matching" }
-// { dg-message "note" { target *-*-* } 16 }
+// { dg-message "note" "note" { target *-*-* } 16 }
 Vec V(12,4,3);  // { dg-error "no matching" }
-// { dg-message "note" { target *-*-* } 18 }
+// { dg-message "note" "note" { target *-*-* } 18 }
 Vec c = v^V;   // { dg-message "required" }
Index: g++.dg/template/error47.C
===
--- g++.dg/template/error47.C   (revision 189025)
+++ g++.dg/template/error47.C   (working copy)
@@ -6,4 +6,4 @@
 {
   foo(0, p); // { dg-error "no matching" }
 }
-// { dg-message "candidate|parameter 'T' ('int' and 'void*')" { target *-*-* } 
7 }
+// { dg-message "candidate|parameter 'T' ('int' and 'void*')" "" { target 
*-*-* } 7 }
Index: g++.dg/template/crash108.C
===
--- g++.dg/template/crash108.C  (revision 189025)
+++ g++.dg/template/crash108.C  (working copy)
@@ -2,4 +2,4 @@
 
 template struct A {A(int b=k(0));}; // { dg-error "arguments" }
 void f(int k){A a;} // // { dg-error "parameter|declared" }
-// { dg-message "note" { target *-*-* } 3 }
+// { dg-message "note" "note" { target *-*-* } 3 }
Index: g++.dg/overload/operator5.C
===
--- g++.dg/overload/operator5.C (revision 189025)
+++ g++.dg/overload/operator5.C (working copy)
@@ -13,4 +13,4 @@
   const String& b,
   bool ignoreCase) {
   return ignoreCase ? equalIgnoringCase(a, b) : (a == b); } // { dg-error 
"ambiguous" }
-// { dg-message "note" { target *-*-* } 15 }
+// { dg-message "note" "note" { target *-*-* } 15 }


[testsuite] g++.dg/cpp0x/nullptr19.c: remove duplicate dg-message

2012-06-28 Thread Janis Johnson
Test g++.dg/cpp0x/nullptr19.c contains the following:

char* k( char* );  /* { dg-message "note" } { dg-message "note" } */
nullptr_t k( nullptr_t ); /* { dg-message "note" } { dg-message "note" } */

Having two test directives on a line should have resulted in an ERROR
but the local replacement of dg-warning silently ignores errors
(something I plan to fix).  There are two notes for each of these lines,
identical but after different candidate lists.  Since they are identical
DejaGnu removes both of them after one has been processed, and there is
apparently no way to check for both of them.  At least with this patch
we'll correctly check for one for each line.

Tested on i686-pc-linux-gnu; OK for trunk?

Janis
2012-06-28  Janis Johnson  

* g++.dg/cpp0x/nullptr19.c: Remove exta directives on same line.

Index: g++.dg/cpp0x/nullptr19.C
===
--- g++.dg/cpp0x/nullptr19.C(revision 189025)
+++ g++.dg/cpp0x/nullptr19.C(working copy)
@@ -5,8 +5,8 @@
 
 typedef decltype(nullptr) nullptr_t;
 
-char* k( char* );  /* { dg-message "note" } { dg-message "note" } */
-nullptr_t k( nullptr_t ); /* { dg-message "note" } { dg-message "note" } */
+char* k( char* );  /* { dg-message "note" } */
+nullptr_t k( nullptr_t ); /* { dg-message "note" } */
 
 void test_k()
 {


Re: [testsuite] g++.dg/cpp0x/nullptr19.c: remove duplicate dg-message

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 5:57 PM, Janis Johnson wrote:
> Test g++.dg/cpp0x/nullptr19.c contains the following:

> OK for trunk?

Ok.


Re: [testsuite] add required comments to dg-message directives in g++.dg

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 5:56 PM, Janis Johnson wrote:
> Several tests in g++.dg use dg-message with a target list and line
> number but without the comment field, which is required when those
> additional arguments are used.

> OK for trunk?

Ok.


Re: [testsuite] gcc.dg/Wstrict-aliasing-converted-assigned.c: fix dg-message errors

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 5:55 PM, Janis Johnson wrote:
> Test gcc.dg/Wstrict-aliasing-converted-assigned.c uses a combination of
> "target" and "xfail" selectors in a way that would be nice if it worked,

> OK for trunk?

Ok.  I prefer no spacing between the comment and the dg-message lines...  ok 
either way.


Re: [PATCH] gfortran testsuite: implicitly cleanup-modules

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 5:15 PM, Bernhard Reutner-Fischer wrote:
> On Thu, Jun 28, 2012 at 04:43:05PM -0700, Mike Stump wrote:
>> On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote:
>>> Perhaps you want to pursue this? We'd need to suggest this to dejagnu,
>> 
>> Actually, we have the technology, so that isn't necessary.  :-)  You can 
>> install replacements for any procs you want, not pretty, but... it does 
>> work.  I think this is a more deterministic path forward than waiting for a 
>> mythical dejagnu release.  Also, we then can avoid the hassle of requiring a 
>> new dejagnu.
> 
> Wouldn't that mean that we have to completely replace proc load_lib?

Yes; worse, it is a cut-n-paste from dejagnu and can effectively rev lock us to 
the current dejagnu release...  One can delegate, but I don't think any pre or 
post processing in this case is enough to `fix' the issue, so it would be a 
wholesale replacement.

> But anyway.
> Mike, it would be nice if you could fix
>> +# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load 
>> gcc-defs!

Sounds like a single line fix.  It is the testing of that fix that is the 
annoying part.


Re: [testsuite] gcc.dg/vect/vect-50.c: combine two scans

2012-06-28 Thread Mike Stump
On Jun 28, 2012, at 10:26 AM, Janis Johnson wrote:
> No, there is no way to combine "target" and "xfail",

Ah...  Grrr  I hate non-composability.  Given that, I think the original 
patch is fine, subject of course to the wants and wishes of vect people.


Ping: Reorganized documentation for warnings -- attempt 2

2012-06-28 Thread David Stone
http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01208.html


Re: New option to turn off stack reuse for temporaries

2012-06-28 Thread Xinliang David Li
(re-post in plain text)

Moving this to cfgexpand time is simple and it can also be extended to
handle scoped variables. However Jakub raised a good point about this
being too late as stack space overlay is not the only way to cause
trouble when the lifetime of a stack object is extended beyond the
clobber stmt.

thanks,

David

On Tue, Jun 26, 2012 at 1:28 AM, Richard Guenther
 wrote:
> On Mon, Jun 25, 2012 at 6:25 PM, Xinliang David Li  wrote:
>> Are there any more concerns about this patch? If not, I'd like to check it 
>> in.
>
> No - the fact that the flag is C++ specific but in common.opt is odd enough
> and -ftemp-reuse-stack sounds very very generic - which in fact it is not,
> it's a no-op in C.  Is there a more formal phrase for the temporary kind that
> is affected?  For me "temp" is synonymous to "auto" so I'd have expected
> the switch to turn off stack slot sharing for
>
>  {
>   int a[5];
>  }
>  {
>   int a[5];
>  }
>
> but that is not what it does.  So - a little kludgy but probably more to what
> I'd like it to be would be to move the option to c-family/c.opt enabled only
> for C++ and Obj-C++ and export it to the middle-end via a new langhook
> (the gimplifier code should be in Frontend code that lowers to GENERIC
> really and the WITH_CLEANUP_EXPR code should be C++ frontend specific ...).
>
> Thanks,
> Richard.
>
>> thanks,
>>
>> David
>>
>> On Fri, Jun 22, 2012 at 8:51 AM, Xinliang David Li  
>> wrote:
>>> On Fri, Jun 22, 2012 at 2:39 AM, Richard Guenther
>>>  wrote:
 On Fri, Jun 22, 2012 at 11:29 AM, Jason Merrill  wrote:
> On 06/22/2012 01:30 AM, Richard Guenther wrote:
>>>
>>> What other issues? It enables more potential code motion, but on the
>>> other hand, causes more conservative stack reuse. As far I can tell,
>>> the handling of temporaries is added independently after the clobber
>>> for scoped variables are introduced. This option can be used to
>>> restore the older behavior (in handling temps).
>>
>>
>> Well, it does not really restore the old behavior (if you mean before
>> adding
>> CLOBBERS, not before the single patch that might have used those for
>> gimplifying WITH_CLEANUP_EXPR).  You say it disables stack-slot sharing
>> for those decls but it also does other things via side-effects of no
>> longer
>> emitting the CLOBBER.  I say it's better to disable the stack-slot
>> sharing.
>
>
> The patch exactly restores the behavior of temporaries from before my 
> change
> to add CLOBBERs for temporaries.  The primary effect of that change was to
> provide stack-slot sharing, but if there are other effects they are 
> probably
> desirable as well, since the broken code depended on the old behavior.

 So you see it as workaround option, like -fno-strict-aliasing, rather than
 debugging aid?
>>>
>>> It can be used for both purposes -- if the violations are as pervasive
>>> as strict-aliasing cases (which looks like so).
>>>
>>> thanks,
>>>
>>> David
>>>

 Richard.

> Jason


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-28 Thread Jakub Jelinek
On Fri, Jun 29, 2012 at 12:00:10AM +0200, Bernhard Reutner-Fischer wrote:
> Really both HI? If so optab2 could be removed from that fn altogether..

Of course, thanks for pointing that out.  I've additionally added a result
mode check (similar to what supportable_widening_operation does).
The reason for not using supportable_widening_operation is that it only
tests even/odd calls for reductions, while we can use them everywhere.

Committed as obvious.

2012-06-29  Jakub Jelinek  

* tree-vect-stmts.c (vectorizable_operation): Check both
VEC_WIDEN_MULT_LO_EXPR and VEC_WIDEN_MULT_HI_EXPR optabs.
Verify that operand[0]'s mode is TYPE_MODE (wide_vectype).

--- gcc/tree-vect-stmts.c   (revision 189053)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -3504,14 +3504,19 @@ vectorizable_operation (gimple stmt, gim
{
  decl1 = NULL_TREE;
  decl2 = NULL_TREE;
- optab = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR,
+ optab = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR,
   vectype, optab_default);
  optab2 = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR,
vectype, optab_default);
  if (optab != NULL
  && optab2 != NULL
  && optab_handler (optab, vec_mode) != CODE_FOR_nothing
- && optab_handler (optab2, vec_mode) != CODE_FOR_nothing)
+ && optab_handler (optab2, vec_mode) != CODE_FOR_nothing
+ && insn_data[optab_handler (optab, vec_mode)].operand[0].mode
+== TYPE_MODE (wide_vectype)
+ && insn_data[optab_handler (optab2,
+ vec_mode)].operand[0].mode
+== TYPE_MODE (wide_vectype))
{
  for (i = 0; i < nunits_in; i++)
sel[i] = !BYTES_BIG_ENDIAN + 2 * i;


Jakub


[patch][RFA] Move the C front end to gcc/c/

2012-06-28 Thread Steven Bosscher
On Thu, Jun 21, 2012 at 4:51 PM, Joseph S. Myers
 wrote:
> On Wed, 20 Jun 2012, Steven Bosscher wrote:
>
>> I'm posting this as an RFC: Does this look like the right approach?
>> Have I overlooked other things than just documentation updates? I hope
>> this would not cause too much trouble for branches like the
>> cxx-conversion branch?
>
> Yes, this looks like the right approach.  I'm not aware of other things;
> "special" directories such as c-family need special treatment in
> po/exgettext, but that doesn't apply to normal front-end directories
> containing a config-lang.in.

Alright then. Here is the patch that, I think, is ready for the trunk.

Bootstrapped&tested on x86_64-unknown-linux-gnu (-m64/-m32) with
c,c++,objc,obj-c++,fortran enabled.
Compared the pre- and post-patch results of "make install".
OK for trunk?

Ciao!
Steven


move_C_fe.diff
Description: Binary data