[wwwdocs][committed] Update vectorizer's webpage

2011-10-23 Thread Ira Rosen
Hi,

I committed the attached update.

Ira
? yy
cvs diff: Diffing .
Index: vectorization.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/tree-ssa/vectorization.html,v
retrieving revision 1.27
diff -r1.27 vectorization.html
9,12c9,10
< The goal of this project is to develop a loop vectorizer in
< GCC, based on the tree-ssa framework. This
< work is taking place in the autovect-branch, and is merged periodically
< to mainline.
---
> The goal of this project is to develop a loop and basic block 
> vectorizer in
> GCC, based on the tree-ssa framework.
36a35,69
>   2011-10-23
>  
>  
> Vectorization of reduction in loop SLP.
> Both  multiple reduction cycles and 
>  reduction chains are supported. 
> Various  basic block vectorization (SLP)
> improvements, such as
> better data dependence analysis, support of misaligned accesses
> and multiple types, cost model.
> Detection of vector size:
>  "http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00441.html";>
> http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00441.html.
> Vectorization of loads with  negative 
> step.
> Improved realignment scheme: 
>  "http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02301.html";>
> http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02301.html.
> A new built-in, 
> __builtin_assume_aligned, has been added,
> through which the compiler can be hinted about pointer 
> alignment.
> Support of  strided accesses using
> memory instructions that have
> the interleaving "built in", such as NEON's vldN and vstN.
> The vectorizer now attempts to reduce over-promotion of operands 
> in some vector
> operations:  "http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01472.html";>
> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01472.html.
>  Widening shifts are now detected and 
> vectorized
> if supported by the target.
> Vectorization of conditions with  mixed 
> types.
> Support of loops with  bool.
> 
> 
> 
> 
> 
44c77
< other then reduction cycles in nested loops) (2009-06-16)
---
> other than reduction cycles in nested loops) (2009-06-16)
82c115,116
< to this project include Revital Eres, Richard Guenther, and Ira Rosen.
---
> to this project include Revital Eres, Richard Guenther, Jakub Jelinek, 
> Michael Matz,
> Richard Sandiford, and Ira Rosen.
279c313
< example11: 
---
> example11:
323d356
<  
341d373
< 
356d387
< 
361d391
< 
371a402,498
> 
> 
> example18: Simple reduction in SLP:
> 
> int sum1;
> int sum2;
> int a[128];
> void foo (void)
> {
>   int i;
> 
>   for (i = 0; i < 64; i++)
> {
>   sum1 += a[2*i];
>   sum2 += a[2*i+1];
> }
> }
> 
> 
> example19: Reduction chain in SLP:
> 
> int sum;
> int a[128];
> void foo (void)
> {
>   int i;
> 
>   for (i = 0; i < 64; i++)
> {
>   sum += a[2*i];
>   sum += a[2*i+1];
> }
> }
> 
> 
> example20: Basic block SLP with
> multiple types, loads with different offsets, misaligned load,
> and not-affine accesses:
> 
> void foo (int * __restrict__ dst, short * __restrict__ src,
>   int h, int stride, short A, short B)
> {
>   int i;
>   for (i = 0; i < h; i++)
> {
>   dst[0] += A*src[0] + B*src[1];
>   dst[1] += A*src[1] + B*src[2];
>   dst[2] += A*src[2] + B*src[3];
>   dst[3] += A*src[3] + B*src[4];
>   dst[4] += A*src[4] + B*src[5];
>   dst[5] += A*src[5] + B*src[6];
>   dst[6] += A*src[6] + B*src[7];
>   dst[7] += A*src[7] + B*src[8];
>   dst += stride;
>   src += stride;
> }
> }
> 
> 
> example21: Backward access:
> 
> int foo (int *b, int n)
> {
>   int i, a = 0;
> 
>   for (i = n-1; i ≥ 0; i--)
> a += b[i];
> 
>   return a;
> }
> 
> 
> example22: Alignment hints:
> 
> void foo (int *out1, int *in1, int *in2, int n)
> {
>   int i;
> 
>   out1 = __builtin_assume_aligned (out1, 32, 16);
>   in1 = __builtin_assume_aligned (in1, 32, 16);
>   in2 = __builtin_assume_aligned (in2, 32, 0);
> 
>   for (i = 0; i < n; i++)
> out1[i] = in1[i] * in2[i];
> }
> 
> 
> example23: Widening shift:
> 
> void foo (unsigned short *src, unsigned int *dst)
> {
>   int i;
> 
>   for (i = 0; i < 256; i++)
> *dst++ = *src++ << 7;
> }
> 
372a500,530
> example24: Condition with mixed types:
> 
> #define N 1024
> float a[N], b[N];
> int c[N];
> 
> void foo (short x, short y)
> {
>   int i;
>   for (i = 0; i < N; i++)
> c[i] = a[i] < b[i] ? x : y;
> }
> 
> 
> example25: Loop with bool:
> 
> #define N 1024
> float a[N], b[N], c[N], d[N];
> int j[N];
> 
> void foo (void)
> {
>   int i;
>   _Bool x, y;
>   for (i = 0; i < N; i++)
> {
>   x = (a[i] < b[i]);
>   y = (c[i] < d[i]);
>   j[i] = x & y;
> }
> }
1355a

[patch] Fix inconsistency in invert_tree_comparison

2011-10-23 Thread Eric Botcazou
Hi,

the comment of the function reads:

/* Given a tree comparison code, return the code that is the logical inverse
   of the given code.  It is not safe to do this for floating-point
   comparisons, except for NE_EXPR and EQ_EXPR, so we receive a machine mode
   as well: if reversing the comparison is unsafe, return ERROR_MARK.  */

but the function starts with:

  if (honor_nans && flag_trapping_math)
return ERROR_MARK;

so, for example, it refuses to fold !(x == y) to x != y for FP, which is valid.

Fixed by letting EQ_EXPR and NE_EXPR go through.  This makes tree-opt/44683 
regress though, but it's clear that the original fix only papered over the 
problem, as you can't infer a simple equivalence from a condition when you can 
have signed zeros around; so the patch also includes the proper fix.

Tested on x86_64-suse-linux, OK for mainline?


2011-10-23  Eric Botcazou  

* fold-const.c (invert_tree_comparison): Always invert EQ_EXPR/NE_EXPR.

PR tree-optimization/44683
* tree-ssa-dom.c (record_edge_info): Record simple equivalences only if
we can be sure that there are no signed zeros involved.


-- 
Eric Botcazou
Index: fold-const.c
===
--- fold-const.c	(revision 180235)
+++ fold-const.c	(working copy)
@@ -2100,15 +2100,14 @@ pedantic_non_lvalue_loc (location_t loc,
   return protected_set_expr_location_unshare (x, loc);
 }
 
-/* Given a tree comparison code, return the code that is the logical inverse
-   of the given code.  It is not safe to do this for floating-point
-   comparisons, except for NE_EXPR and EQ_EXPR, so we receive a machine mode
-   as well: if reversing the comparison is unsafe, return ERROR_MARK.  */
+/* Given a tree comparison code, return the code that is the logical inverse.
+   It is generally not safe to do this for floating-point comparisons, except
+   for EQ_EXPR and NE_EXPR, so we return ERROR_MARK in this case.  */
 
 enum tree_code
 invert_tree_comparison (enum tree_code code, bool honor_nans)
 {
-  if (honor_nans && flag_trapping_math)
+  if (honor_nans && flag_trapping_math && code != EQ_EXPR && code != NE_EXPR)
 return ERROR_MARK;
 
   switch (code)
Index: tree-ssa-dom.c
===
--- tree-ssa-dom.c	(revision 180235)
+++ tree-ssa-dom.c	(working copy)
@@ -1610,12 +1610,15 @@ record_edge_info (basic_block bb)
 {
   tree cond = build2 (code, boolean_type_node, op0, op1);
   tree inverted = invert_truthvalue_loc (loc, cond);
+  enum machine_mode mode = TYPE_MODE (TREE_TYPE (op0));
+  bool can_infer_simple_equiv
+= !(HONOR_SIGNED_ZEROS (mode) && real_zerop (op0));
   struct edge_info *edge_info;
 
   edge_info = allocate_edge_info (true_edge);
   record_conditions (edge_info, cond, inverted);
 
-  if (code == EQ_EXPR)
+  if (can_infer_simple_equiv && code == EQ_EXPR)
 {
   edge_info->lhs = op1;
   edge_info->rhs = op0;
@@ -1624,7 +1627,7 @@ record_edge_info (basic_block bb)
   edge_info = allocate_edge_info (false_edge);
   record_conditions (edge_info, inverted, cond);
 
-  if (TREE_CODE (inverted) == EQ_EXPR)
+  if (can_infer_simple_equiv && TREE_CODE (inverted) == EQ_EXPR)
 {
   edge_info->lhs = op1;
   edge_info->rhs = op0;
@@ -1632,17 +1635,21 @@ record_edge_info (basic_block bb)
 }
 
   else if (TREE_CODE (op0) == SSA_NAME
-   && (is_gimple_min_invariant (op1)
-   || TREE_CODE (op1) == SSA_NAME))
+   && (TREE_CODE (op1) == SSA_NAME
+   || is_gimple_min_invariant (op1)))
 {
   tree cond = build2 (code, boolean_type_node, op0, op1);
   tree inverted = invert_truthvalue_loc (loc, cond);
+  enum machine_mode mode = TYPE_MODE (TREE_TYPE (op1));
+  bool can_infer_simple_equiv
+= !(HONOR_SIGNED_ZEROS (mode)
+&& (TREE_CODE (op1) == SSA_NAME || real_zerop (op1)));
   struct edge_info *edge_info;
 
   edge_info = allocate_edge_info (true_edge);
   record_conditions (edge_info, cond, inverted);
 
-  if (code == EQ_EXPR)
+  if (can_infer_simple_equiv && code == EQ_EXPR)
 {
   edge_info->lhs = op0;
   edge_info->rhs = op1;
@@ -1651,7 +1658,7 @@ record_edge_info (basic_block bb)
   edge_info = allocate_edge_info (false_edge);
   record_conditions (edge_info, inverted, cond);
 
-  if (TREE_CODE (inverted) == EQ_EXPR)
+  if (can_infer_simple_equiv && TREE_CODE (inverted) == EQ_EXPR)
 {
   

Re: [patch] Fix inconsistency in invert_tree_comparison

2011-10-23 Thread Richard Guenther
On Sun, Oct 23, 2011 at 10:56 AM, Eric Botcazou  wrote:
> Hi,
>
> the comment of the function reads:
>
> /* Given a tree comparison code, return the code that is the logical inverse
>   of the given code.  It is not safe to do this for floating-point
>   comparisons, except for NE_EXPR and EQ_EXPR, so we receive a machine mode
>   as well: if reversing the comparison is unsafe, return ERROR_MARK.  */
>
> but the function starts with:
>
>  if (honor_nans && flag_trapping_math)
>    return ERROR_MARK;

Do you have an idea why we test flag_trapping_math here?

> so, for example, it refuses to fold !(x == y) to x != y for FP, which is 
> valid.
>
> Fixed by letting EQ_EXPR and NE_EXPR go through.  This makes tree-opt/44683
> regress though, but it's clear that the original fix only papered over the
> problem, as you can't infer a simple equivalence from a condition when you can
> have signed zeros around; so the patch also includes the proper fix.
>
> Tested on x86_64-suse-linux, OK for mainline?

Ok.

Thanks,
Richard.

>
> 2011-10-23  Eric Botcazou  
>
>        * fold-const.c (invert_tree_comparison): Always invert EQ_EXPR/NE_EXPR.
>
>        PR tree-optimization/44683
>        * tree-ssa-dom.c (record_edge_info): Record simple equivalences only if
>        we can be sure that there are no signed zeros involved.
>
>
> --
> Eric Botcazou
>


Re: [patch tree-optimization]: allow branch-cost optimization for truth-and/or on mode-expanded simple boolean-operands

2011-10-23 Thread Richard Guenther
On Fri, Oct 21, 2011 at 2:19 PM, Kai Tietz  wrote:
> 2011/10/21 Richard Guenther :
>> On Thu, Oct 20, 2011 at 3:08 PM, Kai Tietz  wrote:
>>> Hello,
>>>
>>> this patch re-enables the branch-cost optimization on simple boolean-typed 
>>> operands, which are casted to a wider integral type.  This happens due 
>>> casts from
>>> boolean-types are preserved, but FE might expands simple-expression to 
>>> wider mode.
>>>
>>> I added two tests for already working branch-cost optimization for 
>>> IA-architecture and
>>> two for explicit checking for boolean-type.
>>>
>>> ChangeLog
>>>
>>> 2011-10-20  Kai Tietz  
>>>
>>>        * fold-const.c (simple_operand_p_2): Handle integral
>>>        casts from boolean-operands.
>>>
>>> 2011-10-20  Kai Tietz  
>>>
>>>        * gcc.target/i386/branch-cost1.c: New test.
>>>        * gcc.target/i386/branch-cost2.c: New test.
>>>        * gcc.target/i386/branch-cost3.c: New test.
>>>        * gcc.target/i386/branch-cost4.c: New test.
>>>
>>> Bootstrapped and regression tested on x86_64-unknown-linux-gnu for all 
>>> languages including Ada and Obj-C++.  Ok for apply?
>>>
>>> Regards,
>>> Kai
>>>
>>> Index: gcc-head/gcc/testsuite/gcc.target/i386/branch-cost2.c
>>> ===
>>> --- /dev/null
>>> +++ gcc-head/gcc/testsuite/gcc.target/i386/branch-cost2.c
>>> @@ -0,0 +1,16 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-gimple -mbranch-cost=2" } */
>>> +
>>> +extern int doo (void);
>>> +
>>> +int
>>> +foo (int a, int b)
>>> +{
>>> +  if (a && b)
>>> +   return doo ();
>>> +  return 0;
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "if " 1 "gimple" } } */
>>> +/* { dg-final { scan-tree-dump-times " & " 1 "gimple" } } */
>>> +/* { dg-final { cleanup-tree-dump "gimple" } } */
>>> Index: gcc-head/gcc/testsuite/gcc.target/i386/branch-cost3.c
>>> ===
>>> --- /dev/null
>>> +++ gcc-head/gcc/testsuite/gcc.target/i386/branch-cost3.c
>>> @@ -0,0 +1,16 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-gimple -mbranch-cost=2" } */
>>> +
>>> +extern int doo (void);
>>> +
>>> +int
>>> +foo (_Bool a, _Bool b)
>>> +{
>>> +  if (a && b)
>>> +   return doo ();
>>> +  return 0;
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "if " 1 "gimple" } } */
>>> +/* { dg-final { scan-tree-dump-times " & " 1 "gimple" } } */
>>> +/* { dg-final { cleanup-tree-dump "gimple" } } */
>>> Index: gcc-head/gcc/testsuite/gcc.target/i386/branch-cost4.c
>>> ===
>>> --- /dev/null
>>> +++ gcc-head/gcc/testsuite/gcc.target/i386/branch-cost4.c
>>> @@ -0,0 +1,16 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-gimple -mbranch-cost=0" } */
>>> +
>>> +extern int doo (void);
>>> +
>>> +int
>>> +foo (_Bool a, _Bool b)
>>> +{
>>> +  if (a && b)
>>> +   return doo ();
>>> +  return 0;
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "if " 2 "gimple" } } */
>>> +/* { dg-final { scan-tree-dump-not " & " "gimple" } } */
>>> +/* { dg-final { cleanup-tree-dump "gimple" } } */
>>> Index: gcc-head/gcc/fold-const.c
>>> ===
>>> --- gcc-head.orig/gcc/fold-const.c
>>> +++ gcc-head/gcc/fold-const.c
>>> @@ -3706,6 +3706,19 @@ simple_operand_p_2 (tree exp)
>>>   /* Strip any conversions that don't change the machine mode.  */
>>>   STRIP_NOPS (exp);
>>>
>>> +  /* Handle integral widening casts from boolean-typed
>>> +     expressions as simple.  This happens due casts from
>>> +     boolean-types are preserved, but FE might expands
>>> +     simple-expression to wider mode.  */
>>> +  if (INTEGRAL_TYPE_P (TREE_TYPE (exp))
>>> +      && CONVERT_EXPR_P (exp)
>>> +      && TREE_CODE (TREE_TYPE (TREE_OPERAND (exp, 0)))
>>> +        == BOOLEAN_TYPE)
>>> +    {
>>> +      exp = TREE_OPERAND (exp, 0);
>>> +      STRIP_NOPS (exp);
>>> +    }
>>> +
>>
>> Huh, well.  I think the above is just too special and you instead should
>> replace the existing STRIP_NOPS by
>>
>> while (CONVERT_EXPR_P (exp))
>>  exp = TREE_OPERAND (exp, 0);
>>
>> with a comment that conversions are considered simple.
>>
>> Ok with that change, if it bootstraps & tests ok.
>>
>> Richard.
>
> Ok, bootstrapped and regression-tested on x86_64-unknown-linux-gnu and
> applied to trunk with modifying as you suggested.
>
> One question I have about handling of TRUTH-binaries in general in
> fold-const.c.  Why aren't we enforcing already here in fold_binary for
> those operations, that operands get boolean-type?  I see here some
> advantages of C-AST folding.  I've tested it and saw that even later
> in SSA-passes we get slightly better results on that.

Because we do not want to mess with the Frontends AST.

Richard.

> Regards,
> Kai
>


Re: [PATCH] Fix PR46556 (poor address generation)

2011-10-23 Thread Richard Guenther
On Fri, Oct 21, 2011 at 2:22 PM, William J. Schmidt
 wrote:
>
>
> On Fri, 2011-10-21 at 11:26 +0200, Richard Guenther wrote:
>> On Tue, Oct 18, 2011 at 4:14 PM, William J. Schmidt
>>  wrote:
>
> 
>
>> > +
>> > +  /* We don't use get_def_for_expr for S1 because TER doesn't forward
>> > +     S1 in some situations where this transform is useful, such as
>> > +     when S1 is the base of two MEM_REFs fitting the pattern.  */
>> > +  s1_stmt = SSA_NAME_DEF_STMT (TREE_OPERAND (exp, 0));
>>
>> You can't do this - this will possibly generate wrong code.  You _do_
>> have to use get_def_for_expr.  Or do it when we are still in "true" SSA 
>> form...
>>
>> Richard.
>>
>
> OK.  get_def_for_expr always returns NULL here for the cases I was
> targeting, so doing this in expand isn't going to be helpful.
>
> Rather than cram this in somewhere else upstream, it might be better to
> just wait and let this case be handled by the new strength reduction
> pass.  This is one of the easy cases with explicit multiplies in the
> instruction stream, so it shouldn't require any special handling there.
> Seem reasonable?

Yes, sure.

Richard.

> Bill
>
>


Re: [PATCH] Fix mv8plus, allow targetting Linux or Solaris from other sparc host.

2011-10-23 Thread Eric Botcazou
> This is precisely what I tried initially, and my posting was
> explicitly trying to explain that this kind of approach cannot
> work. :-)

It will work for Richard's case though and that's clearly the most glaring 
problem.  Moreover, it will bring Linux on par with Solaris, which is also a 
good thing.  And of course it will unbreak Solaris.

> Personally, I tend to build a 32-bit compiler and test 64-bit things by
> giving -m64.  Richard has been building 64-bit compilers and using -m32
> to test 32-bit stuff.
>
> Furthermore, consider that we need to solve this issue for things
> other than MASK_V8PLUS.  For example VIS2, VIS3, and FMAF all need
> similar treatment.
>
> Given that, I don't think we want to keep banging the specs for every
> new MASK we add, even if it could work.  I think the specs are quite
> convoluted as-is.

The specs is the only mechanism that discriminates between user input and the 
rest.  I don't think that rejecting them upfront is a good approach.

> I'll try to brainstorm on this, thanks for letting me know about the
> Solaris target problem.

Let's fix the regression quickly though.

-- 
Eric Botcazou


Re: [PATCH] New port for TILEPro and TILE-Gx: 5/7 libgcc port

2011-10-23 Thread Walter Lee
Here is a resubmission of the libgcc patch, removing the dependence on a header
(arch/atomic.h) that's not installed by linux.

Walter

* config.host: Handle tilegx and tilepro.
* config/tilegx/sfp-machine.h: New file.
* config/tilegx/sfp-machine32.h: New file.
* config/tilegx/sfp-machine64.h: New file.
* config/tilegx/t-softfp: New file.
* config/tilegx/t-tilegx: New file.
* config/tilepro/atomic.c: New file.
* config/tilepro/atomic.h: New file.
* config/tilepro/sfp-machine.h: New file.
* config/tilepro/softdivide.c: New file.
* config/tilepro/softmpy.S: New file.
* config/tilepro/t-tilepro: New file.



libgcc.diff.gz
Description: GNU Zip compressed data


Re: [patch] Fix inconsistency in invert_tree_comparison

2011-10-23 Thread Eric Botcazou
> Do you have an idea why we test flag_trapping_math here?

Not really, the test was added with the contradictory comment:
  http://gcc.gnu.org/ml/gcc-patches/2004-05/msg01674.html

-- 
Eric Botcazou


Re: new patches using -fopt-info (issue5294043)

2011-10-23 Thread Richard Guenther
On Fri, Oct 21, 2011 at 6:48 PM, Xinliang David Li  wrote:
> There are two proposals here. One is -fopt-info which prints out
> informational notes to stderr, and the other is -fopt-report which is
> more elaborate form of dump files. Are you object to both or just the
> opt-report one?

What?  I'm objected to adding _two_ variants.  Didn't even realize
you proposed that.

>  The former is no different from any other
> informational notes we already have -- the only difference is that
> they are suppressed by default.

We do not have many informational notes, so it is different.

>>>    ..
>>>  ...
>>
>> I very well understand the intent.  But I disagree with where you start
>> to implement this.  Dump files are _not_ only for developers - after
>> all we don't have anything else.  -fopt-report can get as big and unmanagable
>> to read as dump files - in fact I argue it will be worse than dump files if
>> you go beyond very very coarse reporting.
>
> The problem of using dump files for optimization report is that all
> optimization decisions are 'distributed' in phase specific dumps file.
> For a whole program report, the number of files that are created is
> not manageable (think about a program with 4000 sources each dumping
> 200 files).  If we create a dummy pass and suck in all optimization
> decisions in that pass's dump file -- it will be no different from
> opt-report.

Well, -fopt-whatever will just funnel selected pieces also to stderr.
I object to duplicate dumping when we just need a way to filter
what goes to dump files.

>
>>
>> Yes, dump files are a "mess".  So - why not clean them up, and at the
>> same time annotate dump file pieces so _automatic_ filtering and
>> redirecting to stdout with something like -fopt-report would do something
>> sensible?  I don't see why dump files have to stay messy while you at
>> the same time would need to add _new_ code to dump to stdout for
>> -fopt-report.
>
> In my mind, I would like to separate all dumps into three categories.
>
> 1) IR dumps, and support dump before and after (this reminds me my
> patches are still pending :) )    -fdump-tree-pre-[before|after]-
>  Dump into .after, .before files
> 2) debug tracing etc:        -fdump-tree-pre-debug-...          Dump
> into .debug files.
> 3) opt report : -fdump-opt or -fopt-report
>
> Changes for 1) and 2) are mechanic but requires lots of work.

You can do that, but I want the passes to use a single mechanism to
feed all three "separated dumps".

>>
>> So, no, please do it the right way that benefits both compiler developers
>> and your "power users".
>>
>> And yes, the right way is not to start adding that -fopt-report switch.
>> The right way is to make dump-files consumable by mere mortals first.
>
> I agree we need to do the right way which needs to be discussed first.
> I would argue that mere mortals will really appreciate opt-info
> (separate from dump file and opt-report).

Well, still what you print with opt-info should be better also be present
with opt-report and in dump files.  Thus it all boils down to be able
to filter what passes put in their dump files.

Richard.

> thanks,
>
> David
>
>>
>> Thanks,
>> Richard.
>>
>>>
>>> Thanks,
>>>
>>> David
>>>

 So, please fix dump-files instead.  And for coverage/profiling, fill
 in stuff in a dump-file!

 Richard.

> It would be interested to have some warnings about missing SRA
> opportunities in =1 or =2. I found that sometimes fixing those can give a
> large speedup.
>
> Right now a common case that prevents SRA on structure field
> is simply a memset or memcpy.
>
> -Andi
>
>
> --
> a...@linux.intel.com -- Speaking for myself only
>

>>>
>>
>


Re: [PATCH 2/3] Free large chunks in ggc

2011-10-23 Thread Richard Guenther
On Fri, Oct 21, 2011 at 8:30 PM, Andi Kleen  wrote:
>> > diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
>> > index ba88e3f..eb0eeef 100644
>> > --- a/gcc/ggc-page.c
>> > +++ b/gcc/ggc-page.c
>> > @@ -972,6 +972,54 @@ release_pages (void)
>> >   page_entry *p, *start_p;
>> >   char *start;
>> >   size_t len;
>> > +  size_t mapped_len;
>> > +  page_entry *next, *prev, *newprev;
>> > +  size_t free_unit = PARAM_VALUE (GGC_FREE_UNIT) * G.pagesize;
>> > +
>> > +  /* First free larger continuous areas to the OS.
>> > +     This allows other allocators to grab these areas if needed.
>> > +     This is only done on larger chunks to avoid fragmentation.
>> > +     This does not always work because the free_pages list is only
>> > +     sorted over a single GC cycle. */
>>
>> But release_pages is only called from ggc_collect, or what do you
>
> If there was a spike in GC usage and we end up with lots of free
> space in the free list afterward we free it back on the next GC cycle.
> Then if there's a malloc or other allocator later it can grab
> the address space we freed.
>
> That was done to address your earlier concern.
>
> This will only happen on ggc_collect of course.
>
> So one difference from before the madvise patch is that different
> generations of free pages can accumulate in the freelist. Before madvise
> the freelist would never contain more than one generation.
> Normally it's sorted by address due to the way GC works, but there's no
> attempt to keep the sort order over multiple generations.
>
> The "free in batch" heuristic requires sorting, so it will only
> work if all the pages are freed in a single gc cycle.
>
> I considered sorting, but it seemed to be too slow.
>
> I can expand the comment on that.

Ah, now I see ... but that's of course bad - I expect large regions to be
free only after multiple collections.  Can you measure what sorting would
make for a difference?

>
>> mean with the above?  Would the hitrate using the quire size increase
>> if we change how we allocate from the freelist or is it real fragmentation
>> that causes it?
>
> Not sure really about the hitrate. I haven't measured it. If hitrate
> was a concern the free list should be probably split into an array.
> I'm sure there are lots of other tunings that could be done on the GC,
> but probably not by me for now :)

Heh.  Yeah, I suppose the freelist could be changed into a list of
allocation groups with free pages and a bitmap.

Richard.

>>
>> I'm a bit hesitant to approve the new param, I'd be ok if we just hard-code
>> quire-size / 2.
>
> Ok replacing it with a hardcoded value.
>
> -Andi
>


Re: [PATCH 2/3] Free large chunks in ggc

2011-10-23 Thread Richard Guenther
On Sun, Oct 23, 2011 at 12:23 PM, Richard Guenther
 wrote:
> On Fri, Oct 21, 2011 at 8:30 PM, Andi Kleen  wrote:
>>> > diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
>>> > index ba88e3f..eb0eeef 100644
>>> > --- a/gcc/ggc-page.c
>>> > +++ b/gcc/ggc-page.c
>>> > @@ -972,6 +972,54 @@ release_pages (void)
>>> >   page_entry *p, *start_p;
>>> >   char *start;
>>> >   size_t len;
>>> > +  size_t mapped_len;
>>> > +  page_entry *next, *prev, *newprev;
>>> > +  size_t free_unit = PARAM_VALUE (GGC_FREE_UNIT) * G.pagesize;
>>> > +
>>> > +  /* First free larger continuous areas to the OS.
>>> > +     This allows other allocators to grab these areas if needed.
>>> > +     This is only done on larger chunks to avoid fragmentation.
>>> > +     This does not always work because the free_pages list is only
>>> > +     sorted over a single GC cycle. */
>>>
>>> But release_pages is only called from ggc_collect, or what do you
>>
>> If there was a spike in GC usage and we end up with lots of free
>> space in the free list afterward we free it back on the next GC cycle.
>> Then if there's a malloc or other allocator later it can grab
>> the address space we freed.
>>
>> That was done to address your earlier concern.
>>
>> This will only happen on ggc_collect of course.
>>
>> So one difference from before the madvise patch is that different
>> generations of free pages can accumulate in the freelist. Before madvise
>> the freelist would never contain more than one generation.
>> Normally it's sorted by address due to the way GC works, but there's no
>> attempt to keep the sort order over multiple generations.
>>
>> The "free in batch" heuristic requires sorting, so it will only
>> work if all the pages are freed in a single gc cycle.
>>
>> I considered sorting, but it seemed to be too slow.
>>
>> I can expand the comment on that.
>
> Ah, now I see ... but that's of course bad - I expect large regions to be
> free only after multiple collections.  Can you measure what sorting would
> make for a difference?

I wonder if the free list that falls out of a single collection is sorted
(considering also ggc_free) - if it is, building a new one at each collection
and then merging the two sorted lists should be reasonably fast.

>>
>>> mean with the above?  Would the hitrate using the quire size increase
>>> if we change how we allocate from the freelist or is it real fragmentation
>>> that causes it?
>>
>> Not sure really about the hitrate. I haven't measured it. If hitrate
>> was a concern the free list should be probably split into an array.
>> I'm sure there are lots of other tunings that could be done on the GC,
>> but probably not by me for now :)
>
> Heh.  Yeah, I suppose the freelist could be changed into a list of
> allocation groups with free pages and a bitmap.
>
> Richard.
>
>>>
>>> I'm a bit hesitant to approve the new param, I'd be ok if we just hard-code
>>> quire-size / 2.
>>
>> Ok replacing it with a hardcoded value.
>>
>> -Andi
>>
>


[C++ Patch] PR 50810

2011-10-23 Thread Paolo Carlini

Hi,

this is essentially about enabling -Wnarrowing as part of -Wc++0x-compat 
(see audit trail for details). Tested x86_64-linux.


Ok for mainline?

Thanks,
Paolo.

/
/c-family
2011-10-23  Paolo Carlini  

PR c++/50810
* c-opts.c (c_common_handle_option): Enable -Wnarrowing as part
of -Wall; include -Wnarrowing in -Wc++0x-compat; adjust default
Wnarrowing for C++0x and C++98.
* c.opt ([Wnarrowing]): Adjust.

/cp
2011-10-23  Paolo Carlini  

PR c++/50810
* typeck2.c (check_narrowing): Adjust OPT_Wnarrowing diagnostics.
(digest_init_r): Call check_narrowing irrespective of the C++ dialect.
* decl.c (check_initializer): Likewise.
* semantics.c (finish_compound_literal): Likewise.

/testsuite
2011-10-23  Paolo Carlini  

PR c++/50810
* g++.dg/cpp0x/warn_cxx0x.C: Rename to...
* g++.dg/cpp0x/warn_cxx0x1.C: ... this.
* g++.dg/cpp0x/warn_cxx0x2.C: New.
* g++.dg/cpp0x/warn_cxx0x3.C: Likewise.

2011-10-23  Paolo Carlini  

PR c++/50810
* doc/invoke.texi ([-Wnarrowing], [-Wc++0x-compat]): Update.
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 180333)
+++ doc/invoke.texi (working copy)
@@ -2365,17 +2365,16 @@ an instance of a derived class through a pointer t
 base class does not have a virtual destructor.  This warning is enabled
 by @option{-Wall}.
 
-@item -Wno-narrowing @r{(C++ and Objective-C++ only)}
+@item -Wnarrowing @r{(C++ and Objective-C++ only)}
 @opindex Wnarrowing
 @opindex Wno-narrowing
-With -std=c++0x, suppress the diagnostic required by the standard for
-narrowing conversions within @samp{@{ @}}, e.g.
+Warn when a narrowing conversion occurs within @samp{@{ @}}, e.g.
 
 @smallexample
 int i = @{ 2.2 @}; // error: narrowing from double to int
 @end smallexample
 
-This flag can be useful for compiling valid C++98 code in C++0x mode.
+This flag is included in @option{-Wall} and @option{-Wc++0x-compat}.
 
 @item -Wnoexcept @r{(C++ and Objective-C++ only)}
 @opindex Wnoexcept
@@ -4066,7 +4065,8 @@ ISO C and ISO C++, e.g.@: request for implicit con
 @item -Wc++0x-compat @r{(C++ and Objective-C++ only)}
 Warn about C++ constructs whose meaning differs between ISO C++ 1998 and
 ISO C++ 200x, e.g., identifiers in ISO C++ 1998 that will become keywords
-in ISO C++ 200x.  This warning is enabled by @option{-Wall}.
+in ISO C++ 200x.  This warning turns on @option{-Wnarrowing} and is
+enabled by @option{-Wall}.
 
 @item -Wcast-qual
 @opindex Wcast-qual
Index: c-family/c.opt
===
--- c-family/c.opt  (revision 180333)
+++ c-family/c.opt  (working copy)
@@ -490,8 +490,8 @@ C ObjC C++ ObjC++ Warning
 Warn about use of multi-character character constants
 
 Wnarrowing
-C ObjC C++ ObjC++ Warning Var(warn_narrowing) Init(1)
--Wno-narrowing   In C++0x mode, ignore ill-formed narrowing conversions within 
{ }
+C ObjC C++ ObjC++ Warning Var(warn_narrowing) Init(-1) Warning
+Warn about ill-formed narrowing conversions within { }
 
 Wnested-externs
 C ObjC Var(warn_nested_externs) Warning
Index: c-family/c-opts.c
===
--- c-family/c-opts.c   (revision 180333)
+++ c-family/c-opts.c   (working copy)
@@ -406,6 +406,7 @@ c_common_handle_option (size_t scode, const char *
  warn_reorder = value;
   warn_cxx0x_compat = value;
   warn_delnonvdtor = value;
+ warn_narrowing = value;
}
 
   cpp_opts->warn_trigraphs = value;
@@ -436,6 +437,10 @@ c_common_handle_option (size_t scode, const char *
   cpp_opts->warn_cxx_operator_names = value;
   break;
 
+case OPT_Wc__0x_compat:
+  warn_narrowing = value;
+  break;
+
 case OPT_Wdeprecated:
   cpp_opts->cpp_warn_deprecated = value;
   break;
@@ -997,11 +1002,18 @@ c_common_post_options (const char **pfilename)
   if (warn_implicit_function_declaration == -1)
 warn_implicit_function_declaration = flag_isoc99;
 
-  /* If we're allowing C++0x constructs, don't warn about C++0x
- compatibility problems.  */
   if (cxx_dialect == cxx0x)
-warn_cxx0x_compat = 0;
+{
+  /* If we're allowing C++0x constructs, don't warn about C++98
+identifiers which are keywords in C++0x.  */
+  warn_cxx0x_compat = 0;
 
+  if (warn_narrowing == -1)
+   warn_narrowing = 1;
+}
+  else if (warn_narrowing == -1)
+warn_narrowing = 0;
+
   if (flag_preprocess_only)
 {
   /* Open the output now.  We must do so even if flag_no_output is
Index: testsuite/g++.dg/cpp0x/warn_cxx0x2.C
===
--- testsuite/g++.dg/cpp0x/warn_cxx0x2.C(revision 0)
+++ testsuite/g++.dg/cpp0x/warn_cxx0x2.C(revision 0)
@@ -0,0 +1,4 @@
+// PR c++/50810
+// { dg-options "-std=gn

[patch] SLP data dependence testing - PR 50819

2011-10-23 Thread Ira Rosen
Hi,

When there is pair of data-refs with unknown dependence in basic block
SLP we currently require all the loads in the basic block to be before
all the stores in order to avoid load after store dependencies. But
this is too conservative. It's enough to check that in the pairs of
loads and stores with unknown and known dependence, the load comes
first. This is already done for the known case. This patch adds such
check for unknown dependencies and removes
vect_bb_vectorizable_with_dependencies.

Bootstrapped and tested on powerpc64-suse-linux.
Committed.

Ira

ChangeLog:

PR tree-optimization/50819
* tree-vectorizer.h (vect_analyze_data_ref_dependences): Remove
the last argument.
* tree-vect-loop.c (vect_analyze_loop_2): Update call to
vect_analyze_data_ref_dependences.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Remove
the last argument.  Check load-after-store dependence for unknown
dependencies in basic blocks.
(vect_analyze_data_ref_dependences): Update call to
vect_analyze_data_ref_dependences.
* tree-vect-patterns.c (vect_recog_widen_shift_pattern): Fix typo.
* tree-vect-slp.c (vect_bb_vectorizable_with_dependencies): Remove.
(vect_slp_analyze_bb_1): Update call to
vect_analyze_data_ref_dependences.  Don't call
vect_bb_vectorizable_with_dependencies.

testsuite/Changelog:

PR tree-optimization/50819
* g++.dg/vect/vect.exp: Set target dependent flags for slp-* tests.
* g++.dg/vect/slp-pr50819.cc: New test.
Index: ChangeLog
===
--- ChangeLog   (revision 180333)
+++ ChangeLog   (working copy)
@@ -1,3 +1,21 @@
+2011-10-23  Ira Rosen  
+
+   PR tree-optimization/50819
+   * tree-vectorizer.h (vect_analyze_data_ref_dependences): Remove
+   the last argument.
+   * tree-vect-loop.c (vect_analyze_loop_2): Update call to
+   vect_analyze_data_ref_dependences.
+   * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Remove
+   the last argument.  Check load-after-store dependence for unknown
+   dependencies in basic blocks.
+   (vect_analyze_data_ref_dependences): Update call to
+   vect_analyze_data_ref_dependences.
+   * tree-vect-patterns.c (vect_recog_widen_shift_pattern): Fix typo.
+   * tree-vect-slp.c (vect_bb_vectorizable_with_dependencies): Remove.
+   (vect_slp_analyze_bb_1): Update call to
+   vect_analyze_data_ref_dependences.  Don't call
+   vect_bb_vectorizable_with_dependencies.
+
 2011-10-22  David S. Miller  
 
* config/sparc/sparc.h (SECONDARY_INPUT_RELOAD_CLASS,
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 180333)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,9 @@
+2011-10-23  Ira Rosen  
+
+   PR tree-optimization/50819
+   * g++.dg/vect/vect.exp: Set target dependent flags for slp-* tests.
+   * g++.dg/vect/slp-pr50819.cc: New test.
+
 2011-10-21  Paolo Carlini  
 
PR c++/45385
Index: testsuite/g++.dg/vect/vect.exp
===
--- testsuite/g++.dg/vect/vect.exp  (revision 180333)
+++ testsuite/g++.dg/vect/vect.exp  (working copy)
@@ -42,12 +42,6 @@ set DEFAULT_VECTCFLAGS ""
 # These flags are used for all targets.
 lappend DEFAULT_VECTCFLAGS "-O2" "-ftree-vectorize" "-fno-vect-cost-model"
 
-set VECT_SLP_CFLAGS $DEFAULT_VECTCFLAGS
-
-lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
-lappend VECT_SLP_CFLAGS "-fdump-tree-slp-details"
-
-
 # Skip these tests for targets that do not support generating vector
 # code.  Set additional target-dependent vector flags, which can be
 # overridden by using dg-options in individual tests.
@@ -55,6 +49,11 @@ if ![check_vect_support_and_set_flags] {
 return
 }
 
+set VECT_SLP_CFLAGS $DEFAULT_VECTCFLAGS
+
+lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
+lappend VECT_SLP_CFLAGS "-fdump-tree-slp-details"
+
 # Initialize `dg'.
 dg-init
 
Index: testsuite/g++.dg/vect/slp-pr50819.cc
===
--- testsuite/g++.dg/vect/slp-pr50819.cc(revision 0)
+++ testsuite/g++.dg/vect/slp-pr50819.cc(revision 0)
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_float } */
+
+typedef float Value;
+
+struct LorentzVector
+{
+
+  LorentzVector(Value x=0, Value  y=0, Value  z=0, Value  t=0) :
+theX(x),theY(y),theZ(z),theT(t){}
+  LorentzVector & operator+=(const LorentzVector & a) {
+theX += a.theX;
+theY += a.theY;
+theZ += a.theZ;
+theT += a.theT;
+return *this;
+  }
+
+  Value theX;
+  Value theY;
+  Value theZ;
+  Value theT;
+}  __attribute__ ((aligned(16)));
+
+inline LorentzVector
+operator+(LorentzVector const & a, LorentzVector const & b) {
+  return
+LorentzVector(a.

Re: [C++ Patch] PR 50810

2011-10-23 Thread Jason Merrill

On 10/23/2011 07:23 AM, Paolo Carlini wrote:

-@item -Wno-narrowing @r{(C++ and Objective-C++ only)}
+@item -Wnarrowing @r{(C++ and Objective-C++ only)}
 @opindex Wnarrowing
 @opindex Wno-narrowing
-With -std=c++0x, suppress the diagnostic required by the standard for
-narrowing conversions within @samp{@{ @}}, e.g.
+Warn when a narrowing conversion occurs within @samp{@{ @}}, e.g.

 @smallexample
 int i = @{ 2.2 @}; // error: narrowing from double to int
 @end smallexample

-This flag can be useful for compiling valid C++98 code in C++0x mode.
+This flag is included in @option{-Wall} and @option{-Wc++0x-compat}.


Please still also talk about using -Wno-narrowing in C++0x mode here.


* g++.dg/cpp0x/warn_cxx0x.C: Rename to...
* g++.dg/cpp0x/warn_cxx0x1.C: ... this.


I wouldn't bother renaming, you can just add the new tests.

Jason


[PATCH, i386]: Fix PR50788, [4.7 Regression] ICE: in merge_overlapping_regs, at regrename.c:318 with -mavx -fpeel-loops -fstack-protector-all and __builtin_ia32_maskloadpd256

2011-10-23 Thread Uros Bizjak
Hello!

As discussed in the PR, avx{,2}_maskload pattern outputs zero element
to destination register, when corresponding mask selector is not set.
So, there is no dependency on target register value.

While the attached patch fixes mainline, following one-liner is enough
to fix other relase branches.

Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 180334)
+++ config/i386/sse.md  (working copy)
@@ -12007,8 +12007,7 @@
   [(set (match_operand:AVXMODEF2P 0 "register_operand" "=x")
(unspec:AVXMODEF2P
  [(match_operand:AVXMODEF2P 1 "memory_operand" "m")
-  (match_operand: 2 "register_operand" "x")
-  (match_dup 0)]
+  (match_operand: 2 "register_operand" "x")]
  UNSPEC_MASKLOAD))]
   "TARGET_AVX"
   "vmaskmov\t{%1, %2, %0|%0, %2, %1}"


2011-10-23  Uros Bizjak  

PR target/50788
* config/i386/sse.md (avx2_maskload):
Remove (match_dup 0).
(*avx2_maskload): New insn pattern.
(*avx_maskload): Ditto.
(*avx2_maskstore): Ditto.
(*avx_maskstore): Ditto.
(*avx2_maskmov): Remove insn pattern.
(*avx_maskmov): Ditto.

testsuite/ChangeLog:

2011-10-23  Uros Bizjak  

PR target/50788
* testsuite/gcc.target/i386/pr50788.c: New test.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32}. I will commit this patch to mainline and 4.6 branch as soon
as regression tests finish.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 180333)
+++ config/i386/sse.md  (working copy)
@@ -12279,11 +12279,36 @@
   [(set (match_operand:V48_AVX2 0 "register_operand" "")
(unspec:V48_AVX2
  [(match_operand: 2 "register_operand" "")
-  (match_operand:V48_AVX2 1 "memory_operand" "")
-  (match_dup 0)]
+  (match_operand:V48_AVX2 1 "memory_operand" "")]
  UNSPEC_MASKMOV))]
   "TARGET_AVX")
 
+(define_insn "*avx2_maskload"
+  [(set (match_operand:VI48_AVX2 0 "register_operand" "=x")
+   (unspec:VI48_AVX2
+ [(match_operand: 1 "register_operand" "x")
+  (match_operand:VI48_AVX2 2 "memory_operand" "m")]
+ UNSPEC_MASKMOV))]
+  "TARGET_AVX2"
+  "vpmaskmov\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "sselog1")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "")])
+
+(define_insn "*avx_maskload"
+  [(set (match_operand:VF 0 "register_operand" "=x")
+   (unspec:VF
+ [(match_operand: 1 "register_operand" "x")
+  (match_operand:VF 2 "memory_operand" "m")]
+ UNSPEC_MASKMOV))]
+  "TARGET_AVX"
+  "vmaskmov\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "sselog1")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "")])
+
 (define_expand "_maskstore"
   [(set (match_operand:V48_AVX2 0 "memory_operand" "")
(unspec:V48_AVX2
@@ -12293,30 +12318,28 @@
  UNSPEC_MASKMOV))]
   "TARGET_AVX")
 
-(define_insn "*avx2_maskmov"
-  [(set (match_operand:VI48_AVX2 0 "nonimmediate_operand" "=x,m")
+(define_insn "*avx2_maskstore"
+  [(set (match_operand:VI48_AVX2 0 "memory_operand" "=m")
(unspec:VI48_AVX2
- [(match_operand: 1 "register_operand" "x,x")
-  (match_operand:VI48_AVX2 2 "nonimmediate_operand" "m,x")
+ [(match_operand: 1 "register_operand" "x")
+  (match_operand:VI48_AVX2 2 "register_operand" "x")
   (match_dup 0)]
  UNSPEC_MASKMOV))]
-  "TARGET_AVX2
-   && (REG_P (operands[0]) == MEM_P (operands[2]))"
+  "TARGET_AVX2"
   "vpmaskmov\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sselog1")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "vex")
(set_attr "mode" "")])
 
-(define_insn "*avx_maskmov"
-  [(set (match_operand:VF 0 "nonimmediate_operand" "=x,m")
+(define_insn "*avx_maskstore"
+  [(set (match_operand:VF 0 "memory_operand" "=m")
(unspec:VF
- [(match_operand: 1 "register_operand" "x,x")
-  (match_operand:VF 2 "nonimmediate_operand" "m,x")
+ [(match_operand: 1 "register_operand" "x")
+  (match_operand:VF 2 "register_operand" "x")
   (match_dup 0)]
  UNSPEC_MASKMOV))]
-  "TARGET_AVX
-   && (REG_P (operands[0]) == MEM_P (operands[2]))"
+  "TARGET_AVX"
   "vmaskmov\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sselog1")
(set_attr "prefix_extra" "1")
Index: testsuite/gcc.target/i386/pr50788.c
===
--- testsuite/gcc.target/i386/pr50788.c (revision 0)
+++ testsuite/gcc.target/i386/pr50788.c (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx -fpeel-loops -fstack-protector-all" } */
+
+typedef long long __m256i __attribute__ ((__vector_size__ (32)));
+typedef double __m256d __attribute__ ((__vector_size__ (32)));
+
+__m256d foo (__m256d *__P, _

Inline heuristics tweek

2011-10-23 Thread Jan Hubicka
Hi,
while looking into the inlining dumps I noticed that by adding extra logic to 
identify
inlining benefits, the badness metric is now really scaled down. For tramp3d in 
ranges
from 0 to 400 that is quite coarse for thousdands of functions.  This patch 
scales
it up and makes the overflow situations to be handled better.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

Index: ipa-inline.c
===
--- ipa-inline.c(revision 180328)
+++ ipa-inline.c(working copy)
@@ -822,8 +822,10 @@
   /* Result must be integer in range 0...INT_MAX.
 Set the base of fixed point calculation so we don't lose much of
 precision for small bandesses (those are interesting) yet we don't
-overflow for growths that are still in interesting range.  */
-  badness = ((gcov_type)growth) * (1<<18);
+overflow for growths that are still in interesting range.
+
+Fixed point arithmetic with point at 8th bit. */
+  badness = ((gcov_type)growth) * (1<<(19+8));
   badness = (badness + div / 2) / div;
 
   /* Overall growth of inlining all calls of function matters: we want to
@@ -838,10 +840,14 @@
 We might mix the valud into the fraction by taking into account
 relative growth of the unit, but for now just add the number
 into resulting fraction.  */
+  if (badness > INT_MAX / 2)
+   {
+ badness = INT_MAX / 2;
+ if (dump)
+   fprintf (dump_file, "Badness overflow\n");
+   }
   growth_for_all = estimate_growth (callee);
   badness += growth_for_all;
-  if (badness > INT_MAX - 1)
-   badness = INT_MAX - 1;
   if (dump)
{
  fprintf (dump_file,


Re: [C++ Patch] PR 50810

2011-10-23 Thread Paolo Carlini

Hi,

On 10/23/2011 07:23 AM, Paolo Carlini wrote:

-@item -Wno-narrowing @r{(C++ and Objective-C++ only)}
+@item -Wnarrowing @r{(C++ and Objective-C++ only)}
 @opindex Wnarrowing
 @opindex Wno-narrowing
-With -std=c++0x, suppress the diagnostic required by the standard for
-narrowing conversions within @samp{@{ @}}, e.g.
+Warn when a narrowing conversion occurs within @samp{@{ @}}, e.g.

 @smallexample
 int i = @{ 2.2 @}; // error: narrowing from double to int
 @end smallexample

-This flag can be useful for compiling valid C++98 code in C++0x mode.
+This flag is included in @option{-Wall} and @option{-Wc++0x-compat}.


Please still also talk about using -Wno-narrowing in C++0x mode here.

I change it like this. Better?

Thanks,
Paolo.

///
/c-family
2011-10-23  Paolo Carlini  

PR c++/50810
* c-opts.c (c_common_handle_option): Enable -Wnarrowing as part
of -Wall; include -Wnarrowing in -Wc++0x-compat; adjust default
Wnarrowing for C++0x and C++98.
* c.opt ([Wnarrowing]): Update.

/cp
2011-10-23  Paolo Carlini  

PR c++/50810
* typeck2.c (check_narrowing): Adjust OPT_Wnarrowing diagnostics.
(digest_init_r): Call check_narrowing irrespective of the C++ dialect.
* decl.c (check_initializer): Likewise.
* semantics.c (finish_compound_literal): Likewise.

/testsuite
2011-10-23  Paolo Carlini  

PR c++/50810
* g++.dg/cpp0x/warn_cxx0x2.C: New.
* g++.dg/cpp0x/warn_cxx0x3.C: Likewise.

2011-10-23  Paolo Carlini  

PR c++/50810
* doc/invoke.texi ([-Wnarrowing], [-Wc++0x-compat]): Update.
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 180333)
+++ doc/invoke.texi (working copy)
@@ -2365,17 +2365,18 @@ an instance of a derived class through a pointer t
 base class does not have a virtual destructor.  This warning is enabled
 by @option{-Wall}.
 
-@item -Wno-narrowing @r{(C++ and Objective-C++ only)}
+@item -Wnarrowing @r{(C++ and Objective-C++ only)}
 @opindex Wnarrowing
 @opindex Wno-narrowing
-With -std=c++0x, suppress the diagnostic required by the standard for
-narrowing conversions within @samp{@{ @}}, e.g.
+Warn when a narrowing conversion occurs within @samp{@{ @}}, e.g.
 
 @smallexample
 int i = @{ 2.2 @}; // error: narrowing from double to int
 @end smallexample
 
-This flag can be useful for compiling valid C++98 code in C++0x mode.
+With -std=c++0x, @option{-Wno-narrowing} suppresses the diagnostic
+required by the standard.  This flag is included in @option{-Wall} and
+@option{-Wc++0x-compat}.
 
 @item -Wnoexcept @r{(C++ and Objective-C++ only)}
 @opindex Wnoexcept
@@ -4066,7 +4067,8 @@ ISO C and ISO C++, e.g.@: request for implicit con
 @item -Wc++0x-compat @r{(C++ and Objective-C++ only)}
 Warn about C++ constructs whose meaning differs between ISO C++ 1998 and
 ISO C++ 200x, e.g., identifiers in ISO C++ 1998 that will become keywords
-in ISO C++ 200x.  This warning is enabled by @option{-Wall}.
+in ISO C++ 200x.  This warning turns on @option{-Wnarrowing} and is
+enabled by @option{-Wall}.
 
 @item -Wcast-qual
 @opindex Wcast-qual
Index: c-family/c.opt
===
--- c-family/c.opt  (revision 180333)
+++ c-family/c.opt  (working copy)
@@ -490,8 +490,8 @@ C ObjC C++ ObjC++ Warning
 Warn about use of multi-character character constants
 
 Wnarrowing
-C ObjC C++ ObjC++ Warning Var(warn_narrowing) Init(1)
--Wno-narrowing   In C++0x mode, ignore ill-formed narrowing conversions within 
{ }
+C ObjC C++ ObjC++ Warning Var(warn_narrowing) Init(-1) Warning
+Warn about ill-formed narrowing conversions within { }
 
 Wnested-externs
 C ObjC Var(warn_nested_externs) Warning
Index: c-family/c-opts.c
===
--- c-family/c-opts.c   (revision 180333)
+++ c-family/c-opts.c   (working copy)
@@ -406,6 +406,7 @@ c_common_handle_option (size_t scode, const char *
  warn_reorder = value;
   warn_cxx0x_compat = value;
   warn_delnonvdtor = value;
+ warn_narrowing = value;
}
 
   cpp_opts->warn_trigraphs = value;
@@ -436,6 +437,10 @@ c_common_handle_option (size_t scode, const char *
   cpp_opts->warn_cxx_operator_names = value;
   break;
 
+case OPT_Wc__0x_compat:
+  warn_narrowing = value;
+  break;
+
 case OPT_Wdeprecated:
   cpp_opts->cpp_warn_deprecated = value;
   break;
@@ -997,11 +1002,18 @@ c_common_post_options (const char **pfilename)
   if (warn_implicit_function_declaration == -1)
 warn_implicit_function_declaration = flag_isoc99;
 
-  /* If we're allowing C++0x constructs, don't warn about C++0x
- compatibility problems.  */
   if (cxx_dialect == cxx0x)
-warn_cxx0x_compat = 0;
+{
+  /* If we're allowing C++0x constructs, don't warn about C++98
+identifiers which are keywords

Re: new patches using -fopt-info (issue5294043)

2011-10-23 Thread Xinliang David Li
On Sun, Oct 23, 2011 at 3:18 AM, Richard Guenther
 wrote:
> On Fri, Oct 21, 2011 at 6:48 PM, Xinliang David Li  wrote:
>> There are two proposals here. One is -fopt-info which prints out
>> informational notes to stderr, and the other is -fopt-report which is
>> more elaborate form of dump files. Are you object to both or just the
>> opt-report one?
>
> What?  I'm objected to adding _two_ variants.  Didn't even realize
> you proposed that.

They are different -- -fopt-info is on the fly -- the notes are
emitted as the transformations are done while -fopt-report is for more
structured report so it requires more compiler changes.  Bringing in
-fopt-report is a little distraction as the main discussion is on
-fopt-info.

>
>>  The former is no different from any other
>> informational notes we already have -- the only difference is that
>> they are suppressed by default.
>
> We do not have many informational notes, so it is different.

Why different? opt information notes are not even emitted by default.

>
    ..
  ...
>>>
>>> I very well understand the intent.  But I disagree with where you start
>>> to implement this.  Dump files are _not_ only for developers - after
>>> all we don't have anything else.  -fopt-report can get as big and 
>>> unmanagable
>>> to read as dump files - in fact I argue it will be worse than dump files if
>>> you go beyond very very coarse reporting.
>>
>> The problem of using dump files for optimization report is that all
>> optimization decisions are 'distributed' in phase specific dumps file.
>> For a whole program report, the number of files that are created is
>> not manageable (think about a program with 4000 sources each dumping
>> 200 files).  If we create a dummy pass and suck in all optimization
>> decisions in that pass's dump file -- it will be no different from
>> opt-report.
>
> Well, -fopt-whatever will just funnel selected pieces also to stderr.
> I object to duplicate dumping when we just need a way to filter
> what goes to dump files.
>

that is the main point -- using dump files are not scalable. If you
are just against using stderr and propose dumping the selected
information into a single shared dump file per build, I don't see the
difference with using stderr -- they are not emitted by default and
won't contaminate the build log.

>>
>>>
>>> Yes, dump files are a "mess".  So - why not clean them up, and at the
>>> same time annotate dump file pieces so _automatic_ filtering and
>>> redirecting to stdout with something like -fopt-report would do something
>>> sensible?  I don't see why dump files have to stay messy while you at
>>> the same time would need to add _new_ code to dump to stdout for
>>> -fopt-report.
>>
>> In my mind, I would like to separate all dumps into three categories.
>>
>> 1) IR dumps, and support dump before and after (this reminds me my
>> patches are still pending :) )    -fdump-tree-pre-[before|after]-
>>  Dump into .after, .before files
>> 2) debug tracing etc:        -fdump-tree-pre-debug-...          Dump
>> into .debug files.
>> 3) opt report : -fdump-opt or -fopt-report
>>
>> Changes for 1) and 2) are mechanic but requires lots of work.
>
> You can do that, but I want the passes to use a single mechanism to
> feed all three "separated dumps".
>

Can you elaborate on single mechanism here? A set of well defined
dumping APIs (instead of free form of  if (dump_file) fprintf
(dump_file, ...) ) ?

   debug_print (message, dump_flags, message_verbose_level, ...)
   trace_enter (trace_header_note)
   trace_exit (trace_header_not)
   opt_info_print (location, message_template, insertion)

Or  how dump files are organized?

I am all for clean up of dumping, but I don't see how -fopt-info get
in the way of that.

>>>
>>> So, no, please do it the right way that benefits both compiler developers
>>> and your "power users".
>>>
>>> And yes, the right way is not to start adding that -fopt-report switch.
>>> The right way is to make dump-files consumable by mere mortals first.
>>
>> I agree we need to do the right way which needs to be discussed first.
>> I would argue that mere mortals will really appreciate opt-info
>> (separate from dump file and opt-report).
>
> Well, still what you print with opt-info should be better also be present
> with opt-report and in dump files.  Thus it all boils down to be able
> to filter what passes put in their dump files.

opt-report is different (needs to buffer information and dumping at
the end of compilation).   Dump files and fopt-info can share the same
dumping format -- whatever gets emitted by opt-info should also be
emitted in the dump file (or replace the less well formated
transformation messages that are already available in dump files),
however simply filering the dump info does not solve the scalabilty
issue I mentioned.

thanks,

David

>
> Richard.
>
>> thanks,
>>
>> David
>>
>>>
>>> Thanks,
>>> Richard.
>>>

 Thanks,

 David

>
> So, please fix dump-file

Re: [PATCH 2/3] Free large chunks in ggc

2011-10-23 Thread Andi Kleen
On Sun, Oct 23, 2011 at 12:24:46PM +0200, Richard Guenther wrote:
> >> space in the free list afterward we free it back on the next GC cycle.
> >> Then if there's a malloc or other allocator later it can grab
> >> the address space we freed.
> >>
> >> That was done to address your earlier concern.
> >>
> >> This will only happen on ggc_collect of course.
> >>
> >> So one difference from before the madvise patch is that different
> >> generations of free pages can accumulate in the freelist. Before madvise
> >> the freelist would never contain more than one generation.
> >> Normally it's sorted by address due to the way GC works, but there's no
> >> attempt to keep the sort order over multiple generations.
> >>
> >> The "free in batch" heuristic requires sorting, so it will only
> >> work if all the pages are freed in a single gc cycle.
> >>
> >> I considered sorting, but it seemed to be too slow.
> >>
> >> I can expand the comment on that.
> >
> > Ah, now I see ... but that's of course bad - I expect large regions to be
> > free only after multiple collections.  Can you measure what sorting would
> > make for a difference?
> 
> I wonder if the free list that falls out of a single collection is sorted

The original author seemed to have assumed it is usually. The 
allocation part tries hard to insert sorted. So I thought it 
was ok to assume.

I stuck in an assert now nd it triggers in a bootstrap on the large
files, so it's not always true (so my earlier assumption was not fully correct)

I suppose it's just another heuristic which is often enough true.

So madvise may not may have it made that much worse.

> (considering also ggc_free) - if it is, building a new one at each collection

ggc_free does not put into the freelist I believe.

> and then merging the two sorted lists should be reasonably fast.

It's definitely not O(1). Ok one could assume it's usually sorted
and do a merge sort with max one pass only. But I'm sceptical 
it's worth the effort, at least without anyone having a test case.
At least for 64bit it's not needed anyways.

-Andi


Re: [C++ Patch] PR 50810

2011-10-23 Thread Jason Merrill

On 10/23/2011 11:00 AM, Paolo Carlini wrote:

+With -std=c++0x, @option{-Wno-narrowing} suppresses the diagnostic
+required by the standard.  This flag is included in @option{-Wall} and
+@option{-Wc++0x-compat}.


I'd swap those two sentences.  OK with that change.

Jason


[PATCH, i386]: Macroize maskload/maskstore patterns some more and cleanup gather patterns a bit

2011-10-23 Thread Uros Bizjak
Hello!

2011-10-23  Uros Bizjak  

* config/i386/sse.md (sseintprefix): Rename from gthrfirstp.
(_maskload): Delete expander.
(_maskload) Merge insn
pattern from *avx2_maskload and
*avx_maskload using V48_AVX mode
iterator.  Use sseintprefix mode attribute.
(_maskstore): Delete expander.
(_maskstore) Merge insn
pattern from *avx2_maskstore and
*avx_maskstore using V48_AVX mode
iterator.  Use sseintprefix mode attribute.
(*avx2_gathersi) Use sseintprefix and ssemodesuffix mode
attributes.
(*avx2_gatherdi): Ditto.
(*avx2_gatherdi256): Ditto.
(VI48_AVX2): Remove mode iterator.
(gthrlastfp): Remove mode attribute.

Bootstrapped and regression tested on x86_64-pc-linux-gnu, committed
to mainline SVN.

Uros.
Index: sse.md
===
--- sse.md  (revision 180339)
+++ sse.md  (working copy)
@@ -125,9 +125,6 @@
(V8SI "TARGET_AVX2") V4SI
(V4DI "TARGET_AVX2") V2DI])
 
-(define_mode_iterator VI48_AVX2
-  [V8SI V4SI V4DI V2DI])
-
 (define_mode_iterator VI4SD_AVX2
   [V4SI V4DI])
 
@@ -246,8 +243,7 @@
(V8SI "V8SI") (V4DI "V4DI")
(V4SI "V4SI") (V2DI "V2DI")
(V16HI "V16HI") (V8HI "V8HI")
-   (V32QI "V32QI") (V16QI "V16QI")
-  ])
+   (V32QI "V32QI") (V16QI "V16QI")])
 
 ;; Mapping of vector modes to a vector mode of double size
 (define_mode_attr ssedoublevecmode
@@ -277,6 +273,13 @@
(V8SF "8") (V4DF "4")
(V4SF "4") (V2DF "2")])
 
+;; SSE prefix for integer vector modes
+(define_mode_attr sseintprefix
+  [(V2DI "p") (V2DF "")
+   (V4DI "p") (V4DF "")
+   (V4SI "p") (V4SF "")
+   (V8SI "p") (V8SF "")])
+
 ;; SSE scalar suffix for vector modes
 (define_mode_attr ssescalarmodesuffix
   [(SF "ss") (DF "sd")
@@ -319,16 +322,6 @@
   (V4DI "V4DI") (V4DF "V4DI")
   (V4SI "V2DI") (V4SF "V2DI")
   (V8SI "V4DI") (V8SF "V4DI")])
-(define_mode_attr gthrfirstp
- [(V2DI "p") (V2DF "")
-  (V4DI "p") (V4DF "")
-  (V4SI "p") (V4SF "")
-  (V8SI "p") (V8SF "")])
-(define_mode_attr gthrlastp
- [(V2DI "q") (V2DF "pd")
-  (V4DI "q") (V4DF "pd")
-  (V4SI "d") (V4SF "ps")
-  (V8SI "d") (V8SF "ps")])
 
 (define_mode_iterator FMAMODE [SF DF V4SF V2DF V8SF V4DF])
 
@@ -12275,77 +12268,33 @@
(set_attr "prefix" "vex")
(set_attr "mode" "OI")])
 
-(define_expand "_maskload"
-  [(set (match_operand:V48_AVX2 0 "register_operand" "")
+(define_insn "_maskload"
+  [(set (match_operand:V48_AVX2 0 "register_operand" "=x")
(unspec:V48_AVX2
- [(match_operand: 2 "register_operand" "")
-  (match_operand:V48_AVX2 1 "memory_operand" "")]
+ [(match_operand: 2 "register_operand" "x")
+  (match_operand:V48_AVX2 1 "memory_operand" "m")]
  UNSPEC_MASKMOV))]
-  "TARGET_AVX")
-
-(define_insn "*avx2_maskload"
-  [(set (match_operand:VI48_AVX2 0 "register_operand" "=x")
-   (unspec:VI48_AVX2
- [(match_operand: 1 "register_operand" "x")
-  (match_operand:VI48_AVX2 2 "memory_operand" "m")]
- UNSPEC_MASKMOV))]
-  "TARGET_AVX2"
-  "vpmaskmov\t{%2, %1, %0|%0, %1, %2}"
+  "TARGET_AVX"
+  "vmaskmov\t{%1, %2, %0|%0, %2, %1}"
   [(set_attr "type" "sselog1")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "vex")
(set_attr "mode" "")])
 
-(define_insn "*avx_maskload"
-  [(set (match_operand:VF 0 "register_operand" "=x")
-   (unspec:VF
- [(match_operand: 1 "register_operand" "x")
-  (match_operand:VF 2 "memory_operand" "m")]
- UNSPEC_MASKMOV))]
-  "TARGET_AVX"
-  "vmaskmov\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "type" "sselog1")
-   (set_attr "prefix_extra" "1")
-   (set_attr "prefix" "vex")
-   (set_attr "mode" "")])
-
-(define_expand "_maskstore"
-  [(set (match_operand:V48_AVX2 0 "memory_operand" "")
+(define_insn "_maskstore"
+  [(set (match_operand:V48_AVX2 0 "memory_operand" "=m")
(unspec:V48_AVX2
- [(match_operand: 1 "register_operand" "")
-  (match_operand:V48_AVX2 2 "register_operand" "")
-  (match_dup 0)]
- UNSPEC_MASKMOV))]
-  "TARGET_AVX")
-
-(define_insn "*avx2_maskstore"
-  [(set (match_operand:VI48_AVX2 0 "memory_operand" "=m")
-   (unspec:VI48_AVX2
  [(match_operand: 1 "register_operand" "x")
-  (match_operand:VI48_AVX2 2 "register_operand" "x")
+  (match_operand:V48_AVX2 2 "register_operand" "x")
   (match_dup 0)]
  UNSPEC_MASKMOV))]
-  "TARGET_AVX2"
-  "vpmaskmov\t{%2, %1, %0|%0, %1, %2}"
+  "TARGET_AVX"
+  "vmaskmov\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sselog1")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "vex")
(set_attr "mode" "")])
 
-(define_insn "*avx_maskstore"
-  [(set (match_operand:VF 0

Go patch committed: Implement new syscall package

2011-10-23 Thread Ian Lance Taylor
This patch is a rewrite of the syscall package in the Go library.  This
rewrite moves it from libgo/syscalls to libgo/go/syscall, to more
closely match the master Go library.  More importantly, it changes most
library calls to call new entersyscall and exitsyscall functions.  These
functions currently do nothing.  However, they are a step toward
multiplexing multiple goroutines onto a single thread, which will make
the implementation of goroutines more efficient.  When multiplexing
goroutines, it is of course essential to know when a goroutine is
calling a library function which may block.  This patch makes that
possible.

There are still some existing calls to possibly blocking library
functions in other parts of the library.  Those will also have to be
updated.

It's possible that this patch will once again break the Solaris and Irix
support.  I've tried to ensure that I didn't make any stupid errors, but
I haven't done any actual testing.  Sorry about any problems.

Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian



foo.patch.bz2
Description: patch


[PATCH, i386]: Remove avx2_lshl3 insn pattern

2011-10-23 Thread Uros Bizjak
Hello!

We have the same pattern with generic name just below this one.

2011-10-23  Uros Bizjak  

* config/i386/sse.md (avx2_lshl3): Remove insn pattern.
(VI248_256): Remove mode iterator.
* config/i386/i386.md (ix86_expand_vec_perm): Use gen_ashlv4di3
instead of gen_avx2_lshlv4di3.
(bdesc_args): Use CODE_FOR_ashl{v16hi,v8si,v4di}3 instead of
CODE_FOR_avx2_lshl{v16hi,v8si,v4di}3.

Bootstrapped and regression tested on x86_64-pc-linux-gnu. Committed
to mainline SVN.

Uros.
Index: sse.md
===
--- sse.md  (revision 180344)
+++ sse.md  (working copy)
@@ -196,7 +196,6 @@
 
 ;; Random 256bit vector integer mode combinations
 (define_mode_iterator VI124_256 [V32QI V16HI V8SI])
-(define_mode_iterator VI248_256 [V16HI V8SI V4DI])
 
 ;; Int-float size matches
 (define_mode_iterator VI4F_128 [V4SI V4SF])
@@ -5804,21 +5803,6 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "")])
 
-(define_insn "avx2_lshl3"
-  [(set (match_operand:VI248_256 0 "register_operand" "=x")
-   (ashift:VI248_256
- (match_operand:VI248_256 1 "register_operand" "x")
- (match_operand:SI 2 "nonmemory_operand" "xN")))]
-  "TARGET_AVX2"
-  "vpsll\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "type" "sseishft")
-   (set_attr "prefix" "vex")
-   (set (attr "length_immediate")
- (if_then_else (match_operand 2 "const_int_operand" "")
-   (const_string "1")
-   (const_string "0")))
-   (set_attr "mode" "OI")])
-
 (define_insn "ashl3"
   [(set (match_operand:VI248_AVX2 0 "register_operand" "=x,x")
(ashift:VI248_AVX2
Index: i386.c
===
--- i386.c  (revision 180339)
+++ i386.c  (working copy)
@@ -19490,9 +19490,9 @@ ix86_expand_vec_perm (rtx operands[])
 stands for other 12 bytes.  */
  /* The bit whether element is from the same lane or the other
 lane is bit 4, so shift it up by 3 to the MSB position.  */
- emit_insn (gen_avx2_lshlv4di3 (gen_lowpart (V4DImode, t1),
-gen_lowpart (V4DImode, mask),
-GEN_INT (3)));
+ emit_insn (gen_ashlv4di3 (gen_lowpart (V4DImode, t1),
+   gen_lowpart (V4DImode, mask),
+   GEN_INT (3)));
  /* Clear MSB bits from the mask just in case it had them set.  */
  emit_insn (gen_avx2_andnotv32qi3 (t2, vt, mask));
  /* After this t1 will have MSB set for elements from other lane.  */
@@ -26289,12 +26289,12 @@ static const struct builtin_description bdesc_args
   { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_psignv16hi3, 
"__builtin_ia32_psignw256", IX86_BUILTIN_PSIGNW256, UNKNOWN, (int) 
V16HI_FTYPE_V16HI_V16HI },
   { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_psignv8si3 , 
"__builtin_ia32_psignd256", IX86_BUILTIN_PSIGND256, UNKNOWN, (int) 
V8SI_FTYPE_V8SI_V8SI },
   { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_ashlv2ti3, 
"__builtin_ia32_pslldqi256", IX86_BUILTIN_PSLLDQI256, UNKNOWN, (int) 
V4DI_FTYPE_V4DI_INT_CONVERT },
-  { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_lshlv16hi3, 
"__builtin_ia32_psllwi256", IX86_BUILTIN_PSLLWI256 , UNKNOWN, (int) 
V16HI_FTYPE_V16HI_SI_COUNT },
-  { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_lshlv16hi3, "__builtin_ia32_psllw256", 
IX86_BUILTIN_PSLLW256, UNKNOWN, (int) V16HI_FTYPE_V16HI_V8HI_COUNT },
-  { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_lshlv8si3, "__builtin_ia32_pslldi256", 
IX86_BUILTIN_PSLLDI256, UNKNOWN, (int) V8SI_FTYPE_V8SI_SI_COUNT },
-  { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_lshlv8si3, "__builtin_ia32_pslld256", 
IX86_BUILTIN_PSLLD256, UNKNOWN, (int) V8SI_FTYPE_V8SI_V4SI_COUNT },
-  { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_lshlv4di3, "__builtin_ia32_psllqi256", 
IX86_BUILTIN_PSLLQI256, UNKNOWN, (int) V4DI_FTYPE_V4DI_INT_COUNT },
-  { OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_lshlv4di3, "__builtin_ia32_psllq256", 
IX86_BUILTIN_PSLLQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V2DI_COUNT },
+  { OPTION_MASK_ISA_AVX2, CODE_FOR_ashlv16hi3, "__builtin_ia32_psllwi256", 
IX86_BUILTIN_PSLLWI256 , UNKNOWN, (int) V16HI_FTYPE_V16HI_SI_COUNT },
+  { OPTION_MASK_ISA_AVX2, CODE_FOR_ashlv16hi3, "__builtin_ia32_psllw256", 
IX86_BUILTIN_PSLLW256, UNKNOWN, (int) V16HI_FTYPE_V16HI_V8HI_COUNT },
+  { OPTION_MASK_ISA_AVX2, CODE_FOR_ashlv8si3, "__builtin_ia32_pslldi256", 
IX86_BUILTIN_PSLLDI256, UNKNOWN, (int) V8SI_FTYPE_V8SI_SI_COUNT },
+  { OPTION_MASK_ISA_AVX2, CODE_FOR_ashlv8si3, "__builtin_ia32_pslld256", 
IX86_BUILTIN_PSLLD256, UNKNOWN, (int) V8SI_FTYPE_V8SI_V4SI_COUNT },
+  { OPTION_MASK_ISA_AVX2, CODE_FOR_ashlv4di3, "__builtin_ia32_psllqi256", 
IX86_BUILTIN_PSLLQI256, UNKNOWN, (int) V4DI_FTYPE_V4DI_INT_COUNT },
+  { OPTION_MASK_ISA_AVX2, CODE_FOR_ashlv4di3, "__builtin_ia32_psllq256", 
IX86_BUILTIN_PSLLQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V2DI_COUNT },
   { OPTION_MASK_ISA_AVX2, CODE_FOR_ashrv16hi3, "__builtin_

[CRIS] Hookize GO_IF_MODE_DEPENDENT_ADDRESS

2011-10-23 Thread Anatoly Sokolov
 Hello.

  This patch removes obsolete GO_IF_LEGITIMATE_ADDRESS macro from CRIS back 
end in the GCC and introduces equivalent ARGET_LEGITIMATE_ADDRESS_P target 
hook.

  Regression tested on cris-axis-elf.

  OK to install?

* config/cris/cris.c (reg_ok_for_base_p, reg_ok_for_index_p,
cris_constant_index_p, cris_base_p, cris_index_p,
cris_base_or_autoincr_p, cris_bdap_index_p, cris_biap_index_p,
cris_legitimate_address_p): New functions.
(TARGET_LEGITIMATE_ADDRESS_P): Define.
(cris_pic_symbol_type, cris_valid_pic_const): Change arguments type
from rtx to const_rtx.
(cris_print_operand_address, cris_address_cost,
cris_side_effect_mode_ok):  Use
cris_constant_index_p, cris_base_p, cris_base_or_autoincr_p,
cris_biap_index_p and cris_bdap_index_p.
* config/cris/cris.h (CONSTANT_INDEX_P, BASE_P, BASE_OR_AUTOINCR_P,
BDAP_INDEX_P, BIAP_INDEX_P, GO_IF_LEGITIMATE_ADDRESS,
REG_OK_FOR_BASE_P, REG_OK_FOR_INDEX_P): Remove.
(EXTRA_CONSTRAINT_Q, EXTRA_CONSTRAINT_R, EXTRA_CONSTRAINT_T): Use
cris_constant_index_p, cris_base_p, cris_base_or_autoincr_p,
cris_biap_index_p and cris_bdap_index_p.
* config/cris/cris.md (moversideqi movemsideqi peephole2): Use
cris_base_p.
* config/cris/cris-protos.h (cris_constant_index_p, cris_base_p,
cris_base_or_autoincr_p, cris_bdap_index_p, cris_biap_index_p): New
prototype.
(cris_pic_symbol_type, cris_valid_pic_const): Update prototype.

Index: gcc/config/cris/cris.c
===
--- gcc/config/cris/cris.c  (revision 180345)
+++ gcc/config/cris/cris.c  (working copy)
@@ -125,6 +125,8 @@
 
 static reg_class_t cris_preferred_reload_class (rtx, reg_class_t);
 
+static bool cris_legitimate_address_p (enum machine_mode, rtx, bool);
+
 static int cris_register_move_cost (enum machine_mode, reg_class_t, 
reg_class_t);
 static int cris_memory_move_cost (enum machine_mode, reg_class_t, bool);
 static bool cris_rtx_costs (rtx, int, int, int, int *, bool);
@@ -200,6 +202,9 @@
 #undef TARGET_INIT_LIBFUNCS
 #define TARGET_INIT_LIBFUNCS cris_init_libfuncs
 
+#undef TARGET_LEGITIMATE_ADDRESS_P
+#define TARGET_LEGITIMATE_ADDRESS_P cris_legitimate_address_p
+
 #undef TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS cris_preferred_reload_class
 
@@ -1122,7 +1127,7 @@
 
   if (CONSTANT_ADDRESS_P (x))
 cris_output_addr_const (file, x);
-  else if (BASE_OR_AUTOINCR_P (x))
+  else if (cris_base_or_autoincr_p (x, true))
 cris_print_base (x, file);
   else if (GET_CODE (x) == PLUS)
 {
@@ -1130,12 +1135,12 @@
 
   x1 = XEXP (x, 0);
   x2 = XEXP (x, 1);
-  if (BASE_P (x1))
+  if (cris_base_p (x1, true))
{
  cris_print_base (x1, file);
  cris_print_index (x2, file);
}
-  else if (BASE_P (x2))
+  else if (cris_base_p (x2, true))
{
  cris_print_base (x2, file);
  cris_print_index (x1, file);
@@ -1272,6 +1277,136 @@
   gcc_unreachable ();
 }
 
+/* Nonzero if X is a hard reg that can be used as an index.  */
+static inline bool
+reg_ok_for_base_p (const_rtx x, bool strict)
+{
+  return ((! strict && ! HARD_REGISTER_P (x))
+  || REGNO_OK_FOR_BASE_P (REGNO (x)));
+}
+
+/* Nonzero if X is a hard reg that can be used as an index.  */
+static inline bool
+reg_ok_for_index_p (const_rtx x, bool strict)
+{
+  return reg_ok_for_base_p (x, strict);
+}
+
+/* No symbol can be used as an index (or more correct, as a base) together
+   with a register with PIC; the PIC register must be there.  */
+
+bool
+cris_constant_index_p (const_rtx x)
+{
+  return (CONSTANT_P (x) && (!flag_pic || cris_valid_pic_const (x, true)));
+}
+
+/* True if X is a valid base register.  */
+
+bool
+cris_base_p (const_rtx x, bool strict)
+{
+  return (REG_P (x) && reg_ok_for_base_p (x, strict));
+}
+
+/* True if X is a valid index register.  */
+
+static inline bool
+cris_index_p (const_rtx x, bool strict)
+{
+  return (REG_P (x) && reg_ok_for_index_p (x, strict));
+}
+
+/* True if X is a valid base register with or without autoincrement.  */
+
+bool
+cris_base_or_autoincr_p (const_rtx x, bool strict)
+{
+  return (cris_base_p (x, strict)
+ || (GET_CODE (x) == POST_INC
+ && cris_base_p (XEXP (x, 0), strict)
+ && REGNO (XEXP (x, 0)) != CRIS_ACR_REGNUM));
+}
+
+/* True if X is a valid (register) index for BDAP, i.e. [Rs].S or [Rs+].S.  */
+
+bool
+cris_bdap_index_p (const_rtx x, bool strict)
+{
+  return ((MEM_P (x)
+  && GET_MODE (x) == SImode
+  && cris_base_or_autoincr_p (XEXP (x, 0), strict))
+ || (GET_CODE (x) == SIGN_EXTEND
+ && MEM_P (XEXP (x, 0))
+ && (GET_MODE (XEXP (x, 0)) == HImode
+ || GET_MODE (XEXP (x, 0)) == QImode)
+ && cris_base_or_autoincr_p (XEXP (XEXP (x, 0), 0), strict)))

Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Gerald Pfeifer
Is it possible that this is responsible for a bootstrap failure introduced 
in the last 27 hours or so?

/scratch/tmp/gerald/gcc-HEAD/gcc/tree-object-size.c:44:59: error: narrowing 
conversion of '-0x1' from 'int' to 'long unsigned int' inside { 
} [-Werror=narrowing]
/scratch/tmp/gerald/gcc-HEAD/gcc/tree-object-size.c:44:59: error: narrowing 
conversion of '-0x1' from 'int' to 'long unsigned int' inside { 
} [-Werror=narrowing]
cc1plus: all warnings being treated as errors
gmake[3]: *** [tree-object-size.o] Error 1
gmake[3]: Leaving directory `/local0/scratch/gerald/OBJ-1023-1848/gcc'
gmake[2]: *** [all-stage2-gcc] Error 2
gmake[2]: Leaving directory `/local0/scratch/gerald/OBJ-1023-1848'
gmake[1]: *** [stage2-bubble] Error 2

The code in question is

  static unsigned HOST_WIDE_INT unknown[4] = { -1, -1, 0, 0 };

This is on amd64-unknown-freebsd8.0, though I am puzzled it does not
seem to trigger for other 64-bit platforms?

I also filed PR 50841 for the bootstrap failure, especially if it's
not yours.

Gerald


2011-10-23  Paolo Carlini  

PR c++/50810
* c-opts.c (c_common_handle_option): Enable -Wnarrowing as part
of -Wall; include -Wnarrowing in -Wc++0x-compat; adjust default
Wnarrowing for C++0x and C++98.
* c.opt ([Wnarrowing]): Update.


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Paolo Carlini

On 10/23/2011 10:07 PM, Gerald Pfeifer wrote:

Is it possible that this is responsible for a bootstrap failure introduced
in the last 27 hours or so?

/scratch/tmp/gerald/gcc-HEAD/gcc/tree-object-size.c:44:59: error: narrowing 
conversion of '-0x1' from 'int' to 'long unsigned int' inside { 
} [-Werror=narrowing]
/scratch/tmp/gerald/gcc-HEAD/gcc/tree-object-size.c:44:59: error: narrowing 
conversion of '-0x1' from 'int' to 'long unsigned int' inside { 
} [-Werror=narrowing]
So, to be clear, this is for bootstrapping with a C++ compiler, right? 
Honestly, didn't try that... It's definitely possible that there are 
glitches in the tree wrt -Wnarrowing in C++.


Paolo.


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Eric Botcazou
> The code in question is
>
>   static unsigned HOST_WIDE_INT unknown[4] = { -1, -1, 0, 0 };
>
> This is on amd64-unknown-freebsd8.0, though I am puzzled it does not
> seem to trigger for other 64-bit platforms?

It does trigger on Linux.  I guess the patch wasn't bootstrapped.

There is another problem in Ada.  Fixed thusly.


2011-10-23  Eric Botcazou  

* gcc-interface/decl.c (create_concat_name): Add explicit cast.


-- 
Eric Botcazou
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 180235)
+++ gcc-interface/decl.c	(working copy)
@@ -8976,7 +8976,7 @@ create_concat_name (Entity_Id gnat_entit
 
   if (suffix)
 {
-  String_Template temp = {1, strlen (suffix)};
+  String_Template temp = {1, (int) strlen (suffix)};
   Fat_Pointer fp = {suffix, &temp};
   Get_External_Name_With_Suffix (gnat_entity, fp);
 }


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Eric Botcazou
> So, to be clear, this is for bootstrapping with a C++ compiler, right?
> Honestly, didn't try that... It's definitely possible that there are
> glitches in the tree wrt -Wnarrowing in C++.

Bootstrapping with the C++ compiler has been the default for months...

-- 
Eric Botcazou


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Paolo Carlini

On 10/23/2011 10:19 PM, Eric Botcazou wrote:

So, to be clear, this is for bootstrapping with a C++ compiler, right?
Honestly, didn't try that... It's definitely possible that there are
glitches in the tree wrt -Wnarrowing in C++.

Bootstrapping with the C++ compiler has been the default for months...


Oh my, I thought I was till using C here... Ok, I'll fix that.

Paolo.


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Eric Botcazou
> Oh my, I thought I was till using C here... Ok, I'll fix that.

The base compiler is a C compiler, stage 2/3 are built with the C++ compiler.

-- 
Eric Botcazou


Re: [PATCH] Fix mv8plus, allow targetting Linux or Solaris from other sparc host.

2011-10-23 Thread David Miller
From: Eric Botcazou 
Date: Sun, 23 Oct 2011 11:58:57 +0200

>> I'll try to brainstorm on this, thanks for letting me know about the
>> Solaris target problem.
> 
> Let's fix the regression quickly though.

I'll fix it by the end of tonight.


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Paolo Carlini

On 10/23/2011 10:25 PM, Eric Botcazou wrote:

Oh my, I thought I was till using C here... Ok, I'll fix that.

The base compiler is a C compiler, stage 2/3 are built with the C++ compiler.

Yes, yes. Sorry about this.

Anyway, the below appears to work for me. Eric shall I commit it?

Thanks,
Paolo.

/

Index: tree-ssa-ccp.c
===
--- tree-ssa-ccp.c  (revision 180346)
+++ tree-ssa-ccp.c  (working copy)
@@ -2011,7 +2011,9 @@ ccp_visit_stmt (gimple stmt, edge *taken_edge_p, t
  Mark them VARYING.  */
   FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_ALL_DEFS)
 {
-  prop_value_t v = { VARYING, NULL_TREE, { -1, (HOST_WIDE_INT) -1 } };
+  prop_value_t v =
+   { VARYING, NULL_TREE, { (unsigned HOST_WIDE_INT) -1,
+   (HOST_WIDE_INT) -1 } };
   set_lattice_value (def, v);
 }
 
Index: tree-object-size.c
===
--- tree-object-size.c  (revision 180346)
+++ tree-object-size.c  (working copy)
@@ -41,7 +41,9 @@ struct object_size_info
   unsigned int *stack, *tos;
 };
 
-static unsigned HOST_WIDE_INT unknown[4] = { -1, -1, 0, 0 };
+static unsigned HOST_WIDE_INT unknown[4]
+= { (unsigned HOST_WIDE_INT)-1, (unsigned HOST_WIDE_INT)-1,
+(unsigned HOST_WIDE_INT)0, (unsigned HOST_WIDE_INT)0 };
 
 static tree compute_object_offset (const_tree, const_tree);
 static unsigned HOST_WIDE_INT addr_object_size (struct object_size_info *,


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Eric Botcazou
> Anyway, the below appears to work for me. Eric shall I commit it?

I have other errors for config/i386/i386.c on my x86-64 machine.  But are we 
sure that we want to warn on

static unsigned HOST_WIDE_INT unknown[4] = { -1, -1, 0, 0 };

with -Wall?  This seems overly picky to me.

-- 
Eric Botcazou


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing

2011-10-23 Thread Paolo Carlini

On 10/23/2011 10:39 PM, Paolo Carlini wrote:

On 10/23/2011 10:25 PM, Eric Botcazou wrote:

Oh my, I thought I was till using C here... Ok, I'll fix that.
The base compiler is a C compiler, stage 2/3 are built with the C++ 
compiler.

Yes, yes. Sorry about this.

Anyway, the below appears to work for me. Eric shall I commit it?

Nope, doesn't work, there are *many* more issues in gcc/config.

I'm afraid we are not ready yet to enable this, target maintainer have 
to help cleaning up gcc/config first, I'm going to revert my patch.


Paolo.


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing

2011-10-23 Thread Paolo Carlini

On 10/23/2011 10:47 PM, Paolo Carlini wrote:

On 10/23/2011 10:39 PM, Paolo Carlini wrote:

On 10/23/2011 10:25 PM, Eric Botcazou wrote:

Oh my, I thought I was till using C here... Ok, I'll fix that.
The base compiler is a C compiler, stage 2/3 are built with the C++ 
compiler.

Yes, yes. Sorry about this.

Anyway, the below appears to work for me. Eric shall I commit it?

Nope, doesn't work, there are *many* more issues in gcc/config.

I'm afraid we are not ready yet to enable this, target maintainer have 
to help cleaning up gcc/config first, I'm going to revert my patch.

Done.

Paolo.


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Gabriel Dos Reis
On Sun, Oct 23, 2011 at 3:45 PM, Eric Botcazou  wrote:
>> Anyway, the below appears to work for me. Eric shall I commit it?
>
> I have other errors for config/i386/i386.c on my x86-64 machine.  But are we
> sure that we want to warn on
>
> static unsigned HOST_WIDE_INT unknown[4] = { -1, -1, 0, 0 };
>
> with -Wall?  This seems overly picky to me.
>

The warning probably should not be in -Wall.  It is fairly recent in C++, and I
think we should allow users to adapt before enabling it by default.


Re: [C++-11] User defined literals

2011-10-23 Thread Ed Smith-Rowland

On 10/21/2011 05:20 PM, Jason Merrill wrote:

I think we're down to minor cosmetic issues:

On 10/21/2011 03:55 PM, Tom Tromey wrote:

There are a few spots like this that are missing a space before an open
paren.



+  if (DECL_LANGUAGE(decl) == lang_c)


Another one.


-  if (warn_cxx0x_compat
- && C_RID_CODE (token->u.value) >= RID_FIRST_CXX0X
- && C_RID_CODE (token->u.value) <= RID_LAST_CXX0X)



This code doesn't seem to have actually changed, so let's not adjust 
its whitespace.



+  /* Fill in PARMVEC with all of the parameters.  */
+  parmvec = make_tree_vec (len);


Let's call it 'charvec'; the characters are template arguments, not 
parameters.


+/* Parse a user-defined numeric constant.  returns a call to a 
user-defined

+   literal operator.  */
+static tree
+cp_parser_userdef_numeric_literal (cp_parser *parser)


Add a blank line between comment and function.

While looking at the embedded string issue I found that if you apply 
the suffix of a raw literal to a string it errors as it should but 
the error complained that there were too many arguments for the 
function.  This was not helpful so I made a nicer error message.



+  if (result == error_mark_node)
+error ("invalid string literal prefix %<\"%s\"%> for user-defined"
+  " raw literal operator %qD", TREE_STRING_POINTER (value), 
name);


I think that we want a combination of the two errors; the new error 
doesn't help the user to fix their code as much.  It should remind 
them that for a string literal the function is called with a length 
argument as well.


Concerning this error, the only way to get here is to mis-use a raw 
literal operator by giving it a quoted string.  The prefix must be 
interpretable as a number of some kind.  I think I'll tell the user to 
drop the quotes.
The length of a string literal is supplied implicitly by the compiler to 
a string literal operator when a string user defined literal is 
encountered.  The user doesn't explicitly call the operator (not here 
anyway).


+   error ("literal operator template %qD has invalid parameter 
list",

+  decl);


Similarly, this message should say that the parameter list needs to be 





+
+/* Return true if a user-defined literal operator is a raw 
operator.  */

+


We don't need the extra newline before the comment.

Should be ready to go with these tweaks.

Jason


I've made these corrections.  They'll be in the next patch.

Unfortunately, as I was testing raw operators on very long strings I 
observed two things:
1. A bad error - the argument to a raw literal operator must be a 
null-terminated string.
2. If a very long number is given as the prefix to a numeric literal, a 
warning is issued ("integer constant is too large for its type")
If the receiving operator is either the raw operator or the operator 
template then this should not be given.  I anticipate people might like 
to have multi-precision numbers someday, for example.


Before I release another patch I have to fix 1.  The warning might be 
fixed in-tree if that's OK.


Ed



Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Paolo Carlini

On 10/23/2011 11:05 PM, Gabriel Dos Reis wrote:

On Sun, Oct 23, 2011 at 3:45 PM, Eric Botcazou  wrote:

Anyway, the below appears to work for me. Eric shall I commit it?

I have other errors for config/i386/i386.c on my x86-64 machine.  But are we
sure that we want to warn on

static unsigned HOST_WIDE_INT unknown[4] = { -1, -1, 0, 0 };

with -Wall?  This seems overly picky to me.


The warning probably should not be in -Wall.  It is fairly recent in C++, and I
think we should allow users to adapt before enabling it by default.
The issue is that we wanted -Wconversion to be enabled by -Wc++0x-compat 
(after all, it's what the PR asks) but the latter is *already* in -Wall.


Personally, I would be in favor of taking -Wc++0x-compat out of -Wall.

Paolo.


[PATCH] Use a macro instead of a constant to test for sparc integer regnos.

2011-10-23 Thread David Miller

Since there is a mixture of signed vs. unsigned regnos used in these
tests, I had to code this as ((unsigned) x <= 31) otherwise we get
warnings in the unsigned cases for x >= 0.

Perhaps the signedness should be shored up at some point, but I left
that for some other time.

I'm currently just trying to make the VIS3 fp<-->int move patch more
readable.

Committed to trunk.

gcc/

* config/sparc/sparc.h (SPARC_FIRST_INT_REG, SPARC_LAST_INT_REG,
SPARC_INT_REG_P): Define.
(HARD_REGNO_NREGS): Use SPARC_INT_REG_P.
(REGNO_OK_FOR_INDEX_P): Likewise.
* config/sparc/sparc.c (gen_df_reg): Likewise.
(eligible_for_return_delay): Likewise.
(eligible_for_sibcall_delay): Likewise.
(sparc_legitimate_address_p): Likewise.
(emit_save_or_restore_regs): Likewise.
(registers_ok_for_ldd_peep): Likewise.
* config/spac/sparc.md (DI mode splitters): Likewise.
(SF mode const splitters): Likewise.
(DF mode splitters): Likewise.
(32-bit DI mode logical op splitters): Likewise.
---
 gcc/ChangeLog |   17 +
 gcc/config/sparc/sparc.c  |   18 +-
 gcc/config/sparc/sparc.h  |   12 +---
 gcc/config/sparc/sparc.md |   40 
 4 files changed, 55 insertions(+), 32 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 42766f1..e647a60 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,20 @@
+2011-10-23  David S. Miller  
+
+   * config/sparc/sparc.h (SPARC_FIRST_INT_REG, SPARC_LAST_INT_REG,
+   SPARC_INT_REG_P): Define.
+   (HARD_REGNO_NREGS): Use SPARC_INT_REG_P.
+   (REGNO_OK_FOR_INDEX_P): Likewise.
+   * config/sparc/sparc.c (gen_df_reg): Likewise.
+   (eligible_for_return_delay): Likewise.
+   (eligible_for_sibcall_delay): Likewise.
+   (sparc_legitimate_address_p): Likewise.
+   (emit_save_or_restore_regs): Likewise.
+   (registers_ok_for_ldd_peep): Likewise.
+   * config/spac/sparc.md (DI mode splitters): Likewise.
+   (SF mode const splitters): Likewise.
+   (DF mode splitters): Likewise.
+   (32-bit DI mode logical op splitters): Likewise.
+
 2011-10-23  Paolo Carlini  
 
PR c++/50841
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index ba88315..415ece8 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -2640,7 +2640,7 @@ gen_df_reg (rtx reg, int low)
   int regno = REGNO (reg);
 
   if ((WORDS_BIG_ENDIAN == 0) ^ (low != 0))
-regno += (TARGET_ARCH64 && regno < 32) ? 1 : 2;
+regno += (TARGET_ARCH64 && SPARC_INT_REG_P (regno)) ? 1 : 2;
   return gen_rtx_REG (DFmode, regno);
 }
 
@@ -3124,7 +3124,7 @@ eligible_for_return_delay (rtx trial)
   /* If this instruction sets up floating point register and we have a return
  instruction, it can probably go in.  But restore will not work
  with FP_REGS.  */
-  if (regno >= 32)
+  if (! SPARC_INT_REG_P (regno))
 return (TARGET_V9
&& !epilogue_renumber (&pat, 1)
&& get_attr_in_uncond_branch_delay (trial)
@@ -3166,7 +3166,7 @@ eligible_for_sibcall_delay (rtx trial)
  a `restore' insn can go into the delay slot.  */
   if (GET_CODE (SET_DEST (pat)) != REG
   || (REGNO (SET_DEST (pat)) >= 8 && REGNO (SET_DEST (pat)) < 24)
-  || REGNO (SET_DEST (pat)) >= 32)
+  || ! SPARC_INT_REG_P (REGNO (SET_DEST (pat
 return 0;
 
   /* If it mentions %o7, it can't go in, because sibcall will clobber it
@@ -3486,11 +3486,11 @@ sparc_legitimate_address_p (enum machine_mode mode, rtx 
addr, bool strict)
 }
   else
 {
-  if ((REGNO (rs1) >= 32
+  if ((! SPARC_INT_REG_P (REGNO (rs1))
   && REGNO (rs1) != FRAME_POINTER_REGNUM
   && REGNO (rs1) < FIRST_PSEUDO_REGISTER)
  || (rs2
- && (REGNO (rs2) >= 32
+ && (! SPARC_INT_REG_P (REGNO (rs2))
  && REGNO (rs2) != FRAME_POINTER_REGNUM
  && REGNO (rs2) < FIRST_PSEUDO_REGISTER)))
return 0;
@@ -4729,17 +4729,17 @@ emit_save_or_restore_regs (unsigned int low, unsigned 
int high, rtx base,
 
  if (reg0 && reg1)
{
- mode = i < 32 ? DImode : DFmode;
+ mode = SPARC_INT_REG_P (i) ? DImode : DFmode;
  regno = i;
}
  else if (reg0)
{
- mode = i < 32 ? SImode : SFmode;
+ mode = SPARC_INT_REG_P (i) ? SImode : SFmode;
  regno = i;
}
  else if (reg1)
{
- mode = i < 32 ? SImode : SFmode;
+ mode = SPARC_INT_REG_P (i) ? SImode : SFmode;
  regno = i + 1;
  offset += 4;
}
@@ -7794,7 +7794,7 @@ registers_ok_for_ldd_peep (rtx reg1, rtx reg2)
 return 0;
 
   /* Integer ldd is deprecated in SPARC V9 */
-  if (TARGET_V9 && REGNO (reg1) < 32)
+  if (TARGET_V9 && SPARC_INT_REG_P (REGNO (reg1)))
 retur

[PATCH] Fix sparc so that reload doesn't try to load non-trivial vector consts directly.

2011-10-23 Thread David Miller

While working on the vis3 fp move support, I came to find that we
weren't making sure that non-trivial vector constants went to memory.

The vector move patterns only support -1 and 0, so we have to force
all other values to memory.

Committed to trunk.

gcc/

* config/sparc/predicates.md (input_operand): Disallow vector
constants other than 0 and -1.
* config/sparc/sparc.c (sparc_preferred_reload_class): Return
NO_REGS for vector constants other than 0 and -1.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180351 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog  |5 +
 gcc/config/sparc/predicates.md |8 ++--
 gcc/config/sparc/sparc.c   |   15 ---
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e647a60..3dc4ba9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2011-10-23  David S. Miller  
 
+   * config/sparc/predicates.md (input_operand): Disallow vector
+   constants other than 0 and -1.
+   * config/sparc/sparc.c (sparc_preferred_reload_class): Return
+   NO_REGS for vector constants other than 0 and -1.
+
* config/sparc/sparc.h (SPARC_FIRST_INT_REG, SPARC_LAST_INT_REG,
SPARC_INT_REG_P): Define.
(HARD_REGNO_NREGS): Use SPARC_INT_REG_P.
diff --git a/gcc/config/sparc/predicates.md b/gcc/config/sparc/predicates.md
index f0be149..4dd734f 100644
--- a/gcc/config/sparc/predicates.md
+++ b/gcc/config/sparc/predicates.md
@@ -427,8 +427,12 @@
   && (GET_CODE (op) == CONST_DOUBLE || GET_CODE (op) == CONST_INT))
 return true;
 
-  if ((mclass == MODE_FLOAT && GET_CODE (op) == CONST_DOUBLE)
-  || (mclass == MODE_VECTOR_INT && GET_CODE (op) == CONST_VECTOR))
+  if (mclass == MODE_FLOAT && GET_CODE (op) == CONST_DOUBLE)
+return true;
+
+  if (mclass == MODE_VECTOR_INT && GET_CODE (op) == CONST_VECTOR
+  && (const_zero_operand (op, mode)
+  || const_all_ones_operand (op, mode)))
 return true;
 
   if (register_operand (op, mode))
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 415ece8..df0d825 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -6,17 +6,26 @@ sparc_conditional_register_usage (void)
 static reg_class_t
 sparc_preferred_reload_class (rtx x, reg_class_t rclass)
 {
+  enum machine_mode mode = GET_MODE (x);
   if (CONSTANT_P (x))
 {
   if (FP_REG_CLASS_P (rclass)
  || rclass == GENERAL_OR_FP_REGS
  || rclass == GENERAL_OR_EXTRA_FP_REGS
- || (GET_MODE_CLASS (GET_MODE (x)) == MODE_FLOAT && ! TARGET_FPU)
- || (GET_MODE (x) == TFmode && ! const_zero_operand (x, TFmode)))
+ || (GET_MODE_CLASS (mode) == MODE_FLOAT && ! TARGET_FPU)
+ || (mode == TFmode && ! const_zero_operand (x, mode)))
return NO_REGS;
 
-  if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT)
+  if (GET_MODE_CLASS (mode) == MODE_INT)
return GENERAL_REGS;
+
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
+   {
+ if (! FP_REG_CLASS_P (rclass)
+ || !(const_zero_operand (x, mode)
+  || const_all_ones_operand (x, mode)))
+   return NO_REGS;
+   }
 }
 
   return rclass;
-- 
1.7.6.401.g6a319



[PATCH] Add missing fzero/fone cases to DImode move on 32-bit v9 sparc.

2011-10-23 Thread David Miller

A good example of a case where this matters is pdist.c in the testsuite.
Before this change we get code for the beginning of function 'foo' like:

add %sp, -112, %sp
std %o0, [%sp+96]
stx %g0, [%sp+104]
ldd [%sp+96], %f10
std %o2, [%sp+96]
ldd [%sp+104], %f8
ldd [%sp+96], %f12
pdist   %f10, %f12, %f8

now it will look like:

add %sp, -88, %sp
fzero   %f8
std %o0, [%sp+72]
ldd [%sp+72], %f10
std %o2, [%sp+72]
ldd [%sp+72], %f12
pdist   %f10, %f12, %f8

And it will get a lot better when the VIS3 moves are available.

Now that we've added this case, we have to make sure the DI mode
const_int --> reg splitter doesn't trigger for float regs.

Committed to trunk.

gcc/

* config/sparc/sparc.md (*movdi_insn_sp32_v9): Add alternatives for
generating fzero and fone instructions.
(DImode const_int --> reg splitter): Only trigger for integer regs.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180352 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog |4 
 gcc/config/sparc/sparc.md |   22 +++---
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3dc4ba9..be79367 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2011-10-23  David S. Miller  
 
+   * config/sparc/sparc.md (*movdi_insn_sp32_v9): Add alternatives for
+   generating fzero and fone instructions.
+   (DImode const_int --> reg splitter): Only trigger for integer regs.
+
* config/sparc/predicates.md (input_operand): Disallow vector
constants other than 0 and -1.
* config/sparc/sparc.c (sparc_preferred_reload_class): Return
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index c6454f5..fa27bba 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -1488,9 +1488,9 @@
 
 (define_insn "*movdi_insn_sp32_v9"
   [(set (match_operand:DI 0 "nonimmediate_operand"
-   "=T,o,T,U,o,r,r,r,?T,?f,?f,?o,?e,?e,?W")
+   
"=T,o,T,U,o,r,r,r,?T,?f,?f,?o,?e,?e,?W,b,b")
 (match_operand:DI 1 "input_operand"
-   " J,J,U,T,r,o,i,r, f, T, o, f, e, W, 
e"))]
+   " J,J,U,T,r,o,i,r, f, T, o, f, e, W, 
e,J,P"))]
   "! TARGET_ARCH64
&& TARGET_V9
&& (register_operand (operands[0], DImode)
@@ -1510,10 +1510,12 @@
#
fmovd\\t%1, %0
ldd\\t%1, %0
-   std\\t%1, %0"
-  [(set_attr "type" 
"store,store,store,load,*,*,*,*,fpstore,fpload,*,*,fpmove,fpload,fpstore")
-   (set_attr "length" "*,2,*,*,2,2,2,2,*,*,2,2,*,*,*")
-   (set_attr "fptype" "*,*,*,*,*,*,*,*,*,*,*,*,double,*,*")])
+   std\\t%1, %0
+   fzero\t%0
+   fone\t%0"
+  [(set_attr "type" 
"store,store,store,load,*,*,*,*,fpstore,fpload,*,*,fpmove,fpload,fpstore,fga,fga")
+   (set_attr "length" "*,2,*,*,2,2,2,2,*,*,2,2,*,*,*,*,*")
+   (set_attr "fptype" "*,*,*,*,*,*,*,*,*,*,*,*,double,*,*,double,double")])
 
 (define_insn "*movdi_insn_sp64"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,m,?e,?e,?W,b,b")
@@ -1757,7 +1759,13 @@
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 (match_operand:DI 1 "const_int_operand" ""))]
-  "! TARGET_ARCH64 && reload_completed"
+  "! TARGET_ARCH64
+   && ((GET_CODE (operands[0]) == REG
+&& SPARC_INT_REG_P (REGNO (operands[0])))
+   || (GET_CODE (operands[0]) == SUBREG
+   && GET_CODE (SUBREG_REG (operands[0])) == REG
+   && SPARC_INT_REG_P (REGNO (SUBREG_REG (operands[0])
+   && reload_completed"
   [(clobber (const_int 0))]
 {
 #if HOST_BITS_PER_WIDE_INT == 32
-- 
1.7.6.401.g6a319



Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Gabriel Dos Reis
On Sun, Oct 23, 2011 at 4:28 PM, Paolo Carlini  wrote:
> On 10/23/2011 11:05 PM, Gabriel Dos Reis wrote:
>>
>> On Sun, Oct 23, 2011 at 3:45 PM, Eric Botcazou
>>  wrote:

 Anyway, the below appears to work for me. Eric shall I commit it?
>>>
>>> I have other errors for config/i386/i386.c on my x86-64 machine.  But are
>>> we
>>> sure that we want to warn on
>>>
>>> static unsigned HOST_WIDE_INT unknown[4] = { -1, -1, 0, 0 };
>>>
>>> with -Wall?  This seems overly picky to me.
>>>
>> The warning probably should not be in -Wall.  It is fairly recent in C++,
>> and I
>> think we should allow users to adapt before enabling it by default.
>
> The issue is that we wanted -Wconversion to be enabled by -Wc++0x-compat
> (after all, it's what the PR asks) but the latter is *already* in -Wall.

yes.

>
> Personally, I would be in favor of taking -Wc++0x-compat out of -Wall.
>

Patch pre-approved.
It makes sense though that -Wextra implies -Wc++0x-compat.


[COMMITTED] Fortran -- fix memory leak in array transformations.

2011-10-23 Thread Steve Kargl
2011-10-23  Steven G. Kargl  

* simplify.c (simplify_transformation_to_array): Fix memory leak. 

Index: simplify.c
===
--- simplify.c  (revision 180352)
+++ simplify.c  (working copy)
@@ -516,6 +516,7 @@ simplify_transformation_to_array (gfc_ex
  linked-list traversal. Masked elements are set to NULL.  */
   gfc_array_size (array, &size);
   arraysize = mpz_get_ui (size);
+  mpz_clear (size);
 
   arrayvec = XCNEWVEC (gfc_expr*, arraysize);
 
-- 
Steve


[PATCH] Factor out common tests in 8-byte reg/reg move splitters on 32-bit sparc.

2011-10-23 Thread David Miller

These tests are already growing big and ugly, and they will get even
more conditions when the VIS3 FP moves arrive.

Furthermore, we will need to have to do the same check in a third location,
in a new spliiter for the 64-bit vector moves that will be added for VIS3.

Committed to trunk.

* config/sparc/sparc.c (sparc_split_regreg_legitimate): New
function.
* config/sparc/sparc-protos.h (sparc_split_regreg_legitimate):
Declare it.
* config/sparc/sparc.md (DImode reg/reg split): Use it.
(DFmode reg/reg split): Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180354 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog   |7 +++
 gcc/config/sparc/sparc-protos.h |1 +
 gcc/config/sparc/sparc.c|   25 +
 gcc/config/sparc/sparc.md   |   14 --
 4 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index be79367..dfa4caf 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@
 2011-10-23  David S. Miller  
 
+   * config/sparc/sparc.c (sparc_split_regreg_legitimate): New
+   function.
+   * config/sparc/sparc-protos.h (sparc_split_regreg_legitimate):
+   Declare it.
+   * config/sparc/sparc.md (DImode reg/reg split): Use it.
+   (DFmode reg/reg split): Likewise.
+
* config/sparc/sparc.md (*movdi_insn_sp32_v9): Add alternatives for
generating fzero and fone instructions.
(DImode const_int --> reg splitter): Only trigger for integer regs.
diff --git a/gcc/config/sparc/sparc-protos.h b/gcc/config/sparc/sparc-protos.h
index 2890532..bb6fb07 100644
--- a/gcc/config/sparc/sparc-protos.h
+++ b/gcc/config/sparc/sparc-protos.h
@@ -68,6 +68,7 @@ extern void sparc_defer_case_vector (rtx, rtx, int);
 extern bool sparc_expand_move (enum machine_mode, rtx *);
 extern void sparc_emit_set_symbolic_const64 (rtx, rtx, rtx);
 extern int sparc_splitdi_legitimate (rtx, rtx);
+extern int sparc_split_regreg_legitimate (rtx, rtx);
 extern int sparc_absnegfloat_split_legitimate (rtx, rtx);
 extern const char *output_ubranch (rtx, int, rtx);
 extern const char *output_cbranch (rtx, rtx, int, int, int, rtx);
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index df0d825..29d2847 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -7762,6 +7762,31 @@ sparc_splitdi_legitimate (rtx reg, rtx mem)
   return 1;
 }
 
+/* Like sparc_splitdi_legitimate but for REG <--> REG moves.  */
+
+int
+sparc_split_regreg_legitimate (rtx reg1, rtx reg2)
+{
+  int regno1, regno2;
+
+  if (GET_CODE (reg1) == SUBREG)
+reg1 = SUBREG_REG (reg1);
+  if (GET_CODE (reg1) != REG)
+return 0;
+  regno1 = REGNO (reg1);
+
+  if (GET_CODE (reg2) == SUBREG)
+reg2 = SUBREG_REG (reg2);
+  if (GET_CODE (reg2) != REG)
+return 0;
+  regno2 = REGNO (reg2);
+
+  if (SPARC_INT_REG_P (regno1) && SPARC_INT_REG_P (regno2))
+return 1;
+
+  return 0;
+}
+
 /* Return 1 if x and y are some kind of REG and they refer to
different hard registers.  This test is guaranteed to be
run after reload.  */
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index fa27bba..b84699a 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -1834,11 +1834,8 @@
   "reload_completed
&& (! TARGET_V9
|| (! TARGET_ARCH64
-   && ((GET_CODE (operands[0]) == REG
-&& SPARC_INT_REG_P (REGNO (operands[0])))
-   || (GET_CODE (operands[0]) == SUBREG
-   && GET_CODE (SUBREG_REG (operands[0])) == REG
-   && SPARC_INT_REG_P (REGNO (SUBREG_REG (operands[0])))"
+   && sparc_split_regreg_legitimate (operands[0],
+ operands[1])))"
   [(clobber (const_int 0))]
 {
   rtx set_dest = operands[0];
@@ -2247,11 +2244,8 @@
 (match_operand:DF 1 "register_operand" ""))]
   "(! TARGET_V9
 || (! TARGET_ARCH64
-&& ((GET_CODE (operands[0]) == REG
- && SPARC_INT_REG_P (REGNO (operands[0])))
-|| (GET_CODE (operands[0]) == SUBREG
-&& GET_CODE (SUBREG_REG (operands[0])) == REG
-&& SPARC_INT_REG_P (REGNO (SUBREG_REG (operands[0])))
+&& sparc_split_regreg_legitimate (operands[0],
+  operands[1])))
&& reload_completed"
   [(clobber (const_int 0))]
 {
-- 
1.7.6.401.g6a319



libstdc++/50834 - update thread safety docs w.r.t. C++11

2011-10-23 Thread Jonathan Wakely
PR libstdc++/50834
* doc/xml/manual/using.xml: Update thread safety docs w.r.t. C++11.

committed to trunk
Index: doc/xml/manual/using.xml
===
--- doc/xml/manual/using.xml	(revision 180334)
+++ doc/xml/manual/using.xml	(working copy)
@@ -1281,9 +1281,16 @@ A quick read of the relevant part of the
 Thread Safety
   
 
-
 
-We currently use the http://www.w3.org/1999/xlink"; xlink:href="http://www.sgi.com/tech/stl/thread_safety.html";>SGI STL definition of thread safety.
+In the terms of the 2011 C++ standard a thread-safe program is one which
+does not perform any conflicting non-atomic operations on memory locations
+and so does not contain any data races.
+The standard places requirements on the library to ensure that no data
+races are caused by the library itself or by programs which use the
+library correctly (as described below).
+The C++11 memory model and library requirements are a more formal version
+of the http://www.w3.org/1999/xlink"; xlink:href="http://www.sgi.com/tech/stl/thread_safety.html";>SGI STL definition of thread safety, which the library used
+prior to the 2011 standard.
 
 
 
@@ -1329,17 +1336,25 @@ gcc version 4.1.2 20070925 (Red Hat 4.1.

 
   
-  The user-code must guard against concurrent method calls which may
-	 access any particular library object's state.  Typically, the
-	 application programmer may infer what object locks must be held
-	 based on the objects referenced in a method call.  Without getting
+
+  The user code must guard against concurrent function calls which
+ access any particular library object's state when one or more of
+ those accesses modifies the state. An object will be modified by
+ invoking a non-const member function on it or passing it as a
+ non-const argument to a library function. An object will not be
+ modified by invoking a const member function on it or passing it to
+ a function as a pointer- or reference-to-const.
+ Typically, the application
+ programmer may infer what object locks must be held based on the
+ objects referenced in a function call and whether the objects are
+ accessed as const or non-const.  Without getting
 	 into great detail, here is an example which requires user-level
 	 locks:
   
   
  library_class_a shared_object_a;
 
- thread_main () {
+ void thread_main () {
library_class_b *object_b = new library_class_b;
shared_object_a.add_b (object_b);   // must hold lock for shared_object_a
shared_object_a.mutate ();  // must hold lock for shared_object_a
@@ -1347,25 +1362,84 @@ gcc version 4.1.2 20070925 (Red Hat 4.1.
 
  // Multiple copies of thread_main() are started in independent threads.
   Under the assumption that object_a and object_b are never exposed to
-	 another thread, here is an example that should not require any
+	 another thread, here is an example that does not require any
 	 user-level locks:
   
   
- thread_main () {
+ void thread_main () {
library_class_a object_a;
library_class_b *object_b = new library_class_b;
object_a.add_b (object_b);
object_a.mutate ();
  } 
-  All library objects are safe to use in a multithreaded program as
-	 long as each thread carefully locks out access by any other
-	 thread while it uses any object visible to another thread, i.e.,
-	 treat library objects like any other shared resource.  In general,
-	 this requirement includes both read and write access to objects;
-	 unless otherwise documented as safe, do not assume that two threads
-	 may access a shared standard library object at the same time.
+
+  All library types are safe to use in a multithreaded program
+ if objects are not shared between threads or as
+	 long each thread carefully locks out access by any other
+	 thread while it modifies any object visible to another thread.
+	 Unless otherwise documented, the only exceptions to these rules
+ are atomic operations on the types in
+ 
+ and lock/unlock operations on the standard mutex types in
+ . These
+ atomic operations allow concurrent accesses to the same object
+ without introducing data races.
   
 
+  The following member functions of standard containers can be
+ considered to be const for the purposes of avoiding data races:
+ begin, end, rbegin, rend,
+ front, back, data,
+ find, lower_bound, upper_bound,
+ equal_range, at 
+ and, except in associative or unordered associative containers,
+ operator[]. In other words, although they are non-const
+ so that they can return mutable iterators, those member functions
+ will not modify the container.
+ Accessing an iterator might cause a non-modify

Re: [CRIS] Hookize GO_IF_MODE_DEPENDENT_ADDRESS

2011-10-23 Thread Hans-Peter Nilsson
> Date: Mon, 24 Oct 2011 00:03:45 +0400
> From: Anatoly Sokolov 

For future reference, please add the diff "-p" option for
readability.  With subversion, you need to add the equivalence
of "diff-cmd = /home/hp/.scripts/svn-diff"
under [helpers] in your ~/.subversion/config and have a svn-diff
equivalent to:
---
#!/bin/bash
diff=/usr/bin/diff
args="-up -F ^(define"# Additional -F for .md files

exec ${diff} ${args} "$@"
---

>   Regression tested on cris-axis-elf.
> 
>   OK to install?

Meh, lots of churn, but I suppose inevitable.

> * config/cris/cris.c (reg_ok_for_base_p, reg_ok_for_index_p,
> cris_constant_index_p, cris_base_p, cris_index_p,
> cris_base_or_autoincr_p, cris_bdap_index_p, cris_biap_index_p,
> cris_legitimate_address_p): New functions.
> (TARGET_LEGITIMATE_ADDRESS_P): Define.
> (cris_pic_symbol_type, cris_valid_pic_const): Change arguments type
> from rtx to const_rtx.
> (cris_print_operand_address, cris_address_cost,
> cris_side_effect_mode_ok):  Use
> cris_constant_index_p, cris_base_p, cris_base_or_autoincr_p,
> cris_biap_index_p and cris_bdap_index_p.
> * config/cris/cris.h (CONSTANT_INDEX_P, BASE_P, BASE_OR_AUTOINCR_P,
> BDAP_INDEX_P, BIAP_INDEX_P, GO_IF_LEGITIMATE_ADDRESS,
> REG_OK_FOR_BASE_P, REG_OK_FOR_INDEX_P): Remove.
> (EXTRA_CONSTRAINT_Q, EXTRA_CONSTRAINT_R, EXTRA_CONSTRAINT_T): Use
> cris_constant_index_p, cris_base_p, cris_base_or_autoincr_p,
> cris_biap_index_p and cris_bdap_index_p.
> * config/cris/cris.md (moversideqi movemsideqi peephole2): Use
> cris_base_p.
> * config/cris/cris-protos.h (cris_constant_index_p, cris_base_p,
> cris_base_or_autoincr_p, cris_bdap_index_p, cris_biap_index_p): New
> prototype.
> (cris_pic_symbol_type, cris_valid_pic_const): Update prototype.
> 
> Index: gcc/config/cris/cris.c

> @@ -2030,12 +2165,12 @@

(With "-p", I'd see cris_side_effect_mode_ok here...)

Hm, non-strict checking...  Please change the strictness "false"
to "reload_in_progress || reload_completed" in that function.
It might not actually make a difference now, but since that
function is called in both strictness contexts, it's just
better.

> Index: gcc/config/cris/cris.h
> ===
> --- gcc/config/cris/cris.h  (revision 180345)
> +++ gcc/config/cris/cris.h  (working copy)
> @@ -676,17 +676,18 @@
>/* Just an indirect register (happens to also be \
>   "all" slottable memory addressing modes not   \
>   covered by other constraints, i.e. '>').  */  \
> -  MEM_P (X) && BASE_P (XEXP (X, 0))\
> +  MEM_P (X)\
> +  && cris_base_p (XEXP (X, 0), reload_in_progress | reload_completed) \

Everywhere, use "reload_in_progress || reload_completed", not
"reload_in_progress | reload_completed".

> Index: gcc/config/cris/cris.md
> ===
> --- gcc/config/cris/cris.md (revision 180345)
> +++ gcc/config/cris/cris.md (working copy)
> @@ -4680,7 +4680,7 @@
> (match_operator 4 "cris_mem_op" [(match_dup 0)]))]
>"GET_MODE_SIZE (GET_MODE (operands[4])) <= UNITS_PER_WORD
> && REGNO (operands[3]) != REGNO (operands[0])
> -   && (BASE_P (operands[1]) || BASE_P (operands[2]))
> +   && (cris_base_p (operands[1], false) || cris_base_p (operands[2], false))
> && !CRIS_CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'J')
> && !CRIS_CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'N')
> && (INTVAL (operands[2]) >= -128 && INTVAL (operands[2]) < 128)
> @@ -4716,7 +4716,7 @@
> (match_operand 4 "register_operand" ""))]
>"GET_MODE_SIZE (GET_MODE (operands[4])) <= UNITS_PER_WORD
> && REGNO (operands[4]) != REGNO (operands[0])
> -   && (BASE_P (operands[1]) || BASE_P (operands[2]))
> +   && (cris_base_p (operands[1], false) || cris_base_p (operands[2], false))
> && !CRIS_CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'J')
> && !CRIS_CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'N')
> && (INTVAL (operands[2]) >= -128 && INTVAL (operands[2]) < 128)

Why false?  Peephole2 is always post-reload, so the strict
argument should be true.

With the above requests fixed and re-tested, ok.

brgds, H-P


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Paolo Carlini

Hi,

Personally, I would be in favor of taking -Wc++0x-compat out of -Wall.


Patch pre-approved.

Thanks.

It makes sense though that -Wextra implies -Wc++0x-compat.
Indeed, it would. However, unfortunately, we are using -W to bootstrap 
(it just failed on me). Thus I'm bootstrapping and testing the below, 
which just takes -Wc++0x-compat out from -Wall without adding it to -Wextra.


I'll wait anyway until tomorrow in case of further comments.

Thanks again,
Paolo.

//


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Paolo Carlini

... and the patch ;)

Paolo.


/c-family
2011-10-23  Paolo Carlini  

PR c++/50810
* c-opts.c (c_common_handle_option): Do not enable -Wc++0x-compat
as part of -Wall; handle -Wc++0x-compat.
(c_common_post_options): -std=c++0x enables -Wnarrowing.
* c.opt ([Wnarrowing]): Update.

/cp
2011-10-23  Paolo Carlini  

PR c++/50810
* typeck2.c (check_narrowing): Adjust OPT_Wnarrowing diagnostics.
(digest_init_r): Call check_narrowing irrespective of the C++ dialect.
* decl.c (check_initializer): Likewise.
* semantics.c (finish_compound_literal): Likewise.

/testsuite
2011-10-23  Paolo Carlini  

PR c++/50810
* g++.dg/cpp0x/warn_cxx0x2.C: New.
* g++.dg/cpp0x/warn_cxx0x3.C: Likewise.

2011-10-23  Paolo Carlini  

PR c++/50810
* doc/invoke.texi ([-Wall], [-Wnarrowing], [-Wc++0x-compat]): Update.
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 180348)
+++ doc/invoke.texi (working copy)
@@ -2365,17 +2365,18 @@ an instance of a derived class through a pointer t
 base class does not have a virtual destructor.  This warning is enabled
 by @option{-Wall}.
 
-@item -Wno-narrowing @r{(C++ and Objective-C++ only)}
+@item -Wnarrowing @r{(C++ and Objective-C++ only)}
 @opindex Wnarrowing
 @opindex Wno-narrowing
-With -std=c++0x, suppress the diagnostic required by the standard for
-narrowing conversions within @samp{@{ @}}, e.g.
+Warn when a narrowing conversion occurs within @samp{@{ @}}, e.g.
 
 @smallexample
 int i = @{ 2.2 @}; // error: narrowing from double to int
 @end smallexample
 
-This flag can be useful for compiling valid C++98 code in C++0x mode
+This flag is included in @option{-Wc++0x-compat}.
+With -std=c++0x, @option{-Wno-narrowing} suppresses the diagnostic
+required by the standard.
 
 @item -Wnoexcept @r{(C++ and Objective-C++ only)}
 @opindex Wnoexcept
@@ -2993,7 +2994,6 @@ Options} and @ref{Objective-C and Objective-C++ Di
 
 @gccoptlist{-Waddress   @gol
 -Warray-bounds @r{(only with} @option{-O2}@r{)}  @gol
--Wc++0x-compat  @gol
 -Wchar-subscripts  @gol
 -Wenum-compare @r{(in C/Objc; this is on by default in C++)} @gol
 -Wimplicit-int @r{(C and Objective-C only)} @gol
@@ -4066,7 +4066,7 @@ ISO C and ISO C++, e.g.@: request for implicit con
 @item -Wc++0x-compat @r{(C++ and Objective-C++ only)}
 Warn about C++ constructs whose meaning differs between ISO C++ 1998 and
 ISO C++ 200x, e.g., identifiers in ISO C++ 1998 that will become keywords
-in ISO C++ 200x.  This warning is enabled by @option{-Wall}.
+in ISO C++ 200x.  This warning turns on @option{-Wnarrowing}.
 
 @item -Wcast-qual
 @opindex Wcast-qual
Index: c-family/c.opt
===
--- c-family/c.opt  (revision 180348)
+++ c-family/c.opt  (working copy)
@@ -490,8 +490,8 @@ C ObjC C++ ObjC++ Warning
 Warn about use of multi-character character constants
 
 Wnarrowing
-C ObjC C++ ObjC++ Warning Var(warn_narrowing) Init(1)
--Wno-narrowing   In C++0x mode, ignore ill-formed narrowing conversions within 
{ }
+C ObjC C++ ObjC++ Warning Var(warn_narrowing) Init(-1) Warning
+Warn about ill-formed narrowing conversions within { }
 
 Wnested-externs
 C ObjC Var(warn_nested_externs) Warning
Index: c-family/c-opts.c
===
--- c-family/c-opts.c   (revision 180348)
+++ c-family/c-opts.c   (working copy)
@@ -404,7 +404,6 @@ c_common_handle_option (size_t scode, const char *
  /* C++-specific warnings.  */
   warn_sign_compare = value;
  warn_reorder = value;
-  warn_cxx0x_compat = value;
   warn_delnonvdtor = value;
}
 
@@ -436,6 +435,10 @@ c_common_handle_option (size_t scode, const char *
   cpp_opts->warn_cxx_operator_names = value;
   break;
 
+case OPT_Wc__0x_compat:
+  warn_narrowing = value;
+  break;
+
 case OPT_Wdeprecated:
   cpp_opts->cpp_warn_deprecated = value;
   break;
@@ -997,11 +1000,18 @@ c_common_post_options (const char **pfilename)
   if (warn_implicit_function_declaration == -1)
 warn_implicit_function_declaration = flag_isoc99;
 
-  /* If we're allowing C++0x constructs, don't warn about C++0x
- compatibility problems.  */
   if (cxx_dialect == cxx0x)
-warn_cxx0x_compat = 0;
+{
+  /* If we're allowing C++0x constructs, don't warn about C++98
+identifiers which are keywords in C++0x.  */
+  warn_cxx0x_compat = 0;
 
+  if (warn_narrowing == -1)
+   warn_narrowing = 1;
+}
+  else if (warn_narrowing == -1)
+warn_narrowing = 0;
+
   if (flag_preprocess_only)
 {
   /* Open the output now.  We must do so even if flag_no_output is
Index: testsuite/g++.dg/cpp0x/warn_cxx0x2.C
===
--- testsuite/g++.dg/cpp0x/warn_cxx0x2.

Re: [wwwdocs] gcc-4.6/porting_to.html

2011-10-23 Thread Gerald Pfeifer
On Mon, 10 Oct 2011, Gerald Pfeifer wrote:
> I realized this one hasn't made it in, but is really nice.  I made a 
> number of minor edits (typos, markup, simplifying headings,... among 
> others).  What do you think -- should we include this?

Checking mailing list archives I realized that Jakub had provided
feedback ( http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00987.html )
that the strict overflow warnings had been fixed.

Hence I went ahead and committed the removal below.

Gerald

Index: porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/porting_to.html,v
retrieving revision 1.3
diff -u -r1.3 porting_to.html
--- porting_to.html 12 Oct 2011 16:16:54 -  1.3
+++ porting_to.html 24 Oct 2011 00:52:53 -
@@ -65,24 +65,6 @@
 -Wno-unused-but-set-variable or
 -Wno-unused-but-set-parameter.
 
-Strict overflow warnings
-
-Using the -Wstrict-overflow flag with
--Werror and optmization flags above -O2
-may result in compile errors when using glibc optimizations
-for strcmp.
-
-For example,
-
-#include 
-void do_rm_rf (const char *p) { if (strcmp (p, "/") == 0) return; }
-
-Results in the following diagnostic:
-
-error: assuming signed overflow does not occur when changing X +- C1 cmp C2 to 
X cmp C1 +- C2 [-Werror=strict-overflow]
-
-
-To work around this, use -D__NO_STRING_INLINES.
 
 C++ language issues
 
@@ -139,11 +121,6 @@
 to fix build failures with new GCC versions
 
 
-
-Jim Meyering,
- http://lists.fedoraproject.org/pipermail/devel/2011-March/149355.html";>gcc-4.6.0-0.12.fc15.x86_64
 breaks strcmp?
-
-
 
 
   


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Gabriel Dos Reis
On Sun, Oct 23, 2011 at 7:56 PM, Paolo Carlini  wrote:
> ... and the patch ;)

I am bit puzzled by this:

+This flag is included in @option{-Wc++0x-compat}.
+With -std=c++0x, @option{-Wno-narrowing} suppresses the diagnostic
+required by the standard.

and this:
-  /* If we're allowing C++0x constructs, don't warn about C++0x
- compatibility problems.  */
   if (cxx_dialect == cxx0x)
-warn_cxx0x_compat = 0;
+{
+  /* If we're allowing C++0x constructs, don't warn about C++98
+identifiers which are keywords in C++0x.  */
+  warn_cxx0x_compat = 0;

+  if (warn_narrowing == -1)
+   warn_narrowing = 1;
+}




We do not use -W or -Wno- to suppressed *required* diagnostics.  So,
when -std=c++0x,
-Wno-narrowing should not have any effect.  However with
-Wc++0x-compat, it could
make sense to have -Wno-narrowing suppress the diagnostic.

The point is this:  we do not use -W flags to change a standards semantics.
But we use -W to make suggestions (e.g. warnings) so a suggesting should be
suppressed only in the context of another suggestion (-Wc++0x-compat.)


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Paolo Carlini

Hi,

On 10/24/2011 03:30 AM, Gabriel Dos Reis wrote:
We do not use -W or -Wno- to suppressed *required* diagnostics. So, 
when -std=c++0x, -Wno-narrowing should not have any effect.
Personally, I have no problem with this, but note, I'm not inventing 
anything new here, the behavior you are discussing *pre*-dates my patch 
and I feel a little nervous about changing it. If you think you can 
approve this part of rhe patch, I'll change it as you want and resend.


Paolo.


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Gabriel Dos Reis
On Sun, Oct 23, 2011 at 8:48 PM, Paolo Carlini  wrote:
> Hi,
>
> On 10/24/2011 03:30 AM, Gabriel Dos Reis wrote:
>>
>> We do not use -W or -Wno- to suppressed *required* diagnostics. So, when
>> -std=c++0x, -Wno-narrowing should not have any effect.
>
> Personally, I have no problem with this, but note, I'm not inventing
> anything new here, the behavior you are discussing *pre*-dates my patch and
> I feel a little nervous about changing it. If you think you can approve this
> part of rhe patch, I'll change it as you want and resend.

Let me quote again the part of the patch under discussion:

-  /* If we're allowing C++0x constructs, don't warn about C++0x
- compatibility problems.  */
   if (cxx_dialect == cxx0x)
-warn_cxx0x_compat = 0;
+{
+  /* If we're allowing C++0x constructs, don't warn about C++98
+identifiers which are keywords in C++0x.  */
+  warn_cxx0x_compat = 0;

+  if (warn_narrowing == -1)
+   warn_narrowing = 1;
+}
+  else if (warn_narrowing == -1)
+warn_narrowing = 0;
+

Before the patch, -std=c++0x effectively put off -Wc++0x-compat because we
are compiling c++98/c++03 code, so we can only *warn* about possible
compatibility conflict with C++11.   However, the narrowing diagnostic required
by C++11 is NOT a warning.  It is a diagnostic.  The way we alter a standard
mandate is through some -fflag, e.g. -fpermissive.

What the above patch fragment is doing is to turn on a *warning*.
When -std=c++0x
is in effect, narrowing is no longer a warning.  It is an error by default.


Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Paolo Carlini

On 10/24/2011 04:10 AM, Gabriel Dos Reis wrote:
Before the patch, -std=c++0x effectively put off -Wc++0x-compat 
because we are compiling c++98/c++03 code, so we can only *warn* about 
possible compatibility conflict with C++11. However, the narrowing 
diagnostic required by C++11 is NOT a warning. It is a diagnostic. The 
way we alter a standard mandate is through some -fflag, e.g. 
-fpermissive. What the above patch fragment is doing is to turn on a 
*warning*. When -std=c++0x is in effect, narrowing is no longer a 
warning. It is an error by default.
I'm missing your point, I'm sorry: I maintain that *before* and after 
the patch -Wno-narrowing in C++0x mode was able to suppress the 
narrowing warnings. I'm 100% sure, I double checked for you one second 
ago. Are we on the same page on this? If we are, and you think gcc 
should do something new, I have no problem changing my patch to, eg:


  if (cxx_dialect == cxx0x)
{
  /* If we're allowing C++0x constructs, don't warn about C++98
 identifiers which are keywords in C++0x.  */
  warn_cxx0x_compat = 0;
  warn_narrowing = 1;
}
  else if (warn_narrowing == -1)
warn_narrowing = 0;

Paolo.



[wwwdocs] Refer to GNU/Linux in the GCC 3.4 release notes

2011-10-23 Thread Gerald Pfeifer
Original patch against the generated NEWS file by Karl Berry.

Installed.

Gerald

Index: gcc-3.4/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.4/changes.html,v
retrieving revision 1.157
diff -u -r1.157 changes.html
--- gcc-3.4/changes.html7 Nov 2010 13:30:36 -   1.157
+++ gcc-3.4/changes.html24 Oct 2011 02:23:46 -
@@ -867,7 +867,7 @@
 M32R
   
 Support for the M32R/2 processor has been added by Renesas.
-Support for an M32R Linux target and PIC code generation has
+Support for an M32R GNU/Linux target and PIC code generation has
 been added by Renesas.
   
 



[wwwdocs] Refer to GNU/Linux in the GCC 3.3 release notes

2011-10-23 Thread Gerald Pfeifer
Original patch against the generated NEWS file by Karl Berry.

Installed (and, yes, I updated the underlying PR 11137, too).

Gerald

Index: gcc-3.3/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.3/changes.html,v
retrieving revision 1.56
diff -u -r1.56 changes.html
--- gcc-3.3/changes.html11 Jul 2010 20:37:31 -  1.56
+++ gcc-3.3/changes.html24 Oct 2011 02:34:00 -
@@ -218,7 +218,7 @@
The 32-bit port now supports weak symbols under HP-UX 11.
The handling of initializers and finalizers has been improved
under HP-UX 11.  The 64-bit port no longer uses collect2.
-   Dwarf2 EH support has been added to the 32-bit linux port.
+   Dwarf2 EH support has been added to the 32-bit GNU/Linux port.
ABI fixes to correct the passing of small structures by value.

 The SPARC, HP-PA, SH4, and x86/pentium ports have been converted to
@@ -793,7 +793,7 @@
 http://gcc.gnu.org/PR11062";>11062 (libstdc++) avoid 
__attribute__ ((unused)); say  "__unused__" instead
 http://gcc.gnu.org/PR11095";>11095 C++ iostream 
manipulator causes segfault when called with negative argument
 http://gcc.gnu.org/PR11098";>11098 g++ doesn't emit complete 
debugging information for local variables in destructors
-http://gcc.gnu.org/PR11137";>11137 Linux shared library 
constructors not called unless there's one global object
+http://gcc.gnu.org/PR11137";>11137 GNU/Linux shared library 
constructors not called unless there's one global object
 http://gcc.gnu.org/PR11154";>11154 spurious ambiguity report 
for template class specialization
 http://gcc.gnu.org/PR11329";>11329 Compiler cannot find user 
defined implicit typecast
 http://gcc.gnu.org/PR11332";>11332 Spurious error with casts 
in ?: expression


[PATCH] Add support for sparc VIS3 fp<-->int moves.

2011-10-23 Thread David Miller

The non-trivial aspects (and what took the most time for me) of these
changes are:

1) Getting the register move costs and class preferencing right such
   that the VIS3 moves do get effectively used for incoming
   float/vector argument passing on 32-bit, yet IRA and reload don't
   go nuts allocating integer registers to float/vector mode values
   and vice versa.

   Non-optimized compiles are particularly sensitive to this because
   there's simply a lot of moves that don't get cleaned up.  So we
   might have 6 moves, 3 on each side of a single real calculation, so
   in the IRA costs the register classes of the moves dominate.

2) Making sure we don't merge a VIS3 move into a restore instruction.

3) Dealing with the restriction that we can't operate on 32-bit pieces
   of values contained in the upper 32 v9 float registers.

   We deal with this using two elements.

   First, we indicate a FP_REGS or GENERAL_OR_FP_REGS preferred
   reload class when we see reload try to load an integer register
   into class EXTRA_FP_REGS or GENERAL_OR_EXTRA_FP_REGS.

   Second, we teach reload that if it tries to move between float and
   integer regs, and some register class involving EXTRA_FP_REGS is
   involved, that an intermediate FP_REGS class register will possibly
   be needed to successfully complete the reload.

The rest is mostly mechanical work of splitting the existing v9/64-bit
move patterns into non-vis3 and vis3 variants.

Because of how float arguments are passed on 32-bit, these instructions
help a lot.  This is evident in even the simplest examples, this C code:

float fnegs (float a) { return -a; }
double fnegd (double a) { return -a; }

would generate:

fnegs:
add %sp, -104, %sp
st  %o0, [%sp+100]
ld  [%sp+100], %f8
sub %sp, -104, %sp
jmp %o7+8
 fnegs  %f8, %f0
fnegd:
add %sp, -104, %sp
std %o0, [%sp+96]
ldd [%sp+96], %f8
sub %sp, -104, %sp
jmp %o7+8
 fnegd  %f8, %f0

but with VIS3 moves we get:

fnegs:
movwtos %o0, %f8
jmp %o7+8
 fnegs  %f8, %f0
fnegd:
movwtos %o0, %f8
movwtos %o1, %f9
jmp %o7+8
 fnegd  %f8, %f0

And with our good friend pdist.c we get the following code for
function 'foo' with VIS3 moves:

foo:
fzero   %f8
movwtos %o0, %f10
movwtos %o1, %f11
movwtos %o2, %f12
movwtos %o3, %f13
pdist   %f10, %f12, %f8
movstouw%f8, %o0
jmp %o7+8
 movstouw   %f9, %o1

Another good example of significantly improved code generation
can be found when looking at the output of libgcc2.c:_mulsc3()

Of course, sometimes we generate spurious secondary reloads because
the use of the EXTRA_FP_REGS (and GENERAL_OR_EXTRA_FP_REGS) register
class doesn't necessary result in using one of the upper 32 v9 float
registers.  Maybe if we used segregated register classes for the lower
and upper float regs we could attack this issue effectively.

These VIS3 patterns can also in the future be used for more crafty
constant and non-constant vec_init sequences.

This was regstrapped both with the compiler defaulting to vis3, and
without.

Committed to trunk.

gcc/

* config/sparc/sparc.h (SECONDARY_MEMORY_NEEDED): We can move
between float and non-float regs when VIS3.
* config/sparc/sparc.c (eligible_for_restore_insn): We can't
use a restore when the source is a float register.
(sparc_split_regreg_legitimate): When VIS3 allow moves between
float and integer regs.
(sparc_register_move_cost): Adjust to account for VIS3 moves.
(sparc_preferred_reload_class): On 32-bit with VIS3 when moving an
integer reg to a class containing EXTRA_FP_REGS, constrain to
FP_REGS.
(sparc_secondary_reload): On 32-bit with VIS3 when moving between
float and integer regs we sometimes need a FP_REGS class
intermediate move to satisfy the reload.  When this happens
specify an extra cost of 2.
(*movsi_insn): Rename to have "_novis3" suffix and add !VIS3
guard.
(*movdi_insn_sp32_v9): Likewise.
(*movdi_insn_sp64): Likewise.
(*movsf_insn): Likewise.
(*movdf_insn_sp32_v9): Likewise.
(*movdf_insn_sp64): Likewise.
(*zero_extendsidi2_insn_sp64): Likewise.
(*sign_extendsidi2_insn): Likewise.
(*movsi_insn_vis3): New insn.
(*movdi_insn_sp32_v9_vis3): New insn.
(*movdi_insn_sp64_vis3): New insn.
(*movsf_insn_vis3): New insn.
(*movdf_insn_sp32_v9_vis3): New insn.
(*movdf_insn_sp64_vis3): New insn.
(*zero_extendsidi2_insn_sp64_vis3): New insn.
(*sign_extendsidi2_insn_vis3): New insn.
(TFmode reg/reg split): Make sure both REG operands are float.
(*mov_insn): Add "_novis3" suffix and !VIS3 guard. Remove
easy co

Re: Bootstrap failure in tree-object-size.c due to -Wnarrowing (was: [C++ Patch] PR 50810)

2011-10-23 Thread Gabriel Dos Reis
On Sun, Oct 23, 2011 at 9:16 PM, Paolo Carlini  wrote:
> On 10/24/2011 04:10 AM, Gabriel Dos Reis wrote:
>>
>> Before the patch, -std=c++0x effectively put off -Wc++0x-compat because we
>> are compiling c++98/c++03 code, so we can only *warn* about possible
>> compatibility conflict with C++11. However, the narrowing diagnostic
>> required by C++11 is NOT a warning. It is a diagnostic. The way we alter a
>> standard mandate is through some -fflag, e.g. -fpermissive. What the above
>> patch fragment is doing is to turn on a *warning*. When -std=c++0x is in
>> effect, narrowing is no longer a warning. It is an error by default.
>
> I'm missing your point, I'm sorry: I maintain that *before* and after the
> patch -Wno-narrowing in C++0x mode was able to suppress the narrowing
> warnings.

and I am saying that is a bug.

> 'm 100% sure, I double checked for you one second ago. Are we on
> the same page on this? If we are, and you think gcc should do something new,

It is new only in the sense that a bug will be fixed.  Otherwise no,
it is not new.

> I have no problem changing my patch to, eg:
>
>  if (cxx_dialect == cxx0x)
>    {
>      /* If we're allowing C++0x constructs, don't warn about C++98
>     identifiers which are keywords in C++0x.  */
>      warn_cxx0x_compat = 0;
>      warn_narrowing = 1;
>    }

Yes, a -Wno-narrowing should not suppress narrowing in C++11 mode.

>  else if (warn_narrowing == -1)
>    warn_narrowing = 0;
>

OK.


Go patch committed: Rename is_open_array_type to is_slice_type

2011-10-23 Thread Ian Lance Taylor
A long time ago, before the public release of Go, the Go type known as a
"slice" was known as an "open array".  I used the name
is_open_array_type in a Go frontend function.  This mechanical patch
renames that function to be is_slice_type.  Bootstrapped on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r ad067f060093 go/expressions.cc
--- a/go/expressions.cc	Sun Oct 23 10:44:39 2011 -0700
+++ b/go/expressions.cc	Sun Oct 23 21:03:47 2011 -0700
@@ -234,8 +234,7 @@
   else if (rhs_type->interface_type() != NULL)
 return Expression::convert_interface_to_type(context, lhs_type, rhs_type,
 		 rhs_tree, location);
-  else if (lhs_type->is_open_array_type()
-	   && rhs_type->is_nil_type())
+  else if (lhs_type->is_slice_type() && rhs_type->is_nil_type())
 {
   // Assigning nil to an open array.
   go_assert(TREE_CODE(lhs_type_tree) == RECORD_TYPE);
@@ -3315,7 +3314,7 @@
   mpfr_clear(imag);
 }
 
-  if (type->is_open_array_type() && type->named_type() == NULL)
+  if (type->is_slice_type() && type->named_type() == NULL)
 {
   Type* element_type = type->array_type()->element_type()->forwarded();
   bool is_byte = element_type == Type::lookup_integer_type("uint8");
@@ -3663,7 +3662,7 @@
    len);
 	}
 }
-  else if (type->is_open_array_type() && expr_type->is_string_type())
+  else if (type->is_slice_type() && expr_type->is_string_type())
 {
   Type* e = type->array_type()->element_type()->forwarded();
   go_assert(e->integer_type() != NULL);
@@ -3831,9 +3830,9 @@
   source_location loc = this->location();
 
   bool use_view_convert = false;
-  if (t->is_open_array_type())
-{
-  go_assert(et->is_open_array_type());
+  if (t->is_slice_type())
+{
+  go_assert(et->is_slice_type());
   use_view_convert = true;
 }
   else if (t->map_type() != NULL)
@@ -7302,7 +7301,7 @@
   if (args == NULL || args->empty())
 	return this;
   Type* slice_type = args->front()->type();
-  if (!slice_type->is_open_array_type())
+  if (!slice_type->is_slice_type())
 	{
 	  error_at(args->front()->location(), "argument 1 must be a slice");
 	  this->set_is_error();
@@ -7342,7 +7341,7 @@
   bool is_slice = false;
   bool is_map = false;
   bool is_chan = false;
-  if (type->is_open_array_type())
+  if (type->is_slice_type())
 is_slice = true;
   else if (type->map_type() != NULL)
 is_map = true;
@@ -7554,7 +7553,7 @@
 
 	if (arg_type->points_to() != NULL
 	&& arg_type->points_to()->array_type() != NULL
-	&& !arg_type->points_to()->is_open_array_type())
+	&& !arg_type->points_to()->is_slice_type())
 	  arg_type = arg_type->points_to();
 
 	if (arg_type->array_type() != NULL
@@ -7633,7 +7632,7 @@
 
   if (arg_type->points_to() != NULL
 	  && arg_type->points_to()->array_type() != NULL
-	  && !arg_type->points_to()->is_open_array_type())
+	  && !arg_type->points_to()->is_slice_type())
 	arg_type = arg_type->points_to();
 
   if (arg_type->array_type() != NULL
@@ -8080,7 +8079,7 @@
 	Type* arg_type = this->one_arg()->type();
 	if (arg_type->points_to() != NULL
 		&& arg_type->points_to()->array_type() != NULL
-		&& !arg_type->points_to()->is_open_array_type())
+		&& !arg_type->points_to()->is_slice_type())
 	  arg_type = arg_type->points_to();
 	if (this->code_ == BUILTIN_CAP)
 	  {
@@ -8135,7 +8134,7 @@
 		|| type->channel_type() != NULL
 		|| type->map_type() != NULL
 		|| type->function_type() != NULL
-		|| type->is_open_array_type())
+		|| type->is_slice_type())
 		  ;
 		else
 		  this->report_error(_("unsupported argument type to "
@@ -8192,7 +8191,7 @@
 	  break;
 
 	Type* e1;
-	if (arg1_type->is_open_array_type())
+	if (arg1_type->is_slice_type())
 	  e1 = arg1_type->array_type()->element_type();
 	else
 	  {
@@ -8201,7 +8200,7 @@
 	  }
 
 	Type* e2;
-	if (arg2_type->is_open_array_type())
+	if (arg2_type->is_slice_type())
 	  e2 = arg2_type->array_type()->element_type();
 	else if (arg2_type->is_string_type())
 	  e2 = Type::lookup_integer_type("uint8");
@@ -8321,7 +8320,7 @@
 	  {
 	arg_type = arg_type->points_to();
 	go_assert(arg_type->array_type() != NULL
-		   && !arg_type->is_open_array_type());
+		   && !arg_type->is_slice_type());
 	go_assert(POINTER_TYPE_P(TREE_TYPE(arg_tree)));
 	arg_tree = build_fold_indirect_ref(arg_tree);
 	  }
@@ -8515,7 +8514,7 @@
 			fnname = "__go_print_interface";
 		  }
 		  }
-		else if (type->is_open_array_type())
+		else if (type->is_slice_type())
 		  {
 		static tree print_slice_fndecl;
 		pfndecl = &print_slice_fndecl;
@@ -8694,7 +8693,7 @@
 	Type* arg2_type = arg2->type();
 	tree arg2_val;
 	tree arg2_len;
-	if (arg2_type->is_open_array_type())
+	if (arg2_type->is_slice_type())
 	  {
 	at = arg2_type->array_type();
 	arg2_tree = save_expr(arg2_tree);
@@ -9078,7 +9077,7 @@
   source_location loc = this->location();
 
   go_assert(param_count > 0);
-  go_assert(varargs_type->is_op

Re: [PATCH] Fix mv8plus, allow targetting Linux or Solaris from other sparc host.

2011-10-23 Thread David Miller
From: David Miller 
Date: Sun, 23 Oct 2011 16:32:36 -0400 (EDT)

> From: Eric Botcazou 
> Date: Sun, 23 Oct 2011 11:58:57 +0200
> 
>>> I'll try to brainstorm on this, thanks for letting me know about the
>>> Solaris target problem.
>> 
>> Let's fix the regression quickly though.
> 
> I'll fix it by the end of tonight.

Ok, I committed your suggestion to trunk for now.

--
[PATCH] Fix sol2 sparc -mv8 regression.

* config/sparc/sparc.c (sparc_option_override): Remove -mv8plus
cpu adjustment.
* config/sparc/linux64.h (CC1_SPEC): When defaulting to 64-bit,
append -mcpu=v9 when -mv8plus is given.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180362 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog  |5 +
 gcc/config/sparc/linux64.h |2 ++
 gcc/config/sparc/sparc.c   |4 
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1842402..54e1a4f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2011-10-23  David S. Miller  
 
+   * config/sparc/sparc.c (sparc_option_override): Remove -mv8plus
+   cpu adjustment.
+   * config/sparc/linux64.h (CC1_SPEC): When defaulting to 64-bit,
+   append -mcpu=v9 when -mv8plus is given.
+
* config/sparc/sparc.h (SECONDARY_MEMORY_NEEDED): We can move
between float and non-float regs when VIS3.
* config/sparc/sparc.c (eligible_for_restore_insn): We can't
diff --git a/gcc/config/sparc/linux64.h b/gcc/config/sparc/linux64.h
index a51a2f0..7604fa0 100644
--- a/gcc/config/sparc/linux64.h
+++ b/gcc/config/sparc/linux64.h
@@ -166,6 +166,8 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
 %{m32:%{m64:%emay not use both -m32 and -m64}} \
 %{m32:-mptr32 -mno-stack-bias %{!mlong-double-128:-mlong-double-64} \
   %{!mcpu*:-mcpu=cypress}} \
+%{mv8plus:-mptr32 -mno-stack-bias %{!mlong-double-128:-mlong-double-64} \
+  %{!mcpu*:-mcpu=v9}} \
 %{!m32:%{!mcpu*:-mcpu=ultrasparc}} \
 %{!mno-vis:%{!m32:%{!mcpu=v9:-mvis}}} \
 "
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 79bb821..29d2922 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -1029,10 +1029,6 @@ sparc_option_override (void)
   sparc_cpu_and_features = def->processor;
 }
 
-  if ((target_flags & MASK_V8PLUS)
-  && sparc_cpu_and_features < PROCESSOR_V9)
-sparc_cpu_and_features = PROCESSOR_V9;
-
   if (!global_options_set.x_sparc_cpu)
 sparc_cpu = sparc_cpu_and_features;
 
-- 
1.7.6.401.g6a319



Go patch committed: Some syscall fixes

2011-10-23 Thread Ian Lance Taylor
This patch makes a couple of fixes to the syscall rewrite I committed
earlier today.  There was a typo in the umask library call.  The gettid
call is actually GNU/Linux-specific.  In any case, gettid and tgkill are
apparently not in libc, and are only available by calling syscall
directly.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r a856d4c9f3bf libgo/go/syscall/libcall_linux.go
--- a/libgo/go/syscall/libcall_linux.go	Sun Oct 23 21:05:06 2011 -0700
+++ b/libgo/go/syscall/libcall_linux.go	Sun Oct 23 21:54:06 2011 -0700
@@ -196,6 +196,10 @@
 // //sys	Fstatfs(fd int, buf *Statfs_t) (errno int)
 // //fstatfs(fd int, buf *Statfs_t) int
 
+// FIXME: Only available as a syscall.
+// //sysnb	Gettid() (tid int)
+// //gettid() Pid_t
+
 //sys	Ioperm(from int, num int, on int) (errno int)
 //ioperm(from _C_long, num _C_long, on int) int
 
@@ -298,8 +302,9 @@
 //sys	Tee(rfd int, wfd int, len int, flags int) (n int64, errno int)
 //tee(rfd int, wfd int, len Size_t, flags uint) Ssize_t
 
-//sysnb	Tgkill(tgid int, tid int, sig int) (errno int)
-//tgkill(tgid int, tid int, sig int) int
+// FIXME: Only available as a syscall.
+// //sysnb	Tgkill(tgid int, tid int, sig int) (errno int)
+// //tgkill(tgid int, tid int, sig int) int
 
 //sys	unlinkat(dirfd int, path string, flags int) (errno int)
 //unlinkat(dirfd int, path *byte, flags int) int
diff -r a856d4c9f3bf libgo/go/syscall/libcall_posix.go
--- a/libgo/go/syscall/libcall_posix.go	Sun Oct 23 21:05:06 2011 -0700
+++ b/libgo/go/syscall/libcall_posix.go	Sun Oct 23 21:54:06 2011 -0700
@@ -236,9 +236,6 @@
 //sysnb	Getrusage(who int, rusage *Rusage) (errno int)
 //getrusage(who int, rusage *Rusage) int
 
-//sysnb	Gettid() (tid int)
-//gettid() Pid_t
-
 //sysnb	gettimeofday(tv *Timeval, tz *byte) (errno int)
 //gettimeofday(tv *Timeval, tz *byte) int
 func Gettimeofday(tv *Timeval) (errno int) {
@@ -334,7 +331,7 @@
 // //times(tms *Tms) _clock_t
 
 //sysnb	Umask(mask int) (oldmask int)
-//umark(mask Mode_t) Mode_t
+//umask(mask Mode_t) Mode_t
 
 //sys	Unlink(path string) (errno int)
 //unlink(path *byte) int


Go patch committed: Implement append([]byte, string...)

2011-10-23 Thread Ian Lance Taylor
The Go language was extended to permit calling the builtin function
append with the first argument having type []byte and the second
argument having type string, followed by an ellipsis.  This appends the
string to the slice.  This patch implements this new functionality.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 118281731edd go/expressions.cc
--- a/go/expressions.cc	Sun Oct 23 21:56:14 2011 -0700
+++ b/go/expressions.cc	Sun Oct 23 21:58:35 2011 -0700
@@ -8228,6 +8228,17 @@
 	this->report_error(_("too many arguments"));
 	break;
 	  }
+
+	// The language permits appending a string to a []byte, as a
+	// special case.
+	if (args->back()->type()->is_string_type())
+	  {
+	const Array_type* at = args->front()->type()->array_type();
+	const Type* e = at->element_type()->forwarded();
+	if (e == Type::lookup_integer_type("uint8"))
+	  break;
+	  }
+
 	std::string reason;
 	if (!Type::are_assignable(args->front()->type(), args->back()->type(),
   &reason))
@@ -8766,30 +8777,50 @@
 	  return error_mark_node;
 
 	Array_type* at = arg1->type()->array_type();
-	Type* element_type = at->element_type();
-
-	arg2_tree = Expression::convert_for_assignment(context, at,
-		   arg2->type(),
-		   arg2_tree,
-		   location);
-	if (arg2_tree == error_mark_node)
-	  return error_mark_node;
-
-	arg2_tree = save_expr(arg2_tree);
-	tree arg2_val = at->value_pointer_tree(gogo, arg2_tree);
-	tree arg2_len = at->length_tree(gogo, arg2_tree);
-	if (arg2_val == error_mark_node || arg2_len == error_mark_node)
-	  return error_mark_node;
+	Type* element_type = at->element_type()->forwarded();
+
+	tree arg2_val;
+	tree arg2_len;
+	tree element_size;
+	if (arg2->type()->is_string_type()
+	&& element_type == Type::lookup_integer_type("uint8"))
+	  {
+	arg2_tree = save_expr(arg2_tree);
+	arg2_val = String_type::bytes_tree(gogo, arg2_tree);
+	arg2_len = String_type::length_tree(gogo, arg2_tree);
+	element_size = size_int(1);
+	  }
+	else
+	  {
+	arg2_tree = Expression::convert_for_assignment(context, at,
+			   arg2->type(),
+			   arg2_tree,
+			   location);
+	if (arg2_tree == error_mark_node)
+	  return error_mark_node;
+
+	arg2_tree = save_expr(arg2_tree);
+
+	 arg2_val = at->value_pointer_tree(gogo, arg2_tree);
+	 arg2_len = at->length_tree(gogo, arg2_tree);
+
+	 Btype* element_btype = element_type->get_backend(gogo);
+	 tree element_type_tree = type_to_tree(element_btype);
+	 if (element_type_tree == error_mark_node)
+	   return error_mark_node;
+	 element_size = TYPE_SIZE_UNIT(element_type_tree);
+	  }
+
 	arg2_val = fold_convert_loc(location, ptr_type_node, arg2_val);
 	arg2_len = fold_convert_loc(location, size_type_node, arg2_len);
-
-	tree element_type_tree = type_to_tree(element_type->get_backend(gogo));
-	if (element_type_tree == error_mark_node)
-	  return error_mark_node;
-	tree element_size = TYPE_SIZE_UNIT(element_type_tree);
 	element_size = fold_convert_loc(location, size_type_node,
 	element_size);
 
+	if (arg2_val == error_mark_node
+	|| arg2_len == error_mark_node
+	|| element_size == error_mark_node)
+	  return error_mark_node;
+
 	// We rebuild the decl each time since the slice types may
 	// change.
 	tree append_fndecl = NULL_TREE;


Re: [PATCH] Fix mv8plus, allow targetting Linux or Solaris from other sparc host.

2011-10-23 Thread Eric Botcazou
> Ok, I committed your suggestion to trunk for now.

Thanks!

-- 
Eric Botcazou