Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Jason Merrill

On 07/27/2011 01:54 PM, Dodji Seketeli wrote:

+  /*  Set of typedefs that are used in this function.  */
+  struct pointer_set_t * GTY((skip)) used_local_typedefs;


Is there a reason not to just use TREE_USED for this?


+  /* Vector of locally defined typedefs, for
+ -Wunused-local-typedefs.  */
+  VEC(tree,gc) *local_typedefs;


If the accessors are in c-common, this field should be in 
c_language_function.



+  /* We are only interested in a typedef declared locally.  */
+  if (DECL_CONTEXT (typedef_decl) != current_function_decl)
+   return;


What if it's used in a nested function/local class/lambda?


@@ -4175,6 +4175,9 @@ mark_used (tree decl)

   /* Set TREE_USED for the benefit of -Wunused.  */
   TREE_USED (decl) = 1;
+
+  maybe_record_local_typedef_use (TREE_TYPE (decl));


Why is this needed?  If the decl has the typedef for a type, we should 
have already marked it as used in grokdeclarator.


Actually, couldn't we just mark a typedef as used when when lookup finds 
it?  That would avoid having to mark in so many places and avoid the 
need for walk_tree.


I think -Wunused and -Wall should imply -Wunused-local-typedefs unless 
the user specifies -Wno-unused-local-typedefs.


Jason


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-29 Thread Paolo Bonzini

On 07/29/2011 08:34 AM, Paolo Bonzini wrote:

+  temp = rtl_hooks.gen_lowpart_no_emit (mode, x);


This line is obviously spurious, sorry.


+  if (no_emit)
+ return rtl_hooks.gen_lowpart_no_emit (mode, x);
+  else
+return gen_lowpart (mode, x);


Paolo


[PATCH] Fix PR49893

2011-07-29 Thread Richard Guenther

This fixes a latent issue in predictive commoning - we shouldn't try
to optimize invariant volatile references.  The following patch simply
disables handling of all volatile references similar to possibly
throwing ones.

Bootstrap and regtest in progress on x86_64-unknown-linux-gnu.

Richard.

2011-07-29  Richard Guenther  

PR tree-optimization/49893
* tree-predcom.c (suitable_reference_p): Volatile references
are not suitable.

Index: gcc/tree-predcom.c
===
*** gcc/tree-predcom.c  (revision 176869)
--- gcc/tree-predcom.c  (working copy)
*** suitable_reference_p (struct data_refere
*** 598,603 
--- 598,604 
tree ref = DR_REF (a), step = DR_STEP (a);
  
if (!step
+   || TREE_THIS_VOLATILE (ref)
|| !is_gimple_reg_type (TREE_TYPE (ref))
|| tree_could_throw_p (ref))
  return false;


[PATCH] Fix VRP handling of undefined state

2011-07-29 Thread Richard Guenther

I noticed that for binary expressions VRP contains the same bugs
that CCP once did (it treats UNDEFINED * 0 as UNDEFINED).  Then
I noticed we never hit this bug because we never end up with
any range being UNDEFINED - which is bad, because this way we miss
value-ranges for all conditionally initialized variables.

Thus, the following patch fixes this and conservatively handles
the binary expression case (which is only of academic interest
anyway - the important part is to handle UNDEFINED in PHIs).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2011-07-29  Richard Guenther  

* tree-vrp.c (get_value_range): Only set parameter default
definitions to varying, leave others at undefined.
(extract_range_from_binary_expr): Fix undefined handling.
(vrp_visit_phi_node): Handle merged undefined state.

* gcc.dg/uninit-suppress.c: Also disable VRP.
* gcc.dg/uninit-suppress_2.c: Likewise.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 176869)
+++ gcc/tree-vrp.c  (working copy)
@@ -692,16 +692,16 @@ get_value_range (const_tree var)
   /* Defer allocating the equivalence set.  */
   vr->equiv = NULL;
 
-  /* If VAR is a default definition, the variable can take any value
- in VAR's type.  */
+  /* If VAR is a default definition of a parameter, the variable can
+ take any value in VAR's type.  */
   sym = SSA_NAME_VAR (var);
-  if (SSA_NAME_IS_DEFAULT_DEF (var))
+  if (SSA_NAME_IS_DEFAULT_DEF (var)
+  && TREE_CODE (sym) == PARM_DECL)
 {
   /* Try to use the "nonnull" attribute to create ~[0, 0]
 anti-ranges for pointers.  Note that this is only valid with
 default definitions of PARM_DECLs.  */
-  if (TREE_CODE (sym) == PARM_DECL
- && POINTER_TYPE_P (TREE_TYPE (sym))
+  if (POINTER_TYPE_P (TREE_TYPE (sym))
  && nonnull_arg_p (sym))
set_value_range_to_nonnull (vr, TREE_TYPE (sym));
   else
@@ -2225,12 +2225,20 @@ extract_range_from_binary_expr (value_ra
   else
 set_value_range_to_varying (&vr1);
 
-  /* If either range is UNDEFINED, so is the result.  */
-  if (vr0.type == VR_UNDEFINED || vr1.type == VR_UNDEFINED)
+  /* If both ranges are UNDEFINED, so is the result.  */
+  if (vr0.type == VR_UNDEFINED && vr1.type == VR_UNDEFINED)
 {
   set_value_range_to_undefined (vr);
   return;
 }
+  /* If one of the ranges is UNDEFINED drop it to VARYING for the following
+ code.  At some point we may want to special-case operations that
+ have UNDEFINED result for all or some value-ranges of the not UNDEFINED
+ operand.  */
+  else if (vr0.type == VR_UNDEFINED)
+set_value_range_to_varying (&vr0);
+  else if (vr1.type == VR_UNDEFINED)
+set_value_range_to_varying (&vr1);
 
   /* The type of the resulting value range defaults to VR0.TYPE.  */
   type = vr0.type;
@@ -6642,6 +6650,8 @@ vrp_visit_phi_node (gimple phi)
 
   if (vr_result.type == VR_VARYING)
 goto varying;
+  else if (vr_result.type == VR_UNDEFINED)
+goto update_range;
 
   old_edges = vr_phi_edge_counts[SSA_NAME_VERSION (lhs)];
   vr_phi_edge_counts[SSA_NAME_VERSION (lhs)] = edges;
@@ -6713,6 +6723,7 @@ vrp_visit_phi_node (gimple phi)
 
   /* If the new range is different than the previous value, keep
  iterating.  */
+update_range:
   if (update_value_range (lhs, &vr_result))
 {
   if (dump_file && (dump_flags & TDF_DETAILS))

Index: gcc/testsuite/gcc.dg/uninit-suppress.c
===
*** gcc/testsuite/gcc.dg/uninit-suppress.c  (revision 176869)
--- gcc/testsuite/gcc.dg/uninit-suppress.c  (working copy)
***
*** 1,5 
  /* { dg-do compile } */
! /* { dg-options "-fno-tree-ccp -O2 -Wuninitialized -Wno-maybe-uninitialized" 
} */
  void blah();
  int gflag;
  
--- 1,5 
  /* { dg-do compile } */
! /* { dg-options "-fno-tree-ccp -fno-tree-vrp -O2 -Wuninitialized 
-Wno-maybe-uninitialized" } */
  void blah();
  int gflag;
  
Index: gcc/testsuite/gcc.dg/uninit-suppress_2.c
===
*** gcc/testsuite/gcc.dg/uninit-suppress_2.c(revision 176869)
--- gcc/testsuite/gcc.dg/uninit-suppress_2.c(working copy)
***
*** 1,5 
  /* { dg-do compile } */
! /* { dg-options "-fno-tree-ccp -O2 -Wuninitialized -Werror=uninitialized 
-Wno-error=maybe-uninitialized" } */
  void blah();
  int gflag;
  
--- 1,5 
  /* { dg-do compile } */
! /* { dg-options "-fno-tree-ccp -fno-tree-vrp -O2 -Wuninitialized 
-Werror=uninitialized -Wno-error=maybe-uninitialized" } */
  void blah();
  int gflag;
  


[testsuite] XFAIL gcc.dg/tree-ssa/pr42585.c on Tru64 UNIX (PR tree-optimization/47407)

2011-07-29 Thread Rainer Orth
As Martin analyzed in the PR, those failures are expected, so I'm
XFAILing them.

Tested with the appropriate runtest invocations on alpha-dec-osf5.1b and
i386-pc-solaris2.11, installed on mainline.

Rainer


2011-07-29  Rainer Orth  

PR tree-optimization/47407
* gcc.dg/tree-ssa/pr42585.c: XFAIL scan-tree-dump-times on
alpha*-dec-osf*.
Sort target list.

Index: gcc/testsuite/gcc.dg/tree-ssa/pr42585.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr42585.c (revision 176918)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr42585.c (working copy)
@@ -35,6 +35,6 @@
 /* Whether the structs are totally scalarized or not depends on the
MOVE_RATIO macro defintion in the back end.  The scalarization will
not take place when using small values for MOVE_RATIO.  */
-/* { dg-final { scan-tree-dump-times "struct _fat_ptr _ans" 0 "optimized" { 
target { ! "powerpc*-*-* arm-*-* sh*-*-* s390*-*-*" } } } } */
-/* { dg-final { scan-tree-dump-times "struct _fat_ptr _T2" 0 "optimized" { 
target { ! "powerpc*-*-* arm-*-* sh*-*-* s390*-*-*" } } } } */
+/* { dg-final { scan-tree-dump-times "struct _fat_ptr _ans" 0 "optimized" { 
target { ! "alpha*-dec-osf* arm-*-* powerpc*-*-* s390*-*-* sh*-*-*" } } } } */
+/* { dg-final { scan-tree-dump-times "struct _fat_ptr _T2" 0 "optimized" { 
target { ! "alpha*-dec-osf* arm-*-* powerpc*-*-* s390*-*-* sh*-*-*" } } } } */
 /* { dg-final { cleanup-tree-dump "optimized" } } */

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Paolo Carlini

Hi,
I think -Wunused and -Wall should imply -Wunused-local-typedefs unless 
the user specifies -Wno-unused-local-typedefs.
IMHO, this is a very good idea looking forward, but then I think we 
should make sure the warning plays well with system headers either as-is 
or together with some other pending work of Dodji. In particular, as I 
probably mentioned already in the trail, we really want to double check 
that debug-mode does not trigger warnings, I'm a bit of worried because 
many people use and like it.


Paolo.


Re: Allow IRIX Ada bootstrap with C++

2011-07-29 Thread Rainer Orth
Andreas,

> Wouldn't it be cleanest to adjust the prototype of __gnat_error_handler
> to reality, and cast it when assigning to sa_handler (not sa_sigaction,
> which is only valid if SA_SIGINFO is set)?

probably, provided g++ accepts that.  I'd have to run two bootstraps (C
and C++) to check this.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [C++0x] contiguous bitfields race implementation

2011-07-29 Thread Richard Guenther
On Fri, Jul 29, 2011 at 4:12 AM, Aldy Hernandez  wrote:
> On 07/28/11 06:40, Richard Guenther wrote:
>
>> Looking at the C++ memory model what you need is indeed simple enough
>> to recover here.  Still this loop does quadratic work for a struct with
>> N bitfield members and a function which stores into all of them.
>> And that with a big constant factor as you build a component-ref
>> and even unshare trees (which isn't necessary here anyway).  In fact
>> you could easily manually keep track of bitpos when walking adjacent
>> bitfield members.  An initial call to get_inner_reference on
>> TREE_OPERAND (exp, 0) would give you the starting position of the record.
>>
>> That would still be quadratic of course.
>
> Actually, we don't need to call get_inner_reference at all.  It seems
> DECL_FIELD_BIT_OFFSET has all the information we need.
>
> How about we simplify things further as in the attached patch?
>
> Tested on x86-64 Linux.
>
> OK for mainline?

Well ... byte pieces of the offset can be in the tree offset
(DECL_FIELD_OFFSET).  Only up to DECL_OFFSET_ALIGN bits
are tracked in DECL_FIELD_BIT_OFFSET (and DECL_FIELD_OFFSET
can be a non-constant - at least for Ada, not sure about C++).

But - can you please expand a bit on the desired semantics of
get_bit_range?  Especially, relative to what is *bitstart / *bitend
supposed to be?  Why do you pass in bitpos and bitsize - they
seem to be used as local variables only.  Why is the check for
thread-local storage in this function and not in the caller (and
what's the magic [0,0] bit-range relative to?)?

The existing get_inner_reference calls give you a bitpos relative
to the start of the containing object - but

  /* If this is the last element in the structure, include the padding
 at the end of structure.  */
  *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1;

will set *bitend to the size of the direct parent structure size, not the
size of the underlying object.  Your proposed patch changes
bitpos to be relative to the direct parent structure.

So - I guess you need to play with some testcases like

struct {
   int some_padding;
   struct {
  int bitfield :1;
   } x;
};

and split / clarify some of get_bit_range comments.

Thanks,
Richard.

>


Re: [Patch,AVR]: PR49313

2011-07-29 Thread Georg-Johann Lay
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02424.html

Richard Henderson wrote:
> On 07/27/2011 09:12 AM, Georg-Johann Lay wrote:
>>  PR target/49313
>>  * config/avr/libgcc.S (__ffshi2): Don't skip 2-word instruction.
>>  (__ctzsi2): Result for 0 may be undefined.
>>  (__ctzhi2): Result for 0 may be undefined.
>>  (__popcounthi2): Don't clobber r30. Use __popcounthi2_tail.
>>  (__popcountsi2): Ditto. And don't clobber r26.
>>  (__popcountdi2): Ditto. And don't clobber r27.
>>  * config/avr/avr.md (UNSPEC_COPYSIGN): New c_enum.
>>  (parityhi2): New expand.
>>  (paritysi2): New expand.
>>  (popcounthi2): New expand.
>>  (popcountsi2): New expand.
>>  (clzhi2): New expand.
>>  (clzsi2): New expand.
>>  (ctzhi2): New expand.
>>  (ctzsi2): New expand.
>>  (ffshi2): New expand.
>>  (ffssi2): New expand.
>>  (copysignsf2): New insn.
>>  (bswapsi2): New expand.
>>  (*parityhi2.libgcc): New insn.
>>  (*parityqihi2.libgcc): New insn.
>>  (*paritysihi2.libgcc): New insn.
>>  (*popcounthi2.libgcc): New insn.
>>  (*popcountsi2.libgcc): New insn.
>>  (*popcountqi2.libgcc): New insn.
>>  (*popcountqihi2.libgcc): New insn-and-split.
>>  (*clzhi2.libgcc): New insn.
>>  (*clzsihi2.libgcc): New insn.
>>  (*ctzhi2.libgcc): New insn.
>>  (*ctzsihi2.libgcc): New insn.
>>  (*ffshi2.libgcc): New insn.
>>  (*ffssihi2.libgcc): New insn.
>>  (*bswapsi2.libgcc): New insn.
> 
> Looks good.
> 
> 
> r~

http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176920

Committed with the following changes imposed by
   http://gcc.gnu.org/viewcvs?view=revision&revision=176862
i.e. don't generate zero_extends with hard register and replace

(define_expand ...
...
   (set (match_operand:SI 0 "register_operand" "")
(zero_extend:SI (reg:HI 24)))]
  ""
  "")

with

(define_expand ...
...
   (set (match_dup 2)
(reg:HI 24))
   (set (match_operand:SI 0 "register_operand" "")
(zero_extend:SI (match_dup 2)))]
  ""
  {
operands[2] = gen_reg_rtx (HImode);
  })

Replacing explicit hard registers in expanders/splits with
insns that have corresponding hard register constraints lead
to extraordinary bad code, see
   http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02412.html
and the following discussion.  Unfortunately, the relevant post
didn't show up in the mail archives yet (since 24 hours now) so
I cannot link it, but obviously Richard received it as he answered
to it.

Passed without regressions.

Johann


[PATCH] Fix comparison type in builtin folding

2011-07-29 Thread Richard Guenther

I noticed the following when LTOing libgfortran into polyhedron
with -Ofast which delays signbit folding and exposes the bogus
comparison type to the new stricter type checking.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2011-07-29  Richard Guenther  

* builtins.c (fold_builtin_signbit): Build the comparison
with a proper type.

Index: gcc/builtins.c
===
--- gcc/builtins.c  (revision 176920)
+++ gcc/builtins.c  (working copy)
@@ -8645,8 +8645,9 @@ fold_builtin_signbit (location_t loc, tr
 
   /* If ARG's format doesn't have signed zeros, return "arg < 0.0".  */
   if (!HONOR_SIGNED_ZEROS (TYPE_MODE (TREE_TYPE (arg
-return fold_build2_loc (loc, LT_EXPR, type, arg,
-   build_real (TREE_TYPE (arg), dconst0));
+return fold_convert (type,
+fold_build2_loc (loc, LT_EXPR, boolean_type_node, arg,
+   build_real (TREE_TYPE (arg), dconst0)));
 
   return NULL_TREE;
 }


[trans-mem] verify_types_in_gimple_seq_2 glitch

2011-07-29 Thread Patrick Marlier

In tree-cfg.c (line ~3921), there is a little glitch.

Index: tree-cfg.c
===
--- tree-cfg.c  (revision 176864)
+++ tree-cfg.c  (working copy)
@@ -3918,7 +3918,7 @@
  break;

case GIMPLE_TRANSACTION:
- err |= verify_types_in_gimple_seq_2 (gimple_omp_body (stmt));
+ err |= verify_types_in_gimple_seq_2 (gimple_transaction_body 
(stmt));

  break;

default:

Patrick Marlier.


Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Dodji Seketeli
Jason Merrill  writes:

> On 07/27/2011 01:54 PM, Dodji Seketeli wrote:
>> +  /*  Set of typedefs that are used in this function.  */
>> +  struct pointer_set_t * GTY((skip)) used_local_typedefs;
>
> Is there a reason not to just use TREE_USED for this?

I wasn't sure if that flag wasn't (or couldn't be) used for some "core"
functionality with another meaning for this type of tree, and so would
cause some conflict.

>
>> +  /* Vector of locally defined typedefs, for
>> + -Wunused-local-typedefs.  */
>> +  VEC(tree,gc) *local_typedefs;
>
> If the accessors are in c-common, this field should be in
> c_language_function.
>

Thanks, I didn't realize this existed.

> Actually, couldn't we just mark a typedef as used when when lookup
> finds it?  That would avoid having to mark in so many places and avoid
> the need for walk_tree.

This would indeed simplify things.  I'll try it.

> I think -Wunused and -Wall should imply -Wunused-local-typedefs unless
> the user specifies -Wno-unused-local-typedefs.

I actually first tried this (actually adding it to -Wall -extra and
-Wunused) and found out the following issue.

A typedef can be defined in a macro in a system header, be expanded in a
function and not be used by the function.  In this case we shouldn't
warn, but PR preprocessor/7263 makes us warn nonetheless.  There are
many spots of that kind in the libstdc++ test suite.

Paolo Carlini  writes:

> Hi,
>> I think -Wunused and -Wall should imply -Wunused-local-typedefs
>> unless the user specifies -Wno-unused-local-typedefs.
> IMHO, this is a very good idea looking forward, but then I think we
> should make sure the warning plays well with system headers either
> as-is or together with some other pending work of Dodji. In
> particular, as I probably mentioned already in the trail, we really
> want to double check that debug-mode does not trigger warnings, I'm a
> bit of worried because many people use and like it.

Exactly.  This would be a side effect of PR preprocessor/7263?

So do you guys think we should add it nonetheless and just add
-Wno-unused-local-typedefs to the tests that exhibit the above issue
before fixing PR preprocessor/7263?

Thanks.

-- 
Dodji


Commit: 4.5: Fix typo in rx.c

2011-07-29 Thread Nick Clifton
Hi Guys,

  I am applying the patch below to the 4.5 branch to fix a typo in the
  code to check the setpsw builtin.

Cheers
  Nick

gcc/ChangeLog
2011-07-29  Nick Clifton  

* config/rx/rx.c (rx_expand_builtin): Fix typo checking the setpsw
builtin.
  
Index: gcc/config/rx/rx.c
===
--- gcc/config/rx/rx.c  (revision 176922)
+++ gcc/config/rx/rx.c  (working copy)
@@ -2158,10 +2158,10 @@
   if (! valid_psw_flag (op, "clrpsw"))
return NULL_RTX;
   return rx_expand_void_builtin_1_arg (op, gen_clrpsw, false);
+case RX_BUILTIN_SETPSW:  
   if (! valid_psw_flag (op, "setpsw"))
return NULL_RTX;
   return rx_expand_void_builtin_1_arg (op, gen_setpsw, false);
-case RX_BUILTIN_SETPSW:  
 case RX_BUILTIN_INT: return rx_expand_void_builtin_1_arg
(op, gen_int, false);
 case RX_BUILTIN_MACHI:   return rx_expand_builtin_mac (exp, gen_machi);
  


Re: Mention avx2 patch

2011-07-29 Thread Kirill Yukhin
Agreed, but I have no idea, how to work with wwwdocs at all.

H.J., could you please move ix86/avx to "inactive" section?

Thanks, K

On Thu, Jul 28, 2011 at 2:26 PM, Gerald Pfeifer  wrote:
> On Thu, 28 Jul 2011, Kirill Yukhin wrote:
>> Ping
>
> Oh, sure.  I had somehow thought this had been applied already.
>
> Instead of just removing ix86/avx, would you mind moving it to
> the "Inactive Development Branches" section?
>
> Gerald
>


Re: [trans-mem] verify_types_in_gimple_seq_2 glitch

2011-07-29 Thread Aldy Hernandez

On 07/29/11 05:25, Patrick Marlier wrote:

In tree-cfg.c (line ~3921), there is a little glitch.

Index: tree-cfg.c
===
--- tree-cfg.c (revision 176864)
+++ tree-cfg.c (working copy)
@@ -3918,7 +3918,7 @@
break;

case GIMPLE_TRANSACTION:
- err |= verify_types_in_gimple_seq_2 (gimple_omp_body (stmt));
+ err |= verify_types_in_gimple_seq_2 (gimple_transaction_body (stmt));
break;

default:

Patrick Marlier.


Have you tested this patch?

ChangeLog entry?


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-29 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 3:24 PM, H.J. Lu  wrote:

 In x32, thread pointer is 32bit and choice of segment register for the
 thread base ptr load should be based on TARGET_64BIT.  This patch
 implements it.  OK for trunk?
>>>
>>> -ENOTESTCASE.
>>>
>>
>> There is no standalone testcase.  The symptom is in glibc build, I
>> got
>>
>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>  -E -x c-header'
>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>> --library-path 
>> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>> ../scripts -h rpcsvc/yppasswd.x -o
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
>> Segmentation fault
>> make[5]: *** Waiting for unfinished jobs
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
>> Segmentation fault
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
>> Segmentation fault
>>
>> since thread pointer is 32bit in x32.
>>
>
> If we load thread pointer (fs segment register) in x32 with 64bit
> load, the upper 32bits are garbage.
> We must load 32bit

So, instead of huge complications with new mode iterator, just
introduce two new patterns that will shadow existing ones for
TARGET_X32.

Like in attached (untested) patch.

Uros.
Index: i386.md
===
--- i386.md (revision 176860)
+++ i386.md (working copy)
@@ -12442,6 +12442,17 @@
 (define_mode_attr tp_seg [(SI "gs") (DI "fs")])
 
 ;; Load and add the thread base pointer from %:0.
+(define_insn "*load_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (unspec:DI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%:0, %k0|%k0, DWORD PTR :0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*load_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(const_int 0)] UNSPEC_TP))]
@@ -12453,6 +12464,19 @@
(set_attr "memory" "load")
(set_attr "imm_disp" "false")])
 
+(define_insn "*add_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (plus:DI (unspec:DI [(const_int 0)] UNSPEC_TP)
+(match_operand:DI 1 "register_operand" "0")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%:0, %k0|%k0, DWORD PTR :0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*add_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(plus:P (unspec:P [(const_int 0)] UNSPEC_TP)


[PATCH, ARM] Support NEON's VABD with combine pass

2011-07-29 Thread Dmitry Melnik
This patch adds two define_insn patterns for NEON vabd instruction to 
make combine pass recognize expressions matching (vabs (vsub ...)) 
patterns as vabd.
This patch reduces code size of x264 binary from 649143 to 648343 (800 
bytes, or 0.12%) and increases its performance on average by 2.5% on 
plain C version of x264 with -O2 -ftree-vectorize.
On SPEC2K it didn't make any difference -- all vabs instructions found 
in SPEC2K binaries are either using .f64 mode or scalar .f32 which are 
not supported by NEON's vabd.

Regtested with QEMU.

Ok for trunk?


--
Best regards,
   Dmitry
2011-07-21  Sevak Sargsyan 

* config/arm/neon.md (neon_vabd_2, neon_vabd_3): New define_insn patterns for combine.

gcc/testsuite:

* gcc.target/arm/neon-combine-sub-abs-into-vabd.c: New test.

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index a8c1b87..f457365 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -5607,3 +5607,32 @@
   emit_insn (gen_neon_vec_pack_trunc_ (operands[0], tempreg));
   DONE;
 })
+
+(define_insn "neon_vabd_2"
+ [(set (match_operand:VDQ 0 "s_register_operand" "=w")
+   (abs:VDQ (minus:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
+   (match_operand:VDQ 2 "s_register_operand" "w"]
+ "TARGET_NEON"
+ "vabd. %0, %1, %2"
+ [(set (attr "neon_type")
+   (if_then_else (ne (symbol_ref "") (const_int 0))
+ (if_then_else (ne (symbol_ref "") (const_int 0))
+   (const_string "neon_fp_vadd_ddd_vabs_dd")
+   (const_string "neon_fp_vadd_qqq_vabs_qq"))
+ (const_string "neon_int_5")))]
+)
+
+(define_insn "neon_vabd_3"
+ [(set (match_operand:VDQ 0 "s_register_operand" "=w")
+   (abs:VDQ (unspec:VDQ [(match_operand:VDQ 1 "s_register_operand" "w")
+ (match_operand:VDQ 2 "s_register_operand" "w")]
+ UNSPEC_VSUB)))]
+ "TARGET_NEON"
+ "vabd. %0, %1, %2"
+ [(set (attr "neon_type")
+   (if_then_else (ne (symbol_ref "") (const_int 0))
+ (if_then_else (ne (symbol_ref "") (const_int 0))
+   (const_string "neon_fp_vadd_ddd_vabs_dd")
+   (const_string "neon_fp_vadd_qqq_vabs_qq"))
+ (const_string "neon_int_5")))]
+)
diff --git a/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c b/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c
new file mode 100644
index 000..aae4117
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c
@@ -0,0 +1,54 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O2 -funsafe-math-optimizations" } */
+/* { dg-add-options arm_neon } */
+
+#include 
+float32x2_t f_sub_abs_to_vabd_32()
+{
+
+   float32x2_t val1 = vdup_n_f32 (10); 
+   float32x2_t val2 = vdup_n_f32 (30);
+   float32x2_t sres = vsub_f32(val1, val2);
+   float32x2_t res = vabs_f32 (sres); 
+
+   return res;
+}
+/* { dg-final { scan-assembler "vabd\.f32" } }*/
+
+#include 
+int8x8_t sub_abs_to_vabd_8()
+{
+   
+   int8x8_t val1 = vdup_n_s8 (10); 
+int8x8_t val2 = vdup_n_s8 (30);
+int8x8_t sres = vsub_s8(val1, val2);
+int8x8_t res = vabs_s8 (sres); 
+
+   return res;
+}
+/* { dg-final { scan-assembler "vabd\.s8" } }*/
+
+int16x4_t sub_abs_to_vabd_16()
+{
+   
+   int16x4_t val1 = vdup_n_s16 (10); 
+int16x4_t val2 = vdup_n_s16 (30);
+int16x4_t sres = vsub_s16(val1, val2);
+int16x4_t res = vabs_s16 (sres); 
+
+   return res;
+}
+/* { dg-final { scan-assembler "vabd\.s16" } }*/
+
+int32x2_t sub_abs_to_vabd_32()
+{
+
+int32x2_t val1 = vdup_n_s32 (10);
+int32x2_t val2 = vdup_n_s32 (30);
+int32x2_t sres = vsub_s32(val1, val2);
+int32x2_t res = vabs_s32 (sres);
+
+   return res;
+}
+/* { dg-final { scan-assembler "vabd\.s32" } }*/


[Committed,AVR]: Addendum to fix thinko in PR49687

2011-07-29 Thread Georg-Johann Lay
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02391.html

(reg:DI 18) does not cover (reg:HI 26) which also contributes to
the register footprint of implicit libgcc calls.

I should return to elementary school and learn counting again...

Installed as obvious, passed without regressions.

http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176923

Johann

PR target/49687
* config/avr/avr.md (mulsi3, *mulsi3, mulusi3,
mulssi3, mulohisi3, mulhisi3, umulhisi3, usmulhisi3,

*mulsi3):
Add X to register footprint: Clobber r26/r27.
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 176920)
+++ config/avr/avr.md	(working copy)
@@ -1373,6 +1373,7 @@ (define_expand "mulsi3"
   [(parallel [(set (match_operand:SI 0 "register_operand" "")
(mult:SI (match_operand:SI 1 "register_operand" "")
 (match_operand:SI 2 "nonmemory_operand" "")))
+  (clobber (reg:HI 26))
   (clobber (reg:DI 18))])]
   "AVR_HAVE_MUL"
   {
@@ -1395,6 +1396,7 @@ (define_insn_and_split "*mulsi3"
   [(set (match_operand:SI 0 "pseudo_register_operand"  "=r")
 (mult:SI (match_operand:SI 1 "pseudo_register_operand"  "r")
  (match_operand:SI 2 "pseudo_register_or_const_int_operand" "rn")))
+   (clobber (reg:HI 26))
(clobber (reg:DI 18))]
   "AVR_HAVE_MUL && !reload_completed"
   { gcc_unreachable(); }
@@ -1431,6 +1433,7 @@ (define_insn_and_split "mulusi3"
   [(set (match_operand:SI 0 "pseudo_register_operand"   "=r")
 (mult:SI (zero_extend:SI (match_operand:QIHI 1 "pseudo_register_operand" "r"))
  (match_operand:SI 2 "pseudo_register_or_const_int_operand"  "rn")))
+   (clobber (reg:HI 26))
(clobber (reg:DI 18))]
   "AVR_HAVE_MUL && !reload_completed"
   { gcc_unreachable(); }
@@ -1466,6 +1469,7 @@ (define_insn_and_split "mulssi3"
   [(set (match_operand:SI 0 "pseudo_register_operand"   "=r")
 (mult:SI (sign_extend:SI (match_operand:QIHI 1 "pseudo_register_operand" "r"))
  (match_operand:SI 2 "pseudo_register_or_const_int_operand"  "rn")))
+   (clobber (reg:HI 26))
(clobber (reg:DI 18))]
   "AVR_HAVE_MUL && !reload_completed"
   { gcc_unreachable(); }
@@ -1509,6 +1513,7 @@ (define_insn_and_split "mulohisi3"
 (mult:SI (not:SI (zero_extend:SI 
   (not:HI (match_operand:HI 1 "pseudo_register_operand" "r"
  (match_operand:SI 2 "pseudo_register_or_const_int_operand" "rn")))
+   (clobber (reg:HI 26))
(clobber (reg:DI 18))]
   "AVR_HAVE_MUL && !reload_completed"
   { gcc_unreachable(); }
@@ -1528,6 +1533,7 @@ (define_expand "mulhisi3"
   [(parallel [(set (match_operand:SI 0 "register_operand" "")
(mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" ""))
 (sign_extend:SI (match_operand:HI 2 "register_operand" ""
+  (clobber (reg:HI 26))
   (clobber (reg:DI 18))])]
   "AVR_HAVE_MUL"
   "")
@@ -1536,6 +1542,7 @@ (define_expand "umulhisi3"
   [(parallel [(set (match_operand:SI 0 "register_operand" "")
(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" ""))
 (zero_extend:SI (match_operand:HI 2 "register_operand" ""
+  (clobber (reg:HI 26))
   (clobber (reg:DI 18))])]
   "AVR_HAVE_MUL"
   "")
@@ -1544,6 +1551,7 @@ (define_expand "usmulhisi3"
   [(parallel [(set (match_operand:SI 0 "register_operand" "")
(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" ""))
 (sign_extend:SI (match_operand:HI 2 "register_operand" ""
+  (clobber (reg:HI 26))
   (clobber (reg:DI 18))])]
   "AVR_HAVE_MUL"
   "")
@@ -1557,6 +1565,7 @@ (define_insn_and_split
   [(set (match_operand:SI 0 "pseudo_register_operand""=r")
 (mult:SI (any_extend:SI (match_operand:QIHI 1 "pseudo_register_operand"   "r"))
  (any_extend2:SI (match_operand:QIHI2 2 "pseudo_register_operand" "r"
+   (clobber (reg:HI 26))
(clobber (reg:DI 18))]
   "AVR_HAVE_MUL && !reload_completed"
   { gcc_unreachable(); }


C++ PATCH for c++/49808 (failure with reference non-type template argument)

2011-07-29 Thread Jason Merrill
Here, the problem was that we weren't calling convert_from_reference 
when substituting the argument for a non-type template parameter as we 
do in other places that might produce a reference.  Once that change was 
made, I had to adjust a couple of other functions to deal with getting a 
reference ref instead of a raw reference.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 176d3b3cbf489dfc485576a3d8a204da3e95acde
Author: Jason Merrill 
Date:   Wed Jul 27 00:01:15 2011 -0700

	PR c++/49808
	* pt.c (tsubst) [TEMPLATE_PARM_INDEX]: Call convert_from_reference.
	(convert_nontype_argument, tsubst_template_arg): Handle its output.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index b9e09af..a3cd956 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -5556,41 +5556,45 @@ convert_nontype_argument (tree type, tree expr, tsubst_flags_t complain)
  function). We just strip everything and get to the arg.
  See g++.old-deja/g++.oliva/template4.C and g++.dg/template/nontype9.C
  for examples.  */
-  if (TREE_CODE (expr) == NOP_EXPR)
+  if (TYPE_REF_OBJ_P (type) || TYPE_REFFN_P (type))
 {
-  if (TYPE_REF_OBJ_P (type) || TYPE_REFFN_P (type))
+  tree probe_type, probe = expr;
+  if (REFERENCE_REF_P (probe))
+	probe = TREE_OPERAND (probe, 0);
+  probe_type = TREE_TYPE (probe);
+  if (TREE_CODE (probe) == NOP_EXPR)
 	{
 	  /* ??? Maybe we could use convert_from_reference here, but we
 	 would need to relax its constraints because the NOP_EXPR
 	 could actually change the type to something more cv-qualified,
 	 and this is not folded by convert_from_reference.  */
-	  tree addr = TREE_OPERAND (expr, 0);
-	  gcc_assert (TREE_CODE (expr_type) == REFERENCE_TYPE);
+	  tree addr = TREE_OPERAND (probe, 0);
+	  gcc_assert (TREE_CODE (probe_type) == REFERENCE_TYPE);
 	  gcc_assert (TREE_CODE (addr) == ADDR_EXPR);
 	  gcc_assert (TREE_CODE (TREE_TYPE (addr)) == POINTER_TYPE);
 	  gcc_assert (same_type_ignoring_top_level_qualifiers_p
-		  (TREE_TYPE (expr_type),
+		  (TREE_TYPE (probe_type),
 		   TREE_TYPE (TREE_TYPE (addr;
 
 	  expr = TREE_OPERAND (addr, 0);
 	  expr_type = TREE_TYPE (expr);
 	}
+}
 
-  /* We could also generate a NOP_EXPR(ADDR_EXPR()) when the
-	 parameter is a pointer to object, through decay and
-	 qualification conversion. Let's strip everything.  */
-  else if (TYPE_PTROBV_P (type))
-	{
-	  STRIP_NOPS (expr);
-	  gcc_assert (TREE_CODE (expr) == ADDR_EXPR);
-	  gcc_assert (TREE_CODE (TREE_TYPE (expr)) == POINTER_TYPE);
-	  /* Skip the ADDR_EXPR only if it is part of the decay for
-	 an array. Otherwise, it is part of the original argument
-	 in the source code.  */
-	  if (TREE_CODE (TREE_TYPE (TREE_OPERAND (expr, 0))) == ARRAY_TYPE)
-	expr = TREE_OPERAND (expr, 0);
-	  expr_type = TREE_TYPE (expr);
-	}
+  /* We could also generate a NOP_EXPR(ADDR_EXPR()) when the
+ parameter is a pointer to object, through decay and
+ qualification conversion. Let's strip everything.  */
+  else if (TREE_CODE (expr) == NOP_EXPR && TYPE_PTROBV_P (type))
+{
+  STRIP_NOPS (expr);
+  gcc_assert (TREE_CODE (expr) == ADDR_EXPR);
+  gcc_assert (TREE_CODE (TREE_TYPE (expr)) == POINTER_TYPE);
+  /* Skip the ADDR_EXPR only if it is part of the decay for
+	 an array. Otherwise, it is part of the original argument
+	 in the source code.  */
+  if (TREE_CODE (TREE_TYPE (TREE_OPERAND (expr, 0))) == ARRAY_TYPE)
+	expr = TREE_OPERAND (expr, 0);
+  expr_type = TREE_TYPE (expr);
 }
 
   /* [temp.arg.nontype]/5, bullet 1
@@ -8941,6 +8945,10 @@ tsubst_template_arg (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 		   /*integral_constant_expression_p=*/true);
   if (!(complain & tf_warning))
 	--c_inhibit_evaluation_warnings;
+  /* Preserve the raw-reference nature of T.  */
+  if (TREE_TYPE (t) && TREE_CODE (TREE_TYPE (t)) == REFERENCE_TYPE
+	  && REFERENCE_REF_P (r))
+	r = TREE_OPERAND (r, 0);
 }
   return r;
 }
@@ -10981,7 +10989,7 @@ tsubst (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 	  }
 	else
 	  /* TEMPLATE_TEMPLATE_PARM or TEMPLATE_PARM_INDEX.  */
-	  return unshare_expr (arg);
+	  return convert_from_reference (unshare_expr (arg));
 	  }
 
 	if (level == 1)
diff --git a/gcc/testsuite/g++.dg/template/nontype24.C b/gcc/testsuite/g++.dg/template/nontype24.C
new file mode 100644
index 000..57fbe43
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/nontype24.C
@@ -0,0 +1,16 @@
+// PR c++/49808
+
+template 
+struct A
+{
+  A() { float r = g(0); }
+};
+
+struct f_t
+{
+  float operator() (float) const { return 1; }
+};
+
+f_t f;
+
+A x;


Re: [C++0x] contiguous bitfields race implementation

2011-07-29 Thread Aldy Hernandez

On 07/28/11 06:40, Richard Guenther wrote:


Looking at the C++ memory model what you need is indeed simple enough
to recover here.  Still this loop does quadratic work for a struct with
N bitfield members and a function which stores into all of them.
And that with a big constant factor as you build a component-ref
and even unshare trees (which isn't necessary here anyway).  In fact
you could easily manually keep track of bitpos when walking adjacent
bitfield members.  An initial call to get_inner_reference on
TREE_OPERAND (exp, 0) would give you the starting position of the record.

That would still be quadratic of course.


Actually, we don't need to call get_inner_reference at all.  It seems 
DECL_FIELD_BIT_OFFSET has all the information we need.


How about we simplify things further as in the attached patch?

Tested on x86-64 Linux.

OK for mainline?

* expr.c (get_bit_range): Get field bit offset from
DECL_FIELD_BIT_OFFSET.

Index: expr.c
===
--- expr.c  (revision 176891)
+++ expr.c  (working copy)
@@ -4179,18 +4179,10 @@ get_bit_range (unsigned HOST_WIDE_INT *b
   prev_field_is_bitfield = true;
   for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
 {
-  tree t, offset;
-  enum machine_mode mode;
-  int unsignedp, volatilep;
-
   if (TREE_CODE (fld) != FIELD_DECL)
continue;
 
-  t = build3 (COMPONENT_REF, TREE_TYPE (exp),
- unshare_expr (TREE_OPERAND (exp, 0)),
- fld, NULL_TREE);
-  get_inner_reference (t, &bitsize, &bitpos, &offset,
-  &mode, &unsignedp, &volatilep, true);
+  bitpos = TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (fld));
 
   if (field == fld)
found_field = true;


Re: [C++0x] contiguous bitfields race implementation

2011-07-29 Thread Aldy Hernandez



Yes.  Together with the above it looks then optimal.


Attached patch tested on x86-64 Linux.

OK for mainline?
* expr.c (get_bit_range): Handle *MEM_REF's.

Index: expr.c
===
--- expr.c  (revision 176824)
+++ expr.c  (working copy)
@@ -4158,7 +4158,10 @@ get_bit_range (unsigned HOST_WIDE_INT *b
 
   /* If other threads can't see this value, no need to restrict stores.  */
   if (ALLOW_STORE_DATA_RACES
-  || (!ptr_deref_may_alias_global_p (innerdecl)
+  || ((TREE_CODE (innerdecl) == MEM_REF ||
+  TREE_CODE (innerdecl) == TARGET_MEM_REF)
+ && !ptr_deref_may_alias_global_p (TREE_OPERAND (innerdecl, 0)))
+  || (DECL_P (innerdecl)
  && (DECL_THREAD_LOCAL_P (innerdecl)
  || !TREE_STATIC (innerdecl
 {


PATCH, v2: PR target/47715: [x32] Use SImode for thread pointer

2011-07-29 Thread Uros Bizjak
Hello!

ABI specifies that TP is loaded in ptr_mode. Attached patch implements
this requirement.

2011-07-29  Uros Bizjak  

* config/i386/i386.md (*load_tp_x32): New.
(*load_tp_x32_zext): Ditto.
(*add_tp_x32): Ditto.
(*add_tp_x32_zext): Ditto.
(*load_tp_): Disable for !TARGET_X32 targets.
(*add_tp_): Ditto.
* config/i386/i386.c (get_thread_pointer): Load thread pointer in
ptr_mode and convert to Pmode if needed.

Testing on x86_64-pc-linux-gnu in progress. H.J., please test this
version on x32.

Uros.
Index: i386.md
===
--- i386.md (revision 176915)
+++ i386.md (working copy)
@@ -12444,10 +12444,32 @@
 (define_mode_attr tp_seg [(SI "gs") (DI "fs")])
 
 ;; Load and add the thread base pointer from %:0.
+(define_insn "*load_tp_x32"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
+(define_insn "*load_tp_x32_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI (unspec:SI [(const_int 0)] UNSPEC_TP)))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*load_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(const_int 0)] UNSPEC_TP))]
-  ""
+  "!TARGET_X32"
   "mov{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "imov")
(set_attr "modrm" "0")
@@ -12455,12 +12477,39 @@
(set_attr "memory" "load")
(set_attr "imm_disp" "false")])
 
+(define_insn "*add_tp_x32"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP)
+(match_operand:SI 1 "register_operand" "0")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
+(define_insn "*add_tp_x32_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP)
+  (match_operand:SI 1 "register_operand" "0"
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*add_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(plus:P (unspec:P [(const_int 0)] UNSPEC_TP)
(match_operand:P 1 "register_operand" "0")))
(clobber (reg:CC FLAGS_REG))]
-  ""
+  "!TARGET_X32"
   "add{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "alu")
(set_attr "modrm" "0")
Index: i386.c
===
--- i386.c  (revision 176915)
+++ i386.c  (working copy)
@@ -12118,17 +12118,15 @@ legitimize_pic_address (rtx orig, rtx re
 static rtx
 get_thread_pointer (bool to_reg)
 {
-  rtx tp, reg, insn;
+  rtx tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
 
-  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
-  if (!to_reg)
-return tp;
+  if (GET_MODE (tp) != Pmode)
+tp = convert_to_mode (Pmode, tp, 1);
 
-  reg = gen_reg_rtx (Pmode);
-  insn = gen_rtx_SET (VOIDmode, reg, tp);
-  insn = emit_insn (insn);
+  if (to_reg)
+tp = copy_addr_to_reg (tp);
 
-  return reg;
+  return tp;
 }
 
 /* Construct the SYMBOL_REF for the tls_get_addr function.  */


Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-29 Thread Tom de Vries
On 07/28/2011 12:22 PM, Richard Guenther wrote:
> On Wed, 27 Jul 2011, Tom de Vries wrote:
> 
>> On 07/27/2011 05:27 PM, Richard Guenther wrote:
>>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>>
 On 07/27/2011 02:12 PM, Richard Guenther wrote:
> On Wed, 27 Jul 2011, Tom de Vries wrote:
>
>> On 07/27/2011 01:50 PM, Tom de Vries wrote:
>>> Hi Richard,
>>>
>>> I have a patch set for bug 43513 - The stack pointer is adjusted twice.
>>>
>>> 01_pr43513.3.patch
>>> 02_pr43513.3.test.patch
>>> 03_pr43513.3.mudflap.patch
>>>
>>> The patch set has been bootstrapped and reg-tested on x86_64.
>>>
>>> I will sent out the patches individually.
>>>
>>
>> The patch replaces a vla __builtin_alloca that has a constant argument 
>> with an
>> array declaration.
>>
>> OK for trunk?
>
> I don't think it is safe to try to get at the VLA type the way you do.

 I don't understand in what way it's not safe. Do you mean I don't manage 
 to find
 the type always, or that I find the wrong type, or something else?
>>>
>>> I think you might get the wrong type,
>>
>> Ok, I'll review that code one more time.
>>
>>> you also do not transform code
>>> like
>>>
>>>   int *p = alloca(4);
>>>   *p = 3;
>>>
>>> as there is no array type involved here.
>>>
>>
>> I was trying to stay away from non-vla allocas.  A source declared alloca has
>> function livetime, so we could have a single alloca in a loop, called 10 
>> times,
>> with all 10 instances live at the same time. This patch does not detect such
>> cases, and thus stays away from non-vla allocas. A vla decl does not have 
>> such
>> problems, the lifetime ends when it goes out of scope.
> 
> Yes indeed - that probably would require more detailed analysis.
> 
> In fact I would simply do sth like
>
>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>   n_elem = size * 8 / BITS_PER_UNIT;
>   array_type = build_array_type_nelts (elem_type, n_elem);
>   var = create_tmp_var (array_type, NULL);
>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
>

 I tried this code on the example, and it works, but the newly declared 
 type has
 an 8-bit alignment, while the vla base type has a 32 bit alignment.  This 
 make
 the memory access in the example potentially unaligned, which prohibits an
 ivopts optimization, so the resulting text size is 68 instead of the 64 
 achieved
 with my current patch.
>>>
>>> Ok, so then set DECL_ALIGN of the variable to something reasonable
>>> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
>>> alignment that the targets alloca function would guarantee.
>>>
>>
>> I tried that, but that doesn't help. It's the alignment of the type that
>> matters, not of the decl.
> 
> It shouldn't.  All accesses are performed with the original types and
> alignment comes from that (plus the underlying decl).
> 

I managed to get it all working by using build_aligned_type rather that 
DECL_ALIGN.

>> So should we try to find the base type of the vla, and use that, or use the
>> nonstandard char type?
> 
> I don't think we can reliably find the base type of the vla - well,
> in practice we may because we control how we lower VLAs during
> gimplification, but nothing in the IL constraints say that the
> resulting pointer type should be special.
> 
> Using a char[] decl shouldn't be a problem IMHO.
> 
> And obviously you lose the optimization we arrange with inserting
> __builtin_stack_save/restore pairs that way - stack space will no
> longer be shared for subsequent VLAs.  Which means that you'd
> better limit the size you allow this promotion.
>

 Right, I could introduce a parameter for this.
>>>
>>> I would think you could use PARAM_LARGE_STACK_FRAME for now and say,
>>> allow a size of PARAM_LARGE_STACK_FRAME / 10?
>>>
>>
>> That unfortunately is too small for the example from bug report. The default
>> value of the param is 250, so that would be a threshold of 25, and the alloca
>> size of the example is 40.  Perhaps we can try a threshold of
>> PARAM_LARGE_STACK_FRAME - estimated_stack_size or some such?
> 
> Hm.  estimated_stack_size is not O(1), so no.  I think we need to
> find a sensible way of allowing stack sharing.  Eventually Michas
> patch for introducing points-of-death would help here, if we'd
> go for folding this during stack-save/restore optimization.
> 

I changed the heuristics to this:

+  /* Heuristic: don't fold large vlas.  */
+  threshold = (unsigned HOST_WIDE_INT)PARAM_VALUE (PARAM_LARGE_STACK_FRAME);
+  /* In case a vla is declared at function scope, it has the same lifetime as a
+ declared array, so we allow a larger size.  */
+  block = gimple_block (stmt);
+  if (!(cfun->after_inlining
+&& TREE_CODE (BLOCK_SUPERCONTEXT (block)) == FUNCTION_DECL))
+threshold /= 10;
+  if (size > thres

[PATCH, ARM] Support NEON's VABD with combine pass

2011-07-29 Thread Dmitry Melnik
This patch adds two define_insn patterns for NEON vabd instruction to 
make combine pass recognize expressions matching (vabs (vsub ...)) 
patterns as vabd.
This patch reduces code size of x264 binary from 649143 to 648343 (800 
bytes, or 0.12%) and increases its performance on average by 2.5% on 
plain C version of x264 with -O2 -ftree-vectorize.
On SPEC2K it didn't make any difference -- all vabs instructions found 
in SPEC2K binaries are either using .f64 mode or scalar .f32 which are 
not supported by NEON's vabd.

Regtested with QEMU.

Ok for trunk?


--
Best regards,
   Dmitry

2011-07-21  Sevak Sargsyan 

* config/arm/neon.md (neon_vabd_2, neon_vabd_3): New define_insn patterns for combine.

gcc/testsuite:

* gcc.target/arm/neon-combine-sub-abs-into-vabd.c: New test.

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index a8c1b87..f457365 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -5607,3 +5607,32 @@
   emit_insn (gen_neon_vec_pack_trunc_ (operands[0], tempreg));
   DONE;
 })
+
+(define_insn "neon_vabd_2"
+ [(set (match_operand:VDQ 0 "s_register_operand" "=w")
+   (abs:VDQ (minus:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
+   (match_operand:VDQ 2 "s_register_operand" "w"]
+ "TARGET_NEON"
+ "vabd. %0, %1, %2"
+ [(set (attr "neon_type")
+   (if_then_else (ne (symbol_ref "") (const_int 0))
+ (if_then_else (ne (symbol_ref "") (const_int 0))
+   (const_string "neon_fp_vadd_ddd_vabs_dd")
+   (const_string "neon_fp_vadd_qqq_vabs_qq"))
+ (const_string "neon_int_5")))]
+)
+
+(define_insn "neon_vabd_3"
+ [(set (match_operand:VDQ 0 "s_register_operand" "=w")
+   (abs:VDQ (unspec:VDQ [(match_operand:VDQ 1 "s_register_operand" "w")
+ (match_operand:VDQ 2 "s_register_operand" "w")]
+ UNSPEC_VSUB)))]
+ "TARGET_NEON"
+ "vabd. %0, %1, %2"
+ [(set (attr "neon_type")
+   (if_then_else (ne (symbol_ref "") (const_int 0))
+ (if_then_else (ne (symbol_ref "") (const_int 0))
+   (const_string "neon_fp_vadd_ddd_vabs_dd")
+   (const_string "neon_fp_vadd_qqq_vabs_qq"))
+ (const_string "neon_int_5")))]
+)
diff --git a/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c b/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c
new file mode 100644
index 000..aae4117
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon-combine-sub-abs-into-vabd.c
@@ -0,0 +1,54 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O2 -funsafe-math-optimizations" } */
+/* { dg-add-options arm_neon } */
+
+#include 
+float32x2_t f_sub_abs_to_vabd_32()
+{
+
+   float32x2_t val1 = vdup_n_f32 (10); 
+   float32x2_t val2 = vdup_n_f32 (30);
+   float32x2_t sres = vsub_f32(val1, val2);
+   float32x2_t res = vabs_f32 (sres); 
+
+   return res;
+}
+/* { dg-final { scan-assembler "vabd\.f32" } }*/
+
+#include 
+int8x8_t sub_abs_to_vabd_8()
+{
+   
+   int8x8_t val1 = vdup_n_s8 (10); 
+int8x8_t val2 = vdup_n_s8 (30);
+int8x8_t sres = vsub_s8(val1, val2);
+int8x8_t res = vabs_s8 (sres); 
+
+   return res;
+}
+/* { dg-final { scan-assembler "vabd\.s8" } }*/
+
+int16x4_t sub_abs_to_vabd_16()
+{
+   
+   int16x4_t val1 = vdup_n_s16 (10); 
+int16x4_t val2 = vdup_n_s16 (30);
+int16x4_t sres = vsub_s16(val1, val2);
+int16x4_t res = vabs_s16 (sres); 
+
+   return res;
+}
+/* { dg-final { scan-assembler "vabd\.s16" } }*/
+
+int32x2_t sub_abs_to_vabd_32()
+{
+
+int32x2_t val1 = vdup_n_s32 (10);
+int32x2_t val2 = vdup_n_s32 (30);
+int32x2_t sres = vsub_s32(val1, val2);
+int32x2_t res = vabs_s32 (sres);
+
+   return res;
+}
+/* { dg-final { scan-assembler "vabd\.s32" } }*/


Re: [Patch,AVR]: PR49687 (better widening 32-bit mul)

2011-07-29 Thread Georg-Johann Lay
Richard Henderson wrote:
> On 07/27/2011 06:21 AM, Georg-Johann Lay wrote:
>> +(define_insn_and_split "*mulsi3"
>> +  [(set (match_operand:SI 0 "pseudo_register_operand"  
>> "=r")
>> +(mult:SI (match_operand:SI 1 "pseudo_register_operand"  
>> "r")
>> + (match_operand:SI 2 "pseudo_register_or_const_int_operand" 
>> "rn")))
>> +   (clobber (reg:DI 18))]
>> +  "AVR_HAVE_MUL && !reload_completed"
>> +  { gcc_unreachable(); }
>> +  "&& 1"
>> +  [(set (reg:SI 18)
>> +(match_dup 1))
> 
> That seems like it's guaranteed to force an unnecessary move.
> Have you tried defining special-purpose register classes to
> force reload to move the data into the right hard regs?
> 
> E.g.  "Y" prefix
>   "QHS" size
>   two digit starting register number, as needed.
> 
> You'll probably end up with quite a few register classes 
> out of this, but hopefully reload can do a better job than
> you can manually...
> 
> 
> r~

Waahh, I introduced register classes and constraints to tell register 
allocator
what's the intention if the insns, the ... parts just dealing with CONST_INTs
and not needed in the remainder:

(define_expand "mulsi3"
  [(parallel [(set (match_operand:SI 0 "register_operand" "")
   (mult:SI (match_operand:SI 1 "register_operand" "")
(match_operand:SI 2 "nonmemory_operand" "")))
  (clobber (reg:HI 26))])]
  "AVR_HAVE_MUL"
  {
 ...
  })

(define_insn_and_split "*mulsi3"
  [(set (match_operand:SI 0 "register_operand"   "=RS22")
(mult:SI (match_operand:SI 1 "register_operand"  "%RS22")
 (match_operand:SI 2 "nonmemory_operand"  "RS18")))
   (clobber (reg:HI 26))]
  "AVR_HAVE_MUL"
  "%~call __mulsi3"
  "&& !reload_completed"
  [(clobber (const_int 0))]
  {
...
FAIL;
  }
  [(set_attr "type" "xcall")
   (set_attr "cc" "clobber")])


Again, I used the simple test case from above:

long mul (long a, long b)
{
return a*b;
}

long mul2 (long a, long b)
{
return b*a;
}

Compiled with -Os -mmcu=atmega8 -fno-split-wide-types:

mul:
/* prologue: function */
rcall __mulsi3   ;  7   *mulsi3 [length = 1]
/* epilogue start */
ret  ;  21  return  [length = 1]

mul2:
push r8  ;  22  *pushqi/1   [length = 1]
push r9  ;  23  *pushqi/1   [length = 1]
push r10 ;  24  *pushqi/1   [length = 1]
push r11 ;  25  *pushqi/1   [length = 1]
push r12 ;  26  *pushqi/1   [length = 1]
push r13 ;  27  *pushqi/1   [length = 1]
push r14 ;  28  *pushqi/1   [length = 1]
push r15 ;  29  *pushqi/1   [length = 1]
push r28 ;  30  *pushqi/1   [length = 1]
push r29 ;  31  *pushqi/1   [length = 1]
rcall .  ;  35  *addhi3_sp_R_pc2[length = 2]
rcall .
in r28,__SP_L__  ;  36  *movhi_sp/2 [length = 2]
in r29,__SP_H__
/* prologue: function */
/* frame size = 4 */
/* stack size = 14 */
.L__stack_usage = 14
movw r12,r22 ;  2   *movsi/1[length = 2]
movw r14,r24
movw r24,r20 ;  19  *movsi/1[length = 2]
movw r22,r18
movw r20,r14 ;  21  *movsi/1[length = 2]
movw r18,r12
rcall __mulsi3   ;  7   *mulsi3 [length = 1]
/* epilogue start */
pop __tmp_reg__  ;  41  *addhi3_sp_R_pc2[length = 4]
pop __tmp_reg__
pop __tmp_reg__
pop __tmp_reg__
pop r29  ;  42  popqi   [length = 1]
pop r28  ;  43  popqi   [length = 1]
pop r15  ;  44  popqi   [length = 1]
pop r14  ;  45  popqi   [length = 1]
pop r13  ;  46  popqi   [length = 1]
pop r12  ;  47  popqi   [length = 1]
pop r11  ;  48  popqi   [length = 1]
pop r10  ;  49  popqi   [length = 1]
pop r9   ;  50  popqi   [length = 1]
pop r8   ;  51  popqi   [length = 1]
ret  ;  52  return_from_epilogue[length = 1]


With -fsplit-wide-types (which is on per default) the code is even
worse and the first function inflates to unacceptable code, too.

Using constraints "=RS22,%0,RS18" instead of "=RS22,%RS22,RS18"
the code of the second function is a bit better:

mul2:
push r28 ;  20  *pushqi/1   [length = 1]
push r29 ;  21  *pushqi/1   [length = 1]
rcall .  ;  25  *addhi3_sp_R_pc2[length = 2]
rcall .
in r28,__SP_L__  ;  26  *movhi_sp/2 [length = 2]
in r29,__SP_H__
/* prologue: function */
/* frame size = 4 */
/* stack size = 6 */
.L__stack_usage = 6
std Y+1,r22  ;  2   *movsi/4[length = 4]
std Y+2,r23
std Y+3,r24
std Y+4,r25
movw r24,r20 ;  3   *movsi/1[length = 2]
movw r22,r18
ldd r18,Y+1  ;  19  *movsi/3[length = 4]
l

PATCH, v2: PR target/47715: [x32] Use SImode for thread pointer

2011-07-29 Thread Uros Bizjak
[ For some reason this post didn't reach gcc-patches@ ML archives... ]

Hello!

ABI specifies that TP is loaded in ptr_mode. Attached patch implements
this requirement.

2011-07-29  Uros Bizjak  

       * config/i386/i386.md (*load_tp_x32): New.
       (*load_tp_x32_zext): Ditto.
       (*add_tp_x32): Ditto.
       (*add_tp_x32_zext): Ditto.
       (*load_tp_): Disable for !TARGET_X32 targets.
       (*add_tp_): Ditto.
       * config/i386/i386.c (get_thread_pointer): Load thread pointer in
       ptr_mode and convert to Pmode if needed.

Testing on x86_64-pc-linux-gnu in progress. H.J., please test this
version on x32.

Uros.
Index: i386.md
===
--- i386.md (revision 176915)
+++ i386.md (working copy)
@@ -12444,10 +12444,32 @@
 (define_mode_attr tp_seg [(SI "gs") (DI "fs")])
 
 ;; Load and add the thread base pointer from %:0.
+(define_insn "*load_tp_x32"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
+(define_insn "*load_tp_x32_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI (unspec:SI [(const_int 0)] UNSPEC_TP)))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*load_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(const_int 0)] UNSPEC_TP))]
-  ""
+  "!TARGET_X32"
   "mov{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "imov")
(set_attr "modrm" "0")
@@ -12455,12 +12477,39 @@
(set_attr "memory" "load")
(set_attr "imm_disp" "false")])
 
+(define_insn "*add_tp_x32"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP)
+(match_operand:SI 1 "register_operand" "0")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
+(define_insn "*add_tp_x32_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP)
+  (match_operand:SI 1 "register_operand" "0"
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*add_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(plus:P (unspec:P [(const_int 0)] UNSPEC_TP)
(match_operand:P 1 "register_operand" "0")))
(clobber (reg:CC FLAGS_REG))]
-  ""
+  "!TARGET_X32"
   "add{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "alu")
(set_attr "modrm" "0")
Index: i386.c
===
--- i386.c  (revision 176915)
+++ i386.c  (working copy)
@@ -12118,17 +12118,15 @@ legitimize_pic_address (rtx orig, rtx re
 static rtx
 get_thread_pointer (bool to_reg)
 {
-  rtx tp, reg, insn;
+  rtx tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
 
-  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
-  if (!to_reg)
-return tp;
+  if (GET_MODE (tp) != Pmode)
+tp = convert_to_mode (Pmode, tp, 1);
 
-  reg = gen_reg_rtx (Pmode);
-  insn = gen_rtx_SET (VOIDmode, reg, tp);
-  insn = emit_insn (insn);
+  if (to_reg)
+tp = copy_addr_to_reg (tp);
 
-  return reg;
+  return tp;
 }
 
 /* Construct the SYMBOL_REF for the tls_get_addr function.  */


[patch tree-optimization]: Fix for PR/49806

2011-07-29 Thread Kai Tietz
Hello,

this patch fixes regression of bug report PR middle-end/49806, which was caused 
due unhandled type-cast patterns reasoned by boolification of compares and 
type-cast preserving from/to boolean types.


ChangeLog

2011-07-29  Kai Tietz  

PR middle-end/49806
* tree-vrp.c (has_operand_boolean_range): Helper function.
(simplify_truth_ops_using_ranges): Factored out code pattern
into new has_operand_boolean_range function and use it.
(simplify_converted_bool_expr_using_ranges): New function.
(simplify_stmt_using_ranges): Add new simplification function
call.

* gcc.dg/tree-ssa/vrp47.c: Remove dom-dump and adjusted
scan test for vrp result.

Bootstrapped and regression tested for all languages (+ Ada, Obj-C++) on host 
x86_64-pc-linux-gnu.  Ok for apply?

Regards,
Kai

Index: gcc-head/gcc/tree-vrp.c
===
--- gcc-head.orig/gcc/tree-vrp.c
+++ gcc-head/gcc/tree-vrp.c
@@ -6747,15 +6747,46 @@ varying:
   return SSA_PROP_VARYING;
 }
 
+/* Returns true, if operand OP has either a one-bit type precision,
+   or if value-range of OP is between zero and one.  Otherwise false
+   is returned.  The destination of PSOP will be set to true, if a sign-
+   overflow on range-check occures.  PSOP might be NULL.  */
+static bool
+has_operand_boolean_range (tree op, bool *psop)
+{
+  tree val = NULL;
+  value_range_t *vr;
+  bool sop = false;
+
+  if (TYPE_PRECISION (TREE_TYPE (op)) == 1)
+{
+  if (psop)
+*psop = false;
+  return true;
+}
+  if (TREE_CODE (op) != SSA_NAME)
+return false;
+  vr = get_value_range (op);
+
+  val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, &sop);
+  if (!val || !integer_onep (val))
+return false;
+
+  val = compare_range_with_value (LE_EXPR, vr, integer_one_node, &sop);
+  if (!val || !integer_onep (val))
+return false;
+  if (psop)
+*psop = sop;
+  return true;
+}
+
 /* Simplify boolean operations if the source is known
to be already a boolean.  */
 static bool
 simplify_truth_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
 {
   enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
-  tree val = NULL;
   tree op0, op1;
-  value_range_t *vr;
   bool sop = false;
   bool need_conversion;
 
@@ -6763,20 +6794,8 @@ simplify_truth_ops_using_ranges (gimple_
   gcc_assert (rhs_code == EQ_EXPR || rhs_code == NE_EXPR);
 
   op0 = gimple_assign_rhs1 (stmt);
-  if (TYPE_PRECISION (TREE_TYPE (op0)) != 1)
-{
-  if (TREE_CODE (op0) != SSA_NAME)
-   return false;
-  vr = get_value_range (op0);
-
-  val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, &sop);
-  if (!val || !integer_onep (val))
-return false;
-
-  val = compare_range_with_value (LE_EXPR, vr, integer_one_node, &sop);
-  if (!val || !integer_onep (val))
-return false;
-}
+  if (!has_operand_boolean_range (op0, &sop))
+return false;
 
   op1 = gimple_assign_rhs2 (stmt);
 
@@ -6802,17 +6821,8 @@ simplify_truth_ops_using_ranges (gimple_
   if (rhs_code == EQ_EXPR)
return false;
 
-  if (TYPE_PRECISION (TREE_TYPE (op1)) != 1)
-   {
- vr = get_value_range (op1);
- val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, &sop);
- if (!val || !integer_onep (val))
-   return false;
-
- val = compare_range_with_value (LE_EXPR, vr, integer_one_node, &sop);
- if (!val || !integer_onep (val))
-   return false;
-   }
+  if (!has_operand_boolean_range (op1, &sop))
+return false;
 }
 
   if (sop && issue_strict_overflow_warning (WARN_STRICT_OVERFLOW_MISC))
@@ -7320,6 +7330,126 @@ simplify_switch_using_ranges (gimple stm
   return false;
 }
 
+/* Simplify an integeral boolean-typed casted expression for the
+   following cases:
+   1) (type) ~ (bool) op1 -> op1 ^ 1
+   2) (type) ((bool)op1[0..1] & (bool)op2[0..1]) -> op1 & op2
+   3) (type) ((bool)op1[0..1] | (bool)op2[0..1]) -> op1 | op2
+   4) (type) ((bool)op1[0..1] ^ (bool)op2[0..1]) -> op2 ^ op2
+   5) (type) (val[0..1] == 0) -> val ^ 1
+   6) (type) (val[0..1] != 0) -> val
+
+   Assuming op1 and op2 hav\EDng type TYPE.  */
+
+static bool
+simplify_converted_bool_expr_using_ranges (gimple_stmt_iterator *gsi, gimple 
stmt)
+{
+  tree finaltype, expr, op1, op2 = NULL_TREE;
+  gimple def;
+  enum tree_code expr_code;
+
+  finaltype = TREE_TYPE (gimple_assign_lhs (stmt));
+  if (!INTEGRAL_TYPE_P (finaltype))
+return false;
+  expr = gimple_assign_rhs1 (stmt);
+
+  /* Check that cast is from a boolean-typed expression.  */
+  if (TREE_CODE (TREE_TYPE (expr)) != BOOLEAN_TYPE)
+return false;
+  /* Check for assignment.  */
+  def = SSA_NAME_DEF_STMT (expr);
+  if (!is_gimple_assign (def))
+return false;
+
+  expr_code = gimple_assign_rhs_code (def);
+
+  op1 = gimple_assign_rhs1 (def);
+
+  switch (expr_code)
+{
+/* (TY

Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-29 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:31 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:30 PM, H.J. Lu  wrote:
>
>> TP is 32bit in x32  For load_tp_x32, we load SImode value and
>> zero-extend to DImode. For add_tp_x32, we are adding SImode
>> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
>> must take SImode TP.
>>
>
> I will see what I can do.
>

 Here is the updated patch to use 32bit TP for 32.
>>>
>>> Why??
>>>
>>> This part makes no sense:
>>>
>>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>>> +  if (ptr_mode != Pmode)
>>> +    tp = convert_to_mode (Pmode, tp, 1);
>>>
>>> You will create zero_extend (unspec ...), that won't be matched by any 
>>> pattern.
>>
>> No.  I created  zero_exten from (reg:SI) to (reg: DI).
>>
>>> Can you please explain, how is this pattern different than DImode
>>> pattern, proposed in my patch?
>>>
>>> +(define_insn "*load_tp_x32"
>>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>>> +       (unspec:SI [(const_int 0)] UNSPEC_TP))]
>>> +  "TARGET_X32"
>>> +  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
>>> +  [(set_attr "type" "imov")
>>> +   (set_attr "modrm" "0")
>>> +   (set_attr "length" "7")
>>> +   (set_attr "memory" "load")
>>> +   (set_attr "imm_disp" "false")])
>>>
>>> vs:
>>>
>>> +(define_insn "*load_tp_x32"
>>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>>> +       (unspec:DI [(const_int 0)] UNSPEC_TP))]
>>
>> That is wrong since source (TP)  is 32bit.  This pattern tells compiler
>> source is 64bit.
>
> Where?
>

Here is the revised patch.  The difference is I changed *add_tp_x32 to SImode.
For

---
extern __thread int __libc_errno __attribute__ ((tls_model ("initial-exec")));

int *
__errno_location (void)
{
  return &__libc_errno;
}
---

compiled with -mx32 -O2 -fPIC  DImode *add_tp_x32 generates:

movq__libc_errno@gottpoff(%rip), %rax
addl%fs:0, %eax
mov %eax, %eax
ret

SImode *add_tp_x32 generates:

movl%fs:0, %eax
addl__libc_errno@gottpoff(%rip), %eax
ret

OK for trunk?

Thanks.

-- 
H.J.
---
2011-07-28  Uros Bizjak  
H.J. Lu  

PR target/47715
* config/i386/i386.md (*load_tp_x32):  New.
(*add_tp_x32): Likewise.
(*load_tp_): Disabled for TARGET_X32.
(*add_tp_): Likewise.
2011-07-28  Uros Bizjak  
	H.J. Lu  

	PR target/47715
	* config/i386/i386.md (*load_tp_x32):  New.
	(*add_tp_x32): Likewise.
	(*load_tp_): Disabled for TARGET_X32.
	(*add_tp_): Likewise.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index f33b8a0..7658522 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12444,10 +12452,21 @@
 (define_mode_attr tp_seg [(SI "gs") (DI "fs")])
 
 ;; Load and add the thread base pointer from %:0.
+(define_insn "*load_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*load_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
 	(unspec:P [(const_int 0)] UNSPEC_TP))]
-  ""
+  "!TARGET_X32"
   "mov{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "imov")
(set_attr "modrm" "0")
@@ -12455,12 +12474,25 @@
(set_attr "memory" "load")
(set_attr "imm_disp" "false")])
 
+(define_insn "*add_tp_x32"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP)
+		 (match_operand:SI 1 "register_operand" "0")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*add_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
 	(plus:P (unspec:P [(const_int 0)] UNSPEC_TP)
 		(match_operand:P 1 "register_operand" "0")))
(clobber (reg:CC FLAGS_REG))]
-  ""
+  "!TARGET_X32"
   "add{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "alu")
(set_attr "modrm" "0")


Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Paolo Carlini

Hi,

I think -Wunused and -Wall should imply -Wunused-local-typedefs unless
the user specifies -Wno-unused-local-typedefs.

I actually first tried this (actually adding it to -Wall -extra and
-Wunused) and found out the following issue.

A typedef can be defined in a macro in a system header, be expanded in a
function and not be used by the function.  In this case we shouldn't
warn, but PR preprocessor/7263 makes us warn nonetheless.  There are
many spots of that kind in the libstdc++ test suite.

Paolo Carlini  writes:


Hi,

I think -Wunused and -Wall should imply -Wunused-local-typedefs
unless the user specifies -Wno-unused-local-typedefs.

IMHO, this is a very good idea looking forward, but then I think we
should make sure the warning plays well with system headers either
as-is or together with some other pending work of Dodji. In
particular, as I probably mentioned already in the trail, we really
want to double check that debug-mode does not trigger warnings, I'm a
bit of worried because many people use and like it.

Exactly.  This would be a side effect of PR preprocessor/7263?

So do you guys think we should add it nonetheless and just add
-Wno-unused-local-typedefs to the tests that exhibit the above issue
before fixing PR preprocessor/7263?
Personally, I don't have a strong opinion, but I think it's very 
important to have a solid plan for 4.7.0: we don't want to regress in 
terms of warnings spilled from system headers, we slowly made good 
progress over the years and now the situation is pretty good and much 
less confusing than it used to be to the users. For sure, anyway, I'm 
available to clean up a bit some of the warnings emitted by the library, 
if that can help the process.


Paolo.


Re: [PATCH 4/6] Shrink-wrapping

2011-07-29 Thread Bernd Schmidt
On 07/29/11 00:31, Richard Earnshaw wrote:

> This causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49891

Fixed with this.


Bernd
Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 176904)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2011-07-29  Bernd Schmidt  
+
+   PR rtl-optimization/49891
+   * cfgrtl.c (force_nonfallthru_and_redirect): Set JUMP_LABEL for
+   newly created returnjumps.
+
 2011-07-28  DJ Delorie  
 
* expr.c (expand_expr_addr_expr_1): Detect a user request for a
Index: gcc/cfgrtl.c
===
--- gcc/cfgrtl.c(revision 176881)
+++ gcc/cfgrtl.c(working copy)
@@ -1254,6 +1254,7 @@ force_nonfallthru_and_redirect (edge e,
 {
 #ifdef HAVE_return
emit_jump_insn_after_setloc (gen_return (), BB_END (jump_block), loc);
+   JUMP_LABEL (BB_END (jump_block)) = ret_rtx;
 #else
gcc_unreachable ();
 #endif


Re: [PATCH] Fix PR48648: Handle CLAST assignments.

2011-07-29 Thread Sebastian Pop
Hi Tobi,

On Thu, Jul 28, 2011 at 12:13, Tobias Grosser  wrote:
>> +  struct clast_user_stmt *body
>> +    = clast_get_body_of_loop ((struct clast_stmt *) stmt);
>
> I am not a big fan of using clast_get_body_of_loop as it is buggy.
> Introducing new uses of it, is nothing what I would support. Do we really
> need this?

No, because of ...

>
>> +  poly_bb_p pbb = (poly_bb_p) cloog_statement_usr (body->statement);
>
> What about some more meaningful names like bound_one, bound_two?

Ok, see the second patch attached.

>> +
>> +  compute_bounds_for_level (pbb, level, v1, v2);
>
> Mh. I do not completely understand all the code. But can't we get v1 and v2
> set without the need for the compute_bounds_for_level function. Is the
> type_for_clast_expression not setting them.
>

... this.
You are right.  type_for_clast_expr would provide the bounds for the
RHS of the assign and so we don't need to compute the bounds on
the loop level, as we would have done on a real loop.  Attached the
amended patch.  I'm regstrapping these patches on amd64-linux.
Ok for trunk after?

Thanks for your review!
Sebastian
From 51223eba857c030659bb0e592f186d329ca85a1c Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Fri, 22 Jul 2011 17:51:14 -0500
Subject: [PATCH 1/2] Fix PR48648: Handle CLAST assignments.

The CLAST produced by CLooG-ISL contains an assignment and GCC chokes
on it.  The exact CLAST contains an assignment followed by an if:

scat_1 = max(0,ceild(T_4-7,8));
if (scat_1 <= min(1,floord(T_4-1,8))) {
  S7(scat_1);
}

This is equivalent to a loop that iterates only once, and so CLooG
generates an assignment followed by an if instead of a loop.  This is
an important optimization that was improved in ISL, that allows
if-conversion: imagine GCC having to figure out that a loop like the
following actually iterates only once, and can be converted to an if:

for (scat_1 = max(0,ceild(T_4-7,8)); scat_1 <= min(1,floord(T_4-1,8)); scat_1++)
  S7(scat_1);

This patch implements the translation of CLAST assignments.
Bootstrapped and tested on amd64-linux.

Sebastian

2011-07-22  Sebastian Pop  

	PR middle-end/48648
	* graphite-clast-to-gimple.c (clast_get_body_of_loop): Handle
	CLAST assignments.
	(translate_clast): Same.
	(translate_clast_assignment): New.

	* gcc.dg/graphite/id-pr48648.c: New.
---
 gcc/ChangeLog  |8 +
 gcc/graphite-clast-to-gimple.c |   45 
 gcc/testsuite/ChangeLog|5 +++
 gcc/testsuite/gcc.dg/graphite/id-pr48648.c |   21 +
 4 files changed, 79 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/graphite/id-pr48648.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a565c18..1742a85 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2011-07-22  Sebastian Pop  
+
+	PR middle-end/48648
+	* graphite-clast-to-gimple.c (clast_get_body_of_loop): Handle
+	CLAST assignments.
+	(translate_clast): Same.
+	(translate_clast_assignment): New.
+
 2011-07-27  Sebastian Pop  
 
 	PR tree-optimization/49876
diff --git a/gcc/graphite-clast-to-gimple.c b/gcc/graphite-clast-to-gimple.c
index a911eb6..7bb1d23 100644
--- a/gcc/graphite-clast-to-gimple.c
+++ b/gcc/graphite-clast-to-gimple.c
@@ -816,6 +816,9 @@ clast_get_body_of_loop (struct clast_stmt *stmt)
   if (CLAST_STMT_IS_A (stmt, stmt_block))
 return clast_get_body_of_loop (((struct clast_block *) stmt)->body);
 
+  if (CLAST_STMT_IS_A (stmt, stmt_ass))
+return clast_get_body_of_loop (stmt->next);
+
   gcc_unreachable ();
 }
 
@@ -1125,6 +1128,44 @@ translate_clast_for (loop_p context_loop, struct clast_for *stmt, edge next_e,
   return last_e;
 }
 
+/* Translates a clast assignment STMT to gimple.
+
+   - NEXT_E is the edge where new generated code should be attached.
+   - BB_PBB_MAPPING is is a basic_block and it's related poly_bb_p mapping.  */
+
+static edge
+translate_clast_assignment (struct clast_assignment *stmt, edge next_e,
+			int level, ivs_params_p ip)
+{
+  gimple_seq stmts;
+  mpz_t v1, v2;
+  tree type, new_name, var;
+  edge res = single_succ_edge (split_edge (next_e));
+  struct clast_expr *expr = (struct clast_expr *) stmt->RHS;
+
+  mpz_init (v1);
+  mpz_init (v2);
+  type = type_for_clast_expr (expr, ip, v1, v2);
+  var = create_tmp_var (type, "graphite_var");
+  new_name = force_gimple_operand (clast_to_gcc_expression (type, expr, ip),
+   &stmts, true, var);
+  add_referenced_var (var);
+  if (stmts)
+{
+  gsi_insert_seq_on_edge (next_e, stmts);
+  gsi_commit_edge_inserts ();
+}
+
+  save_clast_name_index (ip->newivs_index, stmt->LHS,
+			 VEC_length (tree, *(ip->newivs)), level, v1, v2);
+  VEC_safe_push (tree, heap, *(ip->newivs), new_name);
+
+  mpz_clear (v1);
+  mpz_clear (v2);
+
+  return res;
+}
+
 /* Translates a clast guard statement STMT to gimple.
 
- NEXT_E is the edge where new generated code should be attached.
@@ -1175,6 +1216,10 @@ translate_clast (loop_p cont

Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-29 Thread H.J. Lu
On Thu, Jul 28, 2011 at 9:00 AM, H.J. Lu  wrote:
> On Thu, Jul 28, 2011 at 8:59 AM, H.J. Lu  wrote:
>> On Thu, Jul 28, 2011 at 7:59 AM, Uros Bizjak  wrote:
>>> On Thu, Jul 28, 2011 at 4:47 PM, H.J. Lu  wrote:
>>>
> In x32, thread pointer is 32bit and choice of segment register for the
> thread base ptr load should be based on TARGET_64BIT.  This patch
> implements it.  OK for trunk?

 -ENOTESTCASE.

>>>
>>> There is no standalone testcase.  The symptom is in glibc build, I
>>> got
>>>
>>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>>  -E -x c-header'
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>>> --library-path 
>>> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>>> ../scripts -h rpcsvc/yppasswd.x -o
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
>>> Segmentation fault
>>> make[5]: *** Waiting for unfinished jobs
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
>>> Segmentation fault
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
>>> Segmentation fault
>>>
>>> since thread pointer is 32bit in x32.
>>>
>>
>> If we load thread pointer (fs segment register) in x32 with 64bit
>> load, the upper 32bits are garbage.gcc-x32-tls-tp-2.patch
>> We must load 32bit
>
> So, instead of huge complications with new mode iterator, just
> introduce two new patterns that will shadow existing ones for
> TARGET_X32.
>
> Like in attached (untested) patch.
>

 I tried the following patch with typos fixed.  It almost worked,
 except for this failure in glibc testsuite:

 gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
 I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
 -c -f $charmap -i $input ${common_objpfx}localedata/$out
 Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" 
 failed
 make[4]: *** 
 [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
 Error 1

 I will add:

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 8723dc5..d32d64d 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
  {
   rtx tp, reg, insn;

 -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
 +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
 +  if (ptr_mode != Pmode)
 +    tp = convert_to_mode (Pmode, tp, 1);
   if (!to_reg)
     return tp;

 since TP must be 32bit.
>>>
>>> No, this won't have the desired effect. It will change the UNSPEC, so
>>> it won't match patterns in i386.md.
>>>
>>> Can you debug the failure a bit more? With my patterns, add{l} and
>>> mov{l} should clear top 32bits.
>>>
>>
>> TP is 32bit in x32  For load_tp_x32, we load SImode value and
>> zero-extend to DImode. For add_tp_x32, we are adding SImode
>> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
>> must take SImode TP.
>>
>
> I will see what I can do.
>

Here is the updated patch to use 32bit TP for 32.  OK for trunk?

Thanks.


-- 
H.J.
-
2011-07-28  Uros Bizjak  
H.J. Lu  

PR target/47715
* config/i386/i386.c (get_thread_pointer): Use ptr_mode
instead of Pmode with UNSPEC_TP.

* config/i386/i386.md (*load_tp_x32):  New.
(*add_tp_x32): Likewise.
(*load_tp_): Disabled for TARGET_X32.
(*add_tp_): Likewise.
2011-07-28  Uros Bizjak  
	H.J. Lu  

	PR target/47715
	* config/i386/i386.c (get_thread_pointer): Use ptr_mode
	instead of Pmode with UNSPEC_TP.

	* config/i386/i386.md (*load_tp_x32):  New.
	(*add_tp_x32): Likewise.
	(*load_tp_): Disabled for TARGET_X32.
	(*add_tp_): Likewise.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8723dc5..8d20849 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
 {
   rtx tp, reg, insn;
 
-  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (

Re: PATCH: Fix config/i386/morestack.S for x32

2011-07-29 Thread H.J. Lu
On Thu, Jul 28, 2011 at 1:07 PM, Richard Henderson  wrote:
> On 07/28/2011 12:42 PM, H.J. Lu wrote:
>> +#ifdef __LP64__
>>       movq    %rax,%fs:0x70           # Save the new stack boundary.
>> +#else
>> +     movl    %eax,%fs:0x40           # Save the new stack boundary.
>> +#endif
>
> Please macro-ize this.
>
>

Here is is the updated patch.  OK for trunk?

Thanks.

-- 
H.J.
---
2011-07-28  H.J. Lu  

* config/i386/morestack.S (X86_64_SAVE_NEW_STACK_BOUNDARY): New.
Use X86_64_SAVE_NEW_STACK_BOUNDARY to save the new stack boundary
for x86-64.  Properly check __x86_64__ and __LP64__.
2011-07-28  H.J. Lu  

	* config/i386/morestack.S (X86_64_SAVE_NEW_STACK_BOUNDARY): New.
	Use X86_64_SAVE_NEW_STACK_BOUNDARY to save the new stack boundary
	for x86-64.  Properly check __x86_64__ and __LP64__.

diff --git a/libgcc/config/i386/morestack.S b/libgcc/config/i386/morestack.S
index 16279c7..b09ac76 100644
--- a/libgcc/config/i386/morestack.S
+++ b/libgcc/config/i386/morestack.S
@@ -353,7 +353,13 @@ __morestack:
 	# FIXME: The offset must match
 	# TARGET_THREAD_SPLIT_STACK_OFFSET in
 	# gcc/config/i386/linux64.h.
-	movq	%rax,%fs:0x70		# Save the new stack boundary.
+	# Macro to save the new stack boundary.
+#ifdef __LP64__
+#define X86_64_SAVE_NEW_STACK_BOUNDARY(reg)	movq	%r##reg,%fs:0x70
+#else
+#define X86_64_SAVE_NEW_STACK_BOUNDARY(reg)	movl	%e##reg,%fs:0x40
+#endif
+	X86_64_SAVE_NEW_STACK_BOUNDARY (ax)
 
 	call	__morestack_unblock_signals
 
@@ -391,7 +397,7 @@ __morestack:
 	subq	0(%rsp),%rax		# Subtract available space.
 	addq	$BACKOFF,%rax		# Back off 1024 bytes.
 .LEHE0:
-	movq	%rax,%fs:0x70		# Save the new stack boundary.
+	X86_64_SAVE_NEW_STACK_BOUNDARY (ax)
 
 	addq	$16,%rsp		# Remove values from stack.
 
@@ -433,7 +439,7 @@ __morestack:
 	movq	%rbp,%rcx		# Get the stack pointer.
 	subq	%rax,%rcx		# Subtract available space.
 	addq	$BACKOFF,%rcx		# Back off 1024 bytes.
-	movq	%rcx,%fs:0x70		# Save new stack boundary.
+	X86_64_SAVE_NEW_STACK_BOUNDARY (cx)
 	movq	(%rsp),%rdi		# Restore exception data for call.
 #ifdef __PIC__
 	call	_Unwind_Resume@PLT	# Resume unwinding.
@@ -493,7 +499,7 @@ __x86.get_pc_thunk.bx:
 	.section .data.DW.ref.__gcc_personality_v0,"awG",@progbits,DW.ref.__gcc_personality_v0,comdat
 	.type	DW.ref.__gcc_personality_v0, @object
 DW.ref.__gcc_personality_v0:
-#ifndef __x86_64
+#ifndef __LP64__
 	.align 4
 	.size	DW.ref.__gcc_personality_v0, 4
 	.long	__gcc_personality_v0
@@ -504,7 +510,7 @@ DW.ref.__gcc_personality_v0:
 #endif
 #endif
 
-#ifdef __x86_64__
+#if defined __x86_64__ && defined __LP64__
 
 # This entry point is used for the large model.  With this entry point
 # the upper 32 bits of %r10 hold the argument size and the lower 32
@@ -537,7 +543,7 @@ __morestack_large_model:
.size	__morestack_large_model, . - __morestack_large_model
 #endif
 
-#endif /* __x86_64__ */
+#endif /* __x86_64__ && __LP64__ */
 
 # Initialize the stack test value when the program starts or when a
 # new thread starts.  We don't know how large the main stack is, so we
@@ -570,7 +576,7 @@ __stack_split_initialize:
 #else /* defined(__x86_64__) */
 
 	leaq	-16000(%rsp),%rax	# We should have at least 16K.
-	movq	%rax,%fs:0x70
+	X86_64_SAVE_NEW_STACK_BOUNDARY (ax)
 	movq	%rsp,%rdi
 	movq	$16000,%rsi
 #ifdef __PIC__
@@ -592,7 +598,7 @@ __stack_split_initialize:
 
 	.section	.ctors.65535,"aw",@progbits
 
-#ifndef __x86_64__
+#ifndef __LP64__
 	.align	4
 	.long	__stack_split_initialize
 	.long	__morestack_load_mmap


Re: [PATCH] Fix PR49876: Continue code generation with integer_zero_node on gloog_error

2011-07-29 Thread Sebastian Pop
On Thu, Jul 28, 2011 at 02:58, Richard Guenther
 wrote:
> On Wed, Jul 27, 2011 at 8:49 PM, Sebastian Pop  wrote:
>> When setting gloog_error, graphite should continue code generation
>> without early returns, as otherwise the SSA representation would not
>> be complete.  So set the new expression to integer_zero_node, that
>> would not require more SSA updates, and continue code generation as
>> nothing happened.
>
> I suppose you have to watch for correct types?  Or does the code get
> discarded again before it eventually reaches the verifier?  Ok in that case.

Attached is the amended patch using build_int_cst.
Thanks for pointing this type problem.
Regstrapping again.

Sebastian
From 1ce1367e3a0439b374025104eb2e7558c9b957fc Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Wed, 27 Jul 2011 13:42:29 -0500
Subject: [PATCH] Fix PR49876: Continue code generation with integer_zero_node on gloog_error

When setting gloog_error, graphite should continue code generation
without early returns, as otherwise the SSA representation would not
be complete.  So set the new expression to integer_zero_node, that
would not require more SSA updates, and continue code generation as
nothing happened.

Regstrapping on amd64-linux.

2011-07-27  Sebastian Pop  

	PR tree-optimization/49876
	* sese.c (rename_uses): Do not return false on gloog_error: set
	the new_expr to integer_zero_node and continue code generation.
	(graphite_copy_stmts_from_block): Remove early exit on gloog_error.
---
 gcc/ChangeLog |7 +++
 gcc/sese.c|   18 --
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b07d494..a565c18 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@
 2011-07-27  Sebastian Pop  
 
+	PR tree-optimization/49876
+	* sese.c (rename_uses): Do not return false on gloog_error: set
+	the new_expr to integer_zero_node and continue code generation.
+	(graphite_copy_stmts_from_block): Remove early exit on gloog_error.
+
+2011-07-27  Sebastian Pop  
+
 	PR tree-optimization/49471
 	* tree-ssa-loop-manip.c (canonicalize_loop_ivs): Build an unsigned
 	iv only when the largest type is unsigned.  Do not call
diff --git a/gcc/sese.c b/gcc/sese.c
index ec96dfb..cd92527 100644
--- a/gcc/sese.c
+++ b/gcc/sese.c
@@ -527,10 +527,10 @@ rename_uses (gimple copy, htab_t rename_map, gimple_stmt_iterator *gsi_tgt,
   if (chrec_contains_undetermined (scev))
 	{
 	  *gloog_error = true;
-	  return false;
+	  new_expr = build_int_cst (TREE_TYPE (old_name), 0);
 	}
-
-  new_expr = chrec_apply_map (scev, iv_map);
+  else
+	new_expr = chrec_apply_map (scev, iv_map);
 
   /* The apply should produce an expression tree containing
 	 the uses of the new induction variables.  We should be
@@ -540,12 +540,13 @@ rename_uses (gimple copy, htab_t rename_map, gimple_stmt_iterator *gsi_tgt,
 	  || tree_contains_chrecs (new_expr, NULL))
 	{
 	  *gloog_error = true;
-	  return false;
+	  new_expr = build_int_cst (TREE_TYPE (old_name), 0);
 	}
+  else
+	/* Replace the old_name with the new_expr.  */
+	new_expr = force_gimple_operand (unshare_expr (new_expr), &stmts,
+	 true, NULL_TREE);
 
-  /* Replace the old_name with the new_expr.  */
-  new_expr = force_gimple_operand (unshare_expr (new_expr), &stmts,
-   true, NULL_TREE);
   gsi_insert_seq_before (gsi_tgt, stmts, GSI_SAME_STMT);
   replace_exp (use_p, new_expr);
 
@@ -621,9 +622,6 @@ graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
 		   gloog_error))
 	fold_stmt_inplace (copy);
 
-  if (*gloog_error)
-	break;
-
   update_stmt (copy);
 }
 }
-- 
1.7.4.1



Re: [trans-mem] verify_types_in_gimple_seq_2 glitch

2011-07-29 Thread Patrick Marlier

Thanks to remind me (once again) the rules...

Bootstrapped and tested successfully with:
make check-gcc RUNTESTFLAGS=tm.exp

Changelog:
* tree-cfg.c: Fix typo.

--
Patrick Marlier

On 07/29/2011 01:14 PM, Aldy Hernandez wrote:

On 07/29/11 05:25, Patrick Marlier wrote:

In tree-cfg.c (line ~3921), there is a little glitch.

Index: tree-cfg.c
===
--- tree-cfg.c (revision 176864)
+++ tree-cfg.c (working copy)
@@ -3918,7 +3918,7 @@
break;

case GIMPLE_TRANSACTION:
- err |= verify_types_in_gimple_seq_2 (gimple_omp_body (stmt));
+ err |= verify_types_in_gimple_seq_2 (gimple_transaction_body (stmt));
break;

default:

Patrick Marlier.


Have you tested this patch?

ChangeLog entry?


[PATCH][1/2] Fix PR49806, promote/demote binary operations in VRP

2011-07-29 Thread Richard Guenther

This factors out a worker for extract_range_from_binary_expr that
operates on two value-ranges instead of two stmt operands and adjusts
predicates used as needed.  This is a prerequesite for [2/2] which
will introduce the ability to promote/demote binary operations
using range information to both fix PR49806 and to allow to do
shorten_* like transformations at a later state (thus recover from
C integer promotions for the sake of vectorization).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2011-07-29  Richard Guenther  

* tree-vrp.c (vrp_expr_computes_nonnegative): Remove.
(value_range_nonnegative_p): New function.
(ssa_name_nonnegative_p): Use it.
(value_range_constant_singleton): New function.
(op_with_constant_singleton_value_range): Use it.
(extract_range_from_binary_expr_1): New function, split out from ...
(extract_range_from_binary_expr): ... this.  Remove fallback
constant folding done here.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 176922)
+++ gcc/tree-vrp.c  (working copy)
@@ -875,17 +875,6 @@ usable_range_p (value_range_t *vr, bool
 }
 
 
-/* Like tree_expr_nonnegative_warnv_p, but this function uses value
-   ranges obtained so far.  */
-
-static bool
-vrp_expr_computes_nonnegative (tree expr, bool *strict_overflow_p)
-{
-  return (tree_expr_nonnegative_warnv_p (expr, strict_overflow_p)
- || (TREE_CODE (expr) == SSA_NAME
- && ssa_name_nonnegative_p (expr)));
-}
-
 /* Return true if the result of assignment STMT is know to be non-negative.
If the return value is based on the assumption that signed overflow is
undefined, set *STRICT_OVERFLOW_P to true; otherwise, don't change
@@ -1404,6 +1393,25 @@ range_includes_zero_p (value_range_t *vr
   return (value_inside_range (zero, vr) == 1);
 }
 
+/* Return true if *VR is know to only contain nonnegative values.  */
+
+static inline bool
+value_range_nonnegative_p (value_range_t *vr)
+{
+  if (vr->type == VR_RANGE)
+{
+  int result = compare_values (vr->min, integer_zero_node);
+  return (result == 0 || result == 1);
+}
+  else if (vr->type == VR_ANTI_RANGE)
+{
+  int result = compare_values (vr->max, integer_zero_node);
+  return result == -1;
+}
+
+  return false;
+}
+
 /* Return true if T, an SSA_NAME, is known to be nonnegative.  Return
false otherwise or if no value range information is available.  */
 
@@ -1419,15 +1427,21 @@ ssa_name_nonnegative_p (const_tree t)
   if (!vr)
 return false;
 
-  /* Testing for VR_ANTI_RANGE is not useful here as any anti-range
- which would return a useful value should be encoded as a VR_RANGE.  */
-  if (vr->type == VR_RANGE)
-{
-  int result = compare_values (vr->min, integer_zero_node);
+  return value_range_nonnegative_p (vr);
+}
 
-  return (result == 0 || result == 1);
-}
-  return false;
+/* If *VR has a value rante that is a single constant value return that,
+   otherwise return NULL_TREE.  */
+
+static tree
+value_range_constant_singleton (value_range_t *vr)
+{
+  if (vr->type == VR_RANGE
+  && operand_equal_p (vr->min, vr->max, 0)
+  && is_gimple_min_invariant (vr->min))
+return vr->min;
+
+  return NULL_TREE;
 }
 
 /* If OP has a value range with a single constant value return that,
@@ -1437,21 +1451,13 @@ ssa_name_nonnegative_p (const_tree t)
 static tree
 op_with_constant_singleton_value_range (tree op)
 {
-  value_range_t *vr;
-
   if (is_gimple_min_invariant (op))
 return op;
 
   if (TREE_CODE (op) != SSA_NAME)
 return NULL_TREE;
 
-  vr = get_value_range (op);
-  if (vr->type == VR_RANGE
-  && operand_equal_p (vr->min, vr->max, 0)
-  && is_gimple_min_invariant (vr->min))
-return vr->min;
-
-  return NULL_TREE;
+  return value_range_constant_singleton (get_value_range (op));
 }
 
 
@@ -2157,19 +2163,19 @@ zero_nonzero_bits_from_vr (value_range_t
 }
 
 
-/* Extract range information from a binary expression EXPR based on
-   the ranges of each of its operands and the expression code.  */
+/* Extract range information from a binary operation CODE based on
+   the ranges of each of its operands, *VR0 and *VR1 with resulting
+   type EXPR_TYPE.  The resulting range is stored in *VR.  */
 
 static void
-extract_range_from_binary_expr (value_range_t *vr,
-   enum tree_code code,
-   tree expr_type, tree op0, tree op1)
+extract_range_from_binary_expr_1 (value_range_t *vr,
+ enum tree_code code, tree expr_type,
+ value_range_t *vr0_, value_range_t *vr1_)
 {
+  value_range_t vr0 = *vr0_, vr1 = *vr1_;
   enum value_range_type type;
   tree min, max;
   int cmp;
-  value_range_t vr0 = { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL };
-  value_range_t vr1 = { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL

[PATCH][2/2][RFC] Fix PR49806, promote/demote binary operations in VRP

2011-07-29 Thread Richard Guenther

This is the actual patch for fixing the binary expression cases in
PR49806 - adding unary expressions is easy but will slightly convolute
the code.

Only lightly tested sofar, bootstrap and regtest pending for 
x86_64-unknown-linux-gnu.

Any comments on general profitability issues?

Thanks,
Richard.

2011-07-29  Richard Guenther  

PR tree-optimization/49806
* tree-vrp.c (range_fits_type_p): Move earlier.
(simplify_converted_operation_using_ranges): New function.
(simplify_stmt_using_ranges): Call it.

* gcc.dg/tree-ssa/vrp47.c: Adjust.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2011-07-29 14:31:59.0 +0200
--- gcc/tree-vrp.c  2011-07-29 14:32:00.0 +0200
*** simplify_switch_using_ranges (gimple stm
*** 7314,7319 
--- 7314,7532 
return false;
  }
  
+ /* Return whether the value range *VR fits in an integer type specified
+by PRECISION and UNSIGNED_P.  */
+ 
+ static bool
+ range_fits_type_p (value_range_t *vr, unsigned precision, bool unsigned_p)
+ {
+   tree src_type;
+   unsigned src_precision;
+   double_int tem;
+ 
+   /* We can only handle integral and pointer types.  */
+   src_type = TREE_TYPE (vr->min);
+   if (!INTEGRAL_TYPE_P (src_type)
+   && !POINTER_TYPE_P (src_type))
+ return false;
+ 
+   /* An extension is always fine, so is an identity transform.  */
+   src_precision = TYPE_PRECISION (TREE_TYPE (vr->min));
+   if (src_precision < precision
+   || (src_precision == precision
+ && TYPE_UNSIGNED (src_type) == unsigned_p))
+ return true;
+ 
+   /* Now we can only handle ranges with constant bounds.  */
+   if (vr->type != VR_RANGE
+   || TREE_CODE (vr->min) != INTEGER_CST
+   || TREE_CODE (vr->max) != INTEGER_CST)
+ return false;
+ 
+   /* For precision-preserving sign-changes the MSB of the double-int
+  has to be clear.  */
+   if (src_precision == precision
+   && (TREE_INT_CST_HIGH (vr->min) | TREE_INT_CST_HIGH (vr->max)) < 0)
+ return false;
+ 
+   /* Then we can perform the conversion on both ends and compare
+  the result for equality.  */
+   tem = double_int_ext (tree_to_double_int (vr->min), precision, unsigned_p);
+   if (!double_int_equal_p (tree_to_double_int (vr->min), tem))
+ return false;
+   tem = double_int_ext (tree_to_double_int (vr->max), precision, unsigned_p);
+   if (!double_int_equal_p (tree_to_double_int (vr->max), tem))
+ return false;
+ 
+   return true;
+ }
+ 
+ /* Convert an operation to the type of its converted result.  The conversion
+statement is STMT.  Return true if we performed the replacement.  */
+ 
+ static bool
+ simplify_converted_operation_using_ranges (gimple stmt)
+ {
+   tree lhs, op_result, rhs1, rhs1_def_rhs, rhs2, rhs2_def_rhs, tem, var, 
iptype;
+   enum tree_code op_code;
+   gimple op, rhs1_def, rhs2_def, newop;
+   gimple_stmt_iterator gsi;
+   value_range_t rhs1_vr = { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL };
+   value_range_t rhs2_vr = { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL };
+   value_range_t res_vr = { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL };
+   unsigned orig_precision, precision;
+   bool orig_unsigned_p, unsigned_p;
+ 
+   lhs = gimple_assign_lhs (stmt);
+   op_result = gimple_assign_rhs1 (stmt);
+   if (TREE_CODE (op_result) != SSA_NAME
+   || !has_single_use (op_result))
+ return false;
+   op = SSA_NAME_DEF_STMT (op_result);
+   if (!is_gimple_assign (op))
+ return false;
+   op_code = gimple_assign_rhs_code (op);
+   if (TREE_CODE_CLASS (op_code) != tcc_binary)
+ return false;
+   rhs1 = gimple_assign_rhs1 (op);
+   rhs2 = gimple_assign_rhs2 (op);
+   if (TREE_CODE (rhs1) != SSA_NAME
+   || (TREE_CODE (rhs2) != SSA_NAME
+ && TREE_CODE (rhs2) != INTEGER_CST))
+ return false;
+   rhs1_def = SSA_NAME_DEF_STMT (rhs1);
+   if (!has_single_use (rhs1)
+   || !is_gimple_assign (rhs1_def)
+   || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (rhs1_def)))
+ return false;
+   rhs1_def_rhs = gimple_assign_rhs1 (rhs1_def);
+   if (TREE_CODE (rhs2) == SSA_NAME)
+ {
+   rhs2_def = SSA_NAME_DEF_STMT (rhs2);
+   if (!has_single_use (rhs2)
+ || !is_gimple_assign (rhs2_def)
+ || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (rhs2_def)))
+   return false;
+   rhs2_def_rhs = gimple_assign_rhs1 (rhs2_def);
+ }
+   else
+ rhs2_def_rhs = rhs2;
+ 
+   /* Now we have matched the statement pattern
+ 
+rhs1 = (T1)x;
+rhs2 = (T1)y;
+op_result = rhs1 OP rhs2;
+lhs = (T2)op_result;
+ 
+  We want to compute rhs1 OP rhs2 in type T2 to get rid of the
+  final conversion, either eliminating the conversions to T1
+  as well, or adjusting them accordingly.
+  For this to be valid we need to simulate rhs1 OP rhs2 in
+  infinite precision and verify the result fits both in T1 and T2.  */
+   orig_preci

Re: [RS6000] asynch exceptions and unwind info

2011-07-29 Thread Alan Modra
On Fri, Jul 29, 2011 at 10:57:48AM +0930, Alan Modra wrote:
> Except that any info about r2 in an indirect call sequence really
> belongs to the *called* function frame, not the callee.  I woke up
> this morning with the realization that what I'd done in
> frob_update_context for indirect call sequences was wrong.  Ditto for
> the r2 store that Michael moved into the prologue.  The only time we
> want the unwinder to restore from that particular save is if r2 isn't
> saved in the current frame.
> 
> Untested patch follows.

Here's a tested patch that fixes an issue with TOC_SINGLE_PIC_BASE and
enables Michael's save_toc_in_prologue optimization for all functions
except those that make dynamic stack adjustments.

Incidentally, the rs6000_emit_prologue comment I added below suggests
another solution.  Since all we need is the toc pointer for the frame,
it would be possible to tell the unwinder to simply load r2 from the
.opd entry.  I think..

libgcc/
* config/rs6000/linux-unwind.h (frob_update_context <__powerpc64__>):
Restore for indirect call bcrtl from correct stack slot, and only
if cfa+40 isn't valid.
gcc/
* config/rs6000/rs6000-protos.h (rs6000_save_toc_in_prologue_p): Delete.
* config/rs6000/rs6000.c (rs6000_save_toc_in_prologue_p): Make static.
(rs6000_emit_prologue): Don't prematurely return when
TARGET_SINGLE_PIC_BASE.  Don't emit eh_frame info in
save_toc_in_prologue case.
(rs6000_call_indirect_aix): Only disallow save_toc_in_prologue for
calls_alloca.

Index: libgcc/config/rs6000/linux-unwind.h
===
--- libgcc/config/rs6000/linux-unwind.h (revision 176905)
+++ libgcc/config/rs6000/linux-unwind.h (working copy)
@@ -354,20 +354,22 @@ frob_update_context (struct _Unwind_Cont
  /* We are in a plt call stub or r2 adjusting long branch stub,
 before r2 has been saved.  Keep REG_UNSAVED.  */
}
-  else if (pc[0] == 0x4E800421
-  && pc[1] == 0xE8410028)
-   {
- /* We are at the bctrl instruction in a call via function
-pointer.  gcc always emits the load of the new r2 just
-before the bctrl.  */
- _Unwind_SetGRPtr (context, 2, context->cfa + 40);
-   }
   else
{
  unsigned int *insn
= (unsigned int *) _Unwind_GetGR (context, R_LR);
  if (insn && *insn == 0xE8410028)
_Unwind_SetGRPtr (context, 2, context->cfa + 40);
+ else if (pc[0] == 0x4E800421
+  && pc[1] == 0xE8410028)
+   {
+ /* We are at the bctrl instruction in a call via function
+pointer.  gcc always emits the load of the new R2 just
+before the bctrl so this is the first and only place
+we need to use the stored R2.  */
+ _Unwind_Word sp = _Unwind_GetGR (context, 1);
+ _Unwind_SetGRPtr (context, 2, sp + 40);
+   }
}
 }
 #endif
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 176905)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -172,8 +172,6 @@ extern void rs6000_emit_epilogue (int);
 extern void rs6000_emit_eh_reg_restore (rtx, rtx);
 extern const char * output_isel (rtx *);
 extern void rs6000_call_indirect_aix (rtx, rtx, rtx);
-extern bool rs6000_save_toc_in_prologue_p (void);
-
 extern void rs6000_aix_asm_output_dwarf_table_ref (char *);
 
 /* Declare functions in rs6000-c.c */
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 176905)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1178,6 +1178,7 @@ static void rs6000_conditional_register_
 static void rs6000_trampoline_init (rtx, tree, rtx);
 static bool rs6000_cannot_force_const_mem (enum machine_mode, rtx);
 static bool rs6000_legitimate_constant_p (enum machine_mode, rtx);
+static bool rs6000_save_toc_in_prologue_p (void);
 
 /* Hash table stuff for keeping track of TOC entries.  */
 
@@ -20478,14 +20504,12 @@ rs6000_emit_prologue (void)
   insn = emit_insn (generate_set_vrsave (reg, info, 0));
 }
 
-  if (TARGET_SINGLE_PIC_BASE)
-return; /* Do not set PIC register */
-
   /* If we are using RS6000_PIC_OFFSET_TABLE_REGNUM, we need to set it up.  */
-  if ((TARGET_TOC && TARGET_MINIMAL_TOC && get_pool_size () != 0)
-  || (DEFAULT_ABI == ABI_V4
- && (flag_pic == 1 || (flag_pic && TARGET_SECURE_PLT))
- && df_regs_ever_live_p (RS6000_PIC_OFFSET_TABLE_REGNUM)))
+  if (!TARGET_SINGLE_PIC_BASE
+  && ((TARGET_TOC && TARGET_MINIMAL_TOC && get_pool_size () != 0)
+ || (DEFAULT_ABI == ABI_V4
+ && (flag_pic == 1 || (flag_pic && TARGET_SECURE_PLT))
+ && df_regs_ever_live_p (RS6000_PIC_OFFSET_TABLE_REGNUM)

Re: [PATCH] Saner return value for gen_lowpart_no_emit

2011-07-29 Thread Richard Sandiford
Paolo Bonzini  writes:
> For some reason, when I "invented" gen_lowpart_no_emit I defaulted it
> to returning the original value of X.  Since gen_lowpart_no_emit is
> mostly used to return simplifications, the correct thing to return when
> conversion fails is NULL.  As a follow-up, every use in simplify-rtx.c
> could be changed to try other simplifications if gen_lowpart_no_emit
> fails; for now, I'm just avoiding a NULL pointer dereference.
>
> 2011-07-25  Paolo Bonzini  
>
> * rtlhooks.c (gen_lowpart_no_emit_general): Remove.
> * rtlhooks-def.h (gen_lowpart_no_emit_general): Remove prototype.
> (RTL_HOOKS_GEN_LOWPART_NO_EMIT): Default to gen_lowpart_if_possible.

OK, thanks.

Richard


Re: PATCH: PR target/47715: [x32] TLS doesn't work

2011-07-29 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 3:47 PM, H.J. Lu  wrote:

 TLS on X32 is almost identical to TLS on x86-64.  The only difference is
 x32 address space is 32bit.  That means TLS symbols can be in either
 SImode or DImode with upper 32bit zero.  This patch updates
 tls_global_dynamic_64 to support x32.  OK for trunk?
>>
>> Please also change 64bit GNU2_TLS patterns, so -mtls-dialect=gnu2 will
>> also work.  Please see attached patch.
>>
>
> Yes, it works.  Can you apply it?

This is what I have committed:

2011-07-28  Uros Bizjak  

PR target/47715
* config/i386/i386.md (*tls_global_dynamic_64): Remove mode from
tls_symbolic_operand check.  Update code sequence for TARGET_X32.
(tls_global_dynamic_64): Remove mode from tls_symbolic_operand check.
(tls_dynamic_gnu2_64): Ditto.
(*tls_dynamic_gnu2_lea_64): Ditto.
(*tls_dynamic_gnu2_call_64): Ditto.
(*tls_dynamic_gnu2_combine_64): Ditto.

Uros.
Index: i386.md
===
--- i386.md (revision 176870)
+++ i386.md (working copy)
@@ -12327,11 +12327,12 @@
(call:DI
 (mem:QI (match_operand:DI 2 "constant_call_address_operand" "z"))
 (match_operand:DI 3 "" "")))
-   (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")]
+   (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
  UNSPEC_TLS_GD)]
   "TARGET_64BIT"
 {
-  fputs (ASM_BYTE "0x66\n", asm_out_file);
+  if (!TARGET_X32)
+fputs (ASM_BYTE "0x66\n", asm_out_file);
   output_asm_insn
 ("lea{q}\t{%a1@tlsgd(%%rip), %%rdi|rdi, %a1@tlsgd[rip]}", operands);
   fputs (ASM_SHORT "0x\n", asm_out_file);
@@ -12349,7 +12350,7 @@
  (call:DI
   (mem:QI (match_operand:DI 2 "constant_call_address_operand" ""))
   (const_int 0)))
- (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")]
+ (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
UNSPEC_TLS_GD)])])
 
 (define_insn "*tls_local_dynamic_base_32_gnu"
@@ -12553,7 +12554,7 @@
 
 (define_expand "tls_dynamic_gnu2_64"
   [(set (match_dup 2)
-   (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")]
+   (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
   UNSPEC_TLSDESC))
(parallel
 [(set (match_operand:DI 0 "register_operand" "")
@@ -12568,7 +12569,7 @@
 
 (define_insn "*tls_dynamic_lea_64"
   [(set (match_operand:DI 0 "register_operand" "=r")
-   (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")]
+   (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
   UNSPEC_TLSDESC))]
   "TARGET_64BIT && TARGET_GNU2_TLS"
   "lea{q}\t{%a1@TLSDESC(%%rip), %0|%0, %a1@TLSDESC[rip]}"
@@ -12579,7 +12580,7 @@
 
 (define_insn "*tls_dynamic_call_64"
   [(set (match_operand:DI 0 "register_operand" "=a")
-   (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")
+   (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")
(match_operand:DI 2 "register_operand" "0")
(reg:DI SP_REG)]
   UNSPEC_TLSDESC))
@@ -12598,7 +12599,7 @@
 (reg:DI SP_REG)]
UNSPEC_TLSDESC)
 (const:DI (unspec:DI
-   [(match_operand:DI 1 "tls_symbolic_operand" "")]
+   [(match_operand 1 "tls_symbolic_operand" "")]
UNSPEC_DTPOFF
(clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && TARGET_GNU2_TLS"


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-29 Thread H.J. Lu
On Thu, Jul 28, 2011 at 2:31 AM, Paolo Bonzini  wrote:
> On 07/28/2011 11:30 AM, Uros Bizjak wrote:
>>
>> >  convert_memory_address_addr_space has a special PLUS/MULT case for
>> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
>> >  for all Pmode != ptr_mode cases. ?OK for trunk?
>> >  2011-06-11 ?H.J. Lu ?
>> >
>> >  ? ? ? ?PR middle-end/47727
>> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
>> >  ? ? ? ?conversion and addition if one operand is a constant.
>>
>> Do we still need this patch? With recent target changes the testcase
>> from PR can be compiled without problems with a gcc from an unpatched
>> trunk.
>
> Given the communication difficulties, I hope not...
>
> Paolo
>

Here is the updated patch.  OK for trunk?

Thanks.

-- 
H.J.
---

2011-07-28  H.J. Lu  

PR middle-end/49721
* explow.c (convert_memory_address_addr_space_1): New.
(convert_memory_address_addr_space): Use it.

* expr.c (convert_modes_1): New.
(convert_modes): Use it.

* expr.h (convert_modes_1): New.

* rtl.h (convert_memory_address_addr_space_1): New.
(convert_memory_address_1): Likewise.

* simplify-rtx.c (simplify_unary_operation_1): Call
convert_memory_address_1 instead of convert_memory_address.
2011-07-28  H.J. Lu  

	PR middle-end/49721
	* explow.c (convert_memory_address_addr_space_1): New.
	(convert_memory_address_addr_space): Use it.

	* expr.c (convert_modes_1): New.
	(convert_modes): Use it.

	* expr.h (convert_modes_1): New.

	* rtl.h (convert_memory_address_addr_space_1): New.
	(convert_memory_address_1): Likewise.

	* simplify-rtx.c (simplify_unary_operation_1): Call
	convert_memory_address_1 instead of convert_memory_address.

diff --git a/gcc/explow.c b/gcc/explow.c
index 3c692f4..069a68a 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -320,8 +320,9 @@ break_out_memory_refs (rtx x)
arithmetic insns can be used.  */
 
 rtx
-convert_memory_address_addr_space (enum machine_mode to_mode ATTRIBUTE_UNUSED,
-   rtx x, addr_space_t as ATTRIBUTE_UNUSED)
+convert_memory_address_addr_space_1 (enum machine_mode to_mode ATTRIBUTE_UNUSED,
+ rtx x, addr_space_t as ATTRIBUTE_UNUSED,
+ bool no_emit ATTRIBUTE_UNUSED)
 {
 #ifndef POINTERS_EXTEND_UNSIGNED
   gcc_assert (GET_MODE (x) == to_mode || GET_MODE (x) == VOIDmode);
@@ -377,28 +378,27 @@ convert_memory_address_addr_space (enum machine_mode to_mode ATTRIBUTE_UNUSED,
   break;
 
 case CONST:
-  return gen_rtx_CONST (to_mode,
-			convert_memory_address_addr_space
-			  (to_mode, XEXP (x, 0), as));
+  temp = convert_memory_address_addr_space_1 (to_mode, XEXP (x, 0),
+		  as, no_emit);
+  return temp ? gen_rtx_CONST (to_mode, temp) : temp;
   break;
 
 case PLUS:
 case MULT:
-  /* For addition we can safely permute the conversion and addition
-	 operation if one operand is a constant and converting the constant
-	 does not change it or if one operand is a constant and we are
-	 using a ptr_extend instruction  (POINTERS_EXTEND_UNSIGNED < 0).
+  /* FIXME: Is this really safe for POINTERS_EXTEND_UNSIGNED < 0?
+ For addition, we can safely permute the conversion and addition
+	 operation if one operand is a constant and we are using a
+	 ptr_extend instruction (POINTERS_EXTEND_UNSIGNED < 0).
+	 
 	 We can always safely permute them if we are making the address
 	 narrower.  */
   if (GET_MODE_SIZE (to_mode) < GET_MODE_SIZE (from_mode)
 	  || (GET_CODE (x) == PLUS
 	  && CONST_INT_P (XEXP (x, 1))
-	  && (XEXP (x, 1) == convert_memory_address_addr_space
-   (to_mode, XEXP (x, 1), as)
- || POINTERS_EXTEND_UNSIGNED < 0)))
+	  && POINTERS_EXTEND_UNSIGNED < 0))
 	return gen_rtx_fmt_ee (GET_CODE (x), to_mode,
-			   convert_memory_address_addr_space
- (to_mode, XEXP (x, 0), as),
+			   convert_memory_address_addr_space_1
+ (to_mode, XEXP (x, 0), as, no_emit),
 			   XEXP (x, 1));
   break;
 
@@ -406,10 +406,17 @@ convert_memory_address_addr_space (enum machine_mode to_mode ATTRIBUTE_UNUSED,
   break;
 }
 
-  return convert_modes (to_mode, from_mode,
-			x, POINTERS_EXTEND_UNSIGNED);
+  return convert_modes_1 (to_mode, from_mode, x,
+			  POINTERS_EXTEND_UNSIGNED, no_emit);
 #endif /* defined(POINTERS_EXTEND_UNSIGNED) */
 }
+
+rtx
+convert_memory_address_addr_space (enum machine_mode to_mode,
+   rtx x, addr_space_t as)
+{
+  return convert_memory_address_addr_space_1 (to_mode, x, as, false);
+}
 
 /* Return something equivalent to X but valid as a memory address for something
of mode MODE in the named address space AS.  When X is not itself valid,
diff --git a/gcc/expr.c b/gcc/expr.c
index 0988c51..8aec0a5 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -696,13 +696,16 @@ convert_to_mode (enum machine_mode mode, rtx x, int unsignedp)
Both modes may be floating, or both integer.
UNSIGNEDP is nonzero if X is

Re: [RS6000] asynch exceptions and unwind info

2011-07-29 Thread David Edelsohn
On Thu, Jul 28, 2011 at 9:27 PM, Alan Modra  wrote:

> Right, but I was talking about the normal case, where the unwinder
> won't even look at .glink unwind info.
>
>> The whole problem is that toc pointer copy in 40(1) is only valid
>> during indirect call sequences, and iff ld inserted a stub?  I.e.
>> direct calls between functions that share toc pointers never save
>> the copy?
>
> Yes.
>
>> Would it make sense, if a function has any indirect call, to move
>> the toc pointer save into the prologue?  You'd get to avoid that
>> store all the time.  Of course you'd not be able to sink the load
>> after the call, but it might still be a win.  And in that special
>> case you can annotate the r2 save slot just once, correctly.
>
> Except that any info about r2 in an indirect call sequence really
> belongs to the *called* function frame, not the callee.  I woke up
> this morning with the realization that what I'd done in
> frob_update_context for indirect call sequences was wrong.  Ditto for
> the r2 store that Michael moved into the prologue.  The only time we
> want the unwinder to restore from that particular save is if r2 isn't
> saved in the current frame.

This discussion seems to be referencing both PLT stubs and pointer
glue.  Indirect calls through a function pointer create a frame, save
R2, and the unwinder can visit that frame.  PLT stub calls are tail
calls, save R2, and the unwinder only would visit the frame if an
exception occurs in the middle of a call.  One also can add lazy
resolution using the glink code, which performs additional work in the
dynamic linker on the first call.

Which has the problem?  Which are you trying to solve?  And how is
your change solving it?

Thanks, David


Re: PATCH, v2: PR target/47715: [x32] Use SImode for thread pointer

2011-07-29 Thread H.J. Lu
On Fri, Jul 29, 2011 at 4:01 AM, Uros Bizjak  wrote:
> [ For some reason this post didn't reach gcc-patches@ ML archives... ]
>
> Hello!
>
> ABI specifies that TP is loaded in ptr_mode. Attached patch implements
> this requirement.
>
> 2011-07-29  Uros Bizjak  
>
>        * config/i386/i386.md (*load_tp_x32): New.
>        (*load_tp_x32_zext): Ditto.
>        (*add_tp_x32): Ditto.
>        (*add_tp_x32_zext): Ditto.
>        (*load_tp_): Disable for !TARGET_X32 targets.
>        (*add_tp_): Ditto.
>        * config/i386/i386.c (get_thread_pointer): Load thread pointer in
>        ptr_mode and convert to Pmode if needed.
>
> Testing on x86_64-pc-linux-gnu in progress. H.J., please test this
> version on x32.
>

It works.  Can you check it in?

Thanks.

-- 
H.J.


Re: PATCH, v2: PR target/47715: [x32] Use SImode for thread pointer

2011-07-29 Thread H.J. Lu
On Fri, Jul 29, 2011 at 6:18 AM, H.J. Lu  wrote:
> On Fri, Jul 29, 2011 at 4:01 AM, Uros Bizjak  wrote:
>> [ For some reason this post didn't reach gcc-patches@ ML archives... ]
>>
>> Hello!
>>
>> ABI specifies that TP is loaded in ptr_mode. Attached patch implements
>> this requirement.
>>
>> 2011-07-29  Uros Bizjak  
>>
>>        * config/i386/i386.md (*load_tp_x32): New.
>>        (*load_tp_x32_zext): Ditto.
>>        (*add_tp_x32): Ditto.
>>        (*add_tp_x32_zext): Ditto.
>>        (*load_tp_): Disable for !TARGET_X32 targets.
^^^

It should be "Disable for TARGET_X32 targets."

>>        (*add_tp_): Ditto.
>>        * config/i386/i386.c (get_thread_pointer): Load thread pointer in
>>        ptr_mode and convert to Pmode if needed.
>>
>> Testing on x86_64-pc-linux-gnu in progress. H.J., please test this
>> version on x32.
>>
>
> It works.  Can you check it in?
>
> Thanks.
>
> --
> H.J.
>



-- 
H.J.


Re: PING Re: PATCH: move Valgrind header checks from "valgrind" to "misc" checking

2011-07-29 Thread Laurynas Biveinis
Pinging yet again...

2011/7/22 Laurynas Biveinis :
> PING again...
>
> 2011/5/28 Laurynas Biveinis :
>> PING
>>
>> http://codereview.appspot.com/4250047
>> http://gcc.gnu.org/ml/gcc/2011-01/msg00363.html
>>
>> this patch moves Valgrind header detection from "valgrind" checking to "misc"
>> and enables
>> the latter whenever the former is enabled.
>>
>> If only "misc" is enabled, then Valgrind header presence is optional.
>>
>> I plan to followup with another patch that adds new configure option
>> --enable-valgrind-annotations that is orthogonal to checking and causes
>> configure to fail
>> if headers are not present.  Also I will update wwwdocs once this is 
>> accepted to
>> trunk.
>>
>> --
>> Laurynas
>>
>
>
>
> --
> Laurynas
>



-- 
Laurynas


Re: [RS6000] asynch exceptions and unwind info

2011-07-29 Thread Alan Modra
On Fri, Jul 29, 2011 at 09:16:09AM -0400, David Edelsohn wrote:
> Which has the problem?  Which are you trying to solve?  And how is
> your change solving it?

Michael's save_toc_in_prologue emit_frame_save writes unwind info for
the wrong frame.  That r2 save is the current r2.  What we need is
info about the previous r2, so we can restore when unwinding.

I made a similar mistake in frob_update_context in that the value
saved by an indirect function call sequence is the r2 for the current
function.  I also restored from the wrong location.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, RFC] PR49749 biased reassociation for accumulator patterns

2011-07-29 Thread William J. Schmidt
I found a handful of degradations with this patch from an earlier test
version, which demonstrate the incorrectness of this comment:

On Wed, 2011-07-27 at 10:11 -0500, William J. Schmidt wrote:

> +   However, the rank of a value that depends on the result of a loop-
> +   carried phi should still be higher than the rank of a value that
> +   depends on values from more distant blocks.  */

On further review, it was smarter to treat the phi's rank as zero for
propagation purposes.  (Among other things, an expression of the form
"phi + constant" gets a ludicrously high rank otherwise.)

Otherwise, things looked good.  I'm testing a revised version of the
patch with the change in propagation rules.  I hope to post it shortly,
assuming the numbers come out as I expect.

Thanks,
Bill



[PATCH, i386]: Re-define pic_32bit_operand back to define_predicate

2011-07-29 Thread Uros Bizjak
Hello!

With recent developments, there is no need for pic_32bit_operand to be
defined as special predicate with explicit mode checks anymore.
Implicit mode checks (including VIODmode bypass) of normal predicates
work OK now.

2011-07-28  Uros Bizjak  

* config/i386/predicates.md (pic_32bit_opreand): Do not define as
special predicate.  Remove explicit mode checks.

Tested on x86_64-pc-linux-gnu {,-m32}. There is remote chance this
patch breaks x32, so let's alert H.J.

Committed to mainline SVN.

Uros.
Index: predicates.md
===
--- predicates.md   (revision 176870)
+++ predicates.md   (working copy)
@@ -366,15 +366,12 @@
 
 ;; Return true when operand is PIC expression that can be computed by lea
 ;; operation.
-(define_special_predicate "pic_32bit_operand"
+(define_predicate "pic_32bit_operand"
   (match_code "const,symbol_ref,label_ref")
 {
-  if (GET_MODE (op) != SImode
-  && GET_MODE (op) != DImode)
-return false;
-
   if (!flag_pic)
 return false;
+
   /* Rule out relocations that translate into 64bit constants.  */
   if (TARGET_64BIT && GET_CODE (op) == CONST)
 {
@@ -386,6 +383,7 @@
  || XINT (op, 1) == UNSPEC_GOT))
return false;
 }
+
   return symbolic_operand (op, mode);
 })
 


Re: [Patch,AVR]: Fix PR29560 (map 16-bit shift to 8-bit)

2011-07-29 Thread Georg-Johann Lay
Richard Henderson wrote:
> On 07/27/2011 10:00 AM, Georg-Johann Lay wrote:
>> Richard Henderson wrote:
 +;; "*ashluqihiqi3.mem"
 +;; "*ashlsqihiqi3.mem"
 +(define_insn_and_split "*ashlqihiqi3.mem"
 +  [(set (match_operand:QI 0 "memory_operand" "=m")
 +(subreg:QI (ashift:HI (any_extend:HI (match_operand:QI 1 
 "register_operand" "r"))
 +  (match_operand:QI 2 "register_operand" "r"))
 +   0))]
 +  "!reload_completed"
 +  { gcc_unreachable(); }
>>> Surely this isn't necessary.  Why would you ever be matching a memory 
>>> output?
>>>
 +(define_insn_and_split "*ashlhiqi3"
 +  [(set (match_operand:QI 0 "nonimmediate_operand" "=r")
 +(subreg:QI (ashift:HI (match_operand:HI 1 "register_operand" "0")
 +  (match_operand:QI 2 "register_operand" 
 "r")) 0))]
 +  "!reload_completed"
 +  { gcc_unreachable(); }
>>> Likewise.
>>>
>>> But the first pattern and the peep2 look good.
>>>
>> It's that what combine comes up with, and combine is not smart enough
>> to find a split point between the mem and the subreg.  I don't know
>> enough of combine, maybe it's because can_create_pseudo_p is false
>> during combine, combine has no spare reg.  A combine-split won't
>> help as it needs a pseudo/spare reg.
> 
> Hmm.  Perhaps.  Have you a test case for this?
> 
> r~

char y;

void shift2 (char x, unsigned char s)
{
y = x << s;
}


Combiner tries:

...

Trying 9 -> 10:
Failed to match this instruction:
(set (mem/c/i:QI (symbol_ref:HI ("y")  ) [0 y+0 S1 A8])
(subreg:QI (ashift:HI (reg:HI 48 [ x ])
(reg/v:QI 47 [ s ])) 0))

Trying 7, 9 -> 10:
Failed to match this instruction:
(set (mem/c/i:QI (symbol_ref:HI ("y")  ) [0 y+0 S1 A8])
(subreg:QI (ashift:HI (zero_extend:HI (reg/v:QI 46 [ x ]))
(reg/v:QI 47 [ s ])) 0))
Successfully matched this instruction:
(set (reg:HI 50)
(zero_extend:HI (reg/v:QI 46 [ x ])))
Failed to match this instruction:
(set (mem/c/i:QI (symbol_ref:HI ("y")  ) [0 y+0 S1 A8])
(subreg:QI (ashift:HI (reg:HI 50)
(reg/v:QI 47 [ s ])) 0))
starting the processing of deferred insns
ending the processing of deferred insns

It sees that it can split out the zero_extend but it does not try to factor
out the set, i.e. try to split like this where both instructions would match:

(set (reg:QI foo)
(subreg:QI (ashift:HI (reg:HI 50)
(reg/v:QI 47 [ s ])) 0))

(set (mem/c/i:QI (symbol_ref:HI ("y")  ) [0 y+0 S1 A8])
 (reg:QI foo))


This is not specific the the test case, I see combine frequently to miss such
opportunities even if just non-volatile memory is involved.

If volatile memory is involved it's even worse because opportunities like
load-modify-store, sign- and zero-extends or extracting/inserting are not
detected, see PR49807.

Johann



Re: PING Re: PATCH: move Valgrind header checks from "valgrind" to "misc" checking

2011-07-29 Thread Paolo Bonzini

On 07/29/2011 03:27 PM, Laurynas Biveinis wrote:

Pinging yet again...

2011/7/22 Laurynas Biveinis:

PING again...

2011/5/28 Laurynas Biveinis:

PING

http://codereview.appspot.com/4250047
http://gcc.gnu.org/ml/gcc/2011-01/msg00363.html

this patch moves Valgrind header detection from "valgrind" checking to "misc"
and enables
the latter whenever the former is enabled.

If only "misc" is enabled, then Valgrind header presence is optional.

I plan to followup with another patch that adds new configure option
--enable-valgrind-annotations that is orthogonal to checking and causes
configure to fail
if headers are not present.  Also I will update wwwdocs once this is accepted to
trunk.


Ok, I missed this.

Paolo



Re: [PATCH] ARM fixed-point support [4/6]: allow overriding of fixed-point helper function names

2011-07-29 Thread Bernd Schmidt
On 07/29/11 16:58, Julian Brown wrote:
> * fixed-bit.c (BUILDING_FIXED_BIT): Define macro.

This appears to be unused in this patch?


Bernd


Re: [backport] arm,rx: don't ICE on naked functions with local vars

2011-07-29 Thread DJ Delorie

> Probably.  Thanks for the leg-work.  I'll approve the patch as-is.

May I apply it to the 4.5 and 4.6 branches?  The same patch applies
as-is to both.


Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Dodji Seketeli
Jason Merrill  writes:

> On 07/27/2011 01:54 PM, Dodji Seketeli wrote:
>> +  /*  Set of typedefs that are used in this function.  */
>> +  struct pointer_set_t * GTY((skip)) used_local_typedefs;
>
> Is there a reason not to just use TREE_USED for this?
>
>> +  /* Vector of locally defined typedefs, for
>> + -Wunused-local-typedefs.  */
>> +  VEC(tree,gc) *local_typedefs;
>
> If the accessors are in c-common, this field should be in
> c_language_function.

Looking into this a bit, it seems to me that I can access
cfun->language->base (of type c_language_function) from inside either
the C or C++ FE only, as the type of cfun->language -- which is of type
struct language_function -- is only defined either in c-lang.h or
cp-tree.h.  I cannot access it from c-common.c.

This is consistent with the comment of the language field of the
function struct:

  /* Language-specific code can use this to store whatever it likes.  */
  struct language_function * language;

What am I missing?

-- 
Dodji


Re: [trans-mem] verify_types_in_gimple_seq_2 glitch

2011-07-29 Thread Aldy Hernandez

On 07/29/11 07:40, Patrick Marlier wrote:

Thanks to remind me (once again) the rules...

Bootstrapped and tested successfully with:
make check-gcc RUNTESTFLAGS=tm.exp


OK for branch.  Thanks.


Re: [PATCH] ARM fixed-point support [6/6]: target-specific parts

2011-07-29 Thread Julian Brown
On Thu, 30 Jun 2011 14:42:54 +0100
Richard Earnshaw  wrote:

> > OK to apply? Tested alongside the rest of the patch series, in both
> > big & little-endian mode.
> > [snip]
> 
> Please put the iterator definitions in iterators.md

I've done this, and made a few other changes as required by the new
version of [4/6] in this series:

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02680.html

OK to apply? (I'll assume so if the previous patch gets approved, and
tests look OK).

Julian

ChangeLog

gcc/
* configure.ac (fixed-point): Add ARM support.
* configure: Regenerate.
* config/arm/arm.c (arm_fixed_mode_set): New struct.
(arm_set_fixed_optab_libfunc): New.
(arm_set_fixed_conv_libfunc): New.
(arm_init_libfuncs): Initialise fixed-point helper libfuncs with
ARM-specific names.
(aapcs_libcall_value): Return sub-word-size fixed-point libcall
return values in SImode.
(arm_return_in_msb): Return fixed-point types in the msb.
(arm_pad_reg_upwards, arm_pad_arg_upwards): Pad fixed-point types
upwards.
(arm_scalar_mode_supported_p): Support fixed-point modes.
(arm_vector_mode_supported_p): Support vector fixed-point modes.
* config/arm/arm.h (SHORT_FRACT_TYPE_SIZE, FRACT_TYPE_SIZE)
(LONG_FRACT_TYPE_SIZE, LONG_LONG_FRACT_TYPE_SIZE)
(SHORT_ACCUM_TYPE_SIZE, ACCUM_TYPE_SIZE, LONG_ACCUM_TYPE_SIZE)
(LONG_LONG_ACCUM_TYPE_SIZE, MAX_FIXED_MODE_SIZE): Define.
* config/arm/iterators.md (FIXED, ADDSUB, UQADDSUB, QADDSUB, QMUL):
New mode iterators.
(qaddsub_suf): New mode attribute.
* config/arm/arm-modes.def (FRACT, UFRACT, ACCUM, UACCUM): Declare
vector modes.
* config/arm/predicates.md (sat_shift_operator): New predicate.
* config/arm/arm-fixed.md: New.
* config/arm/arm.md: Include arm-fixed.md.
* config/arm/t-arm (MD_INCLUDES): Add arm-fixed.md.
 
libgcc/
* config.host (arm*-*-linux*, arm*-*-uclinux*, arm*-*-eabi*)
(arm*-*-symbianelf*): Add t-fixedpoint-gnu-prefix makefile fragment.
* config/arm/bpabi-lib.h (LIBGCC2_GNU_PREFIX): Define, if
BUILDING_FIXED_BIT is set.

gcc/testsuite/
* gcc.target/arm/fixed-point-exec.c: New test.
commit 51a5cdba96c5e583456b24bfb71aaad75c86ec8b
Author: Julian Brown 
Date:   Thu May 26 09:12:05 2011 -0700

Fixed-point extension support for ARM.

diff --git a/gcc/config/arm/arm-fixed.md b/gcc/config/arm/arm-fixed.md
new file mode 100644
index 000..bd33ce2
--- /dev/null
+++ b/gcc/config/arm/arm-fixed.md
@@ -0,0 +1,384 @@
+;; Copyright 2011 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+;;
+;; This file contains ARM instructions that support fixed-point operations.
+
+(define_insn "add3"
+  [(set (match_operand:FIXED 0 "s_register_operand" "=r")
+	(plus:FIXED (match_operand:FIXED 1 "s_register_operand" "r")
+		(match_operand:FIXED 2 "s_register_operand" "r")))]
+  "TARGET_32BIT"
+  "add%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")])
+
+(define_insn "add3"
+  [(set (match_operand:ADDSUB 0 "s_register_operand" "=r")
+	(plus:ADDSUB (match_operand:ADDSUB 1 "s_register_operand" "r")
+		 (match_operand:ADDSUB 2 "s_register_operand" "r")))]
+  "TARGET_INT_SIMD"
+  "sadd%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")])
+
+(define_insn "usadd3"
+  [(set (match_operand:UQADDSUB 0 "s_register_operand" "=r")
+	(us_plus:UQADDSUB (match_operand:UQADDSUB 1 "s_register_operand" "r")
+			  (match_operand:UQADDSUB 2 "s_register_operand" "r")))]
+  "TARGET_INT_SIMD"
+  "uqadd%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")])
+
+(define_insn "ssadd3"
+  [(set (match_operand:QADDSUB 0 "s_register_operand" "=r")
+	(ss_plus:QADDSUB (match_operand:QADDSUB 1 "s_register_operand" "r")
+			 (match_operand:QADDSUB 2 "s_register_operand" "r")))]
+  "TARGET_INT_SIMD"
+  "qadd%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")])
+
+(define_insn "sub3"
+  [(set (match_operand:FIXED 0 "s_register_operand" "=r")
+	(minus:FIXED (match_operand:FIXED 1 "s_register_operand" "r")
+		 (match_operand:FIXED 2 "s_register_operand" "r")))]
+  "TARGET_32BIT"
+  "sub%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")])
+
+(define_insn "sub3"
+  [(set (match_operand:ADDSUB 0 "s_register_operand" "=r")
+	(minus:ADDSUB (match_operand:ADDSUB 1 "s_register_operand" "r")
+		  (match_operand:ADDSUB 2 "s_register_operand" "r")))]
+  

Re: [PATCH] ARM fixed-point support [4/6]: allow overriding of fixed-point helper function names

2011-07-29 Thread Julian Brown
On Fri, 29 Jul 2011 17:26:07 +0200
Bernd Schmidt  wrote:

> On 07/29/11 16:58, Julian Brown wrote:
> > * fixed-bit.c (BUILDING_FIXED_BIT): Define macro.
> 
> This appears to be unused in this patch?

It's used by the later patch:

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02686.html

Cheers,

Julian


Re: [PATCH] ARM fixed-point support [4/6]: allow overriding of fixed-point helper function names

2011-07-29 Thread Bernd Schmidt
On 07/29/11 17:48, Julian Brown wrote:
> On Fri, 29 Jul 2011 17:26:07 +0200
> Bernd Schmidt  wrote:
> 
>> On 07/29/11 16:58, Julian Brown wrote:
>>> * fixed-bit.c (BUILDING_FIXED_BIT): Define macro.
>>
>> This appears to be unused in this patch?
> 
> It's used by the later patch:
> 
> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02686.html

Oh, that hadn't arrived yet.

I think it would be slightly less convoluted to have an additional
macro, LIBGCC2_FIXEDBIT_GNU_PREFIX, and set only one of them for ARM.
Otherwise OK for 4/6.


Bernd


[PATCH, i386]: Remove tp_or_register_operand predicate

2011-07-29 Thread Uros Bizjak
Hello!

tp_or_register_operand predicate is not used.

2011-07-29  Uros Bizjak  

* config/i386/predicates.md (tp_or_register_operand): Remove predicate.

Tested on x86_64-pc-linux-gnu, committed to mainline SVN.

Uros.

Index: predicates.md
===
--- predicates.md   (revision 176924)
+++ predicates.md   (working copy)
@@ -490,11 +490,6 @@
   (and (match_code "symbol_ref")
(match_test "op == ix86_tls_module_base ()")))

-(define_predicate "tp_or_register_operand"
-  (ior (match_operand 0 "register_operand")
-   (and (match_code "unspec")
-   (match_test "XINT (op, 1) == UNSPEC_TP"
-
 ;; Test for a pc-relative call operand
 (define_predicate "constant_call_address_operand"
   (match_code "symbol_ref")


Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Jason Merrill

On 07/29/2011 08:36 AM, Dodji Seketeli wrote:

Looking into this a bit, it seems to me that I can access
cfun->language->base (of type c_language_function) from inside either
the C or C++ FE only, as the type of cfun->language -- which is of type
struct language_function -- is only defined either in c-lang.h or
cp-tree.h.  I cannot access it from c-common.c.


I think you can use (struct c_language_function *)cfun->language.

Jason


Re: [PATCH] ARM fixed-point support [6/6]: target-specific parts

2011-07-29 Thread Richard Earnshaw
On 29/07/11 16:47, Julian Brown wrote:
> On Thu, 30 Jun 2011 14:42:54 +0100
> Richard Earnshaw  wrote:
> 
>>> OK to apply? Tested alongside the rest of the patch series, in both
>>> big & little-endian mode.
>>> [snip]
>>
>> Please put the iterator definitions in iterators.md
> 
> I've done this, and made a few other changes as required by the new
> version of [4/6] in this series:
> 
> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02680.html
> 
> OK to apply? (I'll assume so if the previous patch gets approved, and
> tests look OK).
> 
> Julian
> 
> ChangeLog
> 
> gcc/
> * configure.ac (fixed-point): Add ARM support.
> * configure: Regenerate.
> * config/arm/arm.c (arm_fixed_mode_set): New struct.
> (arm_set_fixed_optab_libfunc): New.
> (arm_set_fixed_conv_libfunc): New.
> (arm_init_libfuncs): Initialise fixed-point helper libfuncs with
> ARM-specific names.
> (aapcs_libcall_value): Return sub-word-size fixed-point libcall
> return values in SImode.
> (arm_return_in_msb): Return fixed-point types in the msb.
> (arm_pad_reg_upwards, arm_pad_arg_upwards): Pad fixed-point types
> upwards.
> (arm_scalar_mode_supported_p): Support fixed-point modes.
> (arm_vector_mode_supported_p): Support vector fixed-point modes.
> * config/arm/arm.h (SHORT_FRACT_TYPE_SIZE, FRACT_TYPE_SIZE)
> (LONG_FRACT_TYPE_SIZE, LONG_LONG_FRACT_TYPE_SIZE)
> (SHORT_ACCUM_TYPE_SIZE, ACCUM_TYPE_SIZE, LONG_ACCUM_TYPE_SIZE)
> (LONG_LONG_ACCUM_TYPE_SIZE, MAX_FIXED_MODE_SIZE): Define.
> * config/arm/iterators.md (FIXED, ADDSUB, UQADDSUB, QADDSUB, QMUL):
> New mode iterators.
> (qaddsub_suf): New mode attribute.
> * config/arm/arm-modes.def (FRACT, UFRACT, ACCUM, UACCUM): Declare
> vector modes.
> * config/arm/predicates.md (sat_shift_operator): New predicate.
> * config/arm/arm-fixed.md: New.
> * config/arm/arm.md: Include arm-fixed.md.
> * config/arm/t-arm (MD_INCLUDES): Add arm-fixed.md.
>  
> libgcc/
> * config.host (arm*-*-linux*, arm*-*-uclinux*, arm*-*-eabi*)
> (arm*-*-symbianelf*): Add t-fixedpoint-gnu-prefix makefile fragment.
> * config/arm/bpabi-lib.h (LIBGCC2_GNU_PREFIX): Define, if
> BUILDING_FIXED_BIT is set.
> 
> gcc/testsuite/
> * gcc.target/arm/fixed-point-exec.c: New test.
> 
> 

OK

R.



Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Jason Merrill

On 07/29/2011 03:35 AM, Dodji Seketeli wrote:

So do you guys think we should add it nonetheless and just add
-Wno-unused-local-typedefs to the tests that exhibit the above issue
before fixing PR preprocessor/7263?


Does your set of linemap patches fix the issue?  In that case, we can 
add it when those go in.  Speaking of which, sorry I haven't found the 
time to review them yet.


Jason


Re: [pph] Free buffers used during tree encoding/decoding

2011-07-29 Thread Gabriel Charette
I just stashed all my changes and pulled in the latest svn HEAD this
morning to check if I was seeing these failures:

Repository Root: svn+ssh://gcc.gnu.org/svn/gcc
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 176906
Node Kind: directory
Schedule: normal
Last Changed Author: crowl
Last Changed Rev: 176906
Last Changed Date: 2011-07-28 16:18:55 -0700 (Thu, 28 Jul 2011)

I did a successful build + pph check of both debug and opt builds
(incremental build only, I didn't actually need to start from scratch;
however I was stashing changes to some headers in libcpp, so
potentially that rebuilt somethings that weren't rebuilt in a smaller
incremental build if there is a missing dependency..?)

Gab

On Thu, Jul 28, 2011 at 10:01 PM, Diego Novillo  wrote:
> On Thu, Jul 28, 2011 at 16:30, Lawrence Crowl  wrote:
>> I'm getting massive failures after incorporating this change:
>>
>>   bytecode stream: trying to read 1735 bytes after the end of the
>>   input buffer
>>
>> where the number of bytes changes.  Suggestions?
>
> Odd.  I'm getting the usual results with:
>
> $ git svn info
> Path: .
> URL: svn+ssh://gcc.gnu.org/svn/gcc/branches/pph/gcc
> Repository Root: svn+ssh://gcc.gnu.org/svn/gcc
> Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
> Revision: 176671
> Node Kind: directory
> Schedule: normal
> Last Changed Author: gchare
> Last Changed Rev: 176671
> Last Changed Date: 2011-07-22 21:04:48 -0400 (Fri, 22 Jul 2011)
>
> Perhaps a file did not get rebuilt after you updated your tree?  That
> may point to a Makefile dependency bug.  Or maybe you have some local
> patch?
>
>
> Diego.
>


-freorder-function is broken

2011-07-29 Thread Xinliang David Li
The attached patch fixed the problem. The root cause of the problem is
due to the ordering change of profile_estimation and tree_profile
pass. In trunk, the function/node frequency is not computed after
profile annotation is done leading to missing information. Another
side effect of this breakage is optimize_function_for_size query is
also broken leading to larger than necessary binary with FDO.

Bootstrapped on x86_64/linux. All FDO testing passed.

Ok after regression test?

David


p
Description: Binary data


Re: [PATCH] [google] [annotalysis] Fix remove operation from pointer_set in case of hash collisions

2011-07-29 Thread Ollie Wild
Okay for google/gcc-4_6.

Ollie

On Tue, Jul 26, 2011 at 7:27 PM, Delesley Hutchins  wrote:
>
> Le-Chun added the additional routine to remove pointers from a set;
> that code is unique to annotalysis.  I can't easily include a test
> case, because the bug is difficult to trigger.  It occurs only when
> there is a hash collision between two pointers in the set, and the
> first pointer is removed before the second.  I do have a test case,
> but it will only work for my particular build on my machine, since the
> actual pointer addresses involved will change as soon as you touch
> something.  I could write a unit test using bogus pointer values that
> are engineered to trigger a collision, but it wouldn't be a normal
> compiler test case; where would I put it?
>
>  -DeLesley
>
> On Tue, Jul 26, 2011 at 5:59 PM, Diego Novillo  wrote:
> > On Tue, Jul 26, 2011 at 16:13, Delesley Hutchins  
> > wrote:
> >> This patch fixes a bug in pointer_set.c, where removing a pointer from
> >> a pointer set would corrupt the hash table if the pointer was involved
> >> in any hash collisions.
> >
> > Could you include a test case?  It's not clear to me what you are
> > fixing and when this happens.  Is this a bug in trunk as well?  The
> > pointer-set implementation has been around for a while, so I'm
> > surprised that you are running into this now.  Or is this something
> > that only happens with the pointer set changes we have in for
> > annotalysis?
> >
> >
> > Thanks.  Diego.
> >
>
>
>
> --
> DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315


Re: [PATCH, RFC] PR49749 biased reassociation for accumulator patterns

2011-07-29 Thread William J. Schmidt
Here is the final version of the reassociation patch.  There are two
differences from the version I published on 7/27.  I removed the
function call from within the MAX macro per Michael's comment, and I
changed the propagation of the rank of loop-carried phis to be zero.
This involved a small change to propagate_rank, and re-casting
phi_propagation_rank to a predicate function loop_carried_phi.

Performance numbers look good, with some nice gains and no significant
regressions for CPU2006 on powerpc64-linux.  Bootstrapped and regression
tested on powerpc64-linux with no regressions.

Ok for trunk?

Thanks,
Bill


2011-07-29  Bill Schmidt  

PR tree-optimization/49749
* tree-ssa-reassoc.c (get_rank): New forward declaration.
(PHI_LOOP_BIAS): New macro.
(phi_rank): New function.
(loop_carried_phi): Likewise.
(propagate_rank): Likewise.
(get_rank): Add calls to phi_rank and propagate_rank.

Index: gcc/tree-ssa-reassoc.c
===
--- gcc/tree-ssa-reassoc.c  (revision 176585)
+++ gcc/tree-ssa-reassoc.c  (working copy)
@@ -190,7 +190,115 @@ static long *bb_rank;
 /* Operand->rank hashtable.  */
 static struct pointer_map_t *operand_rank;
 
+/* Forward decls.  */
+static long get_rank (tree);
 
+
+/* Bias amount for loop-carried phis.  We want this to be larger than
+   the depth of any reassociation tree we can see, but not larger than
+   the rank difference between two blocks.  */
+#define PHI_LOOP_BIAS (1 << 15)
+
+/* Rank assigned to a phi statement.  If STMT is a loop-carried phi of
+   an innermost loop, and the phi has only a single use which is inside
+   the loop, then the rank is the block rank of the loop latch plus an
+   extra bias for the loop-carried dependence.  This causes expressions
+   calculated into an accumulator variable to be independent for each
+   iteration of the loop.  If STMT is some other phi, the rank is the
+   block rank of its containing block.  */
+static long
+phi_rank (gimple stmt)
+{
+  basic_block bb = gimple_bb (stmt);
+  struct loop *father = bb->loop_father;
+  tree res;
+  unsigned i;
+  use_operand_p use;
+  gimple use_stmt;
+
+  /* We only care about real loops (those with a latch).  */
+  if (!father->latch)
+return bb_rank[bb->index];
+
+  /* Interesting phis must be in headers of innermost loops.  */
+  if (bb != father->header
+  || father->inner)
+return bb_rank[bb->index];
+
+  /* Ignore virtual SSA_NAMEs.  */
+  res = gimple_phi_result (stmt);
+  if (!is_gimple_reg (SSA_NAME_VAR (res)))
+return bb_rank[bb->index];
+
+  /* The phi definition must have a single use, and that use must be
+ within the loop.  Otherwise this isn't an accumulator pattern.  */
+  if (!single_imm_use (res, &use, &use_stmt)
+  || gimple_bb (use_stmt)->loop_father != father)
+return bb_rank[bb->index];
+
+  /* Look for phi arguments from within the loop.  If found, bias this phi.  */
+  for (i = 0; i < gimple_phi_num_args (stmt); i++)
+{
+  tree arg = gimple_phi_arg_def (stmt, i);
+  if (TREE_CODE (arg) == SSA_NAME
+ && !SSA_NAME_IS_DEFAULT_DEF (arg))
+   {
+ gimple def_stmt = SSA_NAME_DEF_STMT (arg);
+ if (gimple_bb (def_stmt)->loop_father == father)
+   return bb_rank[father->latch->index] + PHI_LOOP_BIAS;
+   }
+}
+
+  /* Must be an uninteresting phi.  */
+  return bb_rank[bb->index];
+}
+
+/* If EXP is an SSA_NAME defined by a PHI statement that represents a
+   loop-carried dependence of an innermost loop, return TRUE; else
+   return FALSE.  */
+static bool
+loop_carried_phi (tree exp)
+{
+  gimple phi_stmt;
+  long block_rank;
+
+  if (TREE_CODE (exp) != SSA_NAME
+  || SSA_NAME_IS_DEFAULT_DEF (exp))
+return false;
+
+  phi_stmt = SSA_NAME_DEF_STMT (exp);
+
+  if (gimple_code (SSA_NAME_DEF_STMT (exp)) != GIMPLE_PHI)
+return false;
+
+  /* Non-loop-carried phis have block rank.  Loop-carried phis have
+ an additional bias added in.  If this phi doesn't have block rank,
+ it's biased and should not be propagated.  */
+  block_rank = bb_rank[gimple_bb (phi_stmt)->index];
+
+  if (phi_rank (phi_stmt) != block_rank)
+return true;
+
+  return false;
+}
+
+/* Return the maximum of RANK and the rank that should be propagated
+   from expression OP.  For most operands, this is just the rank of OP.
+   For loop-carried phis, the value is zero to avoid undoing the bias
+   in favor of the phi.  */
+static long
+propagate_rank (long rank, tree op)
+{
+  long op_rank;
+
+  if (loop_carried_phi (op))
+return rank;
+
+  op_rank = get_rank (op);
+
+  return MAX (rank, op_rank);
+}
+
 /* Look up the operand rank structure for expression E.  */
 
 static inline long
@@ -232,11 +340,38 @@ get_rank (tree e)
  I make no claims that this is optimal, however, it gives good
  results.  */
 
+  /* We make an exception to the normal ranking system to break
+  

[gomp3.1] Allow const qualified static data members with no mutable members in firstprivate clauses

2011-07-29 Thread Jakub Jelinek
Hi!

My clarification request for ambiguity in the new OpenMP 3.1 standard
http://www.openmp.org/forum/viewtopic.php?f=10&t=1198
has been resolved in that const qualified static data members with no
mutable members should be allowed in firstprivate clauses.
The following patch implements it, regtested on x86_64-linux,
committed to gomp-3_1-branch.

2011-07-29  Jakub Jelinek  

* cp-tree.h (cxx_omp_const_qual_no_mutable): New prototype.
* cp-gimplify.c (cxx_omp_const_qual_no_mutable): New function.
(cxx_omp_predetermined_sharing): Use it.
* semantics.c (finish_omp_clauses): Allow const qualified
static data members having no mutable member in firstprivate
clauses.

* g++.dg/gomp/sharing-2.C: New test.

--- gcc/cp/cp-tree.h.jj 2011-07-11 17:43:48.0 +0200
+++ gcc/cp/cp-tree.h2011-07-29 18:34:03.0 +0200
@@ -5761,6 +5761,7 @@ extern void init_shadowed_var_for_decl
 extern int cp_gimplify_expr(tree *, gimple_seq *,
 gimple_seq *);
 extern void cp_genericize  (tree);
+extern bool cxx_omp_const_qual_no_mutable  (tree);
 extern enum omp_clause_default_kind cxx_omp_predetermined_sharing (tree);
 extern tree cxx_omp_clause_default_ctor(tree, tree, tree);
 extern tree cxx_omp_clause_copy_ctor   (tree, tree, tree);
--- gcc/cp/cp-gimplify.c.jj 2011-07-11 19:57:49.0 +0200
+++ gcc/cp/cp-gimplify.c2011-07-29 18:33:14.0 +0200
@@ -1367,26 +1367,15 @@ cxx_omp_privatize_by_reference (const_tr
   return is_invisiref_parm (decl);
 }
 
-/* True if OpenMP sharing attribute of DECL is predetermined.  */
-
-enum omp_clause_default_kind
-cxx_omp_predetermined_sharing (tree decl)
+/* Return true if DECL is const qualified var having no mutable member.  */
+bool
+cxx_omp_const_qual_no_mutable (tree decl)
 {
-  tree type;
-
-  /* Static data members are predetermined as shared.  */
-  if (TREE_STATIC (decl))
-{
-  tree ctx = CP_DECL_CONTEXT (decl);
-  if (TYPE_P (ctx) && MAYBE_CLASS_TYPE_P (ctx))
-   return OMP_CLAUSE_DEFAULT_SHARED;
-}
-
-  type = TREE_TYPE (decl);
+  tree type = TREE_TYPE (decl);
   if (TREE_CODE (type) == REFERENCE_TYPE)
 {
   if (!is_invisiref_parm (decl))
-   return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
+   return false;
   type = TREE_TYPE (type);
 
   if (TREE_CODE (decl) == RESULT_DECL && DECL_NAME (decl))
@@ -1410,11 +1399,32 @@ cxx_omp_predetermined_sharing (tree decl
 }
 
   if (type == error_mark_node)
-return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
+return false;
 
   /* Variables with const-qualified type having no mutable member
  are predetermined shared.  */
   if (TYPE_READONLY (type) && !cp_has_mutable_p (type))
+return true;
+
+  return false;
+}
+
+/* True if OpenMP sharing attribute of DECL is predetermined.  */
+
+enum omp_clause_default_kind
+cxx_omp_predetermined_sharing (tree decl)
+{
+  /* Static data members are predetermined shared.  */
+  if (TREE_STATIC (decl))
+{
+  tree ctx = CP_DECL_CONTEXT (decl);
+  if (TYPE_P (ctx) && MAYBE_CLASS_TYPE_P (ctx))
+   return OMP_CLAUSE_DEFAULT_SHARED;
+}
+
+  /* Const qualified vars having no mutable member are predetermined
+ shared.  */
+  if (cxx_omp_const_qual_no_mutable (decl))
 return OMP_CLAUSE_DEFAULT_SHARED;
 
   return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
--- gcc/cp/semantics.c.jj   2011-07-11 21:43:48.0 +0200
+++ gcc/cp/semantics.c  2011-07-29 18:34:34.0 +0200
@@ -4085,12 +4085,9 @@ finish_omp_clauses (tree clauses)
case OMP_CLAUSE_DEFAULT_UNSPECIFIED:
  break;
case OMP_CLAUSE_DEFAULT_SHARED:
- /* const vars may be specified in firstprivate clause,
-but don't allow static data members.  */
+ /* const vars may be specified in firstprivate clause.  */
  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE
- && (!TREE_STATIC (t)
- || !TYPE_P (CP_DECL_CONTEXT (t))
- || !MAYBE_CLASS_TYPE_P (CP_DECL_CONTEXT (t
+ && cxx_omp_const_qual_no_mutable (t))
break;
  share_name = "shared";
  break;
--- gcc/testsuite/g++.dg/gomp/sharing-2.C.jj2011-07-29 18:48:33.0 
+0200
+++ gcc/testsuite/g++.dg/gomp/sharing-2.C   2011-07-29 18:48:23.0 
+0200
@@ -0,0 +1,47 @@
+// { dg-do compile }
+
+struct T
+{
+  int i;
+  mutable int j;
+};
+struct S
+{
+  const static int d = 1;
+  const static T e;
+  void foo (int, T);
+};
+
+const int S::d;
+const T S::e = { 2, 3 };
+
+void bar (const int &);
+
+void
+S::foo (const int x, const T y)
+{
+  #pragma omp parallel firstprivate (x)
+bar (x);
+  #pragma omp parallel firstprivate (d)
+bar (d);
+  #pragma omp parallel firstprivate (y)
+bar (y.i);
+  #pragma omp parallel first

Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Dodji Seketeli
Jason Merrill  writes:

> On 07/29/2011 03:35 AM, Dodji Seketeli wrote:
>> So do you guys think we should add it nonetheless and just add
>> -Wno-unused-local-typedefs to the tests that exhibit the above issue
>> before fixing PR preprocessor/7263?
>
> Does your set of linemap patches fix the issue?

Yes it does.  Particularly this patch
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01318.html 

>  In that case, we can add it when those go in.

OK.

>  Speaking of which, sorry I haven't found the time to review them yet.

No problem.

-- 
Dodji


[DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Dimitrios Apostolou

Hello list,

the attached patch achieves a performance improvement by first recording 
DF_REF_BASE DEFs within df_get_call_refs() before the DF_REF_REGULARs are 
recorded in df_defs_record(). BASE DEFs are also recorded in REGNO order. 
Improvement has been measured as follows, for compiling tcp_ipv4.c of 
linux kernel with -O0 optimisation:


trunk  : 1438.4 M instr, 0.627s
patched: 1376.5 M instr, 0.604s

It also includes suggested changes from Paolo discussed on list (subject: 
what can be in a group set). Many thanks to him for the invaluable help 
while writing the patch.


For whoever is interested, you can see the two profiles with fully 
annotated source before and after the change, at the following links. The 
big difference is that qsort() is now called only 33 times instead of 
thousands, from df_sort_and_compress_refs().
Further measurements, comments and ideas for further improvements are 
welcome.


http://gcc.gnu.org/wiki/OptimisingGCC?action=AttachFile&do=view&target=callgrind-tcp_ipv4-trunk-co-109439-prod.txt
http://gcc.gnu.org/wiki/OptimisingGCC?action=AttachFile&do=view&target=callgrind-tcp_ipv4-df2-co-prod.txt


Changelog:

2011-07-29  Dimitrios Apostolou  
Paolo Bonzini  

(df_def_record_1): Assert a parallel must contain an EXPR_LIST at
this point.  Receive the LOC and move its extraction...
(df_defs_record): ... here. Rewrote logic with a switch statement
instead of multiple if-else.
(df_find_hard_reg_defs, df_find_hard_reg_defs_1): New functions
that duplicate the logic of df_defs_record() and df_def_record_1()
but without actually recording any DEFs, only marking them in
the defs HARD_REG_SET.
(df_get_call_refs): Call df_find_hard_reg_defs() to mark DEFs that
are the result of the call. Record DF_REF_BASE DEFs in REGNO
order. Use regs_invalidated_by_call HARD_REG_SET instead of
regs_invalidated_by_call_regset bitmap.
(df_insn_refs_collect): Record DF_REF_REGULAR DEFs after
df_get_call_refs().


Thanks,
Dimitris


P.S. maraz: that's 4.3% improvement in instruction count, should you start 
worrying or is it too late? ;-)
=== modified file 'gcc/df-scan.c'
--- gcc/df-scan.c   2011-02-02 20:08:06 +
+++ gcc/df-scan.c   2011-07-29 16:01:50 +
@@ -111,7 +111,7 @@ static void df_ref_record (enum df_ref_c
   rtx, rtx *,
   basic_block, struct df_insn_info *,
   enum df_ref_type, int ref_flags);
-static void df_def_record_1 (struct df_collection_rec *, rtx,
+static void df_def_record_1 (struct df_collection_rec *, rtx *,
 basic_block, struct df_insn_info *,
 int ref_flags);
 static void df_defs_record (struct df_collection_rec *, rtx,
@@ -2916,40 +2916,27 @@ df_read_modify_subreg_p (rtx x)
 }
 
 
-/* Process all the registers defined in the rtx, X.
+/* Process all the registers defined in the rtx pointed by LOC.
Autoincrement/decrement definitions will be picked up by
df_uses_record.  */
 
 static void
 df_def_record_1 (struct df_collection_rec *collection_rec,
- rtx x, basic_block bb, struct df_insn_info *insn_info,
+ rtx *loc, basic_block bb, struct df_insn_info *insn_info,
 int flags)
 {
-  rtx *loc;
-  rtx dst;
-
- /* We may recursively call ourselves on EXPR_LIST when dealing with PARALLEL
- construct.  */
-  if (GET_CODE (x) == EXPR_LIST || GET_CODE (x) == CLOBBER)
-loc = &XEXP (x, 0);
-  else
-loc = &SET_DEST (x);
-  dst = *loc;
+  rtx dst = *loc;
 
   /* It is legal to have a set destination be a parallel. */
   if (GET_CODE (dst) == PARALLEL)
 {
   int i;
-
   for (i = XVECLEN (dst, 0) - 1; i >= 0; i--)
{
  rtx temp = XVECEXP (dst, 0, i);
- if (GET_CODE (temp) == EXPR_LIST || GET_CODE (temp) == CLOBBER
- || GET_CODE (temp) == SET)
-   df_def_record_1 (collection_rec,
- temp, bb, insn_info,
-GET_CODE (temp) == CLOBBER
-? flags | DF_REF_MUST_CLOBBER : flags);
+ gcc_assert (GET_CODE (temp) == EXPR_LIST);
+ df_def_record_1 (collection_rec, &XEXP (temp, 0),
+  bb, insn_info, flags);
}
   return;
 }
@@ -3003,26 +2990,98 @@ df_defs_record (struct df_collection_rec
int flags)
 {
   RTX_CODE code = GET_CODE (x);
+  int i;
 
-  if (code == SET || code == CLOBBER)
-{
-  /* Mark the single def within the pattern.  */
-  int clobber_flags = flags;
-  clobber_flags |= (code == CLOBBER) ? DF_REF_MUST_CLOBBER : 0;
-  df_def_record_1 (collection_rec, x, bb, insn_info, clobber_flags);
-}
-  else if (code == COND_EXEC)
+  switch (code)
 {
+case SET:
+  df_def_record_1 (collection_rec, &SET_DEST (x), bb, insn_info, flags);
+  

Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Paolo Bonzini

On 07/29/2011 07:23 PM, Dimitrios Apostolou wrote:


2011-07-29  Dimitrios Apostolou 
 Paolo Bonzini 

 (df_def_record_1): Assert a parallel must contain an EXPR_LIST at
 this point.  Receive the LOC and move its extraction...
 (df_defs_record): ... here. Rewrote logic with a switch statement
 instead of multiple if-else.
 (df_find_hard_reg_defs, df_find_hard_reg_defs_1): New functions
 that duplicate the logic of df_defs_record() and df_def_record_1()
 but without actually recording any DEFs, only marking them in
 the defs HARD_REG_SET.
 (df_get_call_refs): Call df_find_hard_reg_defs() to mark DEFs that
 are the result of the call. Record DF_REF_BASE DEFs in REGNO
 order. Use regs_invalidated_by_call HARD_REG_SET instead of
 regs_invalidated_by_call_regset bitmap.
 (df_insn_refs_collect): Record DF_REF_REGULAR DEFs after
 df_get_call_refs().


Ok for mainline.  I will commit it for you after rebootstrapping (just 
to be safe).



P.S. maraz: that's 4.3% improvement in instruction count, should you start 
worrying or is it too late?


Now I'm curious!

Paolo


Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Dodji Seketeli
Jason Merrill  writes:

> On 07/29/2011 08:36 AM, Dodji Seketeli wrote:
>> Looking into this a bit, it seems to me that I can access
>> cfun->language->base (of type c_language_function) from inside either
>> the C or C++ FE only, as the type of cfun->language -- which is of type
>> struct language_function -- is only defined either in c-lang.h or
>> cp-tree.h.  I cannot access it from c-common.c.
>
> I think you can use (struct c_language_function *)cfun->language.

I see.

Looking a bit further, it looks like the C FE uses cfun->language only
to store the context of the outer function when faced with a nested
function.  This is done by c_push_function_context, called by
c_parser_declaration_or_fndef.  Otherwise, cfun->language is not
allocated.  Is it appropriate that -Wunused-local-typedefs allocates it
as well?

-- 
Dodji


Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Kenneth Zadeck

were these tested on any platform aside from x86?

On 07/29/2011 01:26 PM, Paolo Bonzini wrote:

On 07/29/2011 07:23 PM, Dimitrios Apostolou wrote:


2011-07-29  Dimitrios Apostolou 
 Paolo Bonzini 

 (df_def_record_1): Assert a parallel must contain an 
EXPR_LIST at

 this point.  Receive the LOC and move its extraction...
 (df_defs_record): ... here. Rewrote logic with a switch 
statement

 instead of multiple if-else.
 (df_find_hard_reg_defs, df_find_hard_reg_defs_1): New functions
 that duplicate the logic of df_defs_record() and df_def_record_1()
 but without actually recording any DEFs, only marking them in
 the defs HARD_REG_SET.
 (df_get_call_refs): Call df_find_hard_reg_defs() to mark DEFs that
 are the result of the call. Record DF_REF_BASE DEFs in REGNO
 order. Use regs_invalidated_by_call HARD_REG_SET instead of
 regs_invalidated_by_call_regset bitmap.
 (df_insn_refs_collect): Record DF_REF_REGULAR DEFs after
 df_get_call_refs().


Ok for mainline.  I will commit it for you after rebootstrapping (just 
to be safe).


P.S. maraz: that's 4.3% improvement in instruction count, should you 
start worrying or is it too late?


Now I'm curious!

Paolo


Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Dimitrios Apostolou


Completely forgot it: Tested on i386, no regressions.


Dimitrios


Patch for C++ build on HP-UX and to implement -static-libstdc++

2011-07-29 Thread Steve Ellcey
I recently discovered that after moving to the C++ bootstrap, the IA64
HP-UX GCC compiler would build and test OK, but if you tried to run the
installed compiler you would get an error about not being able to find
libgcc_s.so.0.

Joseph and Rainer pointed me at definining gcc_cv_ld_static_dynamic,
gcc_cv_ld_static_option, and gcc_cv_ld_dynamic_option in the
configure.ac script and that helped but resulted in a new problem.

In addition to allowing GCC to link with the static libgcc and libstdc++
these changes resulted in GCC trying to link in a static libunwind.
On IA64 HP-UX we use the system libunwind and it only comes in a shared
version.  So my solution was to change gcc.c so that we only link in the
archive libunwind if we are not using the system libunwind.  I did this by
having unwind_ipinfo.m4 set a new macro (USE_SYSTEM_LIBUNWIND) when
we are using the system libunwind and then checking for this in
gcc.c.

I am not sure if any other platforms use the system libunwind and if
this will cause them problem, IA64 HP-UX is the only platform to use the
system libunwind by default and I'd rather not invent a new config flag
to handle the static vs. dynamic libunwind issue unless we need to.

In addition to this problem on IA64 I found that the 32 bit PA compiler
was not building (due to a problem with not finding libgcc_s) and that
the 64 bit PA compiler was not handling -static-libstdc.  This patch
fixes those problems as well with the change to configure.ac.

Hopefully, there is no issue with the configure.ac change, but I would
like some feedback on the unwind_ipinfo.m4 and gcc.c changes for
USE_SYSTEM_LIBUNWIND.  I don't know if there is a better way to handle
this or not.

Tested on IA64 HP-UX and Linux and on 32 and 64 bit PA.

OK for checkin?

Steve Ellcey
s...@cup.hp.com


config/ChangeLog

2011-07-28  Steve Ellcey  

* unwind_ipinfo.m4 (USE_SYSTEM_LIBUNWIND): Define.


gcc/ChangeLog


2011-07-28  Steve Ellcey  

* configure.ac (gcc_cv_ld_static_dynamic): Define for *-*-hpux*.
(gcc_cv_ld_static_option): Ditto.
(gcc_cv_ld_dynamic_option): Ditto.
* gcc.c (init_spec): Use USE_SYSTEM_LIBUNWIND.



Index: config/unwind_ipinfo.m4
===
--- config/unwind_ipinfo.m4 (revision 176899)
+++ config/unwind_ipinfo.m4 (working copy)
@@ -34,4 +34,8 @@
   if test x$have_unwind_getipinfo = xyes; then
 AC_DEFINE(HAVE_GETIPINFO, 1, [Define if _Unwind_GetIPInfo is available.])
   fi
+
+  if test x$with_system_libunwind = xyes; then
+AC_DEFINE(USE_SYSTEM_LIBUNWIND, 1, [Define if using system unwind 
library.])
+  fi
 ])
Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 176899)
+++ gcc/configure.ac(working copy)
@@ -3240,6 +3240,13 @@
   *-*-solaris2*)
 gcc_cv_ld_static_dynamic=yes
 ;;
+  *-*-hpux*)
+   if test x"$gnu_ld" = xno; then
+ gcc_cv_ld_static_dynamic=yes
+ gcc_cv_ld_static_option="-aarchive"
+ gcc_cv_ld_dynamic_option="-adefault"
+   fi
+   ;;
 esac
   fi
 fi
Index: gcc/gcc.c
===
--- gcc/gcc.c   (revision 176899)
+++ gcc/gcc.c   (working copy)
@@ -1389,7 +1389,7 @@
"-lgcc",
"-lgcc_eh"
 #ifdef USE_LIBUNWIND_EXCEPTIONS
-# ifdef HAVE_LD_STATIC_DYNAMIC
+# if defined(HAVE_LD_STATIC_DYNAMIC) && !defined(USE_SYSTEM_LIBUNWIND)
" %{!static:" LD_STATIC_OPTION "} -lunwind"
" %{!static:" LD_DYNAMIC_OPTION "}"
 # else


Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Kenneth Zadeck
i really think that patches of this magnitude having to with the rtl 
level should be tested on more than one platform.


kenny

On 07/29/2011 01:39 PM, Dimitrios Apostolou wrote:


Completely forgot it: Tested on i386, no regressions.


Dimitrios


[committed] Fix OpenMP shared var handling in nested parallels (PR middle-end/49897)

2011-07-29 Thread Jakub Jelinek
Hi!

This is something that happened to work in 4.4 and earlier, before
DECL_GIMPLE_FORMAL_TEMP_P removal.  If use_pointer_for_field needs to return
true because something is shared in a nested parallel and thus in-out
wouldn't work, as each thread would have its own location, and that var
isn't addressable, before DECL_GIMPLE_FORMAL_TEMP_P removal
.omp_data_2.o.y = &y;
would be gimplified as is and nothing complained about the missing
TREE_ADDRESSABLE on y, supposedly because ompexp cleaned it up.
But after that change, i.e. in 4.5+, the above is gimplified into
y.3 = y;
.omp_data_2.o.y = &y.3;
and thus the inner parallel modifies a wrong variable.
Fixed by treating it like the other case where we need to make
the var addressable for tasks.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
and 4.6.

2011-07-29  Jakub Jelinek  

PR middle-end/49897
PR middle-end/49898
* omp-low.c (use_pointer_for_field): If disallowing copy-in/out
in nested parallel and outer is a gimple_reg, mark it as addressable
and set its bit in task_shared_vars bitmap too.

* testsuite/libgomp.c/pr49897-1.c: New test.
* testsuite/libgomp.c/pr49897-2.c: New test.
* testsuite/libgomp.c/pr49898-1.c: New test.
* testsuite/libgomp.c/pr49898-2.c: New test.

--- gcc/omp-low.c.jj2011-07-21 09:54:49.0 +0200
+++ gcc/omp-low.c   2011-07-29 16:31:08.0 +0200
@@ -781,7 +781,7 @@ use_pointer_for_field (tree decl, omp_co
  break;
 
  if (c)
-   return true;
+   goto maybe_mark_addressable_and_ret;
}
}
 
@@ -791,7 +791,9 @@ use_pointer_for_field (tree decl, omp_co
 returns, the task hasn't necessarily terminated.  */
   if (!TREE_READONLY (decl) && is_task_ctx (shared_ctx))
{
- tree outer = maybe_lookup_decl_in_outer_ctx (decl, shared_ctx);
+ tree outer;
+   maybe_mark_addressable_and_ret:
+ outer = maybe_lookup_decl_in_outer_ctx (decl, shared_ctx);
  if (is_gimple_reg (outer))
{
  /* Taking address of OUTER in lower_send_shared_vars
--- libgomp/testsuite/libgomp.c/pr49897-1.c.jj  2011-07-29 17:03:07.0 
+0200
+++ libgomp/testsuite/libgomp.c/pr49897-1.c 2011-07-29 17:00:23.0 
+0200
@@ -0,0 +1,31 @@
+/* PR middle-end/49897 */
+/* { dg-do run } */
+
+extern void abort (void);
+
+int
+main ()
+{
+  int i, j, x = 0, y, sum = 0;
+#pragma omp parallel reduction(+:sum)
+  {
+  #pragma omp for firstprivate(x) lastprivate(x, y)
+for (i = 0; i < 10; i++)
+  {
+   x = i;
+   y = 0;
+  #pragma omp parallel reduction(+:sum)
+   {
+   #pragma omp for firstprivate(y) lastprivate(y)
+ for (j = 0; j < 10; j++)
+   {
+ y = j;
+ sum += y;
+   }
+   }
+  }
+  }
+  if (x != 9 || y != 9 || sum != 450)
+abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c/pr49897-2.c.jj  2011-07-29 17:03:07.0 
+0200
+++ libgomp/testsuite/libgomp.c/pr49897-2.c 2011-07-29 17:00:07.0 
+0200
@@ -0,0 +1,25 @@
+/* PR middle-end/49897 */
+/* { dg-do run } */
+
+extern void abort (void);
+
+int
+main ()
+{
+  int i, j, x = 0, y, sum = 0;
+#pragma omp parallel for reduction(+:sum) firstprivate(x) lastprivate(x, y)
+  for (i = 0; i < 10; i++)
+{
+  x = i;
+  y = 0;
+#pragma omp parallel for reduction(+:sum) firstprivate(y) lastprivate(y)
+  for (j = 0; j < 10; j++)
+   {
+ y = j;
+ sum += y;
+   }
+}
+  if (x != 9 || y != 9 || sum != 450)
+abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c/pr49898-1.c.jj  2011-07-29 17:03:07.0 
+0200
+++ libgomp/testsuite/libgomp.c/pr49898-1.c 2011-07-29 17:01:44.0 
+0200
@@ -0,0 +1,26 @@
+/* PR middle-end/49898 */
+/* { dg-do run } */
+
+extern void abort (void);
+
+int
+main ()
+{
+  int i, j, sum = 0;
+#pragma omp parallel
+  {
+  #pragma omp for reduction(+:sum)
+for (i = 0; i < 10; i++)
+  {
+  #pragma omp parallel
+   {
+   #pragma omp for reduction(+:sum)
+ for (j = 0; j < 10; j++)
+   sum += j;
+   }
+  }
+  }
+  if (sum != 450)
+abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c/pr49898-2.c.jj  2011-07-29 17:03:07.0 
+0200
+++ libgomp/testsuite/libgomp.c/pr49898-2.c 2011-07-29 17:02:28.0 
+0200
@@ -0,0 +1,18 @@
+/* PR middle-end/49898 */
+/* { dg-do run } */
+
+extern void abort (void);
+
+int
+main ()
+{
+  int i, j, sum = 0;
+#pragma omp parallel for reduction(+:sum)
+  for (i = 0; i < 10; i++)
+#pragma omp parallel for reduction(+:sum)
+for (j = 0; j < 10; j++)
+  sum += j;
+  if (sum != 450)
+abort ();
+  return 0;
+}

Jakub


Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Dimitrios Apostolou

On Fri, 29 Jul 2011, Kenneth Zadeck wrote:

i really think that patches of this magnitude having to with the rtl level 
should be tested on more than one platform.


I'd really appreciate further testing on alternate platforms from whoever 
does it casually, for me it would take too much time to setup my testing 
platform on GCC compile farm, and deadlines are approaching.



Thanks,
Dimitris



Re: Patch for C++ build on HP-UX and to implement -static-libstdc++

2011-07-29 Thread Rainer Orth
Steve,

> Index: gcc/configure.ac
> ===
> --- gcc/configure.ac  (revision 176899)
> +++ gcc/configure.ac  (working copy)
> @@ -3240,6 +3240,13 @@
>*-*-solaris2*)
>  gcc_cv_ld_static_dynamic=yes
>  ;;
> +  *-*-hpux*)
> + if test x"$gnu_ld" = xno; then
> +   gcc_cv_ld_static_dynamic=yes
> +   gcc_cv_ld_static_option="-aarchive"
> +   gcc_cv_ld_dynamic_option="-adefault"
> + fi
> + ;;
>  esac
>fi
>  fi

just a nit, but could you keep the cases sorted alphabetically?

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Patch for C++ build on HP-UX and to implement -static-libstdc++

2011-07-29 Thread Steve Ellcey
On Fri, 2011-07-29 at 20:00 +0200, Rainer Orth wrote:
> Steve,
> 
> > Index: gcc/configure.ac
> > ===
> > --- gcc/configure.ac(revision 176899)
> > +++ gcc/configure.ac(working copy)
> > @@ -3240,6 +3240,13 @@
> >*-*-solaris2*)
> >  gcc_cv_ld_static_dynamic=yes
> >  ;;
> > +  *-*-hpux*)
> > +   if test x"$gnu_ld" = xno; then
> > + gcc_cv_ld_static_dynamic=yes
> > + gcc_cv_ld_static_option="-aarchive"
> > + gcc_cv_ld_dynamic_option="-adefault"
> > +   fi
> > +   ;;
> >  esac
> >fi
> >  fi
> 
> just a nit, but could you keep the cases sorted alphabetically?
> 
> Thanks.
> Rainer

I can do that.  And add a comment line like the other entries have.

Steve Ellcey
s...@cup.hp.com



[lra] another patch to decrease ARM code size degradation.

2011-07-29 Thread Vladimir Makarov
The patch decrease code size degradation on ARM (SPEC2000 was used) by 
improving hard regno preferences for reload and inheritance pseudos, 
more accurate cost evaluation of alternatives with early clobbers, 
fixing missed copies for shuffling reload and inheritance pseudos, 
fixing missed removal of reversed equiv insns (i.e.  EQUIV_MEM <- pseudo).


The patch was successfully bootstrapped on x86-64 and ppc64 
(unfortunately, i64 is broken on the branch).


2011-07-29  Vladimir Makarov 

* lra-constraints.c (LOSER_COST_FACTOR, MAX_OVERALL_COST_BOUND):
New macros.
(process_alt_operands): Use the macros.  Adjust losers and
overall for reloads becuase of early clobbers.
(curr_insn_transform): Use MAX_OVERALL_COST_BOUND.
(lra_constraints): Fix typo with parentheses.

* lra-lives.c (process_bb_lives): Permit creation of copies
involving all new pseudos.

* lra-assigns.c (curr_update_hard_regno_preference_check): New 
variable.

(update_hard_regno_preference_check): Ditto.
(update_hard_regno_preference): New function.
(lra_setup_reg_renumber): Use update_hard_regno_preference.
(assign_by_spills): Initialize update_hard_regno_preference_check
and curr_update_hard_regno_preference_check.

Index: lra-lives.c
===
--- lra-lives.c (revision 176797)
+++ lra-lives.c (working copy)
@@ -533,13 +533,14 @@ process_bb_lives (basic_block bb)
  /* Check that source regno does not conflict with
 destination regno to exclude most impossible
 preferences.  */
- && (src_regno = REGNO (SET_SRC (set))) >= FIRST_PSEUDO_REGISTER
-&& ! sparseset_bit_p (pseudos_live, src_regno))
-   || (src_regno < FIRST_PSEUDO_REGISTER
-   && ! TEST_HARD_REG_BIT (hard_regs_live, src_regno)))
-  /* It might be 'inheritance pseudo <- reload pseudo'.  */
-  || (src_regno >= lra_constraint_new_regno_start
-  && (int) ORIGINAL_REGNO (SET_DEST (set)) == src_regno
+ && src_regno = REGNO (SET_SRC (set))) >= FIRST_PSEUDO_REGISTER
+   && ! sparseset_bit_p (pseudos_live, src_regno))
+  || (src_regno < FIRST_PSEUDO_REGISTER
+  && ! TEST_HARD_REG_BIT (hard_regs_live, src_regno)))
+ /* It might be 'inheritance pseudo <- reload pseudo'.  */
+ || (src_regno >= lra_constraint_new_regno_start
+ && ((int) REGNO (SET_DEST (set))
+ >= lra_constraint_new_regno_start
{
  int hard_regno = -1, regno = -1;
 
Index: lra-constraints.c
===
--- lra-constraints.c   (revision 176806)
+++ lra-constraints.c   (working copy)
@@ -1313,6 +1313,13 @@ uses_hard_regs_p (rtx *loc, HARD_REG_SET
   return false;
 }
 
+/* Cost factor for each additional reload and maximal cost bound for
+   insn reloads.  One might ask about such strange numbers.  Their
+   values occured historically from former reload pass. In some way,
+   even machine descriptions.  */
+#define LOSER_COST_FACTOR 6
+#define MAX_OVERALL_COST_BOUND 600
+
 /* Major function to choose the current insn alternative and what
operands should be reload and how.  If ONLY_ALTERNATIVE is not
negative we should consider only this alternative.  Return false if
@@ -1915,12 +1922,12 @@ process_alt_operands (int only_alternati
{
  if (targetm.preferred_reload_class
  (op, this_alternative) == NO_REGS)
-   reject = 600;
+   reject = MAX_OVERALL_COST_BOUND;
  
  if (curr_static_id->operand[nop].type == OP_OUT
  && (targetm.preferred_output_reload_class
  (op, this_alternative) == NO_REGS))
-   reject = 600;
+   reject = MAX_OVERALL_COST_BOUND;
}
   
  /* We prefer to reload pseudos over reloading other
@@ -1958,7 +1965,7 @@ process_alt_operands (int only_alternati
  /* ??? Should we update the cost because early clobber
 register reloads or it is a rare thing to be worth to do
 it.  */
- overall = losers * 6 + reject;
+ overall = losers * LOSER_COST_FACTOR + reject;
  if ((best_losers == 0 || losers != 0) && best_overall < overall)
goto fail;
 
@@ -2027,7 +2034,11 @@ process_alt_operands (int only_alternati
  /* We need to reload early clobbered register.  */
  for (j = 0; j < n_operands; j++)
if (curr_alt_matches[j] == i)
- curr_alt_match_win[j] = false;
+ {
+   curr_alt_match_win[j] = false;
+   losers++;
+   overall += LOSER_COST_FACTOR;
+ }
  if (! cur

Re: Patches ping

2011-07-29 Thread Ayal Zaks
> [PATCH, SMS 3/4] Optimize stage count
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01341.html

This patch for minimizing the stage count (which also refactors and
cleans up the code) is approved. Have some minor comments below,
followed by some thoughts for possible follow-up improvements.
Thanks,
Ayal.

>Changelog:
>
>       (sms_schedule_by_order): Update call to get_sched_window.
>       all set_must_precede_follow.
    ^^^
    call


>+/* Set bitmaps TMP_FOLLOW and TMP_PRECEDE to MUST_FOLLOW and MUST_PRECEDE
>+   respectively only if cycle C falls in the scheduling window boundaries
    ^^
 on the border of


>   sbitmap tmp_precede = NULL;
>   sbitmap tmp_follow = NULL;
are redundantly reset in set_must_precede_follow().



>+/* Update the sched_params for node U using the II,
    ^
(time, row and stage)
 >+   the CYCLE of U and MIN_CYCLE.  */
Please also clarify that we're not simply taking
  SCHED_STAGE (u) = CALC_STAGE_COUNT (SCHED_TIME (u), min_cycle, ii);
because the stages are aligned on cycle 0.


>+  /* First, normailize the partial schedualing.  */
   ^   ^

>+   /* Try to achieve optimized SC by normalizing the partial
>+  schedule (having the cycles start from cycle zero). The branch
   ^
>+  location must be placed in row ii-1 in the final scheduling.

>+  If that's not the case after the normalization then try to
    ^^
>+  move the branch to that row if possible.  */
    ^
    If failed, shift all instructions to position the branch in row ii-1.


For consistency and clarity, may be instead of:

>+   /* Bring the branch to cycle -1.  */
>+   int amount = SCHED_TIME (g->closing_branch) + 1;
it would be better to have:

+   /* Bring the branch to cycle ii-1.  */
+   int amount = SCHED_TIME (g->closing_branch) - (ii - 1);


Some thoughts on possible improvements (not mandatory; especially
given the delay in approval, sorry..thanks for the ping):
o Have optimize_sc() take care of all possible rotations doing the
best it can, without returning an indication which leaves subsequent
(suboptimal) processing to the caller.
o Instead of removing the branch and then trying to get it back into
the same cycle if you can't place it in row ii-1, consider keeping it
in its place and removing it only if you succeed to schedule it
(also..) in row ii-1.
o May be worthwhile to apply more refactoring, so that the code to
reschedule the branch in row ii-1 reuses more of the general code for
scheduling an instruction (possibly setting end = start + end), but
avoid splitting a row.
o Would be interesting to learn of loops that still have suboptimal
SC's, to see if it's still an issue.
Ayal.


2011/7/20 Revital Eres 
>
> Hello,
>
> [PATCH, SMS 3/4] Optimize stage count
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01341.html
>
> [PATCH, SMS 4/4] Misc. fixes
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01342.html
>
> [PATCH, SMS] Fix calculation of issue_rate
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01344.html
>
> Thanks,
> Revital


Re: Patches ping

2011-07-29 Thread Ayal Zaks
>[PATCH, SMS 4/4] Misc. fixes
http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01342.html
Sure, this is fine.
(Sorry for all the previous '?'s..).
Thanks,
Ayal.

2011/7/20 Revital Eres 
>
> Hello,
>
> [PATCH, SMS 3/4] Optimize stage count
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01341.html
>
> [PATCH, SMS 4/4] Misc. fixes
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01342.html
>
> [PATCH, SMS] Fix calculation of issue_rate
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01344.html
>
> Thanks,
> Revital


Re: PATCH: Use long long for 64bit int in config/i386/64/sfp-machine.h

2011-07-29 Thread NightStrike
On Thu, Jul 28, 2011 at 3:45 PM, H.J. Lu  wrote:
> Hi Ian,
>
> For 64bit x86 targets, long is 32bit for x32 and win64.  But long long
> is always 64bit.  This patch removes _WIN64 check.  OK for trunk?

Isn't that what int64_t is for?


Re: Patches ping

2011-07-29 Thread Ayal Zaks
> [PATCH, SMS] Fix calculation of issue_rate
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01344.html

This is ok (with the updated Changelog). Alternatively, we can have a
local variable for holding the issue_rate.
Ayal.


2011/7/20 Revital Eres :
> Hello,
>
> [PATCH, SMS 3/4] Optimize stage count
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01341.html
>
> [PATCH, SMS 4/4] Misc. fixes
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01342.html
>
> [PATCH, SMS] Fix calculation of issue_rate
> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01344.html
>
> Thanks,
> Revital
>


Re: [PATCH, PR 49886] Prevent fnsplit from changing signature when there are type attributes

2011-07-29 Thread Martin Jambor
Hi,

On Thu, Jul 28, 2011 at 06:52:05PM +0200, Martin Jambor wrote:
> pass_split_functions is happy to split functions which have type
> attributes but cannot update them if the new clone has in any way
> different parameters than the original.  This can lead to
> miscompilations in cases like the testcase.
> 
> This patch solves it by 1) making the inliner set the
> can_change_signature flag to false for them because their signature
> cannot be changed (this step is also necessary to make IPA-CP operate
> on them and handle them correctly), and 2) make the splitting pass
> keep all parameters if the flag is set.  The second step might involve
> inventing some default definitions if the parameters did not really
> have any.
> 
> I spoke about this with Honza and he claimed that the new function is
> really an entirely different thing and that the parameters may
> correspond only very loosely and thus the type attributes should be
> cleared.  I'm not sure I agree, but in any case I needed this to work
> to allow me continue with promised IPA-CP polishing and so I decided
> to do this because it was easier.  (My own opinion is that the current
> representation of parameter-describing function type attributes is
> evil and will cause harm no matter hat we do.)
> 

Actually, I'd like to commit the patch below which also clears
can_change_signature for BUILT_IN_VA_START.  It is not really
necessary for this fix but fixes some problems in a followup patch and
is also the correct thing to do because if we clone a function calling
it and pass non-NULL for args_to_skip, the new clone would not have a
stdarg_p type and fold_builtin_next_arg could error when dealing with
it.

Also bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2011-07-29  Martin Jambor  

PR middle-end/49886
* ipa-inline-analysis.c (compute_inline_parameters): Set
can_change_signature of noes with typde attributes.
* ipa-split.c (split_function): Do not skip any arguments if
can_change_signature is set.

* testsuite/gcc.c-torture/execute/pr49886.c: New testcase.

Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -1658,18 +1658,28 @@ compute_inline_parameters (struct cgraph
   /* Can this function be inlined at all?  */
   info->inlinable = tree_inlinable_function_p (node->decl);
 
-  /* Inlinable functions always can change signature.  */
-  if (info->inlinable)
-node->local.can_change_signature = true;
+  /* Type attributes can use parameter indices to describe them.  */
+  if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl)))
+node->local.can_change_signature = false;
   else
 {
-  /* Functions calling builtin_apply can not change signature.  */
-  for (e = node->callees; e; e = e->next_callee)
-   if (DECL_BUILT_IN (e->callee->decl)
-   && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
-   && DECL_FUNCTION_CODE (e->callee->decl) == BUILT_IN_APPLY_ARGS)
- break;
-  node->local.can_change_signature = !e;
+  /* Otherwise, inlinable functions always can change signature.  */
+  if (info->inlinable)
+   node->local.can_change_signature = true;
+  else
+   {
+ /* Functions calling builtin_apply can not change signature.  */
+ for (e = node->callees; e; e = e->next_callee)
+   {
+ tree cdecl = e->callee->decl;
+ if (DECL_BUILT_IN (cdecl)
+ && DECL_BUILT_IN_CLASS (cdecl) == BUILT_IN_NORMAL
+ && (DECL_FUNCTION_CODE (cdecl) == BUILT_IN_APPLY_ARGS
+ || DECL_FUNCTION_CODE (cdecl) == BUILT_IN_VA_START))
+   break;
+   }
+ node->local.can_change_signature = !e;
+   }
 }
   estimate_function_body_sizes (node, early);
 
Index: src/gcc/ipa-split.c
===
--- src.orig/gcc/ipa-split.c
+++ src/gcc/ipa-split.c
@@ -945,10 +945,10 @@ static void
 split_function (struct split_point *split_point)
 {
   VEC (tree, heap) *args_to_pass = NULL;
-  bitmap args_to_skip = BITMAP_ALLOC (NULL);
+  bitmap args_to_skip;
   tree parm;
   int num = 0;
-  struct cgraph_node *node;
+  struct cgraph_node *node, *cur_node = cgraph_get_node 
(current_function_decl);
   basic_block return_bb = find_return_bb ();
   basic_block call_bb;
   gimple_stmt_iterator gsi;
@@ -968,17 +968,30 @@ split_function (struct split_point *spli
   dump_split_point (dump_file, split_point);
 }
 
+  if (cur_node->local.can_change_signature)
+args_to_skip = BITMAP_ALLOC (NULL);
+  else
+args_to_skip = NULL;
+
   /* Collect the parameters of new function and args_to_skip bitmap.  */
   for (parm = DECL_ARGUMENTS (current_function_decl);
parm; parm = DECL_CHAIN (parm), num++)
-if (!is_gimple_reg (parm)
-   || !gim

Re: [PATCH] PR c++/33255 - Support -Wunused-local-typedefs warning

2011-07-29 Thread Jason Merrill

On 07/29/2011 10:27 AM, Dodji Seketeli wrote:

Jason Merrill  writes:


On 07/29/2011 08:36 AM, Dodji Seketeli wrote:

Looking into this a bit, it seems to me that I can access
cfun->language->base (of type c_language_function) from inside either
the C or C++ FE only, as the type of cfun->language -- which is of type
struct language_function -- is only defined either in c-lang.h or
cp-tree.h.  I cannot access it from c-common.c.


I think you can use (struct c_language_function *)cfun->language.


I see.

Looking a bit further, it looks like the C FE uses cfun->language only
to store the context of the outer function when faced with a nested
function.  This is done by c_push_function_context, called by
c_parser_declaration_or_fndef.  Otherwise, cfun->language is not
allocated.  Is it appropriate that -Wunused-local-typedefs allocates it
as well?


I think so.  Joseph?

Jason


Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Steven Bosscher
On Fri, Jul 29, 2011 at 7:57 PM, Dimitrios Apostolou wrote:
> On Fri, 29 Jul 2011, Kenneth Zadeck wrote:
>
>> i really think that patches of this magnitude having to with the rtl level
>> should be tested on more than one platform.
>
> I'd really appreciate further testing on alternate platforms from whoever
> does it casually, for me it would take too much time to setup my testing
> platform on GCC compile farm, and deadlines are approaching.

"I love deadlines. I love the whooshing sound they make as they go by."
--Douglas Adams

I'll see if I can test the patch on the compile farm this weekend,
just to be sure.

Ciao!
Steven


Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Kenneth Zadeck

you are the best

kenny

On 07/29/2011 05:48 PM, Steven Bosscher wrote:

On Fri, Jul 29, 2011 at 7:57 PM, Dimitrios Apostolou wrote:

On Fri, 29 Jul 2011, Kenneth Zadeck wrote:


i really think that patches of this magnitude having to with the rtl level
should be tested on more than one platform.

I'd really appreciate further testing on alternate platforms from whoever
does it casually, for me it would take too much time to setup my testing
platform on GCC compile farm, and deadlines are approaching.

"I love deadlines. I love the whooshing sound they make as they go by."
--Douglas Adams

I'll see if I can test the patch on the compile farm this weekend,
just to be sure.

Ciao!
Steven


Re: PATCH: PR target/47766: [x32] -fstack-protector doesn't work

2011-07-29 Thread H.J. Lu
On Thu, Jul 28, 2011 at 1:03 PM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 9:03 PM, H.J. Lu  wrote:
>
 This patch adds x32 support to UNSPEC_SP_XXX patterns.  OK for trunk?
>>>
>>> http://gcc.gnu.org/contribute.html#patches
>>>
>>
>> Sorry. I should have mentioned testcase in:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47766
>>
>> Actually, they are in gcc testsuite.  I noticed them when
>> I run gcc testsuite on x32.
>
> This looks like a middle-end problem to me.
>
> According to the documentation:
>
> --quote--
> `stack_protect_set'
>     This pattern, if defined, moves a `Pmode' value from the memory in
>     operand 1 to the memory in operand 0 without leaving the value in
>     a register afterward.  This is to avoid leaking the value some
>     place that an attacker might use to rewrite the stack guard slot
>     after having clobbered it.
>
>     If this pattern is not defined, then a plain move pattern is
>     generated.
>
> `stack_protect_test'
>     This pattern, if defined, compares a `Pmode' value from the memory
>     in operand 1 with the memory in operand 0 without leaving the
>     value in a register afterward and branches to operand 2 if the
>     values weren't equal.
>
>     If this pattern is not defined, then a plain compare pattern and
>     conditional branch pattern is used.
> --quote--
>
> According to the documentation, x86 patterns are correct. However,
> middle-end fails to extend ptr_mode value to Pmode, and in function.c,
> stack_protect_prologue/stack_protect_epilogue, we already have
> ptr_mode (SImode) operand:
>
> (mem/v/f/c/i:SI (plus:DI (reg/f:DI 54 virtual-stack-vars)
>        (const_int -4 [0xfffc])) [2 D.2704+0 S4 A32])
>
> (mem/v/f/c/i:SI (symbol_ref:DI ("__stack_chk_guard") [flags 0x40]
> ) [2 __stack_chk_guard+0 S4
> A32])
>
> An opinion of a RTL maintainer (CC'd) is needed here. Target
> definition is OK in its current form.

When -fstack-protector  was added, there are

tree
default_stack_protect_guard (void)
{
  tree t = stack_chk_guard_decl;

  if (t == NULL)
{
  rtx x;

  t = build_decl (UNKNOWN_LOCATION,
  VAR_DECL, get_identifier ("__stack_chk_guard"),
  ptr_type_node);
  ^

I think ptr_mode is intended and Pmode is just a typo.  Jakub, Richard,
what are your thoughts on this?

Thanks.


-- 
H.J.


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-29 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:34 PM, Paolo Bonzini  wrote:
> Ok, you removed ignore_address_wrap_around, so we're almost there.
>
> On 07/28/2011 07:59 PM, H.J. Lu wrote:
>>
>> @@ -712,7 +715,16 @@ convert_modes (enum machine_mode mode, enum
>> machine_mode oldmode, rtx x, int uns
>>    if (GET_CODE (x) == SUBREG&&  SUBREG_PROMOTED_VAR_P (x)
>>        &&  GET_MODE_SIZE (GET_MODE (SUBREG_REG (x)))>= GET_MODE_SIZE
>> (mode)
>>        &&  SUBREG_PROMOTED_UNSIGNED_P (x) == unsignedp)
>> -    x = gen_lowpart (mode, x);
>> +    {
>> +      temp = rtl_hooks.gen_lowpart_no_emit (mode, x);
>> +      if (temp)
>> +       x = temp;
>> +      else
>> +       {
>> +         gcc_assert (!no_emit);
>> +         x = gen_lowpart (mode, x);
>> +       }
>> +    }
>
> +    {
> +       /* gen_lowpart_no_emit should always succeed here.  */
> +       x = rtl_hooks.gen_lowpart_no_emit (mode, x);
> +    }
>
>>
>>    if (GET_MODE (x) != VOIDmode)
>>      oldmode = GET_MODE (x);
>> @@ -776,6 +788,10 @@ convert_modes (enum machine_mode mode, enum
>> machine_mode oldmode, rtx x, int uns
>>          return gen_int_mode (val, mode);
>>        }
>>
>> +      temp = rtl_hooks.gen_lowpart_no_emit (mode, x);
>> +      if (temp)
>> +       return temp;
>> +      gcc_assert (!no_emit);
>>        return gen_lowpart (mode, x);
>
> Right now, gen_lowpart_no_emit will never return NULL, so these tests in
> convert_modes are dead.  Instead, please include in your patch mine at
> http://permalink.gmane.org/gmane.comp.gcc.patches/242085 and adjust as
> follows.
>
> +      temp = rtl_hooks.gen_lowpart_no_emit (mode, x);
> +      if (no_emit)
> +        return rtl_hooks.gen_lowpart_no_emit (mode, x);
> +      else
> +        return gen_lowpart (mode, x);
>
>>      }
>
> If it does not work, PLEASE say why instead of posting another "updated
> patch".

The whole approach doesn't work. The testcase at

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721#c1

shows GCC depends on transforming:

(zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))

to

(plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))

Otherwise we either get compiler crash or wrong codes.


-- 
H.J.


PATCH: [x32]: Check TARGET_LP64 for SIZE_TYPE/PTRDIFF_TYPE

2011-07-29 Thread H.J. Lu
Hi,

X32 is 32bit.  This patch checks TARGET_LP64 for SIZE_TYPE/PTRDIFF_TYPE.
OK for trunk?

Thanks.


H.J.
---
2011-07-29  H.J. Lu  

* config/i386/x86-64.h (SIZE_TYPE): Check TARGET_LP64 instead
of TARGET_64BIT.
(PTRDIFF_TYPE): Likewise.

diff --git a/gcc/config/i386/x86-64.h b/gcc/config/i386/x86-64.h
index b85dab9..d20f326 100644
--- a/gcc/config/i386/x86-64.h
+++ b/gcc/config/i386/x86-64.h
@@ -38,10 +38,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define MCOUNT_NAME "mcount"
 
 #undef SIZE_TYPE
-#define SIZE_TYPE (TARGET_64BIT ? "long unsigned int" : "unsigned int")
+#define SIZE_TYPE (TARGET_LP64 ? "long unsigned int" : "unsigned int")
 
 #undef PTRDIFF_TYPE
-#define PTRDIFF_TYPE (TARGET_64BIT ? "long int" : "int")
+#define PTRDIFF_TYPE (TARGET_LP64 ? "long int" : "int")
 
 #undef WCHAR_TYPE
 #define WCHAR_TYPE "int"


Re: [pph] Free buffers used during tree encoding/decoding

2011-07-29 Thread Lawrence Crowl
I removed the build directories and rebuilt.  Everything worked.
There are gremlins in the machine.

On 7/29/11, Gabriel Charette  wrote:
> I just stashed all my changes and pulled in the latest svn HEAD this
> morning to check if I was seeing these failures:
>
> Repository Root: svn+ssh://gcc.gnu.org/svn/gcc
> Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
> Revision: 176906
> Node Kind: directory
> Schedule: normal
> Last Changed Author: crowl
> Last Changed Rev: 176906
> Last Changed Date: 2011-07-28 16:18:55 -0700 (Thu, 28 Jul 2011)
>
> I did a successful build + pph check of both debug and opt builds
> (incremental build only, I didn't actually need to start from scratch;
> however I was stashing changes to some headers in libcpp, so
> potentially that rebuilt somethings that weren't rebuilt in a smaller
> incremental build if there is a missing dependency..?)
>
> Gab
>
> On Thu, Jul 28, 2011 at 10:01 PM, Diego Novillo  wrote:
>> On Thu, Jul 28, 2011 at 16:30, Lawrence Crowl  wrote:
>>> I'm getting massive failures after incorporating this change:
>>>
>>>   bytecode stream: trying to read 1735 bytes after the end of the
>>>   input buffer
>>>
>>> where the number of bytes changes.  Suggestions?
>>
>> Odd.  I'm getting the usual results with:
>>
>> $ git svn info
>> Path: .
>> URL: svn+ssh://gcc.gnu.org/svn/gcc/branches/pph/gcc
>> Repository Root: svn+ssh://gcc.gnu.org/svn/gcc
>> Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
>> Revision: 176671
>> Node Kind: directory
>> Schedule: normal
>> Last Changed Author: gchare
>> Last Changed Rev: 176671
>> Last Changed Date: 2011-07-22 21:04:48 -0400 (Fri, 22 Jul 2011)
>>
>> Perhaps a file did not get rebuilt after you updated your tree?  That
>> may point to a Makefile dependency bug.  Or maybe you have some local
>> patch?
>>
>>
>> Diego.
>>
>


-- 
Lawrence Crowl


Re: [PATCH 1/7] Linemap infrastructure for virtual locations

2011-07-29 Thread Jason Merrill

On 07/16/2011 07:37 AM, Dodji Seketeli wrote:

+/* Returns the highest location [of a token resulting from macro
+   expansion] encoded in this line table.  */
+#define LINEMAPS_MACRO_HIGHEST_LOCATION(SET) \
+  LINEMAPS_HIGHEST_LOCATION(SET, true)
+
+/* Returns the location of the begining of the highest line
+   -- containing a token resulting from macro expansion --  encoded
+   in the line table SET.  */
+#define LINEMAPS_MACRO_HIGHEST_LINE(SET) \
+  LINEMAPS_HIGHEST_LINE(SET, true)


What is the use of these?  The ordinary highest line/location are used 
for various things, but I can't think of a reason you would want the 
above, nor are they used in any of the patches.  Maybe these should be 
in line_maps instead of maps_info?


Jason


Re: [PATCH] Use HOST_WIDE_INTs in gcd and least_common_multiple.

2011-07-29 Thread Richard Henderson
On 07/28/2011 10:47 PM, Sebastian Pop wrote:
> 2011-01-28  Sebastian Pop  
>   Joseph Myers  
> 
>   * Makefile.in (hwint.o): Depend on DIAGNOSTIC_CORE_H.
>   * hwint.c: Include diagnostic-core.h.
>   (abs_hwi): New.
>   (gcd): Moved here...
>   (pos_mul_hwi): New.
>   (mul_hwi): New.
>   (least_common_multiple): Moved here...
>   * hwint.h (gcd): ... from here.
>   (least_common_multiple): ... from here.
>   (HOST_WIDE_INT_MIN): New.
>   (HOST_WIDE_INT_MAX): New.
>   (abs_hwi): Declared.
>   (gcd): Declared.
>   (pos_mul_hwi): Declared.
>   (mul_hwi): Declared.
>   (least_common_multiple): Declared.
>   * omega.c (check_pos_mul): Removed.
>   (check_mul): Removed.
>   (omega_solve_geq): Use pos_mul_hwi instead of check_pos_mul and
>   mul_hwi instead of check_mul.

OK.


r~


Re: [PATCH] libtool -- don't print warnings with --silent

2011-07-29 Thread John David Anglin

Ping?

On 9-Jul-11, at 7:03 PM, John David Anglin wrote:

The attached patch fixes the boehm-gc testsuite on hppa2.0w-hp- 
hpux11.11.
Without it, libtool always generates an informational warning when  
linking

causing the entire boehm-gc testsuite to fail.

Ok?  Ralf would you please install in libtool tree if ok.

2011-07-09  John David Anglin  

PR boehm-gc/48494
	* ltmain.sh (func_warning): Don't print warnings if opt_silent is  
true.


Index: ltmain.sh
===
--- ltmain.sh   (revision 176045)
+++ ltmain.sh   (working copy)
@@ -437,7 +437,9 @@
# Echo program name prefixed warning message to standard error.
func_warning ()
{
-$opt_warning && $ECHO "$progname${mode+: }$mode: warning:  
"${1+"$@"} 1>&2

+${opt_silent-false} || {
+  $opt_warning && $ECHO "$progname${mode+: }$mode: warning:  
"${1+"$@"} 1>&2

+}

# bash bug again:
:

Dave
--
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752  
(FAX: 952-6602)




[v3] docbook biblioid/imagedata markup fixes

2011-07-29 Thread Benjamin Kosnik

As noted earlier today, this removes various warnings when processing
doc/xml/* files.

tested x86/linux

-benjamin2011-07-29  Benjamin Kosnik  

	* doc/xml/manual/build_hacking.xml: Markup imagedata changes.
	* doc/xml/manual/policy_data_structures.xml: Same.

	* doc/xml/class.txml: Remove biblioid.
	* doc/xml/manual/allocator.xml: Same.
	* doc/xml/manual/ctype.xml: Same.
	* doc/xml/manual/codecvt.xml: Same.
	* doc/xml/manual/backwards_compatibility.xml: Same.
	* doc/xml/manual/abi.xml: Same.
	* doc/xml/manual/shared_ptr.xml: Same.
	* doc/xml/manual/using_exceptions.xml: Same.
	* doc/xml/manual/messages.xml: Same.

Index: doc/xml/class.txml
===
--- doc/xml/class.txml	(revision 176956)
+++ doc/xml/class.txml	(working copy)
@@ -108,48 +108,57 @@

 
 
-Bibliography
+
+	  
+	
+	Bibliography
+		
+
 
 
 
 
-  
-  
 
 
 
Index: doc/xml/manual/allocator.xml
===
--- doc/xml/manual/allocator.xml	(revision 176956)
+++ doc/xml/manual/allocator.xml	(working copy)
@@ -504,11 +504,12 @@
   
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.drdobbs.com/cpp/184403759"; class="uri">
-
-
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://www.drdobbs.com/cpp/184403759";>
   The Standard Librarian: What Are Allocators Good For?
-
+	
+  
 
 MattAustern
 
@@ -519,21 +520,23 @@
   
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.cs.umass.edu/~emery/hoard/"; class="uri">
-
-
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://www.cs.umass.edu/~emery/hoard";>
   The Hoard Memory Allocator
-
+	
+  
 
 EmeryBerger
   
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf"; class="uri">
-
-
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf";>
   Reconsidering Custom Memory Allocation
-
+	
+  
 
 EmeryBerger
 BenZorn
@@ -546,12 +549,14 @@
 
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.angelikalanger.com/Articles/C++Report/Allocators/Allocators.html"; class="uri">
-
-
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://www.angelikalanger.com/Articles/C++Report/Allocators/Allocators.html";>
   Allocator Types
-
+	
+  
 
+
 KlausKreft
 AngelikaLanger
 
Index: doc/xml/manual/ctype.xml
===
--- doc/xml/manual/ctype.xml	(revision 176956)
+++ doc/xml/manual/ctype.xml	(working copy)
@@ -166,11 +166,12 @@
   
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.unix.org/version3/ieee_std.html"; class="uri">
-  
-  
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://www.unix.org/version3/ieee_std.html";>
 	The Open Group Base Specifications, Issue 6 (IEEE Std. 1003.1-2004)
-  
+	
+  
 
 
   1999
Index: doc/xml/manual/codecvt.xml
===
--- doc/xml/manual/codecvt.xml	(revision 176956)
+++ doc/xml/manual/codecvt.xml	(working copy)
@@ -586,11 +586,13 @@
   
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.opengroup.org/austin/"; class="uri">
-
-
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://www.opengroup.org/austin";>
   System Interface Definitions, Issue 7 (IEEE Std. 1003.1-2008)
-
+	
+  
+
 
   2008
   
@@ -639,33 +641,37 @@
   
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.lysator.liu.se/c/na1.html"; class="uri">
-
-
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://www.lysator.liu.se/c/na1.html";>
   A brief description of Normative Addendum 1
-
+	
+  
 
 FeatherClive
 Extended Character Sets
   
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://tldp.org/HOWTO/Unicode-HOWTO.html"; class="uri">
-	
-	
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://tldp.org/HOWTO/Unicode-HOWTO.html";>
 	  The Unicode HOWTO
-	
+	
+  
 
 HaibleBruno
   
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.cl.cam.ac.uk/~mgk25/unicode.html"; class="uri">
-
-
+  
+	http://www.w3.org/1999/xlink";
+	  xlink:href="http://www.cl.cam.ac.uk/~mgk25/unicode.html";>
   UTF-8 and Unicode FAQ for Unix/Linux
-
+	
+  
 
+
 KhunMarkus
   
 
Index: doc/xml/manual/backwards_compatibility.xml
===
--- doc/xml/manual/backwards_compatibility.xml	(revision 176956)
+++ doc/xml/manual/backwards_compatibility.xml	(working copy)
@@ -1249,31 +1249,33 @@
 
 
   
-http://www.w3.org/1999/xlink"; xlink:href="http://www.kegel.com/gcc/gcc4.html"; class="uri">
-
-
+   

Re: [PATCH] Fix PR47594: Sign extend constants while translating to Graphite

2011-07-29 Thread Sebastian Pop
Hi Richi,

On Thu, Jul 28, 2011 at 03:58, Richard Guenther  wrote:
> So maybe we can instead try to avoid using unsigned arithmetic
> for symbolic niters if the source does not have it unsigned?

Ok, so what about the attached patch that makes niter use the original
type as much as possible?  I.e. for the trivial cases that don't use division.
Regstrap in progress on amd64-linux.

Sebastian
From 992e0e8c7b15610bf7b9092f0723fb77e323de3a Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Fri, 29 Jul 2011 11:08:47 -0500
Subject: [PATCH 1/2] Use build_zero_cst or build_one_cst.

---
 gcc/tree-ssa-loop-niter.c |   32 
 1 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 4acfc67..2ee3f6e 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -101,7 +101,7 @@ split_to_var_and_offset (tree expr, tree *var, mpz_t offset)
   break;
 
 case INTEGER_CST:
-  *var = build_int_cst_type (type, 0);
+  *var = build_zero_cst (type);
   off = tree_to_double_int (expr);
   mpz_set_double_int (offset, off, TYPE_UNSIGNED (type));
   break;
@@ -522,7 +522,7 @@ inverse (tree x, tree mask)
 }
   else
 {
-  rslt = build_int_cst (type, 1);
+  rslt = build_one_cst (type);
   for (; ctr; ctr--)
 	{
 	  rslt = int_const_binop (MULT_EXPR, rslt, x);
@@ -670,7 +670,7 @@ number_of_iterations_ne (tree type, affine_iv *iv, tree final,
 - tree_low_cst (bits, 1)));
 
   d = fold_binary_to_constant (LSHIFT_EXPR, niter_type,
-			   build_int_cst (niter_type, 1), bits);
+			   build_one_cst (niter_type), bits);
   s = fold_binary_to_constant (RSHIFT_EXPR, niter_type, s, bits);
 
   if (!exit_must_be_taken)
@@ -679,7 +679,7 @@ number_of_iterations_ne (tree type, affine_iv *iv, tree final,
 	 assumptions for divisibility of c.  */
   assumption = fold_build2 (FLOOR_MOD_EXPR, niter_type, c, d);
   assumption = fold_build2 (EQ_EXPR, boolean_type_node,
-assumption, build_int_cst (niter_type, 0));
+assumption, build_zero_cst (niter_type));
   if (!integer_nonzerop (assumption))
 	niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
 	  niter->assumptions, assumption);
@@ -846,7 +846,7 @@ assert_no_overflow_lt (tree type, affine_iv *iv0, affine_iv *iv1,
 	}
   else
 	diff = fold_build2 (MINUS_EXPR, niter_type, step,
-			build_int_cst (niter_type, 1));
+			build_one_cst (niter_type));
   bound = fold_build2 (MINUS_EXPR, type,
 			   TYPE_MAX_VALUE (type), fold_convert (type, diff));
   assumption = fold_build2 (LE_EXPR, boolean_type_node,
@@ -867,7 +867,7 @@ assert_no_overflow_lt (tree type, affine_iv *iv0, affine_iv *iv1,
 	}
   else
 	diff = fold_build2 (MINUS_EXPR, niter_type, step,
-			build_int_cst (niter_type, 1));
+			build_one_cst (niter_type));
   bound = fold_build2 (PLUS_EXPR, type,
 			   TYPE_MIN_VALUE (type), fold_convert (type, diff));
   assumption = fold_build2 (GE_EXPR, boolean_type_node,
@@ -963,7 +963,7 @@ assert_loop_rolls_lt (tree type, affine_iv *iv0, affine_iv *iv1,
   if (integer_nonzerop (iv0->step))
 {
   diff = fold_build2 (MINUS_EXPR, type1,
-			  iv0->step, build_int_cst (type1, 1));
+			  iv0->step, build_one_cst (type1));
 
   /* We need to know that iv0->base >= MIN + iv0->step - 1.  Since
 	 0 address never belongs to any object, we can assume this for
@@ -985,7 +985,7 @@ assert_loop_rolls_lt (tree type, affine_iv *iv0, affine_iv *iv1,
   else
 {
   diff = fold_build2 (PLUS_EXPR, type1,
-			  iv1->step, build_int_cst (type1, 1));
+			  iv1->step, build_one_cst (type1));
 
   if (!POINTER_TYPE_P (type))
 	{
@@ -1083,7 +1083,7 @@ number_of_iterations_lt (tree type, affine_iv *iv0, affine_iv *iv1,
 {
   affine_iv zps;
 
-  zps.base = build_int_cst (niter_type, 0);
+  zps.base = build_zero_cst (niter_type);
   zps.step = step;
   /* number_of_iterations_lt_to_ne will add assumptions that ensure that
 	 zps does not overflow.  */
@@ -1102,7 +1102,7 @@ number_of_iterations_lt (tree type, affine_iv *iv0, affine_iv *iv1,
   assert_loop_rolls_lt (type, iv0, iv1, niter, bnds);
 
   s = fold_build2 (MINUS_EXPR, niter_type,
-		   step, build_int_cst (niter_type, 1));
+		   step, build_one_cst (niter_type));
   delta = fold_build2 (PLUS_EXPR, niter_type, delta, s);
   niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, delta, step);
 
@@ -1167,13 +1167,13 @@ number_of_iterations_le (tree type, affine_iv *iv0, affine_iv *iv1,
 	iv1->base = fold_build_pointer_plus_hwi (iv1->base, 1);
   else
 	iv1->base = fold_build2 (PLUS_EXPR, type1, iv1->base,
- build_int_cst (type1, 1));
+ build_one_cst (type1));
 }
   else if (POINTER_TYPE_P (type))
 iv0->base = fold_build_pointer_plus_hwi (iv0->base, -1);
   else
 iv0->base = fold_build2 (MINUS_EXPR, type1,
-			 iv0->base, build_int_cst (type1, 1));
+			 

C++ PATCH for c++/49867 (ICE with case label in lambda)

2011-07-29 Thread Jason Merrill
When we enter a local class, including a lambda, we are no longer in the 
scope of any enclosing loops or switches, and we should adjust the 
parser state accordingly.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 72900624cbb7b7ded0deb615c5175fe34531914a
Author: Jason Merrill 
Date:   Fri Jul 29 15:22:27 2011 -0700

	PR c++/49867
	* parser.c (cp_parser_lambda_expression): Also clear in_statement
	and in_switch_statement_p.
	(cp_parser_class_specifier): Likewise.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index b7410d5..3828ca9 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7437,8 +7437,12 @@ cp_parser_lambda_expression (cp_parser* parser)
 /* Inside the class, surrounding template-parameter-lists do not apply.  */
 unsigned int saved_num_template_parameter_lists
 = parser->num_template_parameter_lists;
+unsigned char in_statement = parser->in_statement;
+bool in_switch_statement_p = parser->in_switch_statement_p;
 
 parser->num_template_parameter_lists = 0;
+parser->in_statement = 0;
+parser->in_switch_statement_p = false;
 
 /* By virtue of defining a local class, a lambda expression has access to
the private variables of enclosing classes.  */
@@ -7471,6 +7475,8 @@ cp_parser_lambda_expression (cp_parser* parser)
 type = finish_struct (type, /*attributes=*/NULL_TREE);
 
 parser->num_template_parameter_lists = saved_num_template_parameter_lists;
+parser->in_statement = in_statement;
+parser->in_switch_statement_p = in_switch_statement_p;
   }
 
   pop_deferring_access_checks ();
@@ -17007,6 +17013,8 @@ cp_parser_class_specifier_1 (cp_parser* parser)
   bool nested_name_specifier_p;
   unsigned saved_num_template_parameter_lists;
   bool saved_in_function_body;
+  unsigned char in_statement;
+  bool in_switch_statement_p;
   bool saved_in_unbraced_linkage_specification_p;
   tree old_scope = NULL_TREE;
   tree scope = NULL_TREE;
@@ -17060,6 +17068,12 @@ cp_parser_class_specifier_1 (cp_parser* parser)
   /* We are not in a function body.  */
   saved_in_function_body = parser->in_function_body;
   parser->in_function_body = false;
+  /* Or in a loop.  */
+  in_statement = parser->in_statement;
+  parser->in_statement = 0;
+  /* Or in a switch.  */
+  in_switch_statement_p = parser->in_switch_statement_p;
+  parser->in_switch_statement_p = false;
   /* We are not immediately inside an extern "lang" block.  */
   saved_in_unbraced_linkage_specification_p
 = parser->in_unbraced_linkage_specification_p;
@@ -17254,6 +17268,8 @@ cp_parser_class_specifier_1 (cp_parser* parser)
   pop_deferring_access_checks ();
 
   /* Restore saved state.  */
+  parser->in_switch_statement_p = in_switch_statement_p;
+  parser->in_statement = in_statement;
   parser->in_function_body = saved_in_function_body;
   parser->num_template_parameter_lists
 = saved_num_template_parameter_lists;
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-switch.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-switch.C
new file mode 100644
index 000..c306771
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-switch.C
@@ -0,0 +1,26 @@
+// PR c++/49867
+// { dg-options -std=c++0x }
+
+int
+main ()
+{
+  void (*l)();
+  while (true)
+{
+  switch (3)
+	{
+	  struct A {
+	void f()
+	{
+	case 4:		// { dg-error "case" }
+	  break;		// { dg-error "break" }
+	}
+	  };
+	  l = []()
+	{
+	case 3:		// { dg-error "case" }
+	  break;		// { dg-error "break" }
+	};
+	}
+}
+}