[PATCH, c]: Fix PR52290, [4.4/4.5/4.6/4.7 Regression] internal compiler error: tree check: expected function_decl, have var_decl in start_function, at c-decl.c:7712

2012-02-23 Thread Uros Bizjak
Hello!

With invalid code, we can trick grokdeclarator to return VAR_DECL,
even when FUNCDEF context is requested. Attached one-liner detects
this situation and exits early from start_function. The new error
stream looks correct to me, with following invalid testcase we get:

$ cat pr52290.c
int f()[j]

$ ~/gcc-build-fast/gcc/cc1 pr52290.c
pr52290.c:3:9: error: ‘j’ undeclared here (not in a function)
pr52290.c:3:1: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’
at end of input

2012-02-23  Uros Bizjak  

* c-decl.c (start_function): Exit early if decl1 is not FUNTION_DECL.

testsuite/ChangeLog:

2012-02-23  Uros Bizjak  

* gcc.dg/noncompile/pr52290.c: New test.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu.

OK for mainline and release branches?

Uros.
Index: c-decl.c
===
--- c-decl.c(revision 184501)
+++ c-decl.c(working copy)
@@ -7702,7 +7702,8 @@ start_function (struct c_declspecs *declspecs, str
 
   /* If the declarator is not suitable for a function definition,
  cause a syntax error.  */
-  if (decl1 == 0)
+  if (decl1 == 0
+  || TREE_CODE (decl1) != FUNCTION_DECL)
 return 0;
 
   loc = DECL_SOURCE_LOCATION (decl1);
Index: testsuite/gcc.dg/noncompile/pr52290.c
===
--- testsuite/gcc.dg/noncompile/pr52290.c   (revision 0)
+++ testsuite/gcc.dg/noncompile/pr52290.c   (revision 0)
@@ -0,0 +1,3 @@
+/* { dg-error "undeclared here" "" { target *-*-* } 3 } */
+/* { dg-error "expected" "" { target *-*-* } 3 } */
+int f()[j]


Re: [PATCH, i386, Android] Enable exceptions and RTTI by default for Android

2012-02-23 Thread Richard Guenther
On Wed, Feb 22, 2012 at 8:20 PM, Jing Yu  wrote:
> So far, Android ARM toolchain, which builds Android platform for ARM
> boards, does not enable RTTI and exceptions by default. There are
> license concerns with the use of GNU libstdc++ and libsupc++.

That, of course, does not answer my question why

>>> +  "%{!fexceptions:%{!fno-exceptions: -fexceptions}} "  \
>>> +  "%{!frtti:%{!fno-rtti: -frtti}}"

is not a no-op.

Richard.

> Thanks,
> Jing
>
> On Wed, Feb 22, 2012 at 7:07 AM, Richard Guenther
>  wrote:
>> On Wed, Feb 22, 2012 at 3:57 PM, Ilya Enkovich  
>> wrote:
>>> Hello,
>>>
>>> Here is a simple patch which enables exceptions and RTTI by default
>>> for Android target. OK for trunk?
>>
>> Err - isn't that the default?  Thus, simply delete the bogus spec?
>>
>> Richard.
>>
>>
>>> Thanks,
>>> Ilya
>>> --
>>>
>>> 2012-02-22  Enkovich Ilya  
>>>
>>>        * gcc/config/linux-android.h (ANDROID_CC1PLUS_SPEC): Enable
>>>        exceptions and rtti by default.
>>>
>>>
>>> diff --git a/gcc/config/linux-android.h b/gcc/config/linux-android.h
>>> index 94c5274..7256082 100644
>>> --- a/gcc/config/linux-android.h
>>> +++ b/gcc/config/linux-android.h
>>> @@ -46,8 +46,8 @@
>>>   "%{!fno-pic:%{!fno-PIC:%{!fpic:%{!fPIC: -fPIC"
>>>
>>>  #define ANDROID_CC1PLUS_SPEC                                           \
>>> -  "%{!fexceptions:%{!fno-exceptions: -fno-exceptions}} "               \
>>> -  "%{!frtti:%{!fno-rtti: -fno-rtti}}"
>>> +  "%{!fexceptions:%{!fno-exceptions: -fexceptions}} "          \
>>> +  "%{!frtti:%{!fno-rtti: -frtti}}"
>>>
>>>  #define ANDROID_LIB_SPEC \
>>>   "%{!static: -ldl}"


RE: [Ping] RE: CR16 Port addition

2012-02-23 Thread Gerald Pfeifer
On Mon, 20 Feb 2012, Jayant R. Sonar wrote:
> PFA, the patch modified as per your suggestion.

This is cool, please go ahead.  (It would have been fine to commit
this based on my previous mail. :-)

> I have also attached here another patch for contrib.texi file changes.

Very well.

+Sumanth Gundapaneni and Jayant Sonar for contributing CR16 port.

Can you please make this two separate items, one for you and one
for your colleague (and sort them in properly)?

"CR16 port" -> "the CR16 port".

Please go ahead and commit with these changes.

Thanks,
Gerald


[PATCH] Re: Serious regressions due to newlib's HAVE_INITFINI_ARRAY

2012-02-23 Thread Ulrich Weigand
Jakub Jelinek wrote:
> On Wed, Feb 22, 2012 at 03:55:34PM +0100, Ulrich Weigand wrote:
> > However, the macro HAVE_INITFINI_ARRAY is defined anyway; this
> > definition is done by an internal "newlib.h" header that is pulled
> > in via the  include in GCC's "tsystem.h".  [ This is clearly
> > a violation of C namespace rules, but this has been the situation
> > for all newlib releases since about 2005 ... ]
> 
> Ugh, clearly newlib bug...
> 
> > Any suggestions how to proceed with this welcome!  I'd really
> > like to see this fixed for 4.7, otherwise the compiler will be
> > seriously broken ...
> 
> I guess the easiest would be just to rename the gcc HAVE_INITFINI_ARRAY
> macro to something else, HAVE_INITFINI_ARRAY_SUPPORT or whatever.

Indeed, the following patch fixes the problem for me.

Tested on spu-elf.

OK for mainline?

Bye,
Ulrich


ChangeLog:

gcc/
* acinclude.m4: Use HAVE_INITFINI_ARRAY_SUPPORT instead of
HAVE_INITFINI_ARRAY to work around namespace pollution in
certain versions of newlib system headers.
* config.in: Regenerate.
* configure: Regenerate.
* config/initfini-array.h: Use HAVE_INITFINI_ARRAY_SUPPORT
instead of HAVE_INITFINI_ARRAY.

libgcc/
* config/ia64/crtbegin.S: Use HAVE_INITFINI_ARRAY_SUPPORT
instead of HAVE_INITFINI_ARRAY.
* config/ia64/crtend.S: Likewise.

Index: libgcc/config/ia64/crtend.S
===
*** libgcc/config/ia64/crtend.S (revision 184438)
--- libgcc/config/ia64/crtend.S (working copy)
*** __DTOR_END__:
*** 39,48 
  __JCR_END__:
data8   0
  
! #ifdef HAVE_INITFINI_ARRAY
.global __do_global_ctors_aux
.hidden __do_global_ctors_aux
! #else /* !HAVE_INITFINI_ARRAY */
  /*
   * Fragment of the ELF _init routine that invokes our dtor cleanup.
   *
--- 39,48 
  __JCR_END__:
data8   0
  
! #ifdef HAVE_INITFINI_ARRAY_SUPPORT
.global __do_global_ctors_aux
.hidden __do_global_ctors_aux
! #else /* !HAVE_INITFINI_ARRAY_SUPPORT */
  /*
   * Fragment of the ELF _init routine that invokes our dtor cleanup.
   *
*** __JCR_END__:
*** 71,77 
  br.call.sptk.many b0 = b6
  ;;
}
! #endif /* !HAVE_INITFINI_ARRAY */
  
  .text
.align 32
--- 71,77 
  br.call.sptk.many b0 = b6
  ;;
}
! #endif /* !HAVE_INITFINI_ARRAY_SUPPORT */
  
  .text
.align 32
Index: libgcc/config/ia64/crtbegin.S
===
*** libgcc/config/ia64/crtbegin.S   (revision 184438)
--- libgcc/config/ia64/crtbegin.S   (working copy)
*** __dso_handle:
*** 61,67 
.hidden __dso_handle
  
  
! #ifdef HAVE_INITFINI_ARRAY
  
  .section .fini_array, "a"
data8 @fptr(__do_global_dtors_aux)
--- 61,67 
.hidden __dso_handle
  
  
! #ifdef HAVE_INITFINI_ARRAY_SUPPORT
  
  .section .fini_array, "a"
data8 @fptr(__do_global_dtors_aux)
*** __dso_handle:
*** 70,76 
data8 @fptr(__do_jv_register_classes)
data8 @fptr(__do_global_ctors_aux)
  
! #else /* !HAVE_INITFINI_ARRAY */
  /*
   * Fragment of the ELF _fini routine that invokes our dtor cleanup.
   *
--- 70,76 
data8 @fptr(__do_jv_register_classes)
data8 @fptr(__do_global_ctors_aux)
  
! #else /* !HAVE_INITFINI_ARRAY_SUPPORT */
  /*
   * Fragment of the ELF _fini routine that invokes our dtor cleanup.
   *
*** __dso_handle:
*** 117,123 
  mov b6 = r2
  br.call.sptk.many b0 = b6
}
! #endif /* !HAVE_INITFINI_ARRAY */
  
  .section .text
.align  32
--- 117,123 
  mov b6 = r2
  br.call.sptk.many b0 = b6
}
! #endif /* !HAVE_INITFINI_ARRAY_SUPPORT */
  
  .section .text
.align  32
Index: gcc/configure
===
*** gcc/configure   (revision 184438)
--- gcc/configure   (working copy)
*** fi
*** 22515,22521 
  
  if test $enable_initfini_array = yes; then
  
! $as_echo "#define HAVE_INITFINI_ARRAY 1" >>confdefs.h
  
  fi
  
--- 22515,22521 
  
  if test $enable_initfini_array = yes; then
  
! $as_echo "#define HAVE_INITFINI_ARRAY_SUPPORT 1" >>confdefs.h
  
  fi
  
Index: gcc/config.in
===
*** gcc/config.in   (revision 184438)
--- gcc/config.in   (working copy)
***
*** 1123,1129 
  
  /* Define .init_array/.fini_array sections are available and working. */
  #ifndef USED_FOR_TARGET
! #undef HAVE_INITFINI_ARRAY
  #endif
  
  
--- 1123,1129 
  
  /* Define .init_array/.fini_array sections are available and working. */
  #ifndef USED_FOR_TARGET
! #undef HAVE_INITFINI_ARRAY_SUPPORT
  #endif
  
  
Index: gcc/acinclude.m4
==

Re: [PATCH] Re: Serious regressions due to newlib's HAVE_INITFINI_ARRAY

2012-02-23 Thread Jakub Jelinek
On Thu, Feb 23, 2012 at 11:03:47AM +0100, Ulrich Weigand wrote:
> OK for mainline?

Yes, thanks.

Jakub


[PATCH] PR libffi/52223: Define FLAGS_TO_PASS for libffi

2012-02-23 Thread Mikael Pettersson
This fixes a "make prefix=... mandir=... install" installation error
in libffi on multilib targets like {x86_64,m68k}-linux:

make prefix=/tmp/buildroot/usr mandir=/tmp/buildroot/usr/share/man install
...
Making install in man
make[5]: Entering directory
`/tmp/objdir/x86_64-unknown-linux-gnu/32/libffi/man'
make[6]: Entering directory
`/tmp/objdir/x86_64-unknown-linux-gnu/32/libffi/man'
make[6]: Nothing to be done for `install-exec-am'.
test -z "/usr/share/man/man3" || /bin/mkdir -p "/usr/share/man/man3"
 /usr/bin/install -c -m 644 /tmp/gcc-4.7-20120211/libffi/man/ffi.3
/tmp/gcc-4.7-20120211/libffi/man/ffi_call.3
/tmp/gcc-4.7-20120211/libffi/man/ffi_prep_cif.3 '/usr/share/man/man3'
/usr/bin/install: cannot create regular file `/usr/share/man/man3/ffi.3':
Permission denied
/usr/bin/install: cannot create regular file `/usr/share/man/man3/ffi_call.3':
Permission denied
/usr/bin/install: cannot create regular file
`/usr/share/man/man3/ffi_prep_cif.3': Permission denied
make[6]: *** [install-man3] Error 1

The problem is that the Makefile variable FLAGS_TO_PASS isn't defined.
This causes multilib subdirs to not see the install-time settings and
instead use the configure-time settings, which breaks installations via
a separate staging area.

This problem affects trunk and the 4.6 and 4.5 branches.  4.4 is not
affected since it doesn't have these man pages.

See  for a
similar problem in libquadmath that Joseph Myers fixed last year.

OK for trunk and affected branches?

(If approved I'll need from someone with svn write access to get it applied.)

/Mikael

libffi/

2012-02-23  Mikael Pettersson  

PR libffi/52223
* Makefile.am (FLAGS_TO_PASS): Define.
* Makefile.in: Regenerate.

--- gcc-4.7-20120218/libffi/Makefile.am.~1~ 2010-07-02 18:52:38.0 
+0200
+++ gcc-4.7-20120218/libffi/Makefile.am 2012-02-23 12:49:21.0 +0100
@@ -76,6 +76,9 @@ AM_MAKEFLAGS = \
"RANLIB=$(RANLIB)" \
"DESTDIR=$(DESTDIR)"
 
+# Subdir rules rely on $(FLAGS_TO_PASS)
+FLAGS_TO_PASS = $(AM_MAKEFLAGS)
+
 MAKEOVERRIDES=
 
 toolexeclib_LTLIBRARIES = libffi.la
--- gcc-4.7-20120218/libffi/Makefile.in.~1~ 2012-02-18 23:44:28.0 
+0100
+++ gcc-4.7-20120218/libffi/Makefile.in 2012-02-23 12:49:21.0 +0100
@@ -455,6 +455,9 @@ AM_MAKEFLAGS = \
"RANLIB=$(RANLIB)" \
"DESTDIR=$(DESTDIR)"
 
+
+# Subdir rules rely on $(FLAGS_TO_PASS)
+FLAGS_TO_PASS = $(AM_MAKEFLAGS)
 MAKEOVERRIDES = 
 toolexeclib_LTLIBRARIES = libffi.la
 noinst_LTLIBRARIES = libffi_convenience.la


[Patch,AVR]: Tweak xor similar to ior

2012-02-23 Thread Georg-Johann Lay
This mini-patch extends some combine patterns that used IOR to also match
for XOR in a similar way.

Bottom line is always that when performing IOR or XOR, only portions
in non-zero bytes matter and zero-bytes that come from zero-extends
or shifts need not to be XORed/IORed.

The patch just replaces ior with the new xior code iterator.

Passes avr test suite without regressions.

Ok for trunk?

Johann

* config/avr/avr.md (code_stdname): Add ior, xor.
(xior): New code iterator.
(*qi.byte0): Use xior instead of ior.
(*qi.byte1-3): Ditto.
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 184469)
+++ config/avr/avr.md	(working copy)
@@ -231,6 +231,8 @@ (define_mode_iterator MOVMODE [(QI "") (
 (define_code_iterator any_extend  [sign_extend zero_extend])
 (define_code_iterator any_extend2 [sign_extend zero_extend])
 
+(define_code_iterator xior [xor ior])
+
 ;; Define code attributes
 (define_code_attr extend_su
   [(sign_extend "s")
@@ -254,6 +256,8 @@ (define_code_attr code_stdname
   [(ashift   "ashl")
(ashiftrt "ashr")
(lshiftrt "lshr")
+   (ior  "ior")
+   (xor  "xor")
(rotate   "rotl")])
 
 ;;
@@ -5960,24 +5964,28 @@ (define_insn "*insv.reg"
 ;; in particular when subreg lowering (-fsplit-wide-types) is turned on.
 ;; That switch obfuscates things here and in many other places.
 
-(define_insn_and_split "*iorqi.byte0"
+;; "*iorhiqi.byte0"   "*iorpsiqi.byte0"   "*iorsiqi.byte0"
+;; "*xorhiqi.byte0"   "*xorpsiqi.byte0"   "*xorsiqi.byte0"
+(define_insn_and_split "*qi.byte0"
   [(set (match_operand:HISI 0 "register_operand" "=r")
-(ior:HISI
+(xior:HISI
  (zero_extend:HISI (match_operand:QI 1 "register_operand" "r"))
  (match_operand:HISI 2 "register_operand" "0")))]
   ""
   "#"
   "reload_completed"
   [(set (match_dup 3)
-(ior:QI (match_dup 3)
-(match_dup 1)))]
+(xior:QI (match_dup 3)
+ (match_dup 1)))]
   {
 operands[3] = simplify_gen_subreg (QImode, operands[0], mode, 0);
   })
 
-(define_insn_and_split "*iorqi.byte1-3"
+;; "*iorhiqi.byte1-3"  "*iorpsiqi.byte1-3"  "*iorsiqi.byte1-3"
+;; "*xorhiqi.byte1-3"  "*xorpsiqi.byte1-3"  "*xorsiqi.byte1-3"
+(define_insn_and_split "*qi.byte1-3"
   [(set (match_operand:HISI 0 "register_operand"  "=r")
-(ior:HISI
+(xior:HISI
  (ashift:HISI (zero_extend:HISI (match_operand:QI 1 "register_operand" "r"))
   (match_operand:QI 2 "const_8_16_24_operand"  "n"))
  (match_operand:HISI 3 "register_operand"  "0")))]
@@ -5985,8 +5993,8 @@ (define_insn_and_split "*iorqi.byt
   "#"
   "&& reload_completed"
   [(set (match_dup 4)
-(ior:QI (match_dup 4)
-(match_dup 1)))]
+(xior:QI (match_dup 4)
+ (match_dup 1)))]
   {
 int byteno = INTVAL(operands[2]) / BITS_PER_UNIT;
 operands[4] = simplify_gen_subreg (QImode, operands[0], mode, byteno);


libitm: Use ml_wt as default TM methods for >1 thread.

2012-02-23 Thread Torvald Riegel
This patch changes the default TM method that is used if more than one
thread is registered to ml_wt (from serialirr_onwrite previously).  For
one registered thread, it's still serialirr.

Right now, this will not be a performance advantage for every workload
and number of threads.  However, of the TM methods we have implemented
so far, ml_wt is the only one that allows concurrent update
transactions.  gl_wt update txns essentially abort all other txns that
have read something.  serialirr_onwrite serializes all update txns.
ml_wt can execute disjoint-access-parallel txns in parallel, unless we
get false sharing via the address-to-orec mapping.  Therefore, even
though ml_wt performance isn't great currently, it's the only option
that doesn't suffer from a principal scalability bottleneck.

In the long-term, this should get improved by a less trivial choice
regarding which TM method to use (e.g., based on actual abort rate and
other factors). Likewise, the ml_wt lock-mapping can hopefully be
improved, so that we can at least get better scalability even if the
single-thread overheads are higher.  We might also want to add other TM
methods.  But all of that will require some time to get done, so until
then, I think the better choice right now is still to go with ml_wt
as-is in the meantime.

OK?

PS: I also noticed that libitm/testsuite/libitm.c/memcpy-1.c takes
surprisingly long when using ml_wt compared to serialirr.  I will
investigate this later. 
commit ce7f7cd1797cb3c4136c53ff038d2f8f7f0bfad7
Author: Torvald Riegel 
Date:   Thu Feb 23 14:16:14 2012 +0100

libitm: Use ml_wt as default TM methods for >1 thread.

libitm/
* retry.cc (GTM::gtm_thread::number_of_threads_changed): Change
default dispatch for more than 1 thread to ml_wt.

diff --git a/libitm/retry.cc b/libitm/retry.cc
index 2c1483e..660bf52 100644
--- a/libitm/retry.cc
+++ b/libitm/retry.cc
@@ -314,7 +314,7 @@ GTM::gtm_thread::number_of_threads_changed(unsigned 
previous, unsigned now)
set_default_dispatch(default_dispatch_user);
   else
{
- abi_dispatch* a = dispatch_serialirr_onwrite();
+ abi_dispatch* a = dispatch_ml_wt();
  if (a->supports(now))
set_default_dispatch(a);
  else


[PATCH] Ignore CLOBBER stmts in ipa-split (PR tree-optimization/52019)

2012-02-23 Thread Jakub Jelinek
Hi!

IMHO we should treat CLOBBER stmts in various places like debug stmts,
they shouldn't affect the decisions of the optimization phases, but
if the pass does some transformation that somehow invalidates them or
moves them (not in this pass), it needs to be adjusted.  Otherwise we end
up with the CLOBBERs inhibiting optimizations.

This seems to work for ipa-split, the actual transformations DTRT
apparently.  Bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2012-02-23  Jakub Jelinek  

PR tree-optimization/52019
* ipa-split.c (find_return_bb, find_retval, visit_bb): Ignore
CLOBBER stmts.

* gcc.dg/tree-ssa/ipa-split-6.c: New test.

--- gcc/ipa-split.c.jj  2012-01-23 18:23:55.0 +0100
+++ gcc/ipa-split.c 2012-02-23 12:30:21.877609719 +0100
@@ -1,5 +1,5 @@
 /* Function splitting pass
-   Copyright (C) 2010, 2011
+   Copyright (C) 2010, 2011, 2012
Free Software Foundation, Inc.
Contributed by Jan Hubicka  
 
@@ -624,7 +624,9 @@ find_return_bb (void)
   for (bsi = gsi_last_bb (e->src); !gsi_end_p (bsi); gsi_prev (&bsi))
 {
   gimple stmt = gsi_stmt (bsi);
-  if (gimple_code (stmt) == GIMPLE_LABEL || is_gimple_debug (stmt))
+  if (gimple_code (stmt) == GIMPLE_LABEL
+ || is_gimple_debug (stmt)
+ || gimple_clobber_p (stmt))
;
   else if (gimple_code (stmt) == GIMPLE_ASSIGN
   && found_return
@@ -657,7 +659,8 @@ find_retval (basic_block return_bb)
   for (bsi = gsi_start_bb (return_bb); !gsi_end_p (bsi); gsi_next (&bsi))
 if (gimple_code (gsi_stmt (bsi)) == GIMPLE_RETURN)
   return gimple_return_retval (gsi_stmt (bsi));
-else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_ASSIGN)
+else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_ASSIGN
+&& !gimple_clobber_p (gsi_stmt (bsi)))
   return gimple_assign_rhs1 (gsi_stmt (bsi));
   return NULL;
 }
@@ -733,6 +736,9 @@ visit_bb (basic_block bb, basic_block re
   if (is_gimple_debug (stmt))
continue;
 
+  if (gimple_clobber_p (stmt))
+   continue;
+
   /* FIXME: We can split regions containing EH.  We can not however
 split RESX, EH_DISPATCH and EH_POINTER referring to same region
 into different partitions.  This would require tracking of
--- gcc/testsuite/gcc.dg/tree-ssa/ipa-split-6.c.jj  2012-02-23 
12:33:20.578790182 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/ipa-split-6.c 2012-02-23 12:34:05.050612776 
+0100
@@ -0,0 +1,10 @@
+/* PR tree-optimization/52019 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-tree-sra -fdump-tree-fnsplit -fdump-tree-optimized" 
} */
+
+#include "ipa-split-5.c"
+
+/* { dg-final { scan-tree-dump-times "Splitting function" 1 "fnsplit"} } */
+/* { dg-final { cleanup-tree-dump "fnsplit" } } */
+/* { dg-final { scan-tree-dump "part" "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Jakub


[PATCH] Fix PR52349

2012-02-23 Thread Richard Guenther

Approved in the PR, tested by Jakub, applied.

Richard.

2012-02-23  Richard Guenther  

go/
* go-gcc.cc (Gcc_backend::placeholder_pointer_type): Use
build_distinct_type_copy.

Index: go-gcc.cc
===
--- go-gcc.cc   (revision 184506)
+++ go-gcc.cc   (working copy)
@@ -602,7 +602,7 @@ Btype*
 Gcc_backend::placeholder_pointer_type(const std::string& name,
  Location location, bool)
 {
-  tree ret = build_variant_type_copy(ptr_type_node);
+  tree ret = build_distinct_type_copy(ptr_type_node);
   if (!name.empty())
 {
   tree decl = build_decl(location.gcc_location(), TYPE_DECL,


Re: [PATCH] Ignore CLOBBER stmts in ipa-split (PR tree-optimization/52019)

2012-02-23 Thread Richard Guenther
On Thu, 23 Feb 2012, Jakub Jelinek wrote:

> Hi!
> 
> IMHO we should treat CLOBBER stmts in various places like debug stmts,
> they shouldn't affect the decisions of the optimization phases, but
> if the pass does some transformation that somehow invalidates them or
> moves them (not in this pass), it needs to be adjusted.  Otherwise we end
> up with the CLOBBERs inhibiting optimizations.
> 
> This seems to work for ipa-split, the actual transformations DTRT
> apparently.  Bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

Ok.

Thanks,
Richard.

> 2012-02-23  Jakub Jelinek  
> 
>   PR tree-optimization/52019
>   * ipa-split.c (find_return_bb, find_retval, visit_bb): Ignore
>   CLOBBER stmts.
> 
>   * gcc.dg/tree-ssa/ipa-split-6.c: New test.
> 
> --- gcc/ipa-split.c.jj2012-01-23 18:23:55.0 +0100
> +++ gcc/ipa-split.c   2012-02-23 12:30:21.877609719 +0100
> @@ -1,5 +1,5 @@
>  /* Function splitting pass
> -   Copyright (C) 2010, 2011
> +   Copyright (C) 2010, 2011, 2012
> Free Software Foundation, Inc.
> Contributed by Jan Hubicka  
>  
> @@ -624,7 +624,9 @@ find_return_bb (void)
>for (bsi = gsi_last_bb (e->src); !gsi_end_p (bsi); gsi_prev (&bsi))
>  {
>gimple stmt = gsi_stmt (bsi);
> -  if (gimple_code (stmt) == GIMPLE_LABEL || is_gimple_debug (stmt))
> +  if (gimple_code (stmt) == GIMPLE_LABEL
> +   || is_gimple_debug (stmt)
> +   || gimple_clobber_p (stmt))
>   ;
>else if (gimple_code (stmt) == GIMPLE_ASSIGN
>  && found_return
> @@ -657,7 +659,8 @@ find_retval (basic_block return_bb)
>for (bsi = gsi_start_bb (return_bb); !gsi_end_p (bsi); gsi_next (&bsi))
>  if (gimple_code (gsi_stmt (bsi)) == GIMPLE_RETURN)
>return gimple_return_retval (gsi_stmt (bsi));
> -else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_ASSIGN)
> +else if (gimple_code (gsi_stmt (bsi)) == GIMPLE_ASSIGN
> +  && !gimple_clobber_p (gsi_stmt (bsi)))
>return gimple_assign_rhs1 (gsi_stmt (bsi));
>return NULL;
>  }
> @@ -733,6 +736,9 @@ visit_bb (basic_block bb, basic_block re
>if (is_gimple_debug (stmt))
>   continue;
>  
> +  if (gimple_clobber_p (stmt))
> + continue;
> +
>/* FIXME: We can split regions containing EH.  We can not however
>split RESX, EH_DISPATCH and EH_POINTER referring to same region
>into different partitions.  This would require tracking of
> --- gcc/testsuite/gcc.dg/tree-ssa/ipa-split-6.c.jj2012-02-23 
> 12:33:20.578790182 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/ipa-split-6.c   2012-02-23 
> 12:34:05.050612776 +0100
> @@ -0,0 +1,10 @@
> +/* PR tree-optimization/52019 */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fno-tree-sra -fdump-tree-fnsplit 
> -fdump-tree-optimized" } */
> +
> +#include "ipa-split-5.c"
> +
> +/* { dg-final { scan-tree-dump-times "Splitting function" 1 "fnsplit"} } */
> +/* { dg-final { cleanup-tree-dump "fnsplit" } } */
> +/* { dg-final { scan-tree-dump "part" "optimized"} } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> 
>   Jakub
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: [Patch,AVR]: Tweak xor similar to ior

2012-02-23 Thread Denis Chertykov
2012/2/23 Georg-Johann Lay :
> This mini-patch extends some combine patterns that used IOR to also match
> for XOR in a similar way.
>
> Bottom line is always that when performing IOR or XOR, only portions
> in non-zero bytes matter and zero-bytes that come from zero-extends
> or shifts need not to be XORed/IORed.
>
> The patch just replaces ior with the new xior code iterator.
>
> Passes avr test suite without regressions.
>
> Ok for trunk?
>
> Johann
>
>        * config/avr/avr.md (code_stdname): Add ior, xor.
>        (xior): New code iterator.
>        (*qi.byte0): Use xior instead of ior.
>        (*qi.byte1-3): Ditto.

Please, commit.

Denis.


Re: Simulator testing for sh and sh64

2012-02-23 Thread Kaz Kojima
[I've moved to gcc-patches because this includes a patch anyway.]

Thomas Schwinge  wrote:
> /scratch/tschwing/FM_sh64-elf/src/gcc-mainline/libgcc/libgcc2.c: In 
> function '__powisf2':
> /scratch/tschwing/FM_sh64-elf/src/gcc-mainline/libgcc/libgcc2.c:1779:1: 
> error: unrecognizable insn:
> (insn 10 9 11 3 (set (reg:SI 162 [ D.2769 ])
> (abs:SI (reg/v:SI 168 [ m ]))) 
> /scratch/tschwing/FM_sh64-elf/src/gcc-mainline/libgcc/libgcc2.c:1770 -1
>  (nil))
> /scratch/tschwing/FM_sh64-elf/src/gcc-mainline/libgcc/libgcc2.c:1779:1: 
> internal compiler error: in extract_insn, at recog.c:2123

BTW, I have a patch below which restores sh64-elf build on trunk.
The hunks for sh_dwarf_register_span and abssi2 are almost obvious.
Those for sh_register_move_cost and CASE_USE_BIT_TESTS would be
suspicious, though.

Regards,
kaz
--
diff -up ORIG/trunk/gcc/config/sh/sh.c trunk/gcc/config/sh/sh.c
--- ORIG/trunk/gcc/config/sh/sh.c   2011-12-30 09:22:01.0 +0900
+++ trunk/gcc/config/sh/sh.c2012-02-23 21:23:44.0 +0900
@@ -8133,10 +8133,8 @@ sh_dwarf_register_span (rtx reg)
   return
 gen_rtx_PARALLEL (VOIDmode,
  gen_rtvec (2,
-gen_rtx_REG (SFmode,
- DBX_REGISTER_NUMBER (regno+1)),
-gen_rtx_REG (SFmode,
- DBX_REGISTER_NUMBER (regno;
+gen_rtx_REG (SFmode, regno + 1),
+gen_rtx_REG (SFmode, regno)));
 }
 
 static enum machine_mode
@@ -11499,7 +11497,7 @@ sh_register_move_cost (enum machine_mode
&& REGCLASS_HAS_GENERAL_REG (srcclass))
   || (REGCLASS_HAS_GENERAL_REG (dstclass)
  && REGCLASS_HAS_FP_REG (srcclass)))
-return ((TARGET_SHMEDIA ? 4 : TARGET_FMOVD ? 8 : 12)
+return (((TARGET_SHMEDIA ? 4 : TARGET_FMOVD ? 8 : 12) + 64)
* ((GET_MODE_SIZE (mode) + 7) / 8U));
 
   if ((dstclass == FPUL_REGS
diff -up ORIG/trunk/gcc/config/sh/sh.h trunk/gcc/config/sh/sh.h
--- ORIG/trunk/gcc/config/sh/sh.h   2011-12-30 09:22:01.0 +0900
+++ trunk/gcc/config/sh/sh.h2012-02-23 20:54:23.0 +0900
@@ -2435,6 +2435,10 @@ extern int current_function_interrupt;
 #define MD_CAN_REDIRECT_BRANCH(INSN, SEQ) \
   sh_can_redirect_branch ((INSN), (SEQ))
 
+#define CASE_USE_BIT_TESTS  (!TARGET_SHMEDIA \
+&& (optab_handler (ashl_optab, word_mode) \
+!= CODE_FOR_nothing))
+
 #define DWARF_FRAME_RETURN_COLUMN \
   (TARGET_SH5 ? DWARF_FRAME_REGNUM (PR_MEDIA_REG) : DWARF_FRAME_REGNUM 
(PR_REG))
 
diff -up ORIG/trunk/gcc/config/sh/sh.md trunk/gcc/config/sh/sh.md
--- ORIG/trunk/gcc/config/sh/sh.md  2012-02-23 15:20:01.0 +0900
+++ trunk/gcc/config/sh/sh.md   2012-02-23 15:20:29.0 +0900
@@ -4464,7 +4464,7 @@ label:
   [(set (match_operand:SI 0 "arith_reg_dest" "")
(abs:SI (match_operand:SI 1 "arith_reg_operand" "")))
(clobber (reg:SI T_REG))]
-  ""
+  "TARGET_SH1"
   "")
 
 (define_insn_and_split "*abssi2"


[PATCH][RFC] Preserve loop info from tree loop opts to after RTL loop opts (PR44688)

2012-02-23 Thread Richard Guenther

The attached patch blob makes us preserve loop information (the loop
tree) from the start of tree loop optimizations until the end of
RTL loop optimizations.  The motivation for this is to fix excessive
prefetching and loop unrolling we perform on (for example) prologue
loops created by the vectorizer.  The reason why we do so is that
we are not able to analyze/bound their number of iterations.  But
of course the vectorizer perfectly knows a bound to its prologue loops,
so why not record that information ... this is what the inlined patch
does, as well as adjust passes to actually _use_ an upper bound if
available.

The whole patch does not yet pass bootstrap, but the C/C++ testsuites
are fine (and the target libs build).

Thus, inline the "meat" of the patch that makes us perform less
unrolling/prefetching.  For example on 437.leslie3d this reduces
code size from

   textdata bss dec hex filename
 438423   04184  442607   6c0ef tml.o

to

   textdata bss dec hex filename
 368903   04184  373087   5b15f tml.o

at -Ofast -funroll-loops and from

   textdata bss dec hex filename
 741167   04184  745351   b5f87 tml.o

to

   textdata bss dec hex filename
 561479   04184  565663   8a19f tml.o

at -Ofast -funroll-loops -march=barcelona.

Attached you find the collection of changes I had to make to preserve 
loops.  The main idea is to make loop_optimizer_finalize a no-op if
PROP_loops is set on the current function.  I added tons of checking
to make sure loop info is correct as well as dominators (loop
verification needs dominators).  I plan to split out the verification
bits (or at least its fixes), then the generic CFG bits that preserve
loops on the RTL side (and the few tree cases I catched).

Any comments on that plan?

Thanks,
Richard.


Index: gcc/loop-iv.c
===
--- gcc/loop-iv.c.orig  2011-07-11 17:02:51.0 +0200
+++ gcc/loop-iv.c   2012-02-23 15:22:14.0 +0100
@@ -2764,6 +2764,10 @@ iv_number_of_iterations (struct loop *lo
 {
   if (!desc->niter_max)
desc->niter_max = determine_max_iter (loop, desc, old_niter);
+  if (loop->any_upper_bound
+ && double_int_fits_in_uhwi_p (loop->nb_iterations_upper_bound)
+ && loop->nb_iterations_upper_bound.low < desc->niter_max)
+   desc->niter_max = loop->nb_iterations_upper_bound.low;
 
   /* simplify_using_initial_values does a copy propagation on the registers
 in the expression for the number of iterations.  This prolongs life
Index: gcc/loop-unroll.c
===
--- gcc/loop-unroll.c.orig  2011-12-02 10:14:44.0 +0100
+++ gcc/loop-unroll.c   2012-02-23 15:26:46.0 +0100
@@ -859,7 +859,8 @@ decide_unroll_runtime_iterations (struct
 }
 
   /* If we have profile feedback, check whether the loop rolls.  */
-  if (loop->header->count && expected_loop_iterations (loop) < 2 * nunroll)
+  if ((loop->header->count && expected_loop_iterations (loop) < 2 * nunroll)
+  || desc->niter_max < 2 * nunroll)
 {
   if (dump_file)
fprintf (dump_file, ";; Not unrolling loop, doesn't roll\n");
Index: gcc/tree-ssa-loop-niter.c
===
--- gcc/tree-ssa-loop-niter.c.orig  2011-09-01 12:08:51.0 +0200
+++ gcc/tree-ssa-loop-niter.c   2012-02-23 14:56:11.0 +0100
@@ -1383,6 +1383,10 @@ number_of_iterations_cond (struct loop *
   gcc_unreachable ();
 }
 
+  if (loop->any_upper_bound
+  && double_int_ucmp (loop->nb_iterations_upper_bound, niter->max) < 0)
+niter->max = loop->nb_iterations_upper_bound;
+
   mpz_clear (bnds.up);
   mpz_clear (bnds.below);
 
@@ -3030,7 +3034,7 @@ estimate_numbers_of_iterations_loop (str
   if (loop->estimate_state != EST_NOT_COMPUTED)
 return;
   loop->estimate_state = EST_AVAILABLE;
-  loop->any_upper_bound = false;
+  /* loop->any_upper_bound = false; */
   loop->any_estimate = false;
 
   exits = get_loop_exit_edges (loop);
Index: gcc/tree-ssa-loop-prefetch.c
===
--- gcc/tree-ssa-loop-prefetch.c.orig   2011-10-12 13:14:10.0 +0200
+++ gcc/tree-ssa-loop-prefetch.c2012-02-23 15:05:45.0 +0100
@@ -1801,6 +1801,8 @@ loop_prefetch_arrays (struct loop *loop)
 
   ahead = (PREFETCH_LATENCY + time - 1) / time;
   est_niter = max_stmt_executions_int (loop, false);
+  if (est_niter == -1)
+est_niter = max_stmt_executions_int (loop, true);
 
   /* Prefetching is not likely to be profitable if the trip count to ahead
  ratio is too small.  */
Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c.orig 2012-02-23 14:45:11.0 +0100
+++ gcc/tree-vect-loop-manip.c  2012-02-23 1

Re: libitm: Use ml_wt as default TM methods for >1 thread.

2012-02-23 Thread Richard Henderson
On 02/23/12 05:34, Torvald Riegel wrote:
> libitm: Use ml_wt as default TM methods for >1 thread.
> 
>   libitm/
>   * retry.cc (GTM::gtm_thread::number_of_threads_changed): Change
>   default dispatch for more than 1 thread to ml_wt.


Ok.


r~


Re: [RFC, 4.8] Magic matching for flags clobbering and setting

2012-02-23 Thread Richard Henderson
On 02/22/12 17:16, Hans-Peter Nilsson wrote:
>> What I know is missing off the top of my head are:
> 
>>  (2) Can't be usefully used with define_insn_and_split, and no way to tell.
>>  This problem should simply be documented in the .texi file as user 
>> error.
> 
> Not sure I see the problem or the impact of the absence.  Would
> it help if there was a way to match_dup the clobber/set?  Maybe
> as a match_op_flags, the same as match_flags but with the first
> argument being an assigning operand number.  You probably
> wouldn't want to use this very often.)

Yes, one could probably have some use of it with a split, if the
match_flags is assigned an operand number.  You'd have to be
very careful about preserving the contents of the compare if you
adjust the other instruction data at all...

I've also thought of using the operand number as a quick way to
test whether the flags are actually live.  I.e.

  if (GET_CODE (operands[n]) == CLOBBER)

within the split or the C output template.


r~


Re: [PATCH, c]: Fix PR52290, [4.4/4.5/4.6/4.7 Regression] internal compiler error: tree check: expected function_decl, have var_decl in start_function, at c-decl.c:7712

2012-02-23 Thread Joseph S. Myers
On Thu, 23 Feb 2012, Uros Bizjak wrote:

> 2012-02-23  Uros Bizjak  
> 
>   * c-decl.c (start_function): Exit early if decl1 is not FUNTION_DECL.
> 
> testsuite/ChangeLog:
> 
> 2012-02-23  Uros Bizjak  
> 
>   * gcc.dg/noncompile/pr52290.c: New test.
> 
> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu.
> 
> OK for mainline and release branches?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[lra] patch to fix SPARC bootstrap failure

2012-02-23 Thread Vladimir Makarov

The following patch fixes SPARC64 bootstrap failure finally.

The patch was successfully bootstrapped on x86/x86-64, SPARC64, and ARM.

Committed as rev. 184510.

2012-02-22  Vladimir Makarov 

* lra-constraints.c (narrow_reload_pseudo_class): New function.
(match_reload): Call it.
(SLOW_UNALIGNED_ACCESS): Define the macro.
(simplify_operand_subreg): Do not change address of improperly
aligned memory.
(curr_insn_transform): Invalidate used alternative if the insn is
changed.


Index: lra-constraints.c
===
--- lra-constraints.c   (revision 18)
+++ lra-constraints.c   (working copy)
@@ -800,6 +800,31 @@ get_op_mode (int nop)
   return find_mode (&PATTERN (curr_insn), VOIDmode, loc);
 }
 
+/* If REG is a reload pseudo got from the current insn, try to make
+   its class satisfying CL.  */
+static void
+narrow_reload_pseudo_class (rtx reg, enum reg_class cl)
+{
+  int regno;
+  enum reg_class rclass;
+
+  /* Do not make more accurate class from reloads generated.  They are
+ mostly moves with a lot of constraints.  Making more accurate
+ class may results in very narrow class and impossibility of find
+ registers for several reloads of one insn.  */
+  if (INSN_UID (curr_insn) >= new_insn_uid_start)
+return;
+  if (GET_CODE (reg) == SUBREG)
+reg = SUBREG_REG (reg);
+  if (! REG_P (reg) || (regno = REGNO (reg)) < new_regno_start)
+return;
+  rclass = get_reg_class (regno);
+  rclass = ira_reg_class_subset[rclass][cl];
+  if (rclass == NO_REGS)
+return;
+  change_class (regno, rclass, "  Change", true);
+}
+
 /* Generate reloads for matching OUT and INS (array of input operand
numbers with end marker -1) with reg class GOAL_CLASS.  Add input
and output reloads correspondingly to the lists *BEFORE and
@@ -853,6 +878,11 @@ match_reload (signed char out, signed ch
goal_class, "");
   bitmap_set_bit (&lra_matched_pseudos, REGNO (new_in_reg));
 }
+  /* In and out operand can be got from transformations before
+ processing constraints.  So the pseudos might have inaccurate
+ class and we should make their classes more accurate.  */
+  narrow_reload_pseudo_class (in_rtx, goal_class);
+  narrow_reload_pseudo_class (out_rtx, goal_class);
   push_to_sequence (*before);
   lra_emit_move (new_in_reg, in_rtx);
   *before = get_insns ();
@@ -1252,6 +1282,10 @@ process_addr_reg (rtx *loc, rtx *before,
   return change_p;
 }
 
+#ifndef SLOW_UNALIGNED_ACCESS
+#define SLOW_UNALIGNED_ACCESS(mode, align) 0
+#endif
+
 /* Make reloads for subreg in operand NOP with internal subreg mode
REG_MODE, add new reloads for further processing.  Return true if
any reload was generated.  */
@@ -1272,7 +1306,14 @@ simplify_operand_subreg (int nop, enum m
   
   mode = GET_MODE (operand);
   reg = SUBREG_REG (operand);
-  if (MEM_P (reg) || (REG_P (reg) && REGNO (reg) < FIRST_PSEUDO_REGISTER))
+  /* If we change address for paradoxical subreg of memory, the
+ address might violate the necessary alignment or the access might
+ be slow.  So take this into consideration.  */
+  if ((MEM_P (reg)
+   && ((! STRICT_ALIGNMENT
+   && ! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (reg)))
+  || MEM_ALIGN (reg) >= GET_MODE_ALIGNMENT (mode)))
+  || (REG_P (reg) && REGNO (reg) < FIRST_PSEUDO_REGISTER))
 {
   alter_subreg (curr_id->operand_loc[nop]);
   return true;
@@ -1296,7 +1337,7 @@ simplify_operand_subreg (int nop, enum m
   enum reg_class rclass
= (enum reg_class) targetm.preferred_reload_class (reg, ALL_REGS);
 
-  if (get_reload_reg (type, reg_mode, reg, rclass, NULL, &new_reg)
+  if (get_reload_reg (type, reg_mode, reg, rclass, "subreg reg", &new_reg)
  && type != OP_OUT)
{
  push_to_sequence (before);
@@ -2796,6 +2837,11 @@ curr_insn_transform (void)
lra_update_dup (curr_id, i);
   }
   
+  if (change_p)
+/* Changes in the insn might result in that we can not satisfy
+   constraints in lately used alternative of the insn.  */
+lra_set_used_insn_alternative (curr_insn, -1);
+
  try_swapped:
 
   reused_alternative_num = curr_id->used_insn_alternative;


[patch i386]: Add support of delegitimize of UNSPEC_PCREL plus displacement

2012-02-23 Thread Kai Tietz
Hi,

This patch adds a missing nit about delegitimize of CONST
(UNSPEC_PCREL + displacement).
Some testcases - like gcc.c-torture/execute/ 930930-1.c - were failing due this.

ChangeLog

2012-02-23  Kai Tietz  

* config/i386/i386.c (ix86_delegitimize_address): Handle
UNSPEC_PCREL plus displacement.

Regression tested for x86_64-w64-mingw32, and
x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 184486)
+++ config/i386/i386.c  (working copy)
@@ -13241,6 +13241,22 @@

   if (TARGET_64BIT)
 {
+  if (GET_CODE (x) == CONST
+  && GET_CODE (XEXP (x, 0)) == PLUS
+  && GET_MODE (XEXP (x, 0)) == Pmode
+  && CONST_INT_P (XEXP (XEXP (x, 0), 1))
+  && GET_CODE (XEXP (XEXP (x, 0), 0)) == UNSPEC
+  && XINT (XEXP (XEXP (x, 0), 0), 1) == UNSPEC_PCREL)
+{
+ rtx x2 = XVECEXP (XEXP (XEXP (x, 0), 0), 0, 0);
+ x = gen_rtx_PLUS (Pmode, XEXP (XEXP (x, 0), 1), x2);
+ if (MEM_P (orig_x))
+   x = replace_equiv_address_nv (orig_x, x);
+ return x;
+   }
+  if (GET_CODE (x) == UNSPEC
+  && XINT (x, 1) == UNSPEC_PCREL)
+return XVECEXP (x, 0, 0);
   if (GET_CODE (x) != CONST
  || GET_CODE (XEXP (x, 0)) != UNSPEC
  || (XINT (XEXP (x, 0), 1) != UNSPEC_GOTPCREL


Re: [patch i386]: Add support of delegitimize of UNSPEC_PCREL plus displacement

2012-02-23 Thread Jakub Jelinek
On Thu, Feb 23, 2012 at 05:38:37PM +0100, Kai Tietz wrote:
> --- config/i386/i386.c(revision 184486)
> +++ config/i386/i386.c(working copy)
> @@ -13241,6 +13241,22 @@
> 
>if (TARGET_64BIT)
>  {
> +  if (GET_CODE (x) == CONST
> +  && GET_CODE (XEXP (x, 0)) == PLUS
> +  && GET_MODE (XEXP (x, 0)) == Pmode
> +  && CONST_INT_P (XEXP (XEXP (x, 0), 1))
> +  && GET_CODE (XEXP (XEXP (x, 0), 0)) == UNSPEC
> +  && XINT (XEXP (XEXP (x, 0), 0), 1) == UNSPEC_PCREL)
> +{
> +   rtx x2 = XVECEXP (XEXP (XEXP (x, 0), 0), 0, 0);
> +   x = gen_rtx_PLUS (Pmode, XEXP (XEXP (x, 0), 1), x2);
> +   if (MEM_P (orig_x))
> + x = replace_equiv_address_nv (orig_x, x);
> +   return x;
> + }
> +  if (GET_CODE (x) == UNSPEC
> +  && XINT (x, 1) == UNSPEC_PCREL)
> +return XVECEXP (x, 0, 0);

Here you don't need the MEM_P (orig_x) handling?
That's strange.

>if (GET_CODE (x) != CONST
> || GET_CODE (XEXP (x, 0)) != UNSPEC
> || (XINT (XEXP (x, 0), 1) != UNSPEC_GOTPCREL

Jakub


[patch testsuite]: Adjust some tests of gcc.dg and gcc.c-torture for mingw targets

2012-02-23 Thread Kai Tietz
Hello,

this patch corrects some testcases in gcc.dg and gcc.c-torture for x64
and x86 Windows targets.

ChangeLog

2012-02-23  Kai Tietz  

* gcc.dg/pack-test-5.c: Add -mno-ms-bitfields option
for mingw-targets.
* gcc.dg/Wpadded.c: Likewise.
* gcc.dg/bf-ms-layout-2.c: Adjust offsets to fit ms-bitfield
structure-layout.
* gcc.dg/di-sync-multithread.c: Replace for mingw-target the use
for sleep by Sleep and add windows.h include for this function.
* gcc.dg/format/dfp-printf-1.c: Adjust dg-skip-if rule for mingw
targets.
* gcc.dg/stack-usage-1.c (SIZE): Provide proper SIZE for x64 mingw
target.
* gcc.dg/tls/thr-cse-1.c: Provide proper pattern for x64 mingw
target.
* gcc.dg/tls/opt-11.c (memset): Use __extension__ to avoid fail
on x64 mingw target.
* gcc.dg/bf-ms-attrib.c: Adjust expected size for ms_struct layout.
* gcc.dg/pr50251.c: Disable test for x64 mingw target.
* gcc.c-torture/execute/930930-1.c (long): Replace by ptr_t to avoid
failure on LLP64 target.

Tested for i686-w64-mingw32, x86_64-w64-mingw32, and
x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: gcc.dg/pack-test-5.c
===
--- gcc.dg/pack-test-5.c(revision 184486)
+++ gcc.dg/pack-test-5.c(working copy)
@@ -1,6 +1,7 @@
 /* PR c/11446: packed on a struct takes precedence over aligned on the type
of a field.  */
 /* { dg-do run } */
+/* { dg-options "-mno-ms-bitfields" { target *-*-mingw* } } */

 extern void abort (void);

Index: gcc.dg/bf-ms-layout-2.c
===
--- gcc.dg/bf-ms-layout-2.c (revision 184503)
+++ gcc.dg/bf-ms-layout-2.c (working copy)
@@ -158,27 +158,27 @@
   struct ten test_ten;

 #if defined (_TEST_MS_LAYOUT) || defined (_MSC_VER)
-  size_t exp_sizeof_one = 12;
-  size_t exp_sizeof_two = 16;
+  size_t exp_sizeof_one = 8;
+  size_t exp_sizeof_two = 12;
   size_t exp_sizeof_three =6;
   size_t exp_sizeof_four = 8;
   size_t exp_sizeof_five = 3;
   size_t exp_sizeof_six = 8;
   size_t exp_sizeof_seven = 3;
-  size_t exp_sizeof_eight = 4;
+  size_t exp_sizeof_eight = 2;
   size_t exp_sizeof_nine = 8;
-  size_t exp_sizeof_ten = 16;
+  size_t exp_sizeof_ten = 8;

-  unsigned char exp_one_c = 8;
-  unsigned char exp_two_c  = 12;
+  unsigned char exp_one_c = 7;
+  unsigned char exp_two_c  = 9;
   unsigned char exp_three_c = 4;
   unsigned char exp_four_c = 4;
   char exp_five_c = 2;
   char exp_six_c = 5;
   char exp_seven_c = 2;
-  char exp_eight_c = 2;
+  char exp_eight_c = 1;
   char exp_nine_c = 0;
-  char exp_ten_c = 8;
+  char exp_ten_c = 1;

 #else /* testing -mno-ms-bitfields */

Index: gcc.dg/di-sync-multithread.c
===
--- gcc.dg/di-sync-multithread.c(revision 184486)
+++ gcc.dg/di-sync-multithread.c(working copy)
@@ -10,6 +10,9 @@

 #include 
 #include 
+#ifdef _WIN32
+#include 
+#endif

 /*#define DEBUGIT 1 */

@@ -175,7 +178,11 @@
t, err);
   };

+#ifdef _WIN32
+  Sleep (5000);
+#else
   sleep (5);
+#endif

   /* Stop please.  */
   __sync_lock_test_and_set (&doquit, 1ll);
Index: gcc.dg/format/dfp-printf-1.c
===
--- gcc.dg/format/dfp-printf-1.c(revision 184486)
+++ gcc.dg/format/dfp-printf-1.c(working copy)
@@ -3,7 +3,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target dfp } */
 /* { dg-options "-Wformat" } */
-/* { dg-skip-if "No scanf/printf dfp support" { *-*-mingw* } } */
+/* { dg-skip-if "No scanf/printf dfp support" { *-*-mingw* } { "*" }
{ "" } } */

 extern int printf (const char *restrict, ...);

Index: gcc.dg/Wpadded.c
===
--- gcc.dg/Wpadded.c(revision 184486)
+++ gcc.dg/Wpadded.c(working copy)
@@ -6,6 +6,7 @@
We won't get a warning anyway if the target has "packed" structure
layout.  */
 /* { dg-options "-Wpadded -fpack-struct=8" } */
+/* { dg-options "-mno-ms-bitfields" { target *-*-mingw* } } */

 struct foo {
   char bar;
Index: gcc.dg/stack-usage-1.c
===
--- gcc.dg/stack-usage-1.c  (revision 184486)
+++ gcc.dg/stack-usage-1.c  (working copy)
@@ -10,7 +10,11 @@
 #if defined(__i386__)
 #  define SIZE 248
 #elif defined(__x86_64__)
-#  define SIZE 356
+#  ifndef _WIN64
+#define SIZE 356
+#  else
+#define SIZE (256 - 24)
+#  endif
 #elif defined (__sparc__)
 #  if defined (__arch64__)
 #define SIZE 76
Index: gcc.dg/pr49544.c
===
--- gcc.dg/pr49544.c(revision 184486)
+++ gcc.dg/pr49544.c(working copy)
@@ -3,6 +3,8 @@
 /* { dg-options "-g -O2" } */
 /* { dg-require-effective-target ptr32plus } */

+_

Re: [patch i386]: Add support of delegitimize of UNSPEC_PCREL plus displacement

2012-02-23 Thread Kai Tietz
2012/2/23 Jakub Jelinek :
> On Thu, Feb 23, 2012 at 05:38:37PM +0100, Kai Tietz wrote:
>> --- config/i386/i386.c        (revision 184486)
>> +++ config/i386/i386.c        (working copy)
>> @@ -13241,6 +13241,22 @@
>>
>>    if (TARGET_64BIT)
>>      {
>> +      if (GET_CODE (x) == CONST
>> +          && GET_CODE (XEXP (x, 0)) == PLUS
>> +          && GET_MODE (XEXP (x, 0)) == Pmode
>> +          && CONST_INT_P (XEXP (XEXP (x, 0), 1))
>> +          && GET_CODE (XEXP (XEXP (x, 0), 0)) == UNSPEC
>> +          && XINT (XEXP (XEXP (x, 0), 0), 1) == UNSPEC_PCREL)
>> +        {
>> +       rtx x2 = XVECEXP (XEXP (XEXP (x, 0), 0), 0, 0);
>> +       x = gen_rtx_PLUS (Pmode, XEXP (XEXP (x, 0), 1), x2);
>> +       if (MEM_P (orig_x))
>> +         x = replace_equiv_address_nv (orig_x, x);
>> +       return x;
>> +     }
>> +      if (GET_CODE (x) == UNSPEC
>> +          && XINT (x, 1) == UNSPEC_PCREL)
>> +        return XVECEXP (x, 0, 0);
>
> Here you don't need the MEM_P (orig_x) handling?
> That's strange.
>
>>        if (GET_CODE (x) != CONST
>>         || GET_CODE (XEXP (x, 0)) != UNSPEC
>>         || (XINT (XEXP (x, 0), 1) != UNSPEC_GOTPCREL
>
>        Jakub

Hmm, those three lines are not necessary.  Sorry, I missed to remove
them from my patch.
>> +  if (GET_CODE (x) == UNSPEC
>> +  && XINT (x, 1) == UNSPEC_PCREL)
>> +return XVECEXP (x, 0, 0);

  But indeed the mem-case isn't necessary here, as this pattern never matches.

I will retest without those three-lines and post updated patch then.

Regards,
Kai


Re: [patch testsuite]: Adjust some tests of gcc.dg and gcc.c-torture for mingw targets

2012-02-23 Thread Jakub Jelinek
On Thu, Feb 23, 2012 at 05:56:11PM +0100, Kai Tietz wrote:
> --- gcc.dg/pack-test-5.c  (revision 184486)
> +++ gcc.dg/pack-test-5.c  (working copy)
> @@ -1,6 +1,7 @@
>  /* PR c/11446: packed on a struct takes precedence over aligned on the type
> of a field.  */
>  /* { dg-do run } */
> +/* { dg-options "-mno-ms-bitfields" { target *-*-mingw* } } */

Shouldn't this be dg-additional-options instead?

> --- gcc.dg/Wpadded.c  (revision 184486)
> +++ gcc.dg/Wpadded.c  (working copy)
> @@ -6,6 +6,7 @@
> We won't get a warning anyway if the target has "packed" structure
> layout.  */
>  /* { dg-options "-Wpadded -fpack-struct=8" } */
> +/* { dg-options "-mno-ms-bitfields" { target *-*-mingw* } } */

And above too?

> --- gcc.dg/tls/opt-11.c   (revision 184486)
> +++ gcc.dg/tls/opt-11.c   (working copy)
> @@ -3,7 +3,7 @@
>  /* { dg-add-options tls } */
> 
>  extern void abort (void);
> -extern void *memset (void *, int, __SIZE_TYPE__);
> +__extension__ extern void *memset (void *, int, __SIZE_TYPE__);

Why?  I don't see extensions anywhere.  Or is __SIZE_TYPE__ on mingw
long long and requires __extension__?  If yes, it should at least go
right before the argument, but perhaps then it should be already
in the __SIZE_TYPE__ macro.

Jakub


Re: [patch testsuite]: Adjust some tests of gcc.dg and gcc.c-torture for mingw targets

2012-02-23 Thread Kai Tietz
2012/2/23 Jakub Jelinek :
> On Thu, Feb 23, 2012 at 05:56:11PM +0100, Kai Tietz wrote:
>> --- gcc.dg/pack-test-5.c      (revision 184486)
>> +++ gcc.dg/pack-test-5.c      (working copy)
>> @@ -1,6 +1,7 @@
>>  /* PR c/11446: packed on a struct takes precedence over aligned on the type
>>     of a field.  */
>>  /* { dg-do run } */
>> +/* { dg-options "-mno-ms-bitfields" { target *-*-mingw* } } */
>
> Shouldn't this be dg-additional-options instead?

Right, I will adjust it.

>> --- gcc.dg/Wpadded.c  (revision 184486)
>> +++ gcc.dg/Wpadded.c  (working copy)
>> @@ -6,6 +6,7 @@
>>     We won't get a warning anyway if the target has "packed" structure
>>     layout.  */
>>  /* { dg-options "-Wpadded -fpack-struct=8" } */
>> +/* { dg-options "-mno-ms-bitfields" { target *-*-mingw* } } */
>
> And above too?

Yes, I will do.

>> --- gcc.dg/tls/opt-11.c       (revision 184486)
>> +++ gcc.dg/tls/opt-11.c       (working copy)
>> @@ -3,7 +3,7 @@
>>  /* { dg-add-options tls } */
>>
>>  extern void abort (void);
>> -extern void *memset (void *, int, __SIZE_TYPE__);
>> +__extension__ extern void *memset (void *, int, __SIZE_TYPE__);
>
> Why?  I don't see extensions anywhere.  Or is __SIZE_TYPE__ on mingw
> long long and requires __extension__?  If yes, it should at least go
> right before the argument, but perhaps then it should be already
> in the __SIZE_TYPE__ macro.

The issue is that for LLP64 target __SIZE_TYPE__ is 'unsigned long long'.

Regards,
Kai


Re: [patch i386]: Add support of delegitimize of UNSPEC_PCREL plus displacement

2012-02-23 Thread Kai Tietz
Hi,

So tests are complete (with full test on x86_64-unknown-linux-gnu and
testfailure testing on i686-w64-mingw32 and x86_64-w64-mingw32).

ChangeLog

2012-02-23  Kai Tietz  

* config/i386/i386.c (ix86_delegitimize_address): Handle
UNSPEC_PCREL plus displacement.

Regression tested for x86_64-w64-mingw32, and
x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 184486)
+++ config/i386/i386.c  (working copy)
@@ -13241,6 +13241,19 @@

   if (TARGET_64BIT)
 {
+  if (GET_CODE (x) == CONST
+  && GET_CODE (XEXP (x, 0)) == PLUS
+  && GET_MODE (XEXP (x, 0)) == Pmode
+  && CONST_INT_P (XEXP (XEXP (x, 0), 1))
+  && GET_CODE (XEXP (XEXP (x, 0), 0)) == UNSPEC
+  && XINT (XEXP (XEXP (x, 0), 0), 1) == UNSPEC_PCREL)
+{
+ rtx x2 = XVECEXP (XEXP (XEXP (x, 0), 0), 0, 0);
+ x = gen_rtx_PLUS (Pmode, XEXP (XEXP (x, 0), 1), x2);
+ if (MEM_P (orig_x))
+   x = replace_equiv_address_nv (orig_x, x);
+ return x;
+   }
   if (GET_CODE (x) != CONST
  || GET_CODE (XEXP (x, 0)) != UNSPEC
  || (XINT (XEXP (x, 0), 1) != UNSPEC_GOTPCREL


Re: [patch testsuite]: Adjust some tests of gcc.dg and gcc.c-torture for mingw targets

2012-02-23 Thread Kai Tietz
2012/2/23 Kai Tietz :
> 2012/2/23 Jakub Jelinek :
>> Why?  I don't see extensions anywhere.  Or is __SIZE_TYPE__ on mingw
>> long long and requires __extension__?  If yes, it should at least go
>> right before the argument, but perhaps then it should be already
>> in the __SIZE_TYPE__ macro.
>
> The issue is that for LLP64 target __SIZE_TYPE__ is 'unsigned long long'.

And the underlying issue is that __extension__ can't be use before
__SIZE_TYPE__ here. gcc lacks to parse that proper.  You would get

opt-11.c:6:35: error: expected declaration specifiers or '...' before
'__extension__'
opt-11.c: In function 'main':
opt-11.c:28:3: warning: incompatible implicit declaration of built-in
function 'memset' [enabled by default]

So here is the updated version.

ChangeLog

2012-02-23  Kai Tietz  

* gcc.dg/pack-test-5.c: Add -mno-ms-bitfields option
for mingw-targets.
* gcc.dg/Wpadded.c: Likewise.
* gcc.dg/bf-ms-layout-2.c: Adjust offsets to fit ms-bitfield
structure-layout.
* gcc.dg/di-sync-multithread.c: Replace for mingw-target the use
for sleep by Sleep and add windows.h include for this function.
* gcc.dg/format/dfp-printf-1.c: Adjust dg-skip-if rule for mingw
targets.
* gcc.dg/stack-usage-1.c (SIZE): Provide proper SIZE for x64 mingw
target.
* gcc.dg/tls/thr-cse-1.c: Provide proper pattern for x64 mingw
target.
* gcc.dg/tls/opt-11.c (memset): Use __extension__ to avoid fail
on x64 mingw target.
* gcc.dg/bf-ms-attrib.c: Adjust expected size for ms_struct layout.
* gcc.dg/pr50251.c: Disable test for x64 mingw target.
* gcc.c-torture/execute/930930-1.c (long): Replace by ptr_t to avoid
failure on LLP64 target.

Tested for i686-w64-mingw32, x86_64-w64-mingw32, and
x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: gcc.dg/pack-test-5.c
===
--- gcc.dg/pack-test-5.c(revision 184486)
+++ gcc.dg/pack-test-5.c(working copy)
@@ -1,6 +1,7 @@
 /* PR c/11446: packed on a struct takes precedence over aligned on the type
of a field.  */
 /* { dg-do run } */
+/* { dg-additional-options "-mno-ms-bitfields" { target *-*-mingw* } } */

 extern void abort (void);

Index: gcc.dg/bf-ms-layout-2.c
===
--- gcc.dg/bf-ms-layout-2.c (revision 184503)
+++ gcc.dg/bf-ms-layout-2.c (working copy)
@@ -158,27 +158,27 @@
   struct ten test_ten;

 #if defined (_TEST_MS_LAYOUT) || defined (_MSC_VER)
-  size_t exp_sizeof_one = 12;
-  size_t exp_sizeof_two = 16;
+  size_t exp_sizeof_one = 8;
+  size_t exp_sizeof_two = 12;
   size_t exp_sizeof_three =6;
   size_t exp_sizeof_four = 8;
   size_t exp_sizeof_five = 3;
   size_t exp_sizeof_six = 8;
   size_t exp_sizeof_seven = 3;
-  size_t exp_sizeof_eight = 4;
+  size_t exp_sizeof_eight = 2;
   size_t exp_sizeof_nine = 8;
-  size_t exp_sizeof_ten = 16;
+  size_t exp_sizeof_ten = 8;

-  unsigned char exp_one_c = 8;
-  unsigned char exp_two_c  = 12;
+  unsigned char exp_one_c = 7;
+  unsigned char exp_two_c  = 9;
   unsigned char exp_three_c = 4;
   unsigned char exp_four_c = 4;
   char exp_five_c = 2;
   char exp_six_c = 5;
   char exp_seven_c = 2;
-  char exp_eight_c = 2;
+  char exp_eight_c = 1;
   char exp_nine_c = 0;
-  char exp_ten_c = 8;
+  char exp_ten_c = 1;

 #else /* testing -mno-ms-bitfields */

Index: gcc.dg/di-sync-multithread.c
===
--- gcc.dg/di-sync-multithread.c(revision 184486)
+++ gcc.dg/di-sync-multithread.c(working copy)
@@ -10,6 +10,9 @@

 #include 
 #include 
+#ifdef _WIN32
+#include 
+#endif

 /*#define DEBUGIT 1 */

@@ -175,7 +178,11 @@
t, err);
   };

+#ifdef _WIN32
+  Sleep (5000);
+#else
   sleep (5);
+#endif

   /* Stop please.  */
   __sync_lock_test_and_set (&doquit, 1ll);
Index: gcc.dg/format/dfp-printf-1.c
===
--- gcc.dg/format/dfp-printf-1.c(revision 184486)
+++ gcc.dg/format/dfp-printf-1.c(working copy)
@@ -3,7 +3,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target dfp } */
 /* { dg-options "-Wformat" } */
-/* { dg-skip-if "No scanf/printf dfp support" { *-*-mingw* } } */
+/* { dg-skip-if "No scanf/printf dfp support" { *-*-mingw* } { "*" }
{ "" } } */

 extern int printf (const char *restrict, ...);

Index: gcc.dg/Wpadded.c
===
--- gcc.dg/Wpadded.c(revision 184486)
+++ gcc.dg/Wpadded.c(working copy)
@@ -5,7 +5,8 @@
packing to be larger than 1, which cannot be guaranteed for all targets.
We won't get a warning anyway if the target has "packed" structure
layout.  */
-/* { dg-options "-Wpadded -fpack-struct=8" } */
+/* { dg-additional-options "-Wpadded -fpack-struct=8" } */
+/* { dg-additional-opt

Re: [patch testsuite]: Adjust some tests of gcc.dg and gcc.c-torture for mingw targets

2012-02-23 Thread Jakub Jelinek
On Thu, Feb 23, 2012 at 06:23:39PM +0100, Kai Tietz wrote:
> And the underlying issue is that __extension__ can't be use before
> __SIZE_TYPE__ here. gcc lacks to parse that proper.  You would get

Then please use
__extension__ typedef __SIZE_TYPE__ size_t;
and use size_t in the prototype.  __extension__ before the prototype
is simply too confusing.
Anyway, will defer the rest to the testsuite maintainers.

Jakub


Re: [patch testsuite]: Adjust some tests of gcc.dg and gcc.c-torture for mingw targets

2012-02-23 Thread Kai Tietz
2012/2/23 Jakub Jelinek :
> On Thu, Feb 23, 2012 at 06:23:39PM +0100, Kai Tietz wrote:
>> And the underlying issue is that __extension__ can't be use before
>> __SIZE_TYPE__ here. gcc lacks to parse that proper.  You would get
>
> Then please use
> __extension__ typedef __SIZE_TYPE__ size_t;
> and use size_t in the prototype.  __extension__ before the prototype
> is simply too confusing.
> Anyway, will defer the rest to the testsuite maintainers.
>
>        Jakub

Ok hunk for gcc.dg/tls/opt-11.c modified as

Index: gcc.dg/tls/opt-11.c
===
--- gcc.dg/tls/opt-11.c (revision 184486)
+++ gcc.dg/tls/opt-11.c (working copy)
@@ -2,8 +2,10 @@
 /* { dg-require-effective-target tls_runtime } */
 /* { dg-add-options tls } */

+__extension__ typedef __SIZE_TYPE__ size_t;
+
 extern void abort (void);
-extern void *memset (void *, int, __SIZE_TYPE__);
+extern void *memset (void *, int, size_t);

 struct A
 {


Re: [patch testsuite]: Adjust some tests of gcc.dg and gcc.c-torture for mingw targets

2012-02-23 Thread Mike Stump
On Feb 23, 2012, at 9:25 AM, Jakub Jelinek wrote:
> Anyway, will defer the rest to the testsuite maintainers.

Ok.


[PR51752] publication safety violations in loop invariant motion pass

2012-02-23 Thread Aldy Hernandez
In this PR we have a publication safety violation in a transaction.  The 
loop invariant motion pass hoists a load out of a loop, creating a load 
data race.  Those unfamiliar with load data races in transactions, 
please see the PR, as this has been confusing to most (me included).


In the snippet below, we are not allowed to hoist DATA out unless it 
will be loaded on every path out of the loop:


 __transaction_atomic
   {
 for (i = 0; i < 10; i++)
   if (x[i])
 x[i] += DATA;
 // OK to hoist DATA above out if also loaded here:
 // blah = DATA;
   }

rth gave me the evil eye on a previous incantation of this patch, and 
I'm sure this one is not totally devoid of grime.


The main problem is how to record all loads in a given function in an 
efficient manner, so we can properly tease the information out in 
every_path_out_of_loop_has_load().  In my first revision I had some data 
flow equations that calculated loads on different paths, and rth just 
about hit me.  Instead now I save all loads in a function and iterate 
through them in a brute force way.  I'd like to rewrite this into a hash 
of some sort, but before I go any further I'm interested to know if the 
main idea is ok.


FYI, it has been suggested to use the mem_ref_p information already 
available in the pass, but I need information on all loads, not just 
those occurring inside loops.


BTW, this only fixes the loop invariant motion pass.  I'm sure there are 
other passes that will need equally painful fixes.


Aldy
* tree-ssa-loop-im.c: New global all_tm_blocks.
(every_path_out_of_loop_has_load): New.
(movement_possibility): Restrict movement of transaction loads.
(tree_ssa_lim_initialize): Call tree_ssa_lim_tm_initialize.
(tree_ssa_lim_finalize): Call tree_ssa_lim_tm_finalize.
(tree_ssa_lim_tm_initialize): New.
(tree_ssa_lim_tm_finalize): New.
* tree.h (get_all_tm_blocks): Protoize.
* trans-mem.c (tm_region_init): Use the heap to store BB
auxilliary data.
(get_all_tm_blocks): New.

Index: tree-ssa-loop-im.c
===
--- tree-ssa-loop-im.c  (revision 184445)
+++ tree-ssa-loop-im.c  (working copy)
@@ -150,7 +150,7 @@ typedef struct mem_ref
 
   bitmap indep_ref;/* The set of memory references on that
   this reference is independent.  */
-  bitmap dep_ref;  /* The complement of DEP_REF.  */
+  bitmap dep_ref;  /* The complement of INDEP_REF.  */
 } *mem_ref_p;
 
 DEF_VEC_P(mem_ref_p);
@@ -189,6 +189,13 @@ static struct
 
 static bool ref_indep_loop_p (struct loop *, mem_ref_p);
 
+/* All basic blocks that are within a transaction in the current
+   function.  */
+static bitmap all_tm_blocks;
+
+/* All the loads in the current function.  */
+static mem_ref_locs_p all_loads;
+
 /* Minimum cost of an expensive expression.  */
 #define LIM_EXPENSIVE ((unsigned) PARAM_VALUE (PARAM_LIM_EXPENSIVE))
 
@@ -337,6 +344,26 @@ for_each_index (tree *addr_p, bool (*cbc
 }
 }
 
+/* Return true if every path out of the loop containing STMT loads the
+   decl loaded in STMT.  */
+
+static bool
+every_path_out_of_loop_has_load (gimple stmt)
+{
+  basic_block bb = gimple_bb (stmt);
+  basic_block header = bb->loop_father->header;
+  unsigned int i;
+  mem_ref_loc_p aref;
+  tree rhs = gimple_assign_rhs1 (stmt);
+
+  /* Return true if the same load occurs on every path out of the loop.  */
+  FOR_EACH_VEC_ELT (mem_ref_loc_p, all_loads->locs, i, aref)
+if (rhs == *aref->ref
+   && dominated_by_p (CDI_POST_DOMINATORS, header, gimple_bb (aref->stmt)))
+  return true;
+  return false;
+}
+
 /* If it is possible to hoist the statement STMT unconditionally,
returns MOVE_POSSIBLE.
If it is possible to hoist the statement STMT, but we must avoid making
@@ -412,6 +439,26 @@ movement_possibility (gimple stmt)
   || gimple_could_trap_p (stmt))
 return MOVE_PRESERVE_EXECUTION;
 
+  /* Non local loads in a transaction cannot be hoisted out unless the
+ load happens on every path out of the loop.  */
+  if (flag_tm
+  && bitmap_bit_p (all_tm_blocks, gimple_bb (stmt)->index)
+  && gimple_assign_single_p (stmt))
+{
+  tree rhs = gimple_assign_rhs1 (stmt);
+  if (DECL_P (rhs) && is_global_var (rhs)
+ && !every_path_out_of_loop_has_load (stmt))
+   {
+ if (dump_file)
+   {
+ fprintf (dump_file, "Cannot hoist conditional load of ");
+ print_generic_expr (dump_file, rhs, TDF_SLIM);
+ fprintf (dump_file, " because it is in a transaction.\n");
+   }
+ return MOVE_IMPOSSIBLE;
+   }
+}
+
   return ret;
 }
 
@@ -2358,6 +2405,42 @@ fill_always_executed_in (struct loop *lo
 fill_always_executed_in (loop, contains_call);
 }
 
+/* Compute global information needed for transactional restrictions in
+   the loop 

[PATCH 2/5] passes.texi: Fix typo in Full redundancy elimination

2012-02-23 Thread Bernhard Reutner-Fischer
gcc/ChangeLog

2012-02-23  Bernhard Reutner-Fischer  

* doc/passes.texi (Full redundancy elimination): Fix typo.

Signed-off-by: Bernhard Reutner-Fischer 
---
 gcc/doc/passes.texi |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index 1fee7d9..8329ddd 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -393,7 +393,7 @@ in @file{tree-ssa-math-opts.c} and is described by
 @item Full redundancy elimination
 
 This is a simpler form of PRE that only eliminates redundancies that
-occur an all paths.  It is located in @file{tree-ssa-pre.c} and
+occur on all paths.  It is located in @file{tree-ssa-pre.c} and
 described by @code{pass_fre}.
 
 @item Loop optimization
-- 
1.7.9



[PATCH 3/5] tree-if-conv: Commentary typo fix

2012-02-23 Thread Bernhard Reutner-Fischer
gcc/ChangeLog:

2012-02-23  Bernhard Reutner-Fischer  

* tree-if-conv (predicate_scalar_phi): Commentary typo fix.

Signed-off-by: Bernhard Reutner-Fischer 
---
 gcc/tree-if-conv.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index cdbbe5b..ca9503f 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1262,7 +1262,7 @@ find_phi_replacement_condition (struct loop *loop,
arguments.
 
For example,
- S1: A = PHI 
is converted into,
  S2: A = cond ? x1 : x2;
 
-- 
1.7.9



[PATCH 5/5] dump_file whitespace nitpicks

2012-02-23 Thread Bernhard Reutner-Fischer
gcc/ChangeLog:

2012-02-23  Bernhard Reutner-Fischer  

* tree-into-ssa (update_ssa): Avoid trailing whitespace in
dump_file.
* tree-ssa-sccvn.c (print_scc): Ditto.

Signed-off-by: Bernhard Reutner-Fischer 
---
 gcc/tree-into-ssa.c  |4 ++--
 gcc/tree-ssa-sccvn.c |4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
index 7eaed2a..6ca52c1 100644
--- a/gcc/tree-into-ssa.c
+++ b/gcc/tree-into-ssa.c
@@ -3519,9 +3519,9 @@ update_ssa (unsigned update_flags)
 
   if (dump_flags & TDF_DETAILS)
{
- fprintf (dump_file, "Affected blocks: ");
+ fprintf (dump_file, "Affected blocks:");
  EXECUTE_IF_SET_IN_BITMAP (blocks_to_update, 0, i, bi)
-   fprintf (dump_file, "%u ", i);
+   fprintf (dump_file, " %u", i);
  fprintf (dump_file, "\n");
}
 
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index fdebe47..ddb1ba6 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -2462,11 +2462,11 @@ print_scc (FILE *out, VEC (tree, heap) *scc)
   tree var;
   unsigned int i;
 
-  fprintf (out, "SCC consists of: ");
+  fprintf (out, "SCC consists of:");
   FOR_EACH_VEC_ELT (tree, scc, i, var)
 {
-  print_generic_expr (out, var, 0);
   fprintf (out, " ");
+  print_generic_expr (out, var, 0);
 }
   fprintf (out, "\n");
 }
-- 
1.7.9



[PATCH 4/5] make_phi_node can be static

2012-02-23 Thread Bernhard Reutner-Fischer
gcc/ChangeLog:

2012-02-23  Bernhard Reutner-Fischer  

* tree-phinodes.c (make_phi_node): Mark static.
* tree-flow.h (make_phi_node): Remove extern decl.
* doc/gimple.texi (make_phi_node): Remove documentation.

Signed-off-by: Bernhard Reutner-Fischer 
---
 gcc/doc/gimple.texi |4 
 gcc/tree-flow.h |1 -
 gcc/tree-phinodes.c |2 +-
 3 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi
index b75dc72..fa31eb0 100644
--- a/gcc/doc/gimple.texi
+++ b/gcc/doc/gimple.texi
@@ -1963,10 +1963,6 @@ Set @code{CLAUSES} to be the clauses associated with 
@code{OMP_SINGLE} @code{G}.
 @subsection @code{GIMPLE_PHI}
 @cindex @code{GIMPLE_PHI}
 
-@deftypefn {GIMPLE function} gimple make_phi_node (tree var, int len)
-Build a @code{PHI} node with len argument slots for variable var.
-@end deftypefn
-
 @deftypefn {GIMPLE function} unsigned gimple_phi_capacity (gimple g)
 Return the maximum number of arguments supported by @code{GIMPLE_PHI} @code{G}.
 @end deftypefn
diff --git a/gcc/tree-flow.h b/gcc/tree-flow.h
index f4c4d5c..319be2b 100644
--- a/gcc/tree-flow.h
+++ b/gcc/tree-flow.h
@@ -504,7 +504,6 @@ extern void find_referenced_vars_in (gimple);
 /* In tree-phinodes.c  */
 extern void reserve_phi_args_for_new_edge (basic_block);
 extern void add_phi_node_to_bb (gimple phi, basic_block bb);
-extern gimple make_phi_node (tree var, int len);
 extern gimple create_phi_node (tree, basic_block);
 extern void add_phi_arg (gimple, tree, edge, source_location);
 extern void remove_phi_args (edge);
diff --git a/gcc/tree-phinodes.c b/gcc/tree-phinodes.c
index 1d7e5c2..218a551 100644
--- a/gcc/tree-phinodes.c
+++ b/gcc/tree-phinodes.c
@@ -204,7 +204,7 @@ ideal_phi_node_len (int len)
 
 /* Return a PHI node with LEN argument slots for variable VAR.  */
 
-gimple
+static gimple
 make_phi_node (tree var, int len)
 {
   gimple phi;
-- 
1.7.9



[PATCH 1/5] invoke.texi: remove duplicate pass-flag entries

2012-02-23 Thread Bernhard Reutner-Fischer
gcc/ChangeLog

2012-02-23  Bernhard Reutner-Fischer  

* doc/invoke.texi (-fdse, -fdce): Remove duplicate entries.

Signed-off-by: Bernhard Reutner-Fischer 
---
 gcc/doc/invoke.texi |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7562273..6cb80cb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -356,8 +356,8 @@ Objective-C and Objective-C++ Dialects}.
 -fcompare-elim -fcprop-registers -fcrossjumping @gol
 -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol
 -fcx-limited-range @gol
--fdata-sections -fdce -fdce -fdelayed-branch @gol
--fdelete-null-pointer-checks -fdse -fdevirtualize -fdse @gol
+-fdata-sections -fdce -fdelayed-branch @gol
+-fdelete-null-pointer-checks -fdevirtualize -fdse @gol
 -fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol
 -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} 
@gol
 -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
-- 
1.7.9



[PATCH 0/5] misc janitorial tweaks

2012-02-23 Thread Bernhard Reutner-Fischer
Hi,

A few tweaks i found i an old local branch.
Ok for trunk?

The duplicate pass-flag entries in invoke.texi is something that should
have been warned about in one of the generator-scripts, ideally. Perhaps
we should remember this as a feature bug..

Bernhard Reutner-Fischer (5):
  invoke.texi: remove duplicate pass-flag entries
  passes.texi: Fix typo in Full redundancy elimination
  tree-if-conv: Commentary typo fix
  make_phi_node can be static
  dump_file whitespace nitpicks

 gcc/doc/gimple.texi  |4 
 gcc/doc/invoke.texi  |4 ++--
 gcc/doc/passes.texi  |2 +-
 gcc/tree-flow.h  |1 -
 gcc/tree-if-conv.c   |2 +-
 gcc/tree-into-ssa.c  |4 ++--
 gcc/tree-phinodes.c  |2 +-
 gcc/tree-ssa-sccvn.c |4 ++--
 8 files changed, 9 insertions(+), 14 deletions(-)

-- 
1.7.9



Re: [PATCH] Fix PR52298

2012-02-23 Thread Ulrich Weigand
Richard Guenther wrote:

>   PR tree-optimization/52298
>   * tree-vect-stmts.c (vectorizable_store): Properly use
>   STMT_VINFO_DR_STEP instead of DR_STEP when vectorizing
>   outer loops.
>   (vectorizable_load): Likewise.
>   * tree-vect-data-refs.c (vect_analyze_data_ref_access):
>   Access DR_STEP after ensuring it is not NULL.

This causes a bunch of regressions on SPU:

FAIL: gcc.dg/vect/vect-outer-fir-big-array.c (internal compiler error)
FAIL: gcc.dg/vect/vect-outer-fir-big-array.c (test for excess errors)
WARNING: gcc.dg/vect/vect-outer-fir-big-array.c compilation failed to produce 
executable
FAIL: gcc.dg/vect/vect-outer-fir-big-array.c scan-tree-dump-times vect "OUTER 
LOOP VECTORIZED" 2
FAIL: gcc.dg/vect/vect-outer-fir-lb-big-array.c (internal compiler error)
FAIL: gcc.dg/vect/vect-outer-fir-lb-big-array.c (test for excess errors)
WARNING: gcc.dg/vect/vect-outer-fir-lb-big-array.c compilation failed to 
produce executable
FAIL: gcc.dg/vect/vect-outer-fir-lb-big-array.c scan-tree-dump-times vect 
"OUTER LOOP VECTORIZED" 2
FAIL: gcc.dg/vect/vect-outer-fir-lb.c (internal compiler error)
FAIL: gcc.dg/vect/vect-outer-fir-lb.c (test for excess errors)
WARNING: gcc.dg/vect/vect-outer-fir-lb.c compilation failed to produce 
executable
FAIL: gcc.dg/vect/vect-outer-fir-lb.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED" 2
FAIL: gcc.dg/vect/vect-outer-fir.c (internal compiler error)
FAIL: gcc.dg/vect/vect-outer-fir.c (test for excess errors)
WARNING: gcc.dg/vect/vect-outer-fir.c compilation failed to produce executable
FAIL: gcc.dg/vect/vect-outer-fir.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED" 2

all due to ICEs of the same type:

 internal compiler error: in vectorizable_load, at tree-vect-stmts.c:4665

The assert in question looks like:

  if (nested_in_vect_loop
  && (TREE_INT_CST_LOW (STMT_VINFO_DR_STEP (stmt_info))
  % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0))
{ 
  gcc_assert (alignment_support_scheme != dr_explicit_realign_optimized);
  compute_in_loop = true;
}

where your patch changed DR_STEP to STMT_VINFO_DR_STEP (reverting just this
one change makes the ICEs go away).

However, at the place where the decision to use the 
dr_explicit_realign_optimized 
strategy is made (tree-vect-data-refs.c:vect_supportable_dr_alignment), we still
have:

  if ((nested_in_vect_loop
   && (TREE_INT_CST_LOW (DR_STEP (dr))
   != GET_MODE_SIZE (TYPE_MODE (vectype
  || !loop_vinfo)
return dr_explicit_realign;
  else
return dr_explicit_realign_optimized;

Should this now also use STMT_VINFO_DR_STEP?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



Backport a patch to 4.6 branch

2012-02-23 Thread Jakub Jelinek
Hi!

I've backported this to 4.6 branch, after bootstrapping/regtesting
it on x86_64-linux and i686-linux.

2012-02-23  Jakub Jelinek  

Backported from trunk
2012-02-20  Georg-Johann Lay  

* gcc.c-torture/execute/pr52286.c: Fix FAIL on 16-bit int platforms.

2012-02-20  Jakub Jelinek  

PR tree-optimization/52286
* fold-const.c (fold_binary_loc): For (X & C1) | C2
optimization use double_int_to_tree instead of build_int_cst_wide,
rewrite to use double_int vars.

* gcc.c-torture/execute/pr52286.c: New test.

--- gcc/fold-const.c(revision 184390)
+++ gcc/fold-const.c(revision 184391)
@@ -10959,66 +10959,50 @@ fold_binary_loc (location_t loc,
  && TREE_CODE (arg1) == INTEGER_CST
  && TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST)
{
- unsigned HOST_WIDE_INT hi1, lo1, hi2, lo2, hi3, lo3, mlo, mhi;
+ double_int c1, c2, c3, msk;
  int width = TYPE_PRECISION (type), w;
- hi1 = TREE_INT_CST_HIGH (TREE_OPERAND (arg0, 1));
- lo1 = TREE_INT_CST_LOW (TREE_OPERAND (arg0, 1));
- hi2 = TREE_INT_CST_HIGH (arg1);
- lo2 = TREE_INT_CST_LOW (arg1);
+ c1 = tree_to_double_int (TREE_OPERAND (arg0, 1));
+ c2 = tree_to_double_int (arg1);
 
  /* If (C1&C2) == C1, then (X&C1)|C2 becomes (X,C2).  */
- if ((hi1 & hi2) == hi1 && (lo1 & lo2) == lo1)
+ if (double_int_equal_p (double_int_and (c1, c2), c1))
return omit_one_operand_loc (loc, type, arg1,
-TREE_OPERAND (arg0, 0));
+TREE_OPERAND (arg0, 0));
 
- if (width > HOST_BITS_PER_WIDE_INT)
-   {
- mhi = (unsigned HOST_WIDE_INT) -1
-   >> (2 * HOST_BITS_PER_WIDE_INT - width);
- mlo = -1;
-   }
- else
-   {
- mhi = 0;
- mlo = (unsigned HOST_WIDE_INT) -1
-   >> (HOST_BITS_PER_WIDE_INT - width);
-   }
+ msk = double_int_mask (width);
 
  /* If (C1|C2) == ~0 then (X&C1)|C2 becomes X|C2.  */
- if ((~(hi1 | hi2) & mhi) == 0 && (~(lo1 | lo2) & mlo) == 0)
+ if (double_int_zero_p (double_int_and_not (msk,
+double_int_ior (c1, c2
return fold_build2_loc (loc, BIT_IOR_EXPR, type,
-   TREE_OPERAND (arg0, 0), arg1);
+   TREE_OPERAND (arg0, 0), arg1);
 
  /* Minimize the number of bits set in C1, i.e. C1 := C1 & ~C2,
 unless (C1 & ~C2) | (C2 & C3) for some C3 is a mask of some
 mode which allows further optimizations.  */
- hi1 &= mhi;
- lo1 &= mlo;
- hi2 &= mhi;
- lo2 &= mlo;
- hi3 = hi1 & ~hi2;
- lo3 = lo1 & ~lo2;
+ c1 = double_int_and (c1, msk);
+ c2 = double_int_and (c2, msk);
+ c3 = double_int_and_not (c1, c2);
  for (w = BITS_PER_UNIT;
   w <= width && w <= HOST_BITS_PER_WIDE_INT;
   w <<= 1)
{
  unsigned HOST_WIDE_INT mask
= (unsigned HOST_WIDE_INT) -1 >> (HOST_BITS_PER_WIDE_INT - w);
- if (((lo1 | lo2) & mask) == mask
- && (lo1 & ~mask) == 0 && hi1 == 0)
+ if (((c1.low | c2.low) & mask) == mask
+ && (c1.low & ~mask) == 0 && c1.high == 0)
{
- hi3 = 0;
- lo3 = mask;
+ c3 = uhwi_to_double_int (mask);
  break;
}
}
- if (hi3 != hi1 || lo3 != lo1)
+ if (!double_int_equal_p (c3, c1))
return fold_build2_loc (loc, BIT_IOR_EXPR, type,
-   fold_build2_loc (loc, BIT_AND_EXPR, type,
-TREE_OPERAND (arg0, 0),
-build_int_cst_wide (type,
-lo3, hi3)),
-   arg1);
+   fold_build2_loc (loc, BIT_AND_EXPR, type,
+TREE_OPERAND (arg0, 0),
+double_int_to_tree (type,
+c3)),
+   arg1);
}
 
   /* (X & Y) | Y is (X, Y).  */
--- gcc/testsuite/gcc.c-torture/execute/pr52286.c   (revision 0)
+++ gcc/testsuite/gcc.c-torture/execute/pr52286.c   (revision 184394)
@@ -0,0 +1,20 @@
+/* PR tree-optimization/52286 */
+
+extern void abort (void);
+
+int
+main ()
+{
+#if __SIZEOF_INT__ > 2
+  int a, b;
+  asm ("" : "=r" (a) : "0" (0));
+  b = (~a | 1) & -2038094497;
+#else
+  long a, b;
+  asm ("" : "=r" (a) : "0" (0));
+  b = (~a | 1) & -20380944

[lra] a patch to fix a testsuite regression on SPARC.

2012-02-23 Thread Vladimir Makarov

The following patch fixes a few SPARC GCC testsuite regressions.

The patch was successfully bootstrapped on x86/x86-64.

Committed as rev. 184512.

2012-02-23  Vladimir Makarov 

* lra-eliminations.c (update_reg_eliminate): Make hard register
unallocatable when we make it a result of the elimination.

Index: lra-eliminations.c
===
--- lra-eliminations.c  (revision 184177)
+++ lra-eliminations.c  (working copy)
@@ -1103,6 +1103,9 @@ update_reg_eliminate (bitmap insns_with_
  if (lra_dump_file != NULL)
fprintf (lra_dump_file, "Using elimination %d to %d now\n",
 ep1->from, ep1->to);
+ /* Prevent the hard register into which we eliminate now
+from the usage for pseudos.  */
+ SET_HARD_REG_BIT (temp_hard_reg_set, ep1->to);
  gcc_assert (ep1->previous_offset == 0);
  ep1->previous_offset = ep->offset;
}


Re: [PATCH][ARM] 64-bit shifts in NEON.

2012-02-23 Thread Andrew Stubbs

On 21/02/12 15:23, Andrew Stubbs wrote:

On 06/02/12 13:13, Andrew Stubbs wrote:

This patch adds DImode shift support in NEON registers/instructions.

The patch causes delays any lowering until the split2 pass, after the
register allocator has chosen whether to do the shift in NEON (VFP)
registers, or in core-registers.

The core-registers case depends on the patch I previously posted here:
http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01472.html

The NEON right-shifts make life more interesting by using a left-shift
instruction with a negative offset. This means that the amount has to be
negated. Ideally you'd want to do this at expand time, but the delayed
NEON/core decision makes this impossible, so I've chosen to expand this
in the post-reload split pass. Unfortunately, NEON does not provide a
suitable instruction for negating the shift amount, so that ends up
happening in core-registers.

Another complication is that the NEON shift instructions use a 64-bit
register for the shift amount, but they only pay attention to the bottom
8 bits. I did experiment with using a DImode shift amount, but that
didn't work out well; there were unnecessary extends and the
core-registers fall back was less efficient.

Therefore, I've chosen to create a new register class, VFP_LO_REGS_EVEN,
which includes only the 32-bit low-part of the DImode NEON registers so
the shift amount can be loaded into VFP regs without extending them.
This required a new print format 'E' that converts the low-part name to
the full register name the instructions need. Unfortunately, this does
artificially limit the shift amount to the bottom half of the register
set, but hopefully that's not going to be a big problem.

The register allocator is causing me trouble though. The problem is that
the compiler just refused to use the NEON variant in all of my toy
examples. It turns out to be simply that the IRA & reload passes do not
change hard-registers already present in the RTL (function parameters,
return values, etc.) unless there is absolutely no alternative that
works with that register. I'm not sure if there's anything that can be
done about this, or not. I'm not even sure if it isn't the right choice
much of the time, cost wise.


I've now updated the patch to take into account size optimization.

Currently, if optimizing for size the compiler prefers to call the
libgcc function, rather that do the shift inline.

With my old patch, when NEON is enabled it always used the inline code
(either in NEON or core-registers) no matter which optimization flags
were set. This is more-or-less correct if the register allocator chooses
to do the operation in NEON, but much less space efficient otherwise.

The update simply disables the core-registers fall-back option when
optimizing for size. Transferring the values to NEON registers and back
should be roughly the same size as calling a function, so there
shouldn't be a big loss.

I'm in two minds about the shift-by-constant cases though, since they
expand to fewer instructions. Any thoughts?


And yet another update.

This time I noticed that I didn't discard the "clobber"s after the split 
has determined they're not necessary any more. Presumably the 
unallocated "match_scratch"es were harmless, but the unnecessary CC 
clobbers could affect if-conversion and scheduling.


This patch is the same as the previous, except that I've broken out the 
alternatives that don't need any clobbers.


Ok for 4.8?

Andrew
2012-02-21  Andrew Stubbs  

	gcc/
	* config/arm/arm.c (arm_print_operand): Add new 'E' format code.
	* config/arm/arm.h (enum reg_class): Add VFP_LO_REGS_EVEN.
	(REG_CLASS_NAMES, REG_CLASS_CONTENTS, IS_VFP_CLASS): Likewise.
	* config/arm/arm.md (ashldi3): Add TARGET_NEON case.
	(ashrdi3, lshrdi3): Likewise.
	* config/arm/constraints.md (T): New register constraint.
	(Pe, P1, Pf, Pg): New constraints.
	* config/arm/neon.md (signed_shift_di3_neon, unsigned_shift_di3_neon,
	ashldi3_neon, ashldi3_neon_noclobber, ashrdi3_neon_imm,
	ashrdi3_neon_reg, ashrdi3_neon, ashrdi3_neon_imm_noclobber,
	lshrdi3_neon_imm, ashrdi3_neon, lshrdi3_neon_imm_noclobber,
	lshrdi3_neon_imm, lshrdi3_neon_reg, lshrdi3_neon): New patterns.
	* config/arm/predicates.md (int_0_to_63): New predicate.
	(shift_amount_64): New predicate.

---
 gcc/config/arm/arm.c  |   18 +++
 gcc/config/arm/arm.h  |5 +
 gcc/config/arm/arm.md |   33 -
 gcc/config/arm/constraints.md |   30 
 gcc/config/arm/neon.md|  290 +
 gcc/config/arm/predicates.md  |8 +
 6 files changed, 374 insertions(+), 10 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 386231a..65ccd91 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17661,6 +17661,24 @@ arm_print_operand (FILE *stream, rtx x, int code)
   }
   return;
 
+/* Print the VFP/Neon double precision register name that overlaps the
+   given single-precision register.  

4.6 branch now frozen

2012-02-23 Thread Jakub Jelinek
Hi!

4.6.3-rc1 snapshot is being built right now, the 4.6 branch is from now on
frozen, all commits require an extra ack from one of the RMs.
I hope we can release 4.6.3 late next week.

Jakub


[PATCH]: Fix PR52179 and remove hack from PR48299

2012-02-23 Thread Jack Howarth
  The attached patch implements the fix for supporting aslr on darwin11 and 
later which exists
in current upstream boehm-gc, as discussed in 
https://github.com/ivmai/bdwgc/issues/13; see
https://github.com/ivmai/bdwgc/commit/faef04e7cb3741163dfdf65900ef5d2a0530be0f.
This change eliminates the test failures in boehm-gc testsuite except for 
random failures

WARNING: program timed out.
FAIL: boehm-gc.c/thread_leak_test.c -O2 execution test
Running
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20120222/boehm-gc/testsuite/boehm-gc.lib/lib.exp

due to PR48299. Fixing PR52179 allows us to stop passing -Wl,-no_pie on 
SYSTEMSPEC,
by reverting r175182, without regressions in the libjava testsuite. Bootstrap 
and regression 
tested on x86_64-apple-darwin11. Okay for gcc trunk for gcc 4.7 since this is 
target specific?
   Jack


2012-02-23  Patrick Marlier  
Jack Howarth  

boehm-gc/

PR boehm-gc/52179
* include/gc_config.h.in: Undefine HAVE_PTHREAD_GET_STACKADDR_NP.
* include/private/gcconfig.h (DARWIN): Define STACKBOTTOM with
pthread_get_stackaddr_np when available.
* configure.ac (THREADS): Check availability of 
pthread_get_stackaddr_np.
* configure: Regenerate.

libjava/

PR target/49461
* configure.ac (SYSTEMSPEC): No longer pass -no_pie for darwin11.
* configure: Regenerate.


Index: boehm-gc/configure.ac
===
--- boehm-gc/configure.ac   (revision 184521)
+++ boehm-gc/configure.ac   (working copy)
@@ -380,6 +380,7 @@ esac
 oldLIBS="$LIBS"
 LIBS="$LIBS $THREADLIBS"
 AC_CHECK_FUNCS([pthread_getattr_np])
+AC_CHECK_FUNCS([pthread_get_stackaddr_np])
 LIBS="$oldLIBS"
 
 # Configuration of machine-dependent code
Index: boehm-gc/include/gc_config.h.in
===
--- boehm-gc/include/gc_config.h.in (revision 184521)
+++ boehm-gc/include/gc_config.h.in (working copy)
@@ -87,6 +87,9 @@
 /* Define to 1 if you have the `pthread_getattr_np' function. */
 #undef HAVE_PTHREAD_GETATTR_NP
 
+/* Define to 1 if you have the `pthread_get_stackaddr_np_np' function. */
+#undef HAVE_PTHREAD_GET_STACKADDR_NP
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STDINT_H
 
Index: boehm-gc/include/private/gcconfig.h
===
--- boehm-gc/include/private/gcconfig.h (revision 184521)
+++ boehm-gc/include/private/gcconfig.h (working copy)
@@ -1331,7 +1331,11 @@
 These aren't used when dyld support is enabled (it is by default) */
 # define DATASTART ((ptr_t) get_etext())
 # define DATAEND   ((ptr_t) get_end())
-# define STACKBOTTOM ((ptr_t) 0xc000)
+# ifdef HAVE_PTHREAD_GET_STACKADDR_NP
+#   define STACKBOTTOM (ptr_t)pthread_get_stackaddr_np(pthread_self())
+# else
+#   define STACKBOTTOM ((ptr_t) 0xc000)
+# endif
 # define USE_MMAP
 # define USE_MMAP_ANON
 # define USE_ASM_PUSH_REGS
@@ -2014,7 +2018,11 @@
 These aren't used when dyld support is enabled (it is by default) */
 # define DATASTART ((ptr_t) get_etext())
 # define DATAEND   ((ptr_t) get_end())
-# define STACKBOTTOM ((ptr_t) 0x7fff5fc0)
+# ifdef HAVE_PTHREAD_GET_STACKADDR_NP
+#   define STACKBOTTOM (ptr_t)pthread_get_stackaddr_np(pthread_self())
+# else
+#   define STACKBOTTOM ((ptr_t) 0x7fff5fc0)
+# endif
 # define USE_MMAP
 # define USE_MMAP_ANON
 # ifdef GC_DARWIN_THREADS
Index: libjava/configure.ac
===
--- libjava/configure.ac(revision 184521)
+++ libjava/configure.ac(working copy)
@@ -898,14 +898,9 @@ case "${host}" in
 SYSTEMSPEC="-lunicows $SYSTEMSPEC"
   fi
 ;;
-*-*-darwin9*)
+*-*-darwin[[912]]*)
   SYSTEMSPEC="%{!Zdynamiclib:%{!Zbundle:-allow_stack_execute}}"
 ;;
-*-*-darwin[[12]]*)
-  # Something is incompatible with pie, would be nice to fix it and
-  # remove -no_pie.  PR49461
-  SYSTEMSPEC="-no_pie %{!Zdynamiclib:%{!Zbundle:-allow_stack_execute}}"
-;;
 *)
   SYSTEMSPEC=
 ;;


Re: [PR51752] publication safety violations in loop invariant motion pass

2012-02-23 Thread Aldy Hernandez

On 02/23/12 12:19, Aldy Hernandez wrote:


about hit me. Instead now I save all loads in a function and iterate
through them in a brute force way. I'd like to rewrite this into a hash
of some sort, but before I go any further I'm interested to know if the
main idea is ok.


For the record, it may be ideal to reuse some of the iterations we 
already do over the function's basic blocks, so we don't have to iterate 
yet again over the IL stream.  Though it may complicate the pass 
unnecessarily.


Re: [PATCH 1/5] invoke.texi: remove duplicate pass-flag entries

2012-02-23 Thread Joseph S. Myers
On Thu, 23 Feb 2012, Bernhard Reutner-Fischer wrote:

> gcc/ChangeLog
> 
> 2012-02-23  Bernhard Reutner-Fischer  
> 
>   * doc/invoke.texi (-fdse, -fdce): Remove duplicate entries.

OK (actually, you should commit this as obvious).

Now we have the arrangements for docstring relicensing it may make sense 
to get documentation for each individual option out of the .opt files, and 
generate these lists more automatically, but that's certainly not a 
priority.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/5] passes.texi: Fix typo in Full redundancy elimination

2012-02-23 Thread Joseph S. Myers
On Thu, 23 Feb 2012, Bernhard Reutner-Fischer wrote:

> gcc/ChangeLog
> 
> 2012-02-23  Bernhard Reutner-Fischer  
> 
>   * doc/passes.texi (Full redundancy elimination): Fix typo.

Should be committed as obvious.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH][ARM] core -> NEON extend

2012-02-23 Thread Andrew Stubbs

Hi All,

This patch converts SImode to DImode extends that also move from core 
registers to VFP/NEON registers.


Currently, the compiler does extends in core registers first, and then 
does the move. This adds to register pressure, which I would imagine to 
be a bad thing. If the value is not in a properly aligned register (the 
first parameter to a register never is) then it also has to move that 
around also.


With my patch, it first moves the SImode value into the NEON register, 
and then extends it, which uses no extra registers.


Zero extend, before and after (assuming the value is passed in r0):

mov r2, r0   | vdup.32   d16, r0
movsr3, #0   | vshr.u64  d16, d16, #32
fmdrr   d16, r2, r3  |

Sign extend:

mov r2, r0   | vdup.32   d16, r0
asrsr3, r0, #31  | vshr.s64  d16, d16, #32
fmdrr   d16, r2, r3  |

OK for 4.8?

Andrew


P.S.

I have experimented with doing zero-extends something like

vmov.i64  d7, #0
fmsr  s14, r0

But, somehow the immediate load doesn't seem to work, and it limits the 
target register to VFP_LO_REGS. It's also not possible to load into only 
s15, so I'm not sure there's any advantage.
2012-02-23  Andrew Stubbs  

	gcc/
	* config/arm/arm.md (zero_extenddi2): Add extra alternatives
	for NEON registers.
	(extenddi2): Likewise.
	Prevent extend splitters doing NEON alternatives.
	* config/arm/iterators.md (qhs_extenddi_cstr, qhs_zextenddi_cstr):
	Adjust constraints to add new alternatives.
	* config/arm/neon.md: Add splitters for zero- and sign-extend.

	gcc/testsuite/
	* gcc.target/arm/neon-extend-1.c: New file.
	* gcc.target/arm/neon-extend-2.c: New file.

---
 gcc/config/arm/arm.md|   26 +++---
 gcc/config/arm/iterators.md  |4 ++--
 gcc/config/arm/neon.md   |   22 ++
 gcc/testsuite/gcc.target/arm/neon-extend-1.c |   13 +
 gcc/testsuite/gcc.target/arm/neon-extend-2.c |   13 +
 5 files changed, 65 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/neon-extend-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/neon-extend-2.c

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 182c52a..35bf688 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4479,33 +4479,35 @@
 ;; Zero and sign extension instructions.
 
 (define_insn "zero_extenddi2"
-  [(set (match_operand:DI 0 "s_register_operand" "=r")
+  [(set (match_operand:DI 0 "s_register_operand" "=w, r")
 (zero_extend:DI (match_operand:QHSI 1 ""
 	"")))]
   "TARGET_32BIT "
   "#"
-  [(set_attr "length" "8")
-   (set_attr "ce_count" "2")
-   (set_attr "predicable" "yes")]
+  [(set_attr "length" "8,8")
+   (set_attr "ce_count" "2,2")
+   (set_attr "predicable" "yes,yes")]
 )
 
 (define_insn "extenddi2"
-  [(set (match_operand:DI 0 "s_register_operand" "=r")
+  [(set (match_operand:DI 0 "s_register_operand" "=w,r")
 (sign_extend:DI (match_operand:QHSI 1 ""
 	"")))]
   "TARGET_32BIT "
   "#"
-  [(set_attr "length" "8")
-   (set_attr "ce_count" "2")
-   (set_attr "shift" "1")
-   (set_attr "predicable" "yes")]
+  [(set_attr "length" "8,8")
+   (set_attr "ce_count" "2,2")
+   (set_attr "shift" "1,1")
+   (set_attr "predicable" "yes,yes")]
 )
 
 ;; Splits for all extensions to DImode
 (define_split
   [(set (match_operand:DI 0 "s_register_operand" "")
 (zero_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
-  "TARGET_32BIT"
+  "TARGET_32BIT && (!TARGET_NEON
+		|| (reload_completed
+			&& !(IS_VFP_REGNUM (REGNO (operands[0])"
   [(set (match_dup 0) (match_dup 1))]
 {
   rtx lo_part = gen_lowpart (SImode, operands[0]);
@@ -4531,7 +4533,9 @@
 (define_split
   [(set (match_operand:DI 0 "s_register_operand" "")
 (sign_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
-  "TARGET_32BIT"
+  "TARGET_32BIT && (!TARGET_NEON
+		|| (reload_completed
+			&& !(IS_VFP_REGNUM (REGNO (operands[0])"
   [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (const_int 31)))]
 {
   rtx lo_part = gen_lowpart (SImode, operands[0]);
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 1567264..07ac5da 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -409,8 +409,8 @@
 (define_mode_attr qhs_extenddi_op [(SI "s_register_operand")
    (HI "nonimmediate_operand")
    (QI "arm_reg_or_extendqisi_mem_op")])
-(define_mode_attr qhs_extenddi_cstr [(SI "r") (HI "rm") (QI "rUq")])
-(define_mode_attr qhs_zextenddi_cstr [(SI "r") (HI "rm") (QI "rm")])
+(define_mode_attr qhs_extenddi_cstr [(SI "r,r") (HI "r,rm") (QI "r,rUq")])
+(define_mode_attr qhs_zextenddi_cstr [(SI "r,r") (HI "r,rm") (QI "r,rm")])
 
 ;; Mode attributes used for fixed-point support.
 (define_mode_attr qaddsub_suf [(V4UQQ "8") (V2UHQ "16") (UQQ "8") (UHQ "16")
diff --git a/gcc/config/arm/neon.md b/gcc

[lra] a path to fix GCC crash on s390

2012-02-23 Thread Vladimir Makarov
The following patch fixes a compiler crash on s390 during the 
bootstrap.   Unfortunately it is not enough to fix the current s390 
bootstrap failure.


The patch was successfully bootstrapped on x86/x86-64.

Committed as rev. 184524.

2012-02-23  Vladimir Makarov 

* lra-constraints.c (inherit_in_ebb): Don't do inheritance from
output of a jump.

Index: lra-constraints.c
===
--- lra-constraints.c   (revision 184510)
+++ lra-constraints.c   (working copy)
@@ -4363,8 +4363,9 @@ inherit_in_ebb (rtx head, rtx tail)
{
  reloads_num++;
  /* 'original_pseudo <- reload_pseudo'.  */
- if (inherit_reload_reg (true, false, dst_regno, cl,
- curr_insn, next_usage_insns))
+ if (! JUMP_P (curr_insn)
+ && inherit_reload_reg (true, false, dst_regno, cl,
+curr_insn, next_usage_insns))
change_p = true;
  /* Invalidate.  */
  usage_insns[dst_regno].check = 0;
@@ -4425,9 +4426,11 @@ inherit_in_ebb (rtx head, rtx tail)
/* Invalidate.  */
usage_insns[dst_regno].check = 0;
  }
- for (i = 0; i < to_inherit_num; i++)
-   if (inherit_reload_reg (true, false, to_inherit[i].regno, ALL_REGS,
-   curr_insn, to_inherit[i].insns))
+ if (! JUMP_P (curr_insn))
+   for (i = 0; i < to_inherit_num; i++)
+ if (inherit_reload_reg (true, false, to_inherit[i].regno,
+ ALL_REGS, curr_insn,
+ to_inherit[i].insns))
  change_p = true;
  if (CALL_P (curr_insn))
calls_num++;


Re: [PATCH]: Fix PR52179 and remove hack from PR48299

2012-02-23 Thread Mike Stump
On Feb 23, 2012, at 1:03 PM, Jack Howarth wrote:
> Okay for gcc trunk for gcc 4.7 since this is target specific?

Ok.

> 2012-02-23  Patrick Marlier  
>   Jack Howarth  
> 
> boehm-gc/
> 
>   PR boehm-gc/52179
>   * include/gc_config.h.in: Undefine HAVE_PTHREAD_GET_STACKADDR_NP.
>   * include/private/gcconfig.h (DARWIN): Define STACKBOTTOM with
>   pthread_get_stackaddr_np when available.
>   * configure.ac (THREADS): Check availability of 
> pthread_get_stackaddr_np.
>* configure: Regenerate.
> 
> libjava/
> 
>   PR target/49461
>   * configure.ac (SYSTEMSPEC): No longer pass -no_pie for darwin11.
>   * configure: Regenerate.


Fix PR bootstrap/52287

2012-02-23 Thread Eric Botcazou
This fixes the recent bootstrap failure of the mainline on SPARC/Solaris 8 by 
stabilizing the sort of the ready list of the Haïfa scheduler in the presence 
of debug insns.

Bootstrapped/regtested on SPARC/Solaris 8 and x86-64/Linux, pre-approved by 
Bernd, applied on the mainline.


2012-02-23  Eric Botcazou  

PR bootstrap/52287
* haifa-sched.c (rank_for_schedule): Stabilize sort for debug insns.


-- 
Eric Botcazou

-- 
Eric Botcazou
Index: haifa-sched.c
===
--- haifa-sched.c	(revision 184352)
+++ haifa-sched.c	(working copy)
@@ -1647,8 +1647,10 @@ rank_for_schedule (const void *x, const
   /* Schedule debug insns as early as possible.  */
   if (DEBUG_INSN_P (tmp) && !DEBUG_INSN_P (tmp2))
 	return -1;
-  else if (DEBUG_INSN_P (tmp2))
+  else if (!DEBUG_INSN_P (tmp) && DEBUG_INSN_P (tmp2))
 	return 1;
+  else if (DEBUG_INSN_P (tmp) && DEBUG_INSN_P (tmp2))
+	return INSN_LUID (tmp) - INSN_LUID (tmp2);
 }
 
   /* The insn in a schedule group should be issued the first.  */


Re: [PATCH 2/5] passes.texi: Fix typo in Full redundancy elimination

2012-02-23 Thread Bernhard Reutner-Fischer
On 23 February 2012 22:33, Joseph S. Myers  wrote:
> On Thu, 23 Feb 2012, Bernhard Reutner-Fischer wrote:
>
>> gcc/ChangeLog
>>
>> 2012-02-23  Bernhard Reutner-Fischer  
>>
>>       * doc/passes.texi (Full redundancy elimination): Fix typo.
>
> Should be committed as obvious.

Presumably the same for the rest of these 5 obvious trivia that, btw,
were bootstrapped and successfully regtested on x86_64-linux-gnu with
--enable-languages=c,fortran,lto,c++ which i forgot to mention.


[pph] Write tree headers for mutated trees (issue5699055)

2012-02-23 Thread Lawrence Crowl
This patch writes the tree header for mutated trees.  This write is
necessary to get updated attribute information from the info bits.
On read, these bits are then merged into the existing tree.

Test x2incomplete4.cc is now generating assembly.  However, it
erroneously generates an inline default constructor.  This error
likely occurs because pph_in_symtab actions are handled after reading
each pph file, rather merged and emitted after all pph files are read.
Diego will address this symtab merge.

Since we are now merging types, this patch includes a fix to more
aggressively trace types.

In addition, there are a couple of fixes to protect against null
pointers.

Tested on x64.


Index: gcc/testsuite/ChangeLog.pph

2012-02-23   Lawrence Crowl  

* g++.dg/pph/x2incomplete4.cc: Make expected assembley diff.

Index: gcc/cp/ChangeLog.pph

2012-02-23   Lawrence Crowl  

* pph-core.c (pph_trace_tree): Trace trees as well as decls.
* cp-tree.h (class_of_this_parm): Protect against null type.
* error.c (dump_function_decl): Protect against null type.
* pph-out.c (pph_out_tree): Write tree header for mutated trees.
* pph-in.c (pph_merge_tree_attributes): New.
(pph_in_merge_key_tree): Merge tree attributes.
(pph_in_tree): Read tree header for mutated trees.
Merge their attributes.


Index: gcc/testsuite/g++.dg/pph/x2incomplete4.cc
===
--- gcc/testsuite/g++.dg/pph/x2incomplete4.cc   (revision 184521)
+++ gcc/testsuite/g++.dg/pph/x2incomplete4.cc   (working copy)
@@ -1,5 +1,5 @@
-// { dg-xfail-if "ICE" { "*-*-*" } { "-fpph-map=pph.map" } }
-// { dg-bogus "internal compiler error: in import_export_decl, at cp/decl2.c" 
"" { xfail *-*-* } 0 }
+// pph asm xdiff 21766
+// copies::copies() is wrongly generated
 
 #include "x1incomplete3.h"
 #include "a0incomplete4.cci"
Index: gcc/cp/pph-core.c
===
--- gcc/cp/pph-core.c   (revision 184521)
+++ gcc/cp/pph-core.c   (working copy)
@@ -381,7 +381,7 @@ pph_trace_tree (tree t, const char *name
enum pph_trace_end end, enum pph_trace_kind kind)
 {
   char end_char, kind_char, decl_char;
-  bool is_merge, is_decl;
+  bool is_merge, is_decl, is_type;
   bool emit = false;
 
   switch (kind)
@@ -418,6 +418,7 @@ pph_trace_tree (tree t, const char *name
   end_char = end == pph_trace_front ? '{' : '}';
 
   is_decl = DECL_P (t);
+  is_type = TYPE_P (t);
   if (is_decl)
 decl_char = 'D';
   else if (TYPE_P (t))
@@ -425,9 +426,9 @@ pph_trace_tree (tree t, const char *name
   else
 decl_char = '.';
 
-  if (is_merge && is_decl && flag_pph_tracer >= 2)
+  if (is_merge && flag_pph_tracer >= 2)
 emit = true;
-  else if ((is_merge || is_decl) && flag_pph_tracer >= 3)
+  else if ((is_merge || is_decl || is_type) && flag_pph_tracer >= 3)
 emit = true;
   else if (!EXPR_P (t) && flag_pph_tracer >= 4)
 emit = true;
Index: gcc/cp/error.c
===
--- gcc/cp/error.c  (revision 184521)
+++ gcc/cp/error.c  (working copy)
@@ -1446,8 +1446,13 @@ dump_function_decl (tree t, int flags)
 
   if (TREE_CODE (fntype) == METHOD_TYPE)
{
+ tree type_this = type_of_this_parm (fntype);
+ tree type_class = type_this ? TREE_TYPE (type_this) : NULL;
  pp_base (cxx_pp)->padding = pp_before;
- pp_cxx_cv_qualifier_seq (cxx_pp, class_of_this_parm (fntype));
+ if (type_class)
+   pp_cxx_cv_qualifier_seq (cxx_pp, class_of_this_parm (fntype));
+ else
+   pp_cxx_ws_string (cxx_pp, M_(""));;
}
 
   if (flags & TFF_EXCEPTION_SPECIFICATION)
Index: gcc/cp/pph-out.c
===
--- gcc/cp/pph-out.c(revision 184521)
+++ gcc/cp/pph-out.c(working copy)
@@ -2355,16 +2355,7 @@ pph_out_tree (pph_stream *stream, tree e
 
   if (marker == PPH_RECORD_START || marker == PPH_RECORD_START_MUTATED)
 {
-  /* This is the first time we see EXPR, write it out.  */
-  if (marker == PPH_RECORD_START)
-{
-  /* We only need to write EXPR's header if it needs to be
- re-allocated when reading.  If we are writing the mutated
- state of an existing tree, then we only need to write its
- body.  */
-  pph_out_tree_header (stream, expr);
-}
-
+  pph_out_tree_header (stream, expr);
   pph_out_tree_body (stream, expr);
 }
   else if (marker == PPH_RECORD_START_MERGE_BODY)
Index: gcc/cp/cp-tree.h
===
--- gcc/cp/cp-tree.h(revision 184521)
+++ gcc/cp/cp-tree.h(working copy)
@@ -4836,7 +4836,8 @@ type_of_this_parm (const_tree fntype)
 static inline tree
 class_of_this_parm (const_tree fntype)
 {
-  return TREE_TYPE (type_of_this_parm (fntype));
+  tree t

v3 PATCH to include/bits/locale_facets.h to revert reordering of virtual functions

2012-02-23 Thread Jason Merrill
Benjamin's patch of 2011-08-06 (r177542) to clean up doxygen markup 
changed the order of the num_get::do_get virtual functions.  This breaks 
ABI compatibility, so I'm reverting the change.


Tested x86_64-pc-linux-gnu, applied to trunk.
commit f8566d7934d5ab7006548fa67d596cac2c346d3e
Author: Jason Merrill 
Date:   Thu Feb 23 16:59:04 2012 -0500

	* include/bits/locale_facets.h (class num_get): Undo reordering of
	do_get virtual functions.

diff --git a/libstdc++-v3/include/bits/locale_facets.h b/libstdc++-v3/include/bits/locale_facets.h
index 3b3139f..dc95f5a 100644
--- a/libstdc++-v3/include/bits/locale_facets.h
+++ b/libstdc++-v3/include/bits/locale_facets.h
@@ -2169,6 +2169,9 @@ _GLIBCXX_BEGIN_NAMESPACE_LDBL
*  @return  Iterator after reading.
   */
   virtual iter_type
+  do_get(iter_type, iter_type, ios_base&, ios_base::iostate&, bool&) const;
+
+  virtual iter_type
   do_get(iter_type __beg, iter_type __end, ios_base& __io,
 	 ios_base::iostate& __err, long& __v) const
   { return _M_extract_int(__beg, __end, __io, __err, __v); }
@@ -2201,9 +2204,6 @@ _GLIBCXX_BEGIN_NAMESPACE_LDBL
 #endif
 
   virtual iter_type
-  do_get(iter_type, iter_type, ios_base&, ios_base::iostate&, bool&) const;
-
-  virtual iter_type
   do_get(iter_type, iter_type, ios_base&, ios_base::iostate&, float&) const;
 
   virtual iter_type


Re: v3 PATCH to include/bits/locale_facets.h to revert reordering of virtual functions

2012-02-23 Thread Benjamin Kosnik

Thanks Jason.

-benjamin


PR 52060 - fixed in 4.6.3?

2012-02-23 Thread Kenny Simpson
PR rtl-optimization/52060 is marked as fixed with a target of 4.6.4, but looks 
like its been backported to 4.6 a couple weeks ago.

Should the target be adjusted so that it gets listed in the bugs fixed for 
4.6.3?

-Kenny


PATCH: PR target/52364: The unnecessary second form in *movabs_[12]

2012-02-23 Thread H.J. Lu
Hi,

The second form is redundant in

;; Stores and loads of ax to arbitrary constant address.
;; We fake an second form of instruction to force reload to load address
;; into register when rax is not available
(define_insn "*movabs_1"
  [(set (mem:SWI1248x (match_operand:DI 0 "x86_64_movabs_operand" "i,r"))
(match_operand:SWI1248x 1 "nonmemory_operand" "a,er"))]
  "TARGET_64BIT && ix86_check_movabs (insn, 0)"
  "@
   movabs{}\t{%1, %P0|%P0, %1}
   mov{}\t{%1, %a0|%a0, %1}"
  [(set_attr "type" "imov")
   (set_attr "modrm" "0,*")
   (set_attr "length_address" "8,0")
   (set_attr "length_immediate" "0,*")
   (set_attr "memory" "store")
   (set_attr "mode" "")])

since it is just normal mov.  Tested on Linux/x86-64.  OK for stage1?

Thanks.


H.J.

2012-02-23  H.J. Lu  

PR target/52352
PR target/52364
* config/i386/i386.md (*movabs_1): Remove the second form.
(*movabs_2): Likewise.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index ec3993a..9242926 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2335,32 +2335,26 @@
   (const_string "QI")))])
 
 ;; Stores and loads of ax to arbitrary constant address.
-;; We fake an second form of instruction to force reload to load address
-;; into register when rax is not available
 (define_insn "*movabs_1"
-  [(set (mem:SWI1248x (match_operand:DI 0 "x86_64_movabs_operand" "i,r"))
-   (match_operand:SWI1248x 1 "nonmemory_operand" "a,er"))]
+  [(set (mem:SWI1248x (match_operand:DI 0 "x86_64_movabs_operand" "i"))
+   (match_operand:SWI1248x 1 "register_operand" "a"))]
   "TARGET_64BIT && ix86_check_movabs (insn, 0)"
-  "@
-   movabs{}\t{%1, %P0|%P0, %1}
-   mov{}\t{%1, %a0|%a0, %1}"
+  "movabs{}\t{%1, %P0|%P0, %1}"
   [(set_attr "type" "imov")
-   (set_attr "modrm" "0,*")
-   (set_attr "length_address" "8,0")
-   (set_attr "length_immediate" "0,*")
+   (set_attr "modrm" "0")
+   (set_attr "length_address" "8")
+   (set_attr "length_immediate" "0")
(set_attr "memory" "store")
(set_attr "mode" "")])
 
 (define_insn "*movabs_2"
-  [(set (match_operand:SWI1248x 0 "register_operand" "=a,r")
-(mem:SWI1248x (match_operand:DI 1 "x86_64_movabs_operand" "i,r")))]
+  [(set (match_operand:SWI1248x 0 "register_operand" "=a")
+(mem:SWI1248x (match_operand:DI 1 "x86_64_movabs_operand" "i")))]
   "TARGET_64BIT && ix86_check_movabs (insn, 1)"
-  "@
-   movabs{}\t{%P1, %0|%0, %P1}
-   mov{}\t{%a1, %0|%0, %a1}"
+  "movabs{}\t{%P1, %0|%0, %P1}"
   [(set_attr "type" "imov")
-   (set_attr "modrm" "0,*")
-   (set_attr "length_address" "8,0")
+   (set_attr "modrm" "0")
+   (set_attr "length_address" "8")
(set_attr "length_immediate" "0")
(set_attr "memory" "load")
(set_attr "mode" "")])


any chance of further wrong-code fixes for 4.6.3?

2012-02-23 Thread Kenny Simpson
issue w/ cmov+volatile (very small)
PR rtl-optimization/47698 (2 line code move)
PR target/45771 (small change to a conditional)

PR c++/41449
PR middle-end/52314 (one expression change)

thanks,
-Kenny


Re: [gimplefe][patch] The symbol table for declarations

2012-02-23 Thread Sandeep Soni
On Wed, Nov 30, 2011 at 9:48 AM, Sandeep Soni  wrote:
> Sorry. Wrong patch. New patch attached.
>
> I am getting the following error in the new patch though.
>
> ../../gimple-front-end/gcc/gimple/parser.c: In function ‘gp_parse_var_decl’:
> ../../gimple-front-end/gcc/gimple/parser.c:927:3: error: implicit
> declaration of function ‘ggc_alloc_cleared_gimple_symtab_entry_def’
> [-Werror=implicit-function-declaration]
> ../../gimple-front-end/gcc/gimple/parser.c:927:5: error: assignment
> makes pointer from integer without a cast [-Werror]
> cc1: all warnings being treated as errors
>
> make[3]: *** [gimple/parser.o] Error 1
> make[3]: *** Waiting for unfinished jobs
> rm gfdl.pod cpp.pod gcov.pod gfortran.pod fsf-funding.pod gcc.pod
> make[3]: Leaving directory `/home/Sandy/gimple_build/gcc'
> make[2]: *** [all-stage2-gcc] Error 2
> make[2]: Leaving directory `/home/Sandy/gimple_build'
> make[1]: *** [stage2-bubble] Error 2
> make[1]: Leaving directory `/home/Sandy/gimple_build'
> make: *** [all] Error 2
>
> Is there anything that needs to be initialized to use the
> ggc_alloc_cleared_* function?

I was finally able to circumvent the error. I guess what I missed was
I did not mark the symbol table (the global variable) for garbage
collection. Once I did that the error was removed.

Builds correctly on x86. Up for review.


-- 
Cheers
Sandy

PS: We finally have the symbol table :)
Index: gcc/gimple/parser.h
===
--- gcc/gimple/parser.h	(revision 174754)
+++ gcc/gimple/parser.h	(working copy)
@@ -27,6 +27,19 @@
 #include "vec.h"
 
 
+/* The GIMPLE symbol table entry.  */
+
+struct GTY(()) gimple_symtab_entry_def 
+{
+  /* symbol table entry key, an identifier.  */
+  tree id;
+
+  /* symbol table entry, a DECL.  */
+  tree decl;
+};
+
+typedef struct gimple_symtab_entry_def *gimple_symtab_entry_t;
+
 /* A GIMPLE token.  */
 
 typedef struct GTY(()) gimple_token {
@@ -81,7 +94,7 @@
   struct GTY((skip)) ht *ident_hash;
 } gimple_parser;
 
-
+ 
 /* In parser.c  */
 extern void gimple_main (void);
 
Index: gcc/gimple/parser.c
===
--- gcc/gimple/parser.c	(revision 174754)
+++ gcc/gimple/parser.c	(working copy)
@@ -28,6 +28,7 @@
 #include "tree.h"
 #include "gimple.h"
 #include "parser.h"
+#include "hashtab.h"
 #include "ggc.h"
 
 /* The GIMPLE parser.  Note: do not use this variable directly.  It is
@@ -44,6 +45,52 @@
 /* EOF token.  */
 static gimple_token gl_eof_token = { CPP_EOF, 0, 0, 0 };
 
+/* Gimple symbol table.  */
+
+static GTY ((if_marked ("gimple_symtab_entry_marked_p"),
+	 param_is (struct gimple_symtab_entry_def)))
+  htab_t gimple_symtab;
+
+/* Return the hash value of the declaration name of a gimple_symtab_entry_def
+   object pointed by ENTRY.  */
+
+static hashval_t
+gimple_symtab_entry_hash (const void *entry)
+{
+  const struct gimple_symtab_entry_def *base =
+(const struct gimple_symtab_entry_def *)entry;
+  return IDENTIFIER_HASH_VALUE (base->id);
+}
+
+/* Returns non-zero if ENTRY1 and ENTRY2 point to gimple_symtab_entry_def
+   objects corresponding to the same declaration.  */
+
+static int
+gimple_symtab_eq_hash (const void *entry1, const void *entry2)
+{
+  const struct gimple_symtab_entry_def *base1 =
+(const struct gimple_symtab_entry_def *)entry1;
+  const struct gimple_symtab_entry_def *base2 =
+(const struct gimple_symtab_entry_def *)entry2;
+
+  return (base1->id == base2->id);
+}
+
+/* Returns non-zero if P points to an gimple_symtab_entry_def struct that needs
+   to be marked for GC.  */
+
+static int
+gimple_symtab_entry_marked_p (const void *p)
+{
+  const struct gimple_symtab_entry_def *base =
+ (const struct gimple_symtab_entry_def *) p;
+
+  /* Keep this only if the common IDENTIFIER_NODE of the symtab chain
+ is marked which it will be if at least one of the DECLs in the
+ chain is marked.  */
+  return ggc_marked_p (base->id);
+}
+
 /* Return the string representation of token TOKEN.  */
 
 static const char *
@@ -807,6 +854,7 @@
 }
 }
 
+
 /* The Declaration section within a .gimple file can consist of 
a) Declaration of variables.
b) Declaration of functions.
@@ -870,18 +918,35 @@
 static void
 gp_parse_var_decl (gimple_parser *parser)
 {
-  const gimple_token *next_token;
+  const gimple_token *next_token, *name_token;
+  const char *name;
   enum tree_code code ;
+  gimple_symtab_entry_t e;
+  void **slot;
+  void **new_entry;
 
   gl_consume_expected_token (parser->lexer, CPP_LESS);
-  gl_consume_expected_token (parser->lexer, CPP_NAME);
+  name_token = gl_consume_expected_token (parser->lexer, CPP_NAME);
+  name = gl_token_as_text (name_token);
+
+  e = ggc_alloc_cleared_gimple_symtab_entry_def ();
+  e->id = get_identifier(name);
+  slot = htab_find_slot (gimple_symtab, e, NO_INSERT); 
+  if (!slot)
+{
+  e->decl = build_decl (name_token->location, VAR_DECL, get_identifier(name), void_typ