Re: [Patch] Move Objective-C runtime flags to modern options system.

2011-11-11 Thread Iain Sandoe


On 11 Nov 2011, at 00:30, Mike Stump wrote:


On Nov 10, 2011, at 9:40 AM, Iain Sandoe wrote:
Thanks for catching that --- brainstorm on my part ... the code  
under discussion should have been #ifndef OBCPLUS


There is no prohibition against C having exceptions, so, doesn't  
matter if you turn C++ off, you can still throw through C code, so  
turning on exceptions is reasonable.


Moreover, there is no personality routine in m32 NeXT libobjc, so  
if one tries to engage the zero-cost exceptions, one gets a link  
error (and generates a load of unused eh data).  I can work around  
that if there is still reason to have "-fexceptions" on.


No, this must be wrong:

$ cat t.c
void bar() {
}

void foo() {
 bar();
}


int main() {
 return 0;
}
$ gcc -fexceptions t.c
$ gcc -m32 -fexceptions t.c
$

Like I said, it does work, one can count on it working and it is  
useful, you can't break it.  And next week, they'll add catching and  
throwing to C, and when they do, it still has to just work.  :-)



FWIW your example doesn't reproduce the problem because it contains no  
objective c exceptions code.


However, OK - I see your point (I also see where the problem came from).

in the code before the split there is this path (note gcc-4.2.1 system  
versionl):


$ gcc-4.2 ../gcc-live-trunk/gcc/testsuite/objc.dg/exceptions-2.m - 
lobjc -fobjc-exceptions -fno-objc-sjlj-exceptions -o t

Undefined symbols:
  "___gnu_objc_personality_v0", referenced from:
  ___gnu_objc_personality_v0$non_lazy_ptr in ccOk5CMv.o
ld: symbol(s) not found
collect2: ld returned 1 exit status

I have incorrectly made that path apply to ABI=0,1 NeXT regardless of  
the setting of fobjc-sjlj-exceptions.


This doesn't affect GNU runtime or NeXT m64.

Patch under test to fix this (will post later).

Iain



Re: PR c++/30195

2011-11-11 Thread Fabien Chêne
2011/11/10 Dodji Seketeli :
> Fabien Chêne  a écrit:
>
>> Index: gcc/dbxout.c
>> ===
>> --- gcc/dbxout.c      (revision 178088)
>> +++ gcc/dbxout.c      (working copy)
>> @@ -1518,6 +1518,8 @@ dbxout_type_fields (tree type)
>>        if (TREE_CODE (tem) == TYPE_DECL
>>         /* Omit here the nameless fields that are used to skip bits.  */
>>         || DECL_IGNORED_P (tem)
>> +       /* Omit USING_DECL */
>> +       || TREE_CODE (tem) >= LAST_AND_UNUSED_TREE_CODE
>>         /* Omit fields whose position or size are variable or too large to
>>            represent.  */
>>         || (TREE_CODE (tem) == FIELD_DECL
>
> As this dbxout backend code already ignores DECLs marked DECL_IGNORED_P,
> maybe it would be best to have the front-end mark the USING_DECL as
> DECL_IGNORED_P; possibly in finish_member_declaration?

Are the other debugging backends not interested at all in USING_DECLs ?

-- 
Fabien


[PATCH] Revert sparc vec_init improvements as they cause 64-bit regressions.

2011-11-11 Thread David Miller

Eric, I tried my best to get the new code working properly on 64-bit
and I just couldn't figure out a reasonably way to do so.

Therefore I simply reverted the changes.  I'll come back to this at
some point in the future.

One thing that really irks me is how pseudo's can only be subreg'd
on UNITS_PER_WORD boundaries.  That's the real reason this stuff
doesn't work and it's nearly impossible to subreg 32-bit values
that end up in float regs on sparc when compiling 64-bit.

We have REGMODE_NATURAL_SIZE which basically describes the
subreg'ability of the hard registers that a pseudo in a given mode
will end up using.

I started playing around with using REGMODE_NATURAL_SIZE in place of
UNITS_PER_WORD in the subreg code but it got way out of the scope of
fixing this regression.

Anyways, commited to trunk and all the 64-bit failures should be gone.

gcc/

Revert
2011-11-05  David S. Miller  
---
 gcc/ChangeLog |5 +
 gcc/config/sparc/sparc.c  |  440 ++--
 gcc/config/sparc/sparc.md |   54 --
 3 files changed, 105 insertions(+), 394 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 5f19470..cf4e66b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2011-11-11  David S. Miller  
+
+   Revert
+   2011-11-05  David S. Miller  
+
 2011-11-11  Jakub Jelinek  
 
* opts-common.c (generate_canonical_option): Free opt_text
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 1f2a27a..55759a0 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -11285,357 +11285,88 @@ output_v8plus_mult (rtx insn, rtx *operands, const 
char *opcode)
 }
 }
 
-/* Subroutine of sparc_expand_vector_init.  Emit code to initialize TARGET to
-   the N_ELTS values for individual fields contained in LOCS by means of VIS2
-   BSHUFFLE insn.  MODE and INNER_MODE are the modes describing TARGET.  */
+/* Subroutine of sparc_expand_vector_init.  Emit code to initialize
+   all fields of TARGET to ELT by means of VIS2 BSHUFFLE insn.  MODE
+   and INNER_MODE are the modes describing TARGET.  */
 
 static void
-vector_init_bshuffle (rtx target, rtx *locs, int n_elts,
- enum machine_mode mode,
+vector_init_bshuffle (rtx target, rtx elt, enum machine_mode mode,
  enum machine_mode inner_mode)
 {
-  rtx mid_target, r0_high, r0_low, r1_high, r1_low;
-  enum machine_mode partial_mode;
-  int bmask, i, idxs[8];
+  rtx t1, final_insn;
+  int bmask;
 
-  partial_mode = (mode == V4HImode
- ? V2HImode
- : (mode == V8QImode
-? V4QImode : mode));
+  t1 = gen_reg_rtx (mode);
 
-  r0_high = r0_low = NULL_RTX;
-  r1_high = r1_low = NULL_RTX;
+  elt = convert_modes (SImode, inner_mode, elt, true);
+  emit_move_insn (gen_lowpart(SImode, t1), elt);
 
-  /* Move the pieces into place, as needed, and calculate the nibble
- indexes for the bmask calculation.  After we execute this loop the
- locs[] array is no longer needed.  Therefore, to simplify things,
- we set entries that have been processed already to NULL_RTX.  */
-
-  for (i = 0; i < n_elts; i++)
-{
-  int j;
-
-  if (locs[i] == NULL_RTX)
-   continue;
-
-  if (!r0_low)
-   {
- r0_low = locs[i];
- idxs[i] = 0x7;
-   }
-  else if (!r1_low)
-   {
- r1_low = locs[i];
- idxs[i] = 0xf;
-   }
-  else if (!r0_high)
-   {
- r0_high = gen_highpart (partial_mode, r0_low);
- emit_move_insn (r0_high, gen_lowpart (partial_mode, locs[i]));
- idxs[i] = 0x3;
-   }
-  else if (!r1_high)
-   {
- r1_high = gen_highpart (partial_mode, r1_low);
- emit_move_insn (r1_high, gen_lowpart (partial_mode, locs[i]));
- idxs[i] = 0xb;
-   }
-  else
-   gcc_unreachable ();
-
-  for (j = i + 1; j < n_elts; j++)
-   {
- if (locs[j] == locs[i])
-   {
- locs[j] = NULL_RTX;
- idxs[j] = idxs[i];
-   }
-   }
-  locs[i] = NULL_RTX;
-}
-
-  bmask = 0;
-  for (i = 0; i < n_elts; i++)
-{
-  int v = idxs[i];
-
-  switch (GET_MODE_SIZE (inner_mode))
-   {
-   case 2:
- bmask <<= 8;
- bmask |= (((v - 1) << 4) | v);
- break;
-
-   case 1:
- bmask <<= 4;
- bmask |= v;
- break;
-
-   default:
- gcc_unreachable ();
-   }
-}
-
-  emit_insn (gen_bmasksi_vis (gen_reg_rtx (SImode), CONST0_RTX (SImode),
- force_reg (SImode, GEN_INT (bmask;
-
-  mid_target = target;
-  if (GET_MODE_SIZE (mode) == 4)
-{
-  mid_target = gen_reg_rtx (mode == V2HImode
-   ? V4HImode : V8QImode);
-}
-
-  if (!r1_low)
-r1_low = r0_low;
-
-  switch (GET_MODE (mid_target))
+  switch (mode)
 {
+case V2SImode:
+  final_insn = gen_bshufflev2si_vis 

Re: Selective Scheduling Reviews

2011-11-11 Thread Andrey Belevantsev

On 10.11.2011 21:31, Jeff Law wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


[ This should have gone out some time ago...  Sorry for the long delay ]

I'm pleased to announce that the GCC steering committee has approved
the nomination of Andrey Belevantsev, Alexander Monakov, and Dmitry
Melnik as selective scheduling reviewers.

Thanks a lot, I've committed the patch below.

Btw, don't we want to keep the reviewers section sorted by component?  I 
then can move LTO folks entries to the appropriate place.


Andrey

2011-11-11  Andrey Belevantsev  

* MAINTAINERS (Selective Scheduling): Add myself as a reviewer.

Index: MAINTAINERS
===
*** MAINTAINERS (revision 181283)
--- MAINTAINERS (working copy)
*** Plugin  Le-Chun Wu  l...@google.com
*** 286,291 
--- 286,292 
  register allocation   Peter Bergner   berg...@vnet.ibm.com
  register allocation   Kenneth Zadeck  zad...@naturalbridge.com
  register allocation   Seongbae Park   seongbae.p...@gmail.com
+ Selective Scheduling  Andrey Belevantsev  a...@ispras.ru
  LTO   Diego Novillo   dnovi...@google.com
  LTO   Richard Guentherrguent...@suse.de
  LTO pluginCary Coutantccout...@google.com



Re: [PATCH] PR target/50038 fix: redundant zero extensions removal

2011-11-11 Thread Eric Botcazou
> Great! I'll be back with patch covering all non functional changes.
> Will it be OK to have everything in one patch (including current
> functional changes) or I should split it?

Let's also rename the file while we are at it.  I'd suggest Redundant Extension 
Elimination for the name of the pass, so ree.c for the filename (suggestions 
for a more descriptive name welcome).  So we rename implicit-zee.c into ree.c 
and add a new header along these lines:

/* Redundant Extension Elimination pass for the GNU compiler.
   Copyright (C) 2010-2011 Free Software Foundation, Inc.
   Contributed by 

   Based on the Redundant Zero-extension elimination pass contributed by
   Sriraman Tallam (tmsri...@google.com) and Silvius Rus (r...@google.com).

   This file is part of GCC.
[...]

The general comment must be adjusted: "Problem Description" extended, "How does 
this pass work" adjusted and so on.  Hardcoded references to zero-extensions 
and specific pair of modes, both in the comment and the function names, must 
be eliminated.

Since implicit-zee.c has essentially no revision history (the original commit + 
a patch of mine to fix rough edges), let's pretend we start from scratch, so 
the ChangeLog will be


* implicit-zee.c: Delete.
* ree.c: New file.
[...]

and you post the complete file.  I'll do the review.

Last but not least, since this is a significant contribution, you need to have 
a copyright assignment on file with the FSF in order for us to accept it.

-- 
Eric Botcazou


Re: PR c++/30195

2011-11-11 Thread Dodji Seketeli
Fabien Chêne  a écrit:

> Are the other debugging backends not interested at all in USING_DECLs ?

The way debug info is generated for USING_DECLs is that
handle_using_decl (via cp_emit_debug_info_for_using) asks the backend to
generate debug info for the DECLs the USING_DECL resolves to, basically.
AIUI, the backend is not supposed to handle the USING_DECL himself.

-- 
Dodji


Re: [PATCH] PR target/50038 fix: redundant zero extensions removal

2011-11-11 Thread Ilya Enkovich
Hello Eric,

2011/11/11 Eric Botcazou :
>> Great! I'll be back with patch covering all non functional changes.
>> Will it be OK to have everything in one patch (including current
>> functional changes) or I should split it?
>
> Let's also rename the file while we are at it.  I'd suggest Redundant 
> Extension
> Elimination for the name of the pass, so ree.c for the filename (suggestions
> for a more descriptive name welcome).  So we rename implicit-zee.c into ree.c
> and add a new header along these lines:
>
> /* Redundant Extension Elimination pass for the GNU compiler.
>   Copyright (C) 2010-2011 Free Software Foundation, Inc.
>   Contributed by 
>
>   Based on the Redundant Zero-extension elimination pass contributed by
>   Sriraman Tallam (tmsri...@google.com) and Silvius Rus (r...@google.com).
>
>   This file is part of GCC.
> [...]
>
> The general comment must be adjusted: "Problem Description" extended, "How 
> does
> this pass work" adjusted and so on.  Hardcoded references to zero-extensions
> and specific pair of modes, both in the comment and the function names, must
> be eliminated.
>
> Since implicit-zee.c has essentially no revision history (the original commit 
> +
> a patch of mine to fix rough edges), let's pretend we start from scratch, so
> the ChangeLog will be
>
>
>        * implicit-zee.c: Delete.
>        * ree.c: New file.
> [...]
>
> and you post the complete file.  I'll do the review.
>
> Last but not least, since this is a significant contribution, you need to have
> a copyright assignment on file with the FSF in order for us to accept it.
>
> --
> Eric Botcazou
>

Thanks for tips! I'll do these changes after my two weeks vacation or
my colleges will cover me.

I have already signed copyright agreement with the FSF. Will I need
the separate one for this particular commit?

Thanks
Ilya


Re: [PATCH] Revert sparc vec_init improvements as they cause 64-bit regressions.

2011-11-11 Thread Eric Botcazou
> Eric, I tried my best to get the new code working properly on 64-bit
> and I just couldn't figure out a reasonably way to do so.

Same here, this looks really tricky.  On the one hand we could give in a little 
and tolerate inferior code quality in 64-bit mode, but on the other hand this 
is a bit hard to swallow as the instructions we can play with are the same.

> One thing that really irks me is how pseudo's can only be subreg'd
> on UNITS_PER_WORD boundaries.  That's the real reason this stuff
> doesn't work and it's nearly impossible to subreg 32-bit values
> that end up in float regs on sparc when compiling 64-bit.

Yes, this was done on purpose to solve very nasty RA/reload problems, but the
irregularity of the SPARC register file in 64-bit mode clearly conflicts with 
it.  And not all issues were solved, so we used CANNOT_CHANGE_MODE_CLASS to 
mask some of the remaining ones on SPARC (and on PA).  

> Anyways, commited to trunk and all the 64-bit failures should be gone.

Do we have the same problem in VIS2/3 mode as in VIS1 mode?  If so, then I 
agree that this is probably the best course of action in the short term.

-- 
Eric Botcazou


Re: [PATCH] PR target/50038 fix: redundant zero extensions removal

2011-11-11 Thread Eric Botcazou
> I have already signed copyright agreement with the FSF. Will I need
> the separate one for this particular commit?

No, if your contributions are already covered by a copyright agreement with the 
FSF, nothing more needs to be done.

-- 
Eric Botcazou


[Patch ObjC/NeXT] use correct personality routine for Objective-C/NeXT/ABI0/1

2011-11-11 Thread Iain Sandoe
This corrects a mistake I made when splitting the runtime code up -  
which causes the GNU eh personality routine to be specified for NeXT  
ABI 0&1.


This causes a linkage error if "-fexceptions" is specified for NeXT @  
m32
(although there's no functional effect, since there is no ZCE  
implementation of ObjC exceptions at m32).


tested on powerpc,i686-darwin9 and x86-64-darwin10.  -m32/-fobjc-abi- 
version=0,-m32/-fobjc-abi-version=1,-m64


OK for trunk/4.6?
Iain

gcc/objc:

* objc-next-runtime-abi-01.c (objc_eh_personality): Use gcc personality
for Objective-C m32.

Index: gcc/objc/objc-next-runtime-abi-01.c
===
--- gcc/objc/objc-next-runtime-abi-01.c (revision 181250)
+++ gcc/objc/objc-next-runtime-abi-01.c (working copy)
@@ -2872,12 +2872,15 @@ make_err_class:
   return eh_id;
 }

+/* For NeXT ABI 0 and 1, the personality routines are just those of  
the underlying

+   language.  */
+
 static tree
 objc_eh_personality (void)
 {
   if (!objc_eh_personality_decl)
 #ifndef OBJCPLUS
-objc_eh_personality_decl = build_personality_function ("objc");
+objc_eh_personality_decl = build_personality_function ("gcc");
 #else
 objc_eh_personality_decl = build_personality_function ("gxx");
 #endif




Re: [PATCH] [Annotalysis] Fix ICE caused by ipa-sra optimization.

2011-11-11 Thread Martin Jambor
Hi,

On Fri, Nov 04, 2011 at 08:01:41AM -0700, Delesley Hutchins wrote:
> Thanks for the suggestion.  Unfortunately, knowing the original
> declaration doesn't help me; I also need to know the original
> arguments that were passed at the call site, before those arguments
> were removed by ipa-sra.

I see, that is tough.  Once you re-base the analysis on 4.7, you might
be able to use the new debugging stuff for this purpose:

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00649.html

Apart from that, I think that the information about the original
actual arguments is indeed lost in the transformation.

Martin


> 
> > (Of course, ipa-sra removes scalar parameters only when they are not
> > used in the first place and so there should be nothing to analyze.)
> 
> The problem is that the static analysis may be using the parameters,
> even if those parameters are not used in the body of the function.
> For example:
> 
> void dummyLock   (Mutex* mu) EXCLUSIVE_LOCK_FUNCTION(mu) { }
> void dummyUnlock(Mutex* mu) UNLOCK_FUNCTION(mu) { }
> 
> Mutex* mutex;
> int a GUARDED_BY(mutex);
> 
> void foo() {
>   // add mutex to set of held locks
>   dummyLock(mutex); // gets rewritten by ipa-sra to dummyLock().  Oops!
>   // okay to modify a, because we've "locked" mutex
>   a = 0;
>   // remove mutex from set of held locks
>   dummyUnlock(mutex);   // gets rewritten by ipa-sra to dummyUnlock().  Oops!
> }
> 
> The annotations here tell the static analyzer to treat dummyLock and
> dummyUnlock as valid lock functions, even though they don't
> technically do anything.  Such a pattern is not quite as deranged as
> it may at first appear -- it is used, for example, when creating a
> template class that may choose to either acquire a lock, or not,
> depending on its template parameter.  Ipa-sra kills the arguments, so
> I no longer know which mutex was locked.
> 
>   -DeLesley
> 
> 
> > Martin
> >
> >
> >>
> >> Bootstrapped and passed gcc regression testsuite on
> >> x86_64-unknown-linux-gnu.  Okay for google/gcc-4_6?
> >>
> >>  -DeLesley
> >>
> >> Changelog.google-4_6:
> >> 2011-11-02  DeLesley Hutchins  
> >>    * tree-threadsafe-analyze.c:
> >>      Ignores invalid attributes, issues a warning, recovers gracefully.
> >>    * common.opt:
> >>      Adds new thread safety warning.
> >>
> >> testsuite/Changelog.google-4_6:
> >> 2011-11-02  DeLesley Hutchins 
> >>    * g++.dg/thread-ann/thread_annot_lock-82.C:
> >>      Expanded regression test
> >>
> >> --
> >> DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315
> >
> >> Index: testsuite/g++.dg/thread-ann/thread_annot_lock-82.C
> >> ===
> >> --- testsuite/g++.dg/thread-ann/thread_annot_lock-82.C        (revision 
> >> 180716)
> >> +++ testsuite/g++.dg/thread-ann/thread_annot_lock-82.C        (working 
> >> copy)
> >> @@ -1,7 +1,7 @@
> >> -// Test template methods in the presence of cloned constructors.
> >> -// Regression test for bugfix.
> >> +// Regression tests: fix ICE issues when IPA-SRA deletes formal
> >> +// function parameters.
> >>  // { dg-do compile }
> >> -// { dg-options "-Wthread-safety -O3" }
> >> +// { dg-options "-Wthread-safety -Wthread-warn-optimization -O3" }
> >>
> >>  #include "thread_annot_common.h"
> >>
> >> @@ -10,6 +10,7 @@ void do_something(void* a);
> >>
> >>  class Foo {
> >>    Mutex mu_;
> >> +  int a GUARDED_BY(mu_);
> >>
> >>    // with optimization turned on, ipa-sra should eliminate the hidden
> >>    // "this" argument, thus invalidating EXCLUSIVE_LOCKS_REQUIRED.
> >> @@ -18,6 +19,7 @@ class Foo {
> >>    }
> >>
> >>    void foo(Foo* f);
> >> +  void bar();
> >>  };
> >>
> >>  void Foo::foo(Foo* f) {
> >> @@ -28,3 +30,17 @@ void Foo::foo(Foo* f) {
> >>    mu_.Unlock();
> >>  }
> >>
> >> +
> >> +class SCOPED_LOCKABLE DummyMutexLock {
> >> +public:
> >> +  // IPA-SRA should kill the parameters to these functions
> >> +  explicit DummyMutexLock(Mutex* mutex) EXCLUSIVE_LOCK_FUNCTION(mutex) {}
> >> +  ~DummyMutexLock() UNLOCK_FUNCTION() {}
> >> +};
> >> +
> >> +
> >> +void Foo::bar() {
> >> +  // Matches two warnings:
> >> +  DummyMutexLock dlock(&mu_);  // { dg-warning "attribute has been 
> >> removed by optimization." }
> >> +  a = 1;  // warning here should be suppressed, due to errors handling 
> >> dlock
> >> +}
> >> Index: common.opt
> >> ===
> >> --- common.opt        (revision 180716)
> >> +++ common.opt        (working copy)
> >> @@ -680,6 +680,10 @@ Wthread-attr-bind-param
> >>  Common Var(warn_thread_attr_bind_param) Init(1) Warning
> >>  Make the thread safety analysis try to bind the function parameters used 
> >> in the attributes
> >>
> >> +Wthread-warn-optimization
> >> +Common Var(warn_thread_optimization) Init(0) Warning
> >> +Warn when optimizations invalidate the thread safety analysis.
> >> +
> >>  Wtype-limits
> >>  Common Var(warn_type_limits) Init(-1) Warning
> >>  Warn if 

Re: Mark objects death at end of scope

2011-11-11 Thread Michael Matz
Hi,

On Thu, 10 Nov 2011, Steve Ellcey wrote:

> This patch (r181172) has broken my bootstrap of IA64 Linux and I am
> trying to figure out what to do about it.
> 
> The failure happens while building libunwind (I did not configure with
> --with-system-libunwind):
> 
> /ctires/gcc/nightly/build-ia64-redhat-linux-gnu-trunk/obj_gcc/./gcc/xgcc 
> -B/ctires/gcc/nightly/build-ia64-redhat-linux-gnu-trunk/obj_gcc/./gcc/ 
> -B/ctires/gcc/nightly/gcc-ia64-redhat-linux-gnu-trunk/ia64-redhat-linux-gnu/bin/
>  
> -B/ctires/gcc/nightly/gcc-ia64-redhat-linux-gnu-trunk/ia64-redhat-linux-gnu/lib/
>  
> -isystem 
> /ctires/gcc/nightly/gcc-ia64-redhat-linux-gnu-trunk/ia64-redhat-linux-gnu/include
>  
> -isystem 
> /ctires/gcc/nightly/gcc-ia64-redhat-linux-gnu-trunk/ia64-redhat-linux-gnu/sys-include
>  
> -O2 -g -O2 -DIN_GCC -DUSE_LIBUNWIND_EXCEPTIONS -W -Wall -Wno-narrowing 
> -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes 
> -Wold-style-definition -isystem ./include -fPIC -DUSE_GAS_SYMVER -g 
> -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -shared 
> -nodefaultlibs -Wl,-h,libunwind.so.7 -Wl,-z,text -Wl,-z,defs -o 
> ./libunwind.so.7.tmp -g -O2 -B./ fde-glibc_s.o unwind-ia64_s.o -lc && rm 
> -f ./libunwind.so && if [ -f ./libunwind.so.7 ]; then mv -f 
> ./libunwind.so.7 ./libu
>  nwind.so.7.backup; else true; fi && mv ./libunwind.so.7.tmp ./libunwind.so.7 
> && ln -s libunwind.so.7 ./libunwind.so
> 
> fde-glibc_s.o:(.IA_64.unwind_info+0x28): undefined reference to 
> `__gcc_personality_v0'

Hmm, this is defined in libgcc_s and in libgcc_eh, but libunwind is linked 
with -nodefaultlibs.  I think it makes sense to require the unwinder to 
not throw or catch exceptions itself, hence -fno-exceptions should be the 
correct flag to compile it ...

> Looking at fde-glibc_s.o and unwind-ia64_s.o before your patch I see 
> that there are no references to __gcc_personality_v0.  Looking at the 
> email and PR 50857 made me think that maybe we should compile these 
> files with -fno-exceptions but the Makefile is currently explicitly 
> compiling them with -fexceptions

... so this seems incorrect.  I'd try adding -fno-exceptions for the 
LIBUNWIND objects, it should work.


Ciao,
Michael.


Re: [libitm] Work around missing AVX support

2011-11-11 Thread Iain Sandoe

An update .. in case anyone is following...

On 11 Nov 2011, at 00:21, Richard Henderson wrote:


On 11/10/2011 03:29 PM, Iain Sandoe wrote:

The m64 build fails because of the -Wl,-undefined -Wl,dynamic_lookup


FAOD, Is there some reason that this library needs to resolve symbols
from some external source at load time?


Not that I know of.  I think that's generic libtool giving you that.


hmmm. Some things are not stacking up.

If I:
a) patch around PR50596.
b) patch sjls.S to include the leading "_" on _ITM_beginTransaction  
and GTM_longjmp.
c) patch varasm to use "__DATA,__tm_clone_table" for the  
tm_clone_table section name.


most of the test-suite runs on x86-64-darwin10 (with fails on clone,  
memcpy, memset).
mem{cpy,set} are caused by a different naming for MAP_ANON (vs.  
MAP_ANONYMOUS).


however, most of the suite fails on darwin9 - with an undefined  
reference to delete(void*).


This is all puzzling me - because the Makefile.am contains

# Force link with C, not C++.  For now, while we're using C++ we don't
# want or need libstdc++.
libitm_la_LINK = $(LINK)
libitm_la_LDFLAGS = $(libitm_version_info) $(libitm_version_script) \
-no-undefined

===

however, if I hack the libtool to remove the  -Wl,-undefined - 
Wl,dynamic_lookup ...

... I get :

/GCC/gcc-4-7-trunk-build/./gcc/xgcc -B/GCC/gcc-4-7-trunk-build/./gcc/ - 
B/GCC/gcc-4-7-install/i686-apple-darwin9/bin/ -B/GCC/gcc-4-7-install/ 
i686-apple-darwin9/lib/ -isystem /GCC/gcc-4-7-install/i686-apple- 
darwin9/include -isystem /GCC/gcc-4-7-install/i686-apple-darwin9/sys- 
include-dynamiclib  -o .libs/libitm.0.dylib  .libs/aatree.o .libs/ 
alloc.o .libs/alloc_c.o .libs/alloc_cpp.o .libs/barrier.o .libs/ 
beginend.o .libs/clone.o .libs/eh_cpp.o .libs/local.o .libs/ 
query.o .libs/retry.o .libs/rwlock.o .libs/useraction.o .libs/ 
util.o .libs/sjlj.o .libs/tls.o .libs/method-serial.o .libs/method- 
gl.o .libs/x86_sse.o .libs/x86_avx.o-march=i486 -mtune=i686 - 
pthread   -install_name  /GCC/gcc-4-7-install/lib/gcc/i686-apple- 
darwin9/4.7.0/libitm.0.dylib -compatibility_version 1 -current_version  
1.0 -Wl,-single_module

Undefined symbols:
  "operator delete(void*, std::nothrow_t const&)", referenced from:
  _del_opnt in alloc_cpp.o
  "operator delete(void*)", referenced from:
  __ZdlPv$non_lazy_ptr in alloc_cpp.o
  "___cxa_tm_cleanup", referenced from:
   
GTM::gtm_thread::revert_cpp_exceptions(GTM::gtm_transaction_cp*)   
in eh_cpp.o
   
GTM::gtm_thread::revert_cpp_exceptions(GTM::gtm_transaction_cp*)   
in eh_cpp.o

  "operator new[](unsigned long)", referenced from:
  transaction clone for operator new[](unsigned long) in  
alloc_cpp.o

  "operator delete[](void*)", referenced from:
  __ZdaPv$non_lazy_ptr in alloc_cpp.o
  "___cxa_begin_catch", referenced from:
  __ITM_cxa_begin_catch in eh_cpp.o
  "operator delete[](void*, std::nothrow_t const&)", referenced from:
  _del_opvnt in alloc_cpp.o
  "operator new[](unsigned long, std::nothrow_t const&)", referenced  
from:
  transaction clone for operator new[](unsigned long,  
std::nothrow_t const&) in alloc_cpp.o
  "operator new(unsigned long, std::nothrow_t const&)", referenced  
from:
  transaction clone for operator new(unsigned long,  
std::nothrow_t const&) in alloc_cpp.o

  "operator new(unsigned long)", referenced from:
  transaction clone for operator new(unsigned long) in alloc_cpp.o
  "___cxa_allocate_exception", referenced from:
  __ITM_cxa_allocate_exception in eh_cpp.o
  "___cxa_throw", referenced from:
  __ITM_cxa_throw in eh_cpp.o
  "___cxa_end_catch", referenced from:
  __ITM_cxa_end_catch in eh_cpp.o
ld: symbol(s) not found

i.e. a bunch of undefined c++ symbols...

(have to do some other stuff for a while.. will try to get back o this  
later).


Iain



[Patch, Fortran] PR 51073: fix for zero-sized coarray arrays

2011-11-11 Thread Tobias Burnus

Dear all,

attached one patches for issues found by Joel when testing gfortran on 
RTEMS.


Coarrays with -fcoarray=lib: For zero-sized static ("save") coarray 
arrays, we should allocate a single byte rather than 0 bytes as  "ptr = 
malloc (0)" might return either NULL or a unique pointer. If it returns 
0, we regard it as error condition and abort.


The patch does the same we do for nonstatic variables: It allocates a 
single byte in this case; as it is done in the front-end and as it is a 
compile-time constant, there is no performance problem ;-)


Example (compile with -fcoarray=lib and search for _gfortran_caf_register):
  integer, save :: caf(1:0)[*]
  print *, size(caf)
  end

The existing test case is gfortran.dg/coarray/lock_1.f90 which contains 
a zero-component derived type.



[There is another issue related to backtracing support: libgfortran 
assumes that getenv("PATH") is set. While that's true for most systems, 
it is not true for all - and if I understood RTEMS correctly, it also 
only allows a single process (but multiple threads) such that setting 
the PATH is pointless and wastes memory. However, Janne wants to take 
care of that patch.]


Build and regtested on x86-64-Linux.
OK for the trunk?

Tobias

PS: The last regression count I saw for RTEMS is as follows 
(mips-unknown-rtems4.11). The numbers look really good:


=== gfortran Summary ===

# of expected passes38236
# of unexpected failures277
# of expected failures  50
# of unsupported tests  374

If I look  the log, we have ~22 failures due to chmod (PR36755), ~8 
failures due to pattern matching of the output - I do not understand 
why, for me the pattern looks OK to me; cray_pointers_2.f90 crashes 
badly, gfortran.dg/default_format_1.f90 (9 failures) also crashes, and a 
handful other programs also segfault.



By the way, RTEMS (Real-Time Executive for Multiprocessor Systems) is a 
real-time operating system for embedded systems, which has been used, 
e.g., for Space missions [e.g. for the Electra software radio of Mars 
Reconnaissance Orbiter]. (Space projects typically means very old* 
hardware which is hardened and thus extremely expensive. [* long time 
between project start and launch.]) Its also used in the embedded system 
of a car manufacturer etc. Not surprising for embedded/real-time 
systems, the available memory is small and one has additional 
constraints; in case of RTEMS that there is only a single process (which 
might have multiple threads). Wikipedia lists around 16 supported 
architectures [depending how one counts].
2011-11-11  Tobias Burnus  

	PR fortran/51073
	* trans-decl.c (generate_coarray_sym_init): Handle zero-sized arrays.

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index b90b0ab..eb74e16 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -4234,12 +4238,16 @@ generate_coarray_sym_init (gfc_symbol *sym)
 
   size = TYPE_SIZE_UNIT (gfc_get_element_type (TREE_TYPE (decl)));
 
+  /* Ensure that we do not have size=0 for zero-sized arrays.  */ 
+  size = fold_build2_loc (input_location, MAX_EXPR, size_type_node,
+			  fold_convert (size_type_node, size),
+			  build_int_cst (size_type_node, 1));
+
   if (GFC_TYPE_ARRAY_RANK (TREE_TYPE (decl)))
 {
   tmp = GFC_TYPE_ARRAY_SIZE (TREE_TYPE (decl));
   size = fold_build2_loc (input_location, MULT_EXPR, size_type_node,
-			  fold_convert (size_type_node, tmp),
-			  fold_convert (size_type_node, size));
+			  fold_convert (size_type_node, tmp), size);
 }
 
   gcc_assert (GFC_TYPE_ARRAY_CAF_TOKEN (TREE_TYPE (decl)) != NULL_TREE);


Re: [libitm] Work around missing AVX support

2011-11-11 Thread Rainer Orth
Iain Sandoe  writes:

> however, most of the suite fails on darwin9 - with an undefined reference
> to delete(void*).

Could this be the same issue I've been seeing on Tru64 UNIX, i.e. lack
of weakdef support?

http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01426.html

At least my weakdef.c testcase also fails to link on
i386-apple-darwin9.8.0.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


PATCH, PR 50605] Handle MEM_REFs in is_gimple_ip_invariant_address

2011-11-11 Thread Martin Jambor
Hi,

the problem in PR 50605 is that is_gimple_ip_invariant returns false
for

  &MEM[(struct tRecorderImp *)&recorder + 8B]

where &reorder is an IP gimple invariant.  This patch fixes that by
copying the code that handles MEM_REFs from
is_gimple_invariant_address (and only changing
decl_address_invariant_p to decl_address_ip_invariant_p).

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin



2011-11-11  Martin Jambor  

PR tree-optimization/50605
* gimple.c (is_gimple_ip_invariant_address): Also handle MEM_REFs
of IPA invariant decls.

* testsuite/g++.dg/ipa/pr50605.C: New test.


Index: src/gcc/testsuite/g++.dg/ipa/pr50605.C
===
--- /dev/null
+++ src/gcc/testsuite/g++.dg/ipa/pr50605.C
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-early-inlining" } */
+
+class A
+{
+public:
+  int a;
+  void *stuff;
+};
+
+class B
+{
+public:
+  int b;
+  void *other_stuff;
+  A array[50];
+};
+
+extern B gb;
+
+int process_A (A *a)
+{
+  return a->a;
+}
+
+int process_A_complex (A *a)
+{
+  return process_A (a+3);
+}
+
+int process_B (B *b)
+{
+  return process_A_complex (&b->array[0]);
+}
+
+int foo (void)
+{
+  return process_B (&gb);
+}
+
Index: src/gcc/gimple.c
===
--- src.orig/gcc/gimple.c
+++ src/gcc/gimple.c
@@ -2850,8 +2850,18 @@ is_gimple_ip_invariant_address (const_tr
 return false;
 
   op = strip_invariant_refs (TREE_OPERAND (t, 0));
+  if (!op)
+return false;
+
+  if (TREE_CODE (op) == MEM_REF)
+{
+  const_tree op0 = TREE_OPERAND (op, 0);
+  return (TREE_CODE (op0) == ADDR_EXPR
+ && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
+ || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0;
+}
 
-  return op && (CONSTANT_CLASS_P (op) || decl_address_ip_invariant_p (op));
+  return CONSTANT_CLASS_P (op) || decl_address_ip_invariant_p (op);
 }
 
 /* Return true if T is a GIMPLE minimal invariant.  It's a restricted



[PATCH] Fold VEC_PERM_EXPR/VEC_INTERLEAVE*EXPR/VEC_EXTRACT*EXPR with VECTOR_CST/CONSTRUCTOR arguments (PR tree-optimization/51074, take 2)

2011-11-11 Thread Jakub Jelinek
Hi!

On Thu, Nov 10, 2011 at 12:00:53PM -0800, Richard Henderson wrote:
> VEC_PERM_EXPR is explicitly modulo.  Don't fail, mask.

Here is an updated patch.

In addition to the creation of subroutines this performs the permutation
folding using an unsigned char array for selector and folds VEC_PERM_EXPR
with out of bounds constant indices to masked ones if the first two
arguments aren't constants.

The valid_gimple_rhs_p change was needed because otherwise the VEC_PERM_EXPR
to VEC_PERM_EXPR with masked indices folding otherwise wasn't applied, and
the tree-vect-generic change was needed because if for whatever reason
the VEC_PERM_EXPR folding doesn't happen, lower_vec_perm wasn't masking it
and i386 target hook on it would ICE.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-11-11  Jakub Jelinek  

PR tree-optimization/51074
* fold-const.c (vec_cst_ctor_to_array, fold_vec_perm): New functions.
(fold_binary_loc): Handle VEC_EXTRACT_EVEN_EXPR,
VEC_EXTRACT_ODD_EXPR, VEC_INTERLEAVE_HIGH_EXPR and
VEC_INTERLEAVE_LOW_EXPR with VECTOR_CST or CONSTRUCTOR operands.
(fold_ternary_loc): Handle VEC_PERM_EXPR with VECTOR_CST or
CONSTRUCTOR operands.
* tree-ssa-propagate.c (valid_gimple_rhs_p): Handle ternary
expressions.
* tree-vect-generic.c (lower_vec_perm): Mask sel_int elements
to 0 .. 2 * elements - 1.

--- gcc/fold-const.c.jj 2011-11-10 18:08:54.328438535 +0100
+++ gcc/fold-const.c2011-11-11 11:23:09.393186463 +0100
@@ -9528,6 +9528,86 @@ get_pointer_modulus_and_residue (tree ex
   return 1;
 }
 
+/* Helper function for fold_vec_perm.  Store elements of VECTOR_CST or
+   CONSTRUCTOR ARG into array ELTS and return true if successful.  */
+
+static bool
+vec_cst_ctor_to_array (tree arg, tree *elts)
+{
+  unsigned int nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg)), i;
+
+  if (TREE_CODE (arg) == VECTOR_CST)
+{
+  tree t;
+
+  for (i = 0, t = TREE_VECTOR_CST_ELTS (arg);
+  i < nelts && t; i++, t = TREE_CHAIN (t))
+   elts[i] = TREE_VALUE (t);
+  if (t)
+   return false;
+}
+  else if (TREE_CODE (arg) == CONSTRUCTOR)
+{
+  constructor_elt *elt;
+
+  FOR_EACH_VEC_ELT (constructor_elt, CONSTRUCTOR_ELTS (arg), i, elt)
+   if (i >= nelts)
+ return false;
+   else
+ elts[i] = elt->value;
+}
+  else
+return false;
+  for (; i < nelts; i++)
+elts[i]
+  = fold_convert (TREE_TYPE (TREE_TYPE (arg)), integer_zero_node);
+  return true;
+}
+
+/* Attempt to fold vector permutation of ARG0 and ARG1 vectors using SEL
+   selector.  Return the folded VECTOR_CST or CONSTRUCTOR if successful,
+   NULL_TREE otherwise.  */
+
+static tree
+fold_vec_perm (tree type, tree arg0, tree arg1, const unsigned char *sel)
+{
+  unsigned int nelts = TYPE_VECTOR_SUBPARTS (type), i;
+  tree *elts;
+  bool need_ctor = false;
+
+  gcc_assert (TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0)) == nelts
+ && TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1)) == nelts);
+  if (TREE_TYPE (TREE_TYPE (arg0)) != TREE_TYPE (type)
+  || TREE_TYPE (TREE_TYPE (arg1)) != TREE_TYPE (type))
+return NULL_TREE;
+
+  elts = XALLOCAVEC (tree, nelts * 3);
+  if (!vec_cst_ctor_to_array (arg0, elts)
+  || !vec_cst_ctor_to_array (arg1, elts + nelts))
+return NULL_TREE;
+
+  for (i = 0; i < nelts; i++)
+{
+  if (!CONSTANT_CLASS_P (elts[sel[i]]))
+   need_ctor = true;
+  elts[i + 2 * nelts] = unshare_expr (elts[sel[i]]);
+}
+
+  if (need_ctor)
+{
+  VEC(constructor_elt,gc) *v = VEC_alloc (constructor_elt, gc, nelts);
+  for (i = 0; i < nelts; i++)
+   CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, elts[2 * nelts + i]);
+  return build_constructor (type, v);
+}
+  else
+{
+  tree vals = NULL_TREE;
+  for (i = 0; i < nelts; i++)
+   vals = tree_cons (NULL_TREE, elts[3 * nelts - i - 1], vals);
+  return build_vector (type, vals);
+}
+}
 
 /* Fold a binary expression of code CODE and type TYPE with operands
OP0 and OP1.  LOC is the location of the resulting expression.
@@ -13381,6 +13461,41 @@ fold_binary_loc (location_t loc,
   /* An ASSERT_EXPR should never be passed to fold_binary.  */
   gcc_unreachable ();
 
+case VEC_EXTRACT_EVEN_EXPR:
+case VEC_EXTRACT_ODD_EXPR:
+case VEC_INTERLEAVE_HIGH_EXPR:
+case VEC_INTERLEAVE_LOW_EXPR:
+  if ((TREE_CODE (arg0) == VECTOR_CST
+  || TREE_CODE (arg0) == CONSTRUCTOR)
+ && (TREE_CODE (arg1) == VECTOR_CST
+ || TREE_CODE (arg1) == CONSTRUCTOR))
+   {
+ unsigned int nelts = TYPE_VECTOR_SUBPARTS (type), i;
+ unsigned char *sel = XALLOCAVEC (unsigned char, nelts);
+
+ for (i = 0; i < nelts; i++)
+   switch (code)
+ {
+ case VEC_EXTRACT_EVEN_EXPR:
+   sel[i] = i * 2;
+   break;
+ case VEC_EXTRACT_ODD_EXPR:
+   sel[i]

[PATCH] Don't ICE on SLP calls if the same call is used in multiple SLP instances (PR tree-optimization/51058)

2011-11-11 Thread Jakub Jelinek
Hi!

Removing the scalar call in vectorizable_call for SLP vectorization
is too early, when another SLP instance refers to the same scalar call,
we'll ICE because that stmt doesn't have bb anymore or gsi_for_stmt
doesn't succeed for it.

Fixed by postponing replacement of calls with zeroing of lhs for later
in the SLP case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-11-11  Jakub Jelinek  

PR tree-optimization/51058
* tree-vect-slp.c (vect_remove_slp_scalar_calls): New function.
(vect_schedule_slp): Call it.
* tree-vect-stmts.c (vectorizable_call): If slp_node != NULL,
don't replace scalar calls with clearing of their lhs here.

* gcc.dg/vect/fast-math-vect-call-1.c: Add f4 test.
* gfortran.fortran-torture/compile/pr51058.f90: New test.

--- gcc/tree-vect-slp.c.jj  2011-11-10 18:09:12.0 +0100
+++ gcc/tree-vect-slp.c 2011-11-11 13:18:42.157292895 +0100
@@ -2898,6 +2898,46 @@ vect_schedule_slp_instance (slp_tree nod
   return is_store;
 }
 
+/* Replace scalar calls from SLP node NODE with clearing of their lhs.
+   For loop vectorization this is done in vectorizable_call, but for SLP
+   it needs to be deferred until end of vect_schedule_slp, because multiple
+   SLP instances may refer to the same scalar stmt.  */
+
+static void
+vect_remove_slp_scalar_calls (slp_tree node)
+{
+  gimple stmt, new_stmt;
+  gimple_stmt_iterator gsi;
+  int i;
+  slp_void_p child;
+  tree lhs;
+  stmt_vec_info stmt_info;
+
+  if (!node)
+return;
+
+  FOR_EACH_VEC_ELT (slp_void_p, SLP_TREE_CHILDREN (node), i, child)
+vect_remove_slp_scalar_calls ((slp_tree) child);
+
+  FOR_EACH_VEC_ELT (gimple, SLP_TREE_SCALAR_STMTS (node), i, stmt)
+{
+  if (!is_gimple_call (stmt) || gimple_bb (stmt) == NULL)
+   continue;
+  stmt_info = vinfo_for_stmt (stmt);
+  if (stmt_info == NULL
+ || is_pattern_stmt_p (stmt_info)
+ || !PURE_SLP_STMT (stmt_info))
+   continue;
+  lhs = gimple_call_lhs (stmt);
+  new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
+  set_vinfo_for_stmt (new_stmt, stmt_info);
+  set_vinfo_for_stmt (stmt, NULL);
+  STMT_VINFO_STMT (stmt_info) = new_stmt;
+  gsi = gsi_for_stmt (stmt);
+  gsi_replace (&gsi, new_stmt, false);
+  SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
+}
+}
 
 /* Generate vector code for all SLP instances in the loop/basic block.  */
 
@@ -2937,6 +2977,8 @@ vect_schedule_slp (loop_vec_info loop_vi
   unsigned int j;
   gimple_stmt_iterator gsi;
 
+  vect_remove_slp_scalar_calls (root);
+
   for (j = 0; VEC_iterate (gimple, SLP_TREE_SCALAR_STMTS (root), j, store)
   && j < SLP_INSTANCE_GROUP_SIZE (instance); j++)
 {
--- gcc/tree-vect-stmts.c.jj2011-11-10 18:09:12.0 +0100
+++ gcc/tree-vect-stmts.c   2011-11-11 13:17:55.957565252 +0100
@@ -1886,6 +1886,9 @@ vectorizable_call (gimple stmt, gimple_s
  it defines is mapped to the new definition.  So just replace
  rhs of the statement with something harmless.  */
 
+  if (slp_node)
+return true;
+
   type = TREE_TYPE (scalar_dest);
   if (is_pattern_stmt_p (stmt_info))
 lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info));
@@ -1893,8 +1896,7 @@ vectorizable_call (gimple stmt, gimple_s
 lhs = gimple_call_lhs (stmt);
   new_stmt = gimple_build_assign (lhs, build_zero_cst (type));
   set_vinfo_for_stmt (new_stmt, stmt_info);
-  if (!slp_node)
-set_vinfo_for_stmt (stmt, NULL);
+  set_vinfo_for_stmt (stmt, NULL);
   STMT_VINFO_STMT (stmt_info) = new_stmt;
   gsi_replace (gsi, new_stmt, false);
   SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
--- gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c.jj2011-11-08 
23:35:11.0 +0100
+++ gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c   2011-11-11 
13:11:30.348891934 +0100
@@ -38,6 +38,18 @@ f3 (void)
 a[i] = copysignf (b[i], c[i]) + 1.0f + sqrtf (d[i]);
 }
 
+__attribute__((noinline, noclone)) void
+f4 (int n)
+{
+  int i;
+  for (i = 0; i < 2 * n; i++)
+{
+  a[3 * i + 0] = copysignf (b[3 * i + 0], c[3 * i + 0]) + 1.0f + sqrtf 
(d[3 * i + 0]);
+  a[3 * i + 1] = copysignf (b[3 * i + 1], c[3 * i + 1]) + 2.0f + sqrtf 
(d[3 * i + 1]);
+  a[3 * i + 2] = copysignf (b[3 * i + 2], c[3 * i + 2]) + 3.0f + sqrtf 
(d[3 * i + 2]);
+}
+}
+
 __attribute__((noinline, noclone)) int
 main1 ()
 {
@@ -66,6 +78,12 @@ main1 ()
   for (i = 0; i < 64; i++)
 if (fabsf (((i & 2) ? -4 * i : 4 * i) + 1 + i - a[i]) >= 0.0001f)
   abort ();
+else
+  a[i] = 131.25;
+  f4 (10);
+  for (i = 0; i < 60; i++)
+if (fabsf (((i & 2) ? -4 * i : 4 * i) + 1 + (i % 3) + i - a[i]) >= 0.0001f)
+  abort ();
   return 0;
 }
 
@@ -76,6 +94,6 @@ main ()
   return main1 ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { target { 
vect_call_copysignf && vect_call_s

[PATCH] Emit vzeroupper even from gen_return and gen_simple_return

2011-11-11 Thread Jakub Jelinek
Hi!

The avx-vzeroupper-14.c testcase now fails, because normal epilogue isn't
emitted and before simple_return or return we forgot to emit the vzeroupper
insn.  Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2011-11-11  Jakub Jelinek  

* config/i386/i386-protos.h (ix86_maybe_emit_epilogue_vzeroupper):
New prototype.
* config/i386/i386.c (ix86_maybe_emit_epilogue_vzeroupper): New
function.
(ix86_expand_epilogue): Use it.
* config/i386/i386.md (return, simple_return): Call it in the
expanders.

--- gcc/config/i386/i386-protos.h.jj2011-11-07 12:40:55.0 +0100
+++ gcc/config/i386/i386-protos.h   2011-11-11 13:39:48.322746212 +0100
@@ -32,6 +32,7 @@ extern void ix86_setup_frame_addresses (
 
 extern HOST_WIDE_INT ix86_initial_elimination_offset (int, int);
 extern void ix86_expand_prologue (void);
+extern void ix86_maybe_emit_epilogue_vzeroupper (void);
 extern void ix86_expand_epilogue (int);
 extern void ix86_expand_split_stack_prologue (void);
 
--- gcc/config/i386/i386.c.jj   2011-11-10 18:09:12.0 +0100
+++ gcc/config/i386/i386.c  2011-11-11 13:39:22.375900662 +0100
@@ -10614,6 +10614,17 @@ ix86_emit_restore_sse_regs_using_mov (HO
   }
 }
 
+/* Emit vzeroupper if needed.  */
+
+void
+ix86_maybe_emit_epilogue_vzeroupper (void)
+{
+  if (TARGET_VZEROUPPER
+  && !TREE_THIS_VOLATILE (cfun->decl)
+  && !cfun->machine->caller_return_avx256_p)
+emit_insn (gen_avx_vzeroupper (GEN_INT (call_no_avx256)));
+}
+
 /* Restore function stack, frame, and registers.  */
 
 void
@@ -10911,10 +10922,7 @@ ix86_expand_epilogue (int style)
 }
 
   /* Emit vzeroupper if needed.  */
-  if (TARGET_VZEROUPPER
-  && !TREE_THIS_VOLATILE (cfun->decl)
-  && !cfun->machine->caller_return_avx256_p)
-emit_insn (gen_avx_vzeroupper (GEN_INT (call_no_avx256)));
+  ix86_maybe_emit_epilogue_vzeroupper ();
 
   if (crtl->args.pops_args && crtl->args.size)
 {
--- gcc/config/i386/i386.md.jj  2011-11-08 09:27:01.0 +0100
+++ gcc/config/i386/i386.md 2011-11-11 13:40:41.081432858 +0100
@@ -11736,6 +11736,7 @@ (define_expand "return"
   [(simple_return)]
   "ix86_can_use_return_insn_p ()"
 {
+  ix86_maybe_emit_epilogue_vzeroupper ();
   if (crtl->args.pops_args)
 {
   rtx popc = GEN_INT (crtl->args.pops_args);
@@ -11752,6 +11753,7 @@ (define_expand "simple_return"
   [(simple_return)]
   "!TARGET_SEH"
 {
+  ix86_maybe_emit_epilogue_vzeroupper ();
   if (crtl->args.pops_args)
 {
   rtx popc = GEN_INT (crtl->args.pops_args);

Jakub


Re: Mark objects death@end of scope

2011-11-11 Thread Ulrich Weigand
Michael Matz wrote:

>   * gimplify.c (gimplify_bind_expr): Add clobbers for all variables
>   that go out of scope and live in memory.

This seems to have completely broken SPU exception handling (note that
SPU is currently completely broken anyway due to the libgcc move).

What happens is that with that patch, some of the core internal routines
of the unwinder itself, including _Unwind_SjLj_Resume, themselves get
exception regions.

While the DWARF unwinder may be able to cope with this, the SjLj unwinder
- which SPU uses - appears to get totally confused by this.  We end up in
an endless loop where _Unwind_SjLj_Resumes "resumes" to a location within
itself.

One reason why this happens is that the unwind*.c files are specifically
built with -fexception.  I think this is for the benefit of the DWARF
unwinder, to ensure CFI records are available for those routines.  But
for the SjLj unwinder, it's a bit counter-productive ...

Any thoughts how to fix this?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: [PATCH] Emit vzeroupper even from gen_return and gen_simple_return

2011-11-11 Thread Uros Bizjak
On Fri, Nov 11, 2011 at 4:33 PM, Jakub Jelinek  wrote:

> The avx-vzeroupper-14.c testcase now fails, because normal epilogue isn't
> emitted and before simple_return or return we forgot to emit the vzeroupper
> insn.  Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

OK.

Thanks,
Uros.


[PATCH][Cilkplus] Patch to fix Finish Call Expr

2011-11-11 Thread Iyer, Balaji V
Hello Everyone,
This patch is for the Cilk Plus branch, mainly affecting the G++ 
compiler. This patch will add the extra parameter (enum call_context) into 
finish_call_expr that was added in the weekly merge.

I am planning to send approximately 4 patches today, and so please 
commit them in order. This is PATCH #1.


Thanks,

Balaji V. Iyer.
diff --git a/gcc/cp/ChangeLog.cilk b/gcc/cp/ChangeLog.cilk
index cbf022f..fc498f1 100644
--- a/gcc/cp/ChangeLog.cilk
+++ b/gcc/cp/ChangeLog.cilk
@@ -1,3 +1,10 @@
+2011-11-11  Balaji V. Iyer  
+
+   * parser.c (cp_parser_userdef_char_literal): Added CALL_NORMAL
+   parameter to finish_call_expr function call.
+   (cp_parser_userdef_numeral_literal): Likewise.
+   (cp_parser_userdef_string_literal): Likewise.
+
 2011-10-22  Balaji V. Iyer  
 
* typeck2.c (split_nonconstant_init_1): Added "CALL_NORMAL" parameter
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 7f03785..07572d4 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -3602,7 +3602,8 @@ cp_parser_userdef_char_literal (cp_parser *parser)
   release_tree_vector (vec);
   return error_mark_node;
 }
-  result = finish_call_expr (decl, &vec, false, true, tf_warning_or_error);
+  result = finish_call_expr (decl, &vec, false, true, CALL_NORMAL,
+tf_warning_or_error);
   release_tree_vector (vec);
 
   return result;
@@ -3662,7 +3663,8 @@ cp_parser_userdef_numeric_literal (cp_parser *parser)
   decl = lookup_function_nonclass (name, args, /*block_p=*/false);
   if (decl && decl != error_mark_node)
 {
-  result = finish_call_expr (decl, &args, false, true, tf_none);
+  result = finish_call_expr (decl, &args, false, true, CALL_NORMAL,
+tf_none);
   if (result != error_mark_node)
{
  release_tree_vector (args);
@@ -3679,7 +3681,8 @@ cp_parser_userdef_numeric_literal (cp_parser *parser)
   decl = lookup_function_nonclass (name, args, /*block_p=*/false);
   if (decl && decl != error_mark_node)
 {
-  result = finish_call_expr (decl, &args, false, true, tf_none);
+  result = finish_call_expr (decl, &args, false, true, CALL_NORMAL,
+tf_none);
   if (result != error_mark_node)
{
  release_tree_vector (args);
@@ -3697,7 +3700,8 @@ cp_parser_userdef_numeric_literal (cp_parser *parser)
 {
   tree tmpl_args = make_char_string_pack (num_string);
   decl = lookup_template_function (decl, tmpl_args);
-  result = finish_call_expr (decl, &args, false, true, tf_none);
+  result = finish_call_expr (decl, &args, false, true, CALL_NORMAL,
+tf_none);
   if (result != error_mark_node)
{
  release_tree_vector (args);
@@ -3743,7 +3747,7 @@ cp_parser_userdef_string_literal (cp_token *token)
   release_tree_vector (vec);
   return error_mark_node;
 }
-  result = finish_call_expr (decl, &vec, false, true, tf_none);
+  result = finish_call_expr (decl, &vec, false, true, CALL_NORMAL, tf_none);
   if (result == error_mark_node)
 error ("unable to find valid user-defined string literal operator %qD."
   "  Possible missing length argument in string literal operator.",


Re: PR c++/30195

2011-11-11 Thread Jason Merrill

On 11/11/2011 04:42 AM, Dodji Seketeli wrote:

Fabien Chêne  a écrit:


Are the other debugging backends not interested at all in USING_DECLs ?


The way debug info is generated for USING_DECLs is that
handle_using_decl (via cp_emit_debug_info_for_using) asks the backend to
generate debug info for the DECLs the USING_DECL resolves to, basically.
AIUI, the backend is not supposed to handle the USING_DECL himself.


That seems to be only used for namespace- or function-scope 
using-declarations; we don't do anything for class-scope 
using-declarations currently.  DWARF2 is interested in them, but that 
seems like a separate chunk of work, so for now let's just set 
DECL_IGNORED_P like Dodji suggested.


Jason


[PATCH][Cilkplus] Patch to finish const tree

2011-11-11 Thread Iyer, Balaji V
Hello Everyone,
This patch is for the Cilkplus branch, affecting the C and C++ compilers. 
This patch will fix a warning about converting a const void * to tree.
   
 This patch is PATCH #2.

Thanks,

Balaji V. Iyer.diff --git a/gcc/ChangeLog.cilk b/gcc/ChangeLog.cilk
index 285f059..511d8f7 100644
--- a/gcc/ChangeLog.cilk
+++ b/gcc/ChangeLog.cilk
@@ -1,3 +1,10 @@
+2011-11-11  Balaji V. Iyer  
+
+   * cilk-spawn.c (wrapper_parm_cb): Changed a const tree cast
+   to (tree *).
+   (for_local_cb): Likewise.
+   (wrapper_local_cb): Likewise.
+
 2011-10-22  Balaji V. Iyer  
 
* cilk.c (install_builtin): Changed implicit_built_in_decls[] to
diff --git a/gcc/cilk-spawn.c b/gcc/cilk-spawn.c
index b004b11..17b9c51 100644
--- a/gcc/cilk-spawn.c
+++ b/gcc/cilk-spawn.c
@@ -1290,7 +1290,7 @@ static bool
 wrapper_parm_cb (const void *key0, void **val0, void *data)
 {
   struct wrapper_data *wd = (struct wrapper_data *)data;
-  tree arg = (const tree)key0;
+  tree arg = * (tree *)&key0;
   tree val = (tree)*val0;
   tree parm;
 
@@ -1450,7 +1450,7 @@ build_cilk_wrapper_body (tree stmt,
 static bool
 for_local_cb (const void *k_v, void **vp, void *p)
 {
-  tree k = (const tree) k_v; /* const cast */
+  tree k = *(tree *) &k_v; /* const cast */
   tree v = (tree) *vp;
 
   if (v == error_mark_node)
@@ -1462,7 +1462,7 @@ static bool
 wrapper_local_cb (const void *k_v, void **vp, void *data)
 {
   copy_body_data *id = (copy_body_data *)data;
-  tree key = (const tree) k_v;
+  tree key = *(tree *) &k_v;
   tree val = (tree) *vp;
 
   if (val == error_mark_node)
diff --git a/gcc/cp/ChangeLog.cilk b/gcc/cp/ChangeLog.cilk
index fc498f1..ce83aa6 100644
--- a/gcc/cp/ChangeLog.cilk
+++ b/gcc/cp/ChangeLog.cilk
@@ -4,6 +4,8 @@
parameter to finish_call_expr function call.
(cp_parser_userdef_numeral_literal): Likewise.
(cp_parser_userdef_string_literal): Likewise.
+   * cilk.c (for_local_cb): Changed a const tree cast to (tree *).
+   (wrapper_local_cb): Likewise.
 
 2011-10-22  Balaji V. Iyer  
 
diff --git a/gcc/cp/cilk.c b/gcc/cp/cilk.c
index 08a196e..b5e58f6 100644
--- a/gcc/cp/cilk.c
+++ b/gcc/cp/cilk.c
@@ -289,7 +289,7 @@ copy_decl_for_cilk (tree decl, copy_body_data *id)
 static bool
 for_local_cb (const void *k_v, void **vp, void *p)
 {
-  tree k = (tree) k_v;
+  tree k = *(tree *) &k_v;
   tree v = (tree) *vp;
 
 
@@ -334,7 +334,7 @@ static bool
 wrapper_local_cb (const void *k_v, void **vp, void *data)
 {
   copy_body_data *id = (copy_body_data *)data;
-  tree key = (tree)k_v;
+  tree key = *(tree *) &k_v;
   tree val = (tree)*vp;
 
   if (val == error_mark_node)


[PATCH][Cilkplus] Patch to remove CILKPLUS IMPLEMENTED macro

2011-11-11 Thread Iyer, Balaji V
Hello Everyone,
This patch is for the Cilkplus branch, affecting both C and C++ compilers. 
This patch will remove the CILKPLUS_IMPLEMENTED macro and all the #ifdef and 
#ifndef that uses it.

   This is patch #3.

Thanks,

Balaji V. Iyer.diff --git a/gcc/ChangeLog.cilk b/gcc/ChangeLog.cilk
index 511d8f7..ea3dc6e 100644
--- a/gcc/ChangeLog.cilk
+++ b/gcc/ChangeLog.cilk
@@ -4,6 +4,8 @@
to (tree *).
(for_local_cb): Likewise.
(wrapper_local_cb): Likewise.
+   * opts.c: Removed the CILKPLUS_IMPLEMENTED macro and the #ifdef and
+   #ifndefs that uses this macro.
 
 2011-10-22  Balaji V. Iyer  
 
diff --git a/gcc/opts.c b/gcc/opts.c
index d6732d5..10be701 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -36,7 +36,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "common/common-target.h"
 
 
-#define CILKPLUS_IMPLEMENTED /* bviyer: Please do not remove this #define */
 
 /* Indexed by enum debug_info_type.  */
 const char *const debug_type_names[] =
@@ -424,9 +423,6 @@ maybe_default_options (struct gcc_options *opts,
 static const struct default_options default_options_table[] =
   {
 /* -O1 optimizations.  */
-#ifdef CILKPLUS_IMPLEMENTED
-{ OPT_LEVELS_1_PLUS, OPT_ftree_vectorize, NULL, 1 },
-#endif
 { OPT_LEVELS_1_PLUS, OPT_fdefer_pop, NULL, 1 },
 #ifdef DELAY_SLOTS
 { OPT_LEVELS_1_PLUS, OPT_fdelayed_branch, NULL, 1 },
@@ -503,9 +499,7 @@ static const struct default_options default_options_table[] 
=
 { OPT_LEVELS_1_PLUS, OPT_finline_functions_called_once, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_funswitch_loops, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_fgcse_after_reload, NULL, 1 },
-#ifndef CILKPLUS_IMPLEMENTED
 { OPT_LEVELS_3_PLUS, OPT_ftree_vectorize, NULL, 1 },
-#endif
 { OPT_LEVELS_3_PLUS, OPT_fipa_cp_clone, NULL, 1 },
 
 /* -Ofast adds optimizations to -O3.  */


[PATCH][Cilkplus] Patch to fix unused variable

2011-11-11 Thread Iyer, Balaji V
Hello Everyone,
This patch is for the Cilkplus branch affecting both C and C++ compilers. 
This patch will fix an unused variable warning in collect2.c by enclosing the 
variable inside #ifdef TARGET_AIX_VERSION.

Thanks,

Balaji V. Iyer.diff --git a/gcc/ChangeLog.cilk b/gcc/ChangeLog.cilk
index ea3dc6e..8893b9a 100644
--- a/gcc/ChangeLog.cilk
+++ b/gcc/ChangeLog.cilk
@@ -6,6 +6,8 @@
(wrapper_local_cb): Likewise.
* opts.c: Removed the CILKPLUS_IMPLEMENTED macro and the #ifdef and
#ifndefs that uses this macro.
+   * collect2.c (main): Enclosed object_nbr declaration inside a
+   #ifdef TARGET_AIX_VERSION.
 
 2011-10-22  Balaji V. Iyer  
 
diff --git a/gcc/collect2.c b/gcc/collect2.c
index 9240bc8..92ef7ba 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -1091,7 +1091,9 @@ main (int argc, char **argv)
   const char **ld2;
   char **object_lst;
   const char **object;
+#ifdef TARGET_AIX_VERSION
   int object_nbr = argc;
+#endif
   int first_file;
   int num_c_args;
   char **old_argv;


FW: [PATCH][Cilkplus] Patch to fix unused variable

2011-11-11 Thread Iyer, Balaji V
Forgot to mention.. this is the 4th and final patch in this sequence.

Thanks,

Balaji V. Iyer.

From: Iyer, Balaji V
Sent: Friday, November 11, 2011 11:16 AM
To: gcc-patches@gcc.gnu.org
Subject: [PATCH][Cilkplus] Patch to fix unused variable

Hello Everyone,
This patch is for the Cilkplus branch affecting both C and C++ compilers. 
This patch will fix an unused variable warning in collect2.c by enclosing the 
variable inside #ifdef TARGET_AIX_VERSION.

Thanks,

Balaji V. Iyer.
diff --git a/gcc/ChangeLog.cilk b/gcc/ChangeLog.cilk
index ea3dc6e..8893b9a 100644
--- a/gcc/ChangeLog.cilk
+++ b/gcc/ChangeLog.cilk
@@ -6,6 +6,8 @@
(wrapper_local_cb): Likewise.
* opts.c: Removed the CILKPLUS_IMPLEMENTED macro and the #ifdef and
#ifndefs that uses this macro.
+   * collect2.c (main): Enclosed object_nbr declaration inside a
+   #ifdef TARGET_AIX_VERSION.
 
 2011-10-22  Balaji V. Iyer  
 
diff --git a/gcc/collect2.c b/gcc/collect2.c
index 9240bc8..92ef7ba 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -1091,7 +1091,9 @@ main (int argc, char **argv)
   const char **ld2;
   char **object_lst;
   const char **object;
+#ifdef TARGET_AIX_VERSION
   int object_nbr = argc;
+#endif
   int first_file;
   int num_c_args;
   char **old_argv;


Re: Continue strict-volatile-bitfields fixes

2011-11-11 Thread Bernd Schmidt
On 11/11/11 16:30, Joey Ye wrote:
> -fstrict-volatile-bitfields doesn't work incorrectly in some cases 
> when storing into a volatile bit-field. 
> 
> Bernd provided a fix here about 1 year ago:
> http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00217.html.
> But it is pending to trunk. Here are my humble opinions and hopefully 
> we can revive it:
> 
> 1. The fix could have helped lots of those who use volatile bit-fields, 
> but has been blocked for 1 year by ABI version 1, a feature that I believe 
> no one nowadays is using with latest gcc. Either error out ABI version 1
> for some target, or just revising the failed ABI test case is OK for me.

Yeah. At the time I thought the objections were a bit pointless. At
worst, the added code in some of the target ports is irrelevant, as
admitted by DJ later in the thread, but nothing stops port maintainers
from adding code to disallow -fabi-version for their port. Since none do
this (AFAIK), I still believe it's best to make all ports behave
identically with this patch.

So, I still think this patch is the best way to go forward, and it does
fix incorrect code generation. Would appreciate an OK.


Bernd


Re: SPU build broken (Re: CFT: [build] Move libgcc2 to toplevel libgcc)

2011-11-11 Thread Rainer Orth
"Ulrich Weigand"  writes:

> Rainer Orth wrote:
>
>> diff --git a/gcc/config/spu/t-spu-elf b/gcc/config/spu/t-spu-elf
>
>> -# We exclude those because the libgcc2.c default versions do not support
>> -# the SPU single-precision format (round towards zero).  We provide our
>> -# own versions below and/or via direct expansion.
>> -LIB2FUNCS_EXCLUDE = _floatdisf _floatundisf _floattisf _floatunstisf
>
>> diff --git a/libgcc/config/spu/t-elf b/libgcc/config/spu/t-elf
>   
>> +# We exclude those because the libgcc2.c default versions do not support
>> +# the SPU single-precision format (round towards zero).  We provide our
>> +# own versions below and/or via direct expansion.
>> +LIB2ADD = _floatdisf _floatundisf _floattisf _floatunstisf
>
>
> This seems to have caused:
>
> make[2]: Entering directory 
> `/home/kwerner/dailybuild/spu-tc-2011-11-05/gcc-build/spu/libgcc'
> Makefile:792: *** Unsupported files in LIB2ADD or LIB2ADD_ST..  Stop.

Sorry for the delay.  Indeed, the test at that line only accepts .c, .S,
and .asm files, where the latter should probably be removed now that
we've standardized on .S.

> Shouldn't the variable still be called LIB2FUNCS_EXCLUDE after the
> move to libgcc?  LIB2ADD seems to expect full file names ...

Of course, the change is bogus.  I can only (half) explain this by the
change from LIB2FUNCS_STATIC_EXTRA to LIB2ADD_ST extra.

The trivial patch allowed a x86_64-unknown-linux-gnu x spu-elf cross to
finish the libgcc build, and at least the set of objects built before my
patch series is identical to the set built now.

Ok for mainline?

Rainer


2011-11-11  Rainer Orth  

* config/spu/t-elf (LIB2ADD): Use LIB2FUNCS_EXCLUDE instead.

# HG changeset patch
# Parent 4b61b438da8a6a11ab1e06abe67fd26fa715c25e
Fix SPU libgcc build

diff --git a/libgcc/config/spu/t-elf b/libgcc/config/spu/t-elf
--- a/libgcc/config/spu/t-elf
+++ b/libgcc/config/spu/t-elf
@@ -5,7 +5,7 @@ CRTSTUFF_T_CFLAGS =
 # We exclude those because the libgcc2.c default versions do not support
 # the SPU single-precision format (round towards zero).  We provide our
 # own versions below and/or via direct expansion.
-LIB2ADD = _floatdisf _floatundisf _floattisf _floatunstisf
+LIB2FUNCS_EXCLUDE = _floatdisf _floatundisf _floattisf _floatunstisf
 
 LIB2ADD_ST = $(srcdir)/config/spu/float_unssidf.c \
 	 $(srcdir)/config/spu/float_unsdidf.c \

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: SPU build broken (Re: CFT: [build] Move libgcc2 to toplevel libgcc)

2011-11-11 Thread Paolo Bonzini

On 11/11/2011 05:23 PM, Rainer Orth wrote:

The trivial patch allowed a x86_64-unknown-linux-gnu x spu-elf cross to
finish the libgcc build, and at least the set of objects built before my
patch series is identical to the set built now.

Ok for mainline?


Ok.  Have you checked for other occurrences?

Paolo


Re: [PATCH] Optimize in RTL vector AND { -1, -1, ... }, IOR { -1, -1, ... } and XOR { -1, -1, ... } (take 2)

2011-11-11 Thread Bernd Schmidt
On 10/31/11 14:24, Henderson, Stuart wrote:
>> 2011-09-26  Jakub Jelinek  
>>
>>   * rtl.h (const_tiny_rtx): Change into array of 4 x MAX_MACHINE_MODE
>>   from 3 x MAX_MACHINE_MODE.
>>   (CONSTM1_RTX): Define.
>>   * emit-rtl.c (const_tiny_rtx): Change into array of 4 x 
>> MAX_MACHINE_MODE
>>   from 3 x MAX_MACHINE_MODE.
>>   (gen_rtx_CONST_VECTOR): Use CONSTM1_RTX if all inner constants are
>>   CONSTM1_RTX.
>>   (init_emit_once): Initialize CONSTM1_RTX for MODE_INT and
>>   MODE_VECTOR_INT modes.
>>   * simplify-rtx.c (simplify_binary_operation_1) :
>>   Optimize if one operand is CONSTM1_RTX.
>>   * config/i386/i386.c (ix86_expand_sse_movcc): Optimize mask ? -1 : x
>>   into mask | x.
> 
> FYI - this patch (179238) breaks the Blackfin compiler build with an internal 
> compiler error during configure of libgcc:
> conftest.c:1:0: internal compiler error: in gen_const_vector, at 
> emit-rtl.c:5491

This fixes it. Looks obvious enough to me; hence will commit Monday if
no objections.


Bernd
* emit-rtl.c (init_emit_once): Initialize const_tiny_rtx[3] for
partial integer modes.

Index: emit-rtl.c
===
--- emit-rtl.c  (revision 181252)
+++ emit-rtl.c  (working copy)
@@ -5706,6 +5706,11 @@ init_emit_once (void)
mode = GET_MODE_WIDER_MODE (mode))
 const_tiny_rtx[3][(int) mode] = constm1_rtx;
 
+  for (mode = GET_CLASS_NARROWEST_MODE (MODE_PARTIAL_INT);
+   mode != VOIDmode;
+   mode = GET_MODE_WIDER_MODE (mode))
+const_tiny_rtx[3][(int) mode] = constm1_rtx;
+  
   for (mode = GET_CLASS_NARROWEST_MODE (MODE_COMPLEX_INT);
mode != VOIDmode;
mode = GET_MODE_WIDER_MODE (mode))


[Patch, libfortran] PR 51090 Check getenv result before proceeding

2011-11-11 Thread Janne Blomqvist
Hi,

committed the patch below as obvious.

2011-11-11  Janne Blomqvist  

PR libfortran/51090
* runtime/main.c (find_addr2line): NULL check before proceeding.

Index: main.c
===
--- main.c  (revision 181287)
+++ main.c  (working copy)
@@ -149,6 +149,8 @@ find_addr2line (void)
 #ifdef HAVE_ACCESS
 #define A2L_LEN 10
   char *path = getenv ("PATH");
+  if (!path)
+  return;
   size_t n = strlen (path);
   char ap[n + 1 + A2L_LEN];
   size_t ai = 0;


-- 
Janne Blomqvist


Fix various minor issues in cprop.c

2011-11-11 Thread Eric Botcazou
While reviewing PR rtl-opt/50663, I ran into some minor issues in cprop.c that 
can easily be addressed:
  - a few outdated comments,
  - non-obvious naming of variables (pavloc, absaltered),
  - reversed naming (transp instead of kill in compute_local_properties),
  - inconsistent protoytype for constprop_register,
  - a few long lines and a few typos left and right.

No functional changes.  Tested on i586-suse-linux, applied on mainline,


2011-11-11  Eric Botcazou  

* cprop.c: Adjust outdated comments throughout.
(hash_scan_set): Rename PAT parameter into SET.
(cprop_pavloc): Rename into...
(cprop_avloc): ...this.
(cprop_absaltered): Rename into...
(cprop_kill): ...this.
(alloc_cprop_mem): Adjust for above renaming.
(free_cprop_mem): Likewise.
(compute_cprop_data): Likewise.
(compute_local_properties): Rename TRANSP parameter into KILL and
adjust throughout.  Rework comments.
(try_replace_reg): Fix long line.
(cprop_jump): Likewise.
(constprop_register): Fix prototype and take INSN last.
(cprop_insn): Adjust calls to above function.  Fix long lines.
(bypass_block): Likewise.
(one_cprop_pass): Likewise.


-- 
Eric Botcazou
Index: cprop.c
===
--- cprop.c	(revision 181267)
+++ cprop.c	(working copy)
@@ -69,7 +69,7 @@ typedef struct occr *occr_t;
 DEF_VEC_P (occr_t);
 DEF_VEC_ALLOC_P (occr_t, heap);
 
-/* Hash table entry for an assignment expressions.  */
+/* Hash table entry for assignment expressions.  */
 
 struct expr
 {
@@ -83,8 +83,8 @@ struct expr
   struct expr *next_same_hash;
   /* List of available occurrence in basic blocks in the function.
  An "available occurrence" is one that is the last occurrence in the
- basic block and the operands are not modified by following statements in
- the basic block [including this insn].  */
+ basic block and whose operands are not modified by following statements
+ in the basic block [including this insn].  */
   struct occr *avail_occr;
 };
 
@@ -136,7 +136,6 @@ static int local_copy_prop_count;
 static int global_const_prop_count;
 /* Number of global copies propagated.  */
 static int global_copy_prop_count;
-
 
 #define GOBNEW(T)		((T *) cprop_alloc (sizeof (T)))
 #define GOBNEWVAR(T, S)		((T *) cprop_alloc ((S)))
@@ -256,14 +255,13 @@ cprop_constant_p (const_rtx x)
   return CONSTANT_P (x) && (GET_CODE (x) != CONST || shared_const_p (x));
 }
 
-/* Scan pattern PAT of INSN and add an entry to the hash TABLE (set or
-   expression one).  */
+/* Scan SET present in INSN and add an entry to the hash TABLE.  */
 
 static void
-hash_scan_set (rtx pat, rtx insn, struct hash_table_d *table)
+hash_scan_set (rtx set, rtx insn, struct hash_table_d *table)
 {
-  rtx src = SET_SRC (pat);
-  rtx dest = SET_DEST (pat);
+  rtx src = SET_SRC (set);
+  rtx dest = SET_DEST (set);
 
   if (REG_P (dest)
   && ! HARD_REGISTER_P (dest)
@@ -288,7 +286,7 @@ hash_scan_set (rtx pat, rtx insn, struct
 	  && REG_NOTE_KIND (note) == REG_EQUAL
 	  && !REG_P (src)
 	  && cprop_constant_p (XEXP (note, 0)))
-	src = XEXP (note, 0), pat = gen_rtx_SET (VOIDmode, dest, src);
+	src = XEXP (note, 0), set = gen_rtx_SET (VOIDmode, dest, src);
 
   /* Record sets for constant/copy propagation.  */
   if ((REG_P (src)
@@ -300,16 +298,7 @@ hash_scan_set (rtx pat, rtx insn, struct
 }
 }
 
-/* Process INSN and add hash table entries as appropriate.
-
-   Only available expressions that set a single pseudo-reg are recorded.
-
-   Single sets in a PARALLEL could be handled, but it's an extra complication
-   that isn't dealt with right now.  The trick is handling the CLOBBERs that
-   are also in the PARALLEL.  Later.
-
-   If SET_P is nonzero, this is for the assignment hash table,
-   otherwise it is for the expression hash table.  */
+/* Process INSN and add hash table entries as appropriate.  */
 
 static void
 hash_scan_insn (rtx insn, struct hash_table_d *table)
@@ -332,6 +321,8 @@ hash_scan_insn (rtx insn, struct hash_ta
   }
 }
 
+/* Dump the hash table TABLE to file FILE under the name NAME.  */
+
 static void
 dump_hash_table (FILE *file, const char *name, struct hash_table_d *table)
 {
@@ -373,6 +364,7 @@ dump_hash_table (FILE *file, const char
 }
 
 /* Record as unavailable all registers that are DEF operands of INSN.  */
+
 static void
 make_set_regs_unavailable (rtx insn)
 {
@@ -383,7 +375,7 @@ make_set_regs_unavailable (rtx insn)
 SET_REGNO_REG_SET (reg_set_bitmap, DF_REF_REGNO (*def_rec));
 }
 
-/* Top level function to create an assignments hash table.
+/* Top level function to create an assignment hash table.
 
Assignment entries are placed in the hash table if
- they are of the form (set (pseudo-reg) src),
@@ -541,13 +533,12 @@ mark_oprs_set (rtx insn)
   for (def_rec = DF_INSN_INFO_DEFS (insn_info); *def_rec; def_rec++)
 SET_REG

Re: [PATCH RFA] rtl-optimization/PR50663, conditional propagation missed in cprop.c pass

2011-11-11 Thread Eric Botcazou
> 2011-11-07  Bin Cheng  
>
>   PR rtl-optimization/50663
>   * cprop.c (bb_implicit): New global variable.
>   (insert_set_in_table): Add additional parameter, record implicit set
>   info. 
>   (hash_scan_set): Add additional parameter.
>   (compute_hash_table_work): And
>   (hash_scan_insn): Pass implicit to hash_scan_set.
>   (compute_cprop_data): Add implicit set to AVIN of block which the
>   implicit set is recorded for.
>   (one_cprop_pass): Handle bb_implicit array.

[80 columns at most for ChangeLog entries as well]

The patch is OK with the following changes:

@@ -116,6 +116,10 @@
 /* Array of implicit set patterns indexed by basic block index.  */
 static rtx *implicit_sets;
 
+/* Array of bitmap_index of corresponding implicit set, indexed by
+   basic block index.  */
+static int *bb_implicit;

A better name is implicit_set_indexes:

/* Array of indexes of expressions for implicit set patterns indexed by basic
   block index.  In other words, implicit_set_indexes[i] is the bitmap_index
   of the expression whose RTX is implicit_sets[i].  */
static int *implicit_set_indexes;


+  /* Record bitmap_index of the implicit set in bb_implicit.  */
+  if (implicit)
+bb_implicit[BLOCK_FOR_INSN(cur_occr->insn)->index] =
+  cur_expr->bitmap_index;

cur_occr->insn is just insn.


+  /* Merge implicit set into CPROP_AVIN. There are always
+ available at the entry of corresponding basic block.  */

"...implicit sets into CPROP_AVIN.  They are..."

+  FOR_EACH_BB (bb)
+{
+  int index = bb_implicit[bb->index];
+  if (index != -1)
+   SET_BIT (cprop_avin[bb->index], (unsigned int)index);

The cast is superfluous.

I think that an explanation as to why we need to do this is in order (after 
all, this went unnoticed until now) along the lines of: "We need to do this 
because 1) implicit sets aren't recorded for the local pass so they cannot
be propagated within their basic block by this pass and 2) the global pass 
would otherwise propagate them only in the successors of their basic block."

Btw, you'll need to slightly adjust the patch because of my changes to cprop.c.

Thanks for investigating and addressing this issue.

-- 
Eric Botcazou


Re: [PATCH] Fold VEC_PERM_EXPR/VEC_INTERLEAVE*EXPR/VEC_EXTRACT*EXPR with VECTOR_CST/CONSTRUCTOR arguments (PR tree-optimization/51074, take 2)

2011-11-11 Thread Richard Henderson
On 11/11/2011 07:24 AM, Jakub Jelinek wrote:
>   PR tree-optimization/51074
>   * fold-const.c (vec_cst_ctor_to_array, fold_vec_perm): New functions.
>   (fold_binary_loc): Handle VEC_EXTRACT_EVEN_EXPR,
>   VEC_EXTRACT_ODD_EXPR, VEC_INTERLEAVE_HIGH_EXPR and
>   VEC_INTERLEAVE_LOW_EXPR with VECTOR_CST or CONSTRUCTOR operands.
>   (fold_ternary_loc): Handle VEC_PERM_EXPR with VECTOR_CST or
>   CONSTRUCTOR operands.
>   * tree-ssa-propagate.c (valid_gimple_rhs_p): Handle ternary
>   expressions.
>   * tree-vect-generic.c (lower_vec_perm): Mask sel_int elements
>   to 0 .. 2 * elements - 1.

Ok, except

> +elts[i]
> +  = fold_convert (TREE_TYPE (TREE_TYPE (arg)), integer_zero_node);

build_int_cst.


r~


Re: SPU build broken (Re: CFT: [build] Move libgcc2 to toplevel libgcc)

2011-11-11 Thread Rainer Orth
Paolo Bonzini  writes:

> On 11/11/2011 05:23 PM, Rainer Orth wrote:
>> The trivial patch allowed a x86_64-unknown-linux-gnu x spu-elf cross to
>> finish the libgcc build, and at least the set of objects built before my
>> patch series is identical to the set built now.
>>
>> Ok for mainline?
>
> Ok.  Have you checked for other occurrences?

Yup.  LIB2FUNCS_EXCLUDE only lists functions names, and LIB2ADD* only
lists filenames with supported extensions, all of which reference
$(srcdir) with the exception of cris/t-cris and m68k/t-floatlib, where
the sources are generated at build time.

This has been a visual inspection since grep cannot easily deal with the
continuation lines.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fold VEC_PERM_EXPR/VEC_INTERLEAVE*EXPR/VEC_EXTRACT*EXPR with VECTOR_CST/CONSTRUCTOR arguments (PR tree-optimization/51074, take 2)

2011-11-11 Thread Jakub Jelinek
On Fri, Nov 11, 2011 at 08:36:36AM -0800, Richard Henderson wrote:
> Ok, except
> 
> > +elts[i]
> > +  = fold_convert (TREE_TYPE (TREE_TYPE (arg)), integer_zero_node);
> 
> build_int_cst.

That would work for integer modes only, but here the type can be REAL_TYPE
too.  I think fold_convert from integer_zero_mode to any time is what is
the most common idiom to create zero
INTEGER_CSTs/COMPLEX_CSTs/REAL_CSTs/FIXED_CSTs of any type.

Jakub


[PATCH, alpha]: Restore bootstrap, broken due to config/elfos.h renames.

2011-11-11 Thread Uros Bizjak
Hello!

2011-11-11  Uros Bizjak  

* config/alpha/elf.h (ELF_ASCII_ESCAPES): Rename from ESCAPES.
(ELF_STRING_LIMIT): Rename from STRING_LIMIT.

Tested on alphaev68-pc-linux-gnu, committed to mainline SVN.

Uros.
Index: elf.h
===
--- elf.h   (revision 181283)
+++ elf.h   (working copy)
@@ -344,8 +344,8 @@
the i386) don't know about that.  Also, we don't use \v
since some versions of gas, such as 2.2 did not accept it.  */
 
-#undef  ESCAPES
-#define ESCAPES \
+#undef  ELF_ASCII_ESCAPES
+#define ELF_ASCII_ESCAPES \
 "\1\1\1\1\1\1\1\1btn\1fr\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\
 \0\0\"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\\\0\0\0\
@@ -366,8 +366,9 @@
If your target assembler doesn't support the .string directive, you
should define this to zero.  */
 
-#undef  STRING_LIMIT
-#define STRING_LIMIT   ((unsigned) 256)
+#undef  ELF_STRING_LIMIT
+#define ELF_STRING_LIMIT   ((unsigned) 256)
+
 #undef  STRING_ASM_OP
 #define STRING_ASM_OP  "\t.string\t"
 


Re: SPU build broken (Re: CFT: [build] Move libgcc2 to toplevel libgcc)

2011-11-11 Thread Ulrich Weigand
Rainer Orth wrote:

> "Ulrich Weigand"  writes:
> > Shouldn't the variable still be called LIB2FUNCS_EXCLUDE after the
> > move to libgcc?  LIB2ADD seems to expect full file names ...
> 
> Of course, the change is bogus.  I can only (half) explain this by the
> change from LIB2FUNCS_STATIC_EXTRA to LIB2ADD_ST extra.
> 
> The trivial patch allowed a x86_64-unknown-linux-gnu x spu-elf cross to
> finish the libgcc build, and at least the set of objects built before my
> patch series is identical to the set built now.

Thanks, this patch fixes the SPU build for me.  Of course, there are
currently a large number of testsuite failures, but those seem to be
due to unrelated problems (e.g. the "objects death" issue) ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: Mark objects death@end of scope

2011-11-11 Thread Michael Matz
Hi,

On Fri, 11 Nov 2011, Ulrich Weigand wrote:

> One reason why this happens is that the unwind*.c files are specifically 
> built with -fexception.  I think this is for the benefit of the DWARF 
> unwinder, to ensure CFI records are available for those routines.

Except for the routines that start the backtracing (e.g.
_Unwind_RaiseException) I don't see how descriptors for them are useful.  
But even those use local variables that would be handled by the scope-end 
clobbers.  Hmpf.  Why does the sjlj unwinder go into an endless loop, and 
only in _Resume, not already in the first phase (i.e. from 
_RaiseException), which also iterates over the backtrace.

If we can't fix the sjlj unwinder to cope with this situation I don't see 
much choice than implementing a command line flag disabling the clobbers 
and use that for compiling the unwinder :-/


Ciao,
Michael.


Re: [PATCH] Fold VEC_PERM_EXPR/VEC_INTERLEAVE*EXPR/VEC_EXTRACT*EXPR with VECTOR_CST/CONSTRUCTOR arguments (PR tree-optimization/51074, take 2)

2011-11-11 Thread Richard Henderson
On 11/11/2011 08:41 AM, Jakub Jelinek wrote:
> On Fri, Nov 11, 2011 at 08:36:36AM -0800, Richard Henderson wrote:
>> Ok, except
>>
>>> +elts[i]
>>> +  = fold_convert (TREE_TYPE (TREE_TYPE (arg)), integer_zero_node);
>>
>> build_int_cst.
> 
> That would work for integer modes only, but here the type can be REAL_TYPE
> too.  I think fold_convert from integer_zero_mode to any time is what is
> the most common idiom to create zero
> INTEGER_CSTs/COMPLEX_CSTs/REAL_CSTs/FIXED_CSTs of any type.

Ah, right.  Patch is ok as-is.


r~


Re: [PATCH] Don't ICE on SLP calls if the same call is used in multiple SLP instances (PR tree-optimization/51058)

2011-11-11 Thread Ira Rosen
On 11 November 2011 17:32, Jakub Jelinek  wrote:
> Hi!

Hi,

>
> Removing the scalar call in vectorizable_call for SLP vectorization
> is too early, when another SLP instance refers to the same scalar call,
> we'll ICE because that stmt doesn't have bb anymore or gsi_for_stmt
> doesn't succeed for it.
>
> Fixed by postponing replacement of calls with zeroing of lhs for later
> in the SLP case.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2011-11-11  Jakub Jelinek  
>
>        PR tree-optimization/51058
>        * tree-vect-slp.c (vect_remove_slp_scalar_calls): New function.
>        (vect_schedule_slp): Call it.
>        * tree-vect-stmts.c (vectorizable_call): If slp_node != NULL,
>        don't replace scalar calls with clearing of their lhs here.

I think it's rhs.

>
>        * gcc.dg/vect/fast-math-vect-call-1.c: Add f4 test.
>        * gfortran.fortran-torture/compile/pr51058.f90: New test.
>
> --- gcc/tree-vect-slp.c.jj      2011-11-10 18:09:12.0 +0100
> +++ gcc/tree-vect-slp.c 2011-11-11 13:18:42.157292895 +0100
> @@ -2898,6 +2898,46 @@ vect_schedule_slp_instance (slp_tree nod
>   return is_store;
>  }
>
> +/* Replace scalar calls from SLP node NODE with clearing of their lhs.

Here too.

> +   For loop vectorization this is done in vectorizable_call, but for SLP
> +   it needs to be deferred until end of vect_schedule_slp, because multiple
> +   SLP instances may refer to the same scalar stmt.  */
> +
> +static void
> +vect_remove_slp_scalar_calls (slp_tree node)
> +{

...

> --- gcc/testsuite/gfortran.fortran-torture/compile/pr51058.f90.jj       
> 2011-11-11 13:26:14.665615842 +0100
> +++ gcc/testsuite/gfortran.fortran-torture/compile/pr51058.f90  2011-11-11 
> 13:25:50.0 +0100
> @@ -0,0 +1,18 @@
> +! PR tree-optimization/51058
> +! { dg-do compile }
> +subroutine pr51058(n, u, v, w, z)
> +  double precision :: x(3,-2:16384), y(3,-2:16384), b, u, v, w, z
> +  integer :: i, n
> +  common /c/ x, y
> +  do i = 1, n
> +    b = u * int(x(1,i)) + sign(z,x(1,i))
> +    x(1,i) = x(1,i) - b
> +    y(1,i) = y(1,i) - b
> +    b = v * int(x(2,i)) + sign(z,x(2,i))
> +    x(2,i) = x(2,i) - b
> +    y(2,i) = y(2,i) - b
> +    b = w * int(x(3,i)) + sign(z,x(3,i))
> +    x(3,i) = x(3,i) - b
> +    y(3,i) = y(3,i) - b
> +  end do
> +end subroutine

Please add
! { dg-final { cleanup-tree-dump "vect" } }


OK otherwise.

Thanks,
Ira

>
>        Jakub
>


Re: [PATCH] Don't ICE on SLP calls if the same call is used in multiple SLP instances (PR tree-optimization/51058)

2011-11-11 Thread Jakub Jelinek
On Fri, Nov 11, 2011 at 06:57:58PM +0200, Ira Rosen wrote:
> On 11 November 2011 17:32, Jakub Jelinek  wrote:
> > 2011-11-11  Jakub Jelinek  
> >
> >        PR tree-optimization/51058
> >        * tree-vect-slp.c (vect_remove_slp_scalar_calls): New function.
> >        (vect_schedule_slp): Call it.
> >        * tree-vect-stmts.c (vectorizable_call): If slp_node != NULL,
> >        don't replace scalar calls with clearing of their lhs here.
> 
> I think it's rhs.

I think it is lhs.  The scalar call is
  lhs = __builtin_copysign (arg1, arg2);
etc. and we transform it to
  lhs = 0.0;

> > --- gcc/testsuite/gfortran.fortran-torture/compile/pr51058.f90.jj       
> > 2011-11-11 13:26:14.665615842 +0100
> > +++ gcc/testsuite/gfortran.fortran-torture/compile/pr51058.f90  2011-11-11 
> > 13:25:50.0 +0100
> > @@ -0,0 +1,18 @@
> > +! PR tree-optimization/51058
> > +! { dg-do compile }
> > +subroutine pr51058(n, u, v, w, z)
> > +  double precision :: x(3,-2:16384), y(3,-2:16384), b, u, v, w, z
> > +  integer :: i, n
> > +  common /c/ x, y
> > +  do i = 1, n
> > +    b = u * int(x(1,i)) + sign(z,x(1,i))
> > +    x(1,i) = x(1,i) - b
> > +    y(1,i) = y(1,i) - b
> > +    b = v * int(x(2,i)) + sign(z,x(2,i))
> > +    x(2,i) = x(2,i) - b
> > +    y(2,i) = y(2,i) - b
> > +    b = w * int(x(3,i)) + sign(z,x(3,i))
> > +    x(3,i) = x(3,i) - b
> > +    y(3,i) = y(3,i) - b
> > +  end do
> > +end subroutine
> 
> Please add
> ! { dg-final { cleanup-tree-dump "vect" } }
> 
> OK otherwise.

This is not a /vect/ testcase, but fortran torture.  I guess
if you really want I could move it over to gfortran.dg/vect/ instead,
then the ! { dg-final { cleanup-tree-dump "vect" } }
would be indeed needed there.

Jakub


Re: [PATCH] Don't ICE on SLP calls if the same call is used in multiple SLP instances (PR tree-optimization/51058)

2011-11-11 Thread Jakub Jelinek
On Fri, Nov 11, 2011 at 06:06:18PM +0100, Jakub Jelinek wrote:
> > Please add
> > ! { dg-final { cleanup-tree-dump "vect" } }
> > 
> > OK otherwise.
> 
> This is not a /vect/ testcase, but fortran torture.  I guess
> if you really want I could move it over to gfortran.dg/vect/ instead,
> then the ! { dg-final { cleanup-tree-dump "vect" } }
> would be indeed needed there.

That would be following, incrementally tested with
make check-gfortran RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
vect.exp=pr51*'
with both unpatched and patched f951.

2011-11-11  Jakub Jelinek  

PR tree-optimization/51058
* tree-vect-slp.c (vect_remove_slp_scalar_calls): New function.
(vect_schedule_slp): Call it.
* tree-vect-stmts.c (vectorizable_call): If slp_node != NULL,
don't replace scalar calls with clearing of their lhs here.

* gcc.dg/vect/fast-math-vect-call-1.c: Add f4 test.
* gfortran.dg/vect/pr51058-2.f90: New test.

--- gcc/tree-vect-slp.c.jj  2011-11-11 16:02:40.475359160 +0100
+++ gcc/tree-vect-slp.c 2011-11-11 18:08:29.784708271 +0100
@@ -2902,6 +2902,46 @@ vect_schedule_slp_instance (slp_tree nod
   return is_store;
 }
 
+/* Replace scalar calls from SLP node NODE with clearing of their lhs.
+   For loop vectorization this is done in vectorizable_call, but for SLP
+   it needs to be deferred until end of vect_schedule_slp, because multiple
+   SLP instances may refer to the same scalar stmt.  */
+
+static void
+vect_remove_slp_scalar_calls (slp_tree node)
+{
+  gimple stmt, new_stmt;
+  gimple_stmt_iterator gsi;
+  int i;
+  slp_void_p child;
+  tree lhs;
+  stmt_vec_info stmt_info;
+
+  if (!node)
+return;
+
+  FOR_EACH_VEC_ELT (slp_void_p, SLP_TREE_CHILDREN (node), i, child)
+vect_remove_slp_scalar_calls ((slp_tree) child);
+
+  FOR_EACH_VEC_ELT (gimple, SLP_TREE_SCALAR_STMTS (node), i, stmt)
+{
+  if (!is_gimple_call (stmt) || gimple_bb (stmt) == NULL)
+   continue;
+  stmt_info = vinfo_for_stmt (stmt);
+  if (stmt_info == NULL
+ || is_pattern_stmt_p (stmt_info)
+ || !PURE_SLP_STMT (stmt_info))
+   continue;
+  lhs = gimple_call_lhs (stmt);
+  new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
+  set_vinfo_for_stmt (new_stmt, stmt_info);
+  set_vinfo_for_stmt (stmt, NULL);
+  STMT_VINFO_STMT (stmt_info) = new_stmt;
+  gsi = gsi_for_stmt (stmt);
+  gsi_replace (&gsi, new_stmt, false);
+  SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
+}
+}
 
 /* Generate vector code for all SLP instances in the loop/basic block.  */
 
@@ -2941,6 +2981,8 @@ vect_schedule_slp (loop_vec_info loop_vi
   unsigned int j;
   gimple_stmt_iterator gsi;
 
+  vect_remove_slp_scalar_calls (root);
+
   for (j = 0; VEC_iterate (gimple, SLP_TREE_SCALAR_STMTS (root), j, store)
   && j < SLP_INSTANCE_GROUP_SIZE (instance); j++)
 {
--- gcc/tree-vect-stmts.c.jj2011-11-11 16:02:28.343433924 +0100
+++ gcc/tree-vect-stmts.c   2011-11-11 18:08:29.786708241 +0100
@@ -1886,6 +1886,9 @@ vectorizable_call (gimple stmt, gimple_s
  it defines is mapped to the new definition.  So just replace
  rhs of the statement with something harmless.  */
 
+  if (slp_node)
+return true;
+
   type = TREE_TYPE (scalar_dest);
   if (is_pattern_stmt_p (stmt_info))
 lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info));
@@ -1893,8 +1896,7 @@ vectorizable_call (gimple stmt, gimple_s
 lhs = gimple_call_lhs (stmt);
   new_stmt = gimple_build_assign (lhs, build_zero_cst (type));
   set_vinfo_for_stmt (new_stmt, stmt_info);
-  if (!slp_node)
-set_vinfo_for_stmt (stmt, NULL);
+  set_vinfo_for_stmt (stmt, NULL);
   STMT_VINFO_STMT (stmt_info) = new_stmt;
   gsi_replace (gsi, new_stmt, false);
   SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
--- gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c.jj2011-11-11 
16:02:28.026435857 +0100
+++ gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c   2011-11-11 
18:08:29.792708207 +0100
@@ -38,6 +38,18 @@ f3 (void)
 a[i] = copysignf (b[i], c[i]) + 1.0f + sqrtf (d[i]);
 }
 
+__attribute__((noinline, noclone)) void
+f4 (int n)
+{
+  int i;
+  for (i = 0; i < 2 * n; i++)
+{
+  a[3 * i + 0] = copysignf (b[3 * i + 0], c[3 * i + 0]) + 1.0f + sqrtf 
(d[3 * i + 0]);
+  a[3 * i + 1] = copysignf (b[3 * i + 1], c[3 * i + 1]) + 2.0f + sqrtf 
(d[3 * i + 1]);
+  a[3 * i + 2] = copysignf (b[3 * i + 2], c[3 * i + 2]) + 3.0f + sqrtf 
(d[3 * i + 2]);
+}
+}
+
 __attribute__((noinline, noclone)) int
 main1 ()
 {
@@ -66,6 +78,12 @@ main1 ()
   for (i = 0; i < 64; i++)
 if (fabsf (((i & 2) ? -4 * i : 4 * i) + 1 + i - a[i]) >= 0.0001f)
   abort ();
+else
+  a[i] = 131.25;
+  f4 (10);
+  for (i = 0; i < 60; i++)
+if (fabsf (((i & 2) ? -4 * i : 4 * i) + 1 + (i % 3) + i - a[i]) >= 0.0001f)
+  abort ();
   return 0;
 }
 
@@ -76,6 +94,6 @@ main ()
 

Re: Mark objects death@end of scope

2011-11-11 Thread Ulrich Weigand
Michael Matz wrote:
> On Fri, 11 Nov 2011, Ulrich Weigand wrote:
> 
> > One reason why this happens is that the unwind*.c files are specifically 
> > built with -fexception.  I think this is for the benefit of the DWARF 
> > unwinder, to ensure CFI records are available for those routines.
> 
> Except for the routines that start the backtracing (e.g.
> _Unwind_RaiseException) I don't see how descriptors for them are useful.  
> But even those use local variables that would be handled by the scope-end 
> clobbers.  Hmpf.  Why does the sjlj unwinder go into an endless loop, and 
> only in _Resume, not already in the first phase (i.e. from 
> _RaiseException), which also iterates over the backtrace.

I haven't fully debugged it yet, but it seems to be related to the linked
list of unwind contexts that are maintained by the SjLj logic.  During
unwinding, those are pulled off the list one by one; it seems the routines
that do that don't expect that new contexts for the _Unwind routines
themselves are being implicitly pushed onto that list while the unwinding
happens ...

> If we can't fix the sjlj unwinder to cope with this situation I don't see 
> much choice than implementing a command line flag disabling the clobbers 
> and use that for compiling the unwinder :-/

I guess one attempt might be to build the unwinder files with
-funwind-tables instead of -fexceptions ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: Mark objects death@end of scope

2011-11-11 Thread Michael Matz
Hi,

On Fri, 11 Nov 2011, Ulrich Weigand wrote:

> I haven't fully debugged it yet, but it seems to be related to the 
> linked list of unwind contexts that are maintained by the SjLj logic.  
> During unwinding, those are pulled off the list one by one; it seems the 
> routines that do that don't expect that new contexts for the _Unwind 
> routines themselves are being implicitly pushed onto that list while the 
> unwinding happens ...
> 
> > If we can't fix the sjlj unwinder to cope with this situation I don't 
> > see much choice than implementing a command line flag disabling the 
> > clobbers and use that for compiling the unwinder :-/
> 
> I guess one attempt might be to build the unwinder files with 
> -funwind-tables instead of -fexceptions ...

Hmm, that could work.  Or marking all routines that the unwinder calls as 
nothrow.


Ciao,
Michael.


Re: cxx-mem-model merge [6 of 9] - libstdc++-v3

2011-11-11 Thread Benjamin Kosnik

> > I just realized I may be feeding you an inconsistent
> > configuration, see the atomicity stuff in
> > libstdc++-v3/config/cpu/cris.  Is that just obsolete and unused
> > now or what do I need to add for that to work?
> >
> 
> You don't need to do anything there. I think that atomicity stuff
> will soon be obsolete, but bkoz will have to answer that question.
> It looks to me like that was some gnu atomic extentions which predate 
> atomic support in the standard.  In theory, that would all be able to
> go away or be integrated into the gcc machine description with the
> modern patterns, if its not already there.

That is very obsolete and should be removed as it's now just
causing confusion. That was pre-builtin days, let's call this
atomics-try-1. So, all:

config/cpu/*/atomicity.h
./i486/atomicity.h
./i386/atomicity.h
./sparc/atomicity.h
./sh/atomicity.h
./m68k/atomicity.h
./cris/atomicity.h
./hppa/atomicity.h
./generic/atomicity_mutex/atomicity.h
./generic/atomicity_builtins/atomicity.h

ATOMICITY_SRCDIR
ATOMIC_WORD_SRCDIR
ATOMIC_FLAGS

Should go. I'll look in to peeling off this cruft sharpish.
 
> bkoz: As relates to the existing problem, how is the legacy support 
> invoked in compatibility-atomic-c++0x.cc?  That has the old style 
> implementation of atomic_flag with a lock, which would allow this
> target to compile...  which is another option perhaps.  or is that
> purely for pervious releases somehow?

compatibility-atomic-c++0x.cc is the support for previous builtins
attempt, let's call this atomics-try-2. We need to keep these symbols
exported and doing the same thing done in previous releases. (Or else
abi-check will fail.)

If this system used to use a lock to work, then that is what it should
still do. The behavior shouldn't change.

 -benjamin



Re: [PATCH] Don't ICE on SLP calls if the same call is used in multiple SLP instances (PR tree-optimization/51058)

2011-11-11 Thread Ira Rosen
On 11 November 2011 19:06, Jakub Jelinek  wrote:
> On Fri, Nov 11, 2011 at 06:57:58PM +0200, Ira Rosen wrote:
>> On 11 November 2011 17:32, Jakub Jelinek  wrote:
>> > 2011-11-11  Jakub Jelinek  
>> >
>> >PR tree-optimization/51058
>> >* tree-vect-slp.c (vect_remove_slp_scalar_calls): New function.
>> >(vect_schedule_slp): Call it.
>> >* tree-vect-stmts.c (vectorizable_call): If slp_node != NULL,
>> >don't replace scalar calls with clearing of their lhs here.
>>
>> I think it's rhs.
>
> I think it is lhs.  The scalar call is
>  lhs = __builtin_copysign (arg1, arg2);
> etc. and we transform it to
>  lhs = 0.0;

I still think it's clearing of rhs, but this is not really important :)

On 11 November 2011 19:13, Jakub Jelinek  wrote:
> On Fri, Nov 11, 2011 at 06:06:18PM +0100, Jakub Jelinek wrote:
>> > Please add
>> > ! { dg-final { cleanup-tree-dump "vect" } }
>> >
>> > OK otherwise.
>>
>> This is not a /vect/ testcase, but fortran torture.

Ah, sorry, indeed I thought it's a vect test.

>> I guess
>> if you really want I could move it over to gfortran.dg/vect/ instead,
>> then the ! { dg-final { cleanup-tree-dump "vect" } }
>> would be indeed needed there.
>
> That would be following, incrementally tested with
> make check-gfortran RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
> vect.exp=pr51*'
> with both unpatched and patched f951.
>
> 2011-11-11  Jakub Jelinek  
>
>        PR tree-optimization/51058
>        * tree-vect-slp.c (vect_remove_slp_scalar_calls): New function.
>        (vect_schedule_slp): Call it.
>        * tree-vect-stmts.c (vectorizable_call): If slp_node != NULL,
>        don't replace scalar calls with clearing of their lhs here.
>
>        * gcc.dg/vect/fast-math-vect-call-1.c: Add f4 test.
>        * gfortran.dg/vect/pr51058-2.f90: New test.

Looks good.

Thanks,
Ira

>
> --- gcc/tree-vect-slp.c.jj      2011-11-11 16:02:40.475359160 +0100
> +++ gcc/tree-vect-slp.c 2011-11-11 18:08:29.784708271 +0100
> @@ -2902,6 +2902,46 @@ vect_schedule_slp_instance (slp_tree nod
>   return is_store;
>  }
>
> +/* Replace scalar calls from SLP node NODE with clearing of their lhs.
> +   For loop vectorization this is done in vectorizable_call, but for SLP
> +   it needs to be deferred until end of vect_schedule_slp, because multiple
> +   SLP instances may refer to the same scalar stmt.  */
> +
> +static void
> +vect_remove_slp_scalar_calls (slp_tree node)
> +{
> +  gimple stmt, new_stmt;
> +  gimple_stmt_iterator gsi;
> +  int i;
> +  slp_void_p child;
> +  tree lhs;
> +  stmt_vec_info stmt_info;
> +
> +  if (!node)
> +    return;
> +
> +  FOR_EACH_VEC_ELT (slp_void_p, SLP_TREE_CHILDREN (node), i, child)
> +    vect_remove_slp_scalar_calls ((slp_tree) child);
> +
> +  FOR_EACH_VEC_ELT (gimple, SLP_TREE_SCALAR_STMTS (node), i, stmt)
> +    {
> +      if (!is_gimple_call (stmt) || gimple_bb (stmt) == NULL)
> +       continue;
> +      stmt_info = vinfo_for_stmt (stmt);
> +      if (stmt_info == NULL
> +         || is_pattern_stmt_p (stmt_info)
> +         || !PURE_SLP_STMT (stmt_info))
> +       continue;
> +      lhs = gimple_call_lhs (stmt);
> +      new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
> +      set_vinfo_for_stmt (new_stmt, stmt_info);
> +      set_vinfo_for_stmt (stmt, NULL);
> +      STMT_VINFO_STMT (stmt_info) = new_stmt;
> +      gsi = gsi_for_stmt (stmt);
> +      gsi_replace (&gsi, new_stmt, false);
> +      SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
> +    }
> +}
>
>  /* Generate vector code for all SLP instances in the loop/basic block.  */
>
> @@ -2941,6 +2981,8 @@ vect_schedule_slp (loop_vec_info loop_vi
>       unsigned int j;
>       gimple_stmt_iterator gsi;
>
> +      vect_remove_slp_scalar_calls (root);
> +
>       for (j = 0; VEC_iterate (gimple, SLP_TREE_SCALAR_STMTS (root), j, store)
>                   && j < SLP_INSTANCE_GROUP_SIZE (instance); j++)
>         {
> --- gcc/tree-vect-stmts.c.jj    2011-11-11 16:02:28.343433924 +0100
> +++ gcc/tree-vect-stmts.c       2011-11-11 18:08:29.786708241 +0100
> @@ -1886,6 +1886,9 @@ vectorizable_call (gimple stmt, gimple_s
>      it defines is mapped to the new definition.  So just replace
>      rhs of the statement with something harmless.  */
>
> +  if (slp_node)
> +    return true;
> +
>   type = TREE_TYPE (scalar_dest);
>   if (is_pattern_stmt_p (stmt_info))
>     lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info));
> @@ -1893,8 +1896,7 @@ vectorizable_call (gimple stmt, gimple_s
>     lhs = gimple_call_lhs (stmt);
>   new_stmt = gimple_build_assign (lhs, build_zero_cst (type));
>   set_vinfo_for_stmt (new_stmt, stmt_info);
> -  if (!slp_node)
> -    set_vinfo_for_stmt (stmt, NULL);
> +  set_vinfo_for_stmt (stmt, NULL);
>   STMT_VINFO_STMT (stmt_info) = new_stmt;
>   gsi_replace (gsi, new_stmt, false);
>   SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
> --- gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c.jj        2011-11

Re: [Patch, Fortran] PR 51073: fix for zero-sized coarray arrays

2011-11-11 Thread Steve Kargl
On Fri, Nov 11, 2011 at 03:24:45PM +0100, Tobias Burnus wrote:
> 
> The patch does the same we do for nonstatic variables: It allocates a 
> single byte in this case; as it is done in the front-end and as it is a 
> compile-time constant, there is no performance problem ;-)

What about memory pressure?   (I'm joking ;-)

> Build and regtested on x86-64-Linux.
> OK for the trunk?

OK.

-- 
Steve


[PATCH, testsuite]: Do not run simulate-thread on alpha*-*-linux*

2011-11-11 Thread Uros Bizjak
Hello!

For some reason, single-stepping executable between ldl_l and stl_c
insns in gdb [1] breaks LL/SC chaining, so atomic operations never
finish. This calls for gdb bugreport.

Also taking into account that dejagnu timeout didn't trigger for
unattended testsuite run and considering huge amount of log
information in testsuite log logged during gdb session, this
all-together was quite devastating to disk space...

We can simply claim that gdb on alpha*-*-linux* is unusable for the
purpose of thread-simulate tests.

2011-11-11  Uros Bizjak  

* lib/gcc-simulate-thread.exp (simulate-thread): Do not run on
alpha*-*-linux* targets.

Tested on alphaev68-pc-linux-gnu.

OK for mainline SVN?

[1] GNU gdb (Gentoo 7.3.1 p1) 7.3.1

Uros.
Index: lib/gcc-simulate-thread.exp
===
--- lib/gcc-simulate-thread.exp (revision 181284)
+++ lib/gcc-simulate-thread.exp (working copy)
@@ -22,6 +22,11 @@
 # Call 'fail' if a given test printed "FAIL:", otherwise call 'pass'.
 
 proc simulate-thread { args } {
+
+# ??? Exit immediately if this is alpha*-*-linux* target, single-stepping
+# executable between ldl_l and stl_c insns in gdb breaks LL/SC chaining.
+if { [istarget alpha*-*-linux*] } { return }
+
 if { ![isnative] || [is_remote target] } { return }
 
 if { [llength $args] == 1 } {


Re: cxx-mem-model merge [6 of 9] - libstdc++-v3

2011-11-11 Thread Andrew MacLeod

On 11/11/2011 12:43 AM, Benjamin Kosnik wrote:



bkoz: As relates to the existing problem, how is the legacy support
invoked in compatibility-atomic-c++0x.cc?  That has the old style
implementation of atomic_flag with a lock, which would allow this
target to compile...  which is another option perhaps.  or is that
purely for pervious releases somehow?

compatibility-atomic-c++0x.cc is the support for previous builtins
attempt, let's call this atomics-try-2. We need to keep these symbols
exported and doing the same thing done in previous releases. (Or else
abi-check will fail.)

If this system used to use a lock to work, then that is what it should
still do. The behavior shouldn't change.

I think there is also an argument for single threaded-ness vs multi 
threaded.  If there is no atomic support and its single threaded, we 
don't really need the lock... and I'm not sure how you can detect the 
change in behaviour if test_and_set and clear just store 1 and 0 rather 
than create alock, then do the store of 1 or 0.


If the target is multithreaded, well, we'll have to go to a lock I 
guess...   Are there any multithreaded targets without atomic support?  
ie, is this one?


Andrew





[PATCH] Fix tree-stdarg after "Mark objects death@end of scope" changes

2011-11-11 Thread Jakub Jelinek
Hi!

This fixes stdarg-2.c failures on i?86-linux, bootstrapped/regtested
on i686-linux, will commit as obvious tonight.

2011-11-11  Jakub Jelinek  

PR tree-optimization/51091
* tree-stdarg.c (execute_optimize_stdarg): Ignore TREE_CLOBBER_P
rhs also in the va_list_simple_ptr case.

--- gcc/tree-stdarg.c.jj2011-11-08 23:35:12.0 +0100
+++ gcc/tree-stdarg.c   2011-11-11 15:02:59.005511553 +0100
@@ -847,8 +847,12 @@ execute_optimize_stdarg (void)
  if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
  == GIMPLE_SINGLE_RHS)
{
+ /* Check for ap ={v} {}.  */
+ if (TREE_CLOBBER_P (rhs))
+   continue;
+
  /* Check for tem = ap.  */
- if (va_list_ptr_read (&si, rhs, lhs))
+ else if (va_list_ptr_read (&si, rhs, lhs))
continue;
 
  /* Check for the last insn in:
@@ -875,6 +879,7 @@ execute_optimize_stdarg (void)
  /* Check for ap ={v} {}.  */
  if (TREE_CLOBBER_P (rhs))
continue;
+
  /* Check for ap[0].field = temp.  */
  else if (va_list_counter_struct_op (&si, lhs, rhs, true))
continue;

Jakub


[Gcc.amd] [Patch 001] Document bdver1/btver1 in invoke.texi

2011-11-11 Thread venkataramanan.kumar
> Subject: Re: [Gcc.amd] [Patch 001] [x86 backend] Define march/mtune for
> upcoming AMD Bulldozer procesor.
> 
> > Hello!
> >
> > > This patch defines -march=bdver1 and -mtune=bdver1 flag for the upcoming
> > > AMD Bulldozer processor.
> Hi,
> it seems that bdver/btver is not mentioned in invoke.texi nor changes.html.
> Could you please add documentation?
> 
> Honza

Hi Honza,  

I have added documentation for bdver1/bdver1 in invoke.texi.

is Ok to commit?

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 181283)
+++ gcc/doc/invoke.texi (working copy)
@@ -12803,6 +12803,15 @@
 AMD Family 10h core based CPUs with x86-64 instruction set support.  (This
 supersets MMX, SSE, SSE2, SSE3, SSE4A, 3DNow!, enhanced 3DNow!, ABM and 64-bit
 instruction set extensions.)
+@item bdver1
+AMD Family 15h core based CPUs with x86-64 instruction set support.  (This
+supersets FMA4, AVX, XOP, LWP, AES, PCL_MUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A,
+SSSE3, SSE4.1, SSE4.2, 3DNow!, enhanced 3DNow!, ABM and 64-bit
+instruction set extensions.)
+@item btver1
+AMD Family 14h core based CPUs with x86-64 instruction set support.  (This
+supersets MMX, SSE, SSE2, SSE3, SSSE3, SSE4A, CX16, ABM and 64-bit
+instruction set extensions.)
 @item winchip-c6
 IDT Winchip C6 CPU, dealt in same way as i486 with additional MMX instruction
 set support.




[Gcc.amd] [Patch 002] Document bdver1 in changes.html for GCC4.6

2011-11-11 Thread venkataramanan.kumar
> Subject: Re: [Gcc.amd] [Patch 001] [x86 backend] Define march/mtune for
> upcoming AMD Bulldozer procesor.
> 
> > Hello!
> >
> > > This patch defines -march=bdver1 and -mtune=bdver1 flag for the upcoming
> > > AMD Bulldozer processor.
> Hi,
> it seems that bdver/btver is not mentioned in invoke.texi nor changes.html.
> Could you please add documentation?
> 
> Honza

Hi Honza,  

Added bdver1 information to changes.html for GCC4.6

is Ok to commit?

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/changes.html,v
retrieving revision 1.136
diff -u -r1.136 changes.html
--- changes.html30 Oct 2011 12:55:43 -  1.136
+++ changes.html11 Nov 2011 12:26:03 -
@@ -813,6 +813,9 @@
 Support for AMD Bobcat (family 14) processors is now available through
the -march=btver1 and -mtune=btver1
options.
+Support for AMD Bulldozer (family 15) processors is now available
+   through the -march=bdver1 and -mtune=bdver1
+   options.
 The default setting (when not optimizing for size) for 32-bit
   GNU/Linux and Darwin x86 targets has been changed to
   -fomit-frame-pointer.  The default can be reverted




Re: FW: [PATCH][Cilkplus] Patch to fix unused variable

2011-11-11 Thread H.J. Lu
On Fri, Nov 11, 2011 at 8:21 AM, Iyer, Balaji V  wrote:
> Forgot to mention.. this is the 4th and final patch in this sequence.
>
> Thanks,
>
> Balaji V. Iyer.
> 
> From: Iyer, Balaji V
> Sent: Friday, November 11, 2011 11:16 AM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH][Cilkplus] Patch to fix unused variable
>
> Hello Everyone,
>    This patch is for the Cilkplus branch affecting both C and C++ compilers. 
> This patch will fix an unused variable warning in collect2.c by enclosing the 
> variable inside #ifdef TARGET_AIX_VERSION.
>
> Thanks,
>
> Balaji V. Iyer.
>

I believe this has been fixed on mainline.  You should imply merge with
mainline to get the proper fix.

I checked in the first 3 patches for you.  Please always use a complete
separate ChangeLog entry for each change when someone else has to
check it in for you.

Thanks.


-- 
H.J.


Implement openmp atomic load/store with __atomic builtins

2011-11-11 Thread Richard Henderson
With the __sync builtins, we weren't able to implement the raw loads
and stores efficiently.  But with the __atomic builtins we can.

Tested on x86_64-linux and committed.


r~
* gimple-pretty-print.c (dump_gimple_omp_atomic_load): Dump needed.
(dump_gimple_omp_atomic_store): Likewise.
* optabs.c (can_atomic_exchange_p): New.
* optabs.h (can_atomic_exchange_p): Declare.
* omp-low.c (expand_omp_atomic_load): Implement.
(expand_omp_atomic_store): Likewise.
(expand_omp_atomic): Update for new arguments to load/store.


diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index df703b4..f0e7c50 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1768,6 +1768,8 @@ dump_gimple_omp_atomic_load (pretty_printer *buffer, 
gimple gs, int spc,
   else
 {
   pp_string (buffer, "#pragma omp atomic_load");
+  if (gimple_omp_atomic_need_value_p (gs))
+   pp_string (buffer, " [needed]");
   newline_and_indent (buffer, spc + 2);
   dump_generic_node (buffer, gimple_omp_atomic_load_lhs (gs),
 spc, flags, false);
@@ -1795,7 +1797,10 @@ dump_gimple_omp_atomic_store (pretty_printer *buffer, 
gimple gs, int spc,
 }
   else
 {
-  pp_string (buffer, "#pragma omp atomic_store (");
+  pp_string (buffer, "#pragma omp atomic_store ");
+  if (gimple_omp_atomic_need_value_p (gs))
+   pp_string (buffer, "[needed] ");
+  pp_character (buffer, '(');
   dump_generic_node (buffer, gimple_omp_atomic_store_val (gs),
 spc, flags, false);
   pp_character (buffer, ')');
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index dc61c0b..a4bfb84 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -4977,25 +4977,125 @@ expand_omp_synch (struct omp_region *region)
operation as a normal volatile load.  */
 
 static bool
-expand_omp_atomic_load (basic_block load_bb, tree addr, tree loaded_val)
+expand_omp_atomic_load (basic_block load_bb, tree addr,
+   tree loaded_val, int index)
 {
-  /* FIXME */
-  (void) load_bb;
-  (void) addr;
-  (void) loaded_val;
-  return false;
+  enum built_in_function tmpbase;
+  gimple_stmt_iterator gsi;
+  basic_block store_bb;
+  location_t loc;
+  gimple stmt;
+  tree decl, call, type, itype;
+
+  gsi = gsi_last_bb (load_bb);
+  stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_OMP_ATOMIC_LOAD);
+  loc = gimple_location (stmt);
+
+  /* ??? If the target does not implement atomic_load_optab[mode], and mode
+ is smaller than word size, then expand_atomic_load assumes that the load
+ is atomic.  We could avoid the builtin entirely in this case.  */
+
+  tmpbase = (enum built_in_function) (BUILT_IN_ATOMIC_LOAD_N + index + 1);
+  decl = builtin_decl_explicit (tmpbase);
+  if (decl == NULL_TREE)
+return false;
+
+  type = TREE_TYPE (loaded_val);
+  itype = TREE_TYPE (TREE_TYPE (decl));
+
+  call = build_call_expr_loc (loc, decl, 2, addr,
+ build_int_cst (NULL, MEMMODEL_RELAXED));
+  if (!useless_type_conversion_p (type, itype))
+call = fold_build1_loc (loc, VIEW_CONVERT_EXPR, type, call);
+  call = build2_loc (loc, MODIFY_EXPR, void_type_node, loaded_val, call);
+
+  force_gimple_operand_gsi (&gsi, call, true, NULL_TREE, true, GSI_SAME_STMT);
+  gsi_remove (&gsi, true);
+
+  store_bb = single_succ (load_bb);
+  gsi = gsi_last_bb (store_bb);
+  gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_ATOMIC_STORE);
+  gsi_remove (&gsi, true);
+
+  if (gimple_in_ssa_p (cfun))
+update_ssa (TODO_update_ssa_no_phi);
+
+  return true;
 }
 
 /* A subroutine of expand_omp_atomic.  Attempt to implement the atomic
operation as a normal volatile store.  */
 
 static bool
-expand_omp_atomic_store (basic_block load_bb, tree addr)
+expand_omp_atomic_store (basic_block load_bb, tree addr,
+tree loaded_val, tree stored_val, int index)
 {
-  /* FIXME */
-  (void) load_bb;
-  (void) addr;
-  return false;
+  enum built_in_function tmpbase;
+  gimple_stmt_iterator gsi;
+  basic_block store_bb = single_succ (load_bb);
+  location_t loc;
+  gimple stmt;
+  tree decl, call, type, itype;
+  enum machine_mode imode;
+  bool exchange;
+
+  gsi = gsi_last_bb (load_bb);
+  stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_OMP_ATOMIC_LOAD);
+
+  /* If the load value is needed, then this isn't a store but an exchange.  */
+  exchange = gimple_omp_atomic_need_value_p (stmt);
+
+  gsi = gsi_last_bb (store_bb);
+  stmt = gsi_stmt (gsi);
+  gcc_assert (gimple_code (stmt) == GIMPLE_OMP_ATOMIC_STORE);
+  loc = gimple_location (stmt);
+
+  /* ??? If the target does not implement atomic_store_optab[mode], and mode
+ is smaller than word size, then expand_atomic_store assumes that the store
+ is atomic.  We could avoid the builtin entirely in this case.  */
+
+  tmpbase = (exchange ? BUILT_IN_ATOMIC_EXCHANGE_N : BUILT_IN_ATOMIC_STORE_N);
+

Re: [Patch ObjC/NeXT] use correct personality routine for Objective-C/NeXT/ABI0/1

2011-11-11 Thread Mike Stump
On Nov 11, 2011, at 2:32 AM, Iain Sandoe wrote:
> This corrects a mistake I made when splitting the runtime code up - which 
> causes the GNU eh personality routine to be specified for NeXT ABI 0&1.

> OK for trunk/4.6?

Ok.


Re: [Patch] Move Objective-C runtime flags to modern options system.

2011-11-11 Thread Mike Stump
On Nov 11, 2011, at 12:25 AM, Iain Sandoe wrote:
> FWIW your example doesn't reproduce the problem because it contains no 
> objective c exceptions code.

Ah, but it can be seen to contradict what you said.  It also found a bug.

> However, OK - I see your point (I also see where the problem came from).

:-)  Which is why it is occasionally important to restate the obvious, to 
ensure we're all on the same page.

> Patch under test to fix this (will post later).

Thanks.


Re: [patch tree-optimization 1/2]: Branch-cost optimizations

2011-11-11 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/07/11 13:28, Kai Tietz wrote:

> Sure.  A more general question, which was raised by Richi here.
> For BC optimization it is of course interesting to know real 
> instruction-costs and not just guessings.   The current code in 
> fold-const guess a bit too optimistic,  the code in that patch is
> in some cases too pesimistic.  So is there a facility to get
> prediction of instruction-costs on gimple-tree?  I would say this
> isn't possible, as here we would need final RTL form to tell this,
> isn't it?
We don't have much costing information in the gimple IL; that is
largely by design.  We generally wanted to avoid introducing cost
information into the gimple/ssa optimizers so that their behavior
would be more predictable.

The fact that BRANCH_COST is utilized in fold-const.c is a historical
blunder that we haven't fixed yet.

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOvWuwAAoJEBRtltQi2kC7ElgIAIOuttaZN4ff6I4WnkN1cmUk
5dr2B4ROVe49cLyq2NOXPp2zLxtxfDG6PkWDi+w1y+XXEXerRn/ncHIczYSZi+Be
/DMhJDsog1GWUzblHTYvE4RluaB26+Khaj8yLgxZKhASAPw4h8Tjf7HwVedjiy1k
PDJC/as8lbGHHgOfxrzOL3t1NXx8RthrtwR7pKBvq1oq0oomjsIEPEa+ubZJfnBJ
TuKLJczcYCPKkgKBm5+wdOe5dO9NdpdfH5GVdEAPa+2lnXdrK7PXuYGz75uF9Hm1
j1zlO+zGP6qIS7FtEm7N70TZWcmPfmIKtamZTqT9JLsrvPTpzLkC4/PN486PiwQ=
=U8VZ
-END PGP SIGNATURE-


Re: [PATCH] Improve VEC_BASE

2011-11-11 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/11/11 00:53, Jakub Jelinek wrote:

> 
> I've actually committed it yesterday after discussion with Richi on
> IRC.
No problem.

> While his patch optimizes it, it doesn't do so for -O0
Funny, I almost make this argument for accepting your patch, then
decided it wasn't that compelling.

Anyway, I'm fine with the patch, just wanted to make sure it was
resolved one way or another.

Thanks,
jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOvWG/AAoJEBRtltQi2kC7reIH/3g/v/GgVp5xI10iV5ZKUj5u
h4PfJjoGUP/skivOW9ZEPcsX7NTwPRu+9vuBER+s4D9TBZJk9T5PmjNYF87xNsOP
vjBZJzSZEcmiTm/2ixkYTMfU9AwO3jImbvECtPjxqIarzm9LNvsn4P/TwlviqMcs
0FEEXOuHxPV/L8zdc0SewvoR8GnnMWRg2leBt5ypNF/D43ozCNaGJrUOB5Z0DGcN
52a9WjotLuEXfs4/P8uMoXyMfr1grWdocX+g3NXrAkssU0Dj6V9tbJw8JG/XG/Px
TgI8iHmGN9EX8W8qTtllbj3j5W7KESpCvfK346pY+dr8QpoO32BjPqrHvQ6U5ZE=
=tjF9
-END PGP SIGNATURE-


Re: cxx-mem-model merge [6 of 9] - libstdc++-v3

2011-11-11 Thread Hans-Peter Nilsson
> From: Benjamin Kosnik 
> Date: Fri, 11 Nov 2011 06:43:29 +0100

> So, all:
> 
> config/cpu/*/atomicity.h

And config/cpu/*/atomic_word.h presumably?

> Should go. I'll look in to peeling off this cruft sharpish.

brgds, H-P


PATCH: Assert DWARF register size <= saved reg size

2011-11-11 Thread H.J. Lu
Hi,

I am working on 32bit Pmode for x32:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50797

It removes all LEAs, which convert 32bit address to 64bit, and uses 0x67
address prefix instead.  I got 5% speed up in SPEC CPU 2K/2006.

But assert in _Unwind_SetGRValue:

gcc_assert (dwarf_reg_size_table[index] == sizeof (_Unwind_Context_Reg_Val));

failed on return column since init_return_column_size use Pmode, not
word_mode.  In this case, _Unwind_Context_Reg_Val is 64bit, but return
column size is 32bit.  This patch changes it to assert DWARF register
size <= saved reg size.  OK for trunk?

Thanks.


H.J.
---
2011-11-11  H.J. Lu  

* unwind-dw2.c (_Unwind_SetGRValue): Assert DWARF register size
<= saved reg size.

diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
index 475ad00..db1c757 100644
--- a/libgcc/unwind-dw2.c
+++ b/libgcc/unwind-dw2.c
@@ -294,7 +294,8 @@ _Unwind_SetGRValue (struct _Unwind_Context *context, int 
index,
 {
   index = DWARF_REG_TO_UNWIND_COLUMN (index);
   gcc_assert (index < (int) sizeof(dwarf_reg_size_table));
-  gcc_assert (dwarf_reg_size_table[index] == sizeof (_Unwind_Context_Reg_Val));
+  /* Return column size may be smaller than _Unwind_Context_Reg_Va.  */ 
+  gcc_assert (dwarf_reg_size_table[index] <= sizeof (_Unwind_Context_Reg_Val));
 
   context->by_value[index] = 1;
   context->reg[index] = _Unwind_Get_Unwind_Context_Reg_Val (val);


Re: cxx-mem-model merge [6 of 9] - libstdc++-v3

2011-11-11 Thread Hans-Peter Nilsson
> From: Andrew MacLeod 
> Date: Fri, 11 Nov 2011 18:45:11 +0100

> On 11/11/2011 12:43 AM, Benjamin Kosnik wrote:
> I think there is also an argument for single threaded-ness vs multi 
> threaded.  If there is no atomic support and its single threaded, we 
> don't really need the lock... and I'm not sure how you can detect the 
> change in behaviour if test_and_set and clear just store 1 and 0 rather 
> than create alock, then do the store of 1 or 0.
> 
> If the target is multithreaded, well, we'll have to go to a lock I 
> guess...   Are there any multithreaded targets without atomic support?  
> ie, is this one?

No, cris-elf is not multithreaded target.

(FWIW, cris-*-linux* and crisv32-*-linux* are, but the lack of
update to the atomicity support for them is a port bug.)

brgds, H-P


Re: Revert "PowerPC shrink-wrap support 3 of 3"

2011-11-11 Thread Hans-Peter Nilsson
> From: Hans-Peter Nilsson 
> Date: Thu, 10 Nov 2011 18:52:39 +0100

> > From: Hans-Peter Nilsson 
> > Date: Thu, 10 Nov 2011 15:12:54 +0100
> 
> > > From: Bernd Schmidt 
> > > Date: Thu, 10 Nov 2011 14:29:04 +0100
> > 
> > > HP, can you run full tests?
> > 
> > Cross-test to cris-elf in progress.
> > Thanks!
> 
> Works, no regressions compared to before the breakage (r181187).
> Thanks!  According to
>  it fixes
> building for arm-unknown-linux-gnueabi too.

AFAICT, your patch has got sufficiently testing now (on three
targets to boot) to be considered safe to check in.  Or is
something amiss?

(If it's the unchecked code quality you mentioned, that can be
just as well dealt with having the tree in a working state;
having the tree broken for some targets accomplishes nothing.)

brgds, H-P


Re: Selective Scheduling Reviews

2011-11-11 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/11/11 02:24, Andrey Belevantsev wrote:
> On 10.11.2011 21:31, Jeff Law wrote:
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
>> 
>> 
>> [ This should have gone out some time ago...  Sorry for the long
>> delay ]
>> 
>> I'm pleased to announce that the GCC steering committee has
>> approved the nomination of Andrey Belevantsev, Alexander Monakov,
>> and Dmitry Melnik as selective scheduling reviewers.
> Thanks a lot, I've committed the patch below.
> 
> Btw, don't we want to keep the reviewers section sorted by
> component?  I then can move LTO folks entries to the appropriate
> place.
Probably wise.  Feel free to update the MAINTAINERS file appropriately.

Thanks,
Jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOvVwjAAoJEBRtltQi2kC7TzEIAKCggQjWoitle78CcEfzQDR9
be1n7rIKaqeruXGfZ+3VB5iiBdigLqZloB34fZrZm0M4BsQJ6HwDsZDvxWdDxpG+
D3SWYfWNLB8Cxg1pMuWA8G09MMhobgxZWDo00sZ+nj1qK+9kRr85N0xIEOJKGIoS
YW6LAKl9iR0/Mg5iXJfZ6gTJo+v0Hpmn16UfHAEVkzdhzFeDlSBlcSlWmoooZ60C
9ubU0opQZZGqFzhQIu7hBmD9ETciSD0eN8ATvy/sE+o8CGJBXY5Z0SewlpqLxNEf
rRsXjWchnZlb2RolPRZgNIzzSNA477uK+OeugahP9F9oaOMcAmB8G7YnhNmMFnU=
=WJbE
-END PGP SIGNATURE-


Re: [libitm] Work around missing AVX support

2011-11-11 Thread Iain Sandoe


On 11 Nov 2011, at 14:33, Rainer Orth wrote:


Iain Sandoe  writes:

however, most of the suite fails on darwin9 - with an undefined  
reference

to delete(void*).


Could this be the same issue I've been seeing on Tru64 UNIX, i.e. lack
of weakdef support?

http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01426.html

At least my weakdef.c testcase also fails to link on
i386-apple-darwin9.8.0.


It certainly looks similar to point 2

Having discussed this with Mike and tried out some experiments.

For Darwin, the symbol can be absent at runtime (and will compare to  
NULL as per the elf case).


However,

a) (with two-level namespace) it can't be absent at link time  - so  
there has to be a dummy provided there.


b) (If you force flat_namespace ... which IMO would be a Very Bad  
Thing for the compiler) you can do

-flat_namespace -undefined suppress  ...
... see the discussion in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51059

- so I think that is quite similar to what your quoted page is saying?

However,

...  it doesn't explain why x86-64 Darwin10 does NOT seem to  
experience this (there should be no change in weak semantics AFAIK  
between 9 and 10) .
.. and i686-darwin9 does (unless there's a tool bug in Darwin 9  
that this is exposing).


... and the comment stands that the Makefile.am explicitly says "we  
don't want or need libstdc++" but the missing symbols (in darwin9)  
seem to be from there.


cheers
Iain



[Patch testsuite/darwin] fix PR testsuite/51059

2011-11-11 Thread Iain Sandoe
This probably qualifies as obvious - but having discussed some of the  
background with Mike ..
.. there are other ways of solving the problem - although probably  
rather heavy-weight for this problem.


.. So, I'll let him have the say...

OK for trunk?
Iain

testsuite:

PR testsuite/51059
* gcc.misc-tests/gcov-14.c (dg-options): Force flat namespace for
Darwin targets and allow the Foo symbol to be undefined.

Index: gcc/testsuite/gcc.misc-tests/gcov-14.c
===
--- gcc/testsuite/gcc.misc-tests/gcov-14.c  (revision 181293)
+++ gcc/testsuite/gcc.misc-tests/gcov-14.c  (working copy)
@@ -1,6 +1,7 @@
 /* Test gcov extern inline.  */

 /* { dg-options "-O2 -fprofile-arcs -ftest-coverage" } */
+/* { dg-options "-O2 -fprofile-arcs -ftest-coverage  -flat_namespace - 
undefined suppress" { target *-*-darwin* }  } */

 /* { dg-require-weak "" } */
 /* { dg-do run { target native } } */






Go patch committed: Introduce g variable

2011-11-11 Thread Ian Lance Taylor
As another step toward multiplexing goroutines onto OS threads, this
patch adopts the g variable used in the other Go's compiler runtime
support library.  g is a thread-local global which holds
goroutine-specific information, as opposed to the existing thread-local
global m which holds thread-specific information.  The only
goroutine-specific information at present is the panic/defer stuff,
which was formerly stored in __go_panic_defer.  This patch essentially
replaces __go_panic_defer.  Much of the patch is mechanical.

The use of g0 is temporary until full support for multiplexing
goroutines goes in.

Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r b9bf54c65ba3 libgo/Makefile.am
--- a/libgo/Makefile.am	Fri Nov 04 16:02:15 2011 -0700
+++ b/libgo/Makefile.am	Fri Nov 11 12:57:37 2011 -0800
@@ -434,7 +434,6 @@
 	runtime/go-new.c \
 	runtime/go-note.c \
 	runtime/go-panic.c \
-	runtime/go-panic-defer.c \
 	runtime/go-print.c \
 	runtime/go-rand.c \
 	runtime/go-rec-big.c \
diff -r b9bf54c65ba3 libgo/runtime/go-defer.c
--- a/libgo/runtime/go-defer.c	Fri Nov 04 16:02:15 2011 -0700
+++ b/libgo/runtime/go-defer.c	Fri Nov 11 12:57:37 2011 -0800
@@ -6,6 +6,7 @@
 
 #include 
 
+#include "runtime.h"
 #include "go-alloc.h"
 #include "go-panic.h"
 #include "go-defer.h"
@@ -17,18 +18,14 @@
 {
   struct __go_defer_stack *n;
 
-  if (__go_panic_defer == NULL)
-__go_panic_defer = ((struct __go_panic_defer_struct *)
-			__go_alloc (sizeof (struct __go_panic_defer_struct)));
-
   n = (struct __go_defer_stack *) __go_alloc (sizeof (struct __go_defer_stack));
-  n->__next = __go_panic_defer->__defer;
+  n->__next = g->defer;
   n->__frame = frame;
-  n->__panic = __go_panic_defer->__panic;
+  n->__panic = g->panic;
   n->__pfn = pfn;
   n->__arg = arg;
   n->__retaddr = NULL;
-  __go_panic_defer->__defer = n;
+  g->defer = n;
 }
 
 /* This function is called when we want to undefer the stack.  */
@@ -36,22 +33,19 @@
 void
 __go_undefer (_Bool *frame)
 {
-  if (__go_panic_defer == NULL)
-return;
-  while (__go_panic_defer->__defer != NULL
-	 && __go_panic_defer->__defer->__frame == frame)
+  while (g->defer != NULL && g->defer->__frame == frame)
 {
   struct __go_defer_stack *d;
   void (*pfn) (void *);
 
-  d = __go_panic_defer->__defer;
+  d = g->defer;
   pfn = d->__pfn;
   d->__pfn = NULL;
 
   if (pfn != NULL)
 	(*pfn) (d->__arg);
 
-  __go_panic_defer->__defer = d->__next;
+  g->defer = d->__next;
   __go_free (d);
 
   /* Since we are executing a defer function here, we know we are
@@ -69,7 +63,7 @@
 _Bool
 __go_set_defer_retaddr (void *retaddr)
 {
-  if (__go_panic_defer != NULL && __go_panic_defer->__defer != NULL)
-__go_panic_defer->__defer->__retaddr = retaddr;
+  if (g->defer != NULL)
+g->defer->__retaddr = retaddr;
   return 0;
 }
diff -r b9bf54c65ba3 libgo/runtime/go-deferred-recover.c
--- a/libgo/runtime/go-deferred-recover.c	Fri Nov 04 16:02:15 2011 -0700
+++ b/libgo/runtime/go-deferred-recover.c	Fri Nov 11 12:57:37 2011 -0800
@@ -6,6 +6,7 @@
 
 #include 
 
+#include "runtime.h"
 #include "go-panic.h"
 #include "go-defer.h"
 
@@ -78,9 +79,7 @@
 struct __go_empty_interface
 __go_deferred_recover ()
 {
-  if (__go_panic_defer == NULL
-  || __go_panic_defer->__defer == NULL
-  || __go_panic_defer->__defer->__panic != __go_panic_defer->__panic)
+  if (g->defer == NULL || g->defer->__panic != g->panic)
 {
   struct __go_empty_interface ret;
 
diff -r b9bf54c65ba3 libgo/runtime/go-go.c
--- a/libgo/runtime/go-go.c	Fri Nov 04 16:02:15 2011 -0700
+++ b/libgo/runtime/go-go.c	Fri Nov 11 12:57:37 2011 -0800
@@ -115,6 +115,7 @@
  any code from here to thread exit must not assume that m is
  valid.  */
   m = NULL;
+  g = NULL;
 
   i = pthread_mutex_unlock (&__go_thread_ids_lock);
   __go_assert (i == 0);
@@ -135,10 +136,11 @@
 
 #ifdef __rtems__
   __wrap_rtems_task_variable_add ((void **) &m);
-  __wrap_rtems_task_variable_add ((void **) &__go_panic_defer);
+  __wrap_rtems_task_variable_add ((void **) &g);
 #endif
 
   m = newm;
+  g = m->curg;
 
   pthread_cleanup_push (remove_current_thread, NULL);
 
@@ -230,6 +232,9 @@
 
   newm->list_entry = list_entry;
 
+  newm->curg = __go_alloc (sizeof (G));
+  newm->curg->m = newm;
+
   newm->id = __sync_fetch_and_add (&mcount, 1);
   newm->fastrand = 0x49f6428aUL + newm->id;
 
@@ -299,9 +304,6 @@
   }
 #endif
 
-  /* FIXME: Perhaps we should just move __go_panic_defer into M.  */
-  m->gc_panic_defer = __go_panic_defer;
-
   /* Tell the garbage collector that we are ready by posting to the
  semaphore.  */
   i = sem_post (&__go_thread_ready_sem);
@@ -433,10 +435,6 @@
   --c;
 }
 
-  /* The gc_panic_defer field should now be set for all M's except the
- one in this thread.  Set this one now.  */
-  m->gc_panic_defer = __go_panic_defer;
-
   /* Leave with __go_thread_ids_lock held.  */
 }
 
diff -r b9bf54c65ba3 libgo/runtime/go

Re: [Patch testsuite/darwin] fix PR testsuite/51059

2011-11-11 Thread Mike Stump
On Nov 11, 2011, at 12:26 PM, Iain Sandoe wrote:
> This probably qualifies as obvious - but having discussed some of the 
> background with Mike ..
> .. there are other ways of solving the problem - although probably rather 
> heavy-weight for this problem.
> 
> .. So, I'll let him have the say...
> 
> OK for trunk?

Ok.  The other way to do this would be to split requires-weak into two, and 
have darwin return false for one and true for the other and then make this use 
the second form.  Please add a comment, like, darwin doesn't have elf weak 
import, or some such.  The idea being, when darwin gets it, then, we could 
remove the line.  Also, this help people have a slight clue, why such options 
are necessary.


Re: [PATCH, testsuite]: Do not run simulate-thread on alpha*-*-linux*

2011-11-11 Thread Richard Henderson
On 11/11/2011 09:44 AM, Uros Bizjak wrote:
> 2011-11-11  Uros Bizjak  
> 
>   * lib/gcc-simulate-thread.exp (simulate-thread): Do not run on
>   alpha*-*-linux* targets.

Ok.

r~


Re: cxx-mem-model merge [6 of 9] - libstdc++-v3

2011-11-11 Thread Torvald Riegel
On Fri, 2011-11-11 at 12:45 -0500, Andrew MacLeod wrote:
> On 11/11/2011 12:43 AM, Benjamin Kosnik wrote:
> >
> >> bkoz: As relates to the existing problem, how is the legacy support
> >> invoked in compatibility-atomic-c++0x.cc?  That has the old style
> >> implementation of atomic_flag with a lock, which would allow this
> >> target to compile...  which is another option perhaps.  or is that
> >> purely for pervious releases somehow?
> > compatibility-atomic-c++0x.cc is the support for previous builtins
> > attempt, let's call this atomics-try-2. We need to keep these symbols
> > exported and doing the same thing done in previous releases. (Or else
> > abi-check will fail.)
> >
> > If this system used to use a lock to work, then that is what it should
> > still do. The behavior shouldn't change.
> >
> I think there is also an argument for single threaded-ness vs multi 
> threaded.  If there is no atomic support and its single threaded, we 
> don't really need the lock... and I'm not sure how you can detect the 
> change in behaviour if test_and_set and clear just store 1 and 0 rather 
> than create alock, then do the store of 1 or 0.

Also, if you're replacing it with atomic ops you can't see different
behavior because, well, they are atomic.  However, this requires that
_nobody_ tries to synchronize using this lock anymore (only observing
the lock is fine, but you must never modify it).  Likewise on a
single-threaded target.

Torvald



Re: [PATCH] Revert sparc vec_init improvements as they cause 64-bit regressions.

2011-11-11 Thread David Miller
From: Eric Botcazou 
Date: Fri, 11 Nov 2011 11:05:06 +0100

>> One thing that really irks me is how pseudo's can only be subreg'd
>> on UNITS_PER_WORD boundaries.  That's the real reason this stuff
>> doesn't work and it's nearly impossible to subreg 32-bit values
>> that end up in float regs on sparc when compiling 64-bit.
> 
> Yes, this was done on purpose to solve very nasty RA/reload problems, but the
> irregularity of the SPARC register file in 64-bit mode clearly conflicts with 
> it.  And not all issues were solved, so we used CANNOT_CHANGE_MODE_CLASS to 
> mask some of the remaining ones on SPARC (and on PA).  

The first problem I ran into was combine, it uses word boundaries to
decide if a subreg store clobbers an entire register, and if so it
treats the SET_DEST as completely clobbered.

>> Anyways, commited to trunk and all the 64-bit failures should be gone.
> 
> Do we have the same problem in VIS2/3 mode as in VIS1 mode?  If so, then I 
> agree that this is probably the best course of action in the short term.

Yes, all of the VIS cases have the same subregging issue on 64-bit.


[PATCH 0/4][CFT] Handle legacy __sync libcalls

2011-11-11 Thread Richard Henderson
These are the targets that used external __sync calls in gcc 4.6.
I've been intending to test them myself, but since these aren't
bare *-elf targets, it's taking me some time to get the various
cross-environment set up.

Port maintainers, please test.


r~


Richard Henderson (4):
  arm: Install __sync libfuncs for Linux.
  mips: Install the __sync libfuncs for mips16
  hppa: Install __sync libfuncs for linux.
  sh-linux: Install __sync libfuncs.

 gcc/config/arm/arm.c |4 
 gcc/config/mips/mips.c   |8 ++--
 gcc/config/pa/pa-linux.h |3 +++
 gcc/config/pa/pa.c   |3 +++
 gcc/config/pa/pa.h   |5 +
 gcc/config/sh/linux.h|4 
 gcc/config/sh/sh.c   |8 
 7 files changed, 33 insertions(+), 2 deletions(-)

-- 
1.7.6.4



[PATCH 2/4] mips: Install the __sync libfuncs for mips16

2011-11-11 Thread Richard Henderson
Cc: Richard Sandiford 
---
 gcc/config/mips/mips.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index ff72e28..75e73bd 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -11218,9 +11218,13 @@ mips_init_libfuncs (void)
 }
 
   /* The MIPS16 ISA does not have an encoding for "sync", so we rely
- on an external non-MIPS16 routine to implement __sync_synchronize.  */
+ on an external non-MIPS16 routine to implement __sync_synchronize.
+ Similarly for the rest of the ll/sc libfuncs.  */
   if (TARGET_MIPS16)
-synchronize_libfunc = init_one_libfunc ("__sync_synchronize");
+{
+  synchronize_libfunc = init_one_libfunc ("__sync_synchronize");
+  init_sync_libfuncs (UNITS_PER_WORD);
+}
 }
 
 /* Build up a multi-insn sequence that loads label TARGET into $AT.  */
-- 
1.7.6.4



[PATCH 3/4] hppa: Install __sync libfuncs for linux.

2011-11-11 Thread Richard Henderson
Cc: John David Anglin 
---
 gcc/config/pa/pa-linux.h |3 +++
 gcc/config/pa/pa.c   |3 +++
 gcc/config/pa/pa.h   |5 +
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/gcc/config/pa/pa-linux.h b/gcc/config/pa/pa-linux.h
index 6c6cf21..addc0e1 100644
--- a/gcc/config/pa/pa-linux.h
+++ b/gcc/config/pa/pa-linux.h
@@ -136,3 +136,6 @@ along with GCC; see the file COPYING3.  If not see
 /* Linux always uses gas.  */
 #undef TARGET_GAS
 #define TARGET_GAS 1
+
+#undef TARGET_SYNC_LIBCALL
+#define TARGET_SYNC_LIBCALL 1
diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c
index 66574ba..134f1f8 100644
--- a/gcc/config/pa/pa.c
+++ b/gcc/config/pa/pa.c
@@ -5587,6 +5587,9 @@ pa_init_libfuncs (void)
   set_conv_libfunc (ufloat_optab, TFmode, DImode,
"_U_Qfcnvxf_udbl_to_quad");
 }
+
+  if (TARGET_SYNC_LIBCALL)
+init_sync_libfuncs (UNITS_PER_WORD);
 }
 
 /* HP's millicode routines mean something special to the assembler.
diff --git a/gcc/config/pa/pa.h b/gcc/config/pa/pa.h
index 2f1295b..c52e3d5 100644
--- a/gcc/config/pa/pa.h
+++ b/gcc/config/pa/pa.h
@@ -74,6 +74,11 @@ extern unsigned long total_code_bytes;
 #define HPUX_LONG_DOUBLE_LIBRARY 0
 #endif
 
+/* Linux kernel atomic operation support.  */
+#ifndef TARGET_SYNC_LIBCALL
+#define TARGET_SYNC_LIBCALL 0
+#endif
+
 /* The following three defines are potential target switches.  The current
defines are optimal given the current capabilities of GAS and GNU ld.  */
 
-- 
1.7.6.4



[PATCH 1/4] arm: Install __sync libfuncs for Linux.

2011-11-11 Thread Richard Henderson
Cc: Richard Earnshaw 
Cc: Ramana Radhakrishnan 
---
 gcc/config/arm/arm.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6ef6f62..abf8ce1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1096,6 +1096,10 @@ arm_set_fixed_conv_libfunc (convert_optab optable, enum 
machine_mode to,
 static void
 arm_init_libfuncs (void)
 {
+  /* For Linux, we have access to kernel support for atomic operations.  */
+  if (arm_abi == ARM_ABI_AAPCS_LINUX)
+init_sync_libfuncs (8);
+
   /* There are no special library functions unless we are using the
  ARM BPABI.  */
   if (!TARGET_BPABI)
-- 
1.7.6.4



[PATCH 4/4] sh-linux: Install __sync libfuncs.

2011-11-11 Thread Richard Henderson
Cc: Kaz Kojima 
---
 gcc/config/sh/linux.h |4 
 gcc/config/sh/sh.c|8 
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/gcc/config/sh/linux.h b/gcc/config/sh/linux.h
index edfd99b..7a75341 100644
--- a/gcc/config/sh/linux.h
+++ b/gcc/config/sh/linux.h
@@ -131,3 +131,7 @@ along with GCC; see the file COPYING3.  If not see
 #define SH_DIV_STRATEGY_DEFAULT SH_DIV_CALL2
 #undef SH_DIV_STR_FOR_SIZE
 #define SH_DIV_STR_FOR_SIZE "call2"
+
+/* Install the __sync libcalls.  */
+#undef TARGET_INIT_LIBFUNCS
+#define TARGET_INIT_LIBFUNCS  sh_init_sync_libfuncs
diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c
index 03c3c48..2545a63 100644
--- a/gcc/config/sh/sh.c
+++ b/gcc/config/sh/sh.c
@@ -302,6 +302,8 @@ static void sh_trampoline_init (rtx, tree, rtx);
 static rtx sh_trampoline_adjust_address (rtx);
 static void sh_conditional_register_usage (void);
 static bool sh_legitimate_constant_p (enum machine_mode, rtx);
+
+static void sh_init_sync_libfuncs (void) ATTRIBUTE_UNUSED;
 
 static const struct attribute_spec sh_attribute_table[] =
 {
@@ -12499,4 +12501,10 @@ sh_legitimate_constant_p (enum machine_mode mode, rtx 
x)
 
 enum sh_divide_strategy_e sh_div_strategy = SH_DIV_STRATEGY_DEFAULT;
 
+static void
+sh_init_sync_libfuncs (void)
+{
+  init_sync_libfuncs (UNITS_PER_WORD);
+}
+
 #include "gt-sh.h"
-- 
1.7.6.4



[PATCH] Fix Linux/sparc build after generic asm output optimizations.

2011-11-11 Thread David Miller

Any ELF target that overrides ASM_GENERATE_INTERNAL_LABEL is at risk
of not building any more due to the recent elfos.h changes.

Those changes require that the label format generated by
ASM_GENERATE_INTERNAL_LABEL and TARGET_ASM_INTERNAL_LABEL are in sync,
but that is only being ensured for targets that use elfos.h as-is.

It turns out that Linux/sparc's override is unnecessary, so just
getting rid of it is the best thing to do.

Eric, it seems that most if not all of the other ELF sparc targets
will need something like this as well but I was only able to validate
Linux at the moment.

Committed to trunk.

gcc/

* config/sparc/linux.h (ASM_GENERATE_INTERNAL_LABEL): Delete.
* config/sparc/linux64.h (ASM_GENERATE_INTERNAL_LABEL): Delete.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@181307 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog  |5 +
 gcc/config/sparc/linux.h   |9 -
 gcc/config/sparc/linux64.h |9 -
 3 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 73bec22..62ae4a1 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2011-11-11  David S. Miller  
+
+   * config/sparc/linux.h (ASM_GENERATE_INTERNAL_LABEL): Delete.
+   * config/sparc/linux64.h (ASM_GENERATE_INTERNAL_LABEL): Delete.
+
 2011-11-11  Jakub Jelinek  
 
* config/i386/i386-protos.h (ix86_maybe_emit_epilogue_vzeroupper):
diff --git a/gcc/config/sparc/linux.h b/gcc/config/sparc/linux.h
index 443c796..60dc869 100644
--- a/gcc/config/sparc/linux.h
+++ b/gcc/config/sparc/linux.h
@@ -118,15 +118,6 @@ do {   
\
 #undef  LOCAL_LABEL_PREFIX
 #define LOCAL_LABEL_PREFIX  "."
 
-/* This is how to store into the string LABEL
-   the symbol_ref name of an internal numbered label where
-   PREFIX is the class of label and NUM is the number within the class.
-   This is suitable for output with `assemble_name'.  */
-
-#undef  ASM_GENERATE_INTERNAL_LABEL
-#define ASM_GENERATE_INTERNAL_LABEL(LABEL,PREFIX,NUM)  \
-  sprintf (LABEL, "*.L%s%ld", PREFIX, (long)(NUM))
-
 
 /* Define for support of TFmode long double.
SPARC ABI says that long double is 4 words.  */
diff --git a/gcc/config/sparc/linux64.h b/gcc/config/sparc/linux64.h
index bec279d..14966b9 100644
--- a/gcc/config/sparc/linux64.h
+++ b/gcc/config/sparc/linux64.h
@@ -236,15 +236,6 @@ do {   
\
 #undef  LOCAL_LABEL_PREFIX
 #define LOCAL_LABEL_PREFIX  "."
 
-/* This is how to store into the string LABEL
-   the symbol_ref name of an internal numbered label where
-   PREFIX is the class of label and NUM is the number within the class.
-   This is suitable for output with `assemble_name'.  */
-
-#undef  ASM_GENERATE_INTERNAL_LABEL
-#define ASM_GENERATE_INTERNAL_LABEL(LABEL,PREFIX,NUM)  \
-  sprintf (LABEL, "*.L%s%ld", PREFIX, (long)(NUM))
-
 /* DWARF bits.  */
 
 /* Follow Irix 6 and not the Dwarf2 draft in using 64-bit offsets. 
-- 
1.7.6.401.g6a319



PATCH [1/n] addr32: Properly use Pmode and word_mode

2011-11-11 Thread H.J. Lu
Hi,

The current x32 implementation uses LEAs to convert 32bit address to
64bit.  However, we can use addr32 prefix to use 32bit address directly.
It improves performance by 5% in SPEC CPU 2K/2006.  All changes are done
in x86 backend, except for a smaill unwind library assert change:

http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01555.html

due to return column size difference.

For x86-64, Pmode can be 32bit or 64bit, but word_mode is always 64bit.
push/pop only work on word_mode.  Also string instructions take Pmode
pointers.

I will submit a set of patches to use 32bit Pmode for x32.  This is
the first patch to properly use Pmode and word_mode.  It also adds
addr32 prefix to string instructions if needed.  OK for trunk?

Thanks.


H.J.
---
2011-11-11  H.J. Lu  

* config/i386/i386.c (function_value_64): Return pointers in
word_mode instead of Pmode.
(ix86_promote_function_mode): Likewise.
(setup_incoming_varargs_64): Use word_mode with integer
parameters in registers.
(gen_push): Push register in word_mode instead of Pmode.
(ix86_emit_save_regs): Likewise.
(ix86_emit_save_regs_using_mov): Save integer registers in
word_mode.
(gen_pop): Pop register in word_mode instead of Pmode.
(ix86_emit_restore_regs_using_pop): Likewise.
(ix86_expand_prologue): Replace Pmode with word_mode for push
immediate.  Use ix86_gen_pro_epilogue_adjust_stack.  Save and
restore RAX and R10 in word_mode.
(ix86_emit_restore_regs_using_mov): Restore integer registers
in word_mode.
(ix86_expand_split_stack_prologue): Save R10_REG and restore in
word_mode.
(ix86_decompose_address): Disallow fs:(reg) if Pmode !=
word_mode. 
(legitimize_tls_address): Load TP into register for
TLS_MODEL_INITIAL_EXEC and TLS_MODEL_LOCAL_EXEC modes in x32.
(ix86_print_operand): Output register in DImode for 64bit
indirect branch.
(ix86_split_to_parts): Use word_mode with PUT_MODE for push.
(ix86_split_long_move): Likewise.
(ix86_zero_extend_to_Pmode): Handle Pmode != DImode.
(ix86_expand_movmem): Use word_mode for size needed for loop.
(ix86_trampoline_init): Use movl for 64bit if ptr_mode == SImode.
Replace DImode with Pmode or ptr_mode.
(x86_this_parameter): Replace DImode with Pmode.

* config/i386/i386.md (W): New.
(*push2_prologue): Replace :P with :W.
(*pop1): Likewise.
(*pop1_epilogue): Likewise.
(*rep_movdi_rex64): Replace :DI with :P.  Add addr32 if needed.
(*rep_stosdi_rex64): Likewise.
(*rep_movsi): Add addr32 if needed.
(*rep_movqi): Likewise.
(*rep_stossi): Likewise.
(*rep_stosqi): Likewise.
(*cmpstrnqi_nz_1): Likewise.
(*cmpstrnqi_1): Likewise.
(*strlenqi_1): Likewise.
(push/pop peephole2): Use word_mode scratch registers.
(lwp_slwpcb): Check Pmode instead of TARGET_64BIT.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 01f4fbe..fd82389 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -7193,8 +7193,8 @@ function_value_64 (enum machine_mode orig_mode, enum 
machine_mode mode,
 }
   else if (POINTER_TYPE_P (valtype))
 {
-  /* Pointers are always returned in Pmode. */
-  mode = Pmode;
+  /* Pointers are always returned in word_mode.  */
+  mode = word_mode;
 }
 
   ret = construct_container (mode, orig_mode, valtype, 1,
@@ -7265,7 +7265,8 @@ ix86_function_value (const_tree valtype, const_tree 
fntype_or_decl,
   return ix86_function_value_1 (valtype, fntype_or_decl, orig_mode, mode);
 }
 
-/* Pointer function arguments and return values are promoted to Pmode.  */
+/* Pointer function arguments and return values are promoted to
+   word_mode.  */
 
 static enum machine_mode
 ix86_promote_function_mode (const_tree type, enum machine_mode mode,
@@ -7275,7 +7276,7 @@ ix86_promote_function_mode (const_tree type, enum 
machine_mode mode,
   if (type != NULL_TREE && POINTER_TYPE_P (type))
 {
   *punsignedp = POINTERS_EXTEND_UNSIGNED;
-  return Pmode;
+  return word_mode;
 }
   return default_promote_function_mode (type, mode, punsignedp, fntype,
for_return);
@@ -7553,12 +7554,13 @@ setup_incoming_varargs_64 (CUMULATIVE_ARGS *cum)
 
   for (i = cum->regno; i < max; i++)
 {
-  mem = gen_rtx_MEM (Pmode,
+  mem = gen_rtx_MEM (word_mode,
 plus_constant (save_area, i * UNITS_PER_WORD));
   MEM_NOTRAP_P (mem) = 1;
   set_mem_alias_set (mem, set);
-  emit_move_insn (mem, gen_rtx_REG (Pmode,
-   x86_64_int_parameter_registers[i]));
+  emit_move_insn (mem,
+ gen_rtx_REG (word_mode,
+  x86_64_int_parameter_registers[i]));
 }
 
   if 

Re: [PATCH] Fix Linux/sparc build after generic asm output optimizations.

2011-11-11 Thread Dimitrios Apostolou

Hi David,

I couldn't imagine such breakage... If too many platforms break perhaps we 
should undo the optimisation - see attached patch.



Thanks,
Dimitris


P.S. see also bug #51094 I've attached some more fixes
=== modified file 'gcc/config/elfos.h'
--- gcc/config/elfos.h  2011-10-30 01:45:46 +
+++ gcc/config/elfos.h  2011-11-12 02:51:39 +
@@ -125,9 +125,6 @@ see the files COPYING3 and COPYING.RUNTI
 }  \
   while (0)
 
-#undef TARGET_ASM_INTERNAL_LABEL
-#define TARGET_ASM_INTERNAL_LABEL default_elf_internal_label
-
 /* Output the label which precedes a jumptable.  Note that for all svr4
systems where we actually generate jumptables (which is to say every
svr4 target except i386, where we use casesi instead) we put the jump-


Re: [PATCH 0/4][CFT] Handle legacy __sync libcalls

2011-11-11 Thread Kaz Kojima
Richard Henderson  wrote:
> These are the targets that used external __sync calls in gcc 4.6.
> I've been intending to test them myself, but since these aren't
> bare *-elf targets, it's taking me some time to get the various
> cross-environment set up.
> 
> Port maintainers, please test.

SH patch looks to work fine, though I've got an ICE when
regtesting:

FAIL: gcc.c-torture/compile/20061005-1.c  -O0  (internal compiler error)

In function 'testc2':
trunk/gcc/testsuite/gcc.c-torture/compile/20061005-1.c:22:3: internal compiler 
error: in emit_move_insn, at expr.c:3438

#0  fancy_abort (file=0x89448fc "../../LOCAL/trunk/gcc/expr.c", line=3438, 
function=0x894553b "emit_move_insn")
at ../../LOCAL/trunk/gcc/diagnostic.c:899
#1  0x082a7e92 in emit_move_insn (x=0xb7de633c, y=0xb7e71970)
at ../../LOCAL/trunk/gcc/expr.c:3437
#2  0x081b8000 in emit_library_call_value_1 (retval=1, orgfun=0xb7e2c770, 
value=0xb7de633c, fn_type=LCT_NORMAL, outmode=QImode, nargs=3, 
p=) at ../../LOCAL/trunk/gcc/calls.c:4103
#3  0x081b8271 in emit_library_call_value (orgfun=0xb7e2c770, 
value=0xb7de633c, fn_type=LCT_NORMAL, outmode=QImode, nargs=3)
at ../../LOCAL/trunk/gcc/calls.c:4184
#4  0x08444ea7 in expand_atomic_compare_and_swap (ptarget_bool=0x0, 
ptarget_oval=0xbfffeba0, mem=0xb7e42654, expected=0xb7de6330, 
desired=0xb7de6318, is_weak=0 '\000', succ_model=MEMMODEL_SEQ_CST, 
fail_model=MEMMODEL_SEQ_CST) at ../../LOCAL/trunk/gcc/optabs.c:7513
#5  0x0818b26f in expand_builtin_compare_and_swap (mode=QImode, 
exp=, is_bool=0 '\000', target=0xb7de633c)
at ../../LOCAL/trunk/gcc/builtins.c:5199

(gdb) fr 1
#1  0x082a7e92 in emit_move_insn (x=0xb7de633c, y=0xb7e71970)
at ../../LOCAL/trunk/gcc/expr.c:3437
3437  gcc_assert (mode != BLKmode

(gdb) call debug_rtx(x)
(const_int 0 [0])

(gdb) fr 6
#6  0x0819d19b in expand_builtin (exp=0xb7e622ec, target=0xb7de633c, 
subtarget=0x0, mode=, ignore=1)
at ../../LOCAL/trunk/gcc/builtins.c:6529
6529  target = expand_builtin_compare_and_swap (mode, exp, false, 
target);

It seems that expand_builtin sets "target" variable to
const0_trx when "ignore" argument is set and this causes
the above ICE.  I'm trying a patch

--- ORIG/trunk/gcc/optabs.c 2011-11-11 08:00:04.0 +0900
+++ trunk/gcc/optabs.c  2011-11-12 12:34:18.0 +0900
@@ -7440,6 +7440,7 @@ expand_atomic_compare_and_swap (rtx *pta
  just in case we need that path down below.  */
   if (ptarget_oval == NULL
   || (target_oval = *ptarget_oval) == NULL
+  || !register_operand (target_oval, mode)
   || reg_overlap_mentioned_p (expected, target_oval))
 target_oval = gen_reg_rtx (mode);
 
though I'm not sure that this is the right thing to do.

Regards,
kaz


[PATCH 1/3] rs6000: fix*_trunc insns use nonimmediate_operand

2011-11-11 Thread Richard Henderson
From: Richard Henderson 

---
 gcc/config/rs6000/rs6000.md |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 331aa79..93b0b6c 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6787,7 +6787,7 @@
 ; register allocation so that it can allocate the memory slot if it
 ; needed
 (define_insn_and_split "fix_truncsi2_stfiwx"
-  [(set (match_operand:SI 0 "general_operand" "=rm")
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
(fix:SI (match_operand:SFDF 1 "gpc_reg_operand" "d")))
(clobber (match_scratch:DI 2 "=d"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
@@ -6883,7 +6883,7 @@
 }")
 
 (define_insn_and_split "fixuns_truncsi2_stfiwx"
-  [(set (match_operand:SI 0 "general_operand" "=rm")
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
(unsigned_fix:SI (match_operand:SFDF 1 "gpc_reg_operand" "d")))
(clobber (match_scratch:DI 2 "=d"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS &&  && TARGET_FCTIWUZ
-- 
1.7.6.4



[PATCH 0/3] Conversion to __atomic builtins

2011-11-11 Thread Richard Henderson
Well, most of it.

The first patch removes two avoidable warnings in rs6000.md.
It seems like we could avoid many more of the remaining, but
those are harder; this one was obvious.

The second patch is a build error.  It has appeared on this
list previously, but not yet applied.

The third implements the atomic operations (mostly) as described in

 http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2011.03.04a.html

There are a couple of instances in which the paper doesn't cover the
handling of memory_model_consume, and I made a best guess.  These
are indicated by /* ??? */ markers.  I would be obliged if someone
could verify what's supposed to happen in these cases.  I attempted
to handle them conservatively.

Tested on ppc64-linux, with a reduced set of languages.  I could
not get libjava to build for some reason.  Missing symbols linking?

Please double-check.


r~


Richard Henderson (3):
  rs6000: fix*_trunc insns use nonimmediate_operand
  ppc-linux: Fix call to _Unwind_SetGRPtr
  rs6000: Rewrite sync patterns for atomic; expand early.

 gcc/config/rs6000/rs6000-protos.h   |   10 +-
 gcc/config/rs6000/rs6000.c  |  675 +-
 gcc/config/rs6000/rs6000.md |6 +-
 gcc/config/rs6000/sync.md   |  705 +--
 libgcc/config/rs6000/linux-unwind.h |2 +-
 5 files changed, 531 insertions(+), 867 deletions(-)

-- 
1.7.6.4



[PATCH 2/3] ppc-linux: Fix call to _Unwind_SetGRPtr

2011-11-11 Thread Richard Henderson
From: Richard Henderson 

---
 libgcc/config/rs6000/linux-unwind.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/libgcc/config/rs6000/linux-unwind.h 
b/libgcc/config/rs6000/linux-unwind.h
index 2011632..13bf413 100644
--- a/libgcc/config/rs6000/linux-unwind.h
+++ b/libgcc/config/rs6000/linux-unwind.h
@@ -368,7 +368,7 @@ frob_update_context (struct _Unwind_Context *context, 
_Unwind_FrameState *fs ATT
 before the bctrl so this is the first and only place
 we need to use the stored R2.  */
  _Unwind_Word sp = _Unwind_GetGR (context, 1);
- _Unwind_SetGRPtr (context, 2, sp + 40);
+ _Unwind_SetGRPtr (context, 2, (void *)(sp + 40));
}
}
 }
-- 
1.7.6.4



[PATCH 3/3] rs6000: Rewrite sync patterns for atomic; expand early.

2011-11-11 Thread Richard Henderson
From: Richard Henderson 

The conversion of the __sync post-reload splitters was half
complete.  Since there are nearly no restrictions on what may
appear between LL and SC, expand all the patterns immediatly.
This allows significantly easier code generation for subword
atomic operations.
---
 gcc/config/rs6000/rs6000-protos.h |   10 +-
 gcc/config/rs6000/rs6000.c|  675 ++--
 gcc/config/rs6000/rs6000.md   |2 +-
 gcc/config/rs6000/sync.md |  705 ++---
 4 files changed, 528 insertions(+), 864 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 23d2d2a..af4c954 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -103,13 +103,9 @@ extern rtx rs6000_emit_set_const (rtx, enum machine_mode, 
rtx, int);
 extern int rs6000_emit_cmove (rtx, rtx, rtx, rtx);
 extern int rs6000_emit_vector_cond_expr (rtx, rtx, rtx, rtx, rtx, rtx);
 extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
-extern void rs6000_emit_sync (enum rtx_code, enum machine_mode,
- rtx, rtx, rtx, rtx, bool);
-extern void rs6000_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
-extern void rs6000_split_compare_and_swap (rtx, rtx, rtx, rtx, rtx);
-extern void rs6000_expand_compare_and_swapqhi (rtx, rtx, rtx, rtx);
-extern void rs6000_split_compare_and_swapqhi (rtx, rtx, rtx, rtx, rtx, rtx);
-extern void rs6000_split_lock_test_and_set (rtx, rtx, rtx, rtx);
+extern void rs6000_expand_atomic_compare_and_swap (rtx op[]);
+extern void rs6000_expand_atomic_exchange (rtx op[]);
+extern void rs6000_expand_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 extern void rs6000_emit_swdiv (rtx, rtx, rtx, bool);
 extern void rs6000_emit_swrsqrt (rtx, rtx);
 extern void output_toc (FILE *, rtx, int, enum machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 89b79ab..65ed6e4 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -17132,199 +17132,6 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx 
op0, rtx op1)
 emit_move_insn (dest, target);
 }
 
-/* Emit instructions to perform a load-reserved/store-conditional operation.
-   The operation performed is an atomic
-   (set M (CODE:MODE M OP))
-   If not NULL, BEFORE is atomically set to M before the operation, and
-   AFTER is set to M after the operation (that is, (CODE:MODE M OP)).
-   If SYNC_P then a memory barrier is emitted before the operation.
-   Either OP or M may be wrapped in a NOT operation.  */
-
-void
-rs6000_emit_sync (enum rtx_code code, enum machine_mode mode,
- rtx m, rtx op, rtx before_param, rtx after_param,
- bool sync_p)
-{
-  enum machine_mode used_mode;
-  rtx the_op, set_before, set_after, set_atomic, cc_scratch, before, after;
-  rtx used_m;
-  rtvec vec;
-  HOST_WIDE_INT imask = GET_MODE_MASK (mode);
-  rtx shift = NULL_RTX;
-
-  if (sync_p)
-emit_insn (gen_lwsync ());
-
-used_m = m;
-
-  /* If this is smaller than SImode, we'll have to use SImode with
- adjustments.  */
-  if (mode == QImode || mode == HImode)
-{
-  rtx newop, oldop;
-
-  if (MEM_ALIGN (used_m) >= 32)
-   {
- int ishift = 0;
- if (BYTES_BIG_ENDIAN)
-   ishift = GET_MODE_BITSIZE (SImode) - GET_MODE_BITSIZE (mode);
-
- shift = GEN_INT (ishift);
- used_m = change_address (used_m, SImode, 0);
-   }
-  else
-   {
- rtx addrSI, aligned_addr;
- int shift_mask = mode == QImode ? 0x18 : 0x10;
-
- addrSI = gen_lowpart_common (SImode,
-  force_reg (Pmode, XEXP (used_m, 0)));
- addrSI = force_reg (SImode, addrSI);
- shift = gen_reg_rtx (SImode);
-
- emit_insn (gen_rlwinm (shift, addrSI, GEN_INT (3),
-GEN_INT (shift_mask)));
- emit_insn (gen_xorsi3 (shift, shift, GEN_INT (shift_mask)));
-
- aligned_addr = expand_binop (Pmode, and_optab,
-  XEXP (used_m, 0),
-  GEN_INT (-4), NULL_RTX,
-  1, OPTAB_LIB_WIDEN);
- used_m = change_address (used_m, SImode, aligned_addr);
- set_mem_align (used_m, 32);
-   }
-  /* It's safe to keep the old alias set of USED_M, because
-the operation is atomic and only affects the original
-USED_M.  */
-  m = used_m;
-
-  if (GET_CODE (op) == NOT)
-   {
- oldop = lowpart_subreg (SImode, XEXP (op, 0), mode);
- oldop = gen_rtx_NOT (SImode, oldop);
-   }
-  else
-   oldop = lowpart_subreg (SImode, op, mode);
-
-  switch (code)
-   {
-   case IOR:
-   case XOR:
- newop = expand_binop (SImode, and_optab,
-   oldop, GEN_INT (imask), NULL_RTX,
- 

Re: [Patch Darwin/Ada] work around PR target/50678

2011-11-11 Thread Eric Botcazou
> This has been filed as radar #10302855, but we need a work-around
> until that is resolved (possibly forever on older systems).
>
> OK for trunk?
> (what opinion about 4.6?)

Did you apply it to the 4.6 branch?  I think that this would be appropriate.

> ada:
>
>   PR target/50678
>   * init.c (Darwin/__gnat_error_handler): Transpose rbx and rdx in the
>   handler.

Sorry, I overlooked something here: there is a specific procedure to make this 
kind of adjustments.  The reason is that the adjustment needs to be made in 
the tasking case as well and __gnat_error_handler isn't used for this case.

So HAVE_GNAT_ADJUST_CONTEXT_FOR_RAISE must be defined in the Darwin-specific 
section and __gnat_adjust_context_for_raise implemented (with the standard 
prototype) and __gnat_error_handler changed to call it instead.

Would you mind adjusting the fix that way?  Thanks in advance.

-- 
Eric Botcazou