Go patch committed: Copy channel implementation from master library

2011-12-01 Thread Ian Lance Taylor
This patch copies the channel implementation from the master Go library
to the gccgo library.  This is a followon patch to the earlier change to
multiplex goroutines onto threads.  With this patch channels now use the
goroutine scheduler directly, rather than taking up a thread by using a
condition variable.  This required changing the runtime calls that the
compiler generates for select statements.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian



foo.patch.bz2
Description: patch


Re: [PATCH] Avoid messages about non-existent gnatls command (PR bootstrap/51201)

2011-12-01 Thread Arnaud Charlet
> := assigned variable gets evaluated right away, including the case when
> host doesn't have any Ada compiler installed.  In that case we remove ada
> from enabled languages, but still RTS_DIR is sometimes computed.

Can you elaborate here? When is RTS_DIR computed if Ada is not enabled?
That seems surprising to me, so I'd like to understand more before OKing
this patch.

Arno


Re: [PATCH] Remove dead labels to increase superblock scope

2011-12-01 Thread Tom de Vries
On 27/11/11 23:59, Eric Botcazou wrote:
>> No, DELETED_LABEL notes still work just fine. It depends on how you
>> remove the label and replace it with a note, and Tom isn't showing
>> what he did, so...
> 
> I agree that there is no obvious reason why just calling delete_insn would 
> not 
> work, so this should be investigated first.
> 

The reason it didn't work, is because after turning a label into a
NOTE_INSN_DELETED_LABEL, one needs to move it to after the NOTE_INSN_BASIC_BLOCK
as in cfgcleanup.c:try_optimize_cfg():
...
  delete_insn_chain (label, label, false);
  /* If the case label is undeletable, move it after the
 BASIC_BLOCK note.  */
  if (NOTE_KIND (BB_HEAD (b)) == NOTE_INSN_DELETED_LABEL)
{
  rtx bb_note = NEXT_INSN (BB_HEAD (b));

  reorder_insns_nobb (label, label, bb_note);
  BB_HEAD (b) = bb_note;
  if (BB_END (b) == bb_note)
BB_END (b) = label;
}
...

Attached patch factors out this piece of code and reuses it in 
fixup_reorder_chain.

Bootstrapped and reg-tested on x86_64.

OK for next stage1?

Thanks,
- Tom

2011-12-01  Tom de Vries  

* cfgcleanup.c (fixup_deleted_label): New function, factored out of ...
(try_optimize_cfg): Use fixup_deleted_label.
* cfglayout.c (fixup_reorder_chain): Delete unused label, and fixup
using fixup_deleted_label.

* gcc.dg/superblock.c: New test.
Index: gcc/cfgcleanup.c
===
--- gcc/cfgcleanup.c (revision 181652)
+++ gcc/cfgcleanup.c (working copy)
@@ -2518,6 +2518,27 @@ trivially_empty_bb_p (basic_block bb)
 }
 }
 
+/* Move a DELETED_LABEL note after the BASIC_BLOCK note of BB.  */
+
+void
+fixup_deleted_label (basic_block bb)
+{
+  rtx deleted_label = BB_HEAD (bb), bb_note;
+
+  if (deleted_label == NULL_RTX
+  || !NOTE_P (deleted_label)
+  || NOTE_KIND (deleted_label) != NOTE_INSN_DELETED_LABEL)
+return;
+
+  bb_note = NEXT_INSN (deleted_label);
+  gcc_assert (bb_note != NULL_RTX && NOTE_INSN_BASIC_BLOCK_P (bb_note));
+
+  reorder_insns_nobb (deleted_label, deleted_label, bb_note);
+  BB_HEAD (bb) = bb_note;
+  if (BB_END (bb) == bb_note)
+BB_END (bb) = deleted_label;
+}
+
 /* Do simple CFG optimizations - basic block merging, simplifying of jump
instructions etc.  Return nonzero if changes were made.  */
 
@@ -2637,15 +2658,7 @@ try_optimize_cfg (int mode)
 		  delete_insn_chain (label, label, false);
 		  /* If the case label is undeletable, move it after the
 		 BASIC_BLOCK note.  */
-		  if (NOTE_KIND (BB_HEAD (b)) == NOTE_INSN_DELETED_LABEL)
-		{
-		  rtx bb_note = NEXT_INSN (BB_HEAD (b));
-
-		  reorder_insns_nobb (label, label, bb_note);
-		  BB_HEAD (b) = bb_note;
-		  if (BB_END (b) == bb_note)
-			BB_END (b) = label;
-		}
+		  fixup_deleted_label (b);
 		  if (dump_file)
 		fprintf (dump_file, "Deleted label in block %i.\n",
 			 b->index);
Index: gcc/cfglayout.c
===
--- gcc/cfglayout.c (revision 181652)
+++ gcc/cfglayout.c (working copy)
@@ -857,6 +857,12 @@ fixup_reorder_chain (void)
    (e_taken->src, e_taken->dest));
 		  e_taken->flags |= EDGE_FALLTHRU;
 		  update_br_prob_note (bb);
+		  if (LABEL_NUSES (ret_label) == 0
+		  && single_pred_p (e_taken->dest))
+		{
+		  delete_insn (ret_label);
+		  fixup_deleted_label (e_taken->dest);
+		}
 		  continue;
 		}
 	}
Index: gcc/basic-block.h
===
--- gcc/basic-block.h (revision 181652)
+++ gcc/basic-block.h (working copy)
@@ -813,6 +813,7 @@ extern void rtl_make_eh_edge (sbitmap, b
 enum replace_direction { dir_none, dir_forward, dir_backward, dir_both };
 
 /* In cfgcleanup.c.  */
+extern void fixup_deleted_label (basic_block);
 extern bool cleanup_cfg (int);
 extern int flow_find_cross_jump (basic_block, basic_block, rtx *, rtx *,
  enum replace_direction*);
Index: gcc/testsuite/gcc.dg/superblock.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/superblock.c (revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-asynchronous-unwind-tables -fsched2-use-superblocks -fdump-rtl-sched2 -fdump-rtl-bbro" } */
+
+typedef int aligned __attribute__ ((aligned (64)));
+extern void abort (void);
+
+int bar (void *p);
+
+void
+foo (void)
+{
+  char *p = __builtin_alloca (13);
+  aligned i;
+
+  if (bar (p) || bar (&i))
+abort ();
+}
+
+/* { dg-final { scan-rtl-dump-times "0 uses" 0 "bbro"} } */
+/* { dg-final { scan-rtl-dump-times "ADVANCING TO" 2 "sched2"} } */
+/* { dg-final { cleanup-rtl-dump "bbro" } } */
+/* { dg-final { cleanup-rtl-dump "sche

Re: [PATCH] Avoid messages about non-existent gnatls command (PR bootstrap/51201)

2011-12-01 Thread Jakub Jelinek
On Thu, Dec 01, 2011 at 09:08:27AM +0100, Arnaud Charlet wrote:
> > := assigned variable gets evaluated right away, including the case when
> > host doesn't have any Ada compiler installed.  In that case we remove ada
> > from enabled languages, but still RTS_DIR is sometimes computed.
> 
> Can you elaborate here? When is RTS_DIR computed if Ada is not enabled?
> That seems surprising to me, so I'd like to understand more before OKing
> this patch.

As written in the PR, even if you on say x86_64-linux
../configure --enable-languages=c --target=i686-pc-linux-gnu
you end up with:
CONFIG_LANGUAGES =  c++ lto
LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES)
...
LANG_MAKEFRAGS =  $(srcdir)/ada/gcc-interface/Make-lang.in 
$(srcdir)/cp/Make-lang.in $(srcdir)/fortran/Make-lang.in 
$(srcdir)/go/Make-lang.in $(srcdir)/java/Make-lang.in 
$(srcdir)/lto/Make-lang.in $(srcdir)/objc/Make-lang.in 
$(srcdir)/objcp/Make-lang.in
...
# per-language makefile fragments
ifneq ($(LANG_MAKEFRAGS),)
include $(LANG_MAKEFRAGS)
endif

in gcc/Makefile.  When Ada isn't installed on the host, you end
up with something like:
...
checking whether g++ accepts -g... yes
checking for alloca... yes
checking for x86_64-unknown-linux-gnu-gnatbind... no
checking for x86_64-unknown-linux-gnu-gnatmake... no
checking whether compiler driver understands Ada... no
checking how to run the C preprocessor... yes
gcc -E
checking for ANSI C header files... (cached) yes
...
config.status: creating ada/gcc-interface/Makefile
config.status: creating ada/Makefile
config.status: creating auto-host.h
config.status: executing default commands
/bin/sh: gnatls: command not found
make[2]: Entering directory `/usr/src/gcc/obj3/gcc'
/bin/sh ../../gcc/../mkinstalldirs po
/bin/sh ../../gcc/../mkinstalldirs po
/bin/sh ../../gcc/../mkinstalldirs po
...
in the output.  That is because ada/gcc-interface/Make-lang.in is sourced,
even when no goals from it are actually made.  But already mere sourcing
of that makefile results in
ifeq ($(build), $(host))
  ifeq ($(host), $(target))
...
  else
...
RTS_DIR:=$(strip $(subst \,/,$(shell gnatls -v | grep adalib )))
...
  endif
else
...
endif

$(build) and $(host) is equal here, $(host) and $(target) aren't equal
and thus RTS_DIR variable is initialized from the command, irrespective
from host GNAT not being detected or present.
By changing that variable to deferred:
RTS_DIR=$(strip $(subst \,/,$(shell gnatls -v | grep adalib )))
no command is run when the makefile fragment is sourced,
instead the command is executed when the vars are actually used:
ADA_TOOLS_FLAGS_TO_PASS=\
CC="$(CC)" \
$(COMMON_FLAGS_TO_PASS) $(ADA_FLAGS_TO_PASS) \
ADA_INCLUDES="-I$(RTS_DIR)../adainclude -I$(RTS_DIR)" \
GNATMAKE="gnatmake" \
GNATBIND="gnatbind" \
GNATLINK="gnatlink" \
LIBGNAT=""
...
gnattools: $(GCC_PARTS) $(CONFIG_H) prefix.o force
$(MAKE) -C ada $(ADA_TOOLS_FLAGS_TO_PASS) gnattools1
$(MAKE) -C ada $(ADA_TOOLS_FLAGS_TO_PASS) gnattools2

For gnattools target, this means that gnatls is with the patch
invoked 4 times, twice per gnattools1 subcommand and twice
per gnattools2 subcommand, but it is evaluated before spawning
the submake (i.e. we don't run gnatls per command line).

Jakub


Re: [PATCH] Avoid messages about non-existent gnatls command (PR bootstrap/51201)

2011-12-01 Thread Arnaud Charlet
> As written in the PR, even if you on say x86_64-linux
> ../configure --enable-languages=c --target=i686-pc-linux-gnu
> you end up with:
> CONFIG_LANGUAGES =  c++ lto
> LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES)
> ...
> LANG_MAKEFRAGS =  $(srcdir)/ada/gcc-interface/Make-lang.in
> $(srcdir)/cp/Make-lang.in $(srcdir)/fortran/Make-lang.in
> $(srcdir)/go/Make-lang.in $(srcdir)/java/Make-lang.in
> $(srcdir)/lto/Make-lang.in $(srcdir)/objc/Make-lang.in
> $(srcdir)/objcp/Make-lang.in

Do we know why LANG_MAKEFRAGS contains all the disabled languages?
If that a feature/requirement or a bug/limitation?

If the latter, I'd rather fix the issue there if possible.

Arno


Re: [PATCH] Avoid messages about non-existent gnatls command (PR bootstrap/51201)

2011-12-01 Thread Jakub Jelinek
On Thu, Dec 01, 2011 at 09:50:15AM +0100, Arnaud Charlet wrote:
> > As written in the PR, even if you on say x86_64-linux
> > ../configure --enable-languages=c --target=i686-pc-linux-gnu
> > you end up with:
> > CONFIG_LANGUAGES =  c++ lto
> > LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES)
> > ...
> > LANG_MAKEFRAGS =  $(srcdir)/ada/gcc-interface/Make-lang.in
> > $(srcdir)/cp/Make-lang.in $(srcdir)/fortran/Make-lang.in
> > $(srcdir)/go/Make-lang.in $(srcdir)/java/Make-lang.in
> > $(srcdir)/lto/Make-lang.in $(srcdir)/objc/Make-lang.in
> > $(srcdir)/objcp/Make-lang.in
> 
> Do we know why LANG_MAKEFRAGS contains all the disabled languages?
> If that a feature/requirement or a bug/limitation?
> 
> If the latter, I'd rather fix the issue there if possible.

I believe because we want
cd obj/gcc
make f951
work even when --enable-languages=c,c++ , it is quite handy
not having to reconfigure gcc because you forgot one language, especially
when you have lots of cross-compilers around.
If LANG_MAKEFRAGS would be limited only to the chosen languages,
this would suddenly not work at all.

Jakub


[Patch, fortran] Don't call stat before fopen

2011-12-01 Thread Janne Blomqvist
Hi,

there's no need to stat() the filename before we fopen() it in
gfc_open_file(). If the file doesn't exist, fopen() will return NULL
anyway, and if the user really wants to read source or module data
from a special file, so be it.

Committed to trunk as obvious.

2011-12-01  Janne Blomqvist  

* misc.c (gfc_open_file): Don't call stat.


Index: misc.c
===
--- misc.c  (revision 181874)
+++ misc.c  (working copy)
@@ -58,17 +58,9 @@ gfc_clear_ts (gfc_typespec *ts)
 FILE *
 gfc_open_file (const char *name)
 {
-  struct stat statbuf;
-
   if (!*name)
 return stdin;

-  if (stat (name, &statbuf) < 0)
-return NULL;
-
-  if (!S_ISREG (statbuf.st_mode))
-return NULL;
-
   return fopen (name, "r");
 }


-- 
Janne Blomqvist


Re: [PATCH] Avoid messages about non-existent gnatls command (PR bootstrap/51201)

2011-12-01 Thread Jakub Jelinek
On Thu, Dec 01, 2011 at 09:55:04AM +0100, Jakub Jelinek wrote:
> I believe because we want
> cd obj/gcc
> make f951
> work even when --enable-languages=c,c++ , it is quite handy
> not having to reconfigure gcc because you forgot one language, especially
> when you have lots of cross-compilers around.
> If LANG_MAKEFRAGS would be limited only to the chosen languages,
> this would suddenly not work at all.

http://gcc.gnu.org/ml/gcc-patches/2006-01/msg02100.html
is the change that went into 4.2.

Jakub


Re: [PATCH] Avoid messages about non-existent gnatls command (PR bootstrap/51201)

2011-12-01 Thread Arnaud Charlet
> > when you have lots of cross-compilers around.
> > If LANG_MAKEFRAGS would be limited only to the chosen languages,
> > this would suddenly not work at all.
> 
> http://gcc.gnu.org/ml/gcc-patches/2006-01/msg02100.html
> is the change that went into 4.2.

OK well, maybe the above patch is partly OBE because c++ is now required,
but I guess other people might argue similarly for other languages.

I personally don't think that's such a good idea, and reconfiguring gcc
is needed frequently anyway, and as we can see in this discussion, this does
have unwanted side effects.

This specific patch is "OK", but if we end up finding out that more of such
hacks are needed, then we'll have to revisit the whole issue and find
another solution.

Same thing if we find out that your patch is causing unexpected side
effects, we'll have to revert the patch.

Arno


Re: [C++ Patch] PR 51327

2011-12-01 Thread Paolo Carlini

On 12/01/2011 07:13 AM, Jason Merrill wrote:
Ah, I see.  I guess what we want here is the GCC 4.5 version of 
locate_ctor instead of the new one; once we've checked that we have a 
default ctor and no user-provided default ctor, there must be a unique 
defaulted ctor so just walking CLASSTYPE_CONSTRUCTORS is correct.  And 
then we can call maybe_explain_implicit_delete if it's deleted.

Ok, thanks for the guidance, I'll work along those lines.

Paolo.


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Torvald Riegel
On Wed, 2011-11-30 at 21:41 -0500, David Edelsohn wrote:
> On Wed, Nov 30, 2011 at 8:05 PM, Richard Henderson  wrote:
> > This is a tad rough, but not too bad.
> 
> Cool.
> 
> Maybe I don't understand what they are suppose to represent, but why
> the choice of values for cacheline size?  Is that suppose to be a
> value chosen by ITM or suppose to be the hardware cacheline used as
> the granularity for transactions?

CACHELINE_SIZE is supposed to be a the size of hardware cachelines so
that we can add proper padding to shared variables to avoid false
sharing.

It also was used as the granularity of transactional access by some TM
methods that aren't part of libitm currently, but might be revived in
the future.


Torvald



Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Torvald Riegel
On Wed, 2011-11-30 at 17:05 -0800, Richard Henderson wrote:
> Oh, for the record, I think we should probably be saving and restoring the fp 
> state on all targets.  If we restart a transaction, we're really saying that 
> absolutely nothing happened.  Something like
> 
>   double a, b, c;
>   __transaction_atomic { a = b+c; }
> 
> shouldn't erroneously set the overflow flag if the first iteration of the 
> transaction generates an infinity but the final iteration doesn't.  The x86 
> port is currently wrong for this, as is the port I just posted for ARM, but I 
> make the attempt here.

The ABI defines the pr_hasNoFloatUpdate and pr_hasNoVectorUpdate flags
for _ITM_beginTransaction but we don't handle these currently.  I guess
we should do the save/restore unless those flags are set?

How difficult would it be to set these flags if there is no float/vector
update (I guess inter-procedural analysis could be sufficient as a first
step).


Torvald



Use atomics in remaining libgomp/config/linux sources

2011-12-01 Thread Alan Modra
This converts the remaining files in libgomp/config/linux/ to use
atomics.  gomp_init_thread_affinity fairly obviously needs no
barriers;  I wasn't so sure about ptrlock, so tried without the
acquire/release barriers and found a number of the loop tests failed.
So it seems the usual lock barriers are needed.  I also fixed a
slightly confusing use of compare_and_swap in gomp_ptrlock_set,
followed by an unconditional store in gomp_ptrlock_set_slow.  If we're
going to write to the thing anyway, we may as well not use the compare
form.

Bootstrapped etc. powerpc-linux.  I did see a libgomp testsuite
failure when testing this (pr51376), but virgin gcc also has the same
failure.

* config/linux/affinity.c: Use atomic rather than sync builtin.
* config/linux/lock.c: Likewise.
* config/linux/ptrlock.h: Likewise.
* config/linux/ptrlock.c: Likewise.
* config/linux/ptrlock.h (gomp_ptrlock_set): Always write here..
* config/linux/ptrlock.c (gomp_ptrlock_set_slow): ..not here.
* config/linux/futex.h (atomic_write_barrier): Delete unused function.
* config/linux/alpha/futex.h (atomic_write_barrier): Likewise.
* config/linux/ia64/futex.h (atomic_write_barrier): Likewise.
* config/linux/mips/futex.h (atomic_write_barrier): Likewise.
* config/linux/powerpc/futex.h (atomic_write_barrier): Likewise.
* config/linux/s390/futex.h (atomic_write_barrier): Likewise.
* config/linux/sparc/futex.h (atomic_write_barrier): Likewise.
* config/linux/x86/futex.h (atomic_write_barrier): Likewise.

Index: libgomp/config/linux/affinity.c
===
--- libgomp/config/linux/affinity.c (revision 181830)
+++ libgomp/config/linux/affinity.c (working copy)
@@ -109,7 +109,7 @@ gomp_init_thread_affinity (pthread_attr_
   unsigned int cpu;
   cpu_set_t cpuset;
 
-  cpu = __sync_fetch_and_add (&affinity_counter, 1);
+  cpu = __atomic_fetch_add (&affinity_counter, 1, MEMMODEL_RELAXED);
   cpu %= gomp_cpu_affinity_len;
   CPU_ZERO (&cpuset);
   CPU_SET (gomp_cpu_affinity[cpu], &cpuset);
Index: libgomp/config/linux/lock.c
===
--- libgomp/config/linux/lock.c (revision 181830)
+++ libgomp/config/linux/lock.c (working copy)
@@ -1,4 +1,4 @@
-/* Copyright (C) 2005, 2008, 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2005, 2008, 2009, 2011 Free Software Foundation, Inc.
Contributed by Richard Henderson .
 
This file is part of the GNU OpenMP Library (libgomp).
@@ -62,7 +62,10 @@ gomp_unset_lock_30 (omp_lock_t *lock)
 int
 gomp_test_lock_30 (omp_lock_t *lock)
 {
-  return __sync_bool_compare_and_swap (lock, 0, 1);
+  int oldval = 0;
+
+  return __atomic_compare_exchange_n (lock, &oldval, 1, false,
+ MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
 }
 
 void
@@ -104,11 +107,14 @@ int
 gomp_test_nest_lock_30 (omp_nest_lock_t *lock)
 {
   void *me = gomp_icv (true);
+  int oldval;
 
   if (lock->owner == me)
 return ++lock->count;
 
-  if (__sync_bool_compare_and_swap (&lock->lock, 0, 1))
+  oldval = 0;
+  if (__atomic_compare_exchange_n (&lock->lock, &oldval, 1, false,
+  MEMMODEL_ACQUIRE, MEMMODEL_RELAXED))
 {
   lock->owner = me;
   lock->count = 1;
@@ -184,8 +190,9 @@ gomp_set_nest_lock_25 (omp_nest_lock_25_
 
   while (1)
 {
-  otid = __sync_val_compare_and_swap (&lock->owner, 0, tid);
-  if (otid == 0)
+  otid = 0;
+  if (__atomic_compare_exchange_n (&lock->owner, &otid, tid, false,
+  MEMMODEL_ACQUIRE, MEMMODEL_RELAXED))
{
  lock->count = 1;
  return;
@@ -207,7 +214,7 @@ gomp_unset_nest_lock_25 (omp_nest_lock_2
 
   if (--lock->count == 0)
 {
-  __sync_lock_release (&lock->owner);
+  __atomic_store_n (&lock->owner, 0, MEMMODEL_RELEASE);
   futex_wake (&lock->owner, 1);
 }
 }
@@ -217,8 +224,9 @@ gomp_test_nest_lock_25 (omp_nest_lock_25
 {
   int otid, tid = gomp_tid ();
 
-  otid = __sync_val_compare_and_swap (&lock->owner, 0, tid);
-  if (otid == 0)
+  otid = 0;
+  if (__atomic_compare_exchange_n (&lock->owner, &otid, tid, false,
+  MEMMODEL_ACQUIRE, MEMMODEL_RELAXED))
 {
   lock->count = 1;
   return 1;
Index: libgomp/config/linux/ptrlock.h
===
--- libgomp/config/linux/ptrlock.h  (revision 181830)
+++ libgomp/config/linux/ptrlock.h  (working copy)
@@ -1,4 +1,4 @@
-/* Copyright (C) 2008, 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
Contributed by Jakub Jelinek .
 
This file is part of the GNU OpenMP Library (libgomp).
@@ -24,7 +24,14 @@
 
 /* This is a Linux specific implementation of a mutex synchronization
mechanism for libgomp.  This type is private 

Re: [PATCH] Fix up VEC_INTERLEAVE_*_EXPR folding and expansion for big endian (PR tree-optimization/51074)

2011-12-01 Thread Richard Guenther
On Tue, 22 Nov 2011, Jakub Jelinek wrote:

> Hi!
> 
> VEC_INTERLEAVE_*_EXPR trees are unfortunately dependent on BYTES_BIG_ENDIAN,
> what is HIGH vs. LOW is different based on endianity.

Huh, that looks bogus.  Both tree codes operate on registers and no
other codes care about "endianess" of vector registers.
(What about VEC_WIDEN_LSHIFT_{HI,LO}_EXPR?)

Can't we simply push the differece to expansion time?  Or even later?

Richard.

> The only place that creates these in the IL is:
>   if (BYTES_BIG_ENDIAN)
> {
>   high_code = VEC_INTERLEAVE_HIGH_EXPR;
>   low_code = VEC_INTERLEAVE_LOW_EXPR;
> }
>   else
> {
>   low_code = VEC_INTERLEAVE_HIGH_EXPR;
>   high_code = VEC_INTERLEAVE_LOW_EXPR;
> }
>   perm_stmt = gimple_build_assign_with_ops (high_code, perm_dest,
> vect1, vect2);
> ...
> so either folding (and expansion if only vec_perm* is supported) needs to
> be adjusted as done in the patch below, or we'd need to rename them
> to VEC_INTERLEAVE_{FIRST,SECOND}_EXPR or similar and adjust all the patterns
> etc.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, tested on the
> testcase using powerpc cross.  Ok for trunk?
> 
> 2011-11-22  Jakub Jelinek  
> 
>   PR tree-optimization/51074
>   * fold-const.c (fold_binary_loc): Fix up VEC_INTERLEAVE_*_EXPR
>   handling for BYTES_BIG_ENDIAN.
>   * optabs.c (can_vec_perm_for_code_p): Likewise.
> 
>   * gcc.dg/vect/pr51074.c: New test.
> 
> --- gcc/fold-const.c.jj   2011-11-21 16:22:02.0 +0100
> +++ gcc/fold-const.c  2011-11-22 09:59:15.606739333 +0100
> @@ -13483,10 +13483,12 @@ fold_binary_loc (location_t loc,
>   sel[i] = i * 2 + 1;
>   break;
> case VEC_INTERLEAVE_HIGH_EXPR:
> - sel[i] = (i + nelts) / 2 + ((i & 1) ? nelts : 0);
> + sel[i] = (i + (BYTES_BIG_ENDIAN ? 0 : nelts)) / 2
> +  + ((i & 1) ? nelts : 0);
>   break;
> case VEC_INTERLEAVE_LOW_EXPR:
> - sel[i] = i / 2 + ((i & 1) ? nelts : 0);
> + sel[i] = (i + (BYTES_BIG_ENDIAN ? nelts : 0)) / 2
> +  + ((i & 1) ? nelts : 0);
>   break;
> default:
>   gcc_unreachable ();
> --- gcc/optabs.c.jj   2011-11-21 16:22:02.0 +0100
> +++ gcc/optabs.c  2011-11-22 10:17:04.820399126 +0100
> @@ -6932,9 +6932,9 @@ can_vec_perm_for_code_p (enum tree_code
> break;
>  
>   case VEC_INTERLEAVE_HIGH_EXPR:
> -   alt = nelt / 2;
> -   /* FALLTHRU */
>   case VEC_INTERLEAVE_LOW_EXPR:
> +   if ((BYTES_BIG_ENDIAN != 0) ^ (code == VEC_INTERLEAVE_HIGH_EXPR))
> + alt = nelt / 2;
> for (i = 0; i < nelt / 2; ++i)
>   {
> data[i * 2] = i + alt;
> --- gcc/testsuite/gcc.dg/vect/pr51074.c.jj2011-11-22 10:22:44.247377928 
> +0100
> +++ gcc/testsuite/gcc.dg/vect/pr51074.c   2011-11-22 10:22:16.0 
> +0100
> @@ -0,0 +1,24 @@
> +/* PR tree-optimization/51074 */
> +
> +#include "tree-vect.h"
> +
> +struct S { int a, b; } s[8];
> +
> +int
> +main ()
> +{
> +  int i;
> +  check_vect ();
> +  for (i = 0; i < 8; i++)
> +{
> +  s[i].b = 0;
> +  s[i].a = i;
> +}
> +  asm volatile ("" : : : "memory");
> +  for (i = 0; i < 8; i++)
> +if (s[i].b != 0 || s[i].a != i)
> +  abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> 
>   Jakub
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Richard Guenther
On Wed, 23 Nov 2011, Diego Novillo wrote:

> On Sat, Nov 5, 2011 at 07:02, Iain Sandoe
>  wrote:
> >
> > On 28 Oct 2011, at 13:57, Richard Guenther wrote:
> >
> >>
> >> We fail to keep the cannot-inline flag up-to-date when turning
> >> indirect to direct calls.  The following patch arranges to do
> >> this during statement folding (which should always be called
> >> when that happens).  It also makes sure to copy the updated flag
> >> to the edge when iterating early inlining.
> >
> > This: http://gcc.gnu.org/ml/gcc-cvs/2011-11/msg00046.html
> >
> > regresses:
> > acats/c740203a (x86-64-darwin10)
> > gnat/aliasing3.adb  (m64 i486-darwin9 and x86-64-darwin10)
> > ... don't know about other platforms at present.
> 
> I am also seeing a regression in some C++ code, specifically, this
> call to gimple_call_set_cannot_inline() is not updating the
> call_stmt_cannot_inline_p field in the corresponding call graph edge
> 
> !   if (callee
> !   && !gimple_check_call_matching_types (stmt, callee))
> ! gimple_call_set_cannot_inline (stmt, true);
> 
> In this code I'm trying to build, we fail the assertion in can_inline_edge_p:
> 
>   /* Be sure that the cannot_inline_p flag is up to date.  */
>   gcc_checking_assert (!e->call_stmt
>|| (gimple_call_cannot_inline_p (e->call_stmt)
>== e->call_stmt_cannot_inline_p)
> 
> because gimple_fold_call did not update the inline flag on the edge.
> 
> I grepped for calls to gimple_call_set_cannot_inline() and we don't
> always bother to update the corresponding edge.  I think the safest
> approach here would be to make sure that we always do (patch below).
> 
> Thoughts?

Ick.

Well.  Which pass makes the flag change and why are edges not
recomputed before inlining (they are, always!?).

Well.  It's a hack we have the flag duplicated.  But the reason
is we throw away the cgraph edges all the time (bah!) and at WPA
time we don't have the stmt to lookup the flag.

I'd rather remove the asserts than fixing up like this (btw, the
inliner can handle all mismatches now).

Richard.

> 
> Diego.
> 
> diff --git a/gcc/gimple.c b/gcc/gimple.c
> index 071c651..e2b082a 100644
> --- a/gcc/gimple.c
> +++ b/gcc/gimple.c
> @@ -5558,4 +5558,31 @@ gimple_asm_clobbers_memory_p (const_gimple stmt)
> 
>return false;
>  }
> +
> +
> +/* Set the inlinable status of GIMPLE_CALL S to INLINABLE_P.  */
> +
> +void
> +gimple_call_set_cannot_inline (gimple s, bool inlinable_p)
> +{
> +  bool prev_inlinable_p;
> +
> +  GIMPLE_CHECK (s, GIMPLE_CALL);
> +
> +  prev_inlinable_p = gimple_call_cannot_inline_p (s);
> +
> +  if (inlinable_p)
> +s->gsbase.subcode |= GF_CALL_CANNOT_INLINE;
> +  else
> +s->gsbase.subcode &= ~GF_CALL_CANNOT_INLINE;
> +
> +  if (prev_inlinable_p != inlinable_p)
> +{
> +  struct cgraph_node *n = cgraph_get_node (current_function_decl);
> +  struct cgraph_edge *e = cgraph_edge (n, s);
> +  if (e)
> +   e->call_stmt_cannot_inline_p = inlinable_p;
> +}
> +}
> +
>  #include "gt-gimple.h"
> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index 8536c70..df31bf3 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -1035,6 +1035,7 @@ extern bool walk_stmt_load_store_ops (gimple, void *,
>  extern bool gimple_ior_addresses_taken (bitmap, gimple);
>  extern bool gimple_call_builtin_p (gimple, enum built_in_function);
>  extern bool gimple_asm_clobbers_memory_p (const_gimple);
> +extern void gimple_call_set_cannot_inline (gimple, bool);
> 
>  /* In gimplify.c  */
>  extern tree create_tmp_var_raw (tree, const char *);
> @@ -2343,19 +2344,6 @@ gimple_call_tail_p (gimple s)
>  }
> 
> 
> -/* Set the inlinable status of GIMPLE_CALL S to INLINABLE_P.  */
> -
> -static inline void
> -gimple_call_set_cannot_inline (gimple s, bool inlinable_p)
> -{
> -  GIMPLE_CHECK (s, GIMPLE_CALL);
> -  if (inlinable_p)
> -s->gsbase.subcode |= GF_CALL_CANNOT_INLINE;
> -  else
> -s->gsbase.subcode &= ~GF_CALL_CANNOT_INLINE;
> -}
> -
> -
>  /* Return true if GIMPLE_CALL S cannot be inlined.  */
> 
>  static inline bool
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: [PATCH] Fix up VEC_INTERLEAVE_*_EXPR folding and expansion for big endian (PR tree-optimization/51074)

2011-12-01 Thread Jakub Jelinek
On Thu, Dec 01, 2011 at 10:53:57AM +0100, Richard Guenther wrote:
> On Tue, 22 Nov 2011, Jakub Jelinek wrote:
> > VEC_INTERLEAVE_*_EXPR trees are unfortunately dependent on BYTES_BIG_ENDIAN,
> > what is HIGH vs. LOW is different based on endianity.
> 
> Huh, that looks bogus.  Both tree codes operate on registers and no
> other codes care about "endianess" of vector registers.
> (What about VEC_WIDEN_LSHIFT_{HI,LO}_EXPR?)
> 
> Can't we simply push the differece to expansion time?  Or even later?

As RTH said, the best fix is to remove VEC_INTERLEAVE_*_EXPR altogether
and just use VEC_PERM_EXPR always, it is redundant with that.  But that
might be too invasive for 4.8.

Jakub


[C++ Patch] PR 51326

2011-12-01 Thread Paolo Carlini

Hi,

in this ICE on invalid, 4.7 Regression, we ICE at the beginning of 
build_user_type_conversion_1 because expr is NULL_TREE. The function is 
called as such from reference_binding which, in 4.6, used to call 
instead convert_class_to_reference which does check for a NULL_TREE 
expr. Thus I think adjusting likewise build_user_type_conversion_1 
should be fine. Tested x86_64-linux.


Thanks,
Paolo.

//
/cp
2011-12-01  Paolo Carlini  

PR c++/51326
* call.c (build_user_type_conversion_1): Early return NULL if
expr is NULL_TREE.

/testsuite
2011-12-01  Paolo Carlini  

PR c++/51326
* g++.dg/inherit/crash3.C: New.

Index: testsuite/g++.dg/inherit/crash3.C
===
--- testsuite/g++.dg/inherit/crash3.C   (revision 0)
+++ testsuite/g++.dg/inherit/crash3.C   (revision 0)
@@ -0,0 +1,11 @@
+// PR c++/51326
+
+struct A
+{
+  virtual int& foo(); // { dg-error "overriding" }
+};
+
+struct B : A
+{
+  B& foo();   // { dg-error "conflicting return type" }
+};
Index: cp/call.c
===
--- cp/call.c   (revision 181875)
+++ cp/call.c   (working copy)
@@ -3373,7 +3373,7 @@ static struct z_candidate *
 build_user_type_conversion_1 (tree totype, tree expr, int flags)
 {
   struct z_candidate *candidates, *cand;
-  tree fromtype = TREE_TYPE (expr);
+  tree fromtype;
   tree ctors = NULL_TREE;
   tree conv_fns = NULL_TREE;
   conversion *conv = NULL;
@@ -3382,6 +3382,11 @@ build_user_type_conversion_1 (tree totype, tree ex
   bool any_viable_p;
   int convflags;
 
+  if (!expr)
+return NULL;
+
+  fromtype = TREE_TYPE (expr);
+
   /* We represent conversion within a hierarchy using RVALUE_CONV and
  BASE_CONV, as specified by [over.best.ics]; these become plain
  constructor calls, as specified in [dcl.init].  */


Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Richard Guenther
On Tue, 29 Nov 2011, Diego Novillo wrote:

> Iain, could you let me know if the attached patch fixes your problem?
> The patch changes gimple_call_set_cannot_inline to update the
> corresponding callgraph edge, if needed.  I did not touch any of the
> other calls, because sometimes we are calling this function in IPA
> mode, and so we don't know what function the call belongs to.
> 
> I've tested it on x86_64.  I will be committing it shortly.

Btw, I don't think this "hammer" solution should be applied (what
about the other direction?).

Richard.


Re: [PATCH 2/5] arm: Emit swp for pre-armv6.

2011-12-01 Thread Richard Earnshaw

Sorry, no.

It's essential we don't emit SWP instructions directly into code on any
platform where that code may possibly be run on a later core that has
LDREX/STREX.  If we do that we'll end up with a mess that can't be resolved.

In the absence of known OS helper functions the only solution to this is
to call library helper functions than can be replaced once to fix the
whole application (including any shared libraries if applicable).  There
cannot be a mix of the two strategies.

I also think that GCC should NOT provide those helper functions, though
we should probably write a document describing how a user might do so.

The model we've been using on Linux is, I think the only viable one: if
we have LDREX/STREX in the instruction set, then we use it directly;
otherwise we call a library function.  The system developer must then
ensure that their helper function does the "right thing" to make all the
code work.

R.

On 01/12/11 00:44, Richard Henderson wrote:
> ---
>  gcc/config/arm/arm.h   |6 
>  gcc/config/arm/sync.md |   63 
> +++-
>  2 files changed, 68 insertions(+), 1 deletions(-)
> 
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 31f4856..33e5b8e 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -276,6 +276,12 @@ extern void 
> (*arm_lang_output_object_attributes_hook)(void);
>  /* Nonzero if this chip implements a memory barrier instruction.  */
>  #define TARGET_HAVE_MEMORY_BARRIER (TARGET_HAVE_DMB || TARGET_HAVE_DMB_MCR)
>  
> +/* Nonzero if this chip supports swp and swpb.  These are technically present
> +   post-armv6, but deprecated.  Never use it if we have OS support, as swp is
> +   not well-defined on SMP systems.  */
> +#define TARGET_HAVE_SWP \
> +  (TARGET_ARM && arm_arch4 && !arm_arch6 && arm_abi != ARM_ABI_AAPCS_LINUX)
> +
>  /* Nonzero if this chip supports ldrex and strex */
>  #define TARGET_HAVE_LDREX((arm_arch6 && TARGET_ARM) || arm_arch7)
>  
> diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
> index 124ebf0..72e7181 100644
> --- a/gcc/config/arm/sync.md
> +++ b/gcc/config/arm/sync.md
> @@ -26,6 +26,10 @@
> (DI "TARGET_HAVE_LDREXD && ARM_DOUBLEWORD_ALIGN
>   && TARGET_HAVE_MEMORY_BARRIER")])
>  
> +(define_mode_attr swp_predtab
> +  [(QI "TARGET_HAVE_SWP") (HI "false")
> +   (SI "TARGET_HAVE_SWP") (DI "false")])
> +
>  (define_code_iterator syncop [plus minus ior xor and])
>  
>  (define_code_attr sync_optab
> @@ -132,7 +136,41 @@
>  DONE;
>})
>  
> -(define_insn_and_split "atomic_exchange"
> +(define_expand "atomic_exchange"
> +  [(match_operand:QHSD 0 "s_register_operand" "")
> +   (match_operand:QHSD 1 "mem_noofs_operand" "")
> +   (match_operand:QHSD 2 "s_register_operand" "r")
> +   (match_operand:SI 3 "const_int_operand" "")]
> +  " || "
> +{
> +  if ()
> +emit_insn (gen_atomic_exchange_rex (operands[0], operands[1],
> +   operands[2], operands[3]));
> +  else
> +{
> +  /* Memory barriers are introduced in armv6, which also gains the
> +  ldrex insns.  Therefore we can ignore the memory model argument
> +  when issuing a SWP instruction.  */
> +  gcc_checking_assert (!TARGET_HAVE_MEMORY_BARRIER);
> +
> +  if (mode == QImode)
> + {
> +   rtx x = gen_reg_rtx (SImode);
> +  emit_insn (gen_atomic_exchangeqi_swp (x, operands[1], 
> operands[2]));
> +   emit_move_insn (operands[0], gen_lowpart (QImode, x));
> + }
> +  else if (mode == SImode)
> + {
> +   emit_insn (gen_atomic_exchangesi_swp
> +  (operands[0], operands[1], operands[2]));
> + }
> +  else
> + gcc_unreachable ();
> +}
> +  DONE;
> +})
> +
> +(define_insn_and_split "atomic_exchange_rex"
>[(set (match_operand:QHSD 0 "s_register_operand" "=&r");; output
>   (match_operand:QHSD 1 "mem_noofs_operand" "+Ua"))   ;; memory
> (set (match_dup 1)
> @@ -152,6 +190,29 @@
>  DONE;
>})
>  
> +(define_insn "atomic_exchangeqi_swp"
> +  [(set (match_operand:SI 0 "s_register_operand" "=&r")  ;; 
> output
> + (zero_extend:SI
> +   (match_operand:QI 1 "mem_noofs_operand" "+Ua")))  ;; memory
> +   (set (match_dup 1)
> + (unspec_volatile:QI
> +   [(match_operand:QI 2 "s_register_operand" "r")]   ;; input
> +   VUNSPEC_ATOMIC_XCHG))]
> +  "TARGET_HAVE_SWP"
> +  "swpb%?\t%0, %2, %C1"
> +  [(set_attr "predicable" "yes")])
> +
> +(define_insn "atomic_exchangesi_swp"
> +  [(set (match_operand:SI 0 "s_register_operand" "=&r")  ;; 
> output
> + (match_operand:SI 1 "mem_noofs_operand" "+Ua")) ;; memory
> +   (set (match_dup 1)
> + (unspec_volatile:SI
> +   [(match_operand:SI 2 "s_register_operand" "r")]   ;; input
> +   VUNSPEC_ATOMIC_XCHG))]
> +  "TARGET_HAVE_SWP"
> +  "swp%?\t%0, %2, %C1"
> +  [(set_attr "predicable" "yes")])
> +
>  (define_mode_attr atomic_op_operand
>[(QI "r

Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Richard Guenther
On Tue, 29 Nov 2011, Diego Novillo wrote:

> On Tue, Nov 29, 2011 at 12:49, H.J. Lu  wrote:
> 
> > This caused:
> >
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51346
> 
> Thanks.  I'm on it.

The patch was wrong, please revert it.  At the gimple stmt
modification level we shouldn't modify the cgraph.  That's
a layering violation at least.

Please file a bug with a reduced testcase that still fails
without your fix.

Richard.


Re: Tidy up MD_INCLUDES in config/arm/t-arm

2011-12-01 Thread Georg-Johann Lay
Richard Earnshaw wrote:
> On 29/11/11 09:42, Matthew Gretton-Dann wrote:
>> All,
>>
>> Whilst developing the Cortex-A15 integer pipeline patch it was noted 
>> that the MD_INCLUDES variable in config/arm/t-arm has not been kept 
>> up-to-date.
>>
>> The attached patch fixes this, and rearranges the list of md files into 
>> alphabetical order.
>>
>> The list was generated using `ls -1 *.md | grep -v arm\\.md`.
>>
>> Tested by doing a arm-none-eabi build.
>>
>> Can someone please review, and if appropriate apply?
>>
>> Thanks,
>>
>> Matt
>>
>> gcc/ChangeLog:
>> 2011-11-29  Matthew Gretton-Dann  
>>
>>  * config/arm/t-arm (MD_INCLUDES): Ensure all md files are
>>  listed.
>>
> 
> OK.
> 
> R.

Is each entry mandatory in that list?

I thought gen-tools already arrange for great part of MD_INCLUDES?

For example, after adding (include "avr-dimode.md") to avr.md, ./gcc/mddeps.mk
reads:

MD_INCLUDES = \
../../../gcc.gnu.org/trunk/gcc/config/avr/predicates.md \
../../../gcc.gnu.org/trunk/gcc/config/avr/constraints.md \
../../../gcc.gnu.org/trunk/gcc/config/avr/avr-dimode.md

../../../gcc.gnu.org/trunk/gcc/config/avr/predicates.md:

../../../gcc.gnu.org/trunk/gcc/config/avr/constraints.md:

../../../gcc.gnu.org/trunk/gcc/config/avr/avr-dimode.md:

so that maintaining such a list might be considerable easier.

Johann





Re: [PATCH] Fix up VEC_INTERLEAVE_*_EXPR folding and expansion for big endian (PR tree-optimization/51074)

2011-12-01 Thread Richard Guenther
On Thu, 1 Dec 2011, Jakub Jelinek wrote:

> On Thu, Dec 01, 2011 at 10:53:57AM +0100, Richard Guenther wrote:
> > On Tue, 22 Nov 2011, Jakub Jelinek wrote:
> > > VEC_INTERLEAVE_*_EXPR trees are unfortunately dependent on 
> > > BYTES_BIG_ENDIAN,
> > > what is HIGH vs. LOW is different based on endianity.
> > 
> > Huh, that looks bogus.  Both tree codes operate on registers and no
> > other codes care about "endianess" of vector registers.
> > (What about VEC_WIDEN_LSHIFT_{HI,LO}_EXPR?)
> > 
> > Can't we simply push the differece to expansion time?  Or even later?
> 
> As RTH said, the best fix is to remove VEC_INTERLEAVE_*_EXPR altogether
> and just use VEC_PERM_EXPR always, it is redundant with that.  But that
> might be too invasive for 4.8.

Yes, sorry - I'm recovering from a 3 week e-mail lag ;)  I agree
using VEC_PERM_EXPR would be best - but that would also affect
backend patterns.  Can we have a middle-ground that leaves those
untouched?  We're still in stage 3, so fixing the bug with using
VEC_PERM_EXPR sounds appealing to me ;)

Thanks,
Richard.


Re: [PATCH 4/5] arm: Set predicable on more instructions.

2011-12-01 Thread Ramana Radhakrishnan
Hi Richard,



> ---
>  gcc/config/arm/arm.md |   40 ++--
>  1 files changed, 30 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index b01343c..3b24627 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md

<...snip>

>  ; By splitting (IOR (AND (NOT A) (NOT B)) C) as D = AND (IOR A B) (NOT C),
>  ; (NOT D) we can sometimes merge the final NOT into one of the following
> @@ -4702,7 +4719,8 @@
>                         (const_int 0)))]
>   "TARGET_32BIT"
>   "tst\\t%0, #255"
> -  [(set_attr "conds" "set")]
> +  [(set_attr "conds" "set")
> +   (set_attr "predicable" "yes")]
>  )

It should be tst%? . Otherwise in the predicable case we wouldn't have
the condition code printed out.

Hmmm that looks like it's been latent for a while.

>
>  (define_expand "extendhisi2"
> @@ -7458,7 +7476,8 @@
>    cmn%?\\t%0, #%n1"
>   [(set_attr "conds" "set")
>    (set_attr "arch" "t2,t2,any,any")
> -   (set_attr "length" "2,2,4,4")]
> +   (set_attr "length" "2,2,4,4")
> +   (set_attr "predicable" "yes")]
>  )
>
>  (define_insn "*cmpsi_shiftsi"
> @@ -7499,7 +7518,8 @@
>   [(set_attr "conds" "set")
>    (set (attr "type") (if_then_else (match_operand 3 "const_int_operand" "")
>                                    (const_string "alu_shift")
> -                                   (const_string "alu_shift_reg")))]
> +                                   (const_string "alu_shift_reg")))
> +   (set_attr "predicable" "yes")]
>  )
>


Otherwise looks OK to me.

Ramana


Re: [Patch,AVR] Light-weight DImode implementation.

2011-12-01 Thread Georg-Johann Lay
Denis Chertykov wrote:

 The only question that remains is what the -m64 option should be like?
 
 [ ] Omit it altogether
 [ ] Leave it as is (off per default)
 [ ] Set it on per default

 As soon as the direction is clear, I'll post a follow-up patch to
 add the missing bits like, e.g., documentation for the new switch.
>>> I'll leave the decision to Denis, but I'm for omitting it.
>> I will also defer to Denis, but I'd rather avoid having another option,
>> if we can. Keep it simple for the users.

It might also be a hidden option like -morder2 and on per default.
Such thing is nice for developers to play :-)

 > I'm agree with Richard. I'm for omitting it.
> 
> Denis.

So here is a follow-up patch atop
http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02136.html
to remove the -m64 option.

The variable avr_have_dimode is still there but set always to true in avr.c and
avr.opt remains unchanged.

Ok?

gcc/
* config/avr/avr-dimode.md: New file.
* config/avr/avr.md: Include it.
(adjust_len): Add plus64, compare64.
(HIDI): Remove code iterator.
(code_stdname): New code attribute.
(rotx, rotsmode): Remove DI.
(rotl3, *rotw, *rotb): Use HISI instead of HIDI
as code iterator.
* config/avr/avr-protos.h (avr_have_dimode): New.
(avr_out_plus64, avr_out_compare64): New.
* config/avr/avr.c (avr_out_compare): Handle DImode.
(avr_have_dimode): New variable definition and initialization.
(avr_out_compare64, avr_out_plus64): New functions.
(avr_out_plus_1): Use simplify_unary_operation to negate xval.
(adjust_insn_length): Handle ADJUST_LEN_COMPARE64, ADJUST_LEN_PLUS64.
(avr_compare_pattern): Skip DImode comparisons.

libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _adddi3, _adddi3_s8,
_subdi3, _cmpdi2, _cmpdi2_s8, _rotldi3.
* config/avr/lib1funcs.S (__adddi3, __adddi3_s8, __subdi3,
__cmpdi2, __cmpdi2_s8, __rotldi3): New functions.

diff --git a/config/avr/avr-protos.h b/config/avr/avr-protos.h
index fd00a4e..a95e611 100644
--- a/config/avr/avr-protos.h
+++ b/config/avr/avr-protos.h
@@ -132,6 +132,8 @@ extern bool avr_xload_libgcc_p (enum machine_mode);
 extern void asm_output_float (FILE *file, REAL_VALUE_TYPE n);
 #endif
 
+extern bool avr_have_dimode;
+
 /* From avr-log.c */
 
 #define avr_edump (avr_log_set_caller_e (__FUNCTION__))
diff --git a/config/avr/avr.c b/config/avr/avr.c
index 551d7c6..1806ac8 100644
--- a/config/avr/avr.c
+++ b/config/avr/avr.c
@@ -145,6 +145,9 @@ static const char * const progmem_section_prefix[6] =
 ".progmem5.data"
   };
 
+/* Condition for insns/expanders from avr-dimode.md.  */
+bool avr_have_dimode = true;
+
 /* To track if code will use .bss and/or .data.  */
 bool avr_need_clear_bss_p = false;
 bool avr_need_copy_data_p = false;
diff --git a/config/avr/avr.opt b/config/avr/avr.opt
index eaa8df5..bb9c90e 100644
--- a/config/avr/avr.opt
+++ b/config/avr/avr.opt
@@ -77,7 +77,3 @@ When accessing RAM, use X as imposed by the hardware, i.e. just use pre-decremen
 mbranch-cost=
 Target Report RejectNegative Joined UInteger Var(avr_branch_cost) Init(0)
 Set the cost of a branch instruction.  Default value is 0.
-
-m64
-Target Report Var(avr_have_dimode) Init(0)
-Experimental.


PR libgomp/51376 fix

2011-12-01 Thread Alan Modra
The simple one-line fix in GOMP_taskwait took many hours to find.
Shared memory problems are a pain to debug, especially when adding
code to dump some state turns a testcase that fails every hundred or
so runs into one that takes thousands of times longer to fail.

What happens here is that GOMP_taskwait is called in the parent thread
some time after gomp_barrier_handle_tasks has run in the child to the
point of clearing the parent's "children" field.  However, since there
is no acquire barrier in the parent and the child may or may not have
reached the release barrier in the mutex unlock, the memory stores in
the child are not guaranteed to be seen in order in the parent thread.
Thus the parent can see "task->children" clear but not yet see stores
done as part of the real work of the child, ie. to "a" and "n" in the
testcase.

The GOMP_task change fixes a similar potential problem.  Bootstrapped
and regression tested powerpc-linux.  OK to apply?

PR libgomp/51376
* task.c (GOMP_taskwait): Don't access task->children outside of
task_lock mutex region.
(GOMP_task): Likewise.

Index: libgomp/task.c
===
--- libgomp/task.c  (revision 181833)
+++ libgomp/task.c  (working copy)
@@ -116,12 +116,12 @@ GOMP_task (void (*fn) (void *), void *da
}
   else
fn (data);
-  if (task.children)
-   {
- gomp_mutex_lock (&team->task_lock);
- gomp_clear_parent (task.children);
- gomp_mutex_unlock (&team->task_lock);
-   }
+  if (team != NULL)
+   gomp_mutex_lock (&team->task_lock);
+  if (task.children != NULL)
+   gomp_clear_parent (task.children);
+  if (team != NULL)
+   gomp_mutex_unlock (&team->task_lock);
   gomp_end_task ();
 }
   else
@@ -290,8 +290,9 @@ GOMP_taskwait (void)
   struct gomp_task *child_task = NULL;
   struct gomp_task *to_free = NULL;
 
-  if (task == NULL || task->children == NULL)
+  if (task == NULL)
 return;
+
   gomp_mutex_lock (&team->task_lock);
   while (1)
 {

-- 
Alan Modra
Australia Development Lab, IBM


Re: [Patch.AVR,4.6] Fix PR51002

2011-12-01 Thread Georg-Johann Lay
Joerg Wunsch wrote:
> As Georg-Johann Lay wrote:
> 
>> Then avr-mcus.def adopted this bug from the manual for ATtiny4313 at least:
>>
>> AVR_MCU ("attiny4313", ARCH_AVR25, "__AVR_ATtiny4313__", 1 /* short_sp, 
>> should
>> be 0 ? */, 0, 0x0060, "tn4313")
> 
> Not unlikely.
> 
> I just ordered one.  Hopefully, it will be here by tomorrow, so I can
> test it on a live device.

Jörg, do you have an easy way to review avr-mcus.def?

http://gcc.gnu.org/viewcvs/trunk/gcc/config/avr/avr-mcus.def?content-type=text%2Fplain&view=co

If you have XML hardware descriptions that are more accurate and much more easy
to use than 1E2...1E3 PDFs that might be not too painful.

The column in question is the column after the built-in define definition like
"__AVR_AT90S2313__" i.e. the 4th column.

Johann


Re: PR libgomp/51376 fix

2011-12-01 Thread Jakub Jelinek
On Thu, Dec 01, 2011 at 09:58:08PM +1030, Alan Modra wrote:
> The GOMP_task change fixes a similar potential problem.  Bootstrapped
> and regression tested powerpc-linux.  OK to apply?
> 
>   PR libgomp/51376
>   * task.c (GOMP_taskwait): Don't access task->children outside of
>   task_lock mutex region.
>   (GOMP_task): Likewise.

Can't this be solved just by adding a barrier?  The access to the var
outside of the lock has been quite intentional, to avoid locking in the
common case where there are no children.

> --- libgomp/task.c(revision 181833)
> +++ libgomp/task.c(working copy)
> @@ -116,12 +116,12 @@ GOMP_task (void (*fn) (void *), void *da
>   }
>else
>   fn (data);
> -  if (task.children)
> - {
> -   gomp_mutex_lock (&team->task_lock);
> -   gomp_clear_parent (task.children);
> -   gomp_mutex_unlock (&team->task_lock);
> - }
> +  if (team != NULL)
> + gomp_mutex_lock (&team->task_lock);
> +  if (task.children != NULL)
> + gomp_clear_parent (task.children);
> +  if (team != NULL)
> + gomp_mutex_unlock (&team->task_lock);
>gomp_end_task ();
>  }
>else
> @@ -290,8 +290,9 @@ GOMP_taskwait (void)
>struct gomp_task *child_task = NULL;
>struct gomp_task *to_free = NULL;
>  
> -  if (task == NULL || task->children == NULL)
> +  if (task == NULL)
>  return;
> +
>gomp_mutex_lock (&team->task_lock);
>while (1)
>  {

Jakub


Re: [RFA/ARM][Patch 01/02]: Thumb2 epilogue in RTL

2011-12-01 Thread Sameera Deshpande
On Tue, 2011-11-22 at 10:37 +, Ramana Radhakrishnan wrote:

> Xinyu: I seem to have mis-remembered that one of your patches was
> turning on Thumb2 for wMMX.
> >
> > Ramana, in that case, should I add the change you suggested in ARM RTL
> > epilogue patch only?
> 
> The comment in Thumb2 epilogues should remain and yes - it should be
> added to the ARM RTL epilogue patch only. I'm also ok with that being
> in with a #if 0 around it but given it's in the epilogue whoever tries
> turning on Thumb2 for iwMMX will surely notice that in the first
> testrun :)

Ramana,

Please find attached updated patch which sets CFA_RESTORE note for
single register pop and fixing new ICEs in check-gcc at trunk.

The patch is tested with check-gcc, bootstrap and check-gdb without
regression.

-- diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 23a29c6..2c38883 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -65,6 +65,7 @@ extern int thumb1_legitimate_address_p (enum machine_mode, rtx, int);
 extern int arm_const_double_rtx (rtx);
 extern int neg_const_double_rtx_ok_for_fpa (rtx);
 extern int vfp3_const_double_rtx (rtx);
+extern bool load_multiple_operation_p (rtx, bool, enum machine_mode, bool);
 extern int neon_immediate_valid_for_move (rtx, enum machine_mode, rtx *, int *);
 extern int neon_immediate_valid_for_logic (rtx, enum machine_mode, int, rtx *,
 	   int *);
@@ -176,10 +177,13 @@ extern int arm_float_words_big_endian (void);
 
 /* Thumb functions.  */
 extern void arm_init_expanders (void);
-extern const char *thumb_unexpanded_epilogue (void);
+extern const char *thumb1_unexpanded_epilogue (void);
 extern void thumb1_expand_prologue (void);
 extern void thumb1_expand_epilogue (void);
 extern const char *thumb1_output_interwork (void);
+extern void thumb2_expand_epilogue (void);
+extern void thumb2_output_return (rtx);
+extern void thumb2_expand_return (void);
 #ifdef TREE_CODE
 extern int is_called_in_ARM_mode (tree);
 #endif
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e3b0b88..40c8b44 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8906,6 +8906,139 @@ neon_valid_immediate (rtx op, enum machine_mode mode, int inverse,
 #undef CHECK
 }
 
+/* Return true if OP is a valid load multiple operation for MODE mode.
+   CONSECUTIVE is true if the registers in the operation must form
+   a consecutive sequence in the register bank.  STACK_ONLY is true
+   if the base register must be the stack pointer.  RETURN_PC is true
+   if value is to be loaded in PC.  */
+bool
+load_multiple_operation_p (rtx op, bool consecutive, enum machine_mode mode,
+   bool return_pc)
+{
+  HOST_WIDE_INT count = XVECLEN (op, 0);
+  unsigned dest_regno, first_dest_regno;
+  rtx src_addr;
+  HOST_WIDE_INT i = 1, base = 0;
+  HOST_WIDE_INT offset = 0;
+  rtx elt;
+  bool addr_reg_loaded = false;
+  bool update = false;
+  int reg_increment, regs_per_val;
+  int offset_adj;
+
+  /* If DFmode, we must be asking for consecutive,
+ since fldmdd can only do consecutive regs.  */
+  gcc_assert ((mode != DFmode) || consecutive);
+
+  /* Set up the increments and the regs per val based on the mode.  */
+  reg_increment = GET_MODE_SIZE (mode);
+  regs_per_val = mode == DFmode ? 2 : 1;
+  offset_adj = return_pc ? 1 : 0;
+
+  if (count <= 1
+  || GET_CODE (XVECEXP (op, 0, offset_adj)) != SET
+  || !REG_P (SET_DEST (XVECEXP (op, 0, offset_adj
+return false;
+
+  /* Check to see if this might be a write-back.  */
+  elt = XVECEXP (op, 0, offset_adj);
+  if (GET_CODE (SET_SRC (elt)) == PLUS)
+{
+  i++;
+  base = 1;
+  update = true;
+
+  /* The offset adjustment should be same as number of registers being
+ popped * size of single register.  */
+  if (!REG_P (SET_DEST (elt))
+  || !REG_P (XEXP (SET_SRC (elt), 0))
+  || !CONST_INT_P (XEXP (SET_SRC (elt), 1))
+  || INTVAL (XEXP (SET_SRC (elt), 1)) !=
+  ((count - 1 - offset_adj) * reg_increment))
+return false;
+}
+
+  i = i + offset_adj;
+  base = base + offset_adj;
+  /* Perform a quick check so we don't blow up below.  */
+  if (GET_CODE (XVECEXP (op, 0, i - 1)) != SET
+  || !REG_P (SET_DEST (XVECEXP (op, 0, i - 1)))
+  || !MEM_P (SET_SRC (XVECEXP (op, 0, i - 1
+return false;
+
+  /* If only one reg being loaded, success depends on the type:
+ FLDMDD can do just one reg, LDM must do at least two.  */
+  if (count <= i)
+return mode == DFmode ? true : false;
+
+  first_dest_regno = REGNO (SET_DEST (XVECEXP (op, 0, i - 1)));
+  dest_regno = first_dest_regno;
+
+  src_addr = XEXP (SET_SRC (XVECEXP (op, 0, i - 1)), 0);
+
+  if (GET_CODE (src_addr) == PLUS)
+{
+  if (!CONST_INT_P (XEXP (src_addr, 1)))
+return false;
+  offset = INTVAL (XEXP (src_addr, 1));
+  src_addr = XEXP (src_addr, 0);
+}
+
+  if (!REG_P (src_addr))
+return false;
+
+  /* T

Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Diego Novillo
On Thu, Dec 1, 2011 at 05:59, Richard Guenther  wrote:
> On Tue, 29 Nov 2011, Diego Novillo wrote:
>
>> On Tue, Nov 29, 2011 at 12:49, H.J. Lu  wrote:
>>
>> > This caused:
>> >
>> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51346
>>
>> Thanks.  I'm on it.
>
> The patch was wrong, please revert it.  At the gimple stmt
> modification level we shouldn't modify the cgraph.  That's
> a layering violation at least.

No, this is a pre-existing problem that got aggravated with the new
changes to the inline attribute in fold.  I think we need to either
toss out the edge attribute or make it such that they are more
automatically sync'd.  Unfortunately, we cannot get rid of it, since
we sometimes do not have the statement.

So, we have to live with the two attributes.  How about, we make the
edge attribute always dependent on the statement?  If the statement
exists, the edge attribute always take its value from it.  Only when
the statement doesn't exist, we take its value from the call.  All
this can be put into a small predicate.

> Please file a bug with a reduced testcase that still fails
> without your fix.

I'll add a test to the final patch after it finishes reducing (the
original test case is huge).


Diego.


Re: [PATCH] Implement stap probe on ARM's unwinder

2011-12-01 Thread Ramana Radhakrishnan
Sergio: Other than a few minor tweaks to the Changelog it largely
looks obvious to me.

Bernd, could you take another look at this since this is now shared
with the c6x backend ?

> Thanks,
>
> Sergio.
>
> diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
> index 305e8ad..f6e9dec 100644
> --- a/libgcc/ChangeLog
> +++ b/libgcc/ChangeLog
> @@ -1,3 +1,15 @@
> +2011-11-22  Sergio Durigan Junior  
> +
> +       Implement ARM Unwinder SystemTap probe.

This line is not required.

> +       * unwind-arm-common.inc: Include `tconfig.h', `tsystem.h' and
> +       `sys/sdt.h'.
> +       (_Unwind_DebugHook): New function.
> +       (uw_restore_core_regs): New define.
> +       (unwind_phase2): Use `uw_restore_core_regs' instead of
> +       `restore_core_regs'.

You don't need the `' quoting of the function names in the ChangeLog.

> +       (unwind_phase2_forced): Likewise.
> +       (__gnu_Unwind_Resume): Likewise.
> +
>  2011-11-22  Iain Sandoe  
>
>        * config/darwin-crt-tm.c: New file.
> diff --git a/libgcc/unwind-arm-common.inc b/libgcc/unwind-arm-common.inc
> index 0713056..bf16902 100644
> --- a/libgcc/unwind-arm-common.inc
> +++ b/libgcc/unwind-arm-common.inc
> @@ -21,8 +21,15 @@
>    see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>    .  */
>
> +#include "tconfig.h"
> +#include "tsystem.h"
>  #include "unwind.h"
>
> +/* Used for SystemTap unwinder probe.  */
> +#ifdef HAVE_SYS_SDT_H
> +#include 
> +#endif
> +
>  /* We add a prototype for abort here to avoid creating a dependency on
>    target headers.  */
>  extern void abort (void);
> @@ -105,6 +112,44 @@ static inline _uw selfrel_offset31 (const _uw *p);
>
>  static _uw __gnu_unwind_get_pr_addr (int idx);
>
> +static void _Unwind_DebugHook (void *, void *)
> +  __attribute__ ((__noinline__, __used__, __noclone__));
> +
> +/* This function is called during unwinding.  It is intended as a hook
> +   for a debugger to intercept exceptions.  CFA is the CFA of the
> +   target frame.  HANDLER is the PC to which control will be
> +   transferred.  */
> +
> +static void
> +_Unwind_DebugHook (void *cfa __attribute__ ((__unused__)),
> +                  void *handler __attribute__ ((__unused__)))
> +{
> +  /* We only want to use stap probes starting with v3.  Earlier
> +     versions added too much startup cost.  */
> +#if defined (HAVE_SYS_SDT_H) && defined (STAP_PROBE2) && _SDT_NOTE_TYPE >= 3
> +  STAP_PROBE2 (libgcc, unwind, cfa, handler);
> +#else
> +  asm ("");
> +#endif
> +}
> +
> +/* This is a wrapper to be called when we need to restore core registers.
> +   It will call `_Unwind_DebugHook' before restoring the registers, thus
> +   making it possible to intercept and debug exceptions.
> +
> +   When calling `_Unwind_DebugHook', the first argument (the CFA) is zero
> +   because we are not interested in it.  However, it must be there (even
> +   being zero) because GDB expects to find it when using the probe.  */
> +
> +#define uw_restore_core_regs(TARGET, CORE)                                   
> \
> +  do                                                                         
> \
> +    {                                                                        
> \
> +      void *handler = __builtin_frob_return_addr ((void *) VRS_PC (TARGET)); 
>  \
> +      _Unwind_DebugHook (0, handler);                                        
> \
> +      restore_core_regs (CORE);                                              
>         \
> +    }                                                                        
> \
> +  while (0)
> +
>  /* Perform a binary search for RETURN_ADDRESS in TABLE.  The table contains
>    NREC entries.  */
>
> @@ -253,8 +298,8 @@ unwind_phase2 (_Unwind_Control_Block * ucbp, phase2_vrs * 
> vrs)
>
>   if (pr_result != _URC_INSTALL_CONTEXT)
>     abort();
> -
> -  restore_core_regs (&vrs->core);
> +
> +  uw_restore_core_regs (vrs, &vrs->core);
>  }
>
>  /* Perform phase2 forced unwinding.  */
> @@ -339,7 +384,7 @@ unwind_phase2_forced (_Unwind_Control_Block *ucbp, 
> phase2_vrs *entry_vrs,
>       return _URC_FAILURE;
>     }
>
> -  restore_core_regs (&saved_vrs.core);
> +  uw_restore_core_regs (&saved_vrs, &saved_vrs.core);
>  }
>
>  /* This is a very limited implementation of _Unwind_GetCFA.  It returns
> @@ -450,7 +495,7 @@ __gnu_Unwind_Resume (_Unwind_Control_Block * ucbp, 
> phase2_vrs * entry_vrs)
>     {
>     case _URC_INSTALL_CONTEXT:
>       /* Upload the registers to enter the landing pad.  */
> -      restore_core_regs (&entry_vrs->core);
> +      uw_restore_core_regs (entry_vrs, &entry_vrs->core);
>
>     case _URC_CONTINUE_UNWIND:
>       /* Continue unwinding the next frame.  */

Otherwise looks ok to me .

Ramana


Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Richard Guenther
On Thu, 1 Dec 2011, Diego Novillo wrote:

> On Thu, Dec 1, 2011 at 05:59, Richard Guenther  wrote:
> > On Tue, 29 Nov 2011, Diego Novillo wrote:
> >
> >> On Tue, Nov 29, 2011 at 12:49, H.J. Lu  wrote:
> >>
> >> > This caused:
> >> >
> >> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51346
> >>
> >> Thanks.  I'm on it.
> >
> > The patch was wrong, please revert it.  At the gimple stmt
> > modification level we shouldn't modify the cgraph.  That's
> > a layering violation at least.
> 
> No, this is a pre-existing problem that got aggravated with the new
> changes to the inline attribute in fold.  I think we need to either
> toss out the edge attribute or make it such that they are more
> automatically sync'd.  Unfortunately, we cannot get rid of it, since
> we sometimes do not have the statement.
>
> So, we have to live with the two attributes.  How about, we make the

Yes.  And I've fixed all places I could find sofar to update them.

> edge attribute always dependent on the statement?  If the statement
> exists, the edge attribute always take its value from it.  Only when
> the statement doesn't exist, we take its value from the call.  All
> this can be put into a small predicate.

Sure, but then you can still have the issue of an inconsistency.
Thus, would you then remove the remaining asserts?

I believe in the end the proper fix is to _not_ throw away
cgraph edges all the time, but keep them up-to-date and thus
make the stmt flag not necessary.  (we can define "up-to-date"
in a way so that we only require that existing edges that
still have a call stmt have to be valid, thus still require
incremental recomputation to remove dead edges and create
new ones)

> > Please file a bug with a reduced testcase that still fails
> > without your fix.
> 
> I'll add a test to the final patch after it finishes reducing (the
> original test case is huge).

Which pass did the folding of the stmt but did not adjust the
edge flag?

Richard.

Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-01 Thread Richard Guenther
On Fri, Nov 18, 2011 at 6:47 PM, Martin Jambor  wrote:
> Hi,
>
> PR 50744 is an issue with an integer overflow when we propagate the
> estimated size and time effects from callees to callers.  Because such
> paths in the parameter value graph can be arbitrarily long, we simply
> need to introduce an artificial cap on these values.  This is what the
> patch below does.  The cap should be more than enough for any
> reasonable values.
>
> Moreover, I have looked at how we then process the accumulated
> estimates and decided to compute evaluation ratio in
> good_cloning_opportunity_p in HOST_WIDEST_INT.  Call graph frequencies
> are numerators of fractions with denominator 1000 and therefore
> capping the size and cost estimate as well as the frequency sums so
> that their multiplication would not overflow an int seems too
> constraining on 32bit hosts.
>
> Bootstrapped and tested on x86_64-linux without any problems, OK for
> trunk?

This introduces host-dependent code generation differences, right?
You can simply use int64_t for code that is run on the host only.

Richard.

> Thanks,
>
> Martin
>
>
>
> 2011-11-15  Martin Jambor  
>
>        PR tree-optimization/50744
>        * ipa-cp.c (good_cloning_opportunity_p): Assert size_cost is positive,
>        compute evaluation in HOST_WIDEST_INT.
>        (safe_add): New function
>        (propagate_effects): Use safe_add to accumulate effects.
>
>        * testsuite/gcc.dg/ipa/pr50744.c: New test.
>
>
> Index: src/gcc/testsuite/gcc.dg/ipa/pr50744.c
> ===
> --- /dev/null
> +++ src/gcc/testsuite/gcc.dg/ipa/pr50744.c
> @@ -0,0 +1,119 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fno-optimize-sibling-calls" } */
> +
> +extern int use_data (void *p_01, void *p_02, void *p_03, void *p_04, void 
> *p_05,
> +                    void *p_06, void *p_07, void *p_08, void *p_09, void 
> *p_10,
> +                    void *p_11, void *p_12, void *p_13, void *p_14, void 
> *p_15,
> +                    void *p_16, void *p_17, void *p_18, void *p_19, void 
> *p_20,
> +                    void *p_21, void *p_22, void *p_23, void *p_24, void 
> *p_25,
> +                    void *p_26, void *p_27, void *p_28, void *p_29,
> +                    void *p_30);
> +
> +extern int idx (int i, int j, int n);
> +
> +struct stuff
> +{
> +  int decision;
> +  int *a, *b, *c;
> +  int res;
> +};
> +
> +
> +#define some_large_stuff(stuff, n) { \
> +  int i, j, k; \
> +  for (i = 0; i < n; i++) \
> +    for (j = 0; j < n; j++) \
> +      { \
> +       int v = stuff->c[idx(i, j, n)]; \
> +       for (k = 0; k < n; k++) \
> +         v += stuff->a[idx(i, k, n)] * stuff->b[idx(k,j,n)]; \
> +       stuff->c[idx(i, j, n)] = v; \
> +      } \
> +}
> +
> +#define recursion if (iter > 0) \
> +    foo (stuff, iter - 1, (void *) -1, p_01, p_02, p_03, p_04, p_05, p_06, \
> +      p_07, p_08, p_09, p_10, p_11, p_12, p_13, p_14, p_15, p_16, p_17, \
> +     p_18, p_19, p_20, p_21, p_22, p_23, p_24, p_25, p_26, p_27, p_28, 
> p_29); \
> +    else \
> +      foo (stuff, iter, p_01, p_02, p_03, p_04, p_05, p_06, p_07, p_08, 
> p_09, \
> +       p_10, p_11, p_12, p_13, p_14, p_15, p_16, p_17, p_18, p_19, p_20, \
> +        p_21,p_22, p_23, p_24, p_25, p_26, p_27, p_28, p_29, p_30)
> +
> +void
> +foo (struct stuff *stuff,
> +     int iter,
> +     void *p_01, void *p_02, void *p_03, void *p_04, void *p_05,
> +     void *p_06, void *p_07, void *p_08, void *p_09, void *p_10,
> +     void *p_11, void *p_12, void *p_13, void *p_14, void *p_15,
> +     void *p_16, void *p_17, void *p_18, void *p_19, void *p_20,
> +     void *p_21, void *p_22, void *p_23, void *p_24, void *p_25,
> +     void *p_26, void *p_27, void *p_28, void *p_29, void *p_30)
> +{
> + switch (stuff->decision)
> +   {
> +   case 0:
> +     some_large_stuff (stuff, 83);
> +     stuff->res =
> +       use_data (p_01, p_02, p_03, p_04, p_05, p_06, p_07, p_08, p_09, p_10,
> +                p_11, p_12, p_13, p_14, p_15, p_16, p_17, p_18, p_19, p_20,
> +                p_21, p_22, p_23, p_24, p_25, p_26, p_27, p_28, p_29, p_30);
> +     recursion;
> +     break;
> +
> +   case 1:
> +     some_large_stuff (stuff, 25);
> +     stuff->res =
> +       use_data (p_11, p_02, p_03, p_04, p_05, p_06, p_07, p_08, p_09, p_10,
> +                p_21, p_12, p_13, p_14, p_15, p_16, p_17, p_18, p_19, p_20,
> +                p_01, p_22, p_23, p_24, p_25, p_26, p_27, p_28, p_29, p_30);
> +     recursion;
> +     break;
> +
> +   case 3:
> +     some_large_stuff (stuff, 139);
> +     stuff->res =
> +       use_data (p_01, p_12, p_03, p_04, p_05, p_06, p_07, p_08, p_09, p_10,
> +                p_11, p_22, p_13, p_14, p_15, p_16, p_17, p_18, p_19, p_20,
> +                p_21, p_02, p_23, p_24, p_25, p_26, p_27, p_28, p_29, p_30);
> +     recursion;
> +     break;
> +
> +   case 4:
> +     some_large_stuff (stuff, 32);
> +     stuff->res =
> +       use_data (p_01, p_02, p_13, p_04, p_05, p_06, 

Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Diego Novillo
On Thu, Dec 1, 2011 at 07:08, Richard Guenther  wrote:
> Sure, but then you can still have the issue of an inconsistency.

Not if we make the edge attribute secondary to the statement
attribute.  Given that can_inline_edge_p() is the *only* tester for
this attribute, what I was thinking was to change can_inline_edge_p()
to:

diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 3dadf8d..e3c6b3c 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -246,6 +246,14 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
   struct function *caller_cfun = DECL_STRUCT_FUNCTION (e->caller->decl);
   struct function *callee_cfun
 = callee ? DECL_STRUCT_FUNCTION (callee->decl) : NULL;
+  bool call_stmt_cannot_inline_p;
+
+  /* If E has a call statement in it, use the inline attribute from
+ the statement, otherwise use the inline attribute in E.  Edges
+ will not have statements when working in WPA mode.  */
+  call_stmt_cannot_inline_p = (e->call_stmt)
+ ? gimple_call_cannot_inline_p (e->call_stmt)
+ : e->call_stmt_cannot_inline_p;

   if (!caller_cfun && e->caller->clone_of)
 caller_cfun = DECL_STRUCT_FUNCTION (e->caller->clone_of->decl);
@@ -270,7 +278,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
   e->inline_failed = CIF_OVERWRITABLE;
   return false;
 }
-  else if (e->call_stmt_cannot_inline_p)
+  else if (call_stmt_cannot_inline_p)
 {
   e->inline_failed = CIF_MISMATCHED_ARGUMENTS;
   inlinable = false;
@@ -343,14 +351,6 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
}
 }

-  /* Be sure that the cannot_inline_p flag is up to date.  */
-  gcc_checking_assert (!e->call_stmt
-  || (gimple_call_cannot_inline_p (e->call_stmt)
-  == e->call_stmt_cannot_inline_p)
-  /* In -flto-partition=none mode we really keep
things out of
- sync because call_stmt_cannot_inline_p is
set at cgraph
- merging when function bodies are not there yet.  */
-  || (in_lto_p && !gimple_call_cannot_inline_p
(e->call_stmt)));
   if (!inlinable && report)
 report_inline_failed_reason (e);
   return inlinable;



> Thus, would you then remove the remaining asserts?

The asserts disappear because we have weakened the meaning of the edge
attribute.  It is only usable when there is no statement on it.  The
question now is, how do we know that the attribute is not lying?  This
only happens in WPA mode, so it would then become an issue of
pessimization, not correctness.

> I believe in the end the proper fix is to _not_ throw away
> cgraph edges all the time, but keep them up-to-date and thus
> make the stmt flag not necessary.

Make it a pure cgraph attribute?  Sure, anything that gets rid of the
dual attribute is the way to go.  There are not very many invocations
to the gimple attribute, but I don't know how big a change that is.

> Which pass did the folding of the stmt but did not adjust the
> edge flag?

The new call to gimple_call_set_cannot_inline added by this patch:

commit 3aa6ac67f5f7d3a6aabce9ada30e99e2a82c0114
Author: rguenth 
Date:   Wed Nov 2 08:46:08 2011 +
   2010-11-02  Richard Guenther  

   PR tree-optimization/50890
   * gimple.h (gimple_fold_call): Remove.
   * gimple-fold.c (fold_stmt_1): Move all call related code to ...
   (gimple_fold_call): ... here.  Make static.  Update the
   cannot-inline flag on direct calls.
   * ipa-inline.c (early_inliner): Copy the cannot-inline flag
   from the statements to the edges.
   * gcc.dg/torture/pr50890.c: New testcase.


   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180763
138bc75d-0d04-0410-961f-82ee72b054a4


Diego.


Re: [RFC] Use REG_EXPR in back-end (introduced by optimization to conditional and/or in ARM back-end)

2011-12-01 Thread Richard Guenther
On Tue, Nov 22, 2011 at 2:55 AM, Richard Henderson  wrote:
> On 11/21/2011 05:31 PM, Jiangning Liu wrote:
>> My question is essentially is "May I really use REG_EXPR in back-end code?"
>> like the patch I gave below?
>
> I suppose.

I'm not so sure ;)  At least checking for BOOLEAN_TYPE is incomplete - you
miss 1-bit precision types of the same signedness as bool.  The middle-end
doesn't distinguish between them anymore (semantically, that is)...

> Another alternative is to use BImode for booleans.  Dunno how much of that
> you'd be able to gleen from mere rtl expansion or if you'd need boolean
> decls to be expanded with BImode.

... which probably also means consistently getting BImode won't work?

Richard.

> The later would probably need a target hook.  I've often wondered how much
> ia64 would benefit from that too, being able to store bool variables directly
> in predicate registers.
>
> All of this almost certainly must wait until stage1 opens up again though...
>
>
> r~


Re: [PATCH] Fix predcom ICE introduced by var clobber changes (PR tree-optimization/51246)

2011-12-01 Thread Richard Guenther
On Thu, Nov 24, 2011 at 5:28 PM, Michael Matz  wrote:
> Hi,
>
> On Thu, 24 Nov 2011, Jakub Jelinek wrote:
>
>> When stmt is mem = {v} {CLOBBER};, then lhs is neither
>> SSA_NAME, but it doesn't satisfy gimple_assign_copy_p either.
>> With this patch it will set the new_tree also to the clobber,
>> making it clear that the next iteration uses unitialized variable.
>
> Hmm.  My guts don't like clobbers on the RHS of normal ssa operations.
> Usually our uninitialized values are the default defs of SSA names that
> aren't PARM_DECLs.  I don't like having two ways of expressing
> uninitializedness.
>
> As the default defs are already automatically handled by all our ssa
> infrastructure (including warning and propagation machinery) I think it
> would be best to generate such one instead of a clobber for the RHS.

I think the patch is ok.  Does the CLOBBER get re-placed anywhere?

Richard.

>
> Ciao,
> Michael.


Re: [PATCH] Fix predcom ICE introduced by var clobber changes (PR tree-optimization/51246, take 2)

2011-12-01 Thread Richard Guenther
On Thu, Nov 24, 2011 at 5:51 PM, Michael Matz  wrote:
> Hi,
>
> On Thu, 24 Nov 2011, Jakub Jelinek wrote:
>
>> On Thu, Nov 24, 2011 at 05:28:00PM +0100, Michael Matz wrote:
>> > As the default defs are already automatically handled by all our ssa
>> > infrastructure (including warning and propagation machinery) I think it
>> > would be best to generate such one instead of a clobber for the RHS.
>>
>> So like this?
>
> I would feel comfortable with this one, yes.

What's when new_tree is a PARM_DECL?  If that can't happen the patch is ok.

Richard.

>
> Ciao,
> Michael.


Re: [PATCH] Fix predcom ICE introduced by var clobber changes (PR tree-optimization/51246, take 2)

2011-12-01 Thread Michael Matz
Hi,

On Thu, 1 Dec 2011, Richard Guenther wrote:

> >> > As the default defs are already automatically handled by all our 
> >> > ssa infrastructure (including warning and propagation machinery) I 
> >> > think it would be best to generate such one instead of a clobber 
> >> > for the RHS.
> >>
> >> So like this?
> >
> > I would feel comfortable with this one, yes.
> 
> What's when new_tree is a PARM_DECL?  If that can't happen the patch is 
> ok.

new_tree always will be a newly generated predcom temporary.


Ciao,
Michael.


Re: RFA: Fix PR tree-optimization/50802

2011-12-01 Thread Richard Guenther
On Sat, Nov 26, 2011 at 10:17 AM, Joern Rennecke  wrote:
> With this rewrite of simplify_conversion_using_ranges we go back to the
> original problem of considering if a single conversion is sufficient
> considering the known input range.
>
> Bootstrapped and regtested on i686-pc-linux-gnu.

Ok.

Thanks,
Richard.


Re: [Patch, fortran] PR 25708 Reduce seeks when loading module files

2011-12-01 Thread Mikael Morin
On Wednesday 30 November 2011 23:49:58 Janne Blomqvist wrote:
> > With the updated patch, the number of lseek's when compiling
> > aermod.f90 drop to 38, which is a factor of 15 reduction compared
> > to the current trunk. And a factor of 55 compared to trunk a few days
> > ago before Thomas' patch.
Nice :)

> > 
> > Updated patch attached. Regtested on x86_64-unknown-linux-gnu, Ok for
> > trunk?
OK
Thanks

Mikael


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread David Edelsohn
On Thu, Dec 1, 2011 at 4:36 AM, Torvald Riegel  wrote:
> On Wed, 2011-11-30 at 21:41 -0500, David Edelsohn wrote:
>> On Wed, Nov 30, 2011 at 8:05 PM, Richard Henderson  wrote:
>> > This is a tad rough, but not too bad.
>>
>> Cool.
>>
>> Maybe I don't understand what they are suppose to represent, but why
>> the choice of values for cacheline size?  Is that suppose to be a
>> value chosen by ITM or suppose to be the hardware cacheline used as
>> the granularity for transactions?
>
> CACHELINE_SIZE is supposed to be a the size of hardware cachelines so
> that we can add proper padding to shared variables to avoid false
> sharing.
>
> It also was used as the granularity of transactional access by some TM
> methods that aren't part of libitm currently, but might be revived in
> the future.

So where did you get the values used in the PowerPC port of ITM?

- David


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Iain Sandoe


On 1 Dec 2011, at 01:05, Richard Henderson wrote:


No support for non-ELF, aka AIX


and Darwin.   I'm not 100% sure how to handle the assembly markup  
for those, and I couldn't test it anyway.  Again, I'd prefer someone  
else figure that stuff out.


I've started to take a look at Darwin [sjlj.S] - and I have a feeling  
that it's going to be nearly 'change every line' - (our asm syntax is  
ancient) -
so probably, in this case, better to have a separate file (but will  
try to get to a working version before making that judgement finally).


I presume that w should treat this as a normal prologue - and it looks  
very much like "save the world" - so I have a question:


is there any reason that we should avoid using out-of-line saves in  
this specific case?


thanks
Iain



Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Richard Guenther
On Thu, 1 Dec 2011, Diego Novillo wrote:

> On Thu, Dec 1, 2011 at 07:08, Richard Guenther  wrote:
> > Sure, but then you can still have the issue of an inconsistency.
> 
> Not if we make the edge attribute secondary to the statement
> attribute.  Given that can_inline_edge_p() is the *only* tester for
> this attribute, what I was thinking was to change can_inline_edge_p()
> to:
> 
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index 3dadf8d..e3c6b3c 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -246,6 +246,14 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
>struct function *caller_cfun = DECL_STRUCT_FUNCTION (e->caller->decl);
>struct function *callee_cfun
>  = callee ? DECL_STRUCT_FUNCTION (callee->decl) : NULL;
> +  bool call_stmt_cannot_inline_p;
> +
> +  /* If E has a call statement in it, use the inline attribute from
> + the statement, otherwise use the inline attribute in E.  Edges
> + will not have statements when working in WPA mode.  */
> +  call_stmt_cannot_inline_p = (e->call_stmt)
> + ? gimple_call_cannot_inline_p (e->call_stmt)
> + : e->call_stmt_cannot_inline_p;
> 
>if (!caller_cfun && e->caller->clone_of)
>  caller_cfun = DECL_STRUCT_FUNCTION (e->caller->clone_of->decl);
> @@ -270,7 +278,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
>e->inline_failed = CIF_OVERWRITABLE;
>return false;
>  }
> -  else if (e->call_stmt_cannot_inline_p)
> +  else if (call_stmt_cannot_inline_p)
>  {
>e->inline_failed = CIF_MISMATCHED_ARGUMENTS;
>inlinable = false;
> @@ -343,14 +351,6 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
> }
>  }
> 
> -  /* Be sure that the cannot_inline_p flag is up to date.  */
> -  gcc_checking_assert (!e->call_stmt
> -  || (gimple_call_cannot_inline_p (e->call_stmt)
> -  == e->call_stmt_cannot_inline_p)
> -  /* In -flto-partition=none mode we really keep
> things out of
> - sync because call_stmt_cannot_inline_p is
> set at cgraph
> - merging when function bodies are not there yet.  */
> -  || (in_lto_p && !gimple_call_cannot_inline_p
> (e->call_stmt)));
>if (!inlinable && report)
>  report_inline_failed_reason (e);
>return inlinable;
> 
> 
> 
> > Thus, would you then remove the remaining asserts?
> 
> The asserts disappear because we have weakened the meaning of the edge
> attribute.  It is only usable when there is no statement on it.  The
> question now is, how do we know that the attribute is not lying?  This
> only happens in WPA mode, so it would then become an issue of
> pessimization, not correctness.

The above looks ok to me, but I don't want the
gimple_call_set_cannot_inline change (if it is in the tree - I have
not yet recovered from three weeks of vacation).  The edge attribute
is "recomputed" when necessary.

> > I believe in the end the proper fix is to _not_ throw away
> > cgraph edges all the time, but keep them up-to-date and thus
> > make the stmt flag not necessary.
> 
> Make it a pure cgraph attribute?  Sure, anything that gets rid of the
> dual attribute is the way to go.  There are not very many invocations
> to the gimple attribute, but I don't know how big a change that is.

The issue with that change would be to preserve the cgraph edges.
Though when we create them we always have the call stmt available
and thus could re-compute that flag.  Honza?

> > Which pass did the folding of the stmt but did not adjust the
> > edge flag?
> 
> The new call to gimple_call_set_cannot_inline added by this patch:

Sure, but what _pass_ changed the call stmt and called fold_stmt
on it?  The patch merely changes the flag during folding.

Richard.


Re: [PATCH] Don't ICE on label DEBUG_INSN in rtl loop unrolling (PR rtl-optimization/51014)

2011-12-01 Thread Richard Guenther
On Tue, Nov 29, 2011 at 12:16 AM, Jakub Jelinek  wrote:
> Hi!
>
> DEBUG_INSN with LABEL_DECL var isn't duplicated in bb copies (we want
> just one definition of the label), which breaks apply_opt_in_copies
> attempt to match insn in bb copy with orig_insn from the orig_bb.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

Ok.

Thanks,
Richard.

> 2011-11-28  Jakub Jelinek  
>
>        PR rtl-optimization/51014
>        * loop-unroll.c (apply_opt_in_copies): Ignore label DEBUG_INSNs
>        both from bb and orig_bb.
>
>        * g++.dg/opt/pr51014.C: New test.
>
> --- gcc/loop-unroll.c.jj        2011-02-15 15:42:26.0 +0100
> +++ gcc/loop-unroll.c   2011-11-28 21:03:58.089497366 +0100
> @@ -2262,10 +2262,15 @@ apply_opt_in_copies (struct opt_info *op
>       for (insn = BB_HEAD (bb); insn != NEXT_INSN (BB_END (bb)); insn = next)
>         {
>           next = NEXT_INSN (insn);
> -          if (!INSN_P (insn))
> +         if (!INSN_P (insn)
> +             || (DEBUG_INSN_P (insn)
> +                 && TREE_CODE (INSN_VAR_LOCATION_DECL (insn)) == LABEL_DECL))
>             continue;
>
> -          while (!INSN_P (orig_insn))
> +         while (!INSN_P (orig_insn)
> +                || (DEBUG_INSN_P (orig_insn)
> +                    && (TREE_CODE (INSN_VAR_LOCATION_DECL (orig_insn))
> +                        == LABEL_DECL)))
>             orig_insn = NEXT_INSN (orig_insn);
>
>           ivts_templ.insn = orig_insn;
> --- gcc/testsuite/g++.dg/opt/pr51014.C.jj       2011-11-28 21:08:19.518986308 
> +0100
> +++ gcc/testsuite/g++.dg/opt/pr51014.C  2011-11-28 21:07:24.0 +0100
> @@ -0,0 +1,16 @@
> +// PR rtl-optimization/51014
> +// { dg-do compile }
> +// { dg-options "-O2 -funroll-loops -fcompare-debug" }
> +
> +struct S
> +{
> +  ~S() { delete s; }
> +  int *s;
> +};
> +
> +void
> +f (S *x, S *y)
> +{
> +  for (; x != y; ++x)
> +    x->~S();
> +}
>
>        Jakub


Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Diego Novillo
On Thu, Dec 1, 2011 at 09:04, Richard Guenther  wrote:

> The above looks ok to me, but I don't want the
> gimple_call_set_cannot_inline change (if it is in the tree - I have
> not yet recovered from three weeks of vacation).  The edge attribute
> is "recomputed" when necessary.

The original patch is no longer necessary given this change of
semantics we are discussing.  For the original semantics, it was
needed because every change to the statement state must be reflected
in the edge.  If we make the edge attribute invisible in the presence
of a call statement, then it doesn't matter if we update it or not.

The only issue I have now is that if we allow the edge attribute to go
stale, when we save it to a bytecode file, and use it during WPA, we
will be using the stale value.

We could update the edge attribute every time can_inline_edge_p is
called, but I'm not sure I like that.

>> > Which pass did the folding of the stmt but did not adjust the
>> > edge flag?
>>
>> The new call to gimple_call_set_cannot_inline added by this patch:
>
> Sure, but what _pass_ changed the call stmt and called fold_stmt
> on it?  The patch merely changes the flag during folding.

Ah.

#0  gimple_call_set_cannot_inline (s=0x70b88ab0, inlinable_p=true)
at pph/gcc/gimple.c:5570
#1  0x008c6d0b in gimple_fold_call (inplace=,
gsi=)
at pph/gcc/gimple-fold.c:1121
#2  fold_stmt_1 (gsi=0x7fffd580, inplace=false)
at pph/gcc/gimple-fold.c:1198
#3  0x00a7a775 in fold_marked_statements (first=6,
statements=0x16c9090)
at pph/gcc/tree-inline.c:4174
#4  0x00a88a0b in tree_function_versioning (old_decl=,
new_decl=0x74be1000, tree_map=, update_clones=224,
args_to_skip=, blocks_to_copy=0x1a28d30,
new_entry=0x704c7958)
at pph/gcc/tree-inline.c:5259
#5  0x00782c27 in cgraph_function_versioning (
old_version_node=, redirect_callers=0x0, tree_map=0x0,
args_to_skip=0x1a25b50, bbs_to_copy=0x1a28d30,
new_entry_block=0x704c7958, clone_name=0x10227f4 "part")
at pph/gcc/cgraphunit.c:2383
#6  0x00ee2452 in split_function (split_point=)
at pph/gcc/ipa-split.c:1102
#7  execute_split_functions ()
at pph/gcc/ipa-split.c:1412
#8  0x00990f15 in execute_one_pass (pass=0x1413060)

Line numbers are relative to the PPH branch, which is trunk as of a
couple of weeks ago.  I *think* the bug only triggers with -m32
-march=pentium3, but I am not sure (need to rebuild the .ii file).


Diego.


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Iain Sandoe


On 1 Dec 2011, at 14:04, Iain Sandoe wrote:
I presume that w should treat this as a normal prologue - and it  
looks very much like "save the world" -


BTW, if this is true ( i.e. we should be preserving all call-saved  
regs around the call to GTM_begin_transaction), then I guess we should  
be saving CR2-CR4 (at least on Darwin, possibly elsewhere too).





Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Richard Guenther
On Thu, 1 Dec 2011, Diego Novillo wrote:

> On Thu, Dec 1, 2011 at 09:04, Richard Guenther  wrote:
> 
> > The above looks ok to me, but I don't want the
> > gimple_call_set_cannot_inline change (if it is in the tree - I have
> > not yet recovered from three weeks of vacation).  The edge attribute
> > is "recomputed" when necessary.
> 
> The original patch is no longer necessary given this change of
> semantics we are discussing.  For the original semantics, it was
> needed because every change to the statement state must be reflected
> in the edge.  If we make the edge attribute invisible in the presence
> of a call statement, then it doesn't matter if we update it or not.
> 
> The only issue I have now is that if we allow the edge attribute to go
> stale, when we save it to a bytecode file, and use it during WPA, we
> will be using the stale value.

We still update it in a few selected places.

> We could update the edge attribute every time can_inline_edge_p is
> called, but I'm not sure I like that.

Me neither.

> >> > Which pass did the folding of the stmt but did not adjust the
> >> > edge flag?
> >>
> >> The new call to gimple_call_set_cannot_inline added by this patch:
> >
> > Sure, but what _pass_ changed the call stmt and called fold_stmt
> > on it?  The patch merely changes the flag during folding.
> 
> Ah.
> 
> #0  gimple_call_set_cannot_inline (s=0x70b88ab0, inlinable_p=true)
> at pph/gcc/gimple.c:5570
> #1  0x008c6d0b in gimple_fold_call (inplace=,
> gsi=)
> at pph/gcc/gimple-fold.c:1121
> #2  fold_stmt_1 (gsi=0x7fffd580, inplace=false)
> at pph/gcc/gimple-fold.c:1198
> #3  0x00a7a775 in fold_marked_statements (first=6,
> statements=0x16c9090)
> at pph/gcc/tree-inline.c:4174
> #4  0x00a88a0b in tree_function_versioning (old_decl=,
> new_decl=0x74be1000, tree_map=, update_clones=224,
> args_to_skip=, blocks_to_copy=0x1a28d30,
> new_entry=0x704c7958)
> at pph/gcc/tree-inline.c:5259
> #5  0x00782c27 in cgraph_function_versioning (
> old_version_node=, redirect_callers=0x0, tree_map=0x0,
> args_to_skip=0x1a25b50, bbs_to_copy=0x1a28d30,
> new_entry_block=0x704c7958, clone_name=0x10227f4 "part")
> at pph/gcc/cgraphunit.c:2383
> #6  0x00ee2452 in split_function (split_point=)
> at pph/gcc/ipa-split.c:1102
> #7  execute_split_functions ()
> at pph/gcc/ipa-split.c:1412
> #8  0x00990f15 in execute_one_pass (pass=0x1413060)
> 
> Line numbers are relative to the PPH branch, which is trunk as of a
> couple of weeks ago.  I *think* the bug only triggers with -m32
> -march=pentium3, but I am not sure (need to rebuild the .ii file).

ISTR updating the function cloning path, so you might be simply on
a too old trunk version.  Do you have the 2011-11-06 change?  And
the 2011-11-09 change?

But your proposed change looks ok anyway with reverting the original
patch.

Thanks,
Richard.

[Patch, fortran] Make a few helper functions static

2011-12-01 Thread Janne Blomqvist
Hi,

committed the patch below as obvious.

2011-12-01  Janne Blomqvist  

* module.c (dt_lower_string): Make static.
(dt_upper_string): Likewise.


Index: module.c
===
--- module.c(revision 181880)
+++ module.c(working copy)
@@ -435,7 +435,7 @@ resolve_fixups (fixup_t *f, void *gp)
to convert the symtree name of a derived-type to the symbol name or to
the name of the associated generic function.  */

-const char *
+static const char *
 dt_lower_string (const char *name)
 {
   if (name[0] != (char) TOLOWER ((unsigned char) name[0]))
@@ -450,7 +450,7 @@ dt_lower_string (const char *name)
symtree/symbol name of the associated generic function start with a lower-
case character.  */

-const char *
+static const char *
 dt_upper_string (const char *name)
 {
   if (name[0] != (char) TOUPPER ((unsigned char) name[0]))


-- 
Janne Blomqvist


Re: [Patch PPC/Darwin] add fp/gp save routines to ppc64 case.

2011-12-01 Thread Mike Stump
On Nov 29, 2011, at 9:06 AM, Iain Sandoe > wrote:
As Rainer recently pointed out, libgcc/config/rs6000/t-darwin64  
overrides the t-darwin version.

This would make it miss the out-of-line saves.

corrected as attached,
OK for trunk?


Ok.


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Richard Henderson
On 12/01/2011 06:00 AM, David Edelsohn wrote:
> On Thu, Dec 1, 2011 at 4:36 AM, Torvald Riegel  wrote:
>> On Wed, 2011-11-30 at 21:41 -0500, David Edelsohn wrote:
>>> On Wed, Nov 30, 2011 at 8:05 PM, Richard Henderson  wrote:
 This is a tad rough, but not too bad.
>>>
>>> Cool.
>>>
>>> Maybe I don't understand what they are suppose to represent, but why
>>> the choice of values for cacheline size?  Is that suppose to be a
>>> value chosen by ITM or suppose to be the hardware cacheline used as
>>> the granularity for transactions?
>>
>> CACHELINE_SIZE is supposed to be a the size of hardware cachelines so
>> that we can add proper padding to shared variables to avoid false
>> sharing.
>>
>> It also was used as the granularity of transactional access by some TM
>> methods that aren't part of libitm currently, but might be revived in
>> the future.
> 
> So where did you get the values used in the PowerPC port of ITM?

I made it up.  As he said, it's only used for padding to *attempt to* avoid 
false sharing.  Currently sources won't actually fail with the wrong cacheline 
value, but they'll work more efficiently with the right value.


r~


Re: [PATCH, lto]: Handle *tm regparm attribute

2011-12-01 Thread Richard Guenther
On Wed, Nov 30, 2011 at 6:03 PM, Jan Hubicka  wrote:
>> Hello!
>>
>> Attached patch handles "*tm regparm" attribute, to avoid "*tm regparm"
>> attribute ignored warnings in lto compile on non-x86 targets.
>>
>> 2011-11-30  Uros Bizjak  
>>
>>       * lto-lang.c (lto_attribute_table): Handle *tm regparm.
>>       (ignore_attribute): New.
>>
>> Tested on alphaev68-pc-linux-gnu and x86_64-pc-linux-gnu {,-m32}.
>>
>> OK for mainline?
>
> Won't similar change be needed for other tm attributes?  Perhaps we could just
> silence the warning with in_lto_p predicate.

Doesn't it need to be handled as well, not just ignored?

Richard.

> Honza


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Richard Henderson
On 12/01/2011 01:42 AM, Torvald Riegel wrote:
> The ABI defines the pr_hasNoFloatUpdate and pr_hasNoVectorUpdate flags
> for _ITM_beginTransaction but we don't handle these currently.  I guess
> we should do the save/restore unless those flags are set?
> 
> How difficult would it be to set these flags if there is no float/vector
> update (I guess inter-procedural analysis could be sufficient as a first
> step).

It's actually quite difficult.

We'd need to mark a region so that the register allocator doesn't attempt to 
allocate call-saved  fp/vector registers within that region.  Which is much 
stricter than seeing that there's no fp/vector operation, since fp/vector 
registers can be used for other things, such as memcpy.


r~


Re: [PATCH] Fix PR 51198, DECL_INITIAL still contains stuff for FIELD_DECLs

2011-12-01 Thread Richard Guenther
On Wed, Nov 30, 2011 at 8:05 PM, Andrew Pinski
 wrote:
> Hi,
>  With C++11's decl initialization for non static members, the
> DECL_INITIAL for FIELD_DECLs contains stuff which we don't need to
> keep around after the front-end is done.  This patch clears them in
> the free_lang_data pass.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Ok.

Thanks,
Richard.

> Thanks,
> Andrew Pinski
>
> ChangeLog:
> * tree.c (free_lang_data_in_decl): Clear FIELD_DECL's DECL_INITIAL also.
>
> testsuite/ChangeLog:
> * g++.dg/torture/pr51198.C: New testcase.


Re: [testsuite] xfail target-specific asms, & gcov

2011-12-01 Thread Mike Stump

On Nov 27, 2011, at 3:41 PM, Nathan Sidwell  wrote:
this patch extends scan-assembler (and scan-assembler-not) to allow  
something like:
/* { dg-final { scan-assembler "\\.hidden 
\t__gcov___ZN1X4FinkEv" { target { ! *-*-darwin* } } { xfail *-*- 
* } } } */



ok?


Almost...   Every test that can fail, should be able to pass and every  
test that can pass, should be able to fail.  The usual structure:


If (test fails)
  fail foo
else
  pass foo

Makes this trivially true.  In your patch, you alter the spelling of a  
test, never do that.  Once that is fixed, Ok.


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Richard Henderson
On 12/01/2011 06:42 AM, Iain Sandoe wrote:
>> I presume that w should treat this as a normal prologue - and it
>> looks very much like "save the world" -
> 
> BTW, if this is true ( i.e. we should be preserving all call-saved
> regs around the call to GTM_begin_transaction), then I guess we
> should be saving CR2-CR4 (at least on Darwin, possibly elsewhere too).

I didn't notice CR registers being saved in the linux setjmp function, but 
perhaps I just missed it?

This is really an implementation of setjmp that allows for a call to 
GTM_begin_transaction in the middle.  I.e. not something you could actually do 
with any variant of libc's setjmp.

So, if you can lay out the registers in some reasonable way with out-of-line 
saves, feel free.  You'll probably have to adjust the layout of gtm_jmpbuf in 
config/powerpc/target.h to match.

If it turns out to be horrible to share code between these different ABIs, feel 
free to make use of the config/ directory hierarchy and arrange for 
config/darwin/powerpc to be searched before config/powerpc/, and make new 
sjlj.S and target.h files.  See configure.tgt for details there.


r~




Re: [PATCH, lto]: Handle *tm regparm attribute

2011-12-01 Thread Jan Hubicka
> On Wed, Nov 30, 2011 at 6:03 PM, Jan Hubicka  wrote:
> >> Hello!
> >>
> >> Attached patch handles "*tm regparm" attribute, to avoid "*tm regparm"
> >> attribute ignored warnings in lto compile on non-x86 targets.
> >>
> >> 2011-11-30  Uros Bizjak  
> >>
> >>       * lto-lang.c (lto_attribute_table): Handle *tm regparm.
> >>       (ignore_attribute): New.
> >>
> >> Tested on alphaev68-pc-linux-gnu and x86_64-pc-linux-gnu {,-m32}.
> >>
> >> OK for mainline?
> >
> > Won't similar change be needed for other tm attributes?  Perhaps we could 
> > just
> > silence the warning with in_lto_p predicate.
> 
> Doesn't it need to be handled as well, not just ignored?
I would expect stuff to be handled at parsing time and thus streamed into IL.

Honza


Re: [PATCH 2/5] arm: Emit swp for pre-armv6.

2011-12-01 Thread Richard Henderson
On 12/01/2011 02:59 AM, Richard Earnshaw wrote:
> It's essential we don't emit SWP instructions directly into code on any
> platform where that code may possibly be run on a later core that has
> LDREX/STREX.  If we do that we'll end up with a mess that can't be resolved.

Ok.  It's easy enough to drop that patch.

> I also think that GCC should NOT provide those helper functions, though
> we should probably write a document describing how a user might do so.

I'll refer you to MacLeod at this point and the atomics support library...


r~


Re: [PATCH] Fix up VEC_INTERLEAVE_*_EXPR folding and expansion for big endian (PR tree-optimization/51074)

2011-12-01 Thread Richard Henderson
On 12/01/2011 03:21 AM, Richard Guenther wrote:
> Yes, sorry - I'm recovering from a 3 week e-mail lag ;)  I agree
> using VEC_PERM_EXPR would be best - but that would also affect
> backend patterns.  Can we have a middle-ground that leaves those
> untouched?  We're still in stage 3, so fixing the bug with using
> VEC_PERM_EXPR sounds appealing to me ;)

If we agree that we want to fix this with vec_perm_expr, then we need a 
relatively minor patch to the  vectorizer, and cleanups in the targets.

In particular, powerpc, spu, and ia64 will need to recognize various constant 
permutations so that they  can continue using the specialized instructions for 
interleave.  This shouldn't be particularly difficult; a few testcases added to 
make sure we don't regress to full permutation wouldn't be amiss.

The x86 port is the only one that really does aggressive constant permutation 
pattern recognition atm.  That is, of course, because the ISA support for 
permutation there is all over the map and we had no choice.

I've already zapped the target patterns that expanded interleave/even_odd back 
into a permuation operation.

If we think this is ok for stage3, we can certainly give it a whack.  I'll take 
care of the backends if Jakub takes care of the vectorizer?


r~


Re: [PATCH] Fix up VEC_INTERLEAVE_*_EXPR folding and expansion for big endian (PR tree-optimization/51074)

2011-12-01 Thread Jakub Jelinek
On Thu, Dec 01, 2011 at 07:57:48AM -0800, Richard Henderson wrote:
> On 12/01/2011 03:21 AM, Richard Guenther wrote:
> > Yes, sorry - I'm recovering from a 3 week e-mail lag ;)  I agree
> > using VEC_PERM_EXPR would be best - but that would also affect
> > backend patterns.  Can we have a middle-ground that leaves those
> > untouched?  We're still in stage 3, so fixing the bug with using
> > VEC_PERM_EXPR sounds appealing to me ;)
> 
> If we agree that we want to fix this with vec_perm_expr, then we need a
> relatively minor patch to the vectorizer, and cleanups in the targets.
> 
> In particular, powerpc, spu, and ia64 will need to recognize various
> constant permutations so that they can continue using the specialized
> instructions for interleave.  This shouldn't be particularly difficult; a
> few testcases added to make sure we don't regress to full permutation
> wouldn't be amiss.
> 
> The x86 port is the only one that really does aggressive constant
> permutation pattern recognition atm.  That is, of course, because the ISA
> support for permutation there is all over the map and we had no choice.
> 
> I've already zapped the target patterns that expanded interleave/even_odd
> back into a permuation operation.
> 
> If we think this is ok for stage3, we can certainly give it a whack.  I'll
> take care of the backends if Jakub takes care of the vectorizer?

Here is the vectorizer part (untested so far) + some small i386 tweaks.
This patch as is regresses code quality for powerpc/ia64/sparc/mips
(I don't think spu has vec_interleave* patterns in *.md).

If it works out, I guess we could also zap VEC_EXTRACT_{EVEN,ODD}_EXPR
similarly.

2011-12-01  Jakub Jelinek  

* tree.def (VEC_INTERLEAVE_HIGH_EXPR, VEC_INTERLEAVE_LOW_EXPR): Remove.
* gimple-pretty-print.c (dump_binary_rhs): Don't handle
VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
* expr.c (expand_expr_real_2): Likewise.
* tree-cfg.c (verify_gimple_assign_binary): Likewise.
* cfgexpand.c (expand_debug_expr): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-vect-generic.c (expand_vector_operations_1): Likewise.
* fold-const.c (fold_binary_loc): Likewise.
* doc/generic.texi (VEC_INTERLEAVE_HIGH_EXPR,
VEC_INTERLEAVE_LOW_EXPR): Remove documentation.
* optabs.c (optab_for_tree_code): Don't handle
VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
(expand_binop, init_optabs): Remove vec_interleave_high_optab
and vec_interleave_low_optab.
* genopinit.c (optabs): Likewise.
* optabs.h (OTI_vec_interleave_high, OTI_vec_interleave_low): Remove.
(vec_interleave_high_optab, vec_interleave_low_optab): Remove.
* doc/md.texi (vec_interleave_high, vec_interleave_low): Remove
documentation.
* tree-vect-stmts.c (gen_perm_mask): Renamed to...
(vect_gen_perm_mask): ... this.  No longer static.
(perm_mask_for_reverse, vectorizable_load): Adjust callers.
* tree-vectorizer.h (vect_gen_perm_mask): New prototype.
* tree-vect-data-refs.c (vect_strided_store_supported): Don't try
VEC_INTERLEAVE_*_EXPR, use can_vec_perm_p instead of
can_vec_perm_for_code_p.
(vect_permute_store_chain): Generate VEC_PERM_EXPR with interleaving
masks instead of VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
* config/i386/i386.c (expand_vec_perm_interleave2): If
expand_vec_perm_interleave3 would handle it, return false.
(expand_vec_perm_broadcast_1): Don't use vec_interleave_*_optab.

--- gcc/tree.def.jj 2011-12-01 11:44:55.0 +0100
+++ gcc/tree.def2011-12-01 13:37:32.071771156 +0100
@@ -1192,10 +1192,6 @@ DEFTREECODE (VEC_PACK_FIX_TRUNC_EXPR, "v
 DEFTREECODE (VEC_EXTRACT_EVEN_EXPR, "vec_extract_even_expr", tcc_binary, 2)
 DEFTREECODE (VEC_EXTRACT_ODD_EXPR, "vec_extract_odd_expr", tcc_binary, 2)
 
-/* Merge input vectors interleaving their fields.  */
-DEFTREECODE (VEC_INTERLEAVE_HIGH_EXPR, "vec_interleave_high_expr", tcc_binary, 
2)
-DEFTREECODE (VEC_INTERLEAVE_LOW_EXPR, "vec_interleave_low_expr", tcc_binary, 2)
-
 /* Widening vector shift left in bits.
Operand 0 is a vector to be shifted with N elements of size S.
Operand 1 is an integer shift amount in bits.
--- gcc/gimple-pretty-print.c.jj2011-12-01 11:44:54.0 +0100
+++ gcc/gimple-pretty-print.c   2011-12-01 13:39:26.611099281 +0100
@@ -347,8 +347,6 @@ dump_binary_rhs (pretty_printer *buffer,
 case VEC_PACK_FIX_TRUNC_EXPR:
 case VEC_EXTRACT_EVEN_EXPR:
 case VEC_EXTRACT_ODD_EXPR:
-case VEC_INTERLEAVE_HIGH_EXPR:
-case VEC_INTERLEAVE_LOW_EXPR:
 case VEC_WIDEN_LSHIFT_HI_EXPR:
 case VEC_WIDEN_LSHIFT_LO_EXPR:
   for (p = tree_code_name [(int) code]; *p; p++)
--- gcc/expr.c.jj   2011-12-01 11:44:53.0 +0100
+++ gcc/expr

Re: [PATCH] Fix up VEC_INTERLEAVE_*_EXPR folding and expansion for big endian (PR tree-optimization/51074)

2011-12-01 Thread Richard Guenther
On Thu, 1 Dec 2011, Jakub Jelinek wrote:

> On Thu, Dec 01, 2011 at 07:57:48AM -0800, Richard Henderson wrote:
> > On 12/01/2011 03:21 AM, Richard Guenther wrote:
> > > Yes, sorry - I'm recovering from a 3 week e-mail lag ;)  I agree
> > > using VEC_PERM_EXPR would be best - but that would also affect
> > > backend patterns.  Can we have a middle-ground that leaves those
> > > untouched?  We're still in stage 3, so fixing the bug with using
> > > VEC_PERM_EXPR sounds appealing to me ;)
> > 
> > If we agree that we want to fix this with vec_perm_expr, then we need a
> > relatively minor patch to the vectorizer, and cleanups in the targets.
> > 
> > In particular, powerpc, spu, and ia64 will need to recognize various
> > constant permutations so that they can continue using the specialized
> > instructions for interleave.  This shouldn't be particularly difficult; a
> > few testcases added to make sure we don't regress to full permutation
> > wouldn't be amiss.
> > 
> > The x86 port is the only one that really does aggressive constant
> > permutation pattern recognition atm.  That is, of course, because the ISA
> > support for permutation there is all over the map and we had no choice.
> > 
> > I've already zapped the target patterns that expanded interleave/even_odd
> > back into a permuation operation.
> > 
> > If we think this is ok for stage3, we can certainly give it a whack.  I'll
> > take care of the backends if Jakub takes care of the vectorizer?
> 
> Here is the vectorizer part (untested so far) + some small i386 tweaks.
> This patch as is regresses code quality for powerpc/ia64/sparc/mips
> (I don't think spu has vec_interleave* patterns in *.md).
> 
> If it works out, I guess we could also zap VEC_EXTRACT_{EVEN,ODD}_EXPR
> similarly.

Yeah, I think it's a good cleaup opportunity.

Thanks,
Richard.

> 2011-12-01  Jakub Jelinek  
> 
>   * tree.def (VEC_INTERLEAVE_HIGH_EXPR, VEC_INTERLEAVE_LOW_EXPR): Remove.
>   * gimple-pretty-print.c (dump_binary_rhs): Don't handle
>   VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
>   * expr.c (expand_expr_real_2): Likewise.
>   * tree-cfg.c (verify_gimple_assign_binary): Likewise.
>   * cfgexpand.c (expand_debug_expr): Likewise.
>   * tree-inline.c (estimate_operator_cost): Likewise.
>   * tree-pretty-print.c (dump_generic_node): Likewise.
>   * tree-vect-generic.c (expand_vector_operations_1): Likewise.
>   * fold-const.c (fold_binary_loc): Likewise.
>   * doc/generic.texi (VEC_INTERLEAVE_HIGH_EXPR,
>   VEC_INTERLEAVE_LOW_EXPR): Remove documentation.
>   * optabs.c (optab_for_tree_code): Don't handle
>   VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
>   (expand_binop, init_optabs): Remove vec_interleave_high_optab
>   and vec_interleave_low_optab.
>   * genopinit.c (optabs): Likewise.
>   * optabs.h (OTI_vec_interleave_high, OTI_vec_interleave_low): Remove.
>   (vec_interleave_high_optab, vec_interleave_low_optab): Remove.
>   * doc/md.texi (vec_interleave_high, vec_interleave_low): Remove
>   documentation.
>   * tree-vect-stmts.c (gen_perm_mask): Renamed to...
>   (vect_gen_perm_mask): ... this.  No longer static.
>   (perm_mask_for_reverse, vectorizable_load): Adjust callers.
>   * tree-vectorizer.h (vect_gen_perm_mask): New prototype.
>   * tree-vect-data-refs.c (vect_strided_store_supported): Don't try
>   VEC_INTERLEAVE_*_EXPR, use can_vec_perm_p instead of
>   can_vec_perm_for_code_p.
>   (vect_permute_store_chain): Generate VEC_PERM_EXPR with interleaving
>   masks instead of VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
>   * config/i386/i386.c (expand_vec_perm_interleave2): If
>   expand_vec_perm_interleave3 would handle it, return false.
>   (expand_vec_perm_broadcast_1): Don't use vec_interleave_*_optab.
> 
> --- gcc/tree.def.jj   2011-12-01 11:44:55.0 +0100
> +++ gcc/tree.def  2011-12-01 13:37:32.071771156 +0100
> @@ -1192,10 +1192,6 @@ DEFTREECODE (VEC_PACK_FIX_TRUNC_EXPR, "v
>  DEFTREECODE (VEC_EXTRACT_EVEN_EXPR, "vec_extract_even_expr", tcc_binary, 2)
>  DEFTREECODE (VEC_EXTRACT_ODD_EXPR, "vec_extract_odd_expr", tcc_binary, 2)
>  
> -/* Merge input vectors interleaving their fields.  */
> -DEFTREECODE (VEC_INTERLEAVE_HIGH_EXPR, "vec_interleave_high_expr", 
> tcc_binary, 2)
> -DEFTREECODE (VEC_INTERLEAVE_LOW_EXPR, "vec_interleave_low_expr", tcc_binary, 
> 2)
> -
>  /* Widening vector shift left in bits.
> Operand 0 is a vector to be shifted with N elements of size S.
> Operand 1 is an integer shift amount in bits.
> --- gcc/gimple-pretty-print.c.jj  2011-12-01 11:44:54.0 +0100
> +++ gcc/gimple-pretty-print.c 2011-12-01 13:39:26.611099281 +0100
> @@ -347,8 +347,6 @@ dump_binary_rhs (pretty_printer *buffer,
>  case VEC_PACK_FIX_TRUNC_EXPR:
>  case VEC_EXTRACT_EVEN_EXPR:
>  case VEC_EXTRACT_ODD_EXPR:
> -case VEC_INTERLEAVE_HIGH_EXPR:
> -case VEC_I

[PATCH, PR 50622] Force a gimple operand in load_assign_lhs_subreplacements when necessary

2011-12-01 Thread Martin Jambor
Hi,

PR 50622 is an omission in load_assign_lhs_subreplacements, which
should force a gimple operand on a RHS of a gimple assignment if both
sides are new replacements of scalar types which are not gimple
registers, because they are partially modified (which can happen to
complex numbers and bit-fields).

Fixed with the patch below.  It passes bootstrap and testsuite on
x86_64-linux, I am about to do the same on the 4.6 branch because I'd
like to commit it there as well.  OK for trunk and the 4.6 branch?

Thanks,

Martin


2011-12-01  Martin Jambor  

PR tree-optimization/50622
* tree-sra.c (load_assign_lhs_subreplacements): Force gimple operand
if both lacc and racc are grp_partial_lhs.

* testsuite/g++.dg/tree-ssa/pr50622.C: New test.

Index: src/gcc/tree-sra.c
===
--- src.orig/gcc/tree-sra.c
+++ src/gcc/tree-sra.c
@@ -2692,6 +2692,10 @@ load_assign_lhs_subreplacements (struct
  rhs = get_access_replacement (racc);
  if (!useless_type_conversion_p (lacc->type, racc->type))
rhs = fold_build1_loc (loc, VIEW_CONVERT_EXPR, lacc->type, rhs);
+
+ if (racc->grp_partial_lhs && lacc->grp_partial_lhs)
+   rhs = force_gimple_operand_gsi (old_gsi, rhs, true, NULL_TREE,
+   true, GSI_SAME_STMT);
}
  else
{
Index: src/gcc/testsuite/g++.dg/tree-ssa/pr50622.C
===
--- /dev/null
+++ src/gcc/testsuite/g++.dg/tree-ssa/pr50622.C
@@ -0,0 +1,30 @@
+// { dg-do compile }
+// { dg-options "-O2" }
+
+typedef __complex__ double Value;
+struct LorentzVector
+{
+  LorentzVector & operator+=(const LorentzVector & a) {
+theX += a.theX;
+theY += a.theY;
+theZ += a.theZ;
+theT += a.theT;
+return *this;
+  }
+
+  Value theX;
+  Value theY;
+  Value theZ;
+  Value theT;
+};
+
+inline LorentzVector
+operator+(LorentzVector a, const LorentzVector & b) {
+  return a += b;
+}
+
+Value ex, et;
+LorentzVector sum() {
+  LorentzVector v1; v1.theX =ex; v1.theY =ex+et; v1.theZ =ex-et;   v1.theT =et;
+  return v1+v1;
+}


Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-01 Thread Jan Hubicka
> On Fri, Nov 18, 2011 at 6:47 PM, Martin Jambor  wrote:
> > Hi,
> >
> > PR 50744 is an issue with an integer overflow when we propagate the
> > estimated size and time effects from callees to callers.  Because such
> > paths in the parameter value graph can be arbitrarily long, we simply
> > need to introduce an artificial cap on these values.  This is what the
> > patch below does.  The cap should be more than enough for any
> > reasonable values.
> >
> > Moreover, I have looked at how we then process the accumulated
> > estimates and decided to compute evaluation ratio in
> > good_cloning_opportunity_p in HOST_WIDEST_INT.  Call graph frequencies
> > are numerators of fractions with denominator 1000 and therefore
> > capping the size and cost estimate as well as the frequency sums so
> > that their multiplication would not overflow an int seems too
> > constraining on 32bit hosts.
> >
> > Bootstrapped and tested on x86_64-linux without any problems, OK for
> > trunk?
> 
> This introduces host-dependent code generation differences, right?
> You can simply use int64_t for code that is run on the host only.

Well, if we rely on int64_t being around now (that is probably the case with
C++ switch), HOST_WIDEST_INT is always equivalent, isn't it?

Honza


Re: [PATCH, lto]: Handle *tm regparm attribute

2011-12-01 Thread Uros Bizjak
2011/12/1 Jan Hubicka :

>> >> Attached patch handles "*tm regparm" attribute, to avoid "*tm regparm"
>> >> attribute ignored warnings in lto compile on non-x86 targets.
>> >>
>> >> 2011-11-30  Uros Bizjak  
>> >>
>> >>       * lto-lang.c (lto_attribute_table): Handle *tm regparm.
>> >>       (ignore_attribute): New.
>> >>
>> >> Tested on alphaev68-pc-linux-gnu and x86_64-pc-linux-gnu {,-m32}.
>> >>
>> >> OK for mainline?
>> >
>> > Won't similar change be needed for other tm attributes?  Perhaps we could 
>> > just
>> > silence the warning with in_lto_p predicate.
>>
>> Doesn't it need to be handled as well, not just ignored?
> I would expect stuff to be handled at parsing time and thus streamed into IL.

Please see the comment. This attribute is intended to be redefined by
the target-dependant handling, so my patch just prevents the warning
(the code is copied from c-family/c-common.c). x86 targets that
redefine the attribute handling work OK even without the patch,
hinting at the fact that redefinition works OK. We just need to
prevent warning for non-x86 targets.

Uros.


Re: [PATCH 4/5] arm: Set predicable on more instructions.

2011-12-01 Thread Richard Henderson
On 12/01/2011 03:22 AM, Ramana Radhakrishnan wrote:
>>   "tst\\t%0, #255"
>> -  [(set_attr "conds" "set")]
>> +  [(set_attr "conds" "set")
>> +   (set_attr "predicable" "yes")]
>>  )
> 
> It should be tst%? . Otherwise in the predicable case we wouldn't have
> the condition code printed out.

Yes, it should.


r~


Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-01 Thread Jakub Jelinek
On Thu, Dec 01, 2011 at 05:16:21PM +0100, Jan Hubicka wrote:
> > This introduces host-dependent code generation differences, right?
> > You can simply use int64_t for code that is run on the host only.
> 
> Well, if we rely on int64_t being around now (that is probably the case with
> C++ switch), HOST_WIDEST_INT is always equivalent, isn't it?

I don't think that is related to C++ switch, because C++03 doesn't have long 
long,
only C++11 and C99 has it.  We apparently are using int64_t or uint64_t in a
couple of places already though:

lto-streamer-out.c:  uint64_t size;
lto-streamer-out.c: ? (uint64_t) int_size_in_bytes (TREE_TYPE (t))
lto-streamer-out.c: : (((uint64_t) TREE_INT_CST_HIGH (DECL_SIZE_UNIT (t))) 
<< 32)
ada/tb-gcc.c:  uwx_get_reg ((struct uwx_env *) uw_context, UWX_REG_IP, 
(uint64_t *) &pc);
lto/lto.c:int64_t
lto/lto.c:  uint64_t ret = 0;
lto/lto.c:  int64_t offset;
lto/lto.h:int64_t lto_parse_hex (const char *p);

Jakub


Re: [PATCH, lto]: Handle *tm regparm attribute

2011-12-01 Thread Richard Guenther
On Thu, Dec 1, 2011 at 5:19 PM, Uros Bizjak  wrote:
> 2011/12/1 Jan Hubicka :
>
>>> >> Attached patch handles "*tm regparm" attribute, to avoid "*tm regparm"
>>> >> attribute ignored warnings in lto compile on non-x86 targets.
>>> >>
>>> >> 2011-11-30  Uros Bizjak  
>>> >>
>>> >>       * lto-lang.c (lto_attribute_table): Handle *tm regparm.
>>> >>       (ignore_attribute): New.
>>> >>
>>> >> Tested on alphaev68-pc-linux-gnu and x86_64-pc-linux-gnu {,-m32}.
>>> >>
>>> >> OK for mainline?
>>> >
>>> > Won't similar change be needed for other tm attributes?  Perhaps we could 
>>> > just
>>> > silence the warning with in_lto_p predicate.
>>>
>>> Doesn't it need to be handled as well, not just ignored?
>> I would expect stuff to be handled at parsing time and thus streamed into IL.
>
> Please see the comment. This attribute is intended to be redefined by
> the target-dependant handling, so my patch just prevents the warning
> (the code is copied from c-family/c-common.c). x86 targets that
> redefine the attribute handling work OK even without the patch,
> hinting at the fact that redefinition works OK. We just need to
> prevent warning for non-x86 targets.

The patch is ok.

Thanks,
Richard.

> Uros.


Re: Go patch committed: Multiplex goroutines onto OS threads

2011-12-01 Thread Rainer Orth
Ian Lance Taylor  writes:

> This patch changes the Go library to multiplex goroutines onto operating
> system threads.  Previously, each new goroutine ran in a separate
> thread.  That is inefficient for programs with lots of goroutines.  This
> patch changes the library such that it runs a certain numbers of
> threads, and lets each thread switch between goroutines.  This is how
> the master Go library works, and this patch brings in code from the
> master Go library, adjusted for use by gccgo.
[...]
> Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.  Tested
> both with and without -fsplit-stack support.  Committed to mainline.

Unfortunately, this patch broke Solaris bootstrap (and would break IRIX
bootstrap if this ever started working again):

/vol/gcc/src/hg/trunk/local/libgo/runtime/go-signal.c:221:1: error: conflicting 
types for 'sigignore'
In file included from /vol/gcc/src/hg/trunk/local/libgo/runtime/go-signal.c:7:0:
/var/gcc/regression/trunk/8-gcc/build/./gcc/include-fixed/signal.h:100:12: note:
 previous declaration of 'sigignore' was here
make[4]: *** [go-signal.lo] Error 1

 on all of Solaris, IRIX, and Tru64 UNIX has

extern int sigignore(int);

I've fixed this by using sig_ignore instead.

Rainer


diff --git a/libgo/runtime/go-signal.c b/libgo/runtime/go-signal.c
--- a/libgo/runtime/go-signal.c
+++ b/libgo/runtime/go-signal.c
@@ -218,7 +218,7 @@ sighandler (int sig)
 /* Ignore a signal.  */
 
 static void
-sigignore (int sig __attribute__ ((unused)))
+sig_ignore (int sig __attribute__ ((unused)))
 {
 }
 
@@ -247,7 +247,7 @@ runtime_initsig (int32 queue)
   if (signals[i].catch || signals[i].queue)
sa.sa_handler = sighandler;
   else
-   sa.sa_handler = sigignore;
+   sa.sa_handler = sig_ignore;
   sa.sa_flags = signals[i].restart ? SA_RESTART : 0;
   if (sigaction (signals[i].sig, &sa, NULL) != 0)
__go_assert (0);


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Don't change DR_STMT if vect_pattern_recog_1 would fail (PR tree-optimization/51356)

2011-12-01 Thread Jakub Jelinek
Hi!

As mentioned in the PR, vect_pattern_recog_1 attempts to find out
if the computed type_in and type_out are already vector types or not,
and uses VECTOR_MODE_P (TYPE_MODE (type_in)) as the test.  Unfortunately,
get_vectype_for_scalar_type on some targets (e.g. PowerPC) returns a
VECTOR_TYPE with TImode for a DImode integer/boolean scalar type.
If that happens, vect_recog_bool_pattern assumes it will succeed and changes
DR_STMT, but vect_mark_pattern_stmts isn't called and we ICE later on.
Not sure what actually can be vectorized using scalar mode vectors,
so either we adjust vect_recog_bool_pattern like this, or perhaps
vect_pattern_recog_1 could use a different test (TREE_CODE (type_in) ==
VECTOR_TYPE)?

This has been bootstrapped/regtested on x86_64-linux and i686-linux
and fixes the testcase on PowerPC.

2011-12-01  Jakub Jelinek  

PR tree-optimization/51356
* tree-vect-patterns.c (vect_recog_bool_pattern): Give up if
vectype doesn't have VECTOR_MODE_P.

--- gcc/tree-vect-patterns.c.jj 2011-11-29 15:09:18.0 +0100
+++ gcc/tree-vect-patterns.c2011-11-30 17:57:42.183149742 +0100
@@ -2078,6 +2078,8 @@ vect_recog_bool_pattern (VEC (gimple, he
   stmt_vec_info pattern_stmt_info;
   vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
   gcc_assert (vectype != NULL_TREE);
+  if (!VECTOR_MODE_P (TYPE_MODE (vectype)))
+   return NULL;
   if (!check_bool_pattern (var, loop_vinfo))
return NULL;
 

Jakub


Re: Go patch committed: New lock/note implementation

2011-12-01 Thread Rainer Orth
Ian Lance Taylor  writes:

>> ... and also Solaris 8 and 9 bootstrap which lack sem_timedwait:
>>
>> /vol/gcc/src/hg/trunk/local/libgo/runtime/thread-sema.c: In function 
>> 'runtime_semasleep':
>> /vol/gcc/src/hg/trunk/local/libgo/runtime/thread-sema.c:42:7: error: 
>> implicit declaration of function 'sem_timedwait' 
>> [-Werror=implicit-function-declaration]
>
> This one was somewhat trickier, but I think this patch will do the job.
> This uses pthread_cond_timedwait instead of sem_timedwait (I hope that
> Solaris 8 and 9 have pthread_cond_timedwait).  Bootstrapped and ran Go

They do, as does IRIX.

> testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Together with the sigignore -> sig_ignore change, I'm back to bootstrap
land again,thanks.  Unfortunately, I'm now (as of r181837) seeing
considerable testsuite regressions on go and libgo test results:

* many 64-bit go execution failures on i386-pc-solaris2.1[01]:

Running target unix/-m64
FAIL: go.go-torture/execute/go-1.go execution,  -O0 
FAIL: go.go-torture/execute/go-1.go execution,  -O1 
FAIL: go.go-torture/execute/go-1.go execution,  -O2 
FAIL: go.go-torture/execute/go-1.go execution,  -O2 -fomit-frame-pointer 
-finline-functions 
FAIL: go.go-torture/execute/go-1.go execution,  -O2 -fomit-frame-pointer 
-finline-functions -funroll-loops 
FAIL: go.go-torture/execute/go-1.go execution,  -O2 -fbounds-check 
FAIL: go.go-torture/execute/go-1.go execution,  -O3 -g 
FAIL: go.go-torture/execute/go-1.go execution,  -Os 
FAIL: go.go-torture/execute/go-2.go execution,  -O0 
FAIL: go.go-torture/execute/go-2.go execution,  -O1 
FAIL: go.go-torture/execute/go-2.go execution,  -O2 
FAIL: go.go-torture/execute/go-2.go execution,  -O2 -fomit-frame-pointer 
-finline-functions 
FAIL: go.go-torture/execute/go-2.go execution,  -O2 -fomit-frame-pointer 
-finline-functions -funroll-loops 
FAIL: go.go-torture/execute/go-2.go execution,  -O2 -fbounds-check 
FAIL: go.go-torture/execute/go-2.go execution,  -O3 -g 
FAIL: go.go-torture/execute/go-2.go execution,  -Os 
FAIL: go.go-torture/execute/go-3.go execution,  -O0 
FAIL: go.go-torture/execute/go-3.go execution,  -O1 
FAIL: go.go-torture/execute/go-3.go execution,  -O2 
FAIL: go.go-torture/execute/go-3.go execution,  -O2 -fomit-frame-pointer 
-finline-functions 
FAIL: go.go-torture/execute/go-3.go execution,  -O2 -fomit-frame-pointer 
-finline-functions -funroll-loops 
FAIL: go.go-torture/execute/go-3.go execution,  -O2 -fbounds-check 
FAIL: go.go-torture/execute/go-3.go execution,  -O3 -g 
FAIL: go.go-torture/execute/go-3.go execution,  -Os 
FAIL: go.go-torture/execute/select-1.go execution,  -O0 
FAIL: go.go-torture/execute/select-1.go execution,  -O1 
FAIL: go.go-torture/execute/select-1.go execution,  -O2 
FAIL: go.go-torture/execute/select-1.go execution,  -O2 -fomit-frame-pointer 
-finline-functions 
FAIL: go.go-torture/execute/select-1.go execution,  -O2 -fomit-frame-pointer 
-finline-functions -funroll-loops 
FAIL: go.go-torture/execute/select-1.go execution,  -O2 -fbounds-check 
FAIL: go.go-torture/execute/select-1.go execution,  -O3 -g 
FAIL: go.go-torture/execute/select-1.go execution,  -Os 
FAIL: go.test/test/235.go execution,  -O2 -g 
FAIL: go.test/test/bigalg.go execution,  -O2 -g 
FAIL: go.test/test/chan/doubleselect.go execution,  -O2 -g 
FAIL: go.test/test/chan/fifo.go execution,  -O2 -g 
FAIL: go.test/test/chan/goroutines.go execution,  -O2 -g 
FAIL: go.test/test/chan/nonblock.go execution,  -O2 -g 
FAIL: go.test/test/chan/powser1.go execution,  -O2 -g 
FAIL: go.test/test/chan/powser2.go execution,  -O2 -g 
FAIL: go.test/test/chan/select2.go execution,  -O2 -g 
FAIL: go.test/test/chan/select3.go execution,  -O2 -g 
FAIL: go.test/test/chan/select6.go execution,  -O2 -g 
FAIL: go.test/test/chan/sieve1.go execution,  -O2 -g 
FAIL: go.test/test/chan/sieve2.go execution,  -O2 -g 
FAIL: go.test/test/closure.go execution,  -O2 -g 
FAIL: go.test/test/escape.go execution,  -O2 -g 
FAIL: go.test/test/fixedbugs/bug067.go execution,  -O2 -g 
FAIL: go.test/test/fixedbugs/bug130.go execution,  -O2 -g 
FAIL: go.test/test/fixedbugs/bug147.go execution,  -O2 -g 
FAIL: go.test/test/fixedbugs/bug243.go execution,  -O2 -g 
FAIL: go.test/test/func5.go execution,  -O2 -g 
FAIL: go.test/test/goprint.go execution,  -O2 -g 
FAIL: go.test/test/ken/chan.go execution,  -O2 -g 
FAIL: go.test/test/ken/chan1.go execution,  -O2 -g 
FAIL: go.test/test/ken/cplx5.go execution,  -O2 -g 
FAIL: go.test/test/mallocfin.go execution,  -O2 -g 
FAIL: go.test/test/nil.go execution,  -O2 -g 
FAIL: go.test/test/range.go execution,  -O2 -g 
FAIL: go.test/test/stack.go execution,  -O2 -g 

* All 64-bit libgo tests fail on the same target:

FAIL: asn1
/vol/gcc/src/hg/trunk/local/libgo/testsuite/gotest[422]: gotest-timeout: cannot 
create [No such file or directory]
checkId: 65 should be 1
checkId: 66 should be 1

I've not yet investigated what's going on here.

Rainer

-- 
--

[PATCH] Improve debug info if tree DCE removes stores (PR debug/50317)

2011-12-01 Thread Jakub Jelinek
Hi!

As discussed in the PR, in 4.7 we regressed some GDB testcases, because
previously unused addressable vars were first previously optimized into
non-addressable and only afterwards removed (which results in correct debug
stmts covering those assignments), but after some recent changes it is
CDDCE that removes them before they are update_address_taken optimized.

In the PR I've offered a patch to schedule another update_address_taken
pass before first cddce, but Michael is right that perhaps some other
DCE pass could have similar issue.

So this patch instead, if the DCEd var stores have addressable lhs, but
with is_gimple_reg_type types, we add debug stmts even for them.
Such variables aren't target_for_debug_bind though, which breaks
var-tracking.  So, the patch if all occurrences of the var are optimized
away, just clears TREE_ADDRESSABLE bit like update_address_taken would,
and, if that didn't happen until expansion, just ignores those debug
stmts so that var-tracking isn't upset by seing non-tracked vars in debug
insns.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-12-01  Jakub Jelinek  

PR debug/50317
* tree-ssa-dce.c (remove_dead_stmt): Add a debug stmt when removing
as unnecessary a store to a variable with gimple reg type.
* tree-ssa-live.c (remove_unused_locals): Clear TREE_ADDRESSABLE bit
on local unreferenced variables.
* cfgexpand.c (expand_gimple_basic_block): Don't emit DEBUG_INSNs
for !target_for_debug_bind variables.

--- gcc/tree-ssa-live.c.jj  2011-11-28 15:41:46.376749700 +0100
+++ gcc/tree-ssa-live.c 2011-12-01 12:04:12.920595572 +0100
@@ -1,5 +1,5 @@
 /* Liveness for SSA trees.
-   Copyright (C) 2003, 2004, 2005, 2007, 2008, 2009, 2010
+   Copyright (C) 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011
Free Software Foundation, Inc.
Contributed by Andrew MacLeod 
 
@@ -814,7 +814,15 @@ remove_unused_locals (void)
  bitmap_set_bit (global_unused_vars, DECL_UID (var));
}
  else
-   continue;
+   {
+ /* For unreferenced local vars drop TREE_ADDRESSABLE
+bit in case it is referenced from debug stmts.  */
+ if (DECL_CONTEXT (var) == current_function_decl
+ && TREE_ADDRESSABLE (var)
+ && is_gimple_reg_type (TREE_TYPE (var)))
+   TREE_ADDRESSABLE (var) = 0;
+ continue;
+   }
}
   else if (TREE_CODE (var) == VAR_DECL
   && DECL_HARD_REGISTER (var)
--- gcc/tree-ssa-dce.c.jj   2011-11-28 15:41:46.376749700 +0100
+++ gcc/tree-ssa-dce.c  2011-12-01 12:04:12.920595572 +0100
@@ -1215,6 +1215,26 @@ remove_dead_stmt (gimple_stmt_iterator *
  ei_next (&ei);
 }
 
+  /* If this is a store into a variable that is being optimized away,
+ add a debug bind stmt if possible.  */
+  if (MAY_HAVE_DEBUG_STMTS
+  && gimple_assign_single_p (stmt)
+  && is_gimple_val (gimple_assign_rhs1 (stmt)))
+{
+  tree lhs = gimple_assign_lhs (stmt);
+  if ((TREE_CODE (lhs) == VAR_DECL || TREE_CODE (lhs) == PARM_DECL)
+ && !DECL_IGNORED_P (lhs)
+ && is_gimple_reg_type (TREE_TYPE (lhs))
+ && !is_global_var (lhs)
+ && !DECL_HAS_VALUE_EXPR_P (lhs))
+   {
+ tree rhs = gimple_assign_rhs1 (stmt);
+ gimple note
+   = gimple_build_debug_bind (lhs, unshare_expr (rhs), stmt);
+ gsi_insert_after (i, note, GSI_SAME_STMT);
+   }
+}
+
   unlink_stmt_vdef (stmt);
   gsi_remove (i, true);
   release_defs (stmt);
--- gcc/cfgexpand.c.jj  2011-12-01 11:44:56.156345109 +0100
+++ gcc/cfgexpand.c 2011-12-01 12:37:57.764791257 +0100
@@ -3903,6 +3903,11 @@ expand_gimple_basic_block (basic_block b
  rtx val;
  enum machine_mode mode;
 
+ if (TREE_CODE (var) != DEBUG_EXPR_DECL
+ && TREE_CODE (var) != LABEL_DECL
+ && !target_for_debug_bind (var))
+   goto delink_debug_stmt;
+
  if (gimple_debug_bind_has_value_p (stmt))
value = gimple_debug_bind_get_value (stmt);
  else
@@ -3932,6 +3937,7 @@ expand_gimple_basic_block (basic_block b
  PAT_VAR_LOCATION_LOC (val) = (rtx)value;
}
 
+   delink_debug_stmt:
  /* In order not to generate too many debug temporaries,
 we delink all uses of debug statements we already expanded.
 Therefore debug statements between definition and real

Jakub


Re: Use atomics in remaining libgomp/config/linux sources

2011-12-01 Thread Richard Henderson
On 12/01/2011 01:44 AM, Alan Modra wrote:
>   * config/linux/affinity.c: Use atomic rather than sync builtin.
>   * config/linux/lock.c: Likewise.
>   * config/linux/ptrlock.h: Likewise.
>   * config/linux/ptrlock.c: Likewise.
>   * config/linux/ptrlock.h (gomp_ptrlock_set): Always write here..
>   * config/linux/ptrlock.c (gomp_ptrlock_set_slow): ..not here.
>   * config/linux/futex.h (atomic_write_barrier): Delete unused function.
>   * config/linux/alpha/futex.h (atomic_write_barrier): Likewise.
>   * config/linux/ia64/futex.h (atomic_write_barrier): Likewise.
>   * config/linux/mips/futex.h (atomic_write_barrier): Likewise.
>   * config/linux/powerpc/futex.h (atomic_write_barrier): Likewise.
>   * config/linux/s390/futex.h (atomic_write_barrier): Likewise.
>   * config/linux/sparc/futex.h (atomic_write_barrier): Likewise.
>   * config/linux/x86/futex.h (atomic_write_barrier): Likewise.

Ok.


r~


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Peter Bergner
On Thu, 2011-12-01 at 07:42 -0800, Richard Henderson wrote:
> I didn't notice CR registers being saved in the linux setjmp function,
> but perhaps I just missed it?

I believe the setjmp/getcontext functions save the entire CR rather than
just the non-volatile CR fields.  Looking at the glibc code, I do see:

  mfcr r0

followed by a store of r0.

Peter





Re: [PATCH 2/5] arm: Emit swp for pre-armv6.

2011-12-01 Thread Andrew MacLeod

On 12/01/2011 10:49 AM, Richard Henderson wrote:

On 12/01/2011 02:59 AM, Richard Earnshaw wrote:

It's essential we don't emit SWP instructions directly into code on any
platform where that code may possibly be run on a later core that has
LDREX/STREX.  If we do that we'll end up with a mess that can't be resolved.

Ok.  It's easy enough to drop that patch.


I also think that GCC should NOT provide those helper functions, though
we should probably write a document describing how a user might do so.

I'll refer you to MacLeod at this point and the atomics support library...



What helper functions?   the __atomic_*  ones when lock free routines 
cannot be provided? they are defined here:

http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary

and recently I tried starting a discussion on its future:

http://gcc.gnu.org/ml/gcc/2011-11/msg00503.html

follow up any opinions there I guess :-)

Andrew




Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Iain Sandoe


On 1 Dec 2011, at 16:50, Peter Bergner wrote:


On Thu, 2011-12-01 at 07:42 -0800, Richard Henderson wrote:
I didn't notice CR registers being saved in the linux setjmp  
function,

but perhaps I just missed it?


I believe the setjmp/getcontext functions save the entire CR rather  
than

just the non-volatile CR fields.  Looking at the glibc code, I do see:

 mfcr r0

followed by a store of r0.


yes, I wasn't clear - we do the same on Darwin (rather than splitting  
out the non-volatile).


However, we (on Darwin) don't seem to save the FPSCR  - and there's no  
mention of it in "preserved regs" section of the ABI doc.  I wonder if  
that's an oversight.


Iain



Re: [PATCH, PR 50744] Prevent overflows in IPA-CP

2011-12-01 Thread Jan Hubicka
> On Thu, Dec 01, 2011 at 05:16:21PM +0100, Jan Hubicka wrote:
> > > This introduces host-dependent code generation differences, right?
> > > You can simply use int64_t for code that is run on the host only.
> > 
> > Well, if we rely on int64_t being around now (that is probably the case with
> > C++ switch), HOST_WIDEST_INT is always equivalent, isn't it?
> 
> I don't think that is related to C++ switch, because C++03 doesn't have long 
> long,
> only C++11 and C99 has it.  We apparently are using int64_t or uint64_t in a
> couple of places already though:
> 
> lto-streamer-out.c:  uint64_t size;
> lto-streamer-out.c:   ? (uint64_t) int_size_in_bytes (TREE_TYPE (t))
> lto-streamer-out.c:   : (((uint64_t) TREE_INT_CST_HIGH (DECL_SIZE_UNIT (t))) 
> << 32)
> ada/tb-gcc.c:  uwx_get_reg ((struct uwx_env *) uw_context, UWX_REG_IP, 
> (uint64_t *) &pc);
> lto/lto.c:int64_t
> lto/lto.c:  uint64_t ret = 0;
> lto/lto.c:  int64_t offset;
> lto/lto.h:int64_t lto_parse_hex (const char *p);

Yep, all is relatively new code.
I originally suggested HOST_WIDEST_INT to Martin since bootstrapped compiler
will have it 64bit and the overflow happens only in very extreme case where
function pass constant arguments very many times.
int65_t would work for me too.

Honza
> 
>   Jakub


Re: Go patch committed: Multiplex goroutines onto OS threads

2011-12-01 Thread Ian Lance Taylor
Rainer Orth  writes:

> Unfortunately, this patch broke Solaris bootstrap (and would break IRIX
> bootstrap if this ever started working again):
>
> /vol/gcc/src/hg/trunk/local/libgo/runtime/go-signal.c:221:1: error: 
> conflicting types for 'sigignore'
> In file included from 
> /vol/gcc/src/hg/trunk/local/libgo/runtime/go-signal.c:7:0:
> /var/gcc/regression/trunk/8-gcc/build/./gcc/include-fixed/signal.h:100:12: 
> note:
>  previous declaration of 'sigignore' was here
> make[4]: *** [go-signal.lo] Error 1
>
>  on all of Solaris, IRIX, and Tru64 UNIX has
>
> extern int sigignore(int);
>
> I've fixed this by using sig_ignore instead.

Thanks.  Patch committed.

Ian


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Peter Bergner
On Thu, 2011-12-01 at 17:01 +, Iain Sandoe wrote:
> However, we (on Darwin) don't seem to save the FPSCR  - and there's no  
> mention of it in "preserved regs" section of the ABI doc.  I wonder if  
> that's an oversight.

The ppc* linux ABIs state the FPSCR is volatile, maybe it's the same
on Darwin?

Peter





Re: PR middle-end/51273: call cgraph_call_node_duplication_hooks

2011-12-01 Thread Richard Henderson
On 11/27/2011 04:44 PM, Patrick Marlier wrote:
>   PR middle-end/51273
>   * cgraph.h (cgraph_call_node_duplication_hooks): Declare.
>   * cgraph.c (cgraph_call_node_duplication_hooks): Make global. 
>   * cgraphunit.c (cgraph_copy_node_for_versioning): Call duplication
>   hooks.

Applied.


r~


Re: [PATCH] Don't change DR_STMT if vect_pattern_recog_1 would fail (PR tree-optimization/51356)

2011-12-01 Thread Ira Rosen
On 1 December 2011 18:41, Jakub Jelinek  wrote:
> Hi!

Hi,

>
> As mentioned in the PR, vect_pattern_recog_1 attempts to find out
> if the computed type_in and type_out are already vector types or not,
> and uses VECTOR_MODE_P (TYPE_MODE (type_in)) as the test.  Unfortunately,
> get_vectype_for_scalar_type on some targets (e.g. PowerPC) returns a
> VECTOR_TYPE with TImode for a DImode integer/boolean scalar type.
> If that happens, vect_recog_bool_pattern assumes it will succeed and changes
> DR_STMT, but vect_mark_pattern_stmts isn't called and we ICE later on.
> Not sure what actually can be vectorized using scalar mode vectors,
> so either we adjust vect_recog_bool_pattern like this, or perhaps
> vect_pattern_recog_1 could use a different test (TREE_CODE (type_in) ==
> VECTOR_TYPE)?

But AFAIU in the later case we would fail to vectorize anyway, so I am
OK with your patch.

Thanks,
Ira

>
> This has been bootstrapped/regtested on x86_64-linux and i686-linux
> and fixes the testcase on PowerPC.
>
> 2011-12-01  Jakub Jelinek  
>
>        PR tree-optimization/51356
>        * tree-vect-patterns.c (vect_recog_bool_pattern): Give up if
>        vectype doesn't have VECTOR_MODE_P.
>
> --- gcc/tree-vect-patterns.c.jj 2011-11-29 15:09:18.0 +0100
> +++ gcc/tree-vect-patterns.c    2011-11-30 17:57:42.183149742 +0100
> @@ -2078,6 +2078,8 @@ vect_recog_bool_pattern (VEC (gimple, he
>       stmt_vec_info pattern_stmt_info;
>       vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
>       gcc_assert (vectype != NULL_TREE);
> +      if (!VECTOR_MODE_P (TYPE_MODE (vectype)))
> +       return NULL;
>       if (!check_bool_pattern (var, loop_vinfo))
>        return NULL;
>
>
>        Jakub
>


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread David Edelsohn
On Thu, Dec 1, 2011 at 10:30 AM, Richard Henderson  wrote:

> I made it up.  As he said, it's only used for padding to *attempt to* avoid 
> false sharing.  Currently sources won't actually fail with the wrong 
> cacheline value, but they'll work more efficiently with the right value.

The cache line size is 128 bytes for PPC64 and 32 bytes for PPC32.

L1 cache line size also is a GCC parameter

PARAM_L1_CACHE_LINE_SIZE

and is set by most ports.

- David


Re: [PATCH] Improve debug info if tree DCE removes stores (PR debug/50317)

2011-12-01 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/01/11 09:49, Jakub Jelinek wrote:
> Hi!
> 
> As discussed in the PR, in 4.7 we regressed some GDB testcases,
> because previously unused addressable vars were first previously
> optimized into non-addressable and only afterwards removed (which
> results in correct debug stmts covering those assignments), but
> after some recent changes it is CDDCE that removes them before they
> are update_address_taken optimized.
> 
> In the PR I've offered a patch to schedule another
> update_address_taken pass before first cddce, but Michael is right
> that perhaps some other DCE pass could have similar issue.
> 
> So this patch instead, if the DCEd var stores have addressable lhs,
> but with is_gimple_reg_type types, we add debug stmts even for
> them. Such variables aren't target_for_debug_bind though, which
> breaks var-tracking.  So, the patch if all occurrences of the var
> are optimized away, just clears TREE_ADDRESSABLE bit like
> update_address_taken would, and, if that didn't happen until
> expansion, just ignores those debug stmts so that var-tracking
> isn't upset by seing non-tracked vars in debug insns.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
> 
> 2011-12-01  Jakub Jelinek  
> 
> PR debug/50317 * tree-ssa-dce.c (remove_dead_stmt): Add a debug
> stmt when removing as unnecessary a store to a variable with gimple
> reg type. * tree-ssa-live.c (remove_unused_locals): Clear
> TREE_ADDRESSABLE bit on local unreferenced variables. * cfgexpand.c
> (expand_gimple_basic_block): Don't emit DEBUG_INSNs for
> !target_for_debug_bind variables.
OK.
Jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO17rmAAoJEBRtltQi2kC7q8EH/1VnjCdoj8E59TJENYmD0Kx9
dP1zeN8uxlD0goVoSum8FbYArL4cbaajPgA+I1+hlGIBK2htl+fSwKiKaq7i6wNT
r21GL27JVtGdPNjDh37Srb3DvgmtKBCP9iSqZDjaO8xAB0zjnoTsYwlx3EfdWi8C
aSuJsAEqAshaU+GptWzG0CsXe5R+XQB10c+RGP5MFpsRpzWVHPXFOZ6yUmFF0EOO
4MI5eHM+C5j0W4NigpZL18YAjEQBsH+ricgO6xjg/XAU3ro5hOidlno4F71+6wCO
q24Mr8RMTpEe+TZE18GLx9iNrIUShDXTqjAwt1fvYVxi96hGgwAyU4BywruXNII=
=Eg48
-END PGP SIGNATURE-


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Joseph S. Myers
On Thu, 1 Dec 2011, Iain Sandoe wrote:

> However, we (on Darwin) don't seem to save the FPSCR  - and there's no mention
> of it in "preserved regs" section of the ABI doc.  I wonder if that's an
> oversight.

As I previously noted in the ARM discussion, C specifically says that 
setjmp/longjmp should *not* save/restore floating-point exceptions and 
rounding modes.  Think of the floating-point state as being a global 
variable (well, thread-local).

-- 
Joseph S. Myers
jos...@codesourcery.com


Go patch committed: Export and import of predeclared error type

2011-12-01 Thread Ian Lance Taylor
This patch from Rémy Oudompheng fixes the export and import of the
predeclared error type.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 2769c29f2014 go/export.cc
--- a/go/export.cc	Thu Dec 01 09:07:49 2011 -0800
+++ b/go/export.cc	Thu Dec 01 10:31:54 2011 -0800
@@ -337,6 +337,7 @@
   this->register_builtin_type(gogo, "uintptr", BUILTIN_UINTPTR);
   this->register_builtin_type(gogo, "bool", BUILTIN_BOOL);
   this->register_builtin_type(gogo, "string", BUILTIN_STRING);
+  this->register_builtin_type(gogo, "error", BUILTIN_ERROR);
 }
 
 // Register one builtin type in the export table.
diff -r 2769c29f2014 go/export.h
--- a/go/export.h	Thu Dec 01 09:07:49 2011 -0800
+++ b/go/export.h	Thu Dec 01 10:31:54 2011 -0800
@@ -39,8 +39,9 @@
   BUILTIN_STRING = -16,
   BUILTIN_COMPLEX64 = -17,
   BUILTIN_COMPLEX128 = -18,
+  BUILTIN_ERROR = -19,
 
-  SMALLEST_BUILTIN_CODE = -18
+  SMALLEST_BUILTIN_CODE = -19
 };
 
 // This class manages exporting Go declarations.  It handles the main
diff -r 2769c29f2014 go/import.cc
--- a/go/import.cc	Thu Dec 01 09:07:49 2011 -0800
+++ b/go/import.cc	Thu Dec 01 10:31:54 2011 -0800
@@ -706,6 +706,7 @@
   this->register_builtin_type(gogo, "uintptr", BUILTIN_UINTPTR);
   this->register_builtin_type(gogo, "bool", BUILTIN_BOOL);
   this->register_builtin_type(gogo, "string", BUILTIN_STRING);
+  this->register_builtin_type(gogo, "error", BUILTIN_ERROR);
 }
 
 // Register a single builtin type.


Re: [PATCH, PR 50622] Force a gimple operand in load_assign_lhs_subreplacements when necessary

2011-12-01 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/01/11 09:15, Martin Jambor wrote:
> Hi,
> 
> PR 50622 is an omission in load_assign_lhs_subreplacements, which 
> should force a gimple operand on a RHS of a gimple assignment if
> both sides are new replacements of scalar types which are not
> gimple registers, because they are partially modified (which can
> happen to complex numbers and bit-fields).
> 
> Fixed with the patch below.  It passes bootstrap and testsuite on 
> x86_64-linux, I am about to do the same on the 4.6 branch because
> I'd like to commit it there as well.  OK for trunk and the 4.6
> branch?
> 
> Thanks,
> 
> Martin
> 
> 
> 2011-12-01  Martin Jambor  
> 
> PR tree-optimization/50622 * tree-sra.c
> (load_assign_lhs_subreplacements): Force gimple operand if both
> lacc and racc are grp_partial_lhs.
> 
> * testsuite/g++.dg/tree-ssa/pr50622.C: New test.
OK for trunk.  Release manager owns 4.6 branch.

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO17tgAAoJEBRtltQi2kC7O+AIAIPeCZFGJulh6WtuF3KM5zDM
C9cibIUI0XjeOB9CpuFSXf3+vbzXkbsiaS/E+E5x6TxOQEsHJJ3Z6SrX342t9OEO
hPSm0vKoFDUV5vfc9nUhb/CsibgbmDL2rcAZO1rT6mvwzyZIdSR3kVGLTueSZIpC
NSLHd+p3yaYcTzUlukSCxK3fphPWhqso4/0bg8Eq7+yZE56AprbTVg9n/fHmocrB
VmlOHriOsI18uAyUOhleT2mDn0x64sV8jS0lMb4sgIg3cguUYqyqCI8C4lGadaFL
k6H/0gnky7zlH8d8ZICAYBzC86vLCBS6wxNTQjvsw0C/2P7e7Wl/fOi7GYS1k4w=
=SwO7
-END PGP SIGNATURE-


[Gcov] Unbreak C++ coverage

2011-12-01 Thread Nathan Sidwell
I've committed this patch, which unbreaks the firefox build problem Markus 
found.  The problem is that the list of functions to emit coverage data is 
determined before the final culling of functions that don't need emitting. 
There's a circular dependency here with the cgraph machinery, and I need to look 
carefully as to how that might be broken.


This does revert some of the new features I was implementing, and the xfailed 
gcov test is a case of this.  I decided to simply remove the tests checking 
coverage object visibility, rather than augment the scan-assembler machinery.


tested on i686-pc-linux-gnu.  Thanks to Markus for verifying this patch does 
indeed unbreak firefox.


nathan
2011-12-01  Nathan Sidwell  

PR gcov-profile/51113
* coverage.c (build_var): Keep coverage variables static.

testsuite/
* lib/gcov.exp (verify-lines): Add support for xfailing.
(run-gcov): Likewise.
* gcc.misc-tests/gcov-13.c: Xfail weak function.
* gcc.misc-tests/gcov-16.c: Remove.
* gcc.misc-tests/gcov-17.c: Remove.
* g++.dg/gcov-8.C: Remove.
* g++.dg/gcov-9.C: Remove.
* g++.dg/gcovpart-12b.C: New.
* g++.dg/gcov-12.C: New.

Index: coverage.c
===
--- coverage.c  (revision 181858)
+++ coverage.c  (working copy)
@@ -657,8 +657,7 @@ coverage_end_function (unsigned lineno_c
 }
 
 /* Build a coverage variable of TYPE for function FN_DECL.  If COUNTER
-   >= 0 it is a counter array, otherwise it is the function structure.
-   Propagate appropriate linkage and visibility from the function decl.  */
+   >= 0 it is a counter array, otherwise it is the function structure.  */
 
 static tree
 build_var (tree fn_decl, tree type, int counter)
@@ -675,21 +674,6 @@ build_var (tree fn_decl, tree type, int
   TREE_STATIC (var) = 1;
   TREE_ADDRESSABLE (var) = 1;
   DECL_ALIGN (var) = TYPE_ALIGN (type);
-  DECL_WEAK (var) = DECL_WEAK (fn_decl);
-  TREE_PUBLIC (var)
-= TREE_PUBLIC (fn_decl) && (counter < 0 || DECL_WEAK (fn_decl));
-  if (DECL_ONE_ONLY (fn_decl))
-make_decl_one_only (var, DECL_COMDAT_GROUP (fn_decl));
-  
-  if (TREE_PUBLIC (var))
-{
-  DECL_VISIBILITY (var) = DECL_VISIBILITY (fn_decl);
-  DECL_VISIBILITY_SPECIFIED (var)
-   = DECL_VISIBILITY_SPECIFIED (fn_decl);
-
-  /* Initialize assembler name so we can stream out. */
-  DECL_ASSEMBLER_NAME (var);
-}
 
   return var;
 }
Index: testsuite/lib/gcov.exp
===
--- testsuite/lib/gcov.exp  (revision 181858)
+++ testsuite/lib/gcov.exp  (working copy)
@@ -39,19 +39,28 @@ proc clean-gcov { testcase } {
 #
 proc verify-lines { testcase file } {
 #send_user "verify-lines\n"
+global subdir
 set failed 0
 set fd [open $file r]
 while { [gets $fd line] >= 0 } {
 # We want to match both "-" and "#" as count as well as numbers,
 # since we want to detect lines that shouldn't be marked as covered.
-   if [regexp "^ *(\[^:]*): *(\[0-9\\-#]+):.*count\\((\[0-9\\-#]+)\\)" \
-   "$line" all is n shouldbe] {
+   if [regexp "^ *(\[^:]*): *(\[0-9\\-#]+):.*count\\((\[0-9\\-#]+)\\)(.*)" 
\
+   "$line" all is n shouldbe rest] {
+   if [regexp "^ *{(.*)}" $rest all xfailed] {
+   switch [dg-process-target $xfailed] {
+   "N" { continue }
+   "F" { setup_xfail "*-*-*" }
+   }
+   }
if { $is == "" } {
-   fail "$n:no data available for this line"
+   fail "$subdir/$testcase:$n:no data available for this line"
incr failed
} elseif { $is != $shouldbe } {
-   fail "$n:is $is:should be $shouldbe"
+   fail "$subdir/$testcase:$n:is $is:should be $shouldbe"
incr failed
+   } else {
+   pass "$subdir/$testcase:$n line count"
}
}
 }
@@ -230,32 +239,36 @@ proc run-gcov { args } {
 global GCOV
 global srcdir subdir
 
-set gcov_args [lindex $args end]
-
+set gcov_args ""
 set gcov_verify_calls 0
 set gcov_verify_branches 0
-set gcov_execute_xfail ""
-set gcov_verify_xfail ""
+set xfailed 0
 
 foreach a $args {
if { $a == "calls" } {
  set gcov_verify_calls 1
} elseif { $a == "branches" } {
  set gcov_verify_branches 1
+   } elseif { $gcov_args == "" } {
+   set gcov_args $a
+   } else {
+   switch [dg-process-target $a] {
+   "N" { return }
+   "F" { set xfailed 1 }
+   }
}
 }
 
 # Extract the test name from the arguments.
 set testcase [lindex $gcov_args end]
 
-if { $gcov_execute_xfail != "" } {
-   eval setup_xfail [split $gcov_execute_xfail]
-}
-
 verbose "Running $GCOV $testcase" 2
 set testc

Re: [testsuite] xfail target-specific asms, & gcov

2011-12-01 Thread Nathan Sidwell

On 12/01/11 15:39, Mike Stump wrote:


Makes this trivially true. In your patch, you alter the spelling of a test,
never do that. Once that is fixed, Ok.


Oh dear,  I didn't mean to do that.  Anyway, I decided to remove the visibility 
tests here for the moment, so the scanasm change is moot and not committed. 
Thanks for review anyway.


nathan


Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Richard Henderson
On 12/01/2011 10:47 AM, Joseph S. Myers wrote:
> As I previously noted in the ARM discussion, C specifically says that 
> setjmp/longjmp should *not* save/restore floating-point exceptions and 
> rounding modes.  Think of the floating-point state as being a global 
> variable (well, thread-local).

Exactly.

If the fpscr _were_ a TLS variable, seen to be modified inside a transaction, 
we would log its initial value so that we could restore that original value on 
a transaction restart or transaction cancel.

So, as you say the ARM libc setjmp/longjmp implementation is wrong, but we do 
need that save and restore here.


r~


[PR bootstrap/51346] Fix lto profiledbootstrap (issue5437103)

2011-12-01 Thread Diego Novillo
These two patches fix the profiledbootstrap failure I caused earlier
this week.

The first patch reverts my original fix.  The second one implements a
different approach for this problem.  Instead of trying to keep the
edge attribute in sync with the statement, we do not use the edge
attribute as long as there is a statement on that edge.

This is still sub-optimal.  We should only have a single no-inline
attribute.  Given that we sometimes have a callgraph without code, the
attribute should be on the edge.  Honza how hard would it be to
implement that?

In addition to fixing my internal build failure and the bootstrap,
this fixes two tests:

g++.dg/lto/20101020-1
gcc.c-torture/execute/920501-1.c


Tested on x86_64 with profiledbootstrap.  Committed to trunk.


Diego.

PR bootstrap/51346
Revert
2011-11-29   Diego Novillo  

* gimple.c (gimple_call_set_cannot_inline): Move from gimple.h.
Update field call_stmt_cannot_inline_p from call
graph edge, if needed.
* gimple.h (gimple_call_set_cannot_inline): Move to gimple.c.


PR bootstrap/51346
* ipa-inline.c (can_inline_edge_p): If the edge E has a
statement, use the statement's inline indicator instead
of E's.
Remove consistency check.


diff --git a/gcc/gimple.c b/gcc/gimple.c
index d27e94b..071c651 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -5558,34 +5558,4 @@ gimple_asm_clobbers_memory_p (const_gimple stmt)
 
   return false;
 }
-
-
-/* Set the inlinable status of GIMPLE_CALL S to INLINABLE_P.  */
-
-void
-gimple_call_set_cannot_inline (gimple s, bool inlinable_p)
-{
-  bool prev_inlinable_p;
-
-  GIMPLE_CHECK (s, GIMPLE_CALL);
-
-  prev_inlinable_p = gimple_call_cannot_inline_p (s);
-
-  if (inlinable_p)
-s->gsbase.subcode |= GF_CALL_CANNOT_INLINE;
-  else
-s->gsbase.subcode &= ~GF_CALL_CANNOT_INLINE;
-
-  /* If we have changed the inlinable attribute, and there is a call
- graph edge going out of this statement, update its inlinable
- attribute as well.  */
-  if (current_function_decl && prev_inlinable_p != inlinable_p)
-{
-  struct cgraph_node *n = cgraph_get_node (current_function_decl);
-  struct cgraph_edge *e = cgraph_edge (n, s);
-  if (e)
-   e->call_stmt_cannot_inline_p = inlinable_p;
-}
-}
-
 #include "gt-gimple.h"
diff --git a/gcc/gimple.h b/gcc/gimple.h
index df31bf3..8536c70 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1035,7 +1035,6 @@ extern bool walk_stmt_load_store_ops (gimple, void *,
 extern bool gimple_ior_addresses_taken (bitmap, gimple);
 extern bool gimple_call_builtin_p (gimple, enum built_in_function);
 extern bool gimple_asm_clobbers_memory_p (const_gimple);
-extern void gimple_call_set_cannot_inline (gimple, bool);
 
 /* In gimplify.c  */
 extern tree create_tmp_var_raw (tree, const char *);
@@ -2344,6 +2343,19 @@ gimple_call_tail_p (gimple s)
 }
 
 
+/* Set the inlinable status of GIMPLE_CALL S to INLINABLE_P.  */
+
+static inline void
+gimple_call_set_cannot_inline (gimple s, bool inlinable_p)
+{
+  GIMPLE_CHECK (s, GIMPLE_CALL);
+  if (inlinable_p)
+s->gsbase.subcode |= GF_CALL_CANNOT_INLINE;
+  else
+s->gsbase.subcode &= ~GF_CALL_CANNOT_INLINE;
+}
+
+
 /* Return true if GIMPLE_CALL S cannot be inlined.  */
 
 static inline bool
-- 
1.7.3.1



diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 3dadf8d..e3c6b3c 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -246,6 +246,14 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
   struct function *caller_cfun = DECL_STRUCT_FUNCTION (e->caller->decl);
   struct function *callee_cfun
 = callee ? DECL_STRUCT_FUNCTION (callee->decl) : NULL;
+  bool call_stmt_cannot_inline_p;
+
+  /* If E has a call statement in it, use the inline attribute from
+ the statement, otherwise use the inline attribute in E.  Edges
+ will not have statements when working in WPA mode.  */
+  call_stmt_cannot_inline_p = (e->call_stmt)
+ ? gimple_call_cannot_inline_p (e->call_stmt)
+ : e->call_stmt_cannot_inline_p;
 
   if (!caller_cfun && e->caller->clone_of)
 caller_cfun = DECL_STRUCT_FUNCTION (e->caller->clone_of->decl);
@@ -270,7 +278,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
   e->inline_failed = CIF_OVERWRITABLE;
   return false;
 }
-  else if (e->call_stmt_cannot_inline_p)
+  else if (call_stmt_cannot_inline_p)
 {
   e->inline_failed = CIF_MISMATCHED_ARGUMENTS;
   inlinable = false;
@@ -343,14 +351,6 @@ can_inline_edge_p (struct cgraph_edge *e, bool report)
}
 }
 
-  /* Be sure that the cannot_inline_p flag is up to date.  */
-  gcc_checking_assert (!e->call_stmt
-  || (gimple_call_cannot_inline_p (e->call_stmt)
-  == e->call_stmt_cannot_inline_p)
-  /* In -flto-partition=none mode we really keep things 
out of
-  

Re: [patch] original function and TM clone has to be marked needed

2011-12-01 Thread Richard Henderson
On 11/30/2011 08:25 PM, Patrick Marlier wrote:
> In the current version, the original function and its clone are marked as 
> taken with cgraph_mark_address_taken_node. But it seems not enough and it has 
> to be marked as needed.
> It comes with a testcase (testsuite/g++.dg/tm/ctor-used.C).
> 
> Passed all TM tests.
> 
> PS: Note that there is still a problem with the testcase because 
> _ITM_getTMCloneOrIrrevocable is called instead of _ITM_getTMCloneSafe (it is 
> an atomic not relaxed transaction). I can have a look if you want.
> 
> 2011-11-30  Patrick Marlier  
> 
> * trans-mem.c (ipa_tm_insert_gettmclone_call): mark original
> and clone as needed.
> 

There are more problems than that.  This function is defined in this 
file, and somehow we declined to clone it.  We should not have attempted
to call _ITM_getTMCloneOrIrrevocable or _ITM_getTMCloneSafe.


r~


Re: [C++ Patch] PR 51326

2011-12-01 Thread Jason Merrill

OK.

Jason


[PATCH] Fix PR middle-end/39976, 200.sixtrack degradation

2011-12-01 Thread William J. Schmidt
Greetings,

Bug 39976 reported a degradation to 200.sixtrack wherein a hot
single-block loop is broken into two blocks.  Investigation showed the
cause to be a redundant PHI statement in the block, which the
tree-outof-ssa logic doesn't handle well.  Currently we don't have code
following the introduction of the redundant PHI that can clean it up.

This patch modifies the dom pass to include redundant PHIs in the logic
that removes redundant computations.  With the patch applied, the extra
block is no longer created and the 200.sixtrack degradation is removed.
This improves its performance by 7.3% on PowerPC64 32-bit and by 5.0% on
PowerPC64 64-bit.

Bootstrapped and regtested on powerpc64-linux.  OK for trunk?

Thanks,
Bill


2011-11-29  Bill Schmidt  

PR middle-end/39976
* tree-ssa-dom.c (enum expr_kind): Add EXPR_PHI.
(struct hashable_expr): Add struct phi field.
(initialize_hash_element): Handle phis.
(hashable_expr_equal_p): Likewise.
(iterative_hash_hashable_expr): Likewise.
(print_expr_hash_elt): Likewise.
(dom_opt_enter_block): Create equivalences from redundant phis.
(eliminate_redundant_computations): Handle redundant phis.


Index: gcc/tree-ssa-dom.c
===
--- gcc/tree-ssa-dom.c  (revision 181501)
+++ gcc/tree-ssa-dom.c  (working copy)
@@ -52,7 +52,8 @@ enum expr_kind
   EXPR_UNARY,
   EXPR_BINARY,
   EXPR_TERNARY,
-  EXPR_CALL
+  EXPR_CALL,
+  EXPR_PHI
 };
 
 struct hashable_expr
@@ -65,6 +66,7 @@ struct hashable_expr
 struct { enum tree_code op;  tree opnd0, opnd1; } binary;
 struct { enum tree_code op;  tree opnd0, opnd1, opnd2; } ternary;
 struct { gimple fn_from; bool pure; size_t nargs; tree *args; } call;
+struct { size_t nargs; tree *args; } phi;
   } ops;
 };
 
@@ -281,6 +283,19 @@ initialize_hash_element (gimple stmt, tree lhs,
   expr->kind = EXPR_SINGLE;
   expr->ops.single.rhs = gimple_goto_dest (stmt);
 }
+  else if (code == GIMPLE_PHI)
+{
+  size_t nargs = gimple_phi_num_args (stmt);
+  size_t i;
+
+  expr->type = TREE_TYPE (gimple_phi_result (stmt));
+  expr->kind = EXPR_PHI;
+  expr->ops.phi.nargs = nargs;
+  expr->ops.phi.args = (tree *) xcalloc (nargs, sizeof (tree));
+
+  for (i = 0; i < nargs; i++)
+expr->ops.phi.args[i] = gimple_phi_arg_def (stmt, i);
+}
   else
 gcc_unreachable ();
 
@@ -439,6 +454,21 @@ hashable_expr_equal_p (const struct hashable_expr
 return true;
   }
 
+case EXPR_PHI:
+  {
+size_t i;
+
+if (expr0->ops.phi.nargs !=  expr1->ops.phi.nargs)
+  return false;
+
+for (i = 0; i < expr0->ops.phi.nargs; i++)
+  if (! operand_equal_p (expr0->ops.phi.args[i],
+ expr1->ops.phi.args[i], 0))
+return false;
+
+return true;
+  }
+
 default:
   gcc_unreachable ();
 }
@@ -516,6 +546,15 @@ iterative_hash_hashable_expr (const struct hashabl
   }
   break;
 
+case EXPR_PHI:
+  {
+size_t i;
+
+for (i = 0; i < expr->ops.phi.nargs; i++)
+  val = iterative_hash_expr (expr->ops.phi.args[i], val);
+  }
+  break;
+
 default:
   gcc_unreachable ();
 }
@@ -588,6 +627,22 @@ print_expr_hash_elt (FILE * stream, const struct e
   fprintf (stream, ")");
 }
 break;
+
+  case EXPR_PHI:
+{
+  size_t i;
+  size_t nargs = element->expr.ops.phi.nargs;
+
+  fprintf (stream, "PHI <");
+  for (i = 0; i < nargs; i++)
+{
+  print_generic_expr (stream, element->expr.ops.phi.args[i], 0);
+  if (i + 1 < nargs)
+fprintf (stream, ", ");
+}
+  fprintf (stream, ">");
+}
+break;
 }
   fprintf (stream, "\n");
 
@@ -1688,6 +1743,10 @@ dom_opt_enter_block (struct dom_walk_data *walk_da
   /* PHI nodes can create equivalences too.  */
   record_equivalences_from_phis (bb);
 
+  /* Create equivalences from redundant PHIs.  */
+  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+eliminate_redundant_computations (&gsi);
+
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
 optimize_stmt (bb, gsi);
 
@@ -1818,13 +1877,27 @@ eliminate_redundant_computations (gimple_stmt_iter
 {
   tree expr_type;
   tree cached_lhs;
+  tree def;
   bool insert = true;
   bool assigns_var_p = false;
+  size_t i;
 
   gimple stmt = gsi_stmt (*gsi);
 
-  tree def = gimple_get_lhs (stmt);
+  /* If this is a PHI, we only want to consider it if all of its
+ arguments are SSA names (which are known to be defined in a
+ single place).  This avoids errors when dealing with if-temps,
+ for example.  */
+  if (gimple_code (stmt) == GIMPLE_PHI)
+for (i = 0; i < gimple_phi_num_args (stmt); i++)
+  if (TREE_CODE (gimple_phi_arg_def (s

Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Iain Sandoe


On 1 Dec 2011, at 20:20, Richard Henderson wrote:


On 12/01/2011 10:47 AM, Joseph S. Myers wrote:

As I previously noted in the ARM discussion, C specifically says that
setjmp/longjmp should *not* save/restore floating-point exceptions  
and

rounding modes.  Think of the floating-point state as being a global
variable (well, thread-local).


Exactly.

If the fpscr _were_ a TLS variable, seen to be modified inside a  
transaction, we would log its initial value so that we could restore  
that original value on a transaction restart or transaction cancel.\


well, I saw ...

mffsf0
stfdf14,  0+OFS_FR+BASE(r1)

[snip]

stfdf0, OFS_FPSCR+BASE(r1)

in your posted  code ... and wondered ..

So, as you say the ARM libc setjmp/longjmp implementation is wrong,  
but we do need that save and restore here.


now I'm slightly confused - do we need to preserve if across the call   
or not?


cheers
Iain



RTEMS Specific Ada Patch

2011-12-01 Thread Joel Sherrill

Hi,

The attached patch is necessary to let the gcc head
compile Ada for *-*-rtems*.  Other than terminals.c,
the files impacted are RTEMS specific.  OK to commit?

I have posted ACATS results for sparc-rtems4.11 at

http://gcc.gnu.org/ml/gcc-testresults/2011-12/msg00108.html

These are better results than what I posted back in
April for a 4.6.1 prerelease:

http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00209.html

2011-12-01  Joel Sherrill 

* gcc/ada/s-tpopsp-rtems.adb: Use ATCB_Key rather than
RTEMS_Ada_Self variable for consistency with other ports.
* gcc/ada/s-osinte-rtems.adb: Add body for dummy implementation
of pthread_rwlockattr_setkind_np().
* gcc/ada/s-osinte-rtems.ads: Add missing clock and rwlock 
bindings.

* gcc/ada/terminals.c: Add __rtems__ conditionals to account
for differences in termios implementation.

--
Joel Sherrill, Ph.D. Director of Research&  Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
   Support Available (256) 722-9985


2011-12-01  Joel Sherrill 

* gcc/ada/s-tpopsp-rtems.adb: Use ATCB_Key rather than
RTEMS_Ada_Self variable for consistency with other ports.
* gcc/ada/s-osinte-rtems.adb: Add body for dummy implementation
of pthread_rwlockattr_setkind_np().
* gcc/ada/s-osinte-rtems.ads: Add missing clock and rwlock bindings.
* gcc/ada/terminals.c: Add __rtems__ conditionals to account
for differences in termios implementation.

Index: gcc/ada/s-tpopsp-rtems.adb
===
--- gcc/ada/s-tpopsp-rtems.adb  (revision 181881)
+++ gcc/ada/s-tpopsp-rtems.adb  (working copy)
@@ -10,7 +10,7 @@
 -- $Revision: 1.2 $
 --  --
 --Copyright (C) 1991-2003, Florida State University --
---Copyright (C) 2008, Free Software Foundation, Inc.--
+--Copyright (C) 2008-2011, Free Software Foundation, Inc.   --
 --  --
 -- GNARL is free software; you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -48,8 +48,8 @@
--  The following gives the Ada run-time direct access to a variable
--  context switched by RTEMS at the lowest level.
 
-   RTEMS_Ada_Self : System.Address;
-   pragma Import (C, RTEMS_Ada_Self, "rtems_ada_self");
+   ATCB_Key : System.Address;
+   pragma Import (C, ATCB_Key, "rtems_ada_self");
 

-- Initialize --
@@ -59,8 +59,7 @@
   pragma Warnings (Off, Environment_Task);
 
begin
-  ATCB_Key := No_Key;
-  RTEMS_Ada_Self := To_Address (Environment_Task);
+  ATCB_Key := To_Address (Environment_Task);
end Initialize;
 
---
@@ -69,7 +68,7 @@
 
function Is_Valid_Task return Boolean is
begin
-  return RTEMS_Ada_Self /= System.Null_Address;
+  return ATCB_Key /= System.Null_Address;
end Is_Valid_Task;
 
-
@@ -78,7 +77,7 @@
 
procedure Set (Self_Id : Task_Id) is
begin
-  RTEMS_Ada_Self := To_Address (Self_Id);
+  ATCB_Key := To_Address (Self_Id);
end Set;
 
--
@@ -102,7 +101,7 @@
   Result : System.Address;
 
begin
-  Result := RTEMS_Ada_Self;
+  Result := ATCB_Key;
 
   --  If the key value is Null, then it is a non-Ada task.
 
Index: gcc/ada/s-osinte-rtems.adb
===
--- gcc/ada/s-osinte-rtems.adb  (revision 181881)
+++ gcc/ada/s-osinte-rtems.adb  (working copy)
@@ -122,4 +122,17 @@
   return 0;
end sigaltstack;
 
+   ---
+   -- pthread_rwlockattr_setkind_np --
+   ---
+
+   function pthread_rwlockattr_setkind_np
+ (attr : access pthread_rwlockattr_t;
+  pref : int) return int is
+  pragma Unreferenced (attr);
+  pragma Unreferenced (pref);
+   begin
+  return 0;
+   end pthread_rwlockattr_setkind_np;
+
 end System.OS_Interface;
Index: gcc/ada/s-osinte-rtems.ads
===
--- gcc/ada/s-osinte-rtems.ads  (revision 181881)
+++ gcc/ada/s-osinte-rtems.ads  (working copy)
@@ -6,7 +6,7 @@
 --  --
 --   S p e c--
 --  --
---  Copyright (C) 1997-2009 Free Software Foundation, Inc.  --
+--  Copyright (C) 1997-2011 Free Software Foundation, Inc.  --
 --  

[patch committed] Fix target/50814

2011-12-01 Thread Kaz Kojima
Hi,

The attached patch is to fix PR50814.  The sh2a support wrongly
assumed that shad/shld instructions on sh2a are 4-byte long.
The patch is tested on sh-elf.

Regards,
kaz
--
2011-12-01  Kaz Kojima  

PR target/50814.
* config/sh/sh.c (expand_ashiftrt): Handle TARGET_SH2A same as
TARGET_SH3.
(shl_sext_kind): Likewise.
* config/sh/sh.h (SH_DYNAMIC_SHIFT_COST): Likewise.
* config/sh/sh.md (ashlsi3_sh2a, ashrsi3_sh2a, lshrsi3_sh2a):
Remove.
(ashlsi3_std): Handle TARGET_SH2A same as TARGET_SH3.
(ashlsi3): Likewise.
(ashrsi3_d): Likewise.
(lshrsi3_d): Likewise.
(lshrsi3): Likewise.

diff -up ORIG/trunk/gcc/config/sh/sh.c trunk/gcc/config/sh/sh.c
--- ORIG/trunk/gcc/config/sh/sh.c   2011-11-13 09:19:44.0 +0900
+++ trunk/gcc/config/sh/sh.c2011-11-28 09:45:17.0 +0900
@@ -3266,7 +3266,7 @@ expand_ashiftrt (rtx *operands)
   char func[18];
   int value;
 
-  if (TARGET_SH3)
+  if (TARGET_SH3 || TARGET_SH2A)
 {
   if (!CONST_INT_P (operands[2]))
{
@@ -3715,7 +3715,7 @@ shl_sext_kind (rtx left_rtx, rtx size_rt
}
}
 }
-  if (TARGET_SH3)
+  if (TARGET_SH3 || TARGET_SH2A)
 {
   /* Try to use a dynamic shift.  */
   cost = shift_insns[32 - insize] + 1 + SH_DYNAMIC_SHIFT_COST;
diff -up ORIG/trunk/gcc/config/sh/sh.h trunk/gcc/config/sh/sh.h
--- ORIG/trunk/gcc/config/sh/sh.h   2011-11-15 09:37:11.0 +0900
+++ trunk/gcc/config/sh/sh.h2011-11-28 09:45:00.0 +0900
@@ -2394,7 +2394,8 @@ extern int current_function_interrupt;
 #define ACCUMULATE_OUTGOING_ARGS TARGET_ACCUMULATE_OUTGOING_ARGS
 
 #define SH_DYNAMIC_SHIFT_COST \
-  (TARGET_HARD_SH4 ? 1 : TARGET_SH3 ? (optimize_size ? 1 : 2) : 20)
+  (TARGET_HARD_SH4 ? 1 \
+   : (TARGET_SH3 || TARGET_SH2A) ? (optimize_size ? 1 : 2) : 20)
 
 
 #define NUM_MODES_FOR_MODE_SWITCHING { FP_MODE_NONE }
diff -up ORIG/trunk/gcc/config/sh/sh.md trunk/gcc/config/sh/sh.md
--- ORIG/trunk/gcc/config/sh/sh.md  2011-10-16 10:18:53.0 +0900
+++ trunk/gcc/config/sh/sh.md   2011-11-28 09:54:17.0 +0900
@@ -1,6 +1,6 @@
 ;;- Machine description for Renesas / SuperH SH.
 ;;  Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
-;;  2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+;;  2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
 ;;  Free Software Foundation, Inc.
 ;;  Contributed by Steve Chamberlain (s...@cygnus.com).
 ;;  Improved by Jim Wilson (wil...@cygnus.com).
@@ -3568,15 +3568,6 @@ label:
 ;;
 ;; shift left
 
-(define_insn "ashlsi3_sh2a"
-  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
-   (ashift:SI (match_operand:SI 1 "arith_reg_operand" "0")
-  (match_operand:SI 2 "arith_reg_operand" "r")))]
-  "TARGET_SH2A"
-  "shad%2,%0"
-  [(set_attr "type" "arith")
-   (set_attr "length" "4")])
-
 ;; This pattern is used by init_expmed for computing the costs of shift
 ;; insns.
 
@@ -3585,14 +3576,14 @@ label:
(ashift:SI (match_operand:SI 1 "arith_reg_operand" "0,0,0,0")
   (match_operand:SI 2 "nonmemory_operand" "r,M,P27,?ri")))
(clobber (match_scratch:SI 3 "=X,X,X,&r"))]
-  "TARGET_SH3
+  "(TARGET_SH3 || TARGET_SH2A)
|| (TARGET_SH1 && satisfies_constraint_P27 (operands[2]))"
   "@
shld%2,%0
add %0,%0
shll%O2 %0
#"
-  "TARGET_SH3
+  "(TARGET_SH3 || TARGET_SH2A)
&& reload_completed
&& CONST_INT_P (operands[2])
&& ! satisfies_constraint_P27 (operands[2])"
@@ -3671,7 +3662,7 @@ label:
   if (CONST_INT_P (operands[2])
   && sh_dynamicalize_shift_p (operands[2]))
 operands[2] = force_reg (SImode, operands[2]);
-  if (TARGET_SH3)
+  if (TARGET_SH3 || TARGET_SH2A)
 {
   emit_insn (gen_ashlsi3_std (operands[0], operands[1], operands[2]));
   DONE;
@@ -3728,15 +3719,6 @@ label:
 ; arithmetic shift right
 ;
 
-(define_insn "ashrsi3_sh2a"
-  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
-   (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0")
-  (neg:SI (match_operand:SI 2 "arith_reg_operand" "r"]
-  "TARGET_SH2A"
-  "shad%2,%0"
-  [(set_attr "type" "dyn_shift")
-   (set_attr "length" "4")])
-
 (define_insn "ashrsi3_k"
   [(set (match_operand:SI 0 "arith_reg_dest" "=r")
(ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0")
@@ -3831,7 +3813,7 @@ label:
   [(set (match_operand:SI 0 "arith_reg_dest" "=r")
(ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0")
 (neg:SI (match_operand:SI 2 "arith_reg_operand" "r"]
-  "TARGET_SH3"
+  "TARGET_SH3 || TARGET_SH2A"
   "shad%2,%0"
   [(set_attr "type" "dyn_shift")])
 
@@ -3879,20 +3861,11 @@ label:
 
 ;; logical shift right
 
-(define_insn "lshrsi3_sh2a"
-  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
-   (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0")
-(neg:SI (match_operand:SI 2 

Re: [RFC] Port libitm to powerpc

2011-12-01 Thread Iain Sandoe


On 1 Dec 2011, at 22:42, Iain Sandoe wrote:


now I'm slightly confused - do we need to preserve if across the  
call  or not?


erm.  not well phrased.

I am trying to get a grasp on what determines the set of registers  
that should be saved.


Initially, I was thinking that it was the "call-saved" set - which, in  
the Darwin ABI is silent about the FPSCR (consistent with Joseph's  
remark - although I note that the ABI doc, in most cases, states YES/ 
NO for each register).


Now I'm wondering if the saved set needs to include most/all of the  
set that are saved for exceptions?


cheers
Iain



Re: Go patch committed: New lock/note implementation

2011-12-01 Thread Ian Lance Taylor
Rainer Orth  writes:

> FAIL: go.go-torture/execute/go-1.go execution,  -O0 

There should be more information in gcc/testsuite/go/go.log.


> * All 64-bit libgo tests fail on the same target:
>
> FAIL: asn1
> /vol/gcc/src/hg/trunk/local/libgo/testsuite/gotest[422]: gotest-timeout: 
> cannot create [No such file or directory]
> checkId: 65 should be 1
> checkId: 66 should be 1
>
> I've not yet investigated what's going on here.

I don't know what is going on either.

gotest-timeout is just a file created by the gotest script:

(sleep `expr $timeout + 10`
echo > gotest-timeout
echo "timed out in gotest" 1>&2
kill -9 $pid) &

The error above implies that gotest is running in a directory which was
removed while the script was running.  I don't really see how libgo
could cause that to happen.  Could something else have been going on?

At this point I have no idea what is going wrong.

Ian


Go patch committed: Remove temporary function

2011-12-01 Thread Ian Lance Taylor
I added the function runtime_cond_wait to libgo temporarily during the
conversion to multiplexing goroutines.  It is no longer needed and this
patch removes it.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 2ce500c576dc libgo/runtime/runtime.h
--- a/libgo/runtime/runtime.h	Thu Dec 01 10:55:24 2011 -0800
+++ b/libgo/runtime/runtime.h	Thu Dec 01 15:41:16 2011 -0800
@@ -337,6 +337,3 @@
 #ifdef __rtems__
 void __wrap_rtems_task_variable_add(void **);
 #endif
-
-/* Temporary.  */
-void	runtime_cond_wait(pthread_cond_t*, pthread_mutex_t*);
diff -r 2ce500c576dc libgo/runtime/thread.c
--- a/libgo/runtime/thread.c	Thu Dec 01 10:55:24 2011 -0800
+++ b/libgo/runtime/thread.c	Thu Dec 01 15:41:16 2011 -0800
@@ -90,27 +90,3 @@
 	if(sigaltstack(&ss, nil) < 0)
 		*(int *)0xf1 = 0xf1;
 }
-
-// Temporary functions, which will be removed when we stop using
-// condition variables.
-
-void
-runtime_cond_wait(pthread_cond_t* cond, pthread_mutex_t* mutex)
-{
-	int i;
-
-	runtime_entersyscall();
-
-	i = pthread_cond_wait(cond, mutex);
-	if(i != 0)
-		runtime_throw("pthread_cond_wait");
-	i = pthread_mutex_unlock(mutex);
-	if(i != 0)
-		runtime_throw("pthread_mutex_unlock");
-
-	runtime_exitsyscall();
-
-	i = pthread_mutex_lock(mutex);
-	if(i != 0)
-		runtime_throw("pthread_mutex_lock");
-}


Re: [PATCH] Fix early inliner inlining uninlinable functions

2011-12-01 Thread Jan Hubicka


Sure, but then you can still have the issue of an inconsistency.
Thus, would you then remove the remaining asserts?

I believe in the end the proper fix is to _not_ throw away
cgraph edges all the time, but keep them up-to-date and thus
make the stmt flag not necessary.  (we can define "up-to-date"
in a way so that we only require that existing edges that
still have a call stmt have to be valid, thus still require
incremental recomputation to remove dead edges and create
new ones)


Well, the stmt flag always looked redundat to me. We we just don't initialize
the edge flag at cgraph construction time? We do have the statement then.

Honza



Ping Re: Fix doloop bug with maximum-length loops

2011-12-01 Thread Joseph S. Myers
Ping.  This patch 
 is pending 
review.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Ping Re: Fix doloop bug with maximum-length loops

2011-12-01 Thread Andrew Pinski
On Thu, Dec 1, 2011 at 6:08 PM, Joseph S. Myers  wrote:
> Ping.  This patch
>  is pending
> review.
>

From my point of view, reverting my patch is fine as the testcase
which I was trying to optimized was not even optimized on the trunk
after this patch anyways.

Thanks,
Andrew Pinski


Re: Ping Re: Fix doloop bug with maximum-length loops

2011-12-01 Thread Andrew Pinski
On Thu, Dec 1, 2011 at 6:28 PM, Andrew Pinski  wrote:
> On Thu, Dec 1, 2011 at 6:08 PM, Joseph S. Myers  
> wrote:
>> Ping.  This patch
>>  is pending
>> review.
>>
>
> From my point of view, reverting my patch is fine as the testcase
> which I was trying to optimized was not even optimized on the trunk
> after this patch anyways.
>
> Thanks,
> Andrew Pinski


  1   2   >