Re: [RFC PATCH] update to libtool-2.4.2 and regenerate

2011-10-29 Thread Markus Trippelsdorf
On 2011.10.28 at 07:20 +0200, Markus Trippelsdorf wrote:
> On 2011.10.27 at 17:29 -0700, Andi Kleen wrote:
> > Markus Trippelsdorf  writes:
> > 
> > > By popular demand, I've prepared a patch that updates the in-tree
> > > libtool to version 2.4.2. It is needed for lto-bootstrap with
> > > -fno-fat-lto-objects and FreeBSD10.x versions. 
> > > It's a pretty big update as you can see by the following diffstat. I
> > > cannot attach the patch even as a gzip file, because of its size:
> > >
> > >  417745 Oct 28 00:47 0001-update-to-libtool-2.4.2-and-regenerate.patch.gz
> > >
> > > Bootstrapped on x86_64-pc-linux-gnu. 
> > 
> > Can you put it up for download somewhere? Does the slim bootstrap
> > now work on Linux?
> 
> http://trippelsdorf.de/0001-update-to-libtool-2.4.2-and-regenerate.patch.bz2
> 
> > I presume it needs at least the gcc-ar patch too, unless you use
> > custom AR/RANLIB wrappers.
> 
> Yes, slim bootstrap now works on Linux. One has to point AR,
> AR_FOR_TARGET, etc. to the wrappers and add "-fuse-linker-plugin" to the
> various CFLAGS (BOOT_CFLAGS, STAGE1_CFLAGS and CFLAGS_FOR_TARGET).

Here is what I use right now:

 % cat config/slim-lto-bootstrap.mk
# This option enables slim LTO for stage2 and stage3.

STAGE2_CFLAGS += -flto=jobserver -fno-fat-lto-objects -frandom-seed=1
STAGE3_CFLAGS += -flto=jobserver -fno-fat-lto-objects -frandom-seed=1
STAGE_CFLAGS += -fuse-linker-plugin
STAGEprofile_CFLAGS += -fno-lto
AR=/usr/local/bin/ar
NM=/usr/local/bin/nm
RANLIB=/usr/local/bin/ranlib
AR_FOR_TARGET=/usr/local/bin/ar
NM_FOR_TARGET=/usr/local/bin/nm
RANLIB_FOR_TARGET=/usr/local/bin/ranlib

And the following patch to force fixincludes to honor CFLAGS:

diff --git a/Makefile.in b/Makefile.in
index d1206bd..3f3f9e0 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -2819,6 +2819,7 @@ configure-build-fixincludes:
test ! -f $(BUILD_SUBDIR)/fixincludes/Makefile || exit 0; \
$(SHELL) $(srcdir)/mkinstalldirs $(BUILD_SUBDIR)/fixincludes ; \
$(BUILD_EXPORTS)  \
+   CFLAGS="$(STAGE_CFLAGS)"; export CFLAGS; \
echo Configuring in $(BUILD_SUBDIR)/fixincludes; \
cd "$(BUILD_SUBDIR)/fixincludes" || exit 1; \
case $(srcdir) in \
@@ -2854,6 +2855,7 @@ all-build-fixincludes: configure-build-fixincludes
$(BUILD_EXPORTS)  \
(cd $(BUILD_SUBDIR)/fixincludes && \
  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_BUILD_FLAGS)  \
+   CFLAGS="$(STAGE_CFLAGS)" \
$(TARGET-build-fixincludes))
 @endif build-fixincludes
 
@@ -7729,6 +7731,7 @@ configure-fixincludes:
test ! -f $(HOST_SUBDIR)/fixincludes/Makefile || exit 0; \
$(SHELL) $(srcdir)/mkinstalldirs $(HOST_SUBDIR)/fixincludes ; \
$(HOST_EXPORTS)  \
+   CFLAGS="$(STAGE_CFLAGS)"; export CFLAGS; \
echo Configuring in $(HOST_SUBDIR)/fixincludes; \
cd "$(HOST_SUBDIR)/fixincludes" || exit 1; \
case $(srcdir) in \
@@ -7763,6 +7766,7 @@ all-fixincludes: configure-fixincludes
$(HOST_EXPORTS)  \
(cd $(HOST_SUBDIR)/fixincludes && \
  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS)  \
+   CFLAGS="$(STAGE_CFLAGS)" \
$(TARGET-fixincludes))
 @endif fixincludes
 

With this in place you can configure and build gcc
--with-build-config=slim-lto-bootstrap .

I haven't figured out yet how to use gcc-ar et al. instead of the shell
wrappers. The plugin path seems to be hardcoded, which is unfortunate.
Would it be possible to make the wrappers more flexible, so that they
could be used during bootstrap?

-- 
Markus


Re: [Patch,AVR]: Support -maccumulate-args option

2011-10-29 Thread Denis Chertykov
2011/10/27 Georg-Johann Lay :
> This is support of a new option -maccumulate-args that implements
> ACCUMULATE_OUTGOING_ARGS as proposed by Richard.
>
> As 4.7 will be released very soon, I'd like to supply the documentation part
> later and use the remaining stage I time for extension/improvements.
>
> The tests ran 4 times with either combination of
>  -m[no-]call-prologues
>  -m[no-]accumulate-args
>
> The results for the C testsuite are:
>
> -maccumulate-args
>    PASS -> FAIL
>    gcc.dg/compat/struct-by-value-16a
>    gcc.dg/compat/struct-by-value-17a
>    gcc.dg/compat/struct-by-value-18a
>
> -maccumulate-args
> -mcall-prologues
>    PASS -> FAIL
>    gcc.dg/compat/struct-by-value-16a
>    gcc.dg/compat/struct-by-value-17a
>    gcc.dg/compat/struct-by-value-18a
>    gcc.dg/sibcall-3.c
>    gcc.dg/sibcall-4.c
>
>    FAIL -> PASS
>    gcc.dg/torture/pta-ptrarith-3.c
>
> -mcall-prologues
>    PASS -> FAIL
>    gcc.dg/sibcall-3.c
>    gcc.dg/sibcall-4.c
>
>    FAIL -> PASS
>    gcc.dg/torture/pta-ptrarith-3.c
>
> All FAILs runtime fails.
>  - struct-by-value-* use extrem RAM (overflow)
>  - sibcall-* fail because that optimization is off
>   with -mcall-prologues
>  - the FAILs of pta-ptrarith-3 are PR50063.
>
> There are no changes in the C++ testsuite.
>
> Ok for trunk?
>
> Johann
>
>        * config/avr/avr.opt (-maccumulate-args): New option.
>        * config/avr/avr.h (STARTING_FRAME_OFFSET): Redefine to
>        avr_starting_frame_offset.
>        (ACCUMULATE_OUTGOING_ARGS): Define to avr_accumulate_outgoing_args.
>        * config/avr/avr.md (UNSPECV_WRITE_SP_IRQ_ON): Remove.
>        (UNSPECV_WRITE_SP_IRQ_OFF): Remove.
>        (UNSPECV_WRITE_SP): New constant.
>        (*addhi3_sp_R): Rewrite to...
>        (*addhi3_sp): ...this new insn.
>        (movhi_sp_r_irq_off, movhi_sp_r_irq_on): Combine to...
>        (movhi_sp_r): ...this new insn.
>        * config/avr/avr-protos.h (avr_accumulate_outgoing_args): New.
>        (avr_starting_frame_offset): New.
>        * config/avr/avr.c (avr_accumulate_outgoing_args): New function.
>        (avr_starting_frame_offset): New function.
>        (avr_outgoing_args_size): New static function.
>        (avr_initial_elimination_offset): Use it.
>        (avr_simple_epilogue): Use it.
>        (avr_asm_function_end_prologue): Use it.
>        (expand_epilogue): Use it.
>        (expand_prologue): Use it.  Break out code to...
>        (avr_prologue_setup_frame): ...this new static function.
>        (avr_can_eliminate): Allow eliminating to frame pointer if there
>        is one.
>        (avr_frame_pointer_required_p): Use frame pointer if target has a
>        nonlocal label.
>        * config/avr/constraints.md (Csp): New constraint.
>        * config/avr/predicates.md (avr_sp_immediate_operand): Use it.
>


Please commit.

Denis.


Re: [Patch, libfortran, 3/3] Update file position lazily

2011-10-29 Thread Janne Blomqvist
On Sat, Oct 29, 2011 at 01:48, Mikael Morin  wrote:
> On Tuesday 18 October 2011 17:11:24 Janne Blomqvist wrote:
>> Also, I think I've found a small standards conformance bug. From F2008
>> (N1830) 9.10.2.23 (page 256): "... ASIS if the connection was opened
>> without changing its position." and "If the file has been repositioned
>> since the connection, the scalar-default-char-variable
>> is assigned a processor-dependent value, which shall not be REWIND
>> unless the file is positioned at its initial
>> point and shall not be APPEND unless the file is positioned so that its
>> endfile record is the next record or at its
>> terminal point if it has no endfile record.
>> "
>>
>> If my understanding of the above is correct, returning ASIS is
>> incorrent unless the position is unchanged since the OPEN statement.
>> Currently we return ASIS by default if it's neither REWIND nor APPEND.
>> So the patch changes the implementation to return the
>> processor-dependent value UNSPECIFIED in this case.
>>
> If my reading is correct, returning ASIS is as valid as returning UNSPECIFIED
> ("processor-dependent"). I have a preference for UNSPECIFIED and see your
> patch as OK, but shouldn't it be avoided if it breaks backwards compatibility?

My thinking was that the first sentence I quoted would prohibit ASIS
even though it's not explicitly forbidden in the second quoted
sentence. Fixing the implementation would thus be correcting a
standards-conformance bug.

FWIW, it seems ifort 12.0 uses "UNDEFINED" in this case; I suppose a
case could be made for using the same. Comments?

> I'm also afraid of testsuite changes of the following kind.
> Was there no reason for the "-std=legacy"?
>
> diff --git a/gcc/testsuite/gfortran.dg/inquire_5.f90
> b/gcc/testsuite/gfortran.dg/inquire_5.f90
> index fe107a1..064f96d 100644
> --- a/gcc/testsuite/gfortran.dg/inquire_5.f90
> +++ b/gcc/testsuite/gfortran.dg/inquire_5.f90
> @@ -1,11 +1,10 @@
>  ! { dg-do run { target fd_truncate } }
> -! { dg-options "-std=legacy" }
>  !

I changed the declaration of "chr" from "character*20" to
"character(len=20)" which made std=legacy unnecessary. As the testcase
doesn't test any legacy functionality per se, I though this change
would slightly simplify it. See also

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40881

which mass-added std=legacy to a number of testcases (including this
one) as a result of some frontend warnings changes.

Thanks for the reviews!

-- 
Janne Blomqvist


Re: Build broken due to "[PATCH] Add gcc-ar/nm/ranlib wrappers for slim LTO v2"

2011-10-29 Thread Andi Kleen
> This broke cross to cris-elf and I guess many other targets
> with TOT binutils, as follows:
> 
> mv -f Tlto-wrapper lto-wrapper

Oops.  Can you please confirm this patch fixes it?

-Andi

diff --git a/gcc/gcc-ar.c b/gcc/gcc-ar.c
index fc7e4a2..1e86d20 100644
--- a/gcc/gcc-ar.c
+++ b/gcc/gcc-ar.c
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 
 static const char standard_libexec_prefix[] = STANDARD_LIBEXEC_PREFIX;
 static const char standard_bin_prefix[] = STANDARD_BINDIR_PREFIX;
+static const char *const target_machine = TARGET_MACHINE;
 
 static const char dir_separator[] = { DIR_SEPARATOR, 0 };
 


Re: [RFC PATCH] update to libtool-2.4.2 and regenerate

2011-10-29 Thread Andi Kleen
> I haven't figured out yet how to use gcc-ar et al. instead of the shell
> wrappers. The plugin path seems to be hardcoded, which is unfortunate.
> Would it be possible to make the wrappers more flexible, so that they
> could be used during bootstrap?

The path can be set with GCC_EXEC_PREFIX currently. Maybe the makefile
could set that.

I considered implementing -B, but I was afraid it would conflict with
some wrapped program option. Maybe there's a way around that.

-Andi


Re: Build broken due to "[PATCH] Add gcc-ar/nm/ranlib wrappers for slim LTO v2"

2011-10-29 Thread Andi Kleen
On Sat, Oct 29, 2011 at 10:09:48AM +0200, Andi Kleen wrote:
> > This broke cross to cris-elf and I guess many other targets
> > with TOT binutils, as follows:
> > 
> > mv -f Tlto-wrapper lto-wrapper
> 
> Oops.  Can you please confirm this patch fixes it?

I committed the patch with ChangeLog as as obvious now, after running
a full test and testing the if path manually.
-Andi

> 
> -Andi
> 
> diff --git a/gcc/gcc-ar.c b/gcc/gcc-ar.c
> index fc7e4a2..1e86d20 100644
> --- a/gcc/gcc-ar.c
> +++ b/gcc/gcc-ar.c
> @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
>  
>  static const char standard_libexec_prefix[] = STANDARD_LIBEXEC_PREFIX;
>  static const char standard_bin_prefix[] = STANDARD_BINDIR_PREFIX;
> +static const char *const target_machine = TARGET_MACHINE;
>  
>  static const char dir_separator[] = { DIR_SEPARATOR, 0 };
>  
> 

-- 
a...@linux.intel.com -- Speaking for myself only.


Re: Build broken due to "[PATCH] Add gcc-ar/nm/ranlib wrappers for slim LTO v2"

2011-10-29 Thread Hans-Peter Nilsson
> From: Andi Kleen 
> Date: Sat, 29 Oct 2011 10:09:48 +0200

> Oops.  Can you please confirm this patch fixes it?

My autotester is still busy after your commit, but has passed
the point of failure.  Thanks for fixing.

brgds, H-P


Re: [C++ Patch] PR 50901

2011-10-29 Thread Jason Merrill
OK.

Jason


Re: [Patch, libfortran, 3/3] Update file position lazily

2011-10-29 Thread Mikael Morin
On Saturday 29 October 2011 10:09:07 Janne Blomqvist wrote:
> On Sat, Oct 29, 2011 at 01:48, Mikael Morin  wrote:
> > On Tuesday 18 October 2011 17:11:24 Janne Blomqvist wrote:
> >> Also, I think I've found a small standards conformance bug. From F2008
> >> (N1830) 9.10.2.23 (page 256): "... ASIS if the connection was opened
> >> without changing its position." and "If the file has been repositioned
> >> since the connection, the scalar-default-char-variable
> >> is assigned a processor-dependent value, which shall not be REWIND
> >> unless the file is positioned at its initial
> >> point and shall not be APPEND unless the file is positioned so that its
> >> endfile record is the next record or at its
> >> terminal point if it has no endfile record.
> >> "
> >> 
> >> If my understanding of the above is correct, returning ASIS is
> >> incorrent unless the position is unchanged since the OPEN statement.
> >> Currently we return ASIS by default if it's neither REWIND nor APPEND.
> >> So the patch changes the implementation to return the
> >> processor-dependent value UNSPECIFIED in this case.
> > 
> > If my reading is correct, returning ASIS is as valid as returning
> > UNSPECIFIED ("processor-dependent"). I have a preference for UNSPECIFIED
> > and see your patch as OK, but shouldn't it be avoided if it breaks
> > backwards compatibility?
> 
> My thinking was that the first sentence I quoted would prohibit ASIS
> even though it's not explicitly forbidden in the second quoted
> sentence. Fixing the implementation would thus be correcting a
> standards-conformance bug.
Well, the first sentence does impose the value to be ASIS in case the position 
hasn't changed, but it does not impose the value not to be ASIS in case the 
position has changed.
On the other hand, let's think about the use cases: if we are returning ASIS 
even if the position has changed, we can't use that value reliably to tell 
that the position hasn't changed.
Thus, I think your patch is OK with the following changes.

> 
> FWIW, it seems ifort 12.0 uses "UNDEFINED" in this case; I suppose a
> case could be made for using the same. Comments?
Let's go for UNDEFINED then.

> 
> > I'm also afraid of testsuite changes of the following kind.
> > Was there no reason for the "-std=legacy"?
> > 
> > diff --git a/gcc/testsuite/gfortran.dg/inquire_5.f90
> > b/gcc/testsuite/gfortran.dg/inquire_5.f90
> > index fe107a1..064f96d 100644
> > --- a/gcc/testsuite/gfortran.dg/inquire_5.f90
> > +++ b/gcc/testsuite/gfortran.dg/inquire_5.f90
> > @@ -1,11 +1,10 @@
> >  ! { dg-do run { target fd_truncate } }
> > -! { dg-options "-std=legacy" }
> >  !
> 
> I changed the declaration of "chr" from "character*20" to
> "character(len=20)" which made std=legacy unnecessary. As the testcase
> doesn't test any legacy functionality per se, I though this change
> would slightly simplify it. See also
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40881
> 
> which mass-added std=legacy to a number of testcases (including this
> one) as a result of some frontend warnings changes.
OK for that part.

>
> @@ -31,7 +30,8 @@
>write(7,*)'this is another record'
>backspace(7)
>inquire(7,position=chr)
> -  if (chr.NE.'ASIS') CALL ABORT
> +  if (chr.eq.'ASIS' .or. chr .eq. 'REWIND' &
> +   .or. chr .eq. 'APPEND') CALL ABORT
I think it's better to keep the more restrictive:
 if (chr.NE.'UNDEFINED') CALL ABORT
Just in case we have a memory leak some day, which makes us return some junk 
here.

Please give the other some time to comment on our discussion before 
committing.
Thanks for the patch.

Mikael


[Patch ObjC/Committed] fix PR47997 (part 2).

2011-10-29 Thread Iain Sandoe

As approved on the PR thread,
Iain

Index: gcc/objc/ChangeLog
===
--- gcc/objc/ChangeLog  (revision 180650)
+++ gcc/objc/ChangeLog  (working copy)
@@ -1,3 +1,10 @@
+2011-10-29  Iain Sandoe  
+
+   PR target/47997
+   * objc-act.c (objc_build_string_object): Remove redundant second
+   call to fix_string_type ().  Add a checking assert that we are,
+   indeed, passed a STRING_CST.
+
 2011-10-18  Mikael Pettersson  
 
PR objc/50743
Index: gcc/objc/objc-act.c
===
--- gcc/objc/objc-act.c (revision 180650)
+++ gcc/objc/objc-act.c (working copy)
@@ -3128,9 +3128,8 @@ objc_build_string_object (tree string)
   struct string_descriptor *desc, key;
   void **loc;
 
-  /* Prep the string argument.  */
-  string = fix_string_type (string);
-  TREE_SET_CODE (string, STRING_CST);
+  /* We should be passed a STRING_CST.  */
+  gcc_checking_assert (TREE_CODE (string) == STRING_CST);
   length = TREE_STRING_LENGTH (string) - 1;
 
   /* The target may have different ideas on how to construct an ObjC string




Re: [trans-mem] Explicitly go irrevocable even if transaction will always go irrevocable.

2011-10-29 Thread Torvald Riegel
On Fri, 2011-10-28 at 07:53 -0500, Aldy Hernandez wrote:
> > diff --git a/gcc/testsuite/gcc.dg/tm/memopt-1.c 
> > b/gcc/testsuite/gcc.dg/tm/memopt-1.c
> > index 06d4f64..9a48dcb 100644
> > --- a/gcc/testsuite/gcc.dg/tm/memopt-1.c
> > +++ b/gcc/testsuite/gcc.dg/tm/memopt-1.c
> > @@ -2,8 +2,8 @@
> >  /* { dg-options "-fgnu-tm -O -fdump-tree-tmmemopt" } */
> >
> >  long g, xxx, yyy;
> > -extern george() __attribute__((transaction_callable));
> > -extern ringo(long int);
> > +extern george() __attribute__((transaction_safe));
> > +extern ringo(long int) __attribute__((transaction_safe));
> >  int i;
> 
> The patch looks fine, but...

Looking closer at this, we were faking to have an uninstrumented code
path so not explicitly requesting irrevocable mode was okay. However, we
still had calls to the TM library in those code, which is not really
what we want (mostly for performance reasons, it is supposed to still
work because the runtime has to change to a suitable dispatch internally
after going irrevocable).
I'll prepare a different patch, after looking at Richard's recent
changes.

> Was the original test wrong, or are you testing something new?

Yes, sort of. It was testing for optimizations that were correct
performed but which should not have been applicable in this particular
test case _and_ with the current state of the code. However, after the
fix for when to request irrevocable mode that I have in mind, this test
should work as is.

Torvald



Re: [RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-29 Thread Toon Moene

On 10/26/2011 11:56 PM, Jakub Jelinek wrote:

Hi!

This patch implements gather vectorization with -mavx2, if
dr_may_alias (which apparently doesn't use tbaa :(( ) can figure out
there is no overlap with stores in the loop (if any).
The testcases show what is possible to get vectorized.

I chose to add 4 extra (internal only) gather builtins in addition to the
16 ones needed for the intrinsics, because the builtins using different
sizes of the index vs. src/mask/ret vectors would complicate the generic
code way too much (we don't have a VEC_SELECT_EXPR nor VEC_CONCAT_EXPR
and interleaving/extract even/odd is undesirable here).
With these 4 extra builtins the generic code always sees same sized
src/mask/ret vs. index vectors, either they have same number of units,
then just one vgather* insn is needed, or the index has more elements
(int index and double/long long load) - then for one loaded index vector
there is one vgather* insn using the first half of the index vector and
one using the second half of that vector, or long index with float/int
load, then two index vectors are processed by two vgather* insns and
the result gets concatenated first halves of both results.

All this is so far unconditional only, we'd need some tree representation
of conditional loads resp. conditional stores (and could already with AVX
use vmaskmov* insns for that).

Bootstrapped/regtested on x86_64-linux and i686-linux, testcases tested
also under sde.  Ok for trunk?

2011-10-26  Jakub Jelinek

PR tree-optimization/50789
* tree-vect-stmts.c (process_use): Add force argument, avoid
exist_non_indexing_operands_for_use_p check if true.
(vect_mark_stmts_to_be_vectorized): Adjust callers.  Handle
STMT_VINFO_GATHER_P.
(gen_perm_mask): New function.
(perm_mask_for_reverse): Use it.
(reverse_vec_element): Rename to...
(permute_vec_elements): ... this.  Add Y and MASK_VEC arguments,
generalize for any permutations.
(vectorizable_load): Adjust caller.  Handle STMT_VINFO_GATHER_P.
* target.def (TARGET_VECTORIZE_BUILTIN_GATHER): New hook.
* doc/tm.texi.in (TARGET_VECTORIZE_BUILTIN_GATHER): Document it.
* doc/tm.texi: Regenerate.
* tree-data-ref.c (initialize_data_dependence_relation,
compute_self_dependence): No longer static.
* tree-data-ref.h (initialize_data_dependence_relation,
compute_self_dependence): New prototypes.
* tree-vect-data-refs.c (vect_check_gather): New function.
(vect_analyze_data_refs): Detect possible gather load data
refs.
* tree-vectorizer.h (struct _stmt_vec_info): Add gather_p field.
(STMT_VINFO_GATHER_P): Define.
(vect_check_gather): New prototype.
* config/i386/i386-builtin-types.def: Add types for alternate
gather builtins.
* config/i386/sse.md (AVXMODE48P_DI): Remove.
(VEC_GATHER_MODE): Rename mode_attr to...
(VEC_GATHER_IDXSI): ... this.
(VEC_GATHER_IDXDI, VEC_GATHER_SRCDI): New mode_attrs.
(avx2_gathersi, *avx2_gathersi): Use
instead of.
(avx2_gatherdi): Use  instead of
<  and  instead of VEC_GATHER_MODE
on src and mask operands.
(*avx2_gatherdi): Likewise.  Use VEC_GATHER_MODE iterator
instead of AVXMODE48P_DI.
(avx2_gatherdi256, *avx2_gatherdi256): Removed.
* config/i386/i386.c (enum ix86_builtins): Add
IX86_BUILTIN_GATHERALTSIV4DF, IX86_BUILTIN_GATHERALTDIV8SF,
IX86_BUILTIN_GATHERALTSIV4DI and IX86_BUILTIN_GATHERALTDIV8SI.
(ix86_init_mmx_sse_builtins): Create those builtins.
(ix86_expand_builtin): Handle those builtins and adjust expansions
of other gather builtins.
(ix86_vectorize_builtin_gather): New function.
(TARGET_VECTORIZE_BUILTIN_GATHER): Define.

* gcc.target/i386/avx2-gather-1.c: New test.
* gcc.target/i386/avx2-gather-2.c: New test.
* gcc.target/i386/avx2-gather-3.c: New test.

--- gcc/tree-vect-stmts.c.jj2011-10-26 14:19:11.0 +0200
+++ gcc/tree-vect-stmts.c   2011-10-26 16:54:23.0 +0200
@@ -332,6 +332,8 @@ exist_non_indexing_operands_for_use_p (t
 - LIVE_P, RELEVANT - enum values to be set in the STMT_VINFO of the stmt
   that defined USE.  This is done by calling mark_relevant and passing it
   the WORKLIST (to add DEF_STMT to the WORKLIST in case it is relevant).
+   - FORCE is true if exist_non_indexing_operands_for_use_p check shouldn't
+ be performed.

 Outputs:
 Generally, LIVE_P and RELEVANT are used to define the liveness and
@@ -351,7 +353,8 @@ exist_non_indexing_operands_for_use_p (t

  static bool
  process_use (gimple stmt, tree use, loop_vec_info loop_vinfo, bool live_p,
-enum vect_relevant relevant, VEC(gimple,heap) **worklist)
+enum vect_relevant relevant, VEC(gimple,heap) **worklist,
+bool force)
  {
stru

Re: [RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-29 Thread Toon Moene

On 10/26/2011 11:56 PM, Jakub Jelinek wrote:


Hi!

This patch implements gather vectorization with -mavx2, if
dr_may_alias (which apparently doesn't use tbaa :(( ) can figure out
there is no overlap with stores in the loop (if any).
The testcases show what is possible to get vectorized.


Hmmm,

I wonder whether it will work with the attached Fortran routine - it 
sure would mean a boost to the 18%+ heaviest CPU user in our code.


What follows is the single CPU breakdown of the most demanding codes in 
our weather forecasting code (from my 2006 GCC Summit "contribution", 
which wasn't approved):


Flat profile:
% time  calls name
 18.34  85684 verint_ <-- That's the one attached
  9.34   1380 invlo4_
  7.84  85684 bixint_
  6.76133 sl2tim_
  5.30  14950 condcv_
  4.74  14950 radia_
  4.65  14950 vcbr_
  3.25133 sldyn_
  2.98  14950 phtask_
  2.42133 sldynm_
  2.29  14950 phys_
  2.19  14950 prevap_

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290  | 4 more
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands   | 4 44
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
# 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F"
# 1 ""
# 1 ""
# 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F"
c Library:grdy $RCSfile$, $Revision: 7536 $
c checked in by $Author: ovignes $ at $Date: 2009-12-18 14:23:36 +0100 (Fri, 18 Dec 2009) $
c $State$, $Locker$
c $Log$
c Revision 1.3  1999/04/22 09:30:45  DagBjoerge
c MPP code
c
c Revision 1.2  1999/03/09 10:23:13  GerardCats
c Add SGI paralllellisation directives DOACROSS
c
c Revision 1.1  1996/09/06 13:12:18  GCats
c Created from grdy.apl, 1 version 2.6.1, by Gerard Cats
c
  SUBROUTINE VERINT (
 I   KLON   , KLAT   , KLEV   , KINT  , KHALO
 I , KLON1  , KLON2  , KLAT1  , KLAT2
 I , KP , KQ , KR
 R , PARG   , PRES
 R , PALFH  , PBETH
 R , PALFA  , PBETA  , PGAMA   )
C
C***
C
C  VERINT - THREE DIMENSIONAL INTERPOLATION
C
C  PURPOSE:
C
C  THREE DIMENSIONAL INTERPOLATION
C
C  INPUT PARAMETERS:
C
C  KLON  NUMBER OF GRIDPOINTS IN X-DIRECTION
C  KLAT  NUMBER OF GRIDPOINTS IN Y-DIRECTION
C  KLEV  NUMBER OF VERTICAL LEVELS
C  KINT  TYPE OF INTERPOLATION
C= 1 - LINEAR
C= 2 - QUADRATIC
C= 3 - CUBIC
C= 4 - MIXED CUBIC/LINEAR
C  KLON1 FIRST GRIDPOINT IN X-DIRECTION
C  KLON2 LAST  GRIDPOINT IN X-DIRECTION
C  KLAT1 FIRST GRIDPOINT IN Y-DIRECTION
C  KLAT2 LAST  GRIDPOINT IN Y-DIRECTION
C  KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KRARRAY OF INDEXES FOR VERTICAL   DISPLACEMENTS
C  PARG  ARRAY OF ARGUMENTS
C  PALFH ALFA HAT
C  PBETH BETA HAT
C  PALFA ARRAY OF WEIGHTS IN X-DIRECTION
C  PBETA ARRAY OF WEIGHTS IN Y-DIRECTION
C  PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION
C
C  OUTPUT PARAMETERS:
C
C  PRES  INTERPOLATED FIELD
C
C  HISTORY:
C
C  J.E. HAUGEN   1  1992
C
C***
C
  IMPLICIT NONE
C
  INTEGER KLON   , KLAT   , KLEV   , KINT   , KHALO,
 IKLON1  , KLON2  , KLAT1  , KLAT2
C
  INTEGER   KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT)
  REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV)  ,   
 RPRES(KLON,KLAT) ,
 R   PALFH(KLON,KLAT) ,  PBETH(KLON,KLAT)  ,
 R   PALFA(KLON,KLAT,4)   ,  PBETA(KLON,KLAT,4),
 R   PGAMA(KLON,KLAT,4)
C
  INTEGER JX, JY, IDX, IDY, ILEV
  REAL Z1MAH, Z1MBH
C
  IF (KINT.EQ.1) THEN
C  LINEAR INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C+
 +   + PGAMA(JX,JY,2)*(
C+
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
  ENDDO
  ENDDO
C
  ELSE
 +IF (KINT.EQ.2) THEN
C  QUADRATIC INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,3)*PARG(IDX+1,IDY

Re: [Patch, fortran] [00/66] PR fortran/43829 Inline sum and?product (AKA scalarization of reductions)

2011-10-29 Thread Jack Howarth
On Fri, Oct 28, 2011 at 06:30:35PM +0200, Mikael Morin wrote:
> On Friday 28 October 2011 15:56:36 Jack Howarth wrote:
> > Mikael,
> > The complete patch bootstraps current FSF gcc trunk on
> > x86_64-apple-darwin11 and the resulting gfortran compiler can compile the
> > Polyhedron 2005 benchmarks using...
> > 
> > Compile Command : gfortran-fsf-4.7 -O3 -ffast-math -funroll-loops -flto
> > -fwhole-program %n.f90 -o %n
> > 
> > without runtime regressions. However I don't seem to see any particular
> > performance improvements with your patches applied. In fact, a few
> > benchmarks including nf and test_fpu seem to show slower runtimes
> > (~8-11%). Have you done any benchmarking with and without the proposed
> > patches? Jack
> 
> Not myself, but the previous versions of the patch have been reported to give 
> sensitive improvement on "tonto" here:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43829#c26
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43829#c35
> 
> Since those versions, the array constructor handling has been improved, and a 
> few mostly cosmetic changes have been applied, so I expect the posted patch 
> to 
> be on par with the previous ones, possibly slightly better.
> 
> Now regarding your regressions, it is quite a lot worse, and quite unexpected.
> I have just looked at test_fpu.f90 and nf.f90 from a polyhedron source I have 
> found at http://www.polyhedron.com/web_images/documents/pb05.zip. 
> There is no call to product in them, and both use only single-argument sum 
> calls, which are not (or shouldn't be) impacted by my patch (scalar cases). 
> Indeed, if I compare the code produced using -fdump-tree-original, there is 
> zero difference in nf.f90, and in test_fpu.f90 only slight variations which 
> are very very unlikely to cause the regression you see (see attached diff).
> 
> Could you double check your figures, and/or that the regressions are really 
> caused by my patch?

Mikeal,
   The problem was the quick.par testing with the patch applied. Full 
standard.par
testing suggests that identical binaries are produced for pb05 (by size 
anyway)...

Using built-in specs.
COLLECT_GCC=gcc-fsf-4.7
COLLECT_LTO_WRAPPER=/sw/lib/gcc4.7/libexec/gcc/x86_64-apple-darwin11.2.0/4.7.0/lto-wrapper
Target: x86_64-apple-darwin11.2.0
Configured with: ../gcc-4.7-20111028/configure --prefix=/sw 
--prefix=/sw/lib/gcc4.7 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.7/info 
--with-build-config=bootstrap-lto --enable-stage1-languages=c,lto 
--enable-languages=c,c++,fortran,lto,objc,obj-c++,java --with-gmp=/sw 
--with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw 
--with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib 
--program-suffix=-fsf-4.7 --enable-checking=yes --enable-cloog-backend=isl
Thread model: posix
gcc version 4.7.0 20111028 (experimental) (GCC) 

prepatch at r180613

Date & Time : 28 Oct 2011 13:47:42
Test Name   : gfortran_lin_O3_wholeprogram
Compile Command : gfortran-fsf-4.7 -O3 -ffast-math -funroll-loops -flto 
-fwhole-program %n.f90 -o %n
Benchmarks  : ac aermod air capacita channel doduc fatigue gas_dyn induct 
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   : 2000.0
Target Error %  :  0.100
Minimum Repeats :10
Maximum Repeats :   100

   Benchmark   Compile  Executable   Ave Run  Number   Estim
Name(secs) (bytes)(secs) Repeats   Err %
   -   ---  --   --- ---  --
  ac  6.75   55000  8.16  10  0.0522
  aermod119.95 1237720 16.83  13  0.0956
 air 18.38  106960  5.77  33  0.0949
capacita  6.48   77240 32.61  17  0.0903
 channel  2.21   34904  2.05  19  0.0493
   doduc 20.19  196496 25.98  17  0.0978
 fatigue  7.20   81616  5.98  16  0.0998
 gas_dyn 13.58  119824  4.11  44  0.0854
  induct 12.90  145096 12.86  13  0.0936
   linpk  1.90   26104 15.51  22  0.0667
mdbx  6.52   81104 11.32  23  0.0995
  nf  6.66   71872 27.17  38  0.0891
 protein 21.47  127264 31.24  15  0.0726
  rnflow 19.51  131056 24.42  19  0.0776
test_fpu 12.09   97272  7.89  22  0.0399
tfft  1.63   22464  1.87  21  0.0169

Geometric Mean Execution Time =  10.54 seconds

postpatch at r180613

Date & Time : 28 Oct 2011 16:42:27
Test Name   : gfortran_lin_O3_wholeprogram
Compile Command : gfortran-fsf-4.7 -O3 -ffast-math -funroll-loops -flto 
-fwhole-program %n.f90 -o %n
Benchmarks  : ac aermod air capacita channel doduc fatigue gas_dyn induct 
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   : 2000.0
Target Error %  :  0.100
Minimum Repeats :10
Maximum Repeats :   100

   Benchmark   Compile  Executable   Ave Ru

Re: [cxx-mem-model][PATCH 0/9] Convert i386 to new atomic optabs.

2011-10-29 Thread Andrew MacLeod

On 10/28/2011 12:07 AM, Richard Henderson wrote:

This exposed a wealth of problems in code that has heretofore never
been tested.  The fourth patch makes certain that all expansions of
compare-and-swap go through a single routine.

I've tested the whole series with and without the last patch.  So
that I've tested both the sync_ and atomic_ paths.  I've not attempted
to test if both are present.  I rather assume that'll never be the
case for any target.
Excellent.  The code is written to always check first for the new atomic 
patterns, and use them if present.  If it works with either one present, 
it should be fine with both, should it ever happen...   but we can leave 
that until some person actually decides that needs to happen :-)


Thanks for exercising and fixing that untested code path. now other 
targets should be easier to convert too.


Andrew


Re: [Patch, libfortran, 3/3] Update file position lazily

2011-10-29 Thread Mikael Morin
On Saturday 29 October 2011 14:43:22 Mikael Morin wrote:
> > FWIW, it seems ifort 12.0 uses "UNDEFINED" in this case; I suppose a
> > case could be made for using the same. Comments?
> 
> Let's go for UNDEFINED then.
On second thought, UNSPECIFIED is better as UNDEFINED is for another case.


New Japanese PO file for 'gcc' (version 4.6.1)

2011-10-29 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Japanese team of translators.  The file is available at:

http://translationproject.org/latest/gcc/ja.po

(This file, 'gcc-4.6.1.ja.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[committed] Fix handling of TLS_MODEL_GLOBAL_DYNAMIC and TLS_MODEL_LOCAL_DYNAMIC symbol references on PA

2011-10-29 Thread John David Anglin
TLS_MODEL_GLOBAL_DYNAMIC and TLS_MODEL_LOCAL_DYNAMIC symbol references
are not legitimate constants because they may require a function call.

This change fixes a bug exposed by the mpfr-3.1.0 testsuite.

Tested on hppa-unknown-linux, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.
Committed to trunk.

Dave
-- 
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752 (FAX: 952-6602)

2011-10-29  John David Anglin  

PR target/50691
config/pa/pa.c (emit_move_sequence): Legitimize TLS symbol references.
(pa_legitimate_constant_p): Return false for TLS_MODEL_GLOBAL_DYNAMIC
and TLS_MODEL_LOCAL_DYNAMIC symbol references.

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 180156)
+++ config/pa/pa.c  (working copy)
@@ -1781,6 +1781,11 @@
   /* Handle the most common case: storing into a register.  */
   else if (register_operand (operand0, mode))
 {
+  /* Legitimize TLS symbol references.  This happens for references
+that aren't a legitimate constant.  */
+  if (PA_SYMBOL_REF_TLS_P (operand1))
+   operand1 = legitimize_tls_address (operand1);
+
   if (register_operand (operand1, mode)
  || (GET_CODE (operand1) == CONST_INT
  && cint_ok_for_move (INTVAL (operand1)))
@@ -10271,6 +10276,16 @@
   if (!NEW_HP_ASSEMBLER && !TARGET_GAS && GET_CODE (x) == LABEL_REF)
 return false;
 
+  /* TLS_MODEL_GLOBAL_DYNAMIC and TLS_MODEL_LOCAL_DYNAMIC are not
+ legitimate constants.  */
+  if (PA_SYMBOL_REF_TLS_P (x))
+   {
+ enum tls_model model = SYMBOL_REF_TLS_MODEL (x);
+
+ if (model == TLS_MODEL_GLOBAL_DYNAMIC || model == TLS_MODEL_LOCAL_DYNAMIC)
+   return false;
+   }
+
   if (TARGET_64BIT && GET_CODE (x) == CONST_DOUBLE)
 return false;
 


[PATCH, testsuite]: Use return 0 instead of exit(0) in gcc.target/i386/*-check.h

2011-10-29 Thread Uros Bizjak
Hello!

2011-10-29  Uros Bizjak  

* gcc.target/i386/fma-check.h (main): Use return 0 instead of exit (0).
* gcc.target/i386/fma4-check.h (main): Ditto.
* gcc.target/i386/xop-check.h (main): Ditto.

Committed as trivial change to mainline SVN.

Uros.
Index: fma4-check.h
===
--- fma4-check.h(revision 180650)
+++ fma4-check.h(working copy)
@@ -23,5 +23,5 @@ main ()
   if (ecx & bit_FMA4)
 do_test ();
 
-  exit (0);
+  return 0;
 }
Index: xop-check.h
===
--- xop-check.h (revision 180650)
+++ xop-check.h (working copy)
@@ -24,5 +24,5 @@ main ()
   if (ecx & bit_XOP)
 do_test ();
 
-  exit (0);
+  return 0;
 }
Index: fma-check.h
===
--- fma-check.h (revision 180650)
+++ fma-check.h (working copy)
@@ -21,5 +21,5 @@ main ()
   if (ecx & bit_FMA)
 do_test ();
 
-  exit (0);
+  return 0;
 }


Re: PING: [C++-11 PATCH] Trailing comma in enum

2011-10-29 Thread Ville Voutilainen
>Could someone please review this?

+ if (cxx_dialect < cxx0x && !in_system_header)
+   pedwarn (input_location, OPT_pedantic,
+ "comma at end of enumerator list");

Why not use maybe_warn_cpp0x there?


[PATCH, i386]: Remove lshlv16qi3 and add lshrv16qi3 XOP expander

2011-10-29 Thread Uros Bizjak
Hello!

lshlv16qi3 is not a generic name for expander, and we have ashlv16qi3
for this. Attached patch adds lshrv16qi3 to generate logical
shift-right XOP instruction.

2011-10-29  Uros Bizjak  

* config/i386/i386.md (lshlv16qi3): Remove expander.
(lshrv16qi3): New expander.
(v16qi3): Macroize expander from ashrv16qi3 and lshrv16qi3
using any_shiftrt code iterator. Cleanup.
(ashlv16qi3): Cleanup.
(ashrv2di3): Ditto.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32} and committed to SVN mainline.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 180650)
+++ config/i386/sse.md  (working copy)
@@ -11382,7 +11385,7 @@
(set_attr "prefix_extra" "2")
(set_attr "mode" "TI")])
 
-;; SSE2 doesn't have some shift varients, so define versions for XOP
+;; SSE2 doesn't have some shift variants, so define versions for XOP
 (define_expand "ashlv16qi3"
   [(set (match_operand:V16QI 0 "register_operand" "")
(ashift:V16QI
@@ -11390,65 +11393,52 @@
  (match_operand:SI 2 "nonmemory_operand" "")))]
   "TARGET_XOP"
 {
-  rtvec vs = rtvec_alloc (16);
-  rtx par = gen_rtx_PARALLEL (V16QImode, vs);
   rtx reg = gen_reg_rtx (V16QImode);
+  rtx par;
   int i;
+
+  par = gen_rtx_PARALLEL (V16QImode, rtvec_alloc (16));
   for (i = 0; i < 16; i++)
-RTVEC_ELT (vs, i) = operands[2];
+XVECEXP (par, 0, i) = operands[2];
 
   emit_insn (gen_vec_initv16qi (reg, par));
   emit_insn (gen_xop_ashlv16qi3 (operands[0], operands[1], reg));
   DONE;
 })
 
-(define_expand "lshlv16qi3"
-  [(match_operand:V16QI 0 "register_operand" "")
-   (match_operand:V16QI 1 "register_operand" "")
-   (match_operand:SI 2 "nonmemory_operand" "")]
-  "TARGET_XOP"
-{
-  rtvec vs = rtvec_alloc (16);
-  rtx par = gen_rtx_PARALLEL (V16QImode, vs);
-  rtx reg = gen_reg_rtx (V16QImode);
-  int i;
-  for (i = 0; i < 16; i++)
-RTVEC_ELT (vs, i) = operands[2];
-
-  emit_insn (gen_vec_initv16qi (reg, par));
-  emit_insn (gen_xop_lshlv16qi3 (operands[0], operands[1], reg));
-  DONE;
-})
-
-(define_expand "ashrv16qi3"
+(define_expand "v16qi3"
   [(set (match_operand:V16QI 0 "register_operand" "")
-   (ashiftrt:V16QI
+   (any_shiftrt:V16QI
  (match_operand:V16QI 1 "register_operand" "")
  (match_operand:SI 2 "nonmemory_operand" "")))]
   "TARGET_XOP"
 {
-  rtvec vs = rtvec_alloc (16);
-  rtx par = gen_rtx_PARALLEL (V16QImode, vs);
   rtx reg = gen_reg_rtx (V16QImode);
+  rtx par;
+  bool negate = false;
+  rtx (*shift_insn)(rtx, rtx, rtx);
   int i;
-  rtx ele = ((CONST_INT_P (operands[2]))
-? GEN_INT (- INTVAL (operands[2]))
-: operands[2]);
 
+  if (CONST_INT_P (operands[2]))
+operands[2] = GEN_INT (-INTVAL (operands[2]));
+  else
+negate = true;
+
+  par = gen_rtx_PARALLEL (V16QImode, rtvec_alloc (16));
   for (i = 0; i < 16; i++)
-RTVEC_ELT (vs, i) = ele;
+XVECEXP (par, 0, i) = operands[2];
 
   emit_insn (gen_vec_initv16qi (reg, par));
 
-  if (!CONST_INT_P (operands[2]))
-{
-  rtx neg = gen_reg_rtx (V16QImode);
-  emit_insn (gen_negv16qi2 (neg, reg));
-  emit_insn (gen_xop_ashlv16qi3 (operands[0], operands[1], neg));
-}
+  if (negate)
+emit_insn (gen_negv16qi2 (reg, reg));
+
+  if ( == LSHIFTRT)
+shift_insn = gen_xop_lshlv16qi3;
   else
-emit_insn (gen_xop_ashlv16qi3 (operands[0], operands[1], reg));
+shift_insn = gen_xop_ashlv16qi3;
 
+  emit_insn (shift_insn (operands[0], operands[1], reg));
   DONE;
 })
 
@@ -11459,29 +11449,25 @@
  (match_operand:DI 2 "nonmemory_operand" "")))]
   "TARGET_XOP"
 {
-  rtvec vs = rtvec_alloc (2);
-  rtx par = gen_rtx_PARALLEL (V2DImode, vs);
   rtx reg = gen_reg_rtx (V2DImode);
-  rtx ele;
+  rtx par;
+  bool negate = false;
+  int i;
 
   if (CONST_INT_P (operands[2]))
-ele = GEN_INT (- INTVAL (operands[2]));
-  else if (GET_MODE (operands[2]) != DImode)
-{
-  rtx move = gen_reg_rtx (DImode);
-  ele = gen_reg_rtx (DImode);
-  convert_move (move, operands[2], false);
-  emit_insn (gen_negdi2 (ele, move));
-}
+operands[2] = GEN_INT (-INTVAL (operands[2]));
   else
-{
-  ele = gen_reg_rtx (DImode);
-  emit_insn (gen_negdi2 (ele, operands[2]));
-}
+negate = true;
 
-  RTVEC_ELT (vs, 0) = ele;
-  RTVEC_ELT (vs, 1) = ele;
+  par = gen_rtx_PARALLEL (V2DImode, rtvec_alloc (2));
+  for (i = 0; i < 2; i++)
+XVECEXP (par, 0, i) = operands[2];
+
   emit_insn (gen_vec_initv2di (reg, par));
+
+  if (negate)
+emit_insn (gen_negv2di2 (reg, reg));
+
   emit_insn (gen_xop_ashlv2di3 (operands[0], operands[1], reg));
   DONE;
 })


[pph] Fix remaining cgraph ICEs (issue5325050)

2011-10-29 Thread Diego Novillo

When emitting the symbols and cgraph nodes in the symbol table, we were
using the same pointer set to decide whether to emit decls and cgraph
nodes.

So, if a function decl F was sent to rest_of_decl_compilation, we would
later refuse to call cgraph_finalize_function on its node because F had
already been emitted.

Fixed by separating the already-emitted test for decls and cgraph nodes.

Tested on x86_64.  Committed.


Diego.

cp/ChangeLog.pph

* pph-streamer-in.c (pph_node_already_emitted): New.
(pph_in_symtab): Call it.

testsuite/ChangeLog.pph

* g++.dg/pph/x1keyed.cc: Mark fixed.
* g++.dg/pph/x1keyno.cc: Likewise.
* g++.dg/pph/x6rtti.cc: Remove ICE failure.  Document operator match
problem.
* g++.dg/pph/x7rtti.cc: Likewise.
* g++.dg/pph/x1tmplclass2.cc: Document asm diff to out-of-order diff.
* g++.dg/pph/z4tmplclass2.cc: Likewise.
* g++.dg/pph/x4keyex.cc: Likewise.
* g++.dg/pph/x4keyed.cc: Change failure to typeinfo redefinition.
* g++.dg/pph/x4keyno.cc: Likewise.
---
 gcc/cp/ChangeLog.pph |5 +
 gcc/cp/pph-streamer-in.c |   17 -
 gcc/testsuite/ChangeLog.pph  |   13 +
 gcc/testsuite/g++.dg/pph/x1keyed.cc  |3 ---
 gcc/testsuite/g++.dg/pph/x1keyno.cc  |3 ---
 gcc/testsuite/g++.dg/pph/x1tmplclass2.cc |9 ++---
 gcc/testsuite/g++.dg/pph/x4keyed.cc  |5 ++---
 gcc/testsuite/g++.dg/pph/x4keyex.cc  |6 +-
 gcc/testsuite/g++.dg/pph/x4keyno.cc  |8 ++--
 gcc/testsuite/g++.dg/pph/x6rtti.cc   |   10 +++---
 gcc/testsuite/g++.dg/pph/x7rtti.cc   |5 ++---
 gcc/testsuite/g++.dg/pph/z4tmplclass2.cc |2 +-
 12 files changed, 47 insertions(+), 39 deletions(-)

diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
index 0dd8e20..5c64739 100644
--- a/gcc/cp/ChangeLog.pph
+++ b/gcc/cp/ChangeLog.pph
@@ -1,3 +1,8 @@
+2011-10-29   Diego Novillo  
+
+   * pph-streamer-in.c (pph_node_already_emitted): New.
+   (pph_in_symtab): Call it.
+
 2011-10-28   Lawrence Crowl  
 
* pph.c (pph_dump_tree_name): Remove dead code.  Dump tree_code also.
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 40f82c3..40d3fc2 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -2209,6 +2209,19 @@ pph_decl_already_emitted (tree decl)
 }
 
 
+/* Have we already emitted this cgraph NODE?  */
+
+static bool
+pph_node_already_emitted (struct cgraph_node *node)
+{
+  static struct pointer_set_t *emitted_nodes = NULL;
+  gcc_assert (node != NULL);
+  if (!emitted_nodes)
+emitted_nodes = pointer_set_create ();
+  return pointer_set_insert (emitted_nodes, node) != 0;
+}
+
+
 /* Read the symbol table from STREAM.  When this image is read into
another translation unit, we want to guarantee that the IL
instances taken from this image are instantiated in the same order
@@ -2241,6 +2254,7 @@ pph_in_symtab (pph_stream *stream)
  at_end = bp_unpack_value (&bp, 1);
   if (pph_decl_already_emitted (decl))
 continue;
+
  cp_rest_of_decl_compilation (decl, top_level, at_end);
}
   else if (action == PPH_SYMTAB_EXPAND)
@@ -2251,8 +2265,9 @@ pph_in_symtab (pph_stream *stream)
  node = pph_in_cgraph_node (stream);
  if (node && node->local.finalized)
{
- if (pph_decl_already_emitted (decl))
+ if (pph_node_already_emitted (node))
continue;
+
  /* Since the writer had finalized this cgraph node,
 we have to re-play its actions.  To do that, we need
 to clear the finalized and reachable bits in the
diff --git a/gcc/testsuite/ChangeLog.pph b/gcc/testsuite/ChangeLog.pph
index b8d2e27..8994e63 100644
--- a/gcc/testsuite/ChangeLog.pph
+++ b/gcc/testsuite/ChangeLog.pph
@@ -1,3 +1,16 @@
+2011-10-29   Diego Novillo  
+
+   * g++.dg/pph/x1keyed.cc: Mark fixed.
+   * g++.dg/pph/x1keyno.cc: Likewise.
+   * g++.dg/pph/x6rtti.cc: Remove ICE failure.  Document operator match
+   problem.
+   * g++.dg/pph/x7rtti.cc: Likewise.
+   * g++.dg/pph/x1tmplclass2.cc: Document asm diff to out-of-order diff.
+   * g++.dg/pph/z4tmplclass2.cc: Likewise.
+   * g++.dg/pph/x4keyex.cc: Likewise.
+   * g++.dg/pph/x4keyed.cc: Change failure to typeinfo redefinition.
+   * g++.dg/pph/x4keyno.cc: Likewise.
+
 2011-10-26   Diego Novillo  
 
* g++.dg/pph/c4inline.cc: Mark fixed.
diff --git a/gcc/testsuite/g++.dg/pph/x1keyed.cc 
b/gcc/testsuite/g++.dg/pph/x1keyed.cc
index 18eb4c8..6ef9b9a 100644
--- a/gcc/testsuite/g++.dg/pph/x1keyed.cc
+++ b/gcc/testsuite/g++.dg/pph/x1keyed.cc
@@ -1,6 +1,3 @@
-// { dg-xfail-if "ICE CGRAPH" { "*-*-*" } { "-fpph-map=pph.map" } }
-// { dg-bogus "x1keyed.cc:12:1: internal compiler error: in 
cgraph_analyze_functions, at cgraphunit.c:1193" "" { xfail *-*-* } 0

[PATCH, i386]: Rename xop_ashl -> xop_sha, xop_lshl -> xop_shl

2011-10-29 Thread Uros Bizjak
Hello!

These pattern names are misleading, implying that these are "logical
shift left" and "arithmetic shift left". They are not, they are "shift
logical" and "shift arithmetic". Attached (trivial) patch renames
these patterns to the insn mnemonic they generate.

2011-10-29  Uros Bizjak  

* config/i386/i386.md (xop_sha3): Rename from xop_ashl3.
Update all uses.
(xop_shl3): Rename from xop_lshl3.  Update all uses.
* config/i386/i386.c: Update all uses.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 180657)
+++ config/i386/sse.md  (working copy)
@@ -11253,7 +11253,7 @@
 {
   rtx neg = gen_reg_rtx (mode);
   emit_insn (gen_neg2 (neg, operands[2]));
-  emit_insn (gen_xop_lshl3 (operands[0], operands[1], neg));
+  emit_insn (gen_xop_shl3 (operands[0], operands[1], neg));
   DONE;
 })
 
@@ -11268,7 +11268,7 @@
 {
   rtx neg = gen_reg_rtx (mode);
   emit_insn (gen_neg2 (neg, operands[2]));
-  emit_insn (gen_xop_lshl3 (operands[0], operands[1], neg));
+  emit_insn (gen_xop_shl3 (operands[0], operands[1], neg));
   DONE;
 }
 })
@@ -11289,7 +11289,7 @@
 {
   rtx neg = gen_reg_rtx (mode);
   emit_insn (gen_neg2 (neg, operands[2]));
-  emit_insn (gen_xop_ashl3 (operands[0], operands[1], neg));
+  emit_insn (gen_xop_sha3 (operands[0], operands[1], neg));
   DONE;
 })
 
@@ -11303,7 +11303,7 @@
 {
   rtx neg = gen_reg_rtx (V4SImode);
   emit_insn (gen_negv4si2 (neg, operands[2]));
-  emit_insn (gen_xop_ashlv4si3 (operands[0], operands[1], neg));
+  emit_insn (gen_xop_shav4si3 (operands[0], operands[1], neg));
   DONE;
 }
 })
@@ -11321,7 +11321,7 @@
  (match_operand:VI12_128 2 "nonimmediate_operand" "")))]
   "TARGET_XOP"
 {
-  emit_insn (gen_xop_ashl3 (operands[0], operands[1], operands[2]));
+  emit_insn (gen_xop_sha3 (operands[0], operands[1], operands[2]));
   DONE;
 })
 
@@ -11335,7 +11335,7 @@
   if (!TARGET_AVX2)
 {
   operands[2] = force_reg (mode, operands[2]);
-  emit_insn (gen_xop_ashl3 (operands[0], operands[1], operands[2]));
+  emit_insn (gen_xop_sha3 (operands[0], operands[1], operands[2]));
   DONE;
 }
 })
@@ -11347,7 +11347,7 @@
  (match_operand:VI48_256 2 "nonimmediate_operand" "")))]
   "TARGET_AVX2")
 
-(define_insn "xop_ashl3"
+(define_insn "xop_sha3"
   [(set (match_operand:VI_128 0 "register_operand" "=x,x")
(if_then_else:VI_128
 (ge:VI_128
@@ -11366,7 +11366,7 @@
(set_attr "prefix_extra" "2")
(set_attr "mode" "TI")])
 
-(define_insn "xop_lshl3"
+(define_insn "xop_shl3"
   [(set (match_operand:VI_128 0 "register_operand" "=x,x")
(if_then_else:VI_128
 (ge:VI_128
@@ -11402,7 +11402,7 @@
 XVECEXP (par, 0, i) = operands[2];
 
   emit_insn (gen_vec_initv16qi (reg, par));
-  emit_insn (gen_xop_ashlv16qi3 (operands[0], operands[1], reg));
+  emit_insn (gen_xop_shav16qi3 (operands[0], operands[1], reg));
   DONE;
 })
 
@@ -11434,9 +11434,9 @@
 emit_insn (gen_negv16qi2 (reg, reg));
 
   if ( == LSHIFTRT)
-shift_insn = gen_xop_lshlv16qi3;
+shift_insn = gen_xop_shlv16qi3;
   else
-shift_insn = gen_xop_ashlv16qi3;
+shift_insn = gen_xop_shav16qi3;
 
   emit_insn (shift_insn (operands[0], operands[1], reg));
   DONE;
@@ -11468,7 +11468,7 @@
   if (negate)
 emit_insn (gen_negv2di2 (reg, reg));
 
-  emit_insn (gen_xop_ashlv2di3 (operands[0], operands[1], reg));
+  emit_insn (gen_xop_shav2di3 (operands[0], operands[1], reg));
   DONE;
 })
 
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 180650)
+++ config/i386/i386.c  (working copy)
@@ -26538,14 +26538,14 @@ static const struct builtin_description bdesc_mult
   { OPTION_MASK_ISA_XOP, CODE_FOR_xop_rotlv4si3, 
"__builtin_ia32_vprotdi", IX86_BUILTIN_VPROTD_IMM,  UNKNOWN,  
(int)MULTI_ARG_2_SI_IMM },
   { OPTION_MASK_ISA_XOP, CODE_FOR_xop_rotlv8hi3, 
"__builtin_ia32_vprotwi", IX86_BUILTIN_VPROTW_IMM,  UNKNOWN,  
(int)MULTI_ARG_2_HI_IMM },
   { OPTION_MASK_ISA_XOP, CODE_FOR_xop_rotlv16qi3,
"__builtin_ia32_vprotbi", IX86_BUILTIN_VPROTB_IMM,  UNKNOWN,  
(int)MULTI_ARG_2_QI_IMM },
-  { OPTION_MASK_ISA_XOP, CODE_FOR_xop_ashlv2di3, 
"__builtin_ia32_vpshaq",  IX86_BUILTIN_VPSHAQ,  UNKNOWN,  
(int)MULTI_ARG_2_DI },
-  { OPTION_MASK_ISA_XOP, CODE_FOR_xop_ashlv4si3, 
"__builtin_ia32_vpshad",  IX86_BUILTIN_VPSHAD,  UNKNOWN,  
(int)MULTI_ARG_2_SI },
-  { OPTION_MASK_ISA_XOP, CODE_FOR_xop_ashlv8hi3, 
"__builtin_ia32_vpshaw",  IX86_BUILTIN_VPSHAW,  UNKNOWN,  
(int)MULTI_ARG_2_HI },
-  { OPTION_MASK_ISA_XOP, CODE_FOR_xop_ashlv16qi3,
"__builtin_ia32_vpshab",  IX86_BUILTIN_VPSHAB,  UNKNOWN,  
(int)MULTI_ARG_2_QI },
-  { OPTION_MASK_ISA_XO

[wwwdocs] Streamline GCC 4.6 release notes

2011-10-29 Thread Gerald Pfeifer
...a bit, by disabling the headers for empty sessions.

Committed.

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/changes.html,v
retrieving revision 1.134
diff -u -r1.134 changes.html
--- changes.html13 Oct 2011 13:18:20 -  1.134
+++ changes.html29 Oct 2011 19:01:57 -
@@ -609,7 +609,9 @@
   support is in progress.  It may or may not work on other
   platforms.
 
+
 
 Objective-C and Objective-C++
 
@@ -1064,7 +1066,9 @@
   MinGW and Cygwin.
   
 
+
 
 Other significant improvements
 


[wwwdocs] Use for table header in java/done.html

2011-10-29 Thread Gerald Pfeifer
Applied.

Gerald

2011-10-29  Gerald Pfeifer  

* done.html: Use  for the header.

Index: done.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/java/done.html,v
retrieving revision 1.49
diff -u -r1.49 done.html
--- done.html   12 Sep 2010 03:18:23 -  1.49
+++ done.html   29 Oct 2011 19:11:43 -
@@ -18,9 +18,11 @@
 
   
+
   
-ProjectDescriptionContact
+ProjectDescriptionContact
   
+
   
   
 http://irate.sourceforge.net/";>iRATE radio


[PR50764, PATCH] Fix for ICE in maybe_record_trace_start with -fsched2-use-superblocks

2011-10-29 Thread Tom de Vries
Richard,

I have a tentative fix for PR50764.

In the example from the test-case, -fsched2-use-superblocks moves an insn from
block 4 to block 3.

   2
  bar
   |
---+-
   / \
  *   *
  5 * 3
abortbar
  |
  |
  *
  4
return


The insn has a REG_CFA_DEF_CFA note and is frame-related.
...
(insn/f 51 50 52 4 (set (reg:DI 39 r10)
(mem/c:DI (plus:DI (reg/f:DI 6 bp)
(const_int -8 [0xfff8])) [3 S8 A8])) pr50764.c:13
62 {*movdi_internal_rex64}
 (expr_list:REG_CFA_DEF_CFA (reg:DI 39 r10)
(nil)))
...

This causes the assert in maybe_record_trace_start to trigger:
...
  /* We ought to have the same state incoming to a given trace no
 matter how we arrive at the trace.  Anything else means we've
 got some kind of optimization error.  */
  gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
...

The assert does not occur with -fno-tree-tail-merge, but that is due to the
following:
- -fsched-use-superblocks does not handle dead labels explicitly
- -freorder-blocks introduces a dead label, which is not removed until after
  sched2
- -ftree-tail-merge makes a difference in which block -freorder-blocks
  introduces the dead label. In the case of -ftree-tail-merge, the dead label
  is introduced at the start of block 3, and block 3 and 4 end up in the same
  ebb. In the case of -fno-tree-tail-merge, the dead label is introduced at the
  start of block 4, and block 3 and 4 don't end up in the same ebb.

attached untested patch fixes PR50764 in a similar way as the patch for PR49994,
which is also about an ICE in maybe_record_trace_start with
-fsched2-use-superblocks.

The patch for PR49994 makes sure frame-related instructions are not moved past
the following jump.

Attached patch makes sure frame-related instructions are not moved past the
preceding jump.

Is this the way to fix this PR?

Thanks,
- Tom

2011-10-29  Tom de Vries  

PR rtl-optimization/50764
* (sched_analyze_insn): Make sure frame-related insns are not moved past
preceding jump.



Index: gcc/sched-deps.c
===
--- gcc/sched-deps.c (revision 180521)
+++ gcc/sched-deps.c (working copy)
@@ -2812,8 +2812,13 @@ sched_analyze_insn (struct deps_desc *de
  during prologue generation and avoid marking the frame pointer setup
  as frame-related at all.  */
   if (RTX_FRAME_RELATED_P (insn))
-deps->sched_before_next_jump
-  = alloc_INSN_LIST (insn, deps->sched_before_next_jump);
+{
+  deps->sched_before_next_jump
+	= alloc_INSN_LIST (insn, deps->sched_before_next_jump);
+
+  if (deps->pending_jump_insns)
+	add_dependence (insn, XEXP (deps->pending_jump_insns, 0), REG_DEP_ANTI);
+}
 
   if (code == COND_EXEC)
 {


[wwwdocs] Various tweaks to java/build-snapshot.html

2011-10-29 Thread Gerald Pfeifer
2011-10-29  Gerald Pfeifer  
 
* build-snapshot.html: Adjust title to not refer to CVS any more.
Adjust formatting of title.
Change link from CVS instructions to SVN.

Installed.

Gerald

Index: build-snapshot.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/java/build-snapshot.html,v
retrieving revision 1.16
diff -u -r1.16 build-snapshot.html
--- build-snapshot.html 29 Jul 2002 16:50:01 -  1.16
+++ build-snapshot.html 29 Oct 2011 19:33:54 -
@@ -1,23 +1,18 @@
 
 
 
-How to build GCJ/LIBGCJ from snapshots or cvs
+How to build GCJ/LIBGCJ from snapshots or checkouts
 
 
 
 
-
- 
-   
-Howto build and run libgcj/gcj snapshots or cvs
-  
-
-
+Howto build and run libgcj/gcj snapshots or checkouts
 
+
 
 
 
-1. Get a GCC snapshot or obtain GCC via CVS.
+1. Get a GCC snapshot or check out the sources.
 
 
 


[wwwdocs] Streamline GCC 4.5 release notes

2011-10-29 Thread Gerald Pfeifer
...a bit, by disabling the headers for empty sessions.

Installed.

Gerald


Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.5/changes.html,v
retrieving revision 1.101
diff -u -r1.101 changes.html
--- changes.html11 May 2011 16:38:39 -  1.101
+++ changes.html29 Oct 2011 19:49:38 -
@@ -575,8 +575,6 @@
 
   
 
-Java (GCJ)
-
 New Targets and Target Specific Improvements
 
 AIX
@@ -703,8 +701,6 @@
 for more details about these attributes.
   
 
-picochip
-
 RS/6000 (POWER/PowerPC)
   
 GCC now supports the Power ISA 2.06, which includes the VSX
@@ -757,7 +753,10 @@
 enhancements to the Fortran language support library.
   
 
+
+>
 
 Other significant improvements
 


[wwwdocs] GNU textutils is GNU coreutils

2011-10-29 Thread Gerald Pfeifer
...and has been for a while.

Committed.

Index: snapshots.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/snapshots.html,v
retrieving revision 1.20
diff -u -r1.20 snapshots.html
--- snapshots.html  21 Sep 2006 14:17:36 -  1.20
+++ snapshots.html  29 Oct 2011 20:00:24 -
@@ -33,10 +33,10 @@
 files so that autoconf et al aren't needed. This is documented in
 comments in contrib/gcc_update.
 
-The program md5sum—which is included with the 
-http://www.gnu.org/software/textutils/textutils.html";>GNU
-Text Utilities—can be used to verify the integrity of a
-snapshot or release.  The release script generates the file 
+The program md5sum — which is included with the 
+https://www.gnu.org/software/coreutils/";>GNU Coreutils
+— can be used to verify the integrity of a snapshot or release.
+The release script generates the file 
 MD5SUMS that provides a 128-bit checksum for every
 file in the tarball.  Use the following command to verify the
 sources:


Re: [PATCH][RFC] Re-write LTO option merging

2011-10-29 Thread Richard Guenther
On Fri, Oct 28, 2011 at 4:42 PM, Diego Novillo  wrote:
> On 11-10-27 01:46 , Richard Guenther wrote:
>>
>> On Wed, 26 Oct 2011, Richard Guenther wrote:
>>
>>>
>>> This completely rewrites LTO option merging.  At compile (uselessly
>>> now at WPA?) time we now stream a COLLECT_GCC_OPTIONS like string
>>> as it comes from argv of the compiler binary.  Those options are
>>> read in by the LTO driver (lto-wrapper), merged into a single
>>> set (very simple merge function right now ;)) and given a place to
>>> complain about incompatible arguments.  The merged set is then
>>> prepended to the arguments from the linker driver line
>>> (what we get in COLLECT_GCC_OPTIONS for lto-wrapper), thus the
>>> linker command-line may override what the compiler command-line(s)
>>> provided.
>>>
>>> One visible change is that no optimization option on the link line
>>> no longer means -O0, unless you explicitly specify -O0 at link time.
>>>
>>> There are probably more obscure differences, especially due to the
>>> very simple merge and complain function ;))  But this is a RFC ...
>>>
>>> If WPA partitioning at any point wants to do something clever with
>>> a set of incompatible functions it can re-parse the options and
>>> do that (we then have to arrange for lto-wrapper to let the options
>>> slip through).
>>>
>>> I'm LTO bootstrapping and testing this simple variant right now
>>> (I believe we do not excercise funny option combinations right now).
>>>
>>> I'll still implement a very simple merge/complain function.
>>> Suggestions for that welcome (I'll probably simply compute the
>>> intersection of options ... in the long run we'd want to annotate
>>> our options as to whether they should be unioned/intersected).
>
> Are you thinking of having some table of options with hints?  An NxN matrix
> of options?  Given two arbitrary options OPT1 and OPT2, how do we decide
> whether they can go together?  That's one big matrix.

No, not really.  I'd divide them into two classes, one where we have to
merge by union and one that we have to merge by intersection (for IL
changing options there (usually) is a conservative setting, like -fwrapv,
or -fno-strict-aliasing, or -fno-fast-math).

> Perhaps we could group options in classes?  There's really only a subset of
> options that need to be checked: -f, -m, -O, -g, ...
> Perhaps start with an if-tree checking an incoming option against the set of
> accumulated options so far.

For non-IL changing options things are more difficult - how should we merge
-O1 and -O2, or, -O2 -fno-tree-pre and -O2?  If there are optimization options
specified at link time we could just ignore those at compile time - but what
if there are none specified at link time?

> In fact, if we simply cataloged the set of options that can affect gimple
> bytecode generation, we can then make sure that those don't change at link
> time.

Or rather choose a conservative setting at link time if they differ between
units at compile-time (and of course make sure to transfer them otherwise).

>
>>
>> !       if (i != 1)
>> !       obstack_grow (&temporary_obstack, " ", 1);
>> !       obstack_grow (&temporary_obstack, "'", 1);
>> !       q = option->canonical_option[0];
>> !       while ((p = strchr (q, '\'')))
>> !       {
>> !         obstack_grow (&temporary_obstack, q, p - q);
>> !         obstack_grow (&temporary_obstack, "'\\''", 4);
>> !         q = ++p;
>> !       }
>> !       obstack_grow (&temporary_obstack, q, strlen (q));
>> !       obstack_grow (&temporary_obstack, "'", 1);
>>
>> !       for (j = 1; j<  option->canonical_option_num_elements; ++j)
>>        {
>> !         obstack_grow (&temporary_obstack, " '", 2);
>> !         q = option->canonical_option[j];
>> !         while ((p = strchr (q, '\'')))
>> !           {
>> !             obstack_grow (&temporary_obstack, q, p - q);
>> !             obstack_grow (&temporary_obstack, "'\\''", 4);
>> !             q = ++p;
>> !           }
>> !         obstack_grow (&temporary_obstack, q, strlen (q));
>> !         obstack_grow (&temporary_obstack, "'", 1);
>
> Ugh.

Basically copied from gcc.c ... (which doesn't use cl_options, thus
I can't really share code).

>> +   /* ???  For now the easiest thing would be to warn about
>> +      mismatches.  */
>> +
>> +   if (*decoded_options_count != fdecoded_options_count)
>> +     {
>> +       /* ???  Warn?  */
>> +       return;
>> +     }
>
> Yes, please.  We don't want to silently accept anything we don't fully
> understand.

Dropped in the followup patch (for now)

>
>> +   for (i = 0; i<  *decoded_options_count; ++i)
>> +     {
>> +       struct cl_decoded_option *option =&(*decoded_options)[i];
>> +       struct cl_decoded_option *foption =&fdecoded_options[i];
>> +       if (strcmp (option->orig_option_with_args_text,
>> +                 foption->orig_option_with_args_text) != 0)
>> +       {
>> +         /* ???  Warn?  */
>> +         return;
>
> Likewise.  If the warning proves to noisy in common scenarios, w

Re: [PATCH][2/n] LTO option handling/merging rewrite

2011-10-29 Thread Richard Guenther
On Fri, Oct 28, 2011 at 5:48 PM, Joseph S. Myers
 wrote:
> On Fri, 28 Oct 2011, Richard Guenther wrote:
>
>> +       /* Fallthru.  */
>> +     case OPT_fPIC:
>> +     case OPT_fpic:
>> +     case OPT_fpie:
>> +     case OPT_fcommon:
>> +     case OPT_fexceptions:
>> +       append_option (decoded_options, decoded_options_count, foption);
>> +       break;
>
> No doubt this is what the previous code did, but in this case shouldn't
> "union" mean the biggest PIC status of any file wins (thus, if -fPIC was
> the PIC option that actually had effect on some object, that wins over an
> explicit -fno-PIC or -fpic on another object)?  In general whether the
> options are positive or negative matters, and I don't see that handled
> here.

I tried to look at what we get for -fno-pic vs. -fpic and -fno-pic is completely
dropped from the decoded options list (not sure what happens on targets
with -fpic as default).  So it seems at most one state (the non-default one)
survives here.

But maybe I'm missing something - how can I reliably check if there is
a negative form of an option, and if, if the option is the negative form?

> (Actually, maybe the smallest PIC status should win - i.e. if any object
> is not PIC then the final code can be presumed to be non-PIC.)
>
> (Using Negative in .opt files for groups of options such as -fPIC/-fpic
> would ensure that at most one survives from any one object, but you still
> need to work out what you want to do for merging.)

Sure - I suppose we can fix that as a followup - the patch tries to do what
we do now, just at a different place (the driver).  Thus I tried to preserve
all bugs as well ;)

Richard.

> --
> Joseph S. Myers
> jos...@codesourcery.com
>


Re: Patch committed: Use GNU/Linux in comment

2011-10-29 Thread Richard Guenther
On Sat, Oct 29, 2011 at 7:06 AM, Ian Lance Taylor  wrote:
> "Joseph S. Myers"  writes:
>
>> On Fri, 28 Oct 2011, Ian Lance Taylor wrote:
>>
>>> This patch changes "Linux" to "GNU/Linux" in a comment.  Bootstrapped
>>> and ran libiberty testsuite on x86_64-unknown-linux-gnu.  Committed to
>>> mainline as libiberty maintainer.
>>
>> prctl is a Linux-kernel-specific syscall.  The comment is describing what
>> that syscall does in terms of Linux /proc datastructures.  Are you using
>> GNU/ because it also refers to userspace programs rather than purely to
>> the kernel, even though "top" and "ps" are not GNU programs?
>
> I'm using it because RMS commented that there were uses of "linux" in
> libiberty which should be "GNU/Linux," and it seemed easier to change
> the only one I could see.

Interesting reason ;)  But maybe there is no /Linux with !=GNU
so it doesn't matter.  Otherwise it would be technically incorrect and we should
change it to "Linux, not only GNU/Linux" ;)

Richard.

> Ian
>


Re: C++ PATCH for c++/50500 (DR 1082, implicitly declared copy in class with move)

2011-10-29 Thread Eric Botcazou
> DR 1082 changed the rules for implicitly declared copy constructors and
> assignment operators in the presence of move ctor/op= such that if
> either move operation is present, instead of being suppressed the copy
> operations will still be declared, but as deleted.

We have detected a side effect of this change by means of -fdump-ada-spec: 
implicit copy assignment operators are now generated in simple cases where 
they were not previously generated, for example:

template class Generic_Array
{
  Generic_Array();
  void mf(T t, U u);
};

template class Generic_Array;  // explicit instantiation


This is because, during the call to lazily_declare_fn on sfk_copy_constructor, 
the new code:

  /* [class.copy]/8 If the class definition declares a move constructor or
 move assignment operator, the implicitly declared copy constructor is
 defined as deleted */
  if ((sfk == sfk_copy_assignment
   || sfk == sfk_copy_constructor)
  && (type_has_user_declared_move_constructor (type)
  || type_has_user_declared_move_assign (type)))
DECL_DELETED_FN (fn) = true;

is invoked, and type_has_user_declared_move_assign has the side effect of 
causing lazily_declare_fn to be called on sfk_copy_assignment through the call 
to lookup_fnfields_slot:

bool
type_has_user_declared_move_assign (tree t)
{
  tree fns;

  if (CLASSTYPE_LAZY_MOVE_ASSIGN (t))
return false;

  for (fns = lookup_fnfields_slot (t, ansi_assopname (NOP_EXPR));
   fns; fns = OVL_NEXT (fns))
{
  tree fn = OVL_CURRENT (fns);
  if (move_fn_p (fn) && !DECL_ARTIFICIAL (fn))
return true;
}

  return false;
}


Is that expected?

-- 
Eric Botcazou


Re: [Patch, libfortran, 3/3] Update file position lazily

2011-10-29 Thread Janne Blomqvist
On Sat, Oct 29, 2011 at 18:35, Mikael Morin  wrote:
> On Saturday 29 October 2011 14:43:22 Mikael Morin wrote:
>> > FWIW, it seems ifort 12.0 uses "UNDEFINED" in this case; I suppose a
>> > case could be made for using the same. Comments?
>>
>> Let's go for UNDEFINED then.
> On second thought, UNSPECIFIED is better as UNDEFINED is for another case.

Hmm, indeed, on second thought I agree as well.

Further comparisons:

pathf95 3.2.99: UNKNOWN

pgf95 11.2-0: REWIND (which is clearly wrong, also inquire_5.f90
failed earlier because trying to get the position of a direct access
file returned ASIS while the standard requires UNDEFINED in this
case).


-- 
Janne Blomqvist


Re: [PATCH] Update html docs for -mno-r11 and --param case-value-threshold

2011-10-29 Thread Gerald Pfeifer
On Wed, 6 Jul 2011, Michael Meissner wrote:
> I  updated the html documents for my two recent changes:

I made the small follow-up patch below which tweaks markup and
refers to GNU/Linux instead of Linux.

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.50
diff -u -r1.50 changes.html
--- changes.html26 Oct 2011 19:48:57 -  1.50
+++ changes.html30 Oct 2011 00:41:34 -
@@ -434,8 +434,9 @@
This will also be fixed in the GCC 4.6.1 and 4.5.4 releases.
 
  A new option (-mno-r11) was added to allow AIX
-   32-bit/64-bit and Linux 64-bit PowerPC users to specify that the 
compiler
-   should not load up the chain register (r11) before calling a
+   32-bit/64-bit and GNU/Linux 64-bit PowerPC users to specify that
+   the compiler should not load up the chain register
+   (r11) before calling a
function through a pointer.  If you use this option, you cannot call
nested functions through a pointer, or call other languages that might
use the static chain.


C++ PATCH to add -std=c++11 ??

2011-10-29 Thread Paolo Carlini

Hi,

today, by chance, I noticed this:

  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01756.html

and it occurred to me that maybe it's time to do this?

Thanks,
Paolo.

//
2011-10-30  Paolo Carlini  

* c.opt: Add -std=c++11.
Index: c.opt
===
--- c.opt   (revision 180671)
+++ c.opt   (working copy)
@@ -1182,6 +1182,10 @@ become a part of the upcoming ISO C++ standard, du
 extensions enabled by this mode are experimental and may be removed in
 future releases of GCC.
 
+std=c++11
+C++ ObjC++ Alias(std=c++0x)
+Conform to the ISO 2011 C++ standard
+
 std=c1x
 C ObjC
 Conform to the ISO 201X C standard draft (experimental and incomplete support)


Re: C++ PATCH to add -std=c++11 ??

2011-10-29 Thread Paolo Carlini

On 10/30/2011 02:12 AM, Paolo Carlini wrote:

Hi,

today, by chance, I noticed this:

  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01756.html

and it occurred to me that maybe it's time to do this?
... or maybe we want, at the same time, to tweak a bit the description 
of c++0x?


Paolo.