Re: [Patch,AVR] Print no-return functions as JMP

2011-10-14 Thread Richard Kenner
> +@item -mjump-to-noreturn
> +@opindex mjump-to-noreturn
> +Use a jump instruction instead of a call instruction when calling a
> +no-return functions.  This option is active if optimization is turned
> +on and just affects the way a call instruction is printed out.
> 
> Would "emit" be better here than "printed out"?

Maybe "generated"?


Re: [C++ Patch] PR 50732

2011-10-14 Thread Paolo Carlini

On 10/15/2011 12:20 AM, Jason Merrill wrote:

That should work.
Excellent. Then we can do something like the below, a great improvement. 
I'm finishing testing it (already past g++.dg), Ok if it passes?


Thanks,
Paolo.

///
/cp
2011-10-14  Paolo Carlini  

PR c++/50732
* semantics.c (finish_trait_expr): Do not try to instantiate the
the base type of an __is_base_of trait.
(check_trait_type): Return a tree; use complete_type_or_else.

/testsuite
2011-10-14  Paolo Carlini  

PR c++/50732
* g++.dg/ext/is_base_of_incomplete.C: New.
* g++.dg/ext/is_base_of_diagnostic.C: Adjust dg-errors.
* g++.dg/ext/unary_trait_incomplete.C: Likewise.
Index: testsuite/g++.dg/ext/unary_trait_incomplete.C
===
--- testsuite/g++.dg/ext/unary_trait_incomplete.C   (revision 180007)
+++ testsuite/g++.dg/ext/unary_trait_incomplete.C   (working copy)
@@ -1,76 +1,76 @@
 // PR c++/39475
 
-struct I;
+struct I; // { dg-error "forward declaration" }
 struct C { };
 
 bool nas1 = __has_nothrow_assign(I); // { dg-error "incomplete type" }
 bool nas2 = __has_nothrow_assign(C[]);
-bool nas3 = __has_nothrow_assign(I[]); // { dg-error "incomplete type" }
+bool nas3 = __has_nothrow_assign(I[]); // { dg-error "unspecified bounds" }
 bool nas4 = __has_nothrow_assign(void);
 bool nas5 = __has_nothrow_assign(const void);
 
 bool tas1 = __has_trivial_assign(I); // { dg-error "incomplete type" }
 bool tas2 = __has_trivial_assign(C[]);
-bool tas3 = __has_trivial_assign(I[]); // { dg-error "incomplete type" }
+bool tas3 = __has_trivial_assign(I[]); // { dg-error "unspecified bounds" }
 bool tas4 = __has_trivial_assign(void);
 bool tas5 = __has_trivial_assign(const void);
 
 bool nco1 = __has_nothrow_constructor(I); // { dg-error "incomplete type" }
 bool nco2 = __has_nothrow_constructor(C[]);
-bool nco3 = __has_nothrow_constructor(I[]); // { dg-error "incomplete type" }
+bool nco3 = __has_nothrow_constructor(I[]); // { dg-error "unspecified bounds" 
}
 bool nco4 = __has_nothrow_constructor(void);
 bool nco5 = __has_nothrow_constructor(const void);
 
 bool tco1 = __has_trivial_constructor(I); // { dg-error "incomplete type" }
 bool tco2 = __has_trivial_constructor(C[]);
-bool tco3 = __has_trivial_constructor(I[]); // { dg-error "incomplete type" }
+bool tco3 = __has_trivial_constructor(I[]); // { dg-error "unspecified bounds" 
}
 bool tco4 = __has_trivial_constructor(void);
 bool tco5 = __has_trivial_constructor(const void);
 
 bool ncp1 = __has_nothrow_copy(I); // { dg-error "incomplete type" }
 bool ncp2 = __has_nothrow_copy(C[]);
-bool ncp3 = __has_nothrow_copy(I[]); // { dg-error "incomplete type" }
+bool ncp3 = __has_nothrow_copy(I[]); // { dg-error "unspecified bounds" }
 bool ncp4 = __has_nothrow_copy(void);
 bool ncp5 = __has_nothrow_copy(const void);
 
 bool tcp1 = __has_trivial_copy(I); // { dg-error "incomplete type" }
 bool tcp2 = __has_trivial_copy(C[]);
-bool tcp3 = __has_trivial_copy(I[]); // { dg-error "incomplete type" }
+bool tcp3 = __has_trivial_copy(I[]); // { dg-error "unspecified bounds" }
 bool tcp4 = __has_trivial_copy(void);
 bool tcp5 = __has_trivial_copy(const void);
 
 bool vde1 = __has_virtual_destructor(I); // { dg-error "incomplete type" }
 bool vde2 = __has_virtual_destructor(C[]);
-bool vde3 = __has_virtual_destructor(I[]); // { dg-error "incomplete type" }
+bool vde3 = __has_virtual_destructor(I[]); // { dg-error "unspecified bounds" }
 bool vde4 = __has_virtual_destructor(void);
 bool vde5 = __has_virtual_destructor(const void);
 
 bool tde1 = __has_trivial_destructor(I); // { dg-error "incomplete type" }
 bool tde2 = __has_trivial_destructor(C[]);
-bool tde3 = __has_trivial_destructor(I[]); // { dg-error "incomplete type" }
+bool tde3 = __has_trivial_destructor(I[]); // { dg-error "unspecified bounds" }
 bool tde4 = __has_trivial_destructor(void);
 bool tde5 = __has_trivial_destructor(const void);
 
 bool abs1 = __is_abstract(I); // { dg-error "incomplete type" }
 bool abs2 = __is_abstract(C[]);
-bool abs3 = __is_abstract(I[]); // { dg-error "incomplete type" }
+bool abs3 = __is_abstract(I[]); // { dg-error "unspecified bounds" }
 bool abs4 = __is_abstract(void);
 bool abs5 = __is_abstract(const void);
 
 bool pod1 = __is_pod(I); // { dg-error "incomplete type" }
 bool pod2 = __is_pod(C[]);
-bool pod3 = __is_pod(I[]); // { dg-error "incomplete type" }
+bool pod3 = __is_pod(I[]); // { dg-error "unspecified bounds" }
 bool pod4 = __is_pod(void);
 bool pod5 = __is_pod(const void);
 
 bool emp1 = __is_empty(I); // { dg-error "incomplete type" }
 bool emp2 = __is_empty(C[]);
-bool emp3 = __is_empty(I[]); // { dg-error "incomplete type" }
+bool emp3 = __is_empty(I[]); // { dg-error "unspecified bounds" }
 bool emp4 = __is_empty(void);
 bool emp5 = __is_empty(const void);
 
 bool pol1 = __is_polymorphic(I); // { dg-error "incomplete type" }
 bool pol2 = __is_polymorphic(C

Re: [PATCH] Fix target default on biarch Linux/Sparc

2011-10-14 Thread Eric Botcazou
> Agreed, please install this patch if you haven't already.

Thanks, done after successfully bootstrapping on SPARC64/Linux.

-- 
Eric Botcazou


Tweak again gnat.dg/specs/debug1.ads

2011-10-14 Thread Eric Botcazou
The test doesn't pass on SPARC because a different comment character is used...

Fixed by not scanning comment characters at all; tested on x86 and SPARC/Linux, 
applied on the mainline.


2011-10-14  Eric Botcazou  

* gnat.dg/specs/debug1.ads: Tweak.


-- 
Eric Botcazou
Index: gnat.dg/specs/debug1.ads
===
--- gnat.dg/specs/debug1.ads	(revision 179844)
+++ gnat.dg/specs/debug1.ads	(working copy)
@@ -11,4 +11,4 @@ package Debug1 is
 
 end Debug1;
 
--- { dg-final { scan-assembler-times "# DW_AT_artificial" 4 } }
+-- { dg-final { scan-assembler-times "DW_AT_artificial" 8 } }


Re: [google] support for building Linux kernel with FDO (issue4523061)

2011-10-14 Thread Rong Xu
You can access gcc side of patch from the following two branches:
svn://gcc.gnu.org/svn/gcc/branches/google/mainorsvn://gcc.gnu.org/svn/gcc/branches/google/gcc-4_6_branch
I have the patch for 2.6.36 kernel and 2.6.34 kernel and I believe I
attached 2.6.36 patch to one of the early review emails. Let me know
you cannot find it.

We do plan to submit to trunk for both gcc and kernel changes. But we
don't expect it happens shortly.

-Rong> From: vulcansh 
> Date: Fri, Oct 14, 2011 at 2:53 PM
> Subject: Re: [google] support for building Linux kernel with FDO 
> (issue4523061)
> To: gcc-patches@gcc.gnu.org
>
>
>
> I found this thread through a Google search so I didn't have much context.
> Can you point me to the google/main source repo for gcc and the kernel?  Are
> there plans to migrate this patch to the gcc and Linux mainline?
>
> Sorry, but I have never used the PGO option in icc.
>
> Steve
>
>
> Xinliang David Li wrote:
> >
> > This patch is for google/main which is 4.7 based, but the validated
> > version is in google_46 branch (which is based on 4.6).
> >
> > By the way (given that you are from intel),  do you know if linux
> > kernel can be built with icc with PGO turned on? Our intern Xiaotian
> > has tried to use icc (12.0) to built kernel, and had some problems.
> > The bootable kernel built with icc + gcc (for those failed with icc)
> > does not perform quite well.
> >
>
> --
> View this message in context:
> http://old.nabble.com/-google---support-for-building-Linux-kernel-with-FDO-%28issue4523061%29-tp31607746p32655618.html
> Sent from the gcc - patches mailing list archive at Nabble.com.


Re: [pph] Make libcpp symbol validation a warning (issue5235061)

2011-10-14 Thread Gabriel Charette
Yes, I understand that.

But when the second 2.pph is skipped when reading foo.pph, the reading
of its line_table is also skipped (as foo.pph doesn't contain the
line_table information for 2.h, 2.pph does and adds it when its
included as a child, but if it's skipped, the line_table info for 2.h
should never make it in the line_table), so I don't see why this is an
issue for the line_table (other than the assert about the number of
line table entries read). What I was suggesting is that as far as the
assert is concerned it would be stronger to count the number of
skipped child headers on read and assert num_read+num_skipped ==
num_expected_childs basically (it is still only an assert so no big
deal I guess).

Essentially this patch fixes the last bug we had in the line_table
merging (i.e. that guarded out headers in the non-pph version weren't
guarded out in the pph version) and this is a good thing. I'm just
being picky about weakening asserts!


I still think it would be nice to have a way to test constructs like
the line_table at the end of parsing (maybe a new flag, as I was
suggesting in my previous email, as gcc doesn't allow for modular
testing) and compare pph and non-pph versions. Testing at this level
would potentially be much better than trying to understand tricky test
failures from the ground up. This is beyond the scope of this patch of
course, but something to keep in mind I think.

Gab

On Fri, Oct 14, 2011 at 8:16 AM, Diego Novillo  wrote:
>
> On 11-10-13 17:55 , Gabriel Charette wrote:
>
>> I'm not sure exactly how you skip headers already parsed now (we
>> didn't used to when I wrote this code and that was the only problem
>> remaining in the line_table (i.e. duplicate entries for guarded
>> headers in the non-pph compile)), but couldn't you count the number of
>> skipped entries and assert (line_table->used - used_before) +
>> numSkipped == expected_in) ?
>
> The problem is that the compilation process of foo.h -> foo.pph may generate 
> different line tables than a compile that includes foo.pph. For instance,
>
> foo.h:
> #include "1.pph"
> #include "2.pph"
> #include "3.pph"
>
> foo.cc:
> #include "2.pph"
> #include "foo.pph"
>
>
> When we compile foo.h, the line table incorporates the effects of including 
> 2.pph, and that's what we save to foo.pph.  However, when compiling foo.cc, 
> the first thing we do is include 2.pph, so when processing the include for 
> foo.pph, we will completely skip over 2.pph.
>
> That's why we cannot really have the same line table that we had when we 
> generated foo.pph.
>
>
> Diego.


[SH] PR 49263 - underutilized "TST #imm, R0" instruction

2011-10-14 Thread Oleg Endo
Hello,

the attached patch is the same as the last proposed patch in the PR but
with some fixed formatting and comments. Hope it's fine like that.

Tested against trunk rev 179778 with 

make -k -j4 check RUNTESTFLAGS="--target_board=sh-sim
\{-m2,-m2a-single,-m4-single,-m4a-single\}\{-mb,-ml\}"

and no new failures (ignoring the impossible -m2a-single -mb
combination).

Cheers,
Oleg

ChangeLog:

2011-10-15  Oleg Endo  

PR target/49263
* config/sh/sh.h (ZERO_EXTRACT_ANDMASK): New macro.
* config/sh/sh.c (sh_rtx_costs): Add test instruction case.
* config/sh/sh.md (tstsi_t): Name existing insn.  Make inner
and instruction commutative.
(tsthi_t, tstqi_t, tstqi_t_zero, tstsi_t_and_not,
tstsi_t_zero_extract_eq, tstsi_t_zero_extract_xor,
tstsi_t_zero_extract_subreg_xor_little,
tstsi_t_zero_extract_subreg_xor_big): New insns.
(*movsicc_t_false, *movsicc_t_true): Replace space with tab in
asm output.
(*andsi_compact): Reorder alternatives so that K08 is considered
first.

testsuite/ChangeLog:

2011-10-15  Oleg Endo  

PR target/49263
* gcc.target/sh/pr49263.c: New.
Index: gcc/testsuite/gcc.target/sh/pr49263.c
===
--- gcc/testsuite/gcc.target/sh/pr49263.c	(revision 0)
+++ gcc/testsuite/gcc.target/sh/pr49263.c	(revision 0)
@@ -0,0 +1,86 @@
+/* Verify that TST #imm, R0 instruction is generated if the constant
+   allows it.  Under some circumstances another compare instruction might
+   be selected, which is also fine.  Any AND instructions are considered
+   counter productive and fail the test.  */
+/* { dg-do compile { target "sh*-*-*" } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "and" } } */
+
+#define make_func(__valtype__, __valget__, __tstval__, __suff__)\
+  int test_imm_##__tstval__##__suff__ (__valtype__ val) \
+{\
+  return ((__valget__) & (0x##__tstval__  << 0)) ? -20 : -40;\
+}
+
+#define make_func_0_F(__valtype__, __valget__, __y__, __suff__)\
+  make_func (__valtype__, __valget__, __y__##0, __suff__)\
+  make_func (__valtype__, __valget__, __y__##1, __suff__)\
+  make_func (__valtype__, __valget__, __y__##2, __suff__)\
+  make_func (__valtype__, __valget__, __y__##3, __suff__)\
+  make_func (__valtype__, __valget__, __y__##4, __suff__)\
+  make_func (__valtype__, __valget__, __y__##5, __suff__)\
+  make_func (__valtype__, __valget__, __y__##6, __suff__)\
+  make_func (__valtype__, __valget__, __y__##7, __suff__)\
+  make_func (__valtype__, __valget__, __y__##8, __suff__)\
+  make_func (__valtype__, __valget__, __y__##9, __suff__)\
+  make_func (__valtype__, __valget__, __y__##A, __suff__)\
+  make_func (__valtype__, __valget__, __y__##B, __suff__)\
+  make_func (__valtype__, __valget__, __y__##C, __suff__)\
+  make_func (__valtype__, __valget__, __y__##D, __suff__)\
+  make_func (__valtype__, __valget__, __y__##E, __suff__)\
+  make_func (__valtype__, __valget__, __y__##F, __suff__)\
+
+#define make_funcs_0_FF(__valtype__, __valget__, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 0, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 1, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 2, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 3, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 4, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 5, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 6, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 7, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 8, __suff__)\
+  make_func_0_F (__valtype__, __valget__, 9, __suff__)\
+  make_func_0_F (__valtype__, __valget__, A, __suff__)\
+  make_func_0_F (__valtype__, __valget__, B, __suff__)\
+  make_func_0_F (__valtype__, __valget__, C, __suff__)\
+  make_func_0_F (__valtype__, __valget__, D, __suff__)\
+  make_func_0_F (__valtype__, __valget__, E, __suff__)\
+  make_func_0_F (__valtype__, __valget__, F, __suff__)\
+
+make_funcs_0_FF (signed char*, *val, int8_mem)
+make_funcs_0_FF (signed char, val, int8_reg)
+
+make_funcs_0_FF (unsigned char*, *val, uint8_mem)
+make_funcs_0_FF (unsigned char, val, uint8_reg)
+
+make_funcs_0_FF (short*, *val, int16_mem)
+make_funcs_0_FF (short, val, int16_reg)
+
+make_funcs_0_FF (unsigned short*, *val, uint16_mem)
+make_funcs_0_FF (unsigned short, val, uint16_reg)
+
+make_funcs_0_FF (int*, *val, int32_mem)
+make_funcs_0_FF (int, val, int32_reg)
+
+make_funcs_0_FF (unsigned int*, *val, uint32_mem)
+make_funcs_0_FF (unsigned int, val, uint32_reg)
+
+make_funcs_0_FF (long long*, *val, int64_lowword_mem)
+make_funcs_0_FF (long long, val, int64_lowword_reg)
+
+make_funcs_0_FF (unsigned long long*, *val, uint64_lowword_mem)
+make_funcs_0_FF (unsigned long long, val, uint64_lowword_reg)
+
+make_funcs_0_FF (long long*, *val >> 32, int64_highword_mem)
+make_funcs_0_FF (long long, val >> 32, int64_highword_reg)
+
+make_funcs_0_FF (unsi

Re: [SH] PR 49263 - underutilized "TST #imm, R0" instruction

2011-10-14 Thread Kaz Kojima
Oleg Endo  wrote:
> the attached patch is the same as the last proposed patch in the PR but
> with some fixed formatting and comments. Hope it's fine like that.
> 
> Tested against trunk rev 179778 with 
> 
> make -k -j4 check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2,-m2a-single,-m4-single,-m4a-single\}\{-mb,-ml\}"
> 
> and no new failures (ignoring the impossible -m2a-single -mb
> combination).

This patch is OK.  Thanks for working on this issue.
I've applied it on trunk as revision 180020.

Regards,
kaz


[PATCH] Fix mv8plus, allow targetting Linux or Solaris from other sparc host.

2011-10-14 Thread David Miller

Richard reported to me that we wouldn't have a mulsi3 pattern with
"-m32 -mv8plus", which left me dumbfounded. It seemed impossible.

Clarifying further, the case is when gcc is built for a target of
sparc64-linux.  And indeed I was able to reproduce this, a 32-bit
mutliply results in a libcall.

The problem is that the default cpu when 32-bit is v7, and applying
the cpu disable bits in sparc_override_options() clears MASK_ISA
which includes MASK_V8PLUS.

While diagnosing this I added all kinds of debugging helpers to
sparc_override_options() so that the next guy can figure this kind of
thing out more quickly.  "-mdebug=x,y,z" is added, and currently
recognizes "options" and "all".  More can be added later, as needed,
to ease sparc backend maintainence.

You can't just fix this -mv8plus problem universally using spec
tricks.  Spec rules such as "{!-mcpu*:-mcpu=v9}" never trigger for the
default bitness, because OPTION_DEFAULT_SPECS appends "-mcpu=v7" or
similar to the command line first.

Therefore, I put the cpu bump to v9 into sparc_override_options()
itself, this handles all possible cases.

sparc64-sun-solaris2* by wouldn't hit this problem, because the
default cpu there is v9.

While testing to see what happens on Solaris, I noticed that it has
become impossible to cross from one sparc target to Linux/Sparc or
Solaris/Sparc.  This is because the ifdef guard for EXTRA_SPEC_FUNCTIONS
is not identical to the test used to decide if driver-sparc.o should
be included in the build, so we get a link failure in such cross
situations.

That issue is fixed here too.

Commited to trunk.

gcc/

* config/sparc/sol2.h: Protect -m{cpu,tune}=native handling
with a more complete cpp test.
* config/sparc/linux64.h: Likewise.
* config/sparc/linux.h: Likewise.
* config/sparc/sparc.opt (sparc_debug): New target variable.
(mdebug): New target option.
* config/sparc/sparc.h (MASK_DEBUG_OPTIONS, MASK_DEBUG_ALL,
TARGET_DEBUG_OPTIONS): New defines.
* config/sparc/sparc.c (debug_target_flag_bits,
debug_target_flags): New functions.
(sparc_option_override): Add name strings back to cpu_table[].
Parse -mdebug string.  When TARGET_DEBUG_OPTIONS is true, print
out the target flags before and after override processing as well
as the selected cpu.  If MASK_V8PLUS, make sure that the selected
cpu is at least v9.
---
 gcc/ChangeLog  |   18 +
 gcc/config/sparc/linux.h   |2 +-
 gcc/config/sparc/linux64.h |2 +-
 gcc/config/sparc/sol2.h|2 +-
 gcc/config/sparc/sparc.c   |  161 ++--
 gcc/config/sparc/sparc.h   |6 ++
 gcc/config/sparc/sparc.opt |8 ++
 7 files changed, 174 insertions(+), 25 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8eac26e..2bc40b0 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,21 @@
+2011-10-14  David S. Miller  
+
+   * config/sparc/sol2.h: Protect -m{cpu,tune}=native handling
+   with a more complete cpp test.
+   * config/sparc/linux64.h: Likewise.
+   * config/sparc/linux.h: Likewise.
+   * config/sparc/sparc.opt (sparc_debug): New target variable.
+   (mdebug): New target option.
+   * config/sparc/sparc.h (MASK_DEBUG_OPTIONS, MASK_DEBUG_ALL,
+   TARGET_DEBUG_OPTIONS): New defines.
+   * config/sparc/sparc.c (debug_target_flag_bits,
+   debug_target_flags): New functions.
+   (sparc_option_override): Add name strings back to cpu_table[].
+   Parse -mdebug string.  When TARGET_DEBUG_OPTIONS is true, print
+   out the target flags before and after override processing as well
+   as the selected cpu.  If MASK_V8PLUS, make sure that the selected
+   cpu is at least v9.
+
 2011-10-15  Oleg Endo  
 
PR target/49263
diff --git a/gcc/config/sparc/linux.h b/gcc/config/sparc/linux.h
index 0ad4b34..443c796 100644
--- a/gcc/config/sparc/linux.h
+++ b/gcc/config/sparc/linux.h
@@ -41,7 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* -mcpu=native handling only makes sense with compiler running on
a SPARC chip.  */
-#if defined(__sparc__)
+#if defined(__sparc__) && defined(__linux__)
 extern const char *host_detect_local_cpu (int argc, const char **argv);
 # define EXTRA_SPEC_FUNCTIONS  \
   { "local_cpu_detect", host_detect_local_cpu },
diff --git a/gcc/config/sparc/linux64.h b/gcc/config/sparc/linux64.h
index b87116a..a51a2f0 100644
--- a/gcc/config/sparc/linux64.h
+++ b/gcc/config/sparc/linux64.h
@@ -139,7 +139,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* -mcpu=native handling only makes sense with compiler running on
a SPARC chip.  */
-#if defined(__sparc__)
+#if defined(__sparc__) && defined(__linux__)
 extern const char *host_detect_local_cpu (int argc, const char **argv);
 # define EXTRA_SPEC_FUNCTIONS  \
   { "loc

Re: Fix for PR obj-c++/48275 ("getter=namespace failing with .mm")

2011-10-14 Thread Mike Stump
On Oct 14, 2011, at 10:51 AM, Nicola Pero wrote:
> Can I apply this fix to the 4.6 branch as well ?

> OK to commit to the 4.6 branch ?

Ok.


Re: [Patch,AVR] Fix PR46278, Take #3

2011-10-14 Thread Georg-Johann Lay
Weddington, Eric schrieb:
> 
>> This is yet another attempt to fix PR46278 (fake X addressing).
>> 
>> After the previous clean-ups it is just a small change.
>> 
>> caller-saves.c tries to eliminate call-clobbered hard-regs allocated to 
>> pseudos around function calls and that leads to situations that reload is 
>> no more capable to perform all requested spills because of the very few 
>> AVR's address registers.
>> 
>> Thus, the patch adds a new target option -mstrict-X so that the user can
>> turn that option if he like to do so, and then -fcaller-save is disabled.
>> 
>> The patch passes the testsuite without regressions. Moreover, the 
>> testsuite passes without regressions if all test cases are run with 
>> -mstrict-X and all libraries (libgcc, avr-libc) are built with the new
>> option turned on.
> 
> Hi Johann,
> 
> Sorry, I haven't been keeping up with the discussion on this PR.
> 
> But if all test cases pass with running -mstrict-X and everything built with
> that option on, then why is this even an option? Is it because that it may 
> not always reduce code size?...

An alternative would be to set -mstrict-X per default if -O or higher.
Let's see what Denis thinks.

Johann

> 
> Thanks, Eric
> 



Re: [C++ Patch] PR 17212

2011-10-14 Thread Mike Stump
On Oct 13, 2011, at 4:19 PM, Paolo Carlini wrote:
> Ok for mainline?

Ok.


Re: ObjC/ObjC++ Patch: rewrite objc/objc++ frontend hashtables

2011-10-14 Thread Mike Stump
On Oct 13, 2011, at 4:54 PM, Nicola Pero wrote:
> This patch finally rewrites the hashtables used by the ObjC (and ObjC++) 
> frontend.  The
> new code speeds up the compiler by about 4% when compiling the standard 
> GNUstep ObjC
> system headers with -fsyntax-only.  That's quite good for a change that does 
> nothing
> but swap a hashtable implementation with another one.
> 
> PS: This also supersedes the two small ObjC hashtable patches that I sent in 
> the past 12
> months or so and that were never applied.  The hashtable implemented by the 
> current patch 
> is polished and fast.
> 
> Bootstrapped and regtested on gnu-linux i686.
> 
> Ok to commit ?

Ok.


Re: [Patch, Darwin] fix PR50699.

2011-10-14 Thread Iain Sandoe


On 13 Oct 2011, at 23:38, Iain Sandoe wrote:



On 13 Oct 2011, at 23:22, Mike Stump wrote:

+/* Add $LDBL128 suffix to long double builtins for ppc darwin.  */

static void
-darwin_patch_builtin (int fncode)
+darwin_patch_builtin (enum built_in_function fncode)


This is a property of the target machine.  DARWIN_PPC is a property  
of the target machine; maybe if (DARWIN_PPC) { } will do what you  
want?


yes, that should be right - there was no reason that this should  
have broken x86 darwin.


regstrapped on *-darwin9, x86-64-darwin10, applied as r179962.

gcc:

PR bootstrap/50699
* config/darwin.c (darwin_patch_builtin): Adjust argument type. Only
build for powerpc targets.  (darwin_patch_builtins): Only build for
powerpc targets.

Index: gcc/config/darwin.c
===
--- gcc/config/darwin.c (revision 179961)
+++ gcc/config/darwin.c (working copy)
@@ -2957,10 +2957,11 @@ darwin_override_options (void)
   darwin_running_cxx = (strstr (lang_hooks.name, "C++") != 0);
 }

-/* Add $LDBL128 suffix to long double builtins.  */
+#if DARWIN_PPC
+/* Add $LDBL128 suffix to long double builtins for ppc darwin.  */

 static void
-darwin_patch_builtin (int fncode)
+darwin_patch_builtin (enum built_in_function fncode)
 {
   tree fn = builtin_decl_explicit (fncode);
   tree sym;
@@ -2998,6 +2999,7 @@ darwin_patch_builtins (void)
 #undef PATCH_BUILTIN_NO64
 #undef PATCH_BUILTIN_VARIADIC
 }
+#endif

 /*  CFStrings implementation.  */
 static GTY(()) tree cfstring_class_reference = NULL_TREE;



Re: ObjC/ObjC++ Patch: rewrite objc/objc++ frontend hashtables

2011-10-14 Thread Mike Stump
On Oct 13, 2011, at 5:02 PM, Nicola Pero wrote:
> I actually forgot to post a tiny bit that is required to support
> the additional objc/objc-map.h and objc/objc-map.c files.  It's
> part of the same patch.  Apologies.

Hum, looks fairly obvious to me.  :-)


Re: [patch] dwarf2out: Drop the size + performance overhead of DW_AT_sibling

2011-10-14 Thread Tristan Gingold

On Oct 13, 2011, at 10:40 PM, Jan Kratochvil wrote:

> On Wed, 12 Oct 2011 16:18:07 +0200, Jan Kratochvil wrote:
>> On Wed, 12 Oct 2011 16:07:24 +0200, Tristan Gingold wrote:
>>> I fear that this may degrade performance of other debuggers.  What about
>>> adding a command line option ?
>> 
>> I can test idb,
> 
> I do not find the difference measurable.  Dropping DW_AT_sibling is 0.25%
> performance _improvement_ but I guess it is just less than the measurement
> error.
> 
> libstdc++ built with gcc -gdwarf-2 as with gcc -gdwarf-4 -fdebug-types-section
> it crashes.  i7-920 x86_64 used for testing:
> Intel(R) Debugger for applications running on Intel(R) 64, Version 12.1, 
> Build [76.472.14]
> 
> with DW_AT_sibling
> real2m34.206s 2m31.822s 2m31.709s 2m32.316s
> avg = 152.51325 seconds
> 
> patched GCC without DW_AT_sibling
> real2m32.528s 2m30.524s 2m33.767s 2m31.719s
> avg = 152.1345 seconds
> 
> I do not see a point in keeping DW_AT_sibling there.

I am not against this patch, my only concern is that there are many many dwarf 
consumers and I have no idea how they will react to this change.

Tristan.


Re: [PATCH] Handle COND_EXPR/VEC_COND_EXPR in walk_stmt_load_store_addr_ops and ssa verification

2011-10-14 Thread Richard Guenther
On Thu, 13 Oct 2011, Jakub Jelinek wrote:

> Hi!
> 
> Andrew mentioned on IRC he found walk_stmt_load_store_addr_ops
> doesn't handle COND_EXPR weirdo first argument well, the following
> patch is an attempt to handle that.
> 
> I've noticed similar spot in verify_ssa, though in that case I'm not
> sure about whether the change is so desirable, as it doesn't seem to
> handle SSA_NAMEs embedded in MEM_EXPRs, ARRAY_REFs etc. either.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> Or just the gimple.c part?

The verify-ssa code is somewhat odd, I'd have expected a
if (count != 0)
  error ();
after that loop, but that of course would have triggered already ;)

The code tries to be more something like verify_operands () which
verifies that update_stmt () was called.  Thus I'd say we
should rather (at the end of processing the stmt) do sth like

saved_need_update = need_ssa_update ();
need_ssa_update = false;
record-state-of-use-operands
update_stmt
compare state-of-use-operands
assert (!need_ssa_update ());
need_ssa_update = saved_need_update;

unfortunately update_stmt may change the operand list even
if no changes occur (IIRC).

But I'm not sure.  I think we should delete this check from
verify_ssa and instead have a corresponding check in
verify_stmts (which already properly walks trees) that
for an SSA name we encounter we do have a properly linked use
(see verify_expr, maybe it's easy to do that for the SSA_NAME
case - at least it's easy without trying to avoid a
FOR_EACH_SSA_USE_OPERAND (, SSA_OP_USE) on the stmt for
each SSA_NAME we encounter).

The gimple.c part is ok.

Thanks,
Richard.

> 2011-10-13  Jakub Jelinek  
> 
>   * gimple.c (walk_stmt_load_store_addr_ops): Call visit_addr
>   also on COND_EXPR/VEC_COND_EXPR comparison operands if they are
>   ADDR_EXPRs.
> 
>   * tree-ssa.c (verify_ssa): For COND_EXPR/VEC_COND_EXPR count
>   SSA_NAMEs in comparison operand as well.
> 
> --- gcc/gimple.c.jj   2011-10-13 11:13:39.0 +0200
> +++ gcc/gimple.c  2011-10-13 11:15:25.0 +0200
> @@ -5313,9 +5313,24 @@ walk_stmt_load_store_addr_ops (gimple st
>  || gimple_code (stmt) == GIMPLE_COND))
>  {
>for (i = 0; i < gimple_num_ops (stmt); ++i)
> - if (gimple_op (stmt, i)
> - && TREE_CODE (gimple_op (stmt, i)) == ADDR_EXPR)
> -   ret |= visit_addr (stmt, TREE_OPERAND (gimple_op (stmt, i), 0), data);
> + {
> +   tree op = gimple_op (stmt, i);
> +   if (op == NULL_TREE)
> + ;
> +   else if (TREE_CODE (op) == ADDR_EXPR)
> + ret |= visit_addr (stmt, TREE_OPERAND (op, 0), data);
> +   /* COND_EXPR and VCOND_EXPR rhs1 argument is a comparison
> +  tree with two operands.  */
> +   else if (i == 1 && COMPARISON_CLASS_P (op))
> + {
> +   if (TREE_CODE (TREE_OPERAND (op, 0)) == ADDR_EXPR)
> + ret |= visit_addr (stmt, TREE_OPERAND (TREE_OPERAND (op, 0),
> +0), data);
> +   if (TREE_CODE (TREE_OPERAND (op, 1)) == ADDR_EXPR)
> + ret |= visit_addr (stmt, TREE_OPERAND (TREE_OPERAND (op, 1),
> +0), data);
> + }
> + }
>  }
>else if (is_gimple_call (stmt))
>  {
> --- gcc/tree-ssa.c.jj 2011-10-07 10:03:28.0 +0200
> +++ gcc/tree-ssa.c2011-10-13 11:19:30.0 +0200
> @@ -1069,14 +1069,27 @@ verify_ssa (bool check_modified_stmt)
> for (i = 0; i < gimple_num_ops (stmt); i++)
>   {
> op = gimple_op (stmt, i);
> -   if (op && TREE_CODE (op) == SSA_NAME && --count < 0)
> +   if (op == NULL_TREE)
> + continue;
> +   if (TREE_CODE (op) == SSA_NAME)
> + --count;
> +   /* COND_EXPR and VCOND_EXPR rhs1 argument is a comparison
> +  tree with two operands.  */
> +   else if (i == 1 && COMPARISON_CLASS_P (op))
>   {
> -   error ("number of operands and imm-links don%'t agree"
> -  " in statement");
> -   print_gimple_stmt (stderr, stmt, 0, TDF_VOPS|TDF_MEMSYMS);
> -   goto err;
> +   if (TREE_CODE (TREE_OPERAND (op, 0)) == SSA_NAME)
> + --count;
> +   if (TREE_CODE (TREE_OPERAND (op, 1)) == SSA_NAME)
> + --count;
>   }
>   }
> +   if (count < 0)
> + {
> +   error ("number of operands and imm-links don%'t agree"
> +  " in statement");
> +   print_gimple_stmt (stderr, stmt, 0, TDF_VOPS|TDF_MEMSYMS);
> +   goto err;
> + }
>  
> FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE|SSA_OP_VUSE)
>   {
> 
>   Jakub
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

resent [PATCH] Fix PR50496

2011-10-14 Thread Markus Trippelsdorf
This patch, originally from Chung-Lin Tang, fixes PR50496.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50496

Can someone please review and commit it? 
Thanks.

Bootstrapped and tested on x86_64-pc-linux-gnu.

PR middle-end/50496
* cfgrtl.c (try_redirect_by_replacing_jump): Treat EXIT_BLOCK_PTR case
separately before call to redirect_jump(). Add assertion.
(patch_jump_insn): Same.

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index b3f045b..57f561f 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -846,11 +846,10 @@ try_redirect_by_replacing_jump (edge e, basic_block 
target, bool in_cfglayout)
   if (dump_file)
fprintf (dump_file, "Redirecting jump %i from %i to %i.\n",
 INSN_UID (insn), e->dest->index, target->index);
-  if (!redirect_jump (insn, block_label (target), 0))
-   {
- gcc_assert (target == EXIT_BLOCK_PTR);
- return NULL;
-   }
+  if (target == EXIT_BLOCK_PTR)
+   return NULL;
+  if (! redirect_jump (insn, block_label (target), 0))
+   gcc_unreachable ();
 }
 
   /* Cannot do anything for target exit block.  */
@@ -1030,11 +1029,10 @@ patch_jump_insn (rtx insn, rtx old_label, basic_block 
new_bb)
  /* If the substitution doesn't succeed, die.  This can happen
 if the back end emitted unrecognizable instructions or if
 target is exit block on some arches.  */
- if (!redirect_jump (insn, block_label (new_bb), 0))
-   {
- gcc_assert (new_bb == EXIT_BLOCK_PTR);
- return false;
-   }
+ if (new_bb == EXIT_BLOCK_PTR)
+   return false;
+ if (! redirect_jump (insn, block_label (new_bb), 0))
+   gcc_unreachable ();
}
 }
   return true;

-- 
Markus


[Patch Darwin/PPC] implement out-of-line FPR/GPR saves/restores.

2011-10-14 Thread Iain Sandoe
we've been building the FPR routines for ages (for compatibility with  
system tools)..
This implements their use and also the GPRs - the latter makes an  
appreciable reduction in code size,
 the former is neutral for non-fp intensive code but, for example,  
makes an appreciable reduction to the size of gnat1.


this has been around my local tree for quite a few bootstrap/regtest  
cycles (incl. Ada and Java).

OK for trunk?
Iain

gcc:

	* config/rs6000/t-darwin (LIB2FUNCS_STATIC_EXTRA): Move darwin- 
fpsave.asm
	from here to ... LIB2FUNCS_EXTRA.  (LIB2FUNCS_EXTRA):  Add darwin- 
gpsave.asm.

(TARGET_LIBGCC2_CFLAGS): Ensure that fPIC and -pipe are inherited from
config/t-darwin.
* config/rs6000/darwin.h (FP_SAVE_INLINE): Adjust to enable.
(GP_SAVE_INLINE): Likewise.
	(SAVE_FP_PREFIX,  SAVE_FP_SUFFIX, RESTORE_FP_PREFIX,  
RESTORE_FP_SUFFIX):

Set to empty strings.
	* config/rs6000/rs6000.c (rs6000_savres_strategy): Implement for  
Darwin.

(debug_stack_info): Print savres_strategy.
(rs6000_savres_routine_name): Implement for Darwin.
(rs6000_make_savres_rtx): Adjust used register for Darwin.
(rs6000_emit_prologue): Implement out-of-line saves for Darwin.
	(rs6000_output_function_prologue): Do not emit .extern for Mach-O  
targets.

(rs6000_emit_epilogue): Implement out-of-line saves for Darwin.
* config/rs6000/darwin-gpsave.asm: New file.

Index: gcc/config/rs6000/t-darwin
===
--- gcc/config/rs6000/t-darwin  (revision 179962)
+++ gcc/config/rs6000/t-darwin  (working copy)
@@ -19,21 +19,21 @@
 
 LIB2FUNCS_EXTRA = $(srcdir)/config/rs6000/darwin-tramp.asm \
$(srcdir)/config/darwin-64.c \
+   $(srcdir)/config/rs6000/darwin-fpsave.asm  \
+   $(srcdir)/config/rs6000/darwin-gpsave.asm  \
$(srcdir)/config/rs6000/darwin-world.asm
 
 LIB2FUNCS_STATIC_EXTRA = \
-   $(srcdir)/config/rs6000/darwin-fpsave.asm  \
$(srcdir)/config/rs6000/darwin-vecsave.asm
 
-# The .asm files above are designed to run on all processors,
-# even though they use AltiVec instructions.  -Wa is used because
-# -force_cpusubtype_ALL doesn't work with -dynamiclib.
-#
-# -pipe because there's an assembler bug, 4077127, which causes
-# it to not properly process the first # directive, causing temporary
-# file names to appear in stabs, causing the bootstrap to fail.  Using -pipe
-# works around this by not having any temporary file names.
-TARGET_LIBGCC2_CFLAGS = -Wa,-force_cpusubtype_ALL -pipe 
-mmacosx-version-min=10.4
+# The .asm files above are designed to run on all processors, even though
+# they use AltiVec instructions.
+# -Wa is used because -force_cpusubtype_ALL doesn't work with -dynamiclib.
+# -mmacosx-version-min=10.4 is used to provide compatibility for code from
+# earlier OSX versions.
 
+TARGET_LIBGCC2_CFLAGS += -Wa,-force_cpusubtype_ALL -mmacosx-version-min=10.4
+
 darwin-fpsave.o:   $(srcdir)/config/rs6000/darwin-asm.h
+darwin-gpsave.o:   $(srcdir)/config/rs6000/darwin-asm.h
 darwin-tramp.o:$(srcdir)/config/rs6000/darwin-asm.h
Index: gcc/config/rs6000/darwin.h
===
--- gcc/config/rs6000/darwin.h  (revision 179962)
+++ gcc/config/rs6000/darwin.h  (working copy)
@@ -173,18 +173,27 @@ extern int darwin_emit_branch_islands;
   (RS6000_ALIGN (crtl->outgoing_args_size, 16) \
+ (STACK_POINTER_OFFSET))
 
-/* Define cutoff for using external functions to save floating point.
-   Currently on Darwin, always use inline stores.  */
+/* Define cutoff for using out-of-line functions to save registers.
+   Currently on Darwin, we implement FP and GPR out-of-line-saves plus the
+   special routine for 'save everything'.  */
 
-#undef FP_SAVE_INLINE
-#define FP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) < 64)
+#undef FP_SAVE_INLINE
+#define FP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) > 60 && (FIRST_REG) < 64)
+
 #undef GP_SAVE_INLINE
-#define GP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) < 32)
+#define GP_SAVE_INLINE(FIRST_REG) ((FIRST_REG) > 29 && (FIRST_REG) < 32)
 
 /* Darwin uses a function call if everything needs to be saved/restored.  */
+
 #undef WORLD_SAVE_P
 #define WORLD_SAVE_P(INFO) ((INFO)->world_save_p)
 
+/* We don't use these on Darwin, they are just place-holders.  */
+#define SAVE_FP_PREFIX ""
+#define SAVE_FP_SUFFIX ""
+#define RESTORE_FP_PREFIX ""
+#define RESTORE_FP_SUFFIX ""
+
 /* The assembler wants the alternate register names, but without
leading percent sign.  */
 #undef REGISTER_NAMES
@@ -234,12 +243,6 @@ extern int darwin_emit_branch_islands;
 #undef ASM_COMMENT_START
 #define ASM_COMMENT_START ";"
 
-/* FP save and restore routines.  */
-#defineSAVE_FP_PREFIX "._savef"
-#define SAVE_FP_SUFFIX ""
-#defineRESTORE_FP_PREFIX "._restf"
-#define RESTORE_FP_SUFFIX ""
-
 /* This is how to output an assembler line that says to advance
the 

Re: [Patch Darwin/PPC] implement out-of-line FPR/GPR saves/restores.

2011-10-14 Thread Mike Stump
On Oct 14, 2011, at 2:05 AM, Iain Sandoe wrote:
> This implements their use and also the GPRs - the latter makes an appreciable 
> reduction in code size,

> OK for trunk?

Ok.  Watch for problems with async stack walking (hitting sample in Activity 
Monitor, or the walking done by CrashReporter)...  that's the only thing I can 
think of that might be strange.


[Patch Darwin/PR49992 1/2] remove ranlib special-casing from the darwin port.

2011-10-14 Thread Iain Sandoe
As per the PR audit trail, there is no reason to retain this special- 
casing for Darwin.
 (given that current GCC is not build-able using Darwin toolsets of  
the vintage that required the case).


Mike has OK'd this off-list - but, since Ralf commented on the  
previous version, I'd like to give him the opportunity to comment here.

OK for trunk?
Iain

* configure.ac: Remove ranlib special case for Darwin port.
* gcc/configure.ac: Likewise.
* configure: Regenerate.
* gcc/configure: Regenerate.

Index: configure.ac
===
--- configure.ac(revision 179962)
+++ configure.ac(working copy)
@@ -2274,10 +2274,6 @@ case "${target}" in
 extra_arflags_for_target=" -X32_64"
 extra_nmflags_for_target=" -B -X32_64"
 ;;
-  *-*-darwin[[3-9]]*)
-# ranlib before Darwin10 requires the -c flag to look at common  
symbols.

-extra_ranlibflags_for_target=" -c"
-;;
 esac

 alphaieee_frag=/dev/null
Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 179962)
+++ gcc/configure.ac(working copy)
@@ -829,17 +829,7 @@ esac
 gcc_AC_PROG_LN_S
 ACX_PROG_LN($LN_S)
 AC_PROG_RANLIB
-case "${host}" in
-*-*-darwin*)
-  # By default, the Darwin ranlib will not treat common symbols as
-  # definitions when  building the archive table of contents.  Other
-  # ranlibs do that; pass an option to the Darwin ranlib that makes
-  # it behave similarly.
-  ranlib_flags="-c"
-  ;;
-*)
-  ranlib_flags=""
-esac
+ranlib_flags=""
 AC_SUBST(ranlib_flags)

 gcc_AC_PROG_INSTALL





[Patch Darwin/PR49992 2/2] remove ranlib special-casing from the darwin port.

2011-10-14 Thread Iain Sandoe
As per the PR audit trail, there is no reason to retain this in the  
building of GCC.


As for its use as a general option in tool-builds;
With current darwin toolsets it has the potential to cause issues when  
using convenience libs containing common.

OK for trunk?
Iain

gcc/ada:

PR target/49992
* mlib-tgt-specific-darwin.adb: Remove ranlib special case.
* gcc-interface/Makefile.in (darwin): Likewise.


Index: gcc/ada/mlib-tgt-specific-darwin.adb
===
--- gcc/ada/mlib-tgt-specific-darwin.adb(revision 179962)
+++ gcc/ada/mlib-tgt-specific-darwin.adb(working copy)
@@ -68,7 +68,7 @@ package body MLib.Tgt.Specific is

function Archive_Indexer_Options return String_List_Access is
begin
-  return new String_List'(1 => new String'("-c"));
+  return new String_List'(1 => new String'(""));
end Archive_Indexer_Options;

---
Index: gcc/ada/gcc-interface/Makefile.in
===
--- gcc/ada/gcc-interface/Makefile.in   (revision 179962)
+++ gcc/ada/gcc-interface/Makefile.in   (working copy)
@@ -2179,7 +2179,6 @@ ifeq ($(strip $(filter-out darwin%,$(osys))),)

   EH_MECHANISM=-gcc
   GNATLIB_SHARED = gnatlib-shared-darwin
-  RANLIB = ranlib -c
   GMEM_LIB = gmemlib
   LIBRARY_VERSION := $(LIB_VERSION)
   soext = .dylib




[PATCH] Fix PR50723, GSI_NEW subtlety

2011-10-14 Thread Richard Guenther

This fixes PR50723, a common error when inserting stmts after some
others and using GSI_NEW (which will make the iterator point to
the _first_ newly inserted stmt).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2011-10-14  Richard Guenther  

PR tree-optimization/50723
* ipa-split.c (split_function): Use GSI_CONTINUE_LINKING.

* gcc.dg/torture/pr50723.c: New testcase.

Index: gcc/ipa-split.c
===
*** gcc/ipa-split.c (revision 179962)
--- gcc/ipa-split.c (working copy)
*** split_function (struct split_point *spli
*** 1134,1140 
  if (!is_gimple_val (arg))
{
arg = force_gimple_operand_gsi (&gsi, arg, true, NULL_TREE,
!   false, GSI_NEW_STMT);
VEC_replace (tree, args_to_pass, i, arg);
}
call = gimple_build_call_vec (node->decl, args_to_pass);
--- 1134,1140 
  if (!is_gimple_val (arg))
{
arg = force_gimple_operand_gsi (&gsi, arg, true, NULL_TREE,
!   false, GSI_CONTINUE_LINKING);
VEC_replace (tree, args_to_pass, i, arg);
}
call = gimple_build_call_vec (node->decl, args_to_pass);
Index: gcc/testsuite/gcc.dg/torture/pr50723.c
===
*** gcc/testsuite/gcc.dg/torture/pr50723.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr50723.c  (revision 0)
***
*** 0 
--- 1,26 
+ /* { dg-do compile } */
+ 
+ typedef short unsigned int wchar_t;
+ typedef unsigned int size_t;
+ int* _errno(void);
+ int WideCharToMultiByte (wchar_t *);
+ int __attribute__ ((__nonnull__ (1)))
+ __wcrtomb_cp (char *dst, wchar_t wc, const unsigned int cp,
+ const unsigned int mb_max)
+ {
+   if (cp == 0) {
+   if (wc > 255) 
+   (*_errno()) = 42;
+   return 1;
+   }
+   else 
+ return WideCharToMultiByte (&wc);
+ }
+ void wcsrtombs (char *dst, const wchar_t *pwc, unsigned int cp,
+   unsigned int mb_max)
+ {
+   if ((__wcrtomb_cp (dst, *pwc, cp, mb_max)) <= 0)
+ return;
+   if ((__wcrtomb_cp (dst, *pwc, cp, mb_max)) <= 0)
+ return;
+ }


Re: [Patch,AVR] Fix PR46278, Take #3

2011-10-14 Thread Denis Chertykov
2011/10/14 Georg-Johann Lay :
> Weddington, Eric schrieb:
>>
>>> This is yet another attempt to fix PR46278 (fake X addressing).
>>>
>>> After the previous clean-ups it is just a small change.
>>>
>>> caller-saves.c tries to eliminate call-clobbered hard-regs allocated to
>>> pseudos around function calls and that leads to situations that reload is
>>> no more capable to perform all requested spills because of the very few
>>> AVR's address registers.
>>>
>>> Thus, the patch adds a new target option -mstrict-X so that the user can
>>> turn that option if he like to do so, and then -fcaller-save is disabled.
>>>
>>> The patch passes the testsuite without regressions. Moreover, the
>>> testsuite passes without regressions if all test cases are run with
>>> -mstrict-X and all libraries (libgcc, avr-libc) are built with the new
>>> option turned on.
>>
>> Hi Johann,
>>
>> Sorry, I haven't been keeping up with the discussion on this PR.
>>
>> But if all test cases pass with running -mstrict-X and everything built with
>> that option on, then why is this even an option? Is it because that it may
>> not always reduce code size?...
>
> An alternative would be to set -mstrict-X per default if -O or higher.
> Let's see what Denis thinks.

I think that it's just a great results.
I vote for committing this patch.
About "to set -mstrict-X per default": if it's possible to print
something like "Please use -mno-strict-X" instead of "Spill error
failure" then we can use -mstrict-X by default.
i.e. how user can get a knowledge about a correlation between "Spill
error..." and -mstrict-X

Denis.


Re: [PR50672, PATCH] Fix ice triggered by -ftree-tail-merge: verify_ssa failed: no immediate_use list

2011-10-14 Thread Richard Guenther
On Fri, Oct 14, 2011 at 1:12 AM, Tom de Vries  wrote:
> On 10/12/2011 02:19 PM, Richard Guenther wrote:
>> On Wed, Oct 12, 2011 at 8:35 AM, Tom de Vries  wrote:
>>> Richard,
>>>
>>> I have a patch for PR50672.
>>>
>>> When compiling the testcase from the PR with -ftree-tail-merge, the 
>>> scenario is
>>> as follows:
>>>
>>> We start out tail_merge_optimize with blocks 14 and 20, which are alike, 
>>> but not
>>> equal, since they have different successors:
>>> ...
>>>  # BLOCK 14 freq:690
>>>  # PRED: 25 [61.0%]  (false,exec)
>>>
>>>  if (wD.2197_57(D) != 0B)
>>>    goto ;
>>>  else
>>>    goto ;
>>>  # SUCC: 15 [78.4%]  (true,exec) 16 [21.6%]  (false,exec)
>>>
>>>
>>>  # BLOCK 20 freq:2900
>>>  # PRED: 29 [100.0%]  (fallthru) 31 [100.0%]  (fallthru)
>>>
>>>  # .MEMD.2447_209 = PHI <.MEMD.2447_125(29), .MEMD.2447_129(31)>
>>>  if (wD.2197_57(D) != 0B)
>>>    goto ;
>>>  else
>>>    goto ;
>>>  # SUCC: 5 [85.0%]  (true,exec) 6 [15.0%]  (false,exec)
>>> ...
>>>
>>> In the first iteration, we merge block 5 with block 15 and block 6 with 
>>> block
>>> 16. After that, the blocks 14 and 20 are equal.
>>>
>>> In the second iteration, the blocks 14 and 20 are merged, by redirecting the
>>> incoming edges of block 20 to block 14, and removing block 20.
>>>
>>> Block 20 also contains the definition of .MEMD.2447_209. Removing the 
>>> definition
>>> delinks the vuse of .MEMD.2447_209 in block 5:
>>> ...
>>>  # BLOCK 5 freq:6036
>>>  # PRED: 20 [85.0%]  (true,exec)
>>>
>>>  # PT = nonlocal escaped
>>>  D.2306_58 = &thisD.2200_10(D)->D.2156;
>>>  # .MEMD.2447_132 = VDEF <.MEMD.2447_209>
>>>  # USE = anything
>>>  # CLB = anything
>>>  drawLineD.2135 (D.2306_58, wD.2197_57(D), gcD.2198_59(D));
>>>  goto ;
>>>  # SUCC: 17 [100.0%]  (fallthru,exec)
>>> ...
>>
>> And block 5 is retained and block 15 is discarded?
>>
>
> Indeed.
>
>>> After the pass, when executing the TODO_update_ssa_only_virtuals, we update 
>>> the
>>> drawLine call in block 5 using rewrite_update_stmt, which calls
>>> maybe_replace_use for the vuse operand.
>>>
>>> However, maybe_replace_use doesn't have an effect since the old vuse and 
>>> the new
>>> vuse happen to be the same (rdef == use), so SET_USE is not called and the 
>>> vuse
>>> remains delinked:
>>> ...
>>>  if (rdef && rdef != use)
>>>    SET_USE (use_p, rdef);
>>> ...
>>>
>>> The patch fixes this by forcing SET_USE for delinked uses.
>>
>> That isn't the correct fix.  Whoever unlinks the vuse (by removing its
>> definition) has to replace it with something valid, which is either the
>> bare symbol .MEM, or the VUSE associated with the removed VDEF
>> (thus, as unlink_stmt_vdef does).
>>
>
> Another try. For each deleted bb, we call unlink_stmt_vdef for the statements,
> and replace the .MEM phi uses with the bare .MEM symbol.
>
> Bootstrapped and reg-tested on x86_64.
>
> Ok for trunk?

Better.  For

+
+  FOR_EACH_IMM_USE_STMT (use_stmt, iter, res)
+   {
+ FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+   SET_USE (use_p, SSA_NAME_VAR (res));
+   }

you can use mark_virtual_phi_result_for_renaming (phi) instead.

+  for (i = gsi_last_bb (bb); !gsi_end_p (i); gsi_prev_nondebug (&i))
+unlink_stmt_vdef (gsi_stmt (i));

is that actually necessary?  That is, isn't the block that follows a
deleted block always starting with a vitual PHI?  If not it should
be enough to walk to the first stmt that uses a virtual operand
and similar to the PHI case replace all its uses with the bare
symbol.  But as I said, I believe handling PHIs should be sufficient?

Thanks,
Richard.


> Thanks,
> - Tom
>
>> Richard.
>>
>
>
> 2011-10-14  Tom de Vries  
>
>        PR tree-optimization/50672
>        * tree-ssa-tail-merge.c (release_vdefs): New function.
>        (purge_bbs): Add update_vops parameter.  Call release_vdefs for each
>        deleted basic block.
>        (tail_merge_optimize): Add argument to call to purge_bbs.
>


Re: RFC: Add ADD_RESTRICT tree code

2011-10-14 Thread Richard Guenther
On Wed, Oct 12, 2011 at 7:16 PM, Michael Matz  wrote:
> Hi,
>
> this adds a mean to retain restrict information without relying on
> restrict casts.  In the patch it's emitted by the gimplifier when it sees
> a norestrict->restrict cast (which from then on is useless), at which
> point also the tag of that restrict pointer is generated.  That's later
> used by the aliasing machinery to associate it with a restrict tag uid.
>
> In particular it will be possible to associate pointers coming from
> different inline instance of the same function with the same restrict tag,
> and hence make them conflict.
>
> This patch will fix the currently XFAILed tree-ssa/restrict-4.c again, as
> well as fix PR 50419.  It also still fixes the original testcase of
> PR 49279.  But it will break the checked in testcase for this bug report.
> I believe the checked in testcase is invalid as follows:
>
> struct S { int a; int *__restrict p; };
>
> int
> foo (int *p, int *q)
> {
>  struct S s, *t;
>  s.a = 1;
>  s.p = p;       // 1
>  t = wrap(&s);  // 2 t=&s in effect, but GCC doesn't see this
>  t->p = q;      // 3
>  s.p[0] = 0;    // 4
>  t->p[0] = 1;   // 5
>  return s.p[0]; // 6
> }
>
> Assignment 2 means that t->p points to s.p.  Assignment 3 changes t->p and
> s.p, but the change to s.p doesn't occur through a pointer based on t->p
> or any other restrict pointer, in fact it doesn't occur through any
> explicit initialization or assignment, but rather through in indirect
> access via a different pointer.  Hence the accesses to the same memory
> object at s.p[0] and t->p[0] were undefined because both accesses weren't
> through pointers based on each other.

Ick, that actually shows a bug in points-to handling (well, in
pt_solutions_same_restrict_base handling).  While we correctly
see that p escapes though wrap() and that t points to ESCAPED
we don't check in

bool
pt_solutions_same_restrict_base (struct pt_solution *pt1,
 struct pt_solution *pt2)
{
  /* If we deal with points-to solutions of two restrict qualified
 pointers solely rely on the pointed-to variable bitmap intersection.
 For two pointers that are based on each other the bitmaps will
 intersect.  */
  if (pt1->vars_contains_restrict
  && pt2->vars_contains_restrict)
{
  gcc_assert (pt1->vars && pt2->vars);
  return bitmap_intersect_p (pt1->vars, pt2->vars);
}

whether the solutions overlap in the ESCAPED bit (remember
ESCAPED contents are not expanded into the pointers pt->vars
bitmaps but just noted as the pt->escaped flag).  We ignore
pt->nonlocal as well, but that was by design ... so, re-try with

Index: tree-ssa-structalias.c
===
--- tree-ssa-structalias.c  (revision 179962)
+++ tree-ssa-structalias.c  (working copy)
@@ -6079,12 +6079,15 @@ pt_solutions_intersect_1 (struct pt_solu
 return true;

   /* If either points to unknown global memory and the other points to
- any global memory they alias.  */
-  if ((pt1->nonlocal
-   && (pt2->nonlocal
-  || pt2->vars_contains_global))
-  || (pt2->nonlocal
- && pt1->vars_contains_global))
+ any global memory they alias.  If both points-to sets are based
+ off a restrict qualified pointer ignore any overlaps with NONLOCAL.  */
+  if (!(pt1->vars_contains_restrict
+   && pt2->vars_contains_restrict)
+  && ((pt1->nonlocal
+  && (pt2->nonlocal
+  || pt2->vars_contains_global))
+ || (pt2->nonlocal
+ && pt1->vars_contains_global)))
 return true;

   /* Check the escaped solution if required.  */
@@ -6148,18 +6151,7 @@ bool
 pt_solutions_same_restrict_base (struct pt_solution *pt1,
 struct pt_solution *pt2)
 {
-  /* If we deal with points-to solutions of two restrict qualified
- pointers solely rely on the pointed-to variable bitmap intersection.
- For two pointers that are based on each other the bitmaps will
- intersect.  */
-  if (pt1->vars_contains_restrict
-  && pt2->vars_contains_restrict)
-{
-  gcc_assert (pt1->vars && pt2->vars);
-  return bitmap_intersect_p (pt1->vars, pt2->vars);
-}
-
-  return true;
+  return pt_solutions_intersect (pt1, pt2);
 }



> I expect some bike shedding about the name of the tree code, hence only
> RFC, no Changelog or anything.  It's only ligthly tested in so far as it's
> currently in the target libs of a regstrap on x86_64-linux.
>
>
> Ciao,
> Michael.
>
> Index: tree-pretty-print.c
> ===
> *** tree-pretty-print.c (revision 179855)
> --- tree-pretty-print.c (working copy)
> *** dump_generic_node (pretty_printer *buffe
> *** 1708,1713 
> --- 1708,1721 
>        pp_character (buffer, '>');
>        break;
>
> +     case ADD_RESTRICT:
> +       pp_string (buffer, "ADD_RESTRICT <");
> +       dump_generic_node (buf

Re: Vector alignment tracking

2011-10-14 Thread Richard Guenther
On Thu, Oct 13, 2011 at 6:57 PM, Andi Kleen  wrote:
>> Or I am missing someting?
>
> I often see the x86 vectorizer with -mtune=generic generate a lot of
> complicated code just to adjust for potential misalignment.
>
> My thought was just if the alias oracle knows what the original
> declaration is, and it's available for changes (e.g. LTO), it would be
> likely be better to just add an __attribute__((aligned()))
> there.
>
> In the general case it's probably harder, you would need some
> cost model to decide when it's worth it.
>
> Your approach of course would still be needed for cases where this
> isn't possible. But it sounded like the infrastructure you're building
> could in principle do both.

The vectorizer already does that.

Richard.

> -Andi
>


[Patch, Fortran] PR50718 Fix -fcheck=pointer 4.6/4.7 regression

2011-10-14 Thread Tobias Burnus

Hello,

while testing my constructor draft patch with FGSL, I found two bugs 
(4.6/4.7 regressions): PR target/50721 (segfault on x86-64 after 
execution - one of the rare -O0 only bugs) - and while trying to debug 
it, I found this bug.


The problem is that for -fcheck=pointer, gfortran expects a pointer for 
the "if (actual_arg == NULL)" check; however, if the dummy has the VALUE 
attribute, one has "*actual_arg". The solution is to strip off the "*" 
for the pointer check. Without the patch, one gets an ICE when comparing 
a derived type (instead of a pointer to a DT) with NULL.


While writing the test case, I realized that there is also the %VAL() 
method of generating a nonpointer should be also tested for. As the 
actual argument in the test case (*_12.f90) is not a record type (DT) 
but a simple integer, comparing "%VAL(p) == NULL" works (but is bogus). 
Result without the patch: Segfault at run time instead of an ICE. With 
the patch, one gets the nice  run-time error message.


Build and regtested on x86-64-linux.
OK for the trunk and 4.6?

Tobias

PS: *Pre-ping* for the pending review of as simple documentation/flag 
patch: http://gcc.gnu.org/ml/fortran/2011-10/msg00073.html
2011-10-14  Tobias Burnus  

	PR fortran/50718
	* trans-expr.c (gfc_conv_procedure_call): Fix -fcheck=pointer
	for dummy arguments with VALUE attribute.

2011-10-14  Tobias Burnus  

	PR fortran/50718
	* gfortran.dg/pointer_check_11.f90: New.
	* gfortran.dg/pointer_check_12.f90: New.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index ca0523f..a3847aa 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -3357,10 +3357,16 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	  else
 		goto end_pointer_check;
 
+	  tmp = parmse.expr;
+
+	  /* If the argument is passed by value, we need to strip the
+		 INDIRECT_REF.  */
+	  if (!POINTER_TYPE_P (TREE_TYPE (parmse.expr)))
+		tmp = gfc_build_addr_expr (NULL_TREE, tmp);
 
 	  cond = fold_build2_loc (input_location, EQ_EXPR,
-  boolean_type_node, parmse.expr,
-  fold_convert (TREE_TYPE (parmse.expr),
+  boolean_type_node, tmp,
+  fold_convert (TREE_TYPE (tmp),
 		null_pointer_node));
 	}
  
--- /dev/null	2011-10-14 07:41:04.295638041 +0200
+++ gcc/gcc/testsuite/gfortran.dg/pointer_check_11.f90	2011-10-14 09:33:41.0 +0200
@@ -0,0 +1,24 @@
+! { dg-do run }
+! { dg-options "-fcheck=all" }
+!
+! { dg-shouldfail "Pointer check" }
+! { dg-output "Fortran runtime error: Pointer actual argument 'y' is not associated" }
+!
+!
+! PR fortran/50718
+!
+! Was failing (ICE) with -fcheck=pointer if the dummy had the value attribute.
+
+type t
+  integer :: p
+end type t
+
+type(t), pointer :: y => null()
+
+call sub(y) ! Invalid: Nonassociated pointer
+
+contains
+  subroutine sub (x)
+type(t), value :: x
+  end subroutine
+end
--- /dev/null	2011-10-14 07:41:04.295638041 +0200
+++ gcc/gcc/testsuite/gfortran.dg/pointer_check_12.f90	2011-10-14 09:42:53.0 +0200
@@ -0,0 +1,22 @@
+! { dg-do run }
+! { dg-options "-fcheck=all" }
+!
+! { dg-shouldfail "Pointer check" }
+! { dg-output "Fortran runtime error: Pointer actual argument 'p' is not associated" }
+!
+! PR fortran/50718
+!
+! Was failing with -fcheck=pointer: Segfault at run time
+
+integer, pointer :: p => null()
+
+call sub2(%val(p)) ! Invalid: Nonassociated pointer
+end
+
+! Not quite correct dummy, but if one uses VALUE, gfortran
+! complains about a missing interface - which we cannot use
+! if we want to use %VAL().
+
+subroutine sub2(p)
+  integer :: p
+end subroutine sub2


Re: [testsuite] require arm_little_endian in two tests

2011-10-14 Thread Julian Brown
On Thu, 13 Oct 2011 16:12:17 +0100
Richard Earnshaw  wrote:

> On 13/10/11 15:56, Joseph S. Myers wrote:
> > Indeed, vector initializers are part of the target-independent GNU
> > C language and have target-independent semantics that the elements
> > go in memory order, corresponding to the target-independent
> > semantics of lane numbers where they appear in GENERIC, GIMPLE and
> > (non-UNSPEC) RTL and any target-independent built-in functions that
> > use such numbers.  (The issue here being, as you saw, that the lane
> > numbers used in ARM-specific NEON intrinsics are for big-endian not
> > the same as those used in target-independent features of GNU C and
> > target-independent internal representations in GCC - hence various
> > code to translate them between the two conventions when processing
> > intrinsics into non-UNSPEC RTL, and to translate back when
> > generating assembly instructions that encode lane numbers with the
> > ARM conventions, as expounded at greater length at
> > .)
> > 
> 
> This is all rather horrible, and leads to THREE different layouts for
> a 128-bit vector for big-endian Neon.
> 
> GCC format
> 'VLD1.n' format
> 'ABI' format
> 
> GCC format and 'ABI' format differ in that the 64-bit words of the
> 128-bit vector are swapped.
> 
> All this and they are all expected to share a single machine mode.
> 
> Furthermore, the definitions in GCC are broken, in that the types
> defined in arm_neon.h (eg int8x16_t) are supposed to be ABI format,
> not GCC format.
> 
> Eukk! :-(

FWIW, I thought long and hard about this problem, and eventually gave
up trying to solve it. Note that many operations which depend on the
ordering of vectors are now disabled entirely (at least for Q regs) in
neon.md in big-endian mode to try and limit the damage. NEON is
basically only supported properly in little-endian mode, IMO.

I'd love to see this resolved properly. Some random observations:

 * The vectorizer can use whatever layout it wants for vectors in
   either endianness. Vectorizer vectors never interact with either
   GCC generic (source-level) vectors, nor the NEON intrinsics. Also
   they never cross ABI boundaries.

 * GCC generic vectors aren't specified very formally, particularly wrt.
   their interaction with NEON intrinsics. If you stick *entirely* to
   accessing vectors via NEON intrinsics, the problems in big-endian
   mode (I think) don't ever materialise. This includes not using
   indirection to load/store vectors, and (of course) not constructing
   vectors using { x, y, z... } syntax. One possibility might be to
   detect and *disallow* code which attempts to mix vector operations
   like that.

I don't quite understand your comment about the GCC definitions of
int8x16_t etc. being broken, tbh...

Cheers,

Julian


[PATCH, i386 tests] New tests to check vectorization for AVX2 insns.

2011-10-14 Thread Kirill Yukhin
Hello guys,
Here is a bunch of tests which check basic vectorization abilities to
generate AVX2 instructions.

testsuite/ChangeLog entry is:
2011-10-14  Kirill Yukhin  

* gcc.target/i386/avx2-vpaddd-3.c: New test.
* gcc.target/i386/avx2-vpaddw-3.c: Ditto.
* gcc.target/i386/avx2-vpaddb-3.c: New.
* gcc.target/i386/avx2-vpaddq-3.c: Ditto.
* gcc.target/i386/avx2-vpand-3.c: Ditto.
* gcc.target/i386/avx2-vpmulld-3.c: Ditto.
* gcc.target/i386/avx2-vpmullw-3.c: Ditto.
* gcc.target/i386/avx2-vpsrad-3.c: Ditto.
* gcc.target/i386/avx2-vpsraw-3.c: Ditto.
* gcc.target/i386/avx2-vpsrld-3.c: Ditto.
* gcc.target/i386/avx2-vpsrlw-3.c: Ditto.
* gcc.target/i386/avx2-vpsubb-3.c: Ditto.
* gcc.target/i386/avx2-vpsubd-3.c: Ditto.
* gcc.target/i386/avx2-vpsubq-3.c: Ditto.
* gcc.target/i386/avx2-vpsubw-3.c: Ditto.

Could you please have a look?

Thanks, K


avx2.vect.tests.gcc.patch
Description: Binary data


Re: resent [PATCH] Fix PR50496

2011-10-14 Thread Eric Botcazou
> This patch, originally from Chung-Lin Tang, fixes PR50496.
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50496
>
> Can someone please review and commit it?

A proper patch submission should include a description of the problem and a 
rationale for the proposed fix (unless it is trivial).  

-- 
Eric Botcazou


Re: [PATCH, Atom] Fix performance regression with -mtune=atom

2011-10-14 Thread Vladimir Yakovlev
This is a ping. Change affects Atom only and was made because it
really gives better performance on this architecture. This fact
actually leads to the thought that old value is just a simple
misprint.
  Please look.

Vladimir

2011/9/30 Vladimir Yakovlev :
> This patch fixes performance regression with -mtune=atom. Changing
> atom cost removes regression in several tests of EEMBC and spec2000.
> Bootstrap amd make check Ok for both with and witout -mtune-atom.
> OK for trunk?
>
> 2011-09-30  Yakovlev Vladimir  vladimir.b.yakov...@intel.com
>
>      * gcc/config/i386/i386.c (atom_cost): Changed cost for loading
>       QImode using movzbl.
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 7e89dbd..8a512a7 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -1672,7 +1672,7 @@ struct processor_costs atom_cost = {
>   COSTS_N_INSNS (1),                   /* cost of movzx */
>   8,                                   /* "large" insn */
>   17,                                  /* MOVE_RATIO */
> -  2,                                /* cost for loading QImode using movzbl 
> */
> +  4,                                   /* cost for loading QImode
> using movzbl */
>   {4, 4, 4},                           /* cost of loading integer registers
>                                           in QImode, HImode and SImode.
>                                           Relative to reg-reg move (2).  */
>


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-14 Thread Richard Guenther
On Thu, Oct 13, 2011 at 3:25 PM, Kai Tietz  wrote:
> Hello,
>
> this new version addresses the comments from you.
> On gimplify.c's gimplify_expr we didn't handled the case that operands
> for TRUTH-AND/OR/XOR expressions need to have same operand-size in
> case  of transformation to bitwise-binary operation.  This shows up
> for Fortran, as there are more than one boolean-kind type with
> different mode-sizes.  I added a testcase for this,

The gimplify.c bits and the new testcase is ok.  They should have been
submitted separately.

Please re-submit the fold-const.c part.

Thanks,
Richard.

> ChangeLog
>
> 2011-10-13  Kai Tietz  
>
>        * fold-const.c (simple_operand_p_2): New function.
>        (fold_truthop): Rename to
>        (fold_truth_andor_1): function name.
>        Additionally remove branching creation for logical and/or.
>        (fold_truth_andor): Handle branching creation for logical and/or here.
>        * gimplify.c (gimplify_expr): Take care that for bitwise-binary
>        transformation the operands have compatible types.
>
> 2011-10-13  Kai Tietz  
>
>        * gfortran.fortran-torture/compile/logical-2.f90: New test.
>
> Bootstrapped and regression-tested for all languages plus Ada and
> Obj-C++ on x86_64-pc-linux-gnu.
> Ok for apply?
>
> Regards,
> Kai
>
> Index: gcc/gcc/fold-const.c
> ===
> --- gcc.orig/gcc/fold-const.c
> +++ gcc/gcc/fold-const.c
> @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
>  static int all_ones_mask_p (const_tree, int);
>  static tree sign_bit_p (tree, const_tree);
>  static int simple_operand_p (const_tree);
> +static bool simple_operand_p_2 (tree);
>  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
>  static tree range_predecessor (tree);
>  static tree range_successor (tree);
>  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
>  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
> tree, tree);
>  static tree unextend (tree, int, int, tree);
> -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
>  static tree optimize_minmax_comparison (location_t, enum tree_code,
>                                        tree, tree, tree);
>  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
> @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
>   return lhs;
>  }
>
> -/* Subroutine for fold_truthop: decode a field reference.
> +/* Subroutine for fold_truth_andor_1: decode a field reference.
>
>    If EXP is a comparison reference, we return the innermost reference.
>
> @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
>   return NULL_TREE;
>  }
>
> -/* Subroutine for fold_truthop: determine if an operand is simple enough
> +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
> enough
>    to be evaluated unconditionally.  */
>
>  static int
> @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
>   STRIP_NOPS (exp);
>
>   return (CONSTANT_CLASS_P (exp)
> -         || TREE_CODE (exp) == SSA_NAME
> +         || TREE_CODE (exp) == SSA_NAME
>          || (DECL_P (exp)
>              && ! TREE_ADDRESSABLE (exp)
>              && ! TREE_THIS_VOLATILE (exp)
> @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
>                 registers aren't expensive.  */
>              && (! TREE_STATIC (exp) || DECL_REGISTER (exp;
>  }
> +
> +/* Subroutine for fold_truth_andor: determine if an operand is simple enough
> +   to be evaluated unconditionally.
> +   I addition to simple_operand_p, we assume that comparisons and logic-not
> +   operations are simple, if their operands are simple, too.  */
> +
> +static bool
> +simple_operand_p_2 (tree exp)
> +{
> +  enum tree_code code;
> +
> +  /* Strip any conversions that don't change the machine mode.  */
> +  STRIP_NOPS (exp);
> +
> +  code = TREE_CODE (exp);
> +
> +  if (TREE_CODE_CLASS (code) == tcc_comparison)
> +    return (!tree_could_trap_p (exp)
> +           && simple_operand_p_2 (TREE_OPERAND (exp, 0))
> +           && simple_operand_p_2 (TREE_OPERAND (exp, 1)));
> +
> +  if (TREE_SIDE_EFFECTS (exp)
> +      || tree_could_trap_p (exp))
> +    return false;
> +
> +  switch (code)
> +    {
> +    case SSA_NAME:
> +      return true;
> +    case TRUTH_NOT_EXPR:
> +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
> +    case BIT_NOT_EXPR:
> +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
> +       return false;
> +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
> +    default:
> +      return simple_operand_p (exp);
> +    }
> +}
> +
>
>  /* The following functions are subroutines to fold_range_test and allow it to
>    try to change a logical combination of comparisons into a range test.
> @@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
>   return 0;
>  }
>
> -/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
> +/* Subroutine for 

Re: [Patch,AVR] Fix PR46278, Take #3

2011-10-14 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> Georg-Johann Lay :
>> Weddington, Eric schrieb:
 This is yet another attempt to fix PR46278 (fake X addressing).

 After the previous clean-ups it is just a small change.

 caller-saves.c tries to eliminate call-clobbered hard-regs allocated to
 pseudos around function calls and that leads to situations that reload is
 no more capable to perform all requested spills because of the very few
 AVR's address registers.

 Thus, the patch adds a new target option -mstrict-X so that the user can
 turn that option if he like to do so, and then -fcaller-save is disabled.

 The patch passes the testsuite without regressions. Moreover, the
 testsuite passes without regressions if all test cases are run with
 -mstrict-X and all libraries (libgcc, avr-libc) are built with the new
 option turned on.
>>> Hi Johann,
>>>
>>> Sorry, I haven't been keeping up with the discussion on this PR.
>>>
>>> But if all test cases pass with running -mstrict-X and everything built with
>>> that option on, then why is this even an option? Is it because that it may
>>> not always reduce code size?...
>> An alternative would be to set -mstrict-X per default if -O or higher.
>> Let's see what Denis thinks.
> 
> I think that it's just a great results.
> I vote for committing this patch.

So I will proceed and commit this patch if there are no objections or
propositions to improve it.

> About "to set -mstrict-X per default": if it's possible to print
> something like "Please use -mno-strict-X" instead of "Spill error
> failure" then we can use -mstrict-X by default.

I don't see a way to hook in spill_failure or find_reload_regs or have backend
target specific customization of diagnostics.

Except someone experienced in that field recommends, say, hooking into
diagnostic printer somehow.  If so, I'd prefer to do that in a separate patch.

> i.e. how user can get a knowledge about a correlation between "Spill
> error..." and -mstrict-X

...or between "spill error" and -fcaller-saves...

Johann

> Denis.



Re: [PATCH, i386 tests] New tests to check vectorization for AVX2 insns.

2011-10-14 Thread Jakub Jelinek
On Fri, Oct 14, 2011 at 03:13:45PM +0400, Kirill Yukhin wrote:

--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx2-vpaddb-3.c
@@ -0,0 +1,49 @@
+/* { dg-do run } */
+/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-require-effective-target avx2 } */
+
...
+
+/* { dg-final { scan-assembler-times "vpaddb\[ \\t\]+\[^\n\]*%ymm\[0-9\]" 1 } 
} */

You need
/* { dg-final { cleanup-saved-temps } } */
in all the testcases compiled with -save-temps. 

Jakub


Re: resent2 [PATCH] Fix ICE in redirect_jump, at jump.c:1497 PR50496

2011-10-14 Thread Markus Trippelsdorf
Consider this testcase:

 $ cat test.cpp
class GCAlloc {
};
class BaseAlloc {
};
class String;
class Base {
public:
 virtual void destroy( String *str ) const =0;
};
class String: public GCAlloc {
 const Base *m_class;
public:
 enum constants {
 };
 String( const char *data );
 ~String() {
  m_class->destroy( this );
 }
 void copy( const String &other );
 String & operator=( const char *other ) {
  copy( String( other ) );
 }
};
class ListElement: public BaseAlloc {
};
class List: public BaseAlloc {
 ListElement *m_head;
 void (*m_deletor)( void *);
public:
 List():   m_deletor(0) {
 }
 const void *back() const {
 }
 bool empty() const {
  return m_head == 0;
 }
 void popBack();
};
class FalconData: public BaseAlloc {
public:
 virtual ~FalconData() {
 }
};
class Stream: public FalconData {
};
class SrcLexer: public BaseAlloc {
 List m_streams;
 String m_whiteLead;
 void reset();
};
void SrcLexer::reset()
{
 m_whiteLead = "";
 while( ! m_streams.empty() ) {
  Stream *s = (Stream *) m_streams.back();
  m_streams.popBack();
  if ( !m_streams.empty() )  delete s;
 }
}

 % g++ -O2 test.cpp
test.cpp: In member function ‘void SrcLexer::reset()’:
test.cpp:59:1: internal compiler error: in redirect_jump, at jump.c:1497

It hits the following assertion:
 gcc_assert (nlabel != NULL_RTX);

In this case target(or new_bb)=EXIT_BLOCK_PTR and 
block_label(EXIT_BLOCK_PTR)==NULL_RTX.
Fix this by treating the target=EXIT_BLOCK_PTR case before calling
redirect_jump in gcc/cfgrtl.c.


PR middle-end/50496
* cfgrtl.c (try_redirect_by_replacing_jump): Treat EXIT_BLOCK_PTR case
separately before call to redirect_jump(). Add assertion.
(patch_jump_insn): Same.

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index b3f045b..57f561f 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -846,11 +846,10 @@ try_redirect_by_replacing_jump (edge e, basic_block 
target, bool in_cfglayout)
   if (dump_file)
fprintf (dump_file, "Redirecting jump %i from %i to %i.\n",
 INSN_UID (insn), e->dest->index, target->index);
-  if (!redirect_jump (insn, block_label (target), 0))
-   {
- gcc_assert (target == EXIT_BLOCK_PTR);
- return NULL;
-   }
+  if (target == EXIT_BLOCK_PTR)
+   return NULL;
+  if (! redirect_jump (insn, block_label (target), 0))
+   gcc_unreachable ();
 }
 
   /* Cannot do anything for target exit block.  */
@@ -1030,11 +1029,10 @@ patch_jump_insn (rtx insn, rtx old_label, basic_block 
new_bb)
  /* If the substitution doesn't succeed, die.  This can happen
 if the back end emitted unrecognizable instructions or if
 target is exit block on some arches.  */
- if (!redirect_jump (insn, block_label (new_bb), 0))
-   {
- gcc_assert (new_bb == EXIT_BLOCK_PTR);
- return false;
-   }
+ if (new_bb == EXIT_BLOCK_PTR)
+   return false;
+ if (! redirect_jump (insn, block_label (new_bb), 0))
+   gcc_unreachable ();
}
 }
   return true;

-- 
Markus


[pph] Unify chain streaming (issue5262045)

2011-10-14 Thread Diego Novillo

This cleans up the chain streaming code under a single core function
and various entry points.  In the process it fixes the timeout problem
in c1limits-externalid.cc (we were just messing up the chain when
streaming it in reverse).

Tested on x86_64.  Committed to branch.

* pph-streamer-out.c (pph_out_mergeable_tree): Remove.  Update
all users.
(pph_tree_matches): Handle PPHF_NONE.
(vec2vec_filter): New.
(pph_out_tree_vec_1): Rename from pph_out_tree_vec.
Add arguments FILTER, MERGEABLE and REVERSE.
Call vec2vec_filter.
(pph_out_tree_vec): Call pph_out_tree_vec_1.
(pph_out_tree_vec_filtered): Likewise.
(chain2vec_filter): New.
(pph_out_mergeable_links): Remove.
(pph_out_chain_1): Rename from pph_out_chain_1.
(pph_out_chain): Call pph_out_chain_1.
(pph_out_chain_filtered): Likewise.
(pph_out_mergeable_chain_filtered): Likewise.
(pph_out_tree_common): Asser that T is not a DECL.
(pph_out_tcc_type): Remove FIXME comment.

testsuite/ChangeLog.pph

* g++.dg/pph/c1limits-externalid.cc: Mark fixed.

diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c
index b5020f2..ffbe710 100644
--- a/gcc/cp/pph-streamer-out.c
+++ b/gcc/cp/pph-streamer-out.c
@@ -632,15 +632,6 @@ pph_out_tree (pph_stream *stream, tree t)
 }
 
 
-/* Output mergable AST T to STREAM.  */
-
-static void
-pph_out_mergeable_tree (pph_stream *stream, tree t)
-{
-  pph_out_any_tree (stream, t, true);
-}
-
-
 /** lexical elements */
 
 
@@ -764,49 +755,14 @@ pph_out_token_cache (pph_stream *f, cp_token_cache *cache)
 /*** vectors */
 
 
-/* Write all the trees in VEC V to STREAM.  */
-
-static void
-pph_out_tree_vec (pph_stream *stream, VEC(tree,gc) *v)
-{
-  unsigned i;
-  tree t;
-
-  /* Note that we use the same format used by streamer_write_chain.
- This is to support pph_out_chain_filtered, which writes the
- filtered chain as a VEC.  Since the reader always reads chains
- using streamer_read_chain, we have to write VECs in exactly the
- same way as tree chains.  */
-  pph_out_hwi (stream, VEC_length (tree, v));
-  FOR_EACH_VEC_ELT (tree, v, i, t)
-pph_out_tree (stream, t);
-}
-
-
-/* Write all the trees in VEC V to STREAM.  */
-
-static void
-pph_out_mergeable_tree_vec (pph_stream *stream, VEC(tree,gc) *v)
-{
-  unsigned i;
-  tree t;
-
-  /* Note that we use the same format used by streamer_write_chain.
- This is to support pph_out_chain_filtered, which writes the
- filtered chain as a VEC.  Since the reader always reads chains
- using streamer_read_chain, we have to write VECs in exactly the
- same way as tree chains.  */
-  pph_out_hwi (stream, VEC_length (tree, v));
-  FOR_EACH_VEC_ELT_REVERSE (tree, v, i, t)
-pph_out_mergeable_tree (stream, t);
-}
-
-
 /* Return true if T matches FILTER for STREAM.  */
 
 static inline bool
 pph_tree_matches (pph_stream *stream, tree t, unsigned filter)
 {
+  if (filter == PPHF_NONE)
+return true;
+
   if ((filter & PPHF_NO_BUILTINS)
   && DECL_P (t)
   && DECL_IS_BUILTIN (t))
@@ -825,31 +781,83 @@ pph_tree_matches (pph_stream *stream, tree t, unsigned 
filter)
 }
 
 
-/* Write all the trees matching FILTER in VEC V to STREAM.  */
+/* Return a heap vector with all the trees in V that match FILTER.
+   The caller is responsible for freeing the returned vector.  */
 
-static void
-pph_out_tree_vec_filtered (pph_stream *stream, VEC(tree,gc) *v, unsigned 
filter)
+static inline VEC(tree,heap) *
+vec2vec_filter (pph_stream *stream, VEC(tree,gc) *v, unsigned filter)
 {
   unsigned i;
   tree t;
-  VEC(tree, heap) *to_write = NULL;
+  VEC(tree, heap) *filtered_v = NULL;
 
-  /* Special case.  If the caller wants no filtering, it is much
- faster to just call pph_out_tree_vec.  */
-  if (filter == PPHF_NONE)
-{
-  pph_out_tree_vec (stream, v);
-  return;
-}
+  /* Do not accept the nil filter.  The caller is responsible for
+ freeing the returned vector and they may inadvertently free
+ a vector they assumed to be allocated by this function.  */
+  gcc_assert (filter != PPHF_NONE);
 
   /* Collect all the nodes that match the filter.  */
   FOR_EACH_VEC_ELT (tree, v, i, t)
 if (pph_tree_matches (stream, t, filter))
-  VEC_safe_push (tree, heap, to_write, t);
+  VEC_safe_push (tree, heap, filtered_v, t);
 
-  /* Write them.  */
-  pph_out_tree_vec (stream, (VEC(tree,gc) *)to_write);
-  VEC_free (tree, heap, to_write);
+  return filtered_v;
+}
+
+
+/* Write all the trees in VEC V to STREAM.  REVERSE is true if V should
+   be written in reverse.  MERGEABLE is true if the tree nodes in V
+   are mergeable trees (see pph_out_any_tree).  If FILTER is set,
+   only emit the elements in V that match it.  */
+
+static void
+pph_out_tree_vec_1 (pph_stream *stream

[pph] Misc cleanups (1/3) (issue5282043)

2011-10-14 Thread Diego Novillo

* pph-streamer-out.c (pph_out_tree_1): Rename from pph_out_any_tree.
Update all users.


diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c
index ffbe710..e337a31 100644
--- a/gcc/cp/pph-streamer-out.c
+++ b/gcc/cp/pph-streamer-out.c
@@ -619,16 +619,15 @@ pph_out_start_tree_record (pph_stream *stream, tree t)
 
 
 /* The core tree writer is defined much later.  */
+static void pph_out_tree_1 (pph_stream *stream, tree t, bool mergeable);
 
-static void pph_out_any_tree (pph_stream *stream, tree t, bool mergeable);
 
-
-/* Output non-mergeable AST T to STREAM  */
+/* Output non-mergeable tree T to STREAM.  */
 
 void
 pph_out_tree (pph_stream *stream, tree t)
 {
-  pph_out_any_tree (stream, t, false);
+  pph_out_tree_1 (stream, t, false);
 }
 
 
@@ -807,7 +806,7 @@ vec2vec_filter (pph_stream *stream, VEC(tree,gc) *v, 
unsigned filter)
 
 /* Write all the trees in VEC V to STREAM.  REVERSE is true if V should
be written in reverse.  MERGEABLE is true if the tree nodes in V
-   are mergeable trees (see pph_out_any_tree).  If FILTER is set,
+   are mergeable trees (see pph_out_tree_1).  If FILTER is set,
only emit the elements in V that match it.  */
 
 static void
@@ -832,10 +831,10 @@ pph_out_tree_vec_1 (pph_stream *stream, VEC(tree,gc) *v, 
unsigned filter,
 
   if (!reverse)
 FOR_EACH_VEC_ELT (tree, to_write, i, t)
-  pph_out_any_tree (stream, t, mergeable);
+  pph_out_tree_1 (stream, t, mergeable);
   else
 FOR_EACH_VEC_ELT_REVERSE (tree, to_write, i, t)
-  pph_out_any_tree (stream, t, mergeable);
+  pph_out_tree_1 (stream, t, mergeable);
 
   /* If we did not have to filter, TO_WRITE == V.  Do not free it!  */
   if (filter != PPHF_NONE)
@@ -918,7 +917,7 @@ chain2vec_filter (pph_stream *stream, tree chain, unsigned 
filter)
 /* Write a chain of trees to STREAM starting with FIRST (if REVERSE is
false) or the last element reachable from FIRST (if REVERSE is
true).  If FILTER is given, use it to decide what nodes should be
-   emitted.  MERGEABLE is as in pph_out_any_tree.  */
+   emitted.  MERGEABLE is as in pph_out_tree_1.  */
 
 static void
 pph_out_chain_1 (pph_stream *stream, tree first, unsigned filter,
@@ -1068,7 +1067,7 @@ pph_out_binding_level_1 (pph_stream *stream, 
cp_binding_level *bl,
   pph_out_mergeable_chain_filtered (stream, bl->namespaces, aux_filter);
   pph_out_mergeable_chain_filtered (stream, bl->usings, aux_filter);
   pph_out_mergeable_chain_filtered (stream, bl->using_directives,
-aux_filter);
+aux_filter);
 }
   else
 {
@@ -1886,7 +1885,7 @@ pph_out_merge_name (pph_stream *stream, tree expr)
 /* Write a tree EXPR (MERGEABLE or not) to STREAM.  */
 
 static void
-pph_out_any_tree (pph_stream *stream, tree expr, bool mergeable)
+pph_out_tree_1 (pph_stream *stream, tree expr, bool mergeable)
 {
   enum pph_record_marker marker;
 
-- 
1.7.3.1


--
This patch is available for review at http://codereview.appspot.com/5282043


[pph] Misc cleanups (2/3) (issue5263044)

2011-10-14 Thread Diego Novillo
* pph-streamer-in.c (pph_in_tree_1): Rename from pph_in_any_tree.
Update all users.
Do not call pph_trace_tree unless flag_pph_tracer is set.

 
diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index f8d6393..5cdf4d5 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -512,25 +512,24 @@ pph_in_start_record (pph_stream *stream, unsigned 
*include_ix_p,
 
 
 /* The core tree reader is defined much later.  */
+static tree pph_in_tree_1 (pph_stream *stream, tree *chain);
 
-static tree pph_in_any_tree (pph_stream *stream, tree *chain);
 
-
-/* Load an AST from STREAM.  Return the corresponding tree.  */
+/* Load a non-mergeable AST from STREAM.  Return the corresponding tree.  */
 
 tree
 pph_in_tree (pph_stream *stream)
 {
-  tree t = pph_in_any_tree (stream, NULL);
-  return t;
+  return pph_in_tree_1 (stream, NULL);
 }
 
 
 /* Load an AST into CHAIN from STREAM.  */
+
 static void
 pph_in_mergeable_tree (pph_stream *stream, tree *chain)
 {
-  pph_in_any_tree (stream, chain);
+  pph_in_tree_1 (stream, chain);
 }
 
 
@@ -1922,11 +1921,11 @@ pph_in_tree_header (pph_stream *stream, enum LTO_tags 
tag)
 }
 
 
-/* Read a tree from the STREAM.  If CHAIN is not null, the tree may be
+/* Read a tree from the STREAM.  If CHAIN is not NULL, the tree may be
unified with an existing tree in that chain.  */
 
 static tree
-pph_in_any_tree (pph_stream *stream, tree *chain)
+pph_in_tree_1 (pph_stream *stream, tree *chain)
 {
   struct lto_input_block *ib = stream->encoder.r.ib;
   struct data_in *data_in = stream->encoder.r.data_in;
@@ -1965,7 +1964,7 @@ pph_in_any_tree (pph_stream *stream, tree *chain)
   return streamer_read_integer_cst (ib, data_in);
 }
 
-  /* Materialize a new node from IB.  This will also read all the
+  /* Materialize a new node from STREAM.  This will also read all the
  language-independent bitfields for the new tree.  */
   expr = read = pph_in_tree_header (stream, tag);
   gcc_assert (read != NULL);
@@ -1992,11 +1991,11 @@ pph_in_any_tree (pph_stream *stream, tree *chain)
   /* Add the new tree to the cache and read its body.  The tree
  is added to the cache before we read its body to handle
  circular references and references from children nodes.  */
-  /* FIXME pph: We should not insert when read == expr, but it fails.  */
   pph_cache_insert_at (&stream->cache, expr, ix, pph_tree_code_to_tag (expr));
   pph_in_tree_body (stream, expr);
 
-  pph_trace_tree (expr, chain != NULL);
+  if (flag_pph_tracer)
+pph_trace_tree (expr, chain != NULL);
 
   /* If needed, sign the recently materialized tree to detect
  mutations.  Note that we only need to compute signatures
-- 
1.7.3.1


--
This patch is available for review at http://codereview.appspot.com/5263044


[pph] Misc cleanups (3/3) (issue5274043)

2011-10-14 Thread Diego Novillo
* pph-streamer-out.c (pph_out_tree_1): Do not call pph_trace_tree
unless flag_pph_tracer is set.


diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c
index e337a31..d534b42 100644
--- a/gcc/cp/pph-streamer-out.c
+++ b/gcc/cp/pph-streamer-out.c
@@ -1916,7 +1916,8 @@ pph_out_tree_1 (pph_stream *stream, tree expr, bool 
mergeable)
 }
   else if (marker == PPH_RECORD_START || marker == PPH_RECORD_START_MUTATED)
 {
-  pph_trace_tree (expr, mergeable);
+  if (flag_pph_tracer)
+   pph_trace_tree (expr, mergeable);
 
   /* This is the first time we see EXPR, write it out.  */
   if (marker == PPH_RECORD_START)
-- 
1.7.3.1


--
This patch is available for review at http://codereview.appspot.com/5274043


Re: [pph] Make libcpp symbol validation a warning (issue5235061)

2011-10-14 Thread Diego Novillo

On 11-10-13 17:55 , Gabriel Charette wrote:


I'm not sure exactly how you skip headers already parsed now (we
didn't used to when I wrote this code and that was the only problem
remaining in the line_table (i.e. duplicate entries for guarded
headers in the non-pph compile)), but couldn't you count the number of
skipped entries and assert (line_table->used - used_before) +
numSkipped == expected_in) ?


The problem is that the compilation process of foo.h -> foo.pph may 
generate different line tables than a compile that includes foo.pph. 
For instance,


foo.h:
#include "1.pph"
#include "2.pph"
#include "3.pph"

foo.cc:
#include "2.pph"
#include "foo.pph"


When we compile foo.h, the line table incorporates the effects of 
including 2.pph, and that's what we save to foo.pph.  However, when 
compiling foo.cc, the first thing we do is include 2.pph, so when 
processing the include for foo.pph, we will completely skip over 2.pph.


That's why we cannot really have the same line table that we had when we 
generated foo.pph.



Diego.


Re: [PATCH, Atom] Fix performance regression with -mtune=atom

2011-10-14 Thread Uros Bizjak
Hello!

> This is a ping. Change affects Atom only and was made because it
> really gives better performance on this architecture. This fact
> actually leads to the thought that old value is just a simple
> misprint.
>
> > This patch fixes performance regression with -mtune=atom. Changing
> > atom cost removes regression in several tests of EEMBC and spec2000.
> > Bootstrap amd make check Ok for both with and witout -mtune-atom.
> > OK for trunk?
> >
> > 2011-09-30 ?Yakovlev Vladimir ?vladimir.b.yakov...@intel.com
> >
> > ? ? ?* gcc/config/i386/i386.c (atom_cost): Changed cost for loading
> > ? ? ? QImode using movzbl.

OK.

Thanks,
Uros.


Re: [PATCH] Add mulv4di3 expander

2011-10-14 Thread Uros Bizjak
On Fri, Oct 14, 2011 at 8:18 AM, Jakub Jelinek  wrote:

> mulv2di3 can be expanded the same as mulv2di3.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2011-10-14  Jakub Jelinek  
>
>        * config/i386/sse.md (mulv2di3): Macroize using VI8_AVX2
>        iterator.
>        (ashl3): Use VI248_AVX2 iterator instead of VI248_128.
>        Use  instead of TI in mode attr.

OK.

Thanks,
Uros.


[PATCH] Simplify and fix restrict handling

2011-10-14 Thread Richard Guenther

This follows up Michas testcase where we fail to handle the
conservatively propagated restrict tags properly.  The following
patch simplifies handling of restrict in the oracle and thus
only excludes NONLOCAL (as designed), but not ESCAPED from
conflict checking.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2011-10-14  Richard Guenther  

* tree-ssa-alias.h (pt_solutions_same_restrict_base): Remove.
* tree-ssa-alias.c (ptr_derefs_may_alias_p): Remove call to
pt_solutions_same_restrict_base.
* tree-ssa-structalias.c (pt_solutions_same_restrict_base): Remove.
(pt_solutions_intersect_1): Integrate restrict handling here.

Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 179966)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -316,11 +316,6 @@ ptr_derefs_may_alias_p (tree ptr1, tree
   if (!pi1 || !pi2)
 return true;
 
-  /* If both pointers are restrict-qualified try to disambiguate
- with restrict information.  */
-  if (!pt_solutions_same_restrict_base (&pi1->pt, &pi2->pt))
-return false;
-
   /* ???  This does not use TBAA to prune decls from the intersection
  that not both pointers may access.  */
   return pt_solutions_intersect (&pi1->pt, &pi2->pt);
Index: gcc/tree-ssa-alias.h
===
--- gcc/tree-ssa-alias.h(revision 179966)
+++ gcc/tree-ssa-alias.h(working copy)
@@ -130,8 +130,6 @@ extern bool pt_solution_singleton_p (str
 extern bool pt_solution_includes_global (struct pt_solution *);
 extern bool pt_solution_includes (struct pt_solution *, const_tree);
 extern bool pt_solutions_intersect (struct pt_solution *, struct pt_solution 
*);
-extern bool pt_solutions_same_restrict_base (struct pt_solution *,
-struct pt_solution *);
 extern void pt_solution_reset (struct pt_solution *);
 extern void pt_solution_set (struct pt_solution *, bitmap, bool, bool);
 extern void pt_solution_set_var (struct pt_solution *, tree);
Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 179966)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -6079,12 +6079,15 @@ pt_solutions_intersect_1 (struct pt_solu
 return true;
 
   /* If either points to unknown global memory and the other points to
- any global memory they alias.  */
-  if ((pt1->nonlocal
-   && (pt2->nonlocal
-  || pt2->vars_contains_global))
-  || (pt2->nonlocal
- && pt1->vars_contains_global))
+ any global memory they alias.  If both points-to sets are based
+ off a restrict qualified pointer ignore any overlaps with NONLOCAL.  */
+  if (!(pt1->vars_contains_restrict
+   && pt2->vars_contains_restrict)
+  && ((pt1->nonlocal
+  && (pt2->nonlocal
+  || pt2->vars_contains_global))
+ || (pt2->nonlocal
+ && pt1->vars_contains_global)))
 return true;
 
   /* Check the escaped solution if required.  */
@@ -6141,27 +6144,6 @@ pt_solutions_intersect (struct pt_soluti
   return res;
 }
 
-/* Return true if both points-to solutions PT1 and PT2 for two restrict
-   qualified pointers are possibly based on the same pointer.  */
-
-bool
-pt_solutions_same_restrict_base (struct pt_solution *pt1,
-struct pt_solution *pt2)
-{
-  /* If we deal with points-to solutions of two restrict qualified
- pointers solely rely on the pointed-to variable bitmap intersection.
- For two pointers that are based on each other the bitmaps will
- intersect.  */
-  if (pt1->vars_contains_restrict
-  && pt2->vars_contains_restrict)
-{
-  gcc_assert (pt1->vars && pt2->vars);
-  return bitmap_intersect_p (pt1->vars, pt2->vars);
-}
-
-  return true;
-}
-
 
 /* Dump points-to information to OUTFILE.  */
 


Re: [ARM] Fix PR49641

2011-10-14 Thread Bernd Schmidt
On 07/13/11 16:03, Richard Earnshaw wrote:
>>  * config/arm/arm.c (store_multiple_sequence): Avoid cases where
>>  the base reg is stored iff compiling for Thumb1.
>>
>>  * gcc.target/arm/pr49641.c: New test.

Ping.  Richard, you replied to the mail but didn't comment on the patch.


Bernd


Re: New warning for expanded vector operations

2011-10-14 Thread Artem Shinkarov
On Thu, Oct 13, 2011 at 10:40 AM, Artem Shinkarov
 wrote:
> On Thu, Oct 13, 2011 at 10:23 AM, Richard Guenther
>  wrote:
>> On Thu, Oct 13, 2011 at 10:59 AM, Mike Stump  wrote:
>>> On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote:
 This patch fixed PR50704.

 gcc/testsuite:
        * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target.
        * gcc.target/i386/warn-vect-op-1.c: Ditto.
        * gcc.target/i386/warn-vect-op-2.c: Ditto.

 Ok for trunk?
>>>
>>> Ok.  Is this x32 clean?  :-)  If not, HJ will offer an even better spelling.
>>
>> I suppose you instead want sth like
>>
>> { dg-require-effective-target lp64 }
>>
>> ?
>>
>
> See our discussion with HJ here:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50704
> /* { dg-do compile { target { ! { ia32 } } } } */ was his idea.  As
> far as x32 sets UNITS_PER_WORD to 8, these tests should work fine.
>
> Artem.
>

Ping.

So can I commit the changes?


Thanks,
Artem.


Re: [PATCH, i386 tests] New tests to check vectorization for AVX2 insns.

2011-10-14 Thread Kirill Yukhin
Thanks, done.

Anything else?

K

On Fri, Oct 14, 2011 at 3:53 PM, Jakub Jelinek  wrote:
> On Fri, Oct 14, 2011 at 03:13:45PM +0400, Kirill Yukhin wrote:
>
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx2-vpaddb-3.c
> @@ -0,0 +1,49 @@
> +/* { dg-do run } */
> +/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
> +/* { dg-require-effective-target avx2 } */
> +
> ...
> +
> +/* { dg-final { scan-assembler-times "vpaddb\[ \\t\]+\[^\n\]*%ymm\[0-9\]" 1 
> } } */
>
> You need
> /* { dg-final { cleanup-saved-temps } } */
> in all the testcases compiled with -save-temps.
>
>        Jakub
>


avx2.vect-2.tests.gcc.patch
Description: Binary data


Re: [C++ Patch / RFC] PR 38174

2011-10-14 Thread Jason Merrill

On 10/13/2011 10:37 PM, Paolo Carlini wrote:

+  if ((TYPE_PTR_P (type1) && TYPE_PTR_P (type2))
+ || (TYPE_PTRMEM_P (type1) && TYPE_PTRMEM_P (type2))
+ || TYPE_PTRMEMFUNC_P (type1))


You don't need to check TYPE_PTR_P or TYPE_PTRMEM_P for type2 here (or 
in the condition above) because we already established that type1 and 
type2 have the same TREE_CODE.  OK with that change.


Jason


Re: [C++ Patch] PR 17212

2011-10-14 Thread Jason Merrill

On 10/13/2011 07:19 PM, Paolo Carlini wrote:

-@item -Wno-format-zero-length @r{(C and Objective-C only)}
+@item -Wno-format-zero-length @r{(C, C++, Objective-C and Objective-C++ only)}


I would just remove the {...only} to match the other -Wformat items.

OK.

Jason



Re: New warning for expanded vector operations

2011-10-14 Thread Richard Guenther
On Fri, Oct 14, 2011 at 3:42 PM, Artem Shinkarov
 wrote:
> On Thu, Oct 13, 2011 at 10:40 AM, Artem Shinkarov
>  wrote:
>> On Thu, Oct 13, 2011 at 10:23 AM, Richard Guenther
>>  wrote:
>>> On Thu, Oct 13, 2011 at 10:59 AM, Mike Stump  wrote:
 On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote:
> This patch fixed PR50704.
>
> gcc/testsuite:
>        * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target.
>        * gcc.target/i386/warn-vect-op-1.c: Ditto.
>        * gcc.target/i386/warn-vect-op-2.c: Ditto.
>
> Ok for trunk?

 Ok.  Is this x32 clean?  :-)  If not, HJ will offer an even better 
 spelling.
>>>
>>> I suppose you instead want sth like
>>>
>>> { dg-require-effective-target lp64 }
>>>
>>> ?
>>>
>>
>> See our discussion with HJ here:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50704
>> /* { dg-do compile { target { ! { ia32 } } } } */ was his idea.  As
>> far as x32 sets UNITS_PER_WORD to 8, these tests should work fine.
>>
>> Artem.
>>
>
> Ping.
>
> So can I commit the changes?

Yes.

Thanks,
Richard.

>
> Thanks,
> Artem.
>


Re: [patch] dwarf2out: Drop the size + performance overhead of DW_AT_sibling

2011-10-14 Thread Tom Tromey
> "Tristan" == Tristan Gingold  writes:

Tristan> I am not against this patch, my only concern is that there are many
Tristan> many dwarf consumers and I have no idea how they will react to this
Tristan> change.

I tend to think that this is the wrong standard to apply.  In this case
we would be avoiding a beneficial change -- as measured in both
performance in a couple of cases, and in size -- for the sake of unknown
and possibly nonexistent consumers.  I think instead the burden of proof
should be on those consumers, both to give their evidence and reasoning
and to engage with GCC.

Another way to look at it is that there have been many changes to GCC's
DWARF output in the last few years.  Surely these have broken these
DWARF consumers more than this change possibly could.

Tom


Re: [C++ Patch] PR 17212

2011-10-14 Thread Paolo Carlini

On 10/14/2011 03:57 PM, Jason Merrill wrote:

On 10/13/2011 07:19 PM, Paolo Carlini wrote:

-@item -Wno-format-zero-length @r{(C and Objective-C only)}
+@item -Wno-format-zero-length @r{(C, C++, Objective-C and 
Objective-C++ only)}


I would just remove the {...only} to match the other -Wformat items.

Done.

Thanks,
Paolo.


Re: [PATCH] Handle COND_EXPR/VEC_COND_EXPR in walk_stmt_load_store_addr_ops and ssa verification

2011-10-14 Thread Michael Matz
Hi,

On Fri, 14 Oct 2011, Richard Guenther wrote:

> But I'm not sure.  I think we should delete this check from
> verify_ssa and instead have a corresponding check in
> verify_stmts (which already properly walks trees) that
> for an SSA name we encounter we do have a properly linked use
> (see verify_expr, maybe it's easy to do that for the SSA_NAME
> case - at least it's easy without trying to avoid a
> FOR_EACH_SSA_USE_OPERAND (, SSA_OP_USE) on the stmt for
> each SSA_NAME we encounter).

Whatever we do with this check, it should be ensured that it still 
triggers on gcc.dg/pr45415.c at revision r163821.  IIRC to find the cause 
for this bug caused some more gray hair on my part :)


Ciao,
Michael.


Re: [PATCH, i386 tests] New tests to check vectorization for AVX2 insns.

2011-10-14 Thread Jakub Jelinek
On Fri, Oct 14, 2011 at 05:53:28PM +0400, Kirill Yukhin wrote:
> Thanks, done.
> 
> Anything else?

First of all, most of the testcases look very similar, the only changes
in between many of them are (unimportant) function names and different type.
So, I think it would be much better to just write one testcase that will use
instead of int or whatever the type is TYPE
where there will be
#ifndef TYPE
#define TYPE int
#endif
early in the testcase.  Then all the other type variants can just include
the base variant of the testcase, and would contain just

/* { dg-do run } */
/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
/* { dg-require-effective-target avx2 } */

#define TYPE long long int
#include "avx2-vpaddd-3.c"

/* { dg-final { scan-assembler-times "vpaddq\[ \\t\]+\[^\n\]*%ymm\[0-9\]" 1 } } 
*/
/* { dg-final { cleanup-saved-temps } } */


The indentation is not the GNU standard one, though perhaps for the
testcases it is less important.  For testcases we import from somewhere
else we often keep it in the weirdo formatting it originally had
(after all, it doesn't hurt to verify that e.g. our lexer etc. isn't
surprised by whitespace missing or being present at unexpected spots).
These testcases on the other side are written for GCC just to check
the vectorization, so perhaps it should follow the GNU coding conventions.

+  for (i = 0; i < 4; ++i ) {
+for ( j = 0; j < SIZE; ++j ) {
+  a[i] = i*i+i;
+  b[i] = i*i*i;
+}

{ should be on the next line, indented by two columns from for.
= i * i + i;
= i * i * i;

+if ( memcmp(c, c_ref, SIZE * sizeof (char) ) )

if (memcmp (c, c_ref, SIZE * sizeof (TYPE)))

+  abort();
+  }
+}
+

Jakub


Re: [PATCH, Atom] Fix performance regression with -mtune=atom

2011-10-14 Thread Vladimir Yakovlev
Could anyone checkin that?

Thanks,
Vladimir

2011/10/14 Uros Bizjak :
> Hello!
>
>> This is a ping. Change affects Atom only and was made because it
>> really gives better performance on this architecture. This fact
>> actually leads to the thought that old value is just a simple
>> misprint.
>>
>> > This patch fixes performance regression with -mtune=atom. Changing
>> > atom cost removes regression in several tests of EEMBC and spec2000.
>> > Bootstrap amd make check Ok for both with and witout -mtune-atom.
>> > OK for trunk?
>> >
>> > 2011-09-30 ?Yakovlev Vladimir ?vladimir.b.yakov...@intel.com
>> >
>> > ? ? ?* gcc/config/i386/i386.c (atom_cost): Changed cost for loading
>> > ? ? ? QImode using movzbl.
>
> OK.
>
> Thanks,
> Uros.
>


[PATCH] AVX2 vec_widen_[su]mult_{hi,lo}*, sdot_prod* and udot_prod*

2011-10-14 Thread Jakub Jelinek
Hi!

This patch improves generated code for SSE4.1 and even more
for AVX2 on the attached testcases.  SSE4.1 has pmuldq
(where SSE2 only had pmuludq), so can handle signed widening shifts fine
too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-10-14  Jakub Jelinek  

* config/i386/sse.md (vec_widen_smult_hi_v8hi,
vec_widen_smult_lo_v8hi, vec_widen_umult_hi_v8hi,
vec_widen_umult_lo_v8hi): Macroize using VI2_AVX2
mode iterator and any_extend code iterator.
(vec_widen_mult_hi_v8si, vec_widen_mult_lo_v8si): New
expanders.
(vec_widen_smult_hi_v4si, vec_widen_smult_lo_v4si): Enable
also for TARGET_SSE4_1 using pmuldq insn.
(sdot_prodv8hi): Macroize using VI2_AVX2 iterator.
(sse2_sse4_1): New code attr.
(udot_prodv4si): Macroize using any_extend code iterator.
(dot_prodv8si): New expander.

* gcc.target/i386/sse2-mul-1.c: New test.
* gcc.target/i386/sse4_1-mul-1.c: New test.
* gcc.target/i386/avx-mul-1.c: New test.
* gcc.target/i386/xop-mul-1.c: New test.
* gcc.target/i386/avx2-mul-1.c: New test.

--- gcc/config/i386/sse.md.jj   2011-10-14 08:38:47.0 +0200
+++ gcc/config/i386/sse.md  2011-10-14 13:05:58.0 +0200
@@ -5507,83 +5507,100 @@ (define_insn_and_split "mul3"
   DONE;
 })
 
-(define_expand "vec_widen_smult_hi_v8hi"
-  [(match_operand:V4SI 0 "register_operand" "")
-   (match_operand:V8HI 1 "register_operand" "")
-   (match_operand:V8HI 2 "register_operand" "")]
+(define_expand "vec_widen_mult_hi_"
+  [(match_operand: 0 "register_operand" "")
+   (any_extend:
+ (match_operand:VI2_AVX2 1 "register_operand" ""))
+   (match_operand:VI2_AVX2 2 "register_operand" "")]
   "TARGET_SSE2"
 {
   rtx op1, op2, t1, t2, dest;
 
   op1 = operands[1];
   op2 = operands[2];
-  t1 = gen_reg_rtx (V8HImode);
-  t2 = gen_reg_rtx (V8HImode);
-  dest = gen_lowpart (V8HImode, operands[0]);
+  t1 = gen_reg_rtx (mode);
+  t2 = gen_reg_rtx (mode);
+  dest = gen_lowpart (mode, operands[0]);
 
-  emit_insn (gen_mulv8hi3 (t1, op1, op2));
-  emit_insn (gen_smulv8hi3_highpart (t2, op1, op2));
-  emit_insn (gen_vec_interleave_highv8hi (dest, t1, t2));
+  emit_insn (gen_mul3 (t1, op1, op2));
+  emit_insn (gen_mul3_highpart (t2, op1, op2));
+  emit_insn (gen_vec_interleave_high (dest, t1, t2));
   DONE;
 })
 
-(define_expand "vec_widen_smult_lo_v8hi"
-  [(match_operand:V4SI 0 "register_operand" "")
-   (match_operand:V8HI 1 "register_operand" "")
-   (match_operand:V8HI 2 "register_operand" "")]
+(define_expand "vec_widen_mult_lo_"
+  [(match_operand: 0 "register_operand" "")
+   (any_extend:
+ (match_operand:VI2_AVX2 1 "register_operand" ""))
+   (match_operand:VI2_AVX2 2 "register_operand" "")]
   "TARGET_SSE2"
 {
   rtx op1, op2, t1, t2, dest;
 
   op1 = operands[1];
   op2 = operands[2];
-  t1 = gen_reg_rtx (V8HImode);
-  t2 = gen_reg_rtx (V8HImode);
-  dest = gen_lowpart (V8HImode, operands[0]);
+  t1 = gen_reg_rtx (mode);
+  t2 = gen_reg_rtx (mode);
+  dest = gen_lowpart (mode, operands[0]);
 
-  emit_insn (gen_mulv8hi3 (t1, op1, op2));
-  emit_insn (gen_smulv8hi3_highpart (t2, op1, op2));
-  emit_insn (gen_vec_interleave_lowv8hi (dest, t1, t2));
+  emit_insn (gen_mul3 (t1, op1, op2));
+  emit_insn (gen_mul3_highpart (t2, op1, op2));
+  emit_insn (gen_vec_interleave_low (dest, t1, t2));
   DONE;
 })
 
-(define_expand "vec_widen_umult_hi_v8hi"
-  [(match_operand:V4SI 0 "register_operand" "")
-   (match_operand:V8HI 1 "register_operand" "")
-   (match_operand:V8HI 2 "register_operand" "")]
-  "TARGET_SSE2"
+(define_expand "vec_widen_mult_hi_v8si"
+  [(match_operand:V4DI 0 "register_operand" "")
+   (any_extend:V4DI (match_operand:V8SI 1 "nonimmediate_operand" ""))
+   (match_operand:V8SI 2 "nonimmediate_operand" "")]
+  "TARGET_AVX2"
 {
-  rtx op1, op2, t1, t2, dest;
-
-  op1 = operands[1];
-  op2 = operands[2];
-  t1 = gen_reg_rtx (V8HImode);
-  t2 = gen_reg_rtx (V8HImode);
-  dest = gen_lowpart (V8HImode, operands[0]);
+  rtx t1, t2, t3, t4, rperm[8], vperm;
+  int i;
 
-  emit_insn (gen_mulv8hi3 (t1, op1, op2));
-  emit_insn (gen_umulv8hi3_highpart (t2, op1, op2));
-  emit_insn (gen_vec_interleave_highv8hi (dest, t1, t2));
+  t1 = gen_reg_rtx (V8SImode);
+  t2 = gen_reg_rtx (V8SImode);
+  t3 = gen_reg_rtx (V8SImode);
+  t4 = gen_reg_rtx (V8SImode);
+  /* This would be 2 insns shorter if
+ rperm[i] = GEN_INT (((~i & 1) << 2) + i / 2);
+ has been used instead (both vpslrq insns wouldn't be needed),
+ but vec_widen_*mult_hi_* is usually used together with
+ vec_widen_*mult_lo_* and by writing it this way the load
+ of the constant and the two vpermd instructions (cross-lane)
+ can be CSEd together.  */
+  for (i = 0; i < 8; ++i)
+rperm[i] = GEN_INT (((i & 1) << 2) + i / 2);
+  vperm = gen_rtx_CONST_VECTOR (V8SImode, gen_rtvec_v (8, rperm));
+  vperm = force_reg (V8SImode, vperm);
+  emit_insn (gen_avx2_permvarv8si (t1, v

[PATCH] negv{32qi,16hi,8si,4di}

2011-10-14 Thread Jakub Jelinek
Hi!

This patch allows to vectorize negations using 32-byte vectors.

2011-10-14  Jakub Jelinek  

* config/i386/sse.md (neg2): Use VI_AVX2 iterator instead
of VI_128.

--- gcc/config/i386/sse.md.jj   2011-10-14 13:05:58.0 +0200
+++ gcc/config/i386/sse.md  2011-10-14 13:56:55.0 +0200
@@ -4860,10 +4860,10 @@ (define_insn "*vec_concatv2df"
 ;
 
 (define_expand "neg2"
-  [(set (match_operand:VI_128 0 "register_operand" "")
-   (minus:VI_128
+  [(set (match_operand:VI_AVX2 0 "register_operand" "")
+   (minus:VI_AVX2
  (match_dup 2)
- (match_operand:VI_128 1 "nonimmediate_operand" "")))]
+ (match_operand:VI_AVX2 1 "nonimmediate_operand" "")))]
   "TARGET_SSE2"
   "operands[2] = force_reg (mode, CONST0_RTX (mode));")
 

Jakub


Re: [C++ Patch / RFC] PR 38174

2011-10-14 Thread Paolo Carlini

Hi,

On 10/13/2011 10:37 PM, Paolo Carlini wrote:

+  if ((TYPE_PTR_P (type1) && TYPE_PTR_P (type2))
+  || (TYPE_PTRMEM_P (type1) && TYPE_PTRMEM_P (type2))
+  || TYPE_PTRMEMFUNC_P (type1))


You don't need to check TYPE_PTR_P or TYPE_PTRMEM_P for type2 here (or 
in the condition above) because we already established that type1 and 
type2 have the same TREE_CODE.  OK with that change.
Thanks. I'm finishing regtesting again the below variant and mean to 
commit it if everything goes well.


Paolo.

//
/cp
2011-10-14  Paolo Carlini  

PR c++/38174
* call.c (add_builtin_candidate): If two pointers have a composite
pointer type, generate a single candidate with that type.

/testsuite
2011-10-14  Paolo Carlini  

PR c++/38174
* g++.dg/overload/operator4.C: New.
Index: testsuite/g++.dg/overload/operator4.C
===
--- testsuite/g++.dg/overload/operator4.C   (revision 0)
+++ testsuite/g++.dg/overload/operator4.C   (revision 0)
@@ -0,0 +1,14 @@
+// PR c++/38174
+
+struct VolatileIntPtr {
+  operator int volatile *();
+};
+
+struct ConstIntPtr {
+  operator int const *();
+};
+
+void test_with_ptrs(VolatileIntPtr vip, ConstIntPtr cip) {
+  bool b1 = (vip == cip);
+  long p1 = vip - cip;
+}
Index: cp/call.c
===
--- cp/call.c   (revision 179978)
+++ cp/call.c   (working copy)
@@ -2582,6 +2582,21 @@ add_builtin_candidate (struct z_candidate **candid
  || MAYBE_CLASS_TYPE_P (type1)
  || TREE_CODE (type1) == ENUMERAL_TYPE))
 {
+  if (TYPE_PTR_P (type1) || TYPE_PTR_TO_MEMBER_P (type1))
+   {
+ tree cptype = composite_pointer_type (type1, type2,
+   error_mark_node,
+   error_mark_node,
+   CPO_CONVERSION,
+   tf_none);
+ if (cptype != error_mark_node)
+   {
+ build_builtin_candidate
+   (candidates, fnname, cptype, cptype, args, argtypes, flags);
+ return;
+   }
+   }
+
   build_builtin_candidate
(candidates, fnname, type1, type1, args, argtypes, flags);
   build_builtin_candidate


[PATCH/RFA] Fix up gcc.dg/vect/pr30858.c expected output

2011-10-14 Thread Matthew Gretton-Dann

All,

The attached patch corrects the expected output of the
gcc.dg/vect/pr30858.c testcase.

Historically it has expected the output "Unknown def-use cycle pattern." 
just once.


However, recent changes to GCC for ARM targets means that vectorization 
is attempted twice once with a vector size of 128-bits and once with a 
vector size of 64-bits.  This means that the output appears more than  once.


The patch works around this by making the testcase expect one or more 
instances of "Unknown def-use cycle pattern"


Can someone review please?

Thanks,

Matt

gcc/testsuite/ChangeLog:

2011-10-13  Matthew Gretton-Dann  

 * gcc.dg/vect/pr30858.c: Update expected output for
 architectures with multiple vector sizes.

--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltddiff --git a/gcc/testsuite/gcc.dg/vect/pr30858.c 
b/gcc/testsuite/gcc.dg/vect/pr30858.c
index 0af2f8e..0e7f7e1 100644
--- a/gcc/testsuite/gcc.dg/vect/pr30858.c
+++ b/gcc/testsuite/gcc.dg/vect/pr30858.c
@@ -11,5 +11,5 @@ foo (int ko)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Unknown def-use cycle pattern." 1 "vect" 
} } */
+/* { dg-final { scan-tree-dump "Unknown def-use cycle pattern." "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */


Re: [PATCH] Add explicit VIS intrinsics for addition and subtraction.

2011-10-14 Thread Vladimir Makarov

On 09/28/2011 06:38 PM, Eric Botcazou wrote:

[Vlad, if you have a few minutes, would you mind having a look at the couple of
questions at the end of the message?  Thanks in advance].


No problem.

Here are the results of the investigation.  Pseudo 116 needs to be assigned a
hard register.  It is used mostly in vector instructions so we would like it
to be assigned a FP reg, but it is initialized in insn 2:

(insn 2 5 3 2 (set (reg/v:V4HI 116 [ a ])
 (reg:V4HI 24 %i0 [ a ])) combined-1.c:7 93 {*movdf_insn_sp32_v9}
  (expr_list:REG_DEAD (reg:V4HI 24 %i0 [ a ])
 (nil)))

so it ends up being assigned the (integer) argument register %i0 instead.  It
used to be assigned a FP reg as expected with the GCC 4.6.x series.


The register class preference discovery is OK:

 r116: preferred EXTRA_FP_REGS, alternative GENERAL_OR_EXTRA_FP_REGS,
allocno GENERAL_OR_EXTRA_FP_REGS
 a2 (r116,l0) best EXTRA_FP_REGS, allocno GENERAL_OR_EXTRA_FP_REGS

i.e. EXTRA_FP_REGS is "preferred"/"best".  Then it seems that this preference
is dropped and only the class of the allocno, GENERAL_OR_EXTRA_FP_REGS, is
handed down to the coloring stage.  By contrast, in the GCC 4.6 series, the
cover_class of the allocno is EXTRA_FP_REGS.

The initial cost for %i0 is twice as high (24000) as the cost of FP regs.  But
then it is reduced by 12000 when process_bb_node_for_hard_reg_moves sees insn
2 above and then again by 12000 when process_regs_for_copy sees the same insn.
So, in the end, %i0 is given cost 0 and thus beats every other register.  This
doesn't happen in the GCC 4.6 series because %i0 isn't in the cover_class.

This is at -O1.  At -O2, there is an extra pass at the discovery stage and it
sets the class of the allocno to EXTRA_FP_REGS, like with the GCC 4.6 series,
so a simple workaround is

Index: gcc.target/sparc/combined-1.c
===
--- gcc.target/sparc/combined-1.c   (revision 179316)
+++ gcc.target/sparc/combined-1.c   (working copy)
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-O -mcpu=ultrasparc -mvis" } */
+/* { dg-options "-O2 -mcpu=ultrasparc -mvis" } */
  typedef short vec16 __attribute__((vector_size(8)));
  typedef int vec32 __attribute__((vector_size(8)));


Finally the couple of questions:

  1. Is it expected that the register class preference be dropped at -O1?

  2. Is it expected that a single insn be processed by 2 different mechanisms
that independently halve the initial cost of a hard register?



Sorry for the delay with the answer.  I missed this email.

About the 1st question.  Before gcc4.7, the only class (allocno class) 
used for coloring can be a cover class.  So it was not possible to use 
GENERAL_OR_EXTRA_FP_REGS in gcc4.6 and older versions.  Starting gcc4.7, 
class used for coloring can be any class which is more profitable than 
memory.  Although there is inaccuracy in cost calculations for -O1 
because only one pass for cost calculations is used (it is very 
expensive pass).  To get better cost evaluations, more passes should be 
used.  But again we don't do more 2 passes because even one pass is not 
cheap.


In brief, I don't see any criminal that the class calculation is 
different for -O1 and -O2.


About the 2nd question.  It seems to me wrong.  I'd remove function 
process_bb_node_for_hard_reg_moves and its call from 
setup_allocno_cover_class_and_costs because function  
process_regs_for_copy is more accurate (it works with subreg).   
Although, I might be miss something here.  There were a lot of problems 
and tunings of cost calculation code.  Generated code *performance* (and 
even generation of *valid* code) is very sensitive to changes in 
ira-costs.c.  So even if such change looks obvious, a lot of testing and 
benchmarking should be done.  I could do that but it will take a week or 
two before committing such change if everything is ok.





[Ada] Checks fail on right operand of "and" and "or" with Short_Circuit_And_Or

2011-10-14 Thread Arnaud Charlet
When the pragma Short_Circuit_And_Or is used, no part of the right operand
of an "and" or "or" operator should be executed if the left operand would
short-circuit the evaluation of the corresponding "and then" or "or else".
However, run-time checks associated with such operands were being evaluated
unconditionally, due to being before to the condition prior to the rewriting
as short-circuit forms during expansion. This is corrected by doing the
rewrite during analysis of the logical operators rather than waiting until
expansion.

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-14  Gary Dismukes  

* exp_ch4.adb (Expand_N_Op_And): Remove Short_Circuit_And_Or
expansion code (moved to sem_res) (Expand_N_Op_Or): Remove
Short_Circuit_And_Or expansion code (moved to sem_res).
* sem_res.adb (Resolve_Logical_Op): Add code to rewrite Boolean
"and" and "or" operators as short-circuit "and then" and "or
else", when pragma Short_Circuit_And_Or is active.

Index: sem_res.adb
===
--- sem_res.adb (revision 179984)
+++ sem_res.adb (working copy)
@@ -7356,6 +7356,48 @@
  Check_For_Visible_Operator (N, B_Typ);
   end if;
 
+  --  Replace AND by AND THEN, or OR by OR ELSE, if Short_Circuit_And_Or
+  --  is active and the result type is standard Boolean (do not mess with
+  --  ops that return a nonstandard Boolean type, because something strange
+  --  is going on).
+
+  --  Note: you might expect this replacement to be done during expansion,
+  --  but that doesn't work, because when the pragma Short_Circuit_And_Or
+  --  is used, no part of the right operand of an "and" or "or" operator
+  --  should be executed if the left operand would short-circuit the
+  --  evaluation of the corresponding "and then" or "or else". If we left
+  --  the replacement to expansion time, then run-time checks associated
+  --  with such operands would be evaluated unconditionally, due to being
+  --  before to the condition prior to the rewriting as short-circuit forms
+  --  during expansion.
+
+  if Short_Circuit_And_Or
+and then B_Typ = Standard_Boolean
+and then Nkind_In (N, N_Op_And, N_Op_Or)
+  then
+ if Nkind (N) = N_Op_And then
+Rewrite (N,
+  Make_And_Then (Sloc (N),
+Left_Opnd  => Relocate_Node (Left_Opnd (N)),
+Right_Opnd => Relocate_Node (Right_Opnd (N;
+Analyze_And_Resolve (N, B_Typ);
+
+ --  Case of OR changed to OR ELSE
+
+ else
+Rewrite (N,
+  Make_Or_Else (Sloc (N),
+Left_Opnd  => Relocate_Node (Left_Opnd (N)),
+Right_Opnd => Relocate_Node (Right_Opnd (N;
+Analyze_And_Resolve (N, B_Typ);
+ end if;
+
+ --  Return now, since analysis of the rewritten ops will take care of
+ --  other reference bookkeeping and expression folding.
+
+ return;
+  end if;
+
   Resolve (Left_Opnd (N), B_Typ);
   Resolve (Right_Opnd (N), B_Typ);
 
Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 179984)
+++ exp_ch4.adb (working copy)
@@ -5579,27 +5579,11 @@
  Expand_Boolean_Operator (N);
 
   elsif Is_Boolean_Type (Etype (N)) then
+ Adjust_Condition (Left_Opnd (N));
+ Adjust_Condition (Right_Opnd (N));
+ Set_Etype (N, Standard_Boolean);
+ Adjust_Result_Type (N, Typ);
 
- --  Replace AND by AND THEN if Short_Circuit_And_Or active and the
- --  type is standard Boolean (do not mess with AND that uses a non-
- --  standard Boolean type, because something strange is going on).
-
- if Short_Circuit_And_Or and then Typ = Standard_Boolean then
-Rewrite (N,
-  Make_And_Then (Sloc (N),
-Left_Opnd  => Relocate_Node (Left_Opnd (N)),
-Right_Opnd => Relocate_Node (Right_Opnd (N;
-Analyze_And_Resolve (N, Typ);
-
- --  Otherwise, adjust conditions
-
- else
-Adjust_Condition (Left_Opnd (N));
-Adjust_Condition (Right_Opnd (N));
-Set_Etype (N, Standard_Boolean);
-Adjust_Result_Type (N, Typ);
- end if;
-
   elsif Is_Intrinsic_Subprogram (Entity (N)) then
  Expand_Intrinsic_Call (N, Entity (N));
 
@@ -7535,27 +7519,11 @@
  Expand_Boolean_Operator (N);
 
   elsif Is_Boolean_Type (Etype (N)) then
+ Adjust_Condition (Left_Opnd (N));
+ Adjust_Condition (Right_Opnd (N));
+ Set_Etype (N, Standard_Boolean);
+ Adjust_Result_Type (N, Typ);
 
- --  Replace OR by OR ELSE if Short_Circuit_And_Or active and the type
- --  is standard Boolean (do not mess with AND that uses a non-standard
- --  Boolean type, be

Re: [testsuite] require arm_little_endian in two tests

2011-10-14 Thread Richard Earnshaw
On 14/10/11 11:42, Julian Brown wrote:
> On Thu, 13 Oct 2011 16:12:17 +0100
> Richard Earnshaw  wrote:
> 
>> On 13/10/11 15:56, Joseph S. Myers wrote:
>>> Indeed, vector initializers are part of the target-independent GNU
>>> C language and have target-independent semantics that the elements
>>> go in memory order, corresponding to the target-independent
>>> semantics of lane numbers where they appear in GENERIC, GIMPLE and
>>> (non-UNSPEC) RTL and any target-independent built-in functions that
>>> use such numbers.  (The issue here being, as you saw, that the lane
>>> numbers used in ARM-specific NEON intrinsics are for big-endian not
>>> the same as those used in target-independent features of GNU C and
>>> target-independent internal representations in GCC - hence various
>>> code to translate them between the two conventions when processing
>>> intrinsics into non-UNSPEC RTL, and to translate back when
>>> generating assembly instructions that encode lane numbers with the
>>> ARM conventions, as expounded at greater length at
>>> .)
>>>
>>
>> This is all rather horrible, and leads to THREE different layouts for
>> a 128-bit vector for big-endian Neon.
>>
>> GCC format
>> 'VLD1.n' format
>> 'ABI' format
>>
>> GCC format and 'ABI' format differ in that the 64-bit words of the
>> 128-bit vector are swapped.
>>
>> All this and they are all expected to share a single machine mode.
>>
>> Furthermore, the definitions in GCC are broken, in that the types
>> defined in arm_neon.h (eg int8x16_t) are supposed to be ABI format,
>> not GCC format.
>>
>> Eukk! :-(
> 
> FWIW, I thought long and hard about this problem, and eventually gave
> up trying to solve it. Note that many operations which depend on the
> ordering of vectors are now disabled entirely (at least for Q regs) in
> neon.md in big-endian mode to try and limit the damage. NEON is
> basically only supported properly in little-endian mode, IMO.
> 
> I'd love to see this resolved properly. Some random observations:
> 
>  * The vectorizer can use whatever layout it wants for vectors in
>either endianness. Vectorizer vectors never interact with either
>GCC generic (source-level) vectors, nor the NEON intrinsics. Also
>they never cross ABI boundaries.
> 
>  * GCC generic vectors aren't specified very formally, particularly wrt.
>their interaction with NEON intrinsics. If you stick *entirely* to
>accessing vectors via NEON intrinsics, the problems in big-endian
>mode (I think) don't ever materialise. This includes not using
>indirection to load/store vectors, and (of course) not constructing
>vectors using { x, y, z... } syntax. One possibility might be to
>detect and *disallow* code which attempts to mix vector operations
>like that.
> 
> I don't quite understand your comment about the GCC definitions of
> int8x16_t etc. being broken, tbh...
> 

the 128-bit vectors are loaded as a pair of D regs, with D holding
the lower addressed D-word and D holding the higher addressed
D-word; but these are treated in a Q reg as {D:D}. On a
big-endian machine that means D contains the most significant lanes
of the vector and D the least significant lanes.  For a big-endian
view we really need to see these as {D:D} (read {a:b} as
bit-wise concatenation of a and b).

One way we might address this is to redefine our 128-bit vector types as
structs of low/high Dwords.  Each Dword remains a vector (apart from
64-bit lane types), but the Dword order then matches the ABI
specification correctly.  For example, the definition of uint8x16_t becomes

typedef struct { uint8x8_t _val[2]; } uint8x16_t;

that is we consider this to be a pair of 64-bit vectors.  Obviously
there would be similar definitions for the other vector types.  This
then gives the correct view on the world because D is always _val[0]
and D is always _val[1].

Secondly, all vector loads/stores should really be changed to use
vld1.64 (with {d, d} as the register list for 128-bit accesses)
rather than vldm; this then sorts out any issues with unaligned accesses
without changing the memory format.

> Cheers,
> 
> Julian
> 




[Ada] Aliasing and objects in extended return statements

2011-10-14 Thread Arnaud Charlet
AI05-0053 forbids the use of the aliased keyword in the object declaration of an
extended return statement. This avoids semantic complications with return
objects that are not of an immutably limited type, and which therefore  are not
necesarily built in place.

Compiling the following in -gnat12 mode must yield:

   illegal_alias.adb:15:26: "aliased" not allowed in extended return

Compiling it in gnat05 mode must yield:

illegal_alias.adb:15:26: warning:
"aliased" not allowed in extended return in Ada2012

---
procedure Illegal_Alias is
   type Outer;
   type Inner (Ref : access Outer) is limited null record;
   type Outer is limited
  record
  Self : Inner (Outer'Access);
  Data : Integer := 0;
  end record;

   type Vector is array (Positive range <>) of Outer;
   subtype S is Vector (1 .. 10);

   function F return Vector is
   begin
  return X : aliased Vector := S'(others => <>); --  ERROR
   end F;

begin
   null;
end Illegal_Alias;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-14  Ed Schonberg  

* par-ch6.adb (P_Return_Object_Declaration): In Ada 2012 mode,
reject an aliased keyword on the object declaration of an extended
return statement. In older versions of the language indicate
that this is illegal in the standard.

Index: par-ch6.adb
===
--- par-ch6.adb (revision 179984)
+++ par-ch6.adb (working copy)
@@ -1677,6 +1677,14 @@
  Scan; -- past ALIASED
  Set_Aliased_Present (Decl_Node);
 
+ if Ada_Version < Ada_2012 then
+Error_Msg_SC -- CODEFIX
+  ("ALIASED not allowed in extended return in Ada2012?");
+ else
+Error_Msg_SC -- CODEFIX
+  ("ALIASED not allowed in extended return");
+ end if;
+
  if Token = Tok_Constant then
 Scan; -- past CONSTANT
 Set_Constant_Present (Decl_Node);


Re: [testsuite] require arm_little_endian in two tests

2011-10-14 Thread Joseph S. Myers
On Fri, 14 Oct 2011, Julian Brown wrote:

>  * The vectorizer can use whatever layout it wants for vectors in
>either endianness. Vectorizer vectors never interact with either
>GCC generic (source-level) vectors, nor the NEON intrinsics. Also
>they never cross ABI boundaries.

I don't think it makes sense to refer to the vectorizer as using a layout.  
The vectorizer transforms GIMPLE to GIMPLE, and both the input and output 
GIMPLE have target-independent semantics that may be relied upon anywhere 
that processes GIMPLE (meaning the transformations should be valid as 
target-independent transformations of GIMPLE), except insofar as built-in 
functions are used.  Of course which transformations are made depends on 
what operations can be implemented efficiently on the target processor.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch,AVR] Print no-return functions as JMP

2011-10-14 Thread Richard Henderson
On 10/13/2011 11:31 PM, Georg-Johann Lay wrote:
> Richard Henderson schrieb:
>> On 10/13/2011 12:00 PM, Georg-Johann Lay wrote:
>>
>>> What do you propose?
>>>
>>> o A command line option that is on per default like
>>>  -mnoreturn-tail-calls or -mjmp-noreturn
>>
>> The command-line-option.  I think I prefer -mjump-noreturn,
>> as the inverse -mno-noreturn-tail-calls is too awkward.
> 
> What about flag_optimize_sibling_calls?
> What wa are seeing here is actually a tail call.

Because we explicitly don't tail call noreturn for the
reason previously explainted.


r~


Re: [Patch, Fortran] PR50718 Fix -fcheck=pointer 4.6/4.7 regression

2011-10-14 Thread Tobias Burnus

On 10/14/2011 12:19 PM, Tobias Burnus wrote:
while testing my constructor draft patch with FGSL, I found two bugs 
(4.6/4.7 regressions): PR target/50721 (segfault on x86-64 after 
execution - one of the rare -O0 only bugs)


That bug turned out to be a out-of-bounds problem between Fortran and C, 
which is very difficult to diagnose, and that the stack was destroyed, 
didn't help. As Bernd pointed out, the bug is in the Fortran programm 
(FGSL's poly.f90) and not in the compiler.


Regarding the true GCC bug: The patch for -fcheck=pointer was approved 
off list by Janus. I have committed the trunk version as Rev. 179988. I 
will soon backport it for 4.6 - in time for the 4.6.2 release, which is 
to be expected next week.


If you think any other 4.6 regression should be fixed in 4.6.2: Well, 
you have the whole weekend to work on a patch :-)


Tobias


Re: [PATCH] negv{32qi,16hi,8si,4di}

2011-10-14 Thread Richard Henderson
On 10/14/2011 07:19 AM, Jakub Jelinek wrote:
> 2011-10-14  Jakub Jelinek  
> 
>   * config/i386/sse.md (neg2): Use VI_AVX2 iterator instead
>   of VI_128.

Ok.


r~


Re: [testsuite] require arm_little_endian in two tests

2011-10-14 Thread Joseph S. Myers
On Fri, 14 Oct 2011, Richard Earnshaw wrote:

> One way we might address this is to redefine our 128-bit vector types as
> structs of low/high Dwords.  Each Dword remains a vector (apart from
> 64-bit lane types), but the Dword order then matches the ABI
> specification correctly.  For example, the definition of uint8x16_t becomes
> 
>   typedef struct { uint8x8_t _val[2]; } uint8x16_t;

Those types have different ABIs for argument passing and return, so you'd 
need some magic for special handling of the uint8x16_t type as defined in 
the header

> Secondly, all vector loads/stores should really be changed to use
> vld1.64 (with {d, d} as the register list for 128-bit accesses)
> rather than vldm; this then sorts out any issues with unaligned accesses
> without changing the memory format.

vld1 runs into problems for big-endian of not being able to do core 
register loads / stores / transfers between core and NEON registers that 
way, and needing to convert to the other format for argument passing / 
return.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Predication during scheduling

2011-10-14 Thread Vladimir Makarov

On 10/13/2011 05:01 PM, Bernd Schmidt wrote:

On 09/30/11 17:29, Bernd Schmidt wrote:

This patch allows a backend to set a new scheduler flag, DO_PREDICATION,
which will make the haifa scheduler try to move insns across jumps by
predicating them. On C6X, the primary benefit is to fill jump delay slots.

Ping.

http://gcc.gnu.org/ml/gcc-patches/2011-09/msg02053.html



It is hard to read the patch without function names.

As I understand, changes for tree/sra.c is in the patch by accident.

Thanks for additional scheduler code cleaning.

The scheduler part of the patch is ok for me (other part changes are 
obvious).  Could you only commit it at the beginning of the next week.


Thanks, Bernd.


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-14 Thread H.J. Lu
On Thu, Oct 13, 2011 at 11:51 PM, Paolo Bonzini  wrote:
> On 10/13/2011 10:07 PM, H.J. Lu wrote:
>>
>> On Thu, Oct 13, 2011 at 11:15 AM, Richard Kenner
>>   wrote:

 The answer to H.J.'s "Why do we do it for MEM then?" is simply
 "because no one ever thought about not doing it"
>>>
>>> No, that's false.  The same expand_compound_operation /
>>> make_compound_operation
>>> pair is present in the MEM case as in the SET case.  It's just that
>>> there's some bug here that's noticable in not making proper MEMs that
>>> doesn't show up in the SET case because of the way the insns are
>>> structured.
>>>
>>
>> When we have (and (OP) M) where
>>
>> (and (OP) M) == (and (OP) ((1<<  ceil_log2 (M)) - 1) ))
>>
>> (and (OP) M) is zero_extract bits 0 to ceil_log2 (M).
>>
>> Does it look OK?
>
> Yes, it does.  How did you test it?
>

There is a testcase at

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50696

It passes with my patch.

-- 
H.J.


Re: New warning for expanded vector operations

2011-10-14 Thread Artem Shinkarov
On Fri, Oct 14, 2011 at 2:57 PM, Richard Guenther
 wrote:
> On Fri, Oct 14, 2011 at 3:42 PM, Artem Shinkarov
>  wrote:
>> On Thu, Oct 13, 2011 at 10:40 AM, Artem Shinkarov
>>  wrote:
>>> On Thu, Oct 13, 2011 at 10:23 AM, Richard Guenther
>>>  wrote:
 On Thu, Oct 13, 2011 at 10:59 AM, Mike Stump  wrote:
> On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote:
>> This patch fixed PR50704.
>>
>> gcc/testsuite:
>>        * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target.
>>        * gcc.target/i386/warn-vect-op-1.c: Ditto.
>>        * gcc.target/i386/warn-vect-op-2.c: Ditto.
>>
>> Ok for trunk?
>
> Ok.  Is this x32 clean?  :-)  If not, HJ will offer an even better 
> spelling.

 I suppose you instead want sth like

 { dg-require-effective-target lp64 }

 ?

>>>
>>> See our discussion with HJ here:
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50704
>>> /* { dg-do compile { target { ! { ia32 } } } } */ was his idea.  As
>>> far as x32 sets UNITS_PER_WORD to 8, these tests should work fine.
>>>
>>> Artem.
>>>
>>
>> Ping.
>>
>> So can I commit the changes?
>
> Yes.
>
> Thanks,
> Richard.

Committed with 179991.

>>
>> Thanks,
>> Artem.
>>
>


[pph] Fix chain merging (issue5264044)

2011-10-14 Thread Diego Novillo

So, I had not fixed c1limits-externalid.cc.  I had simply messed up
merging decls into chains.

The problem here is that we insert decls into a chain *before* we
finish reading them.  So, when pph_merge_into_chain inserts the
partially read DECL into the chain (or returns an existing DECL), the
call to pph_in_tree_body proceeds to clobber DECL_CHAIN with whatever
value it had when we had generated that PPH file.

I tried not streaming DECL_CHAINs at all, but we chain decls outside
of "chain" contexts, so that breaks things too.  I've added some
documentation on that in the code.

What I ended up doing is saving the value of DECL_CHAIN after
insertion and restoring it if pph_in_tree_body clobbered it.

c1limits-externalid.cc is timing out because we are doing this N^2
search/insert with 100,000 symbols.  I've got some ideas on how to fix
that.  I'll be trying that next.

Tested on x86_64.  Committed to branch.


* pph-streamer-in.c (pph_in_mergeable_tree): Remove.  Update all users.
(pph_in_chain_1): Rename from pph_in_chain.
Add argument CHAIN.
(pph_in_chain): Call pph_in_chain_1.
(pph_in_mergeable_chain): Likewise.
(pph_in_tcc_declaration): Add documentation on why reading
DECL_CHAIN is necessary.
(pph_in_tree_1): Prevent getting DECL_CHAIN clobbered after
merging into CHAIN.
* pph-streamer-out.c (pph_out_tree_vec_1): Add argument UNCHAIN.
Update all users.
(pph_out_tcc_declaration): Add documentation on why writing
DECL_CHAIN is necessary

testsuite/ChangeLog.pph:

* testsuite/g++.dg/pph/c1limits-externalid.cc: Restore timeout.

diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 5cdf4d5..23a28bf 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -524,15 +524,6 @@ pph_in_tree (pph_stream *stream)
 }
 
 
-/* Load an AST into CHAIN from STREAM.  */
-
-static void
-pph_in_mergeable_tree (pph_stream *stream, tree *chain)
-{
-  pph_in_tree_1 (stream, chain);
-}
-
-
 /** lexical elements */
 
 
@@ -711,13 +702,29 @@ pph_in_tree_pair_vec (pph_stream *stream)
 / chains */
 
 
-/* Read a chain of ASTs from STREAM.  */
+/* Read a chain of ASTs from STREAM.  If CHAIN is set, the ASTs are
+   incorporated at the head of *CHAIN as they are read.  */
+
 static tree
-pph_in_chain (pph_stream *stream)
+pph_in_chain_1 (pph_stream *stream, tree *chain)
 {
-  tree t = streamer_read_chain (stream->encoder.r.ib,
+  HOST_WIDE_INT i, count;
+
+  if (chain == NULL)
+return streamer_read_chain (stream->encoder.r.ib,
 stream->encoder.r.data_in);
-  return t;
+
+  count = pph_in_hwi (stream);
+  for (i = 0; i < count; i++)
+pph_in_tree_1 (stream, chain);
+
+  return *chain;
+}
+
+static tree
+pph_in_chain (pph_stream *stream)
+{
+  return pph_in_chain_1 (stream, NULL);
 }
 
 
@@ -726,11 +733,7 @@ pph_in_chain (pph_stream *stream)
 static void
 pph_in_mergeable_chain (pph_stream *stream, tree *chain)
 {
-  int i, count;
-
-  count = pph_in_hwi (stream);
-  for (i = 0; i < count; i++)
-pph_in_mergeable_tree (stream, chain);
+  pph_in_chain_1 (stream, chain);
 }
 
 
@@ -816,7 +819,7 @@ pph_match_to_link (tree expr, location_t where, const char 
*idstr, tree *link)
 
 static tree
 pph_search_in_chain (tree expr, location_t where, const char *idstr,
-   tree *chain)
+tree *chain)
 {
   /* FIXME pph: This could resultin O(POW(n,2)) compilation.  */
   tree *link = chain;
@@ -869,6 +872,7 @@ pph_merge_into_chain (pph_stream *stream, tree expr, tree 
*chain)
 
   if (flag_pph_debug >= 3)
 fprintf (pph_logfile, "PPH: %s FOUND on chain\n", idstr);
+
   return found;
 }
 
@@ -1629,8 +1633,11 @@ pph_in_tcc_declaration (pph_stream *stream, tree decl)
   pph_in_lang_specific (stream, decl);
   DECL_INITIAL (decl) = pph_in_tree (stream);
 
-  /* The tree streamer only writes DECL_CHAIN for PARM_DECL nodes.  */
-  /* FIXME pph: almost redundant.  */
+  /* The tree streamer only writes DECL_CHAIN for PARM_DECL nodes.
+ We need to read DECL_CHAIN for variables and functions because
+ they are sometimes chained together in places other than regular
+ tree chains.  For example in BINFO_VTABLEs, the decls are chained
+ together).  */
   if (TREE_CODE (decl) == VAR_DECL
   || TREE_CODE (decl) == FUNCTION_DECL)
 DECL_CHAIN (decl) = pph_in_tree (stream);
@@ -1934,6 +1941,7 @@ pph_in_tree_1 (pph_stream *stream, tree *chain)
   enum pph_record_marker marker;
   unsigned image_ix, ix;
   enum LTO_tags tag;
+  tree saved_expr_chain = NULL;
 
   /* Read record start and test cache.  */
   marker = pph_in_start_record (stream, &image_ix, &ix, PPH_any_tree);
@@ -1967,9 +1975,20 @@ pph_in_tree_1 (pph_stream *stream, tree *chain)
   /* Materialize a new node from STREAM.  This 

Re: [PATCH] sel-sched: forbid differing modes in substitution (PR 50205)

2011-10-14 Thread Vladimir Makarov

On 09/07/2011 05:38 AM, Alexander Monakov wrote:

Hello,

The patch repairs a problem when we attempt to substitute an insn like
(... (cmp (mem (reg:DI ax)) (reg:SI ax))) (note different modes) through
(set (reg:DI ax) (reg:DI dx)), which leaves the (reg:SI ax) part of the
comparison intact, causing an ICE later on when we notice that the dependency
on ax is still present.  As this is quite rare, we can simply forbid
substitution in such circumstances, much like substitution of multiple hard
reg references is forbidden now.

En passant, the patch simplifes the code a little, as we never try to
substitute anything but registers.

Bootstrapped and regtested on x86_64-linux with sel-sched enabled at -O2, OK?
(I'll add the testcase from Bugzilla when committing)


Ok.  Thanks for the patch.

2011-09-07  Alexander Monakov

PR rtl-optimization/50205
* sel-sched.c (count_occurrences_1): Simplify on the assumption that
p->x is a register.  Forbid substitution when the same register is
found in a different mode.
(count_occurrences_equiv): Assert that 'what' is a register.





Re: [PATCH] sel-sched: fix merging of LHS reg availability (PR 50340)

2011-10-14 Thread Vladimir Makarov

On 09/13/2011 12:42 PM, Alexander Monakov wrote:

Fixed as follows, bootstrapped and regtested on x86_64-linux and ia64-linux
(without java, with one recent SRA patch reverted to unbreak bootstrap) with
sel-sched enabled at -O2.  OK for trunk?


Ok with a small code format change below.

(a small testcase is not available at the moment, but I can try to produce one
using delta before committing)


2011-09-13  Andrey Belevantsev

* sel-sched-ir.c (update_target_availability): LHS register
availability is not known if the unavailable LHS of the other
expression is a different register.

diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index 4878460..b132392 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -1746,7 +1746,13 @@ update_target_availability (expr_t to, expr_t from,
insn_t split_point)
  EXPR_TARGET_AVAILABLE (to) = -1;
  }
else
-EXPR_TARGET_AVAILABLE (to)&= EXPR_TARGET_AVAILABLE (from);

Please, put else and if on the same line with a proper indentation.

+if (EXPR_TARGET_AVAILABLE (from) == 0
+&&  EXPR_LHS (from)
+&&  REG_P (EXPR_LHS (from))
+&&  REGNO (EXPR_LHS (to)) != REGNO (EXPR_LHS (from)))
+  EXPR_TARGET_AVAILABLE (to) = -1;
+else
+  EXPR_TARGET_AVAILABLE (to)&= EXPR_TARGET_AVAILABLE (from);
  }
  }




Re: Intrinsics for N2965: Type traits and base classes

2011-10-14 Thread Jason Merrill

On 10/13/2011 01:35 PM, Michael Spertus wrote:

+int main() {
+  assert(typeid(b::type)
+ == typeid(types));
+  assert(typeid(db::type) == typeid(types));
+  assert(typeid(db::type) == typeid(types<>));
+  return 0;
+}


Let's make this a compile-time test using something like

template  struct assert_same_type;
template  struct assert_same_type {};

Jason


Re: Predication during scheduling

2011-10-14 Thread Bernd Schmidt
On 10/14/11 17:35, Vladimir Makarov wrote:
> On 10/13/2011 05:01 PM, Bernd Schmidt wrote:
>> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg02053.html
>>
>>
> It is hard to read the patch without function names.

Oh, you mean without -p? Not sure how that happened, svn diff seems to
add them when run on my machine. I may have run it on gcc60 during ia64
testing.

> As I understand, changes for tree/sra.c is in the patch by accident.

Yes - it was necessary to get ia64 to bootstrap.

> Thanks for additional scheduler code cleaning.
> 
> The scheduler part of the patch is ok for me (other part changes are
> obvious).  Could you only commit it at the beginning of the next week.

Thank you!


Bernd


[google] AddressSanitizer for gcc, first attempt. (issue5272048)

2011-10-14 Thread Kostya Serebryany
Index: tree-asan.c
===
--- tree-asan.c (revision 0)
+++ tree-asan.c (revision 0)
@@ -0,0 +1,512 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "tm_p.h"
+#include "basic-block.h"
+#include "flags.h"
+#include "function.h"
+#include "tree-inline.h"
+#include "gimple.h"
+#include "tree-iterator.h"
+#include "tree-flow.h"
+#include "tree-dump.h"
+#include "tree-pass.h"
+#include "diagnostic.h"
+#include "demangle.h"
+#include "langhooks.h"
+#include "ggc.h"
+#include "cgraph.h"
+#include "gimple.h"
+#include "tree-asan.h"
+
+/*
+ AddressSanitizer finds out-of-bounds and use-after-free bugs 
+ with <2x slowdown on average.
+
+ The tool consists of two parts:
+ instrumentation module (this file) and a run-time library.
+ The instrumentation module adds a run-time check before every memory insn.
+   For a 8 byte load accessing address X:
+ ShadowAddr = (X >> 3) + Offset
+ ShadowValue = (char*)ShadowAddr;
+ if (ShadowValue)
+   __asan_report_load8(X);
+   For a load of N bytes (N=1, 2 or 4) from address X:
+ ShadowAddr = (X >> 3) + Offset
+ ShadowValue = (char*)ShadowAddr;
+ if (ShadowValue)
+   if ((X & 7) + N - 1 > ShadowValue)
+ __asan_report_loadN(X);
+ Stores are instrumented similarly, but using __asan_report_storeN functions.
+ A call too __asan_init() is inserted to the list of module CTORs.
+
+ The run-time library redefines malloc (so that redzone are inserted around
+ the allocated memory) and free (so that reuse of free-ed memory is delayed), 
+ provides __asan_report* and __asan_init functions.
+ 
+ Read more:
+ http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
+
+ Implementation details:
+ This is my first code in gcc. I wrote it by copying tree-mudflap.c,
+ stripping 70% of irrelevant code and modifying the instrumentation routine
+ build_check_stmt. The code seems to work, but I don't feel I understand it.
+ In particular, transform_derefs() and transform_statements() seem too complex.
+ Suggestions are welcome on how to simplify them.
+ (All I need is to traverse *all* memory accesses and instrument them).
+
+ Future work:
+ The current implementation supports only detection of out-of-bounds and
+ use-after-free bugs in heap.
+ In order to support out-of-bounds for stack and globals we will need
+ to create redzones for stack and global object and poison them.
+*/
+
+/* The shadow address is computed as (X>>asan_scale) + (1dest;
+
+  /* A recap at this point: join_bb is the basic block at whose head
+ is the gimple statement for which this check expression is being
+ built.  cond_bb is the (possibly new, synthetic) basic block the
+ end of which will contain the cache-lookup code, and a
+ conditional that jumps to the cache-miss code or, much more
+ likely, over to join_bb.  */
+
+  /* Create the bb that contains the crash block.  */
+  then_bb = create_empty_bb (cond_bb);
+  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
+
+  /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
+  e = find_edge (cond_bb, join_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = REG_BR_PROB_BASE;
+
+  /* Update dominance info.  Note that bb_join's data was
+ updated by split_block.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+{
+  set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+  set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
+}
+
+  base_addr = make_rename_temp (uintptr_type, "base_addr");
+
+  seq = gimple_seq_alloc ();
+  t = fold_convert_loc (location, uintptr_type,
+unshare_expr (base));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  g = gimple_build_assign (base_addr, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
+  build_int_cst(uintptr_type, asan_scale)
+ );
+  t 

Re: [PATCH] AVX2 vec_widen_[su]mult_{hi,lo}*, sdot_prod* and udot_prod*

2011-10-14 Thread Richard Henderson
On 10/14/2011 07:18 AM, Jakub Jelinek wrote:
> +  /* This would be 2 insns shorter if
> + rperm[i] = GEN_INT (((~i & 1) << 2) + i / 2);
> + has been used instead (both vpslrq insns wouldn't be needed),
> + but vec_widen_*mult_hi_* is usually used together with
> + vec_widen_*mult_lo_* and by writing it this way the load
> + of the constant and the two vpermd instructions (cross-lane)
> + can be CSEd together.  */
> +  for (i = 0; i < 8; ++i)
> +rperm[i] = GEN_INT (((i & 1) << 2) + i / 2);
> +  vperm = gen_rtx_CONST_VECTOR (V8SImode, gen_rtvec_v (8, rperm));
> +  vperm = force_reg (V8SImode, vperm);
> +  emit_insn (gen_avx2_permvarv8si (t1, vperm, operands[1]));
> +  emit_insn (gen_avx2_permvarv8si (t2, vperm, operands[2]));
> +  emit_insn (gen_lshrv4di3 (gen_lowpart (V4DImode, t3),
> + gen_lowpart (V4DImode, t1), GEN_INT (32)));
> +  emit_insn (gen_lshrv4di3 (gen_lowpart (V4DImode, t4),
> + gen_lowpart (V4DImode, t2), GEN_INT (32)));
> +  emit_insn (gen_avx2_mulv4siv4di3 (operands[0], t3, t4));

So what you're doing here is the low-part permutation:

0 4 1 5 2 6 3 7

followed by a shift to get

4 . 5 . 6 . 7 .

But you need to load a 256-bit constant from memory to get it.

I wonder if it wouldn't be better to use VPERMQ to handle the lane change:

0   2   1   3
0 1 4 5 2 3 6 7

shared between the hi/lo, and a VPSHUFD to handle the in-lane ordering:

0 0 1 1 2 2 3 3
4 4 5 5 6 6 7 7

In the end we get 2+(2+2)=6 insns as setup prior to the VPMULDQs, as compared
to your 1+2+(0+2)=5 insns, but no need to wait for the constant load.  Of 
course, if the constant load gets hoisted out of the loop, yours will likely
win on throughput.

Thoughts, Uros and those looking in from Intel?

Otherwise it looks ok.


r~


Re: [PATCH, Atom] Fix performance regression with -mtune=atom

2011-10-14 Thread H.J. Lu
On Fri, Oct 14, 2011 at 7:15 AM, Vladimir Yakovlev  wrote:
> Could anyone checkin that?

Please provide a suitable patch which can be applied.

H.J.

> Thanks,
> Vladimir
>
> 2011/10/14 Uros Bizjak :
>> Hello!
>>
>>> This is a ping. Change affects Atom only and was made because it
>>> really gives better performance on this architecture. This fact
>>> actually leads to the thought that old value is just a simple
>>> misprint.
>>>
>>> > This patch fixes performance regression with -mtune=atom. Changing
>>> > atom cost removes regression in several tests of EEMBC and spec2000.
>>> > Bootstrap amd make check Ok for both with and witout -mtune-atom.
>>> > OK for trunk?
>>> >
>>> > 2011-09-30 ?Yakovlev Vladimir ?vladimir.b.yakov...@intel.com
>>> >
>>> > ? ? ?* gcc/config/i386/i386.c (atom_cost): Changed cost for loading
>>> > ? ? ? QImode using movzbl.
>>
>> OK.
>>
>> Thanks,
>> Uros.
>>
>



-- 
H.J.


Re: [testsuite] require arm_little_endian in two tests

2011-10-14 Thread Richard Earnshaw
On 14/10/11 16:21, Joseph S. Myers wrote:
> On Fri, 14 Oct 2011, Richard Earnshaw wrote:
> 
>> One way we might address this is to redefine our 128-bit vector types as
>> structs of low/high Dwords.  Each Dword remains a vector (apart from
>> 64-bit lane types), but the Dword order then matches the ABI
>> specification correctly.  For example, the definition of uint8x16_t becomes
>>
>>  typedef struct { uint8x8_t _val[2]; } uint8x16_t;
> 
> Those types have different ABIs for argument passing and return, so you'd 
> need some magic for special handling of the uint8x16_t type as defined in 
> the header
> 

Yes, it's not a simple substitution, but it more correctly describes the
data type that the architecture supports.  It might be necessary to
create a special internal type to distinguish it from user types that
are equivalent.


>> Secondly, all vector loads/stores should really be changed to use
>> vld1.64 (with {d, d} as the register list for 128-bit accesses)
>> rather than vldm; this then sorts out any issues with unaligned accesses
>> without changing the memory format.
> 
> vld1 runs into problems for big-endian of not being able to do core 
> register loads / stores / transfers between core and NEON registers that 
> way, and needing to convert to the other format for argument passing / 
> return.
> 

Note that I said vld1.64 (not vld1..  That has the same
memory format as vldm, except that it can also deal with unaligned accesses.



Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-14 Thread Paolo Bonzini

On 10/14/2011 05:36 PM, H.J. Lu wrote:

There is a testcase at

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50696

It passes with my patch.


Cool, so let's wait for the results of testing.

Paolo


Re: [Patch,AVR] Print no-return functions as JMP

2011-10-14 Thread Georg-Johann Lay
Richard Henderson schrieb:
> On 10/13/2011 11:31 PM, Georg-Johann Lay wrote:
>> Richard Henderson schrieb:
>>> On 10/13/2011 12:00 PM, Georg-Johann Lay wrote:
>>>
 What do you propose?

 o A command line option that is on per default like
  -mnoreturn-tail-calls or -mjmp-noreturn
>>> The command-line-option.  I think I prefer -mjump-noreturn,
>>> as the inverse -mno-noreturn-tail-calls is too awkward.
>> What about flag_optimize_sibling_calls?
>> What wa are seeing here is actually a tail call.
> 
> Because we explicitly don't tail call noreturn for the
> reason previously explained.

Thanks for the explanation.

Here is the new patch for review with the new option -mjump-to-noreturn

Ok for trunk?

Johann

* doc/invoke.texi (AVR Options): Document -mjump-to-noreturn.

* config/avr/avr-protos.h (avr_out_call): New prototype.
* config/avr/avr.md (adjust_len): Add alternative "call".
(call_insn, call_calue_insn): Use it.  Use avr_out_call to print
assembler.
* config/avr/avr.c (avr_out_call): New function.
(adjust_insn_length): Handle ADJUST_LEN_CALL.

* common/config/avr/avr-common.c (-mjump-to-noreturn): Turn on for
-O and higher.

Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 179993)
+++ doc/invoke.texi	(working copy)
@@ -487,7 +487,7 @@ Objective-C and Objective-C++ Dialects}.
 
 @emph{AVR Options}
 @gccoptlist{-mmcu=@var{mcu}  -mno-interrupts @gol
--mcall-prologues  -mtiny-stack  -mint8  -mstrict-X}
+-mcall-prologues  -mtiny-stack  -mint8  -mstrict-X -mjump-to-noreturn}
 
 @emph{Blackfin Options}
 @gccoptlist{-mcpu=@var{cpu}@r{[}-@var{sirevision}@r{]} @gol
@@ -10690,6 +10690,13 @@ and long long will be 4 bytes.  Please n
 comply to the C standards, but it will provide you with smaller code
 size.
 
+@item -mjump-to-noreturn
+@opindex mjump-to-noreturn
+Use a jump instruction instead of a call instruction when calling a
+no-return functions.  This option is active if optimization is turned
+on and just affects the way a call instruction is printed out.
+Besides that, it has no effect on code generation or debug information.
+
 @item -mstrict-X
 @opindex mstrict-X
 Use register @code{X} in a way proposed by the hardware.  This means
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 179992)
+++ config/avr/avr.md	(working copy)
@@ -133,11 +133,10 @@ (define_attr "length" ""
 ;; Following insn attribute tells if and how the adjustment has to be
 ;; done:
 ;; no No adjustment needed; attribute "length" is fine.
-;; yesAnalyse pattern in adjust_insn_length by hand.
 ;; Otherwise do special processing depending on the attribute.
 
 (define_attr "adjust_len"
-  "out_bitop, out_plus, addto_sp, tsthi, tstsi, compare,
+  "out_bitop, out_plus, addto_sp, tsthi, tstsi, compare, call,
mov8, mov16, mov32, reload_in16, reload_in32,
ashlqi, ashrqi, lshrqi,
ashlhi, ashrhi, lshrhi,
@@ -3634,21 +3633,12 @@ (define_insn "call_insn"
   ;; Operand 1 not used on the AVR.
   ;; Operand 2 is 1 for tail-call, 0 otherwise.
   ""
-  "@
-%!icall
-%~call %x0
-%!ijmp
-%~jmp %x0"
+  {
+return avr_out_call (insn, operands[0], 0 != INTVAL (operands[2]));
+  }
   [(set_attr "cc" "clobber")
-   (set_attr_alternative "length"
- [(const_int 1)
-  (if_then_else (eq_attr "mcu_mega" "yes")
-(const_int 2)
-(const_int 1))
-  (const_int 1)
-  (if_then_else (eq_attr "mcu_mega" "yes")
-(const_int 2)
-(const_int 1))])])
+   (set_attr "length" "1,*,1,*")
+   (set_attr "adjust_len" "*,call,*,call")])
 
 (define_insn "call_value_insn"
   [(parallel[(set (match_operand 0 "register_operand"   "=r,r,r,r")
@@ -3658,21 +3648,12 @@ (define_insn "call_value_insn"
   ;; Operand 2 not used on the AVR.
   ;; Operand 3 is 1 for tail-call, 0 otherwise.
   ""
-  "@
-%!icall
-%~call %x1
-%!ijmp
-%~jmp %x1"
+  {
+return avr_out_call (insn, operands[1], 0 != INTVAL (operands[3]));
+  }
   [(set_attr "cc" "clobber")
-   (set_attr_alternative "length"
- [(const_int 1)
-  (if_then_else (eq_attr "mcu_mega" "yes")
-(const_int 2)
-(const_int 1))
-  (const_int 1)
-  (if_then_else (eq_attr "mcu_mega" "yes")
-(const_int 2)
-(const_int 1))])])
+   (set_attr "length" "1,*,1,*")
+   (set_attr "adjust_len" "*,call,*,call")])
 
 (define_insn "nop"
   [(const_

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-14 Thread H.J. Lu
On Fri, Oct 14, 2011 at 9:23 AM, Paolo Bonzini  wrote:
> On 10/14/2011 05:36 PM, H.J. Lu wrote:
>>
>> There is a testcase at
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50696
>>
>> It passes with my patch.
>
> Cool, so let's wait for the results of testing.
>
> Paolo
>

Here is the complete patch with a testcase. I will
check it in if there are no performance regressions
with SPEC CPU 2K/2006 on Linux/ia32/x86-64/x32.

Thanks.

-- 
H.J.
---
gcc/

2011-10-13  H.J. Lu  

PR rtl-optimization/50696
* combine.c (make_compound_operation): Turn (and (OP) M) into
extraction if M is an extraction mask.

gcc/testsuite/

2011-10-13  H.J. Lu  

PR rtl-optimization/50696
* gcc.target/i386/pr50696.c: New.
gcc/

2011-10-13  H.J. Lu  

	PR rtl-optimization/50696
	* combine.c (make_compound_operation): Turn (and (OP) M) into
	extraction if M is an extraction mask.

gcc/testsuite/

2011-10-13  H.J. Lu  

	PR rtl-optimization/50696
	* gcc.target/i386/pr50696.c: New.

diff --git a/gcc/combine.c b/gcc/combine.c
index 6c3b17c..4b57b88 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7739,16 +7739,6 @@ make_compound_operation (rtx x, enum rtx_code in_code)
  XEXP (XEXP (x, 0), 1)));
 	}
 
-  /* If the constant is one less than a power of two, this might be
-	 representable by an extraction even if no shift is present.
-	 If it doesn't end up being a ZERO_EXTEND, we will ignore it unless
-	 we are in a COMPARE.  */
-  else if ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0)
-	new_rtx = make_extraction (mode,
-			   make_compound_operation (XEXP (x, 0),
-			next_code),
-			   0, NULL_RTX, i, 1, 0, in_code == COMPARE);
-
   /* If we are in a comparison and this is an AND with a power of two,
 	 convert this into the appropriate bit extract.  */
   else if (in_code == COMPARE
@@ -7758,6 +7748,26 @@ make_compound_operation (rtx x, enum rtx_code in_code)
 			next_code),
 			   i, NULL_RTX, 1, 1, 0, 1);
 
+  /* If the constant is an extraction mask with the zero bits in
+	 the first operand ignored, this might be representable by an
+	 extraction even if no shift is present.  If it doesn't end up
+	 being a ZERO_EXTEND, we will ignore it unless we are in a
+	 COMPARE.  */
+  else
+	{
+	  unsigned HOST_WIDE_INT nonzero =
+	nonzero_bits (XEXP (x, 0), GET_MODE (XEXP (x, 0)));
+	  unsigned HOST_WIDE_INT mask = UINTVAL (XEXP (x, 1));
+	  unsigned HOST_WIDE_INT len = ceil_log2 (mask);
+	  if ((nonzero & (((unsigned HOST_WIDE_INT) 1 << len) - 1))
+	  == (nonzero & mask))
+	{
+	  new_rtx = make_compound_operation (XEXP (x, 0), next_code);
+	  new_rtx = make_extraction (mode, new_rtx, 0, NULL_RTX,
+	 len, 1, 0, in_code == COMPARE);
+	}
+	}
+
   break;
 
 case LSHIFTRT:
diff --git a/gcc/testsuite/gcc.target/i386/pr50696.c b/gcc/testsuite/gcc.target/i386/pr50696.c
new file mode 100644
index 000..b1ec2c5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr50696.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target { x32 } } } */
+/* { dg-options "-O2 -mtune=generic" } */
+
+struct s { int val[16]; };
+
+extern void f (struct s pb);
+
+void
+foo ()
+{
+  struct s x;
+  int i;
+
+  for (i = 0; i < 16; i++)
+x.val[i] = i + 1;
+  f (x);
+}
+
+/* { dg-final { scan-assembler-not "lea\[lq\]" } } */


Re: [PATCH] 32-byte integer vec_interleave_{high,low}

2011-10-14 Thread Richard Henderson
On 10/13/2011 11:16 PM, Jakub Jelinek wrote:
> 2011-10-14  Jakub Jelinek  
> 
>   * config/i386/sse.md (vec_interleave_high,
>   vec_interleave_low): Add AVX2 expanders for VI_256
>   modes.
>   * config/i386/i386.c (expand_vec_perm_interleave3): New function.
>   (ix86_expand_vec_perm_builtin_1): Call it.

Ok.


r~


Re: [Patch,AVR] Print no-return functions as JMP

2011-10-14 Thread Paolo Bonzini

On 10/14/2011 06:23 PM, Georg-Johann Lay wrote:

+@item -mjump-to-noreturn
+@opindex mjump-to-noreturn
+Use a jump instruction instead of a call instruction when calling a
+no-return functions.  This option is active if optimization is turned
+on and just affects the way a call instruction is printed out.
+Besides that, it has no effect on code generation or debug information.


I think this is not really accurate given Richard's input.

Paolo


[Patch,AVR,Comitted]: Avoid unwind warning from toplev.c [fix thinko]

2011-10-14 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> 2011/10/10 Georg-Johann Lay :
>> toplev.c complains about "unwind tables currently require a frame pointer for
>> correctness".
>>
>> This patchlet supplies a fix to avoid build warnings/test fails in that it 
>> sets
>> flag_omit_frame_pointer to 0 if unwind needs FP.
>>
>> toplev.c:process_options sets flag_unwind_tables depending on
>> flag_non_call_exceptions and flag_asynchronous_unwind_tables after calling
>> targetm.target_option.override() so that the test includes these flags, too.
>>
>> Ok?
>>
>> Johann
>>
>>* config/avr/avr.c (avr_option_override): Set
>>flag_omit_frame_pointer to 0 if frame pointer is needed for
>>unwinding.
>>
> 
> Approved.
> 
> Denis.

flag_omit_frame_pointer must not be overridden if there is no need for doing
so, thus I canceled the else because otherwise -fno-omit-frame-pointer will
have no effect. flag_omit_frame_pointer is already set up fine in
avr_option_optimization_table.

Committed as obvious.

http://gcc.gnu.org/viewcvs?view=revision&revision=179994

Johann

Fix thinko from r179765
* config/avr/avr.c (avr_option_override): Don't override
flag_omit_frame_pointer if not actually needed.


Index: config/avr/avr.c
===
--- config/avr/avr.c(revision 179993)
+++ config/avr/avr.c(working copy)
@@ -372,10 +372,6 @@ avr_option_override (void)
 {
   flag_omit_frame_pointer = 0;
 }
-  else
-{
-  flag_omit_frame_pointer = (optimize >= 1);
-}

   avr_current_device = &avr_mcu_types[avr_mcu_index];
   avr_current_arch = &avr_arch_types[avr_current_device->arch];


libobjc/50002: Applied fix to 4.6 branch as well

2011-10-14 Thread Nicola Pero
I applied the following patch to the 4.6 branch to backport the fix for 
libobjc/50002.
It makes sense to backport it to 4.6.x so that it appears in 4.6.2.  It really 
is quite a bug,
and the fix is simple/safe (and, the ObjFW guys were particularly keen on it).

Thanks

Index: class.c
===
--- class.c (revision 179967)
+++ class.c (working copy)
@@ -850,35 +850,57 @@ __objc_update_classes_with_methods (struct objc_me
   
   while (node != NULL)
{
- /* Iterate over all methods in the class.  */
- Class class = node->pointer;
- struct objc_method_list * method_list = class->methods;
+ /* We execute this loop twice: the first time, we iterate
+over all methods in the class (instance methods), while
+the second time we iterate over all methods in the meta
+class (class methods).  */
+ Class class = Nil;
+ BOOL done = NO;
 
- while (method_list)
+ while (done == NO)
{
- int i;
+ struct objc_method_list * method_list;
 
- for (i = 0; i < method_list->method_count; ++i)
+ if (class == Nil)
{
- struct objc_method *method = &method_list->method_list[i];
+ /* The first time, we work on the class.  */
+ class = node->pointer;
+   }
+ else
+   {
+ /* The second time, we work on the meta class.  */
+ class = class->class_pointer;
+ done = YES;
+   }
 
- /* If the method is one of the ones we are looking
-for, update the implementation.  */
- if (method == method_a)
-   sarray_at_put_safe (class->dtable,
-   (sidx) method_a->method_name->sel_id,
-   method_a->method_imp);
+ method_list = class->methods;
 
- if (method == method_b)
+ while (method_list)
+   {
+ int i;
+ 
+ for (i = 0; i < method_list->method_count; ++i)
{
- if (method_b != NULL)
+ struct objc_method *method = &method_list->method_list[i];
+ 
+ /* If the method is one of the ones we are
+looking for, update the implementation.  */
+ if (method == method_a)
sarray_at_put_safe (class->dtable,
-   (sidx) 
method_b->method_name->sel_id,
-   method_b->method_imp);
+   (sidx) 
method_a->method_name->sel_id,
+   method_a->method_imp);
+ 
+ if (method == method_b)
+   {
+ if (method_b != NULL)
+   sarray_at_put_safe (class->dtable,
+   (sidx) 
method_b->method_name->sel_id,
+   method_b->method_imp);
+   }
}
+ 
+ method_list = method_list->method_next;
}
- 
- method_list = method_list->method_next;
}
  node = node->next;
}
Index: ChangeLog
===
--- ChangeLog   (revision 179967)
+++ ChangeLog   (working copy)
@@ -1,3 +1,13 @@
+2011-10-14  Nicola Pero  
+
+   Backport from mainline
+   2011-08-06  Nicola Pero  
+
+   PR libobjc/50002
+   * class.c (__objc_update_classes_with_methods): Iterate over meta
+   classes as well as normal classes when refreshing the method
+   implementations.  This fixes replacing class methods.
+
 2011-06-27  Release Manager
 
* GCC 4.6.1 released.



Re: [Patch,AVR] Print no-return functions as JMP

2011-10-14 Thread Georg-Johann Lay
Paolo Bonzini schrieb:
> On 10/14/2011 06:23 PM, Georg-Johann Lay wrote:
>> +@item -mjump-to-noreturn
>> +@opindex mjump-to-noreturn
>> +Use a jump instruction instead of a call instruction when calling a
>> +no-return functions.  This option is active if optimization is turned
>> +on and just affects the way a call instruction is printed out.
>> +Besides that, it has no effect on code generation or debug information.
> 
> I think this is not really accurate given Richard's input.
> 
> Paolo
> 

Confused.  The conclusion was to introduce a new command line option in order
to have individual control over this feature.  The option is named
-mjump-to-noreturn now instead of -mjump-noreturn. Is that what you mean?

Johann





libobjc/49883: Applied fix to 4.6 branch as well

2011-10-14 Thread Nicola Pero
I applied the following patch to backport the fix for libobjc/49883 to GCC 4.6 
so that
it appears in 4.6.2.  This is the clang-related problem that was recently 
discussed.
Again, it's an important fix, but safe, with users (the ObjFW guys) asking for 
it in 4.6.2,
which made total sense, so I backported the fix and committed it to the 4.6 
branch.

Thanks

Index: init.c
===
--- init.c  (revision 179967)
+++ init.c  (working copy)
@@ -643,6 +643,15 @@
   assert (CLS_ISMETA (class->class_pointer));
   DEBUG_PRINTF (" installing class '%s'\n", class->name);
 
+  /* Workaround for a bug in clang: Clang may set flags other than
+_CLS_CLASS and _CLS_META even when compiling for the
+traditional ABI (version 8), confusing our runtime.  Try to
+wipe these flags out.  */
+  if (CLS_ISCLASS (class))
+   __CLS_INFO (class) = _CLS_CLASS;
+  else
+   __CLS_INFO (class) = _CLS_META;
+
   /* Initialize the subclass list to be NULL.  In some cases it
 isn't and this crashes the program.  */
   class->subclass_list = NULL;
Index: ChangeLog
===
--- ChangeLog   (revision 179996)
+++ ChangeLog   (working copy)
@@ -1,6 +1,19 @@
 2011-10-14  Nicola Pero  
 
Backport from mainline
+   2011-10-09  Nicola Pero  
+
+   PR libobjc/49883
+   * init.c (__objc_exec_class): Work around a bug in clang's code
+   generation.  Clang sets the class->info field to values different
+   from 0x1 or 0x2 (the only allowed values in the traditional GNU
+   Objective-C runtime ABI) to store some additional information, but
+   this breaks backwards compatibility.  Wipe out all the bits in the
+   fields other than the first two upon loading a class.
+
+2011-10-14  Nicola Pero  
+
+   Backport from mainline
2011-08-06  Nicola Pero  
 
PR libobjc/50002



Re: [PATCH, Atom] Fix performance regression with -mtune=atom

2011-10-14 Thread H.J. Lu
On Fri, Oct 14, 2011 at 10:13 AM, Vladimir Yakovlev
 wrote:
> Patch is attached.
>
> Thanks,
> Vladimir
>
> 2011-10-14  Yakovlev Vladimir  vladimir.b.yakov...@intel.com

^ Should be in "<>".
>
>     * gcc/config/i386/i386.c (atom_cost): Changed cost for loading
 ^ File name is relative to ChangeLog.
>      QImode using movzbl.
>
>

I checked in it for you.  Please provide the correct ChangeLog entry
next time.

-- 
H.J.


[PATCH] Fix typo in Builtin infrastructure change

2011-10-14 Thread Michael Meissner
David pointed out that I had a typo in the AIX code in my builtin changes on
October 11th.  I've checked this patch in as obvious.

2011-10-14  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_init_builtins): Fix typo in my
change on October 11th, 2011.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 179997)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -12213,7 +12213,7 @@ rs6000_init_builtins (void)
 
 #if TARGET_XCOFF
   /* AIX libm provides clog as __clog.  */
-  if ((tdecl = builtin_decl_explicit ([BUILT_IN_CLOG))) != NULL_TREE)
+  if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
 set_user_assembler_name (tdecl, "__clog");
 #endif
 

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899



Re: Fix for PR obj-c++/48275 ("getter=namespace failing with .mm")

2011-10-14 Thread Nicola Pero
Can I apply this fix to the 4.6 branch as well ?

Some users are asking for it to be backported to the 4.6 branch.

It's not a regression, because the property implementation is new in GCC 4.6, 
but it is still a serious
bug if you're trying to use properties with ObjC++, as you can't use 
"namespace" as a getter name.

It's a simple fix, quite obvious, and very limited in scope (even if it was 
wrong, it can't really do much
harm other than breaking property declarations in ObjC++ which are already 
broken without the patch),
so I'd go for it.

OK to commit to the 4.6 branch ?

Thanks

On 6 Jun 2011, at 20:22, Nicola Pero wrote:

> This patch fixes PR obj-c++/48275.  It's a routine parser ingenuity.
> 
> OK to commit ?
> 
> Thanks
> 
> Index: testsuite/ChangeLog
> ===
> --- testsuite/ChangeLog (revision 174657)
> +++ testsuite/ChangeLog (working copy)
> @@ -1,3 +1,9 @@
> +2011-06-06  Nicola Pero  
> +
> +   PR objc-++/48275
> +   * obj-c++.dg/property/cxx-property-1.mm: New.   
> +   * obj-c++.dg/property/cxx-property-2.mm: New.
> +
> 2011-06-05  Nicola Pero  
> 
>PR testsuite/49287
> Index: testsuite/obj-c++.dg/property/cxx-property-2.mm
> ===
> --- testsuite/obj-c++.dg/property/cxx-property-2.mm (revision 0)
> +++ testsuite/obj-c++.dg/property/cxx-property-2.mm (revision 0)
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +
> +/* All these C++ keywords are acceptable in ObjC method names, hence
> +   should be accepted for property getters and setters.  */
> +
> +@interface Test
> +{
> +  Class isa;
> +}
> +@property (getter=namespace) int p0;
> +@property (setter=namespace:) int p1;
> +@property (getter=and) int p2;
> +@property (setter=and:) int p3;
> +@property (getter=class) int p4;
> +@property (setter=class:) int p5;
> +@property (getter=new) int p6;
> +@property (setter=new:) int p7;
> +@property (getter=delete) int p8;
> +@property (setter=delete:) int p9;
> +@property (getter=delete) int p10;
> +@property (setter=delete:) int p11;
> +@end
> Index: testsuite/obj-c++.dg/property/cxx-property-1.mm
> ===
> --- testsuite/obj-c++.dg/property/cxx-property-1.mm (revision 0)
> +++ testsuite/obj-c++.dg/property/cxx-property-1.mm (revision 0)
> @@ -0,0 +1,10 @@
> +/* Testcase from PR obj-c++/48275.  */
> +/* { dg-do compile } */
> +
> +@interface Test
> +{
> +int ns;
> +}
> +@property (getter=namespace) int ns;
> +
> +@end
> Index: cp/ChangeLog
> ===
> --- cp/ChangeLog(revision 174656)
> +++ cp/ChangeLog(working copy)
> @@ -1,3 +1,9 @@
> +2011-06-06  Nicola Pero  ,
> +
> +   PR obj-c++/48275
> +   * parser.c (cp_parser_objc_at_property_declaration): Allow setter
> +   and getter names to use all the allowed method names.
> +
> 2011-06-04  Jonathan Wakely  
> 
>* init.c (build_delete): Warn when deleting type with non-virtual
> Index: cp/parser.c
> ===
> --- cp/parser.c (revision 174656)
> +++ cp/parser.c (working copy)
> @@ -23187,7 +23187,7 @@ cp_parser_objc_at_property_declaration (cp_parser
>  break;
>}
>  cp_lexer_consume_token (parser->lexer); /* eat the = */
> - if (cp_lexer_next_token_is_not (parser->lexer, CPP_NAME))
> + if (!cp_parser_objc_selector_p (cp_lexer_peek_token 
> (parser->lexer)->type))
>{
>  cp_parser_error (parser, "expected identifier");
>  syntax_error = true;
> @@ -23196,10 +23196,12 @@ cp_parser_objc_at_property_declaration (cp_parser
>  if (keyword == RID_SETTER)
>{
>  if (property_setter_ident != NULL_TREE)
> -   cp_parser_error (parser, "the % attribute may 
> only be specified once");
> +   {
> + cp_parser_error (parser, "the % attribute may 
> only be specified once");
> + cp_lexer_consume_token (parser->lexer);
> +   }
>  else
> -   property_setter_ident = cp_lexer_peek_token 
> (parser->lexer)->u.value;
> - cp_lexer_consume_token (parser->lexer);
> +   property_setter_ident = cp_parser_objc_selector (parser);
>  if (cp_lexer_next_token_is_not (parser->lexer, CPP_COLON))
>cp_parser_error (parser, "setter name must terminate with 
> %<:%>");
>  else
> @@ -23208,10 +23210,12 @@ cp_parser_objc_at_property_declaration (cp_parser
>  else
>{
>  if (property_getter_ident != NULL_TREE)
> -   cp_parser_error (parser, "the % attribute may 
> only be specified once");
> +  

C++ PATCH for c++/50563 (NSDMI and multiple declarator list)

2011-10-14 Thread Jason Merrill
The problem here was that we were saving away the second declarator as 
part of the NSDMI for the first declarator.  Fixed by stopping at a 
non-nested comma.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 09f16be794e871607c2bb46bf74206ee40af1b74
Author: Jason Merrill 
Date:   Thu Oct 13 17:51:13 2011 -0400

	PR c++/50563
	* parser.c (cp_parser_cache_group): Handle end==CPP_COMMA.
	(cp_parser_save_nsdmi): Pass it.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index cabe9aa..ea0c4dc 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -20617,7 +20617,8 @@ cp_parser_save_nsdmi (cp_parser* parser)
   cp_token *last;
   tree node;
 
-  cp_parser_cache_group (parser, CPP_CLOSE_PAREN, /*depth=*/0);
+  /* Save tokens until the next comma or semicolon.  */
+  cp_parser_cache_group (parser, CPP_COMMA, /*depth=*/0);
 
   last = parser->lexer->next_token;
 
@@ -21719,6 +21720,12 @@ cp_parser_cache_group (cp_parser *parser,
 	   kind of syntax error.  */
 	return true;
 
+  /* If we're caching something finished by a comma (or semicolon),
+	 such as an NSDMI, don't consume the comma.  */
+  if (end == CPP_COMMA
+	  && (token->type == CPP_SEMICOLON || token->type == CPP_COMMA))
+	return false;
+
   /* Consume the token.  */
   cp_lexer_consume_token (parser->lexer);
   /* See if it starts a new group.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/nsdmi-list1.C b/gcc/testsuite/g++.dg/cpp0x/nsdmi-list1.C
new file mode 100644
index 000..526f29a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/nsdmi-list1.C
@@ -0,0 +1,14 @@
+// PR c++/50563
+// { dg-options -std=c++0x }
+
+struct S1 {
+  int a{10}, b{20}; // OK
+};
+
+struct S2 {
+  int a, b = 20;// OK
+};
+
+struct S3 {
+  int a = 10, b = 20;
+};


C++ PATCH for c++/50507 (NSDMI for const field)

2011-10-14 Thread Jason Merrill
We should check for an initializer before complaining about lack 
thereof.  :)


Tested x86_64-pc-linux-gnu, applying to trunk.
commit fc150101bb79b3b328263e272a44eecd1083e6cd
Author: Jason Merrill 
Date:   Thu Oct 13 21:57:21 2011 -0400

	PR c++/50507
	* method.c (walk_field_subobs): Check for NSDMI before
	complaining about uninitialized fields.

diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index f4a3ea6..0718f47 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -1016,25 +1016,7 @@ walk_field_subobs (tree fields, tree fnname, special_function_kind sfk,
 	}
   else if (sfk == sfk_constructor)
 	{
-	  bool bad = true;
-	  if (CP_TYPE_CONST_P (mem_type)
-	  && default_init_uninitialized_part (mem_type))
-	{
-	  if (msg)
-		error ("uninitialized non-static const member %q#D",
-		   field);
-	}
-	  else if (TREE_CODE (mem_type) == REFERENCE_TYPE)
-	{
-	  if (msg)
-		error ("uninitialized non-static reference member %q#D",
-		   field);
-	}
-	  else
-	bad = false;
-
-	  if (bad && deleted_p)
-	*deleted_p = true;
+	  bool bad;
 
 	  if (DECL_INITIAL (field))
 	{
@@ -1057,6 +1039,26 @@ walk_field_subobs (tree fields, tree fnname, special_function_kind sfk,
 	  continue;
 	}
 
+	  bad = false;
+	  if (CP_TYPE_CONST_P (mem_type)
+	  && default_init_uninitialized_part (mem_type))
+	{
+	  if (msg)
+		error ("uninitialized non-static const member %q#D",
+		   field);
+	  bad = true;
+	}
+	  else if (TREE_CODE (mem_type) == REFERENCE_TYPE)
+	{
+	  if (msg)
+		error ("uninitialized non-static reference member %q#D",
+		   field);
+	  bad = true;
+	}
+
+	  if (bad && deleted_p)
+	*deleted_p = true;
+
 	  /* For an implicitly-defined default constructor to be constexpr,
 	 every member must have a user-provided default constructor or
 	 an explicit initializer.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/nsdmi-const1.C b/gcc/testsuite/g++.dg/cpp0x/nsdmi-const1.C
new file mode 100644
index 000..ddf9f04
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/nsdmi-const1.C
@@ -0,0 +1,10 @@
+// PR c++/50707
+// { dg-options -std=c++0x }
+
+int g;
+
+struct S {
+   int const v=g;
+};
+
+S s;


Re: New warning for expanded vector operations

2011-10-14 Thread Mike Stump
On Oct 14, 2011, at 8:37 AM, Artem Shinkarov wrote:
> Committed with 179991.

Please don't send these...  If you commit for a person, you can send directly 
to them the fact you committed the work. If people want to know when works goes 
in, be sure to use a PR and put yourself on the cc list, then, you will get the 
email it was committed from the version control system, after it hits spinning 
disk.  Thanks.


[Patch, Fortran, committed] PR 50570: [4.6/4.7 Regression] Incorrect error for assignment to intent(in) pointer

2011-10-14 Thread Janus Weil
Hi all,

I just committed a one-line fix for this regression (approved by
Tobias in a private mail):

http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=18

Cheers,
Janus


Btw: I just noticed that ABORT throws a backtrace in recent trunk
versions (which didn't happen in 4.6 or earlier). I think this can be
quite useful, but maybe it should be documented in the manual?

http://gcc.gnu.org/onlinedocs/gfortran/ABORT.html#ABORT


Re: [Patch, Fortran, committed] PR 50570: [4.6/4.7 Regression] Incorrect error for assignment to intent(in) pointer

2011-10-14 Thread Janus Weil
Forgot to mention: Will backport to 4.6 during the weekend (hopefully
in time for 4.6.2).

Cheers,
Janus



2011/10/14 Janus Weil :
> Hi all,
>
> I just committed a one-line fix for this regression (approved by
> Tobias in a private mail):
>
> http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=18
>
> Cheers,
> Janus
>
>
> Btw: I just noticed that ABORT throws a backtrace in recent trunk
> versions (which didn't happen in 4.6 or earlier). I think this can be
> quite useful, but maybe it should be documented in the manual?
>
> http://gcc.gnu.org/onlinedocs/gfortran/ABORT.html#ABORT
>


[C++ Patch] PR 50732

2011-10-14 Thread Paolo Carlini

Hi,

submitter complains that, at variance with C++11, __is_base_of doesn't 
handle an incomplete base type (the first parameter). The reason seems 
simple: in finish_trait_expr we try to complete *both* types instead of 
doing it where/when necessary.


Tested x86_64-linux. Ok?

Thanks,
Paolo.

///
/cp
2011-10-14  Paolo Carlini  

PR c++/50732
* semantics.c (finish_trait_expr): Do not try to instantiate the
the base type of an __is_base_of trait.

/testsuite
2011-10-14  Paolo Carlini  

PR c++/50732
* g++.dg/ext/is_base_of_incomplete.C: New.
Index: testsuite/g++.dg/ext/is_base_of_incomplete.C
===
--- testsuite/g++.dg/ext/is_base_of_incomplete.C(revision 0)
+++ testsuite/g++.dg/ext/is_base_of_incomplete.C(revision 0)
@@ -0,0 +1,7 @@
+template 
+struct non_instantiable
+{
+  typedef typename T::THIS_TYPE_CANNOT_BE_INSTANTIATED type;
+};
+
+int check[__is_base_of(non_instantiable, void) ? -1 : 1];
Index: cp/semantics.c
===
--- cp/semantics.c  (revision 179997)
+++ cp/semantics.c  (working copy)
@@ -5276,10 +5276,6 @@ finish_trait_expr (cp_trait_kind kind, tree type1,
   return trait_expr;
 }
 
-  complete_type (type1);
-  if (type2)
-complete_type (type2);
-
   switch (kind)
 {
 case CPTK_HAS_NOTHROW_ASSIGN:
@@ -5297,6 +5293,7 @@ finish_trait_expr (cp_trait_kind kind, tree type1,
 case CPTK_IS_POLYMORPHIC:
 case CPTK_IS_STD_LAYOUT:
 case CPTK_IS_TRIVIAL:
+  complete_type (type1);
   if (!check_trait_type (type1))
{
  error ("incomplete type %qT not allowed", type1);
@@ -5305,6 +5302,7 @@ finish_trait_expr (cp_trait_kind kind, tree type1,
   break;
 
 case CPTK_IS_BASE_OF:
+  complete_type (type2);
   if (NON_UNION_CLASS_TYPE_P (type1) && NON_UNION_CLASS_TYPE_P (type2)
  && !same_type_ignoring_top_level_qualifiers_p (type1, type2)
  && !COMPLETE_TYPE_P (type2))


Re: [PATCH] Fix target default on biarch Linux/Sparc

2011-10-14 Thread Eric Botcazou
> If you configure a biarch Linux/Sparc compiler defaulting to
> 32-bit, but give --with-cpu= for a v9 cpu it erroneously
> turns on 64-bit in TARGET_DEFAULT.

PR target/50354 reports the breakage of the opposite case after the change:
configuring for sparc64-linux --with-cpu=v8 used to build a 32-bit compiler, 
now the build aborts because of an architecture mismatch.

> The right thing to do is what the Solaris/Sparc target does,
> which is to key things off of a cpp macro (TARGET_64BIT_DEFAULT)
> which is defined by a header that gets prepended to the target
> header list based upon the target triplet.

There is a discrepancy between Solaris and Linux: on Solaris, we don't support 
the weird sparc64-* --with-cpu=v8 combination, whereas it is the default for 
some Linux distros (e.g. Debian).  So we cannot have the full symmetry here.

Something like the attached patch would be sufficient.  What do you think?


PR target/50354
* config/sparc/linux64.h (TARGET_DEFAULT): Only override if the default
processor is at least V9 and TARGET_64BIT_DEFAULT is defined.


-- 
Eric Botcazou
Index: config/sparc/linux64.h
===
--- config/sparc/linux64.h	(revision 179894)
+++ config/sparc/linux64.h	(working copy)
@@ -31,7 +31,9 @@ along with GCC; see the file COPYING3.
 }		\
   while (0)
 
-#ifdef TARGET_64BIT_DEFAULT
+/* On Linux, the combination sparc64-* --with-cpu=v8 is supported and
+   selects a 32-bit compiler.  */
+#if defined(TARGET_64BIT_DEFAULT) && TARGET_CPU_DEFAULT >= TARGET_CPU_v9
 #undef TARGET_DEFAULT
 #define TARGET_DEFAULT \
   (MASK_V9 + MASK_PTR64 + MASK_64BIT + MASK_STACK_BIAS + \


Re: [C++ Patch] PR 50732

2011-10-14 Thread Paolo Carlini

On 10/14/2011 08:23 PM, Paolo Carlini wrote:

Hi,

submitter complains that, at variance with C++11, __is_base_of doesn't 
handle an incomplete base type (the first parameter). The reason seems 
simple: in finish_trait_expr we try to complete *both* types instead 
of doing it where/when necessary.
Hmm, maybe we should be even more careful and call complete (type2) only 
when


NON_UNION_CLASS_TYPE_P (type1) && NON_UNION_CLASS_TYPE_P (type2) && 
!same_type_ignoring_top_level_qualifiers_p (type1, type2)


is true?

Paolo.


  1   2   >