Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-24 Thread Andrew Stubbs

On 23/06/11 17:26, Richard Guenther wrote:

On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs  wrote:

There are many cases where the widening_mult pass does not recognise
widening multiply-and-accumulate cases simply because there is a type
conversion step between the multiply and add statements.

This patch should rectify that simply by looking beyond those conversions.


That's surely wrong for (int)(short)int_var.  You have to constrain
the conversions
you look through properly.


To be clear, it only skips past NOP_EXPR. Is it not the case that what 
you're describing would need a CONVERT_EXPR?


Andrew


[Patch ARM] Add predefine for availability of DSP multiplication functions.

2011-06-24 Thread James Greenhalgh
Hi,

This patch adds a builtin macro __ARM_FEATURE_DSP which is defined
when the ARMv5E DSP multiplication extensions are available for use.

Thanks,
James Greenhalgh

2011-06-22  James Greenhalgh  

* TARGET_CPU_CPP_BUILTINS: Add __ARM_FEATURE_DSP.
diff --git gcc/config/arm/arm.h gcc/config/arm/arm.h
index c32ef1a..892065b 100644
--- gcc/config/arm/arm.h
+++ gcc/config/arm/arm.h
@@ -45,6 +45,8 @@ extern char arm_arch_name[];
 #define TARGET_CPU_CPP_BUILTINS()			\
   do			\
 {			\
+	if (TARGET_DSP_MULTIPLY)\
+	  builtin_define ("__ARM_FEATURE_DSP");			\
 	/* Define __arm__ even when in thumb mode, for	\
 	   consistency with armcc.  */			\
 	builtin_define ("__arm__");			\

Re: [RFA:] Removing target-libiberty on branches

2011-06-24 Thread Richard Guenther
On Thu, Jun 23, 2011 at 8:23 PM, Hans-Peter Nilsson
 wrote:
> Here's the patch I tested for 4.6, native
> x86_64-unknown-linux-gnu, cross to cris-axis-elf, both with old
> and new ("breaking") newlib.
>
> Ok for 4.6 and after testing, earlier branches?

Ok for 4.6.2 and 4.5.

Thanks,
Richard.

>
> 2011-06-22  Hans-Peter Nilsson  
>
>        PR regression/47836
>        PR bootstrap/23656
>        PR other/47733
>        PR bootstrap/49247
>        PR c/48825
>        * configure.ac (target_libraries): Remove target-libiberty.
>        Remove all target-specific settings adding target-libiberty to
>        skipdirs and noconfigdirs.  Remove checking target_configdirs
>        and removing target-libiberty but keeping target-libgcc if
>        otherwise empty.
>        * Makefile.def (target_modules): Don't add libiberty.
>        (dependencies): Remove all traces of target-libiberty.
>        * configure, Makefile.in: Regenerate.
>
> Index: configure.ac
> ===
> --- configure.ac        (revision 175300)
> +++ configure.ac        (working copy)
> @@ -186,9 +186,8 @@ libgcj="target-libffi \
>
>  # these libraries are built for the target environment, and are built after
>  # the host libraries and the host tools (which may be a cross compiler)
> -#
> +# Note that libiberty is not a target library.
>  target_libraries="target-libgcc \
> -               target-libiberty \
>                target-libgloss \
>                target-newlib \
>                target-libgomp \
> @@ -595,14 +594,14 @@ case "${target}" in
>     ;;
>   *-*-kaos*)
>     # Remove unsupported stuff on all kaOS configurations.
> -    skipdirs="target-libiberty ${libgcj} target-libstdc++-v3 target-librx"
> +    skipdirs="${libgcj} target-libstdc++-v3 target-librx"
>     skipdirs="$skipdirs target-libobjc target-examples target-groff 
> target-gperf"
>     skipdirs="$skipdirs zlib fastjar target-libjava target-boehm-gc 
> target-zlib"
>     noconfigdirs="$noconfigdirs target-libgloss"
>     ;;
>   *-*-netbsd*)
>     # Skip some stuff on all NetBSD configurations.
> -    noconfigdirs="$noconfigdirs target-newlib target-libiberty 
> target-libgloss"
> +    noconfigdirs="$noconfigdirs target-newlib target-libgloss"
>
>     # Skip some stuff that's unsupported on some NetBSD configurations.
>     case "${target}" in
> @@ -614,21 +613,20 @@ case "${target}" in
>     esac
>     ;;
>   *-*-netware*)
> -    noconfigdirs="$noconfigdirs target-newlib target-libiberty 
> target-libgloss ${libgcj} target-libmudflap"
> +    noconfigdirs="$noconfigdirs target-newlib target-libgloss ${libgcj} 
> target-libmudflap"
>     ;;
>   *-*-rtems*)
> -    skipdirs="${skipdirs} target-libiberty"
>     noconfigdirs="$noconfigdirs target-libgloss ${libgcj}"
>     ;;
>     # The tpf target doesn't support gdb yet.
>   *-*-tpf*)
> -    noconfigdirs="$noconfigdirs target-newlib target-libgloss 
> target-libiberty ${libgcj} target-libmudflap gdb tcl tk libgui itcl"
> +    noconfigdirs="$noconfigdirs target-newlib target-libgloss ${libgcj} 
> target-libmudflap gdb tcl tk libgui itcl"
>     ;;
>   *-*-uclinux*)
>     noconfigdirs="$noconfigdirs target-newlib target-libgloss target-rda 
> ${libgcj}"
>     ;;
>   *-*-vxworks*)
> -    noconfigdirs="$noconfigdirs target-newlib target-libgloss 
> target-libiberty target-libstdc++-v3 ${libgcj}"
> +    noconfigdirs="$noconfigdirs target-newlib target-libgloss 
> target-libstdc++-v3 ${libgcj}"
>     ;;
>   alpha*-dec-osf*)
>     # ld works, but does not support shared libraries.
> @@ -656,7 +654,7 @@ case "${target}" in
>   sh*-*-pe|mips*-*-pe|*arm-wince-pe)
>     noconfigdirs="$noconfigdirs ${libgcj}"
>     noconfigdirs="$noconfigdirs target-examples"
> -    noconfigdirs="$noconfigdirs target-libiberty texinfo send-pr"
> +    noconfigdirs="$noconfigdirs texinfo send-pr"
>     noconfigdirs="$noconfigdirs tcl tk itcl libgui sim"
>     noconfigdirs="$noconfigdirs expect dejagnu"
>     # the C++ libraries don't build on top of CE's C libraries
> @@ -690,7 +688,7 @@ case "${target}" in
>     libgloss_dir=arm
>     ;;
>   arm*-*-symbianelf*)
> -    noconfigdirs="$noconfigdirs ${libgcj} target-libiberty"
> +    noconfigdirs="$noconfigdirs ${libgcj}"
>     libgloss_dir=arm
>     ;;
>   arm-*-pe*)
> @@ -709,7 +707,7 @@ case "${target}" in
>     noconfigdirs="$noconfigdirs ld target-libgloss ${libgcj}"
>     ;;
>   avr-*-*)
> -    noconfigdirs="$noconfigdirs target-libiberty target-libstdc++-v3 
> ${libgcj} target-libssp"
> +    noconfigdirs="$noconfigdirs target-libstdc++-v3 ${libgcj} target-libssp"
>     ;;
>   bfin-*-*)
>     unsupported_languages="$unsupported_languages java"
> @@ -888,7 +886,7 @@ case "${target}" in
>     noconfigdirs="$noconfigdirs ${libgcj}"
>     ;;
>   m68hc11-*-*|m6811-*-*|m68hc12-*-*|m6812-*-*)
> -    noconfigdirs="$noconfigdirs target-libiberty target-libstdc++-v3 
> ${libgcj}"
> +    noconfigdirs="$noconfigdirs target-libstdc++-v3 ${libgcj}"
>     li

Re: [PATCH, PR 49516] Avoid SRA mem-refing its scalar replacements

2011-06-24 Thread Richard Guenther
On Thu, Jun 23, 2011 at 9:53 PM, Martin Jambor  wrote:
> Hi,
>
> When SRA tries to modify an assignment where on one side it should put
> a new scalar replacement but the other is actually an aggregate with
> a number of replacements for it, it will generate MEM-REFs into the
> former replacement which can lead to miscompilations.
>
> This is avoided by the simple patch below.  With it, we deal with
> these situations like with other type-casts that SRA cannot handle: we
> channel the data through the original variable and the original
> statement.
>
> The testcase is not miscompiled with 4.6 gcc but the bug is just
> latent there.
>
> I have verified the problem goes away on i686-linux.  I have
> bootstrapped and tested the patch on x86_64-linux too.  I intend to do
> a full i686 bootstrap and test but so far have not managed to do it.
>
> OK for trunk and 4.6 after it is unfrozen?

Ok.

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> 2011-06-22  Martin Jambor  
>
>        PR tree-optimizations/49516
>        * tree-sra.c (sra_modify_assign): Choose the safe path for
>        aggregate copies if we also did scalar replacements.
>
>        * testsuite/g++.dg/tree-ssa/pr49516.C: New test.
>
> Index: src/gcc/tree-sra.c
> ===
> --- src.orig/gcc/tree-sra.c
> +++ src/gcc/tree-sra.c
> @@ -2804,7 +2804,8 @@ sra_modify_assign (gimple *stmt, gimple_
>      there to do the copying and then load the scalar replacements of the LHS.
>      This is what the first branch does.  */
>
> -  if (gimple_has_volatile_ops (*stmt)
> +  if (modify_this_stmt
> +      || gimple_has_volatile_ops (*stmt)
>       || contains_vce_or_bfcref_p (rhs)
>       || contains_vce_or_bfcref_p (lhs))
>     {
> Index: src/gcc/testsuite/g++.dg/tree-ssa/pr49516.C
> ===
> --- /dev/null
> +++ src/gcc/testsuite/g++.dg/tree-ssa/pr49516.C
> @@ -0,0 +1,86 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +extern "C" void abort (void);
> +
> +typedef int int32;
> +typedef unsigned int uint32;
> +typedef unsigned long long uint64;
> +typedef short int16;
> +
> +class Tp {
> + public:
> +  Tp(int, const int segment, const int index) __attribute__((noinline));
> +
> +  inline bool operator==(const Tp& other) const;
> +  inline bool operator!=(const Tp& other) const;
> +  int GetType() const { return type_; }
> +  int GetSegment() const { return segment_; }
> +  int GetIndex() const { return index_; }
> + private:
> +  inline static bool IsValidSegment(const int segment);
> +  static const int kSegmentBits = 28;
> +  static const int kTypeBits = 4;
> +  static const int kMaxSegment = (1 << kSegmentBits) - 1;
> +
> +  union {
> +
> +    struct {
> +      int32 index_;
> +      uint32 segment_ : kSegmentBits;
> +      uint32 type_ : kTypeBits;
> +    };
> +    struct {
> +      int32 dummy_;
> +      uint32 type_and_segment_;
> +    };
> +    uint64 value_;
> +  };
> +};
> +
> +Tp::Tp(int t, const int segment, const int index)
> + : index_(index), segment_(segment), type_(t) {}
> +
> +inline bool Tp::operator==(const Tp& other) const {
> +  return value_ == other.value_;
> +}
> +inline bool Tp::operator!=(const Tp& other) const {
> +  return value_ != other.value_;
> +}
> +
> +class Range {
> + public:
> +  inline Range(const Tp& position, const int count) 
> __attribute__((always_inline));
> +  inline Tp GetBeginTokenPosition() const;
> +  inline Tp GetEndTokenPosition() const;
> + private:
> +  Tp position_;
> +  int count_;
> +  int16 begin_index_;
> +  int16 end_index_;
> +};
> +
> +inline Range::Range(const Tp& position,
> +                    const int count)
> +    : position_(position), count_(count), begin_index_(0), end_index_(0)
> +    { }
> +
> +inline Tp Range::GetBeginTokenPosition() const {
> +  return position_;
> +}
> +inline Tp Range::GetEndTokenPosition() const {
> +  return Tp(position_.GetType(), position_.GetSegment(),
> +            position_.GetIndex() + count_);
> +}
> +
> +int main ()
> +{
> +  Range range(Tp(0, 0, 3), 0);
> +  if (!(range.GetBeginTokenPosition() == Tp(0, 0, 3)))
> +    abort ();
> +
> +  if (!(range.GetEndTokenPosition() == Tp(0, 0, 3)))
> +    abort();
> +
> +  return 0;
> +}
>


RE: [Patch ARM] Add predefine for availability of DSP multiplication functions.

2011-06-24 Thread James Greenhalgh
Apologies, I sent an incomplete ChangeLog entry.

Thanks,
James Greenahlgh


2011-06-22  James Greenhalgh  

* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Add
__ARM_FEATURE_DSP.

> Hi,
> 
> This patch adds a builtin macro __ARM_FEATURE_DSP which is defined
> when the ARMv5E DSP multiplication extensions are available for use.
> 
> Thanks,
> James Greenhalgh
> 
> 2011-06-22  James Greenhalgh  
> 
>   * TARGET_CPU_CPP_BUILTINS: Add __ARM_FEATURE_DSP.





Re: Mark variables addressable if they are copied using libcall in RTL expander

2011-06-24 Thread Richard Guenther
On Thu, Jun 23, 2011 at 11:47 PM, Easwaran Raman  wrote:
> On Thu, Jun 23, 2011 at 12:16 PM, Jakub Jelinek  wrote:
>> On Thu, Jun 23, 2011 at 12:02:35PM -0700, Easwaran Raman wrote:
>>> +      if (y_expr)
>>> +        mark_addressable (y_expr);
>>
>> Please watch formatting, a tab should be used instead of 8 spaces.
>>
>>> +      if (x_expr)
>>> +        mark_addressable (x_expr);
>>
>> Ditto.
>>
>>> @@ -1084,6 +1084,8 @@ initialize_argument_information (int num_actuals A
>>>                 && TREE_CODE (base) != SSA_NAME
>>>                 && (!DECL_P (base) || MEM_P (DECL_RTL (base)
>>>           {
>>> +              mark_addressable (args[i].tree_value);
>>> +
>>
>> Likewise, plus the line is indented too much, each level should be indented
>> by 2 chars.
>>
>>        Jakub
>>
>
> I have attached a new patch that fixes the formatting  issues.

Ok.

Thanks,
Richard.

> Thanks,
> Easwaran
>


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-24 Thread Richard Guenther
On Fri, Jun 24, 2011 at 10:05 AM, Andrew Stubbs
 wrote:
> On 23/06/11 17:26, Richard Guenther wrote:
>>
>> On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs
>>  wrote:
>>>
>>> There are many cases where the widening_mult pass does not recognise
>>> widening multiply-and-accumulate cases simply because there is a type
>>> conversion step between the multiply and add statements.
>>>
>>> This patch should rectify that simply by looking beyond those
>>> conversions.
>>
>> That's surely wrong for (int)(short)int_var.  You have to constrain
>> the conversions
>> you look through properly.
>
> To be clear, it only skips past NOP_EXPR. Is it not the case that what
> you're describing would need a CONVERT_EXPR?

NOP_EXPR is the same as CONVERT_EXPR.

Richard.

> Andrew
>


Re: PATCH: PR rtl-optimization/49504: Invalid optimization for Pmode != ptr_mode

2011-06-24 Thread Eric Botcazou
> I just don't see how nonzero_bits1 can assume if pointers extend unsigned
> and this is an addition or subtraction to a pointer in Pmode, all the bits
> bove ptr_mode are known to be zero.  We never run into it before x32
> since x32 is the first such target.

I agree that this is overly optimistical, but this was done on purpose:
  http://gcc.gnu.org/ml/gcc-patches/2001-02/msg00316.html

Could you evaluate the pessimization that the patch introduces (if any) for the 
generated code on x32?  If there is none or it is negligible, the patch is OK 
if completed by the removal of the equivalent code in num_sign_bit_copies1.
If it isn't negligible, we may need to be clever and devise something else.

-- 
Eric Botcazou


RE: [PATCH, ARM] iWMMXT maintenance

2011-06-24 Thread Xinyu Qi
Hi, Ramana and Joseph,

Thank you for your reviewing! Sorry for the late response.
Before I submit the new modified patch, I want to make something more specific.

> The -mwmmxt option is not acceptable as it stands today.  IIRC the msimd
>   option was the plan long term when we talked about this last. It is a
> good idea to revisit this now that we are finalizing the options /
> multilib rework and the iwmmx port is getting some maintenance.
> 

So I decide to remove the option from my patch.
I plan to submit three patches this time, one for iWMMXt intrinsic maintenance 
and WMMX pipeline description (no auto-vectorization or address fix 
containing), another for iWMMXt testsuite, and the third for doc update.
Do you think it's better to split iWMMXt intrinsic maintenance and WMMX 
pipeline description into two patches? 

> Also based on a quick reading I find that
> 1. The documentation for the new intrinsics added is missing and that
> needs to be contributed along with the documentation to invoke.texi
> about the new options that are being added.

About the documentation, I found there is no iWMMXt intrinsic doc in 
extend.texi (which only has WMMX built-in function doc instead).With reference 
to NEON (existing NEON intrinsic doc), should the iWMMXt intrinsic doc be 
provide or just simply update the WMMX build-in function? Is it possible to 
postpone the doc patch since it maybe takes a long time to prepare?

> There is a lot of restructuring of pattern names in neon.md. When you
> say you tested arm-linux-gnueabi did you specifically test the neon port
> with your patches applied to be sure that nothing broke there since I
> notice this churn ?

I have tested all the neon test under gcc.target/arm and gcc.target/arm/neon. I 
prefer holding the WMMX auto-vectorization patch for a while.

> Based on a quick skim of the patch -
> In a number of places I noticed that you have
> For e.g. in your pipeline descriptions .
> ior (eq_attr ("wtype" "waligni")
>  ior (eq_attr ("wtype" "walignr"))
> etc...
> You could rationalize these with 
> eq_attr "wtype" "waligni, walignr" makes these things smaller :)

Thanks for direction! That's really convenient.

Thanks,
Xinyu


Re: [RFC, ARM] Convert thumb1 prologue completely to rtl

2011-06-24 Thread Richard Earnshaw
On 18/06/11 20:02, Richard Henderson wrote:
> I couldn't find anything terribly tricky about the conversion.
> 
> The existing push_mult pattern would service thumb1 with just
> a tweak or two to the memory predicate and the length.
> 
> The existing emit_multi_reg_push wasn't set up to handle a
> complete switch of registers for unwind info.  I thought about
> trying to merge them, but thought chickened out.
> 
> I havn't cleaned out the code that is now dead in thumb_pushpop.
> I'd been thinking about maybe converting epilogues completely
> to rtl as well, which would allow the function to be deleted
> completely, rather than incrementally.
> 
> I'm unsure what testing should be applied.  I'm currently doing
> arm-elf, which does at least have a thumb1 multilib, and uses
> newlib so I don't have to fiddle with setting up a full native
> cross environment.  What else should be done?  arm-eabi?
> 

Testing this on arm-eabi is essential since this may affect C++ unwind
table generation (I can't see any obvious problems, but you never know).

> This is the only substantive bit of code left that tries to emit
> dwarf2 unwind info while emitting assembly as text.  So I'd like
> to get rid of this as soon as possible...
> 
> 
> r~
> 
> 
> d-thumb-1
> 
> 
>   * config/arm/arm.c (arm_output_function_prologue): Don't call
>   thumb1_output_function_prologue.
>   (arm_expand_prologue): Avoid dead store.
>   (number_of_first_bit_set): Use ctz_hwi.
>   (thumb1_emit_multi_reg_push): New.
>   (thumb1_expand_prologue): Merge thumb1_output_function_prologue
>   to emit the entire prologue as rtl.
>   (thumb1_output_interwork): Split out from
>   thumb1_output_function_prologue.
>   (thumb1_output_function_prologue): Remove.
>   (arm_attr_length_push_multi): Handle thumb1.
>   * config/arm/arm.md (VUNSPEC_THUMB1_INTERWORK): New.
>   (prologue_thumb1_interwork): New.
>   (*push_multi): Allow thumb1; use push_mult_memory_operand.
>   * config/arm/predicates.md (push_mult_memory_operand): New.
> 

OK if the arm-eabi tests are OK.

R.



Re: [testsuite] ARM tests vfp-ldm*.c and vfp-stm*.c

2011-06-24 Thread Joseph S. Myers
On Thu, 23 Jun 2011, Janis Johnson wrote:

> Tests target/arm/vfp-ldm*.c and vfp-sdm*.c add -mfloat-abi=softfp but
> fail if multilib flags override that option.  This patch skips the test
> for multilibs that specify a different value for -mfloat-abi.

While they need to be skipped for -mfloat-abi=soft, I'd think they ought 
to pass for -mfloat-abi=hard - why do they fail there?

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [testsuite] ARM ivopts tests: skip for no thumb support

2011-06-24 Thread Tom de Vries
On 06/24/2011 12:07 AM, Janis Johnson wrote:
> On 06/23/2011 02:56 PM, Ramana Radhakrishnan wrote:
>> On 23 June 2011 22:36, Janis Johnson  wrote:
>>> Tests gcc.target/arm/ivopts*.c add -mthumb but fail on targets without
>>> thumb support; skip those targets.  The tests save temporary files and
>>> need to remove them at the end, easily done with cleanup-saved-temps.
>>>
>>> Test ivopts-6.c is the only one of the set that does not require thumb2
>>> support in the check for object-size, and it fails for -march=iwmmxt
>>> and iwmmxt2; the check should probably be used on that test as well,
>>> although I haven't included it here.
>>
>> I'm not sure I understand the change for ivopts-6.c :
>>
>> It's skipping if there is no Thumb support by default but the test
>> assumes the test will run with  -marm on the command line ?
>>
>> Ramana
> 
> Oops, I got carried away and didn't notice that it uses -marm rather
> than -mthumb.  I'll take another look at that one.
> 
> Janis

How about this patch? I removed all -mthumb/-marm option settings, and instead
focused on trying to guard the object-size tests properly.

I introduced 2 new arm-related effective targets to accomplish this.
- arm_thumb2: Tests if we're compiling for thumb2.
- arm_nothumb: Tests if we're not compiling for any thumb.
I don't know how to get the same effect with the existing arm-related effective
targets.

Thanks,
- Tom

2011-06-24  Janis Johnson  
Tom de Vries  

* lib/target-supports.exp (check_effective_target_arm_nothumb)
(check_effective_target_arm_thumb2): New effective targets.
* gcc.target/arm/ivopts.c: Remove -mthumb/-marm.  Guard object-size
properly.  Clean up temporary files.
* gcc.target/arm/ivopts-2.c: Likewise.
* gcc.target/arm/ivopts-3.c: Likewise.
* gcc.target/arm/ivopts-4.c: Likewise.
* gcc.target/arm/ivopts-5.c: Likewise.
* gcc.target/arm/ivopts-6.c: Remove duplicate of ivopts.c.
diff -u gcc/testsuite/gcc.target/arm/ivopts-3.c (revision 0) gcc/testsuite/gcc.target/arm/ivopts-3.c (revision 0)
--- gcc/testsuite/gcc.target/arm/ivopts-3.c (revision 0)
+++ gcc/testsuite/gcc.target/arm/ivopts-3.c (revision 0)
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-Os -mthumb -fdump-tree-ivopts -save-temps" } */
+/* { dg-options "-Os -fdump-tree-ivopts -save-temps" } */
 
 extern unsigned int foo2 (short*) __attribute__((pure));
 
@@ -19,2 +19,3 @@
-/* { dg-final { object-size text <= 30 { target arm_thumb2_ok } } } */
+/* { dg-final { object-size text <= 30 { target arm_thumb2 } } } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
+/* { dg-final { cleanup-saved-temps "ivopts" } } */
diff -u gcc/testsuite/gcc.target/arm/ivopts-4.c (revision 0) gcc/testsuite/gcc.target/arm/ivopts-4.c (revision 0)
--- gcc/testsuite/gcc.target/arm/ivopts-4.c (revision 0)
+++ gcc/testsuite/gcc.target/arm/ivopts-4.c (revision 0)
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-mthumb -Os -fdump-tree-ivopts -save-temps" } */
+/* { dg-options "-Os -fdump-tree-ivopts -save-temps" } */
 
 extern unsigned int foo (int*) __attribute__((pure));
 
@@ -20,2 +20,3 @@
-/* { dg-final { object-size text <= 36 { target arm_thumb2_ok } } } */
+/* { dg-final { object-size text <= 36 { target arm_thumb2 } } } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
+/* { dg-final { cleanup-saved-temps "ivopts" } } */
diff -u gcc/testsuite/gcc.target/arm/ivopts-5.c (revision 0) gcc/testsuite/gcc.target/arm/ivopts-5.c (revision 0)
--- gcc/testsuite/gcc.target/arm/ivopts-5.c (revision 0)
+++ gcc/testsuite/gcc.target/arm/ivopts-5.c (revision 0)
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-Os -mthumb -fdump-tree-ivopts -save-temps" } */
+/* { dg-options "-Os -fdump-tree-ivopts -save-temps" } */
 
 extern unsigned int foo (int*) __attribute__((pure));
 
@@ -19,2 +19,3 @@
-/* { dg-final { object-size text <= 30 { target arm_thumb2_ok } } } */
+/* { dg-final { object-size text <= 30 { target arm_thumb2 } } } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
+/* { dg-final { cleanup-saved-temps "ivopts" } } */
diff -u gcc/testsuite/gcc.target/arm/ivopts-2.c (revision 0) gcc/testsuite/gcc.target/arm/ivopts-2.c (revision 0)
--- gcc/testsuite/gcc.target/arm/ivopts-2.c (revision 0)
+++ gcc/testsuite/gcc.target/arm/ivopts-2.c (revision 0)
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-Os -mthumb -fdump-tree-ivopts -save-temps" } */
+/* { dg-options "-Os -fdump-tree-ivopts -save-temps" } */
 
 extern void foo2 (short*);
 
@@ -17,2 +17,3 @@
-/* { dg-final { object-size text <= 26 { target arm_thumb2_ok } } } */
+/* { dg-final { object-size text <= 26 { target arm_thumb2 } } } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
+/* { dg-final { cleanup-saved-temps "ivopts" } } */
diff -u gcc/testsuite/gcc.target/arm/ivopts.c (revision 0) gcc/testsuite/gcc.target/arm/ivopts.c (revision 0)
--- gcc/testsuite/gcc.target/arm/ivopts.c (revision 0)
+++ gc

Re: varpool alias reorg

2011-06-24 Thread Jan Hubicka
Hi,
I also tested the attached variant that simply disable the builtins streaming.
It fixes the testcase, too, bootstraps/regtestes x86_64 with and without plugin
and builds mozilla. It also solves the decl sharing problems that leads to
debug info confussion.
However it breaks memops-asm.c testcase.  What testcase does is giving builtlins
asm name (i.e. my_memcpy instead of memcpy) and then it tests that the new calls
to the builtilns introduced via folding actually use these asm names.

The code this patch remove was probably invented specifically for this testcase:
instead of streaming builtin decl like we do all other streaming, we just stream
reference to the builtin + an asm name and the single builtin decl in the
LTO unit gets the name.

The problem is that this won't work when LTOing multiple such units each giving 
a
different asm name (i.e. one of units will win). We can't quite fix this because
we don't want to keep track from where the code we are folding is comming.

So I wonder, do we really need to preserve this behaviour?  
I do not think it is documented anywhere and it seems to me that both variants:
ignoring the asm names or honoring them are sane.

Honza

Index: testsuite/gcc.c-torture/execute/builtins/memops-asm-lib.c
===
*** testsuite/gcc.c-torture/execute/builtins/memops-asm-lib.c   (revision 
175350)
--- testsuite/gcc.c-torture/execute/builtins/memops-asm-lib.c   (working copy)
*** typedef __SIZE_TYPE__ size_t;
*** 6,12 
  
  /* LTO code is at the present to able to track that asm alias my_bcopy on 
builtin
 actually refers to this function.  See PR47181. */
- __attribute__ ((used))
  void *
  my_memcpy (void *d, const void *s, size_t n)
  {
--- 6,11 
*** my_memcpy (void *d, const void *s, size_
*** 19,25 
  
  /* LTO code is at the present to able to track that asm alias my_bcopy on 
builtin
 actually refers to this function.  See PR47181. */
- __attribute__ ((used))
  void
  my_bcopy (const void *s, void *d, size_t n)
  {
--- 18,23 
*** my_bcopy (const void *s, void *d, size_t
*** 39,45 
  
  /* LTO code is at the present to able to track that asm alias my_bcopy on 
builtin
 actually refers to this function.  See PR47181. */
- __attribute__ ((used))
  void *
  my_memset (void *d, int c, size_t n)
  {
--- 37,42 
*** my_memset (void *d, int c, size_t n)
*** 51,57 
  
  /* LTO code is at the present to able to track that asm alias my_bcopy on 
builtin
 actually refers to this function.  See PR47181. */
- __attribute__ ((used))
  void
  my_bzero (void *d, size_t n)
  {
--- 48,53 
Index: lto-streamer-out.c
===
*** lto-streamer-out.c  (revision 175350)
--- lto-streamer-out.c  (working copy)
*** pack_ts_function_decl_value_fields (stru
*** 484,491 
  {
/* For normal/md builtins we only write the class and code, so they
   should never be handled here.  */
-   gcc_assert (!lto_stream_as_builtin_p (expr));
- 
bp_pack_enum (bp, built_in_class, BUILT_IN_LAST,
DECL_BUILT_IN_CLASS (expr));
bp_pack_value (bp, DECL_STATIC_CONSTRUCTOR (expr), 1);
--- 484,489 
*** lto_output_tree_header (struct output_bl
*** 1306,1346 
  }
  
  
- /* Write the code and class of builtin EXPR to output block OB.  IX is
-the index into the streamer cache where EXPR is stored.*/
- 
- static void
- lto_output_builtin_tree (struct output_block *ob, tree expr)
- {
-   gcc_assert (lto_stream_as_builtin_p (expr));
- 
-   if (DECL_BUILT_IN_CLASS (expr) == BUILT_IN_MD
-   && !targetm.builtin_decl)
- sorry ("gimple bytecode streams do not support machine specific builtin "
-  "functions on this target");
- 
-   output_record_start (ob, LTO_builtin_decl);
-   lto_output_enum (ob->main_stream, built_in_class, BUILT_IN_LAST,
-  DECL_BUILT_IN_CLASS (expr));
-   output_uleb128 (ob, DECL_FUNCTION_CODE (expr));
- 
-   if (DECL_ASSEMBLER_NAME_SET_P (expr))
- {
-   /* When the assembler name of a builtin gets a user name,
-the new name is always prefixed with '*' by
-set_builtin_user_assembler_name.  So, to prevent the
-reader side from adding a second '*', we omit it here.  */
-   const char *str = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (expr));
-   if (strlen (str) > 1 && str[0] == '*')
-   lto_output_string (ob, ob->main_stream, &str[1], true);
-   else
-   lto_output_string (ob, ob->main_stream, NULL, true);
- }
-   else
- lto_output_string (ob, ob->main_stream, NULL, true);
- }
- 
- 
  /* Write a physical representation of tree node EXPR to output block
 OB.  If REF_P is true, the leaves of EXPR are emitted as references
 via lto_output_tree_ref.  IX is the index into the streamer cache
--- 1304,1309 
*** lto_output_tree (st

[PATCH, MELT] pragma support in MELT

2011-06-24 Thread Pierre Vittet

Hello,

This patch completes the pragma support in MELT. Now, a plugin can 
register several pragmas (with different name) in the following format 
(for GCC > 4.6):


#pragma MELT   (,...).

This pragma can be easily handle in a MELT function, giving the operator 
and the list of arguments as parameters.


For GCC<=4.6, there is a minimal pragma support, we can handle following 
pragma:


#pragma GCCPLUGIN melt  (,...)

with only melt as name.

ChangeLog:

2011-06-24  Pierre Vittet 

* melt-runtime.c (GCC_PRAGMA_BAD): Macro to return an error from the
pragma handling system.
[__GNUC__>4.6](melt_handle_melt_pragma, handle_melt_pragma,
melt_pragma_callback): Add functions for full pragma handling.
[__GNUC__<=4.6](melt_handle_melt_pragma, handle_melt_pragma,
melt_pragma_callback): Add functions for limited pragma handling.
(melt_startunit_callback): Register a callback for pragma.
* Makefile.in (CFAMILYINC): We need c-family header in include 
headers.
Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 175348)
+++ gcc/Makefile.in (working copy)
@@ -359,6 +359,9 @@ DECNUMFMT = $(srcdir)/../libdecnumber/$(enable_dec
 DECNUMINC = -I$(DECNUM) -I$(DECNUMFMT) -I../libdecnumber
 LIBDECNUMBER = ../libdecnumber/libdecnumber.a
 
+#c-family header
+CFAMILYINC=-I$(srcdir)/c-family
+
 # Target to use when installing include directory.  Either
 # install-headers-tar, install-headers-cpio or install-headers-cp.
 INSTALL_HEADERS_DIR = @build_install_headers_dir@
@@ -1096,8 +1099,8 @@ INCLUDES = -I. -I$(@D) -I$(srcdir) -I$(srcdir)/$(@
   -I$(srcdir)/melt/generated \
   -I$(srcdir)/../include @INCINTL@ \
   $(CPPINC) $(GMPINC) $(DECNUMINC) \
-  $(PPLINC) $(CLOOGINC)
-
+  $(PPLINC) $(CLOOGINC) \
+  $(CFAMILYINC)
 .c.o:
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $< $(OUTPUT_OPTION)
 
Index: gcc/melt-runtime.c
===
--- gcc/melt-runtime.c  (revision 175348)
+++ gcc/melt-runtime.c  (working copy)
@@ -74,7 +74,9 @@ along with GCC; see the file COPYING3.   If not se
 #include "md5.h"
 #include "plugin.h"
 #include "cppdefault.h"
+#include "c-pragma.h"
 
+
 #if BUILDING_GCC_VERSION > 4005
 /* GCC 4.6 has realmpfr.h which includes   */
 #include "realmpfr.h"
@@ -8938,7 +8940,284 @@ melt_attribute_callback(void *gcc_data ATTRIBUTE_U
   register_attribute(&melt_attr_spec);
 }
 
+/* We declare weak functions because they cannot be linked when we use lto (it
+   loses langage specific informations).
+   If you use one of those functions you must check them to be not NULL.
+*/
+extern enum cpp_ttype __attribute__((weak)) pragma_lex (tree *);
 
+
+
+#define GCC_PRAGMA_BAD(gmsgid) \
+  do { warning (OPT_Wpragmas, gmsgid); goto end; } while (0)
+
+
+
+/* Test for GCC > 4.6.0 */
+#if __GNUC__ > 4 || \
+(__GNUC__ == 4 && (__GNUC_MINOR__ > 6))
+/*Full pragma with data support.*/
+
+void melt_handle_melt_pragma (melt_ptr_t optreev, melt_ptr_t listargtreev,
+  int indice_handler);
+
+extern void __attribute__((weak)) c_register_pragma_with_expansion_and_data
+(const char *space, const char *name,
+ pragma_handler_2arg handler, void *data);
+
+/* handle a melt pragma: data contain the name of the command (as a string)*/
+static void
+handle_melt_pragma (cpp_reader *ARG_UNUSED(dummy), void * data)
+{
+  enum cpp_ttype token;
+  /*list containing the pragma argument*/
+  tree x;
+  int ihandler = (int) data;
+  MELT_ENTERFRAME (3, NULL);
+#define seqv  meltfram__.mcfr_varptr[0]
+#define treev meltfram__.mcfr_varptr[1]
+#define optreev   meltfram__.mcfr_varptr[2]
+  if(! pragma_lex || ! c_register_pragma_with_expansion_and_data)
+GCC_PRAGMA_BAD("Cannot use pragma symbol at this level \
+   (maybe you use -flto which is incompatible).");
+
+  token = pragma_lex (&x);
+  if(token != CPP_NAME)
+GCC_PRAGMA_BAD ("malformed #pragma melt, ignored");
+  optreev = meltgc_new_tree((meltobject_ptr_t) MELT_PREDEF (DISCR_TREE), x);
+  /*If the pragma has the form #pragma PLUGIN melt id (...) then optreev is the
+  tree containing "id".
+  Next element should be a parenthese opening. */
+  token = pragma_lex (&x);
+  if (token != CPP_OPEN_PAREN){
+if (token != CPP_EOF)
+  GCC_PRAGMA_BAD ("malformed #pragma melt, ignored");
+
+else{ /* we have a pragma of the type '#pragma PLUGIN melt instr' */
+  melt_handle_melt_pragma ((melt_ptr_t ) optreev, (melt_ptr_t ) NULL,
+   ihandler);
+}
+  }
+  else{/* opening parenthesis */
+seqv = meltgc_new_list ((meltobject_ptr_t) MELT_PREDEF (DISCR_LIST));
+do {
+  token = pragma_lex (&x);
+  if(token != CPP_NAME && token != CPP_STRING && token != CPP_NUMBER)
+GCC_PRAGMA_BAD ("malformed #pragma 

Re: [PATCH, MELT] pragma support in MELT

2011-06-24 Thread Basile Starynkevitch
On Fri, Jun 24, 2011 at 02:04:20PM +0200, Pierre Vittet wrote:
> Hello,
> 
> This patch completes the pragma support in MELT. Now, a plugin can
> register several pragmas (with different name) in the following
> format (for GCC > 4.6):
> 
> #pragma MELT   (,...).
> 
> This pragma can be easily handle in a MELT function, giving the
> operator and the list of arguments as parameters.
> 
> For GCC<=4.6, there is a minimal pragma support, we can handle
> following pragma:
> 
> #pragma GCCPLUGIN melt  (,...)
> 
> with only melt as name.
> 
> ChangeLog:
> 
> 2011-06-24  Pierre Vittet 
> 
> * melt-runtime.c (GCC_PRAGMA_BAD): Macro to return an error from the
> pragma handling system.
> [__GNUC__>4.6](melt_handle_melt_pragma, handle_melt_pragma,
> melt_pragma_callback): Add functions for full pragma handling.
> [__GNUC__<=4.6](melt_handle_melt_pragma, handle_melt_pragma,
> melt_pragma_callback): Add functions for limited pragma handling.
> (melt_startunit_callback): Register a callback for pragma.
> * Makefile.in (CFAMILYINC): We need c-family header in include
> headers.

Thanks for the work. Almost perfect!

Your ChangeLog entry is not properly aligned. And what is happenning when
MELT is running in LTO mode?

Cheers.

PS. I will call you. 
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: varpool alias reorg

2011-06-24 Thread Jan Hubicka
Hi,
this is yet another variant of the fix.  This time we stream builtins decls as
usually, but at fixup time we copy the assembler names (if set) into the
builtin decls used by folders.  Not sure if it is any better than breaking
memops-asm, but I can imagine that things like glibc actually rename string
functions into their internal variants (and thus with this version of patch we
would be able to LTO such library, but still we won't be able to LTO such
library into something else because something else would end up referncing the
internal versions of builtins).  I doubt we could do any better, however.

__attribute__ ((used)) is still needed in memops-asm-lib.c because LTO symtab
of course doesn't see the future references to builtins that we will emit
later via folding.  I think it is resonable requirement, as discussed at the
time enabling the plugin.

Honza

Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 175350)
+++ lto-streamer-out.c  (working copy)
@@ -484,8 +484,6 @@ pack_ts_function_decl_value_fields (stru
 {
   /* For normal/md builtins we only write the class and code, so they
  should never be handled here.  */
-  gcc_assert (!lto_stream_as_builtin_p (expr));
-
   bp_pack_enum (bp, built_in_class, BUILT_IN_LAST,
DECL_BUILT_IN_CLASS (expr));
   bp_pack_value (bp, DECL_STATIC_CONSTRUCTOR (expr), 1);
@@ -1121,7 +1119,7 @@ lto_output_ts_binfo_tree_pointers (struc
  together large portions of programs making it harder to partition.  
Becuase
  devirtualization is interesting before inlining, only, there is no real
  need to ship it into ltrans partition.  */
-  lto_output_tree_or_ref (ob, flag_wpa ? NULL : BINFO_VIRTUALS (expr), ref_p);
+  lto_output_tree_or_ref (ob, flag_wpa || 1 ? NULL : BINFO_VIRTUALS (expr), 
ref_p);
   lto_output_tree_or_ref (ob, BINFO_VPTR_FIELD (expr), ref_p);
 
   output_uleb128 (ob, VEC_length (tree, BINFO_BASE_ACCESSES (expr)));
@@ -1306,41 +1304,6 @@ lto_output_tree_header (struct output_bl
 }
 
 
-/* Write the code and class of builtin EXPR to output block OB.  IX is
-   the index into the streamer cache where EXPR is stored.*/
-
-static void
-lto_output_builtin_tree (struct output_block *ob, tree expr)
-{
-  gcc_assert (lto_stream_as_builtin_p (expr));
-
-  if (DECL_BUILT_IN_CLASS (expr) == BUILT_IN_MD
-  && !targetm.builtin_decl)
-sorry ("gimple bytecode streams do not support machine specific builtin "
-  "functions on this target");
-
-  output_record_start (ob, LTO_builtin_decl);
-  lto_output_enum (ob->main_stream, built_in_class, BUILT_IN_LAST,
-  DECL_BUILT_IN_CLASS (expr));
-  output_uleb128 (ob, DECL_FUNCTION_CODE (expr));
-
-  if (DECL_ASSEMBLER_NAME_SET_P (expr))
-{
-  /* When the assembler name of a builtin gets a user name,
-the new name is always prefixed with '*' by
-set_builtin_user_assembler_name.  So, to prevent the
-reader side from adding a second '*', we omit it here.  */
-  const char *str = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (expr));
-  if (strlen (str) > 1 && str[0] == '*')
-   lto_output_string (ob, ob->main_stream, &str[1], true);
-  else
-   lto_output_string (ob, ob->main_stream, NULL, true);
-}
-  else
-lto_output_string (ob, ob->main_stream, NULL, true);
-}
-
-
 /* Write a physical representation of tree node EXPR to output block
OB.  If REF_P is true, the leaves of EXPR are emitted as references
via lto_output_tree_ref.  IX is the index into the streamer cache
@@ -1456,15 +1419,6 @@ lto_output_tree (struct output_block *ob
   lto_output_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
   lto_tree_code_to_tag (TREE_CODE (expr)));
 }
-  else if (lto_stream_as_builtin_p (expr))
-{
-  /* MD and NORMAL builtins do not need to be written out
-completely as they are always instantiated by the
-compiler on startup.  The only builtins that need to
-be written out are BUILT_IN_FRONTEND.  For all other
-builtins, we simply write the class and code.  */
-  lto_output_builtin_tree (ob, expr);
-}
   else
 {
   /* This is the first time we see EXPR, write its fields
Index: lto-streamer-in.c
===
--- lto-streamer-in.c   (revision 175350)
+++ lto-streamer-in.c   (working copy)
@@ -1736,18 +1736,7 @@ unpack_ts_function_decl_value_fields (st
   DECL_PURE_P (expr) = (unsigned) bp_unpack_value (bp, 1);
   DECL_LOOPING_CONST_OR_PURE_P (expr) = (unsigned) bp_unpack_value (bp, 1);
   if (DECL_BUILT_IN_CLASS (expr) != NOT_BUILT_IN)
-{
-  DECL_FUNCTION_CODE (expr) = (enum built_in_function) bp_unpack_value 
(bp, 11);
-  if (DECL_BUILT_IN_CLASS (expr) == BUILT_IN_NORMAL
- && DECL_FUNCTION_CODE (expr) >= END_BUILTINS)
-   fatal_error ("machine independent builtin code out of range");
- 

New German PO file for 'gcc' (version 4.6.0)

2011-06-24 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

http://translationproject.org/latest/gcc/de.po

(This file, 'gcc-4.6.0.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: C++ PATCH for c++/35255 (address of template-id)

2011-06-24 Thread Diego Novillo
On Thu, Jun 23, 2011 at 22:10, Jason Merrill  wrote:
> Per DR 115, if the context of a template-id doesn't give enough type
> information to resolve it and the template-id fully resolves exactly one
> specialization, we should use that one.  The code in
> resolve_overloaded_unification was trying to do this, but was failing to
> handle the case where there are additional templates that aren't fully
> resolved.
>
> Tested x86_64-pc-linux-gnu, applying to trunk.

Patch missing.


Re: Removed unused cp_binding_level field names_size. (issue4662052)

2011-06-24 Thread Diego Novillo
On Thu, Jun 23, 2011 at 22:02, Jason Merrill  wrote:
> OK.
>
> Jason

Applied to trunk rev. 175373.


Diego.


Re: [pph contrib] Add support for multiple spawn patterns in repro_fail (issue4571061)

2011-06-24 Thread Diego Novillo
On Tue, Jun 21, 2011 at 14:36, Alexandre Oliva  wrote:
> On Jun 10, 2011, dnovi...@google.com (Diego Novillo) wrote:
>
>> I'm thinking that this script is better written in python, but that
>> may make it less generic and I don't know whether we accept python in
>> gcc/contrib.  Alex?
>
> I guess anything goes in gcc/contrib, so it could be rewritten in
> Python, but the bash script you checked in looks good enough to me.

Thanks.  I committed the script to trunk r175374.


Diego.


Re: [testsuite] ARM test pr42093.c: thumb2 or thumb1

2011-06-24 Thread Ramana Radhakrishnan

On 24/06/11 01:40, Janis Johnson wrote:

Test gcc.target/arm/pr42093.c, added by Ramana, requires support for
arm_thumb2 but fails for those targets.  The patch for which it was
added modified support for thumb1.  Should the test instead require
arm_thumb1_ok, as in this patch?


No this is for a Thumb2 defect so the test is valid for Thumb2 - we 
shouldn't be generating a tbb / tbh with signed offsets and that's what 
was happening there.


This test I think ends up being fragile because the generation of tbb / 
tbh depends on how the blocks have been laid out . It would be 
interesting to try and get a test that works reliably in T2 .


cheers
Ramana



Janis




Re: [Patch ARM] Add predefine for availability of DSP multiplication functions.

2011-06-24 Thread Ramana Radhakrishnan
On 24 June 2011 09:20, James Greenhalgh  wrote:
> Apologies, I sent an incomplete ChangeLog entry.

This is OK.

cheers
Ramana


Re: PATCH: PR rtl-optimization/49504: Invalid optimization for Pmode != ptr_mode

2011-06-24 Thread H.J. Lu
On Fri, Jun 24, 2011 at 1:58 AM, Eric Botcazou  wrote:
>> I just don't see how nonzero_bits1 can assume if pointers extend unsigned
>> and this is an addition or subtraction to a pointer in Pmode, all the bits
>> bove ptr_mode are known to be zero.  We never run into it before x32
>> since x32 is the first such target.
>
> I agree that this is overly optimistical, but this was done on purpose:
>  http://gcc.gnu.org/ml/gcc-patches/2001-02/msg00316.html
>
> Could you evaluate the pessimization that the patch introduces (if any) for 
> the
> generated code on x32?  If there is none or it is negligible, the patch is OK
> if completed by the removal of the equivalent code in num_sign_bit_copies1.
> If it isn't negligible, we may need to be clever and devise something else.
>

I compared x32 glibc built with the old x32 gcc against x32 glibc built with
this patch and

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00913.html

reverted, the new glibc is little smaller:

New:

[hjl@gnu-33 build-x86_64-linux]$ size libc.so
   textdata bss dec hex filename
1537648   10076   12944 1560668  17d05c libc.so
[hjl@gnu-33 build-x86_64-linux]$

Old:

[hjl@gnu-33 build-x86_64-linux.old]$ size libc.so
   textdata bss dec hex filename
1538968   10076   12944 1561988  17d584 libc.so
[hjl@gnu-33 build-x86_64-linux.old]$

I looked at assembly codes.  The new one is better.
I will check it in.

Thanks.

-- 
H.J.


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-24 Thread Stubbs, Andrew
On 24/06/11 09:28, Richard Guenther wrote:
>> >  To be clear, it only skips past NOP_EXPR. Is it not the case that what
>> >  you're describing would need a CONVERT_EXPR?
> NOP_EXPR is the same as CONVERT_EXPR.

Are you sure?

I thought this was safe because the internals manual says:

   NOP_EXPR
   These nodes are used to represent conversions that do not require any
   code-generation 

   CONVERT_EXPR
   These nodes are similar to NOP_EXPRs, but are used in those
   situations where code may need to be generated 

So, I tried this example:

int
foo (int a, short b, short c)
{
   int bc = b * c;
   return a + (short)bc;
}

Both before and after my patch, GCC gives:

 mul r2, r1, r2
 sxtah   r0, r0, r2

(where, SXTAH means sign-extend the third operand from HImode to SImode 
and add to the second operand.)

The dump after the widening_mult pass is:

foo (int a, short int b, short int c)
{
   int bc;
   int D.2018;
   short int D.2017;
   int D.2016;
   int D.2015;
   int D.2014;

:
   D.2014_2 = (int) b_1(D);
   D.2015_4 = (int) c_3(D);
   bc_5 = b_1(D) w* c_3(D);
   D.2017_6 = (short int) bc_5;
   D.2018_7 = (int) D.2017_6;
   D.2016_9 = D.2018_7 + a_8(D);
   return D.2016_9;

}

Where you can clearly see that the addition has not been recognised as a 
multiply-and-accumulate.

When I step through convert_plusminus_to_widen, I can see that the 
reason it has not matched is because "D.2017_6 = (short int) bc_5" is 
encoded with a CONVERT_EXPR, just as the manual said it would be.

So, according to the manual, and my (admittedly limited) experiments, 
skipping over NOP_EXPR does appear to be safe.

But you say that it isn't safe. So now I'm confused. :(

I can certainly add checks to make sure that the skipped operations 
actually don't make any important changes to the value, but do I need to?

Andrew


Ping: C-family stack check for threads

2011-06-24 Thread Thomas Klein

Hi

This is a ping of 
(http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01226.html).

Repeating my request.

I would like to have a stack check for threads with small amount of 
stack space per thread.
(I'm using a ARM Cortex-M3 microcontroller with a stack size of a 1 
KByte per Thread.)

Each thread having its own limit address.
The thread scheduler can then calculate the limit and store this value 
inside of a global variable.
The compiler may generate code to check the stack for overflow at 
function entry.

In principal this can be done this way:
  - push registers as usual
  - figure out if one or two work registers, that can be used directly 
without extra push

  - if not enough registers found push required work registers to stack
  - load limit address into first working register
  - load value of limit address (into the same register)
  - if stack pointer will go to extend the stack (e.g. for local 
variables)

load this size value too (here the second work register can be used)
  - compare for overflow
  - if overflow occur "call" stack_failure function
  - pop work registers that are pushed before
  - continue function prologue as usual e.g. extend stack pointer

The ARM target has an option "-mapcs-stack-check" but this is more or 
less not working. (implementation seems to be missing)

There are also architecture independent options like
"-fstack-check=generic", "-fstack-limit-symbol=current_stack_limit" or 
"-fstack-limit-register=r6"

that can be used.

The generic stack check is doing a probe at end of function prologue phase
(e.g by writing 12K ahead the current stack pointer position).
If this stack space is not available the probe may generates a fault.
This require that the CPU is having a MPU or a MMU.
For machines with small memory space an additional mechanism should be
available.

The option "-fstack-check" can be extend by the switches "direct" and 
"indirect" to emit compare code in function prologue.
If switch "direct" is given the address of "-fstack-limit-symbol" 
represents the limit itself.

If switch "indirect" is given "-fstack-limit-symbol" is a kind of global
variable that needs be read before comparison.

I have add an proposal to show how an integration of this behavior can
be done at an ARM architecture.

The generated code look like this
e.g. if using "-fstack-check=indirect -fstack-limit-symbol=stack_limit_var"
->   push {r0}
->   ldr r0, .LSPCHK0
->   ldr r0, [r0]
->   cmp sp, r0
->   bhs .LSPCHK1
->   push {lr}
->   bl __thumb_stack_failure
-> .align 2
-> .LSPCHK0:
-> .word stack_limit_var
-> .LSPCHK1:
->   pop {r0}

Regards
  Thomas Klein

gcc/ChangeLog

2011-06-24  Thomas Klein  
* opts.c (common_handle_option): introduce additional stack checking
parameters "direct" and "indirect"
* flag-types.h (enum stack_check_type): Likewise

* explow.c (allocate_dynamic_stack_space):
- suppress stack probing if parameter "direct", "indirect" or if a
stack-limit is given
- do additional read of limit value if parameter "indirect" and a
stack-limit symbol is given
- emit a call to a stack_failure function [as an alternative to a trap
call]
(function probe_stack_range): if allowed to override the range probe
emit generic_limit_check_stack

* config/arm/arm.c (stack_check_output_function): new function to 
write

the stack check code sequence to the assember file (inside prologue)
(stack_check_work_registers): new function to find possible working
registers [only used by "stack check"]
(arm_expand_prologue): stack check integration for ARM and Thumb-2
(thumb1_output_function_prologue): stack check integration for Thumb-1

* config/arm/arm.md (probe_stack): do not emit code when parameters
"direct" or "indirect" given, emit move code as in gcc/explow.c
[function emit_stack_probe]
(probe_stack_done): dummy to make sure probe_stack insns are not
optimized away
(generic_limit_check_stack): if stack-limit and parameter "generic" is
given use the limit the same way as in function
allocate_dynamic_stack_space
(stack_check): ARM/Thumb-2 insn to output function
stack_check_output_function
(stack_failure): failure call used in function
allocate_dynamic_stack_space [similar to a trap but avoid conflict 
with

builtin_trap]

Index: gcc/opts.c
===
--- gcc/opts.c(revision 175346)
+++ gcc/opts.c(working copy)
@@ -1629,6 +1629,12 @@ common_handle_option (struct gcc_options *opts,
: STACK_CHECK_STATIC_BUILTIN
  ? STATIC_BUILTIN_STACK_CHECK
  : GENERIC_STACK_CHECK;
+  else if (!strcmp (arg, "indirect"))
+/* This is an other stack checking method.  */
+opts->x_flag_stack_check = INDIRECT_STACK_CHECK;
+  else if (!strcmp (arg, "direct"))
+/* This is an other stack checking method.  */
+opts->x_flag_stack_

Re: PATCH: PR rtl-optimization/49504: Invalid optimization for Pmode != ptr_mode

2011-06-24 Thread Eric Botcazou
> I compared x32 glibc built with the old x32 gcc against x32 glibc built
> with this patch and
>
> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00913.html
>
> reverted, the new glibc is little smaller:
>
> New:
>
> [hjl@gnu-33 build-x86_64-linux]$ size libc.so
>text  data bss dec hex filename
> 1537648 10076   12944 1560668  17d05c libc.so
> [hjl@gnu-33 build-x86_64-linux]$
>
> Old:
>
> [hjl@gnu-33 build-x86_64-linux.old]$ size libc.so
>text  data bss dec hex filename
> 1538968 10076   12944 1561988  17d584 libc.so
> [hjl@gnu-33 build-x86_64-linux.old]$
>
> I looked at assembly codes.  The new one is better.
> I will check it in.

OK, but remove the equivalent code in num_sign_bit_copies1 then, otherwise 
someone in a couple of years from now will adapt it to nonzero_bits1 and we 
will be back to square one.

-- 
Eric Botcazou


Patch committed: Remove entry from gcc/go/ChangeLog

2011-06-24 Thread Ian Lance Taylor
I committed this patch to remove an entry from gcc/go/ChangeLog.  The
file in question is part of the gofrontend which lives in a separate
repository.  ChangeLog patches do not themselves get ChangeLog entries.

Ian

Index: ChangeLog
===
--- ChangeLog	(revision 175357)
+++ ChangeLog	(working copy)
@@ -1,8 +1,3 @@
-2011-06-21  Andrew MacLeod  
-
-	* gogo-tree.cc (Gogo::define_builtin_function_trees): Change
-	BUILT_IN_ADD_AND_FETCH to BUILT_IN_SYNC_ADD_AND_FETCH.
-
 2011-06-14  Joseph Myers  
 
 	* Make-lang.in (go/go-lang.o, go/go-backend.o): Update


Re: PATCH: PR rtl-optimization/49504: Invalid optimization for Pmode != ptr_mode

2011-06-24 Thread H.J. Lu
On Fri, Jun 24, 2011 at 7:07 AM, Eric Botcazou  wrote:
>> I compared x32 glibc built with the old x32 gcc against x32 glibc built
>> with this patch and
>>
>> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00913.html
>>
>> reverted, the new glibc is little smaller:
>>
>> New:
>>
>> [hjl@gnu-33 build-x86_64-linux]$ size libc.so
>>    text          data     bss     dec     hex filename
>> 1537648         10076   12944 1560668  17d05c libc.so
>> [hjl@gnu-33 build-x86_64-linux]$
>>
>> Old:
>>
>> [hjl@gnu-33 build-x86_64-linux.old]$ size libc.so
>>    text          data     bss     dec     hex filename
>> 1538968         10076   12944 1561988  17d584 libc.so
>> [hjl@gnu-33 build-x86_64-linux.old]$
>>
>> I looked at assembly codes.  The new one is better.
>> I will check it in.
>
> OK, but remove the equivalent code in num_sign_bit_copies1 then, otherwise
> someone in a couple of years from now will adapt it to nonzero_bits1 and we
> will be back to square one.
>

I am testing this patch on x32 branch.  I will compare glibc binaries before and
after.

Thanks.

-- 
H.J.
---
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index e5c045d..0be6504 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -4605,21 +4605,6 @@ num_sign_bit_copies1 (const_rtx x, enum
machine_mode mode, const_rtx known_x,
 known_x, known_mode, known_ret);
   result = MAX (1, MIN (num0, num1) - 1);

-#ifdef POINTERS_EXTEND_UNSIGNED
-  /* If pointers extend signed and this is an addition or subtraction
-to a pointer in Pmode, all the bits above ptr_mode are known to be
-sign bit copies.  */
-  /* As we do not know which address space the pointer is refering to,
-we can do this only if the target does not support different pointer
-or address modes depending on the address space.  */
-  if (target_default_pointer_address_modes_p ()
- && ! POINTERS_EXTEND_UNSIGNED && GET_MODE (x) == Pmode
- && (code == PLUS || code == MINUS)
- && REG_P (XEXP (x, 0)) && REG_POINTER (XEXP (x, 0)))
-   result = MAX ((int) (GET_MODE_BITSIZE (Pmode)
-- GET_MODE_BITSIZE (ptr_mode) + 1),
- result);
-#endif
   return result;

 case MULT:


[PATCH] __builtin_assume_aligned

2011-06-24 Thread Jakub Jelinek
Hi!

This patch introduces a new extension, to hint the compiler
that a pointer is guaranteed to be somehow aligned (or misaligned).
It is designed as a pass-thru builtin which just returns its first
argument, so that it is more obvious where we can assume how it is aligned.
Otherwise it is similar to ICC's __assume_aligned, so for lvalue first
argument ICC's __assume_aligned can be emulated using
#define __assume_aligned(lvalueptr, align) lvalueptr = __builtin_assume_aligned 
(lvalueptr, align)
ICC doesn't allow side-effects in the arguments of this, GCC does,
so one can e.g. write:
void
foo (std::vector &vec)
{
  double *__restrict data = (double *) __builtin_assume_aligned (vec.data (), 
16);
...
}
to hint gcc that it can assume the vector has its data () 16 byte aligned
(which is true e.g. on x86_64-linux if using standard malloc based
allocator, which guarantees 2 * sizeof (void*) alignment).  E.g. vectorizer
can use that hint to generate aligned stores/loads instead of unaligned
ones.

Maybe we should have also __builtin_likely_aligned, which would be similar,
just wouldn't guarantee such an alignment, just say it is very likely.  If
vectorizer decided to version a loop, for the fast alternative it could
check the alignment in the versioning condition and assume the likely
aligned alignment in the fast vectorized version and let the unlikely
non-aligned case use slower scalar loop.  But that can be done separately.

The builtin can have either two or three arguments, the second is
alignment and third is misalignment (i.e. that
(uintptr_t) ((char *) firstarg - misalign) & (align - 1) == 0
).  I've been contemplating to make the builtin overloaded, have return
type be always the type of the first argument if it is pointer/reference
type, like template  T __builtin_assume_aligned (T, size_t, ...);
both in C and C++, but I think it would be too difficult to make it work
that way, so the builtin is instead
void *__builtin_assume_aligned (const void *, size_t, ...);

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-06-24  Jakub Jelinek  

* builtin-types.def (BT_FN_PTR_CONST_PTR_SIZE_VAR): New.
* builtins.def (BUILT_IN_ASSUME_ALIGNED): New builtin.
* tree-ssa-structalias.c (find_func_aliases_for_builtin_call,
find_func_clobbers): Handle BUILT_IN_ASSUME_ALIGNED.
* tree-ssa-ccp.c (bit_value_assume_aligned): New function.
(evaluate_stmt, execute_fold_all_builtins): Handle
BUILT_IN_ASSUME_ALIGNED.
* tree-ssa-dce.c (propagate_necessity): Likewise.
* tree-ssa-alias.c (ref_maybe_used_by_call_p_1,
call_may_clobber_ref_p_1): Likewise.
* builtins.c (is_simple_builtin, fold_builtin_varargs,
expand_builtin): Likewise.
(expand_builtin_assume_aligned, fold_builtin_assume_aligned):
New functions.
* doc/extend.texi (__builtin_assume_aligned): Document.

* gcc.dg/builtin-assume-aligned-1.c: New test.
* gcc.dg/builtin-assume-aligned-2.c: New test.
* gcc.target/i386/builtin-assume-aligned-1.c: New test.

--- gcc/builtin-types.def.jj2011-06-21 16:45:42.0 +0200
+++ gcc/builtin-types.def   2011-06-23 11:25:03.0 +0200
@@ -454,6 +454,8 @@ DEF_FUNCTION_TYPE_VAR_2 (BT_FN_INT_CONST
 BT_INT, BT_CONST_STRING, BT_CONST_STRING)
 DEF_FUNCTION_TYPE_VAR_2 (BT_FN_INT_INT_CONST_STRING_VAR,
 BT_INT, BT_INT, BT_CONST_STRING)
+DEF_FUNCTION_TYPE_VAR_2 (BT_FN_PTR_CONST_PTR_SIZE_VAR, BT_PTR,
+BT_CONST_PTR, BT_SIZE)
 
 DEF_FUNCTION_TYPE_VAR_3 (BT_FN_INT_STRING_SIZE_CONST_STRING_VAR,
 BT_INT, BT_STRING, BT_SIZE, BT_CONST_STRING)
--- gcc/builtins.def.jj 2011-06-21 16:46:01.0 +0200
+++ gcc/builtins.def2011-06-23 11:25:03.0 +0200
@@ -1,7 +1,7 @@
 /* This file contains the definitions and documentation for the
builtins used in the GNU compiler.
Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009,
-   2010 Free Software Foundation, Inc.
+   2010, 2011 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -638,6 +638,7 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_EXE
 DEF_EXT_LIB_BUILTIN(BUILT_IN_EXECVE, "execve", 
BT_FN_INT_CONST_STRING_PTR_CONST_STRING_PTR_CONST_STRING, ATTR_NOTHROW_LIST)
 DEF_LIB_BUILTIN(BUILT_IN_EXIT, "exit", BT_FN_VOID_INT, 
ATTR_NORETURN_NOTHROW_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_EXPECT, "expect", BT_FN_LONG_LONG_LONG, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_ASSUME_ALIGNED, "assume_aligned", 
BT_FN_PTR_CONST_PTR_SIZE_VAR, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_EXTEND_POINTER, "extend_pointer", 
BT_FN_UNWINDWORD_PTR, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_EXTRACT_RETURN_ADDR, "extract_return_addr", 
BT_FN_PTR_PTR, ATTR_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_FFS, "ffs", BT_FN_INT_INT, 
ATTR_CONST

PR 49169: testing the alignment of a function

2011-06-24 Thread Richard Sandiford
This patch fixes PR 49169, where GCC is incorrectly optimising away
a test for whether a function is Thumb rather than ARM.  The patch
was posted by Richard in the PR:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49169

See the PR for a discussion about whether a target hook is better
(or not, IMO).

Tested on arm-linux-gnueabi, where it fixes the attached testcase.
OK to install?

Richard


gcc/
2011-07-24  Richard Guenther  

PR tree-optimization/49169
* fold-const.c (get_pointer_modulus_and_residue): Don't rely on
the alignment of function decls.

gcc/testsuite/
2011-07-24  Michael Hope  
Richard Sandiford  

PR tree-optimization/49169
* gcc.dg/torture/pr49169.c: New test.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c2011-06-22 16:48:38.0 +0100
+++ gcc/fold-const.c2011-06-23 17:50:33.0 +0100
@@ -9216,7 +9216,8 @@ get_pointer_modulus_and_residue (tree ex
   *residue = 0;
 
   code = TREE_CODE (expr);
-  if (code == ADDR_EXPR)
+  if (code == ADDR_EXPR
+  && TREE_CODE (TREE_OPERAND (expr, 0)) != FUNCTION_DECL)
 {
   unsigned int bitalign;
   bitalign = get_object_alignment_1 (TREE_OPERAND (expr, 0), residue);
Index: gcc/testsuite/gcc.dg/torture/pr49169.c
===
--- /dev/null   2011-06-20 08:31:41.268810499 +0100
+++ gcc/testsuite/gcc.dg/torture/pr49169.c  2011-06-23 17:52:24.0 
+0100
@@ -0,0 +1,13 @@
+#include 
+#include 
+
+int
+main (void)
+{
+  void *p = main;
+  if ((intptr_t) p & 1)
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "abort" } } */


[PATCH] Use TYPE_NEXT_VARIANT instead of TREE_CHAIN as chain_next for types during GC (PR c++/46400)

2011-06-24 Thread Jakub Jelinek
Hi!

On the huge testcase in the PR we ICE, because GC needs too deep recursion.
I've noticed that part of the problem is deep recursion through
TYPE_NEXT_VARIANT pointers.
As TREE_CHAIN on types is TYPE_STUB_DECL, therefore should point
to a different kind of tree and thus isn't really a pointer through
which long chain of trees are linked together, I think we should
use TYPE_NEXT_VARIANT as such chain instead.
In fact, the C FE already uses it, just for INTEGER_TYPEs only.
This patch changes both C and C++ FEs to use it for all types.

The patch fixes the testcase (while without the patch it needs
roughly 35MB of stack, with the patch 5.25MB is enough) and has been
bootstrapped/regtested on x86_64-linux and i686-linux.  Ok for trunk?

2011-06-24  Jakub Jelinek  

PR c++/46400
* cp-tree.h (union lang_tree_node): Use TYPE_NEXT_VARIANT
instead of TYPE_CHAIN for chain_next for types.

* c-decl.c (union lang_tree_node): Use TYPE_NEXT_VARIANT
instead of TYPE_CHAIN for chain_next for types.

--- gcc/cp/cp-tree.h.jj 2011-06-21 16:45:52.0 +0200
+++ gcc/cp/cp-tree.h2011-06-24 12:07:42.0 +0200
@@ -729,7 +729,7 @@ enum cp_tree_node_structure_enum {
 
 /* The resulting tree type.  */
 union GTY((desc ("cp_tree_node_structure (&%h)"),
-   chain_next ("CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), TS_COMMON) 
? ((union lang_tree_node *) TREE_CHAIN (&%h.generic)) : NULL"))) lang_tree_node 
{
+   chain_next ("CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), 
TS_TYPE_COMMON) ? ((union lang_tree_node *) TYPE_NEXT_VARIANT (&%h.generic)) : 
CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), TS_COMMON) ? ((union 
lang_tree_node *) TREE_CHAIN (&%h.generic)) : NULL"))) lang_tree_node {
   union tree_node GTY ((tag ("TS_CP_GENERIC"),
desc ("tree_node_structure (&%h)"))) generic;
   struct template_parm_index_s GTY ((tag ("TS_CP_TPI"))) tpi;
--- gcc/c-decl.c.jj 2011-06-17 11:02:19.0 +0200
+++ gcc/c-decl.c2011-06-24 12:13:35.0 +0200
@@ -238,7 +238,7 @@ extern char C_SIZEOF_STRUCT_LANG_IDENTIF
 /* The resulting tree type.  */
 
 union GTY((desc ("TREE_CODE (&%h.generic) == IDENTIFIER_NODE"),
-   chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE ? (union 
lang_tree_node *) TYPE_NEXT_VARIANT (&%h.generic) : CODE_CONTAINS_STRUCT 
(TREE_CODE (&%h.generic), TS_COMMON) ? ((union lang_tree_node *) TREE_CHAIN 
(&%h.generic)) : NULL")))  lang_tree_node
+   chain_next ("CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), 
TS_TYPE_COMMON) ? (union lang_tree_node *) TYPE_NEXT_VARIANT (&%h.generic) : 
CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), TS_COMMON) ? ((union 
lang_tree_node *) TREE_CHAIN (&%h.generic)) : NULL")))  lang_tree_node
  {
   union tree_node GTY ((tag ("0"),
desc ("tree_node_structure (&%h)")))

Jakub


[lra] a patch to fix mesa compilation crash on ppc64

2011-06-24 Thread Vladimir Makarov

The following patch fixes compilation abort of SPECFP2000 mesa on ppc64.

The patch was successfully bootstrapped on x86-64, ppc64, and ia64.

 2011-06-24  Vladimir Makarov 

* lra-constraints.c (lra_constraints): Check subreg for equiv init
insn.


Index: lra-constraints.c
===
--- lra-constraints.c   (revision 175313)
+++ lra-constraints.c   (working copy)
@@ -3121,7 +3121,7 @@ lra_constraints (bool first_p)
   bool changed_p;
   int i, hard_regno, new_insns_num;
   unsigned int min_len;
-  rtx set, x;
+  rtx set, x, dest_reg;
   basic_block last_bb;
 
   lra_constraint_iter++;
@@ -3193,43 +3193,52 @@ lra_constraints (bool first_p)
   new_insns_num++;
   if (NONDEBUG_INSN_P (curr_insn))
{
- if ((set = single_set (curr_insn)) != NULL_RTX
- && x = get_equiv_substitution (SET_DEST (set)))
-   != SET_DEST (set))
-  /* Remove insns which set up a pseudo whose value
- can not be changed.  Such insns might be not in
- init_insns because we don't update equiv data
- during insn transformations.
-
- As an example, let suppose that a pseudo got
- hard register and on the 1st pass was not
- changed to equivalent constant.  We generate an
- additional insn setting up the pseudo because of
- secondary memory movement.  Then the pseudo is
- spilled and we use the equiv constant.  In this
- case we should remove the additional insn and
- this insn is not init_insns list.  */
-  && (! MEM_P (x) || MEM_READONLY_P (x)
-  || in_list_p (curr_insn,
-ira_reg_equiv
-[REGNO (SET_DEST (set))].init_insns)))
- || (SET_SRC (set) != get_equiv_substitution (SET_SRC (set))
- && in_list_p (curr_insn,
-   ira_reg_equiv
-   [REGNO (SET_SRC (set))].init_insns
+ if ((set = single_set (curr_insn)) != NULL_RTX)
{
- /* This is equiv init insn of pseudo which did not get a
-hard register -- remove the insn.  */
- if (lra_dump_file != NULL)
+ dest_reg = SET_DEST (set);
+ /* The equivalence pseudo could be set up as SUBREG in a
+case when it is a call restore insn in a mode
+different from the pseudo mode.  */
+ if (GET_CODE (dest_reg) == SUBREG)
+   dest_reg = SUBREG_REG (dest_reg);
+ if (REG_P (dest_reg)
+ && x = get_equiv_substitution (dest_reg)) != dest_reg)
+  /* Remove insns which set up a pseudo whose
+ value can not be changed.  Such insns might
+ be not in init_insns because we don't update
+ equiv data during insn transformations.
+ 
+ As an example, let suppose that a pseudo got
+ hard register and on the 1st pass was not
+ changed to equivalent constant.  We generate
+ an additional insn setting up the pseudo
+ because of secondary memory movement.  Then
+ the pseudo is spilled and we use the equiv
+ constant.  In this case we should remove the
+ additional insn and this insn is not
+ init_insns list.  */
+  && (! MEM_P (x) || MEM_READONLY_P (x)
+  || in_list_p (curr_insn,
+ira_reg_equiv
+[REGNO (SET_DEST (set))].init_insns)))
+ || (SET_SRC (set) != get_equiv_substitution (SET_SRC 
(set))
+ && in_list_p (curr_insn,
+   ira_reg_equiv
+   [REGNO (SET_SRC (set))].init_insns
{
- fprintf (lra_dump_file,
-  "  Removing equiv init insn %i (freq=%d)\n",
-  INSN_UID (curr_insn),
-  BLOCK_FOR_INSN (curr_insn)->frequency);
- print_rtl_slim (lra_dump_file, curr_insn, curr_insn, -1, 0);
+ /* This is equiv init insn of pseudo which did not get a
+hard register -- remove the insn.  */
+ if (lra_dump_file != NULL)
+   {
+ fprintf (lra_dump_file,
+  "  Removing equiv init insn %i (freq=%d)\n",
+  

Re: [testsuite] ARM tests vfp-ldm*.c and vfp-stm*.c

2011-06-24 Thread Janis Johnson
On 06/24/2011 03:29 AM, Joseph S. Myers wrote:
> On Thu, 23 Jun 2011, Janis Johnson wrote:
> 
>> Tests target/arm/vfp-ldm*.c and vfp-sdm*.c add -mfloat-abi=softfp but
>> fail if multilib flags override that option.  This patch skips the test
>> for multilibs that specify a different value for -mfloat-abi.
> 
> While they need to be skipped for -mfloat-abi=soft, I'd think they ought 
> to pass for -mfloat-abi=hard - why do they fail there?
> 

They don't, this would be better:

/* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } 
} */

Janis


Re: [testsuite] ARM ivopts tests: skip for no thumb support

2011-06-24 Thread Janis Johnson
On 06/24/2011 03:29 AM, Tom de Vries wrote:
> On 06/24/2011 12:07 AM, Janis Johnson wrote:
>> On 06/23/2011 02:56 PM, Ramana Radhakrishnan wrote:
>>> On 23 June 2011 22:36, Janis Johnson  wrote:
 Tests gcc.target/arm/ivopts*.c add -mthumb but fail on targets without
 thumb support; skip those targets.  The tests save temporary files and
 need to remove them at the end, easily done with cleanup-saved-temps.

 Test ivopts-6.c is the only one of the set that does not require thumb2
 support in the check for object-size, and it fails for -march=iwmmxt
 and iwmmxt2; the check should probably be used on that test as well,
 although I haven't included it here.
>>>
>>> I'm not sure I understand the change for ivopts-6.c :
>>>
>>> It's skipping if there is no Thumb support by default but the test
>>> assumes the test will run with  -marm on the command line ?
>>>
>>> Ramana
>>
>> Oops, I got carried away and didn't notice that it uses -marm rather
>> than -mthumb.  I'll take another look at that one.
>>
>> Janis
> 
> How about this patch? I removed all -mthumb/-marm option settings, and instead
> focused on trying to guard the object-size tests properly.
> 
> I introduced 2 new arm-related effective targets to accomplish this.
> - arm_thumb2: Tests if we're compiling for thumb2.
> - arm_nothumb: Tests if we're not compiling for any thumb.
> I don't know how to get the same effect with the existing arm-related 
> effective
> targets.

That looks good to me, and those effective targets will be very useful.

Reviewers, is Tom's patch OK?

Janis


Re: [testsuite] ARM ivopts tests: skip for no thumb support

2011-06-24 Thread Ramana Radhakrishnan
>> I introduced 2 new arm-related effective targets to accomplish this.
>> - arm_thumb2: Tests if we're compiling for thumb2.
>> - arm_nothumb: Tests if we're not compiling for any thumb.
>> I don't know how to get the same effect with the existing arm-related 
>> effective
>> targets.
>
> That looks good to me, and those effective targets will be very useful.

How is this different from arm_thumb2_ok and !arm_thumb2_ok ?

If l Iook at arm_thumb2 that appears to be identical to what
arm_thumb2_ok does.

proc check_effective_target_arm_thumb2_ok { } {
return [check_no_compiler_messages arm_thumb2_ok assembly {
#if !defined(__thumb2__)
#error FOO
#endif
} "-mthumb"]
}

+# Return 1 is this is an ARM target where is Thumb-2 used.
+
+proc check_effective_target_arm_thumb2 { } {
+return [check_no_compiler_messages arm_thumb2 assembly {
+   #if !defined(__thumb2__)
+   #error FOO
+   #endif
+} ""]
+}
+

Or am I missing something ?

Ramana


Re: [testsuite] ARM ivopts tests: skip for no thumb support

2011-06-24 Thread Janis Johnson
On 06/24/2011 08:03 AM, Ramana Radhakrishnan wrote:
>>> I introduced 2 new arm-related effective targets to accomplish this.
>>> - arm_thumb2: Tests if we're compiling for thumb2.
>>> - arm_nothumb: Tests if we're not compiling for any thumb.
>>> I don't know how to get the same effect with the existing arm-related 
>>> effective
>>> targets.
>>
>> That looks good to me, and those effective targets will be very useful.
> 
> How is this different from arm_thumb2_ok and !arm_thumb2_ok ?
> 
> If l Iook at arm_thumb2 that appears to be identical to what
> arm_thumb2_ok does.
> 
> proc check_effective_target_arm_thumb2_ok { } {
> return [check_no_compiler_messages arm_thumb2_ok assembly {
> #if !defined(__thumb2__)
> #error FOO
> #endif
> } "-mthumb"]<=== HERE
> }
> 
> +# Return 1 is this is an ARM target where is Thumb-2 used.
> +
> +proc check_effective_target_arm_thumb2 { } {
> +return [check_no_compiler_messages arm_thumb2 assembly {
> + #if !defined(__thumb2__)
> + #error FOO
> + #endif
> +} ""]
> +}
> +
> 
> Or am I missing something ?
> 
> Ramana

arm_thumb_ok and arm_thumb2_ok check to see if the target will be as expected
when compiling with -mthumb, and the tests that use it add -mthumb to the
options.  The new ones check to see if the target is thumb with current
multilib options, and it can safely be used for dg-final.

Janis

Janis


[google] Backport contrib/repro_fail to google/gcc-4_6

2011-06-24 Thread Diego Novillo
I backported contrib/repro_fail to google/gcc-4_6.  The other
google branches will get it when they start tracking trunk.

* repro_fail: New.

diff --git a/contrib/repro_fail b/contrib/repro_fail
new file mode 100755
index 000..8100456
--- /dev/null
+++ b/contrib/repro_fail
@@ -0,0 +1,82 @@
+#!/bin/bash -eu
+#
+# Script to reproduce a test failure from a dejagnu .log file.
+#
+# Contributed by Diego Novillo 
+#
+# Copyright (C) 2011 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING.  If not, write to
+# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+# Boston, MA 02110-1301, USA.
+
+# This script will search a line starting with 'spawn' that includes the
+# pattern you are looking for (typically a source file name).
+#
+# Once it finds that pattern, it re-executes the whole command
+# in the spawn line.  If the pattern matches more than one spawn
+# command, it asks which one you want.
+
+if [ $# -lt 2 ] ; then
+echo "usage: $0 pattern file.log [additional-args]"
+echo
+echo "Finds the 'spawn' line matching PATTERN in FILE.LOG and executes"
+echo "the command with any arguments in ADDITIONAL-ARGS."
+echo
+exit 1
+fi
+
+pattern="$1"
+logf="$2"
+shift 2
+
+# Find the commands in LOGF that reference PATTERN.
+lines=$(grep -E "^spawn .*$pattern" $logf | sed -e 's/^spawn //')
+if [ -z "$lines" ] ; then
+echo "Could not find a spawn command for pattern $pattern"
+exit 1
+fi
+
+# Collect all the command lines into the COMMANDS array.
+old_IFS="$IFS"
+IFS="
"
+num_lines=0
+for line in $lines ; do
+num_lines=$[$num_lines + 1]
+echo "[$num_lines] $line"
+commands[$num_lines]=$line
+done
+
+# If we found more than one line for PATTERN, ask which one we should run.
+cmds_to_run='0'
+if [ $num_lines -gt 1 ] ; then
+echo
+echo
+echo -n "Enter the list of commands to run or '0' to run them all: "
+read cmds_to_run
+fi
+if [ "$cmds_to_run" = "0" ] ; then
+cmds_to_run=$(seq 1 $num_lines)
+fi
+IFS="$old_IFS"
+
+# Finally, execute all the commands we were told to execute.
+for cmd_num in $cmds_to_run ; do
+cmd=${commands[$cmd_num]}
+set -x +e
+$cmd "$@"
+set +x -e
+done


Re: PATCH: PR rtl-optimization/49504: Invalid optimization for Pmode != ptr_mode

2011-06-24 Thread H.J. Lu
On Fri, Jun 24, 2011 at 7:17 AM, H.J. Lu  wrote:
> On Fri, Jun 24, 2011 at 7:07 AM, Eric Botcazou  wrote:
>>> I compared x32 glibc built with the old x32 gcc against x32 glibc built
>>> with this patch and
>>>
>>> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00913.html
>>>
>>> reverted, the new glibc is little smaller:
>>>
>>> New:
>>>
>>> [hjl@gnu-33 build-x86_64-linux]$ size libc.so
>>>    text          data     bss     dec     hex filename
>>> 1537648         10076   12944 1560668  17d05c libc.so
>>> [hjl@gnu-33 build-x86_64-linux]$
>>>
>>> Old:
>>>
>>> [hjl@gnu-33 build-x86_64-linux.old]$ size libc.so
>>>    text          data     bss     dec     hex filename
>>> 1538968         10076   12944 1561988  17d584 libc.so
>>> [hjl@gnu-33 build-x86_64-linux.old]$
>>>
>>> I looked at assembly codes.  The new one is better.
>>> I will check it in.
>>
>> OK, but remove the equivalent code in num_sign_bit_copies1 then, otherwise
>> someone in a couple of years from now will adapt it to nonzero_bits1 and we
>> will be back to square one.
>>
>
> I am testing this patch on x32 branch.  I will compare glibc binaries before 
> and
> after.
>

There are no code differences in glibc.  I will check it in.

Thanks.

H.J.
> --
> H.J.
> ---
> diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
> index e5c045d..0be6504 100644
> --- a/gcc/rtlanal.c
> +++ b/gcc/rtlanal.c
> @@ -4605,21 +4605,6 @@ num_sign_bit_copies1 (const_rtx x, enum
> machine_mode mode, const_rtx known_x,
>                                         known_x, known_mode, known_ret);
>       result = MAX (1, MIN (num0, num1) - 1);
>
> -#ifdef POINTERS_EXTEND_UNSIGNED
> -      /* If pointers extend signed and this is an addition or subtraction
> -        to a pointer in Pmode, all the bits above ptr_mode are known to be
> -        sign bit copies.  */
> -      /* As we do not know which address space the pointer is refering to,
> -        we can do this only if the target does not support different pointer
> -        or address modes depending on the address space.  */
> -      if (target_default_pointer_address_modes_p ()
> -         && ! POINTERS_EXTEND_UNSIGNED && GET_MODE (x) == Pmode
> -         && (code == PLUS || code == MINUS)
> -         && REG_P (XEXP (x, 0)) && REG_POINTER (XEXP (x, 0)))
> -       result = MAX ((int) (GET_MODE_BITSIZE (Pmode)
> -                            - GET_MODE_BITSIZE (ptr_mode) + 1),
> -                     result);
> -#endif
>       return result;
>
>     case MULT:
>



-- 
H.J.


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-24 Thread Richard Guenther
On Fri, Jun 24, 2011 at 3:46 PM, Stubbs, Andrew
 wrote:
> On 24/06/11 09:28, Richard Guenther wrote:
>>> >  To be clear, it only skips past NOP_EXPR. Is it not the case that what
>>> >  you're describing would need a CONVERT_EXPR?
>> NOP_EXPR is the same as CONVERT_EXPR.
>
> Are you sure?

Yes, definitely.  They are synonyms of each other (an unfinished merging
process), the usual check for them is via CONVERT_EXPR_P.

> I thought this was safe because the internals manual says:
>
>   NOP_EXPR
>   These nodes are used to represent conversions that do not require any
>   code-generation 
>
>   CONVERT_EXPR
>   These nodes are similar to NOP_EXPRs, but are used in those
>   situations where code may need to be generated 

Which is wrong (sorry).

> So, I tried this example:
>
> int
> foo (int a, short b, short c)
> {
>   int bc = b * c;
>   return a + (short)bc;
> }
>
> Both before and after my patch, GCC gives:
>
>         mul     r2, r1, r2
>         sxtah   r0, r0, r2
>
> (where, SXTAH means sign-extend the third operand from HImode to SImode
> and add to the second operand.)
>
> The dump after the widening_mult pass is:
>
> foo (int a, short int b, short int c)
> {
>   int bc;
>   int D.2018;
>   short int D.2017;
>   int D.2016;
>   int D.2015;
>   int D.2014;
>
> :
>   D.2014_2 = (int) b_1(D);
>   D.2015_4 = (int) c_3(D);
>   bc_5 = b_1(D) w* c_3(D);
>   D.2017_6 = (short int) bc_5;
>   D.2018_7 = (int) D.2017_6;
>   D.2016_9 = D.2018_7 + a_8(D);
>   return D.2016_9;
>
> }
>
> Where you can clearly see that the addition has not been recognised as a
> multiply-and-accumulate.
>
> When I step through convert_plusminus_to_widen, I can see that the
> reason it has not matched is because "D.2017_6 = (short int) bc_5" is
> encoded with a CONVERT_EXPR, just as the manual said it would be.

A NOP_EXPR in this place would be valid as well.  The merging hasn't
been completed and at least the C frontend still generates CONVERT_EXPRs
in some cases.

> So, according to the manual, and my (admittedly limited) experiments,
> skipping over NOP_EXPR does appear to be safe.
>
> But you say that it isn't safe. So now I'm confused. :(
>
> I can certainly add checks to make sure that the skipped operations
> actually don't make any important changes to the value, but do I need to?

Yes.

Thanks,
Richard.

> Andrew
>


Re: PR 49169: testing the alignment of a function

2011-06-24 Thread Richard Guenther
On Fri, Jun 24, 2011 at 4:42 PM, Richard Sandiford
 wrote:
> This patch fixes PR 49169, where GCC is incorrectly optimising away
> a test for whether a function is Thumb rather than ARM.  The patch
> was posted by Richard in the PR:
>
>    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49169
>
> See the PR for a discussion about whether a target hook is better
> (or not, IMO).
>
> Tested on arm-linux-gnueabi, where it fixes the attached testcase.
> OK to install?

Ok.

Thanks,
Richard.

> Richard
>
>
> gcc/
> 2011-07-24  Richard Guenther  
>
>        PR tree-optimization/49169
>        * fold-const.c (get_pointer_modulus_and_residue): Don't rely on
>        the alignment of function decls.
>
> gcc/testsuite/
> 2011-07-24  Michael Hope  
>            Richard Sandiford  
>
>        PR tree-optimization/49169
>        * gcc.dg/torture/pr49169.c: New test.
>
> Index: gcc/fold-const.c
> ===
> --- gcc/fold-const.c    2011-06-22 16:48:38.0 +0100
> +++ gcc/fold-const.c    2011-06-23 17:50:33.0 +0100
> @@ -9216,7 +9216,8 @@ get_pointer_modulus_and_residue (tree ex
>   *residue = 0;
>
>   code = TREE_CODE (expr);
> -  if (code == ADDR_EXPR)
> +  if (code == ADDR_EXPR
> +      && TREE_CODE (TREE_OPERAND (expr, 0)) != FUNCTION_DECL)
>     {
>       unsigned int bitalign;
>       bitalign = get_object_alignment_1 (TREE_OPERAND (expr, 0), residue);
> Index: gcc/testsuite/gcc.dg/torture/pr49169.c
> ===
> --- /dev/null   2011-06-20 08:31:41.268810499 +0100
> +++ gcc/testsuite/gcc.dg/torture/pr49169.c      2011-06-23 17:52:24.0 
> +0100
> @@ -0,0 +1,13 @@
> +#include 
> +#include 
> +
> +int
> +main (void)
> +{
> +  void *p = main;
> +  if ((intptr_t) p & 1)
> +    abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler "abort" } } */
>


[PATCH, MELT] correct meltgc_read_from_val without location

2011-06-24 Thread Pierre Vittet

Hello,

The function meltgc_read_from_val (in melt-runtime.c) takes two 
arguments, a string value and a second one which is a location.
In the comments, it is written that we can pass a NULL pointer if we 
have no location (it is a direct string). However, this conduct MELT to 
crash because it doesn't handle correctly the absence of file.


This patch correct this, if there is no file, it create a "virtual" one 
which is named "stringBuffer".


Pierre Vittet
Index: gcc/melt-runtime.c
===
--- gcc/melt-runtime.c  (revision 175348)
+++ gcc/melt-runtime.c  (working copy)
@@ -6326,7 +6326,7 @@ melt_linemap_compute_current_location (struct read
 {
   int colnum = 1;
   int cix = 0;
-  if (!rd || !rd->rcurlin) 
+  if (!rd || !rd->rcurlin || !rd->rpfilnam)
 return;
   for (cix=0; cixrcol; cix++) {
 char c = rd->rcurlin[cix];
@@ -6702,13 +6702,22 @@ static melt_ptr_t
 makesexpr (struct reading_st *rd, int lineno, melt_ptr_t contents_p,
   location_t loc, bool ismacrostring)
 {
-  MELT_ENTERFRAME (4, NULL);
+  MELT_ENTERFRAME (5, NULL);
 #define sexprv  meltfram__.mcfr_varptr[0]
 #define contsv   meltfram__.mcfr_varptr[1]
 #define locmixv meltfram__.mcfr_varptr[2]
 #define sexpclassv meltfram__.mcfr_varptr[3]
+#define locnamv meltfram__.mcfr_varptr[4]
   contsv = contents_p;
   gcc_assert (melt_magic_discr ((melt_ptr_t) contsv) == MELTOBMAG_LIST);
+  /* If there is no filename associated, we create a false one, named
+"stringBuffer".  */
+  if(rd->rpfilnam == NULL)
+{
+  locnamv = meltgc_new_string ((meltobject_ptr_t) 
MELT_PREDEF(DISCR_STRING),
+   "stringBuffer");
+  rd->rpfilnam = (melt_ptr_t *) &locnamv;
+}
   if (loc == 0)
 locmixv = meltgc_new_mixint ((meltobject_ptr_t) MELT_PREDEF 
(DISCR_MIXED_INTEGER),
*rd->rpfilnam, (long) lineno);
@@ -6728,6 +6737,7 @@ makesexpr (struct reading_st *rd, int lineno, melt
   meltgc_touch (sexprv);
   MELT_EXITFRAME ();
   return (melt_ptr_t) sexprv;
+#undef locnamv
 #undef sexprv
 #undef contsv
 #undef locmixv
@@ -8414,6 +8424,7 @@ meltgc_read_from_val (melt_ptr_t strv_p, melt_ptr_
   strv = strv_p;
   locnamv = locnam_p;
   rbuf = 0;
+  seqv = meltgc_new_list ((meltobject_ptr_t) MELT_PREDEF (DISCR_LIST));
   strmagic = melt_magic_discr ((melt_ptr_t) strv);
   switch (strmagic)
 {
@@ -8442,7 +8453,10 @@ meltgc_read_from_val (melt_ptr_t strv_p, melt_ptr_
   rds.rlineno = 0;
   rds.rcurlin = rbuf;
   rd = &rds;
-  rds.rpfilnam = (melt_ptr_t *) & locnamv;
+  if(locnamv == NULL)
+rds.rpfilnam = NULL;
+  else
+rds.rpfilnam = (melt_ptr_t *) & locnamv;
   while (rdcurc ())
 {
   bool got = FALSE;
2011-06-24  Pierre Vittet  

* melt-runtime.c (melt_linemap_compute_current_location, makesexpr,
meltgc_read_from_val): Handle the case of reading a string sexp without
given location.


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-24 Thread Stubbs, Andrew
On 24/06/11 16:47, Richard Guenther wrote:
>> >  I can certainly add checks to make sure that the skipped operations
>> >  actually don't make any important changes to the value, but do I need to?
> Yes.

Ok, I'll go away and do that then.

BTW, I see useless_type_conversion_p, but that's not quite what I want. 
Is there an equivalent existing function to determine whether a 
conversion changes the logical/arithmetic meaning of a type?

I mean, conversion to a wider mode is not "useless", but it is harmless, 
whereas conversion to a narrower mode may truncate the value.

Andrew


Re: [pph] Fix binding_level's names_size streaming (issue4634071)

2011-06-24 Thread Gabriel Charette
This was commited to trunk. Diego can you commit this patch to pph as well?

Thanks,
Gab

On Thu, Jun 23, 2011 at 4:07 AM, Diego Novillo  wrote:
> On Wed, Jun 22, 2011 at 20:17, Gabriel Dos Reis
>  wrote:
>> On Wed, Jun 22, 2011 at 7:05 PM, Gabriel Charette  wrote:
>>> And it looks like this wasn't sent to anyone directly...
>>> Adding back dnovillo and crowl (Diego I don't think Jason was ever
>>> added to the original message...?)
>>
>> should not this go to mainline too?
>
> Yes, I CC'd Jason in the original thread that started this discussion.
>  Gab, could you send a patch for trunk?  Please CC Jason when you do.
>
>
> Diego.
>


[lra] patch to fix crafty compilation on ppc64

2011-06-24 Thread Vladimir Makarov

The following patch fixes compilation abort of SPEC2000 crafty on ppc64.

The patch was successfully bootstrapped on x86-64, ppc64, and ia64.

2011-06-24  Vladimir Makarov 

* lra-constraints.c (curr_insn_transform): Process operator
duplications.

Index: lra-constraints.c
===
--- lra-constraints.c   (revision 175381)
+++ lra-constraints.c   (working copy)
@@ -3051,8 +3051,22 @@ curr_insn_transform (void)
   if (before != NULL_RTX || after != NULL_RTX || max_regno_before != 
max_reg_num ())
 change_p = true;
   if (change_p)
-/* Something changes -- process the insn.  */
-lra_update_insn_regno_info (curr_insn);
+{
+  /* Process operator duplications.  We do it here to guarantee
+their processing after operands processing.  Generally
+speaking, we could do this probably in the previous loop
+because a common practice is to enumerate the operators after
+their operands.  */
+  for (i = 0; i < n_dups; i++)
+   {
+ int ndup = curr_static_id->dup_num[i];
+
+ if (curr_static_id->operand[ndup].is_operator)
+   *curr_id->dup_loc[i] = *curr_id->operand_loc[ndup];
+   }
+  /* Something changes -- process the insn.  */
+  lra_update_insn_regno_info (curr_insn);
+}
   lra_process_new_insns (curr_insn, before, after, "Inserting insn reload");
   return change_p;
 }


Re: [PATCH] Use TYPE_NEXT_VARIANT instead of TREE_CHAIN as chain_next for types during GC (PR c++/46400)

2011-06-24 Thread Joseph S. Myers
On Fri, 24 Jun 2011, Jakub Jelinek wrote:

>   * c-decl.c (union lang_tree_node): Use TYPE_NEXT_VARIANT
>   instead of TYPE_CHAIN for chain_next for types.

The C changes are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [pph] Fix binding_level's names_size streaming (issue4634071)

2011-06-24 Thread dnovillo

On 2011/06/24 17:35:51, Gabriel Charette wrote:

This was commited to trunk. Diego can you commit this patch to pph as

well?

Done.  r175387.


Diego.

http://codereview.appspot.com/4634071/


Re: [patch, 4.6/4.7] fix installation of plugin header files

2011-06-24 Thread Matthias Klose
On 06/20/2011 05:18 PM, Joseph S. Myers wrote:
> On Mon, 20 Jun 2011, Matthias Klose wrote:
> 
>>  - PR45078; vxworks-dummy.h is included for cpu_type in arm,
>>i386, mips, sh and sparc but only installed when it's i386; copy it
>>manually anytime.
> 
> I don't think you should list particular config/ headers in PLUGIN_HEADERS 
> in Makefile.in; provide a way for targets to specify their additions to 
> this list in config.gcc instead.  Is the issue headers that are directly 
> #included from tm.h headers (for whatever reason) rather than listed in 
> tm_file?  (Some of those #includes may be avoidable, but the .def ones 
> probably do need listing explicitly.)
> 
> The aim should be to get the extra files in tm_file_list, which is 
> included in PLUGIN_HEADERS, so that they appear in $(TM_H) dependencies as 
> well.

updated patch attached.

  Matthias

2011-06-24  Matthias Klose  

	PR plugin/45078
	* Makefile.in (PLUGIN_HEADERS): Add config/arm/arm-cores.def.
	(install-plugin): Install c-family headers into a c-family subdir.
	* config.gcc: Add vxworks-dummy.h to tm_file for arm, mips, sh and
	sparc targets.

--- gcc/Makefile.in
+++ gcc/Makefile.in
@@ -4503,6 +4503,7 @@
   $(EXCEPT_H) tree-ssa-sccvn.h real.h output.h $(IPA_UTILS_H) \
   $(C_PRAGMA_H)  $(CPPLIB_H)  $(FUNCTION_H) \
   cppdefault.h flags.h $(MD5_H) params.def params.h prefix.h tree-inline.h \
+  config/arm/arm-cores.def \
   $(IPA_PROP_H) $(RTL_H) $(TM_P_H) $(CFGLOOP_H) $(EMIT_RTL_H) version.h
 
 # generate the 'build fragment' b-header-vars
@@ -4527,7 +4528,7 @@
 	  else continue; \
 	  fi; \
 	  case $$path in \
-	  "$(srcdir)"/config/* | "$(srcdir)"/*.def ) \
+	  "$(srcdir)"/config/* | "$(srcdir)"/c-family/* | "$(srcdir)"/*.def ) \
 	base=`echo "$$path" | sed -e "s|$$srcdirstrip/||"`;; \
 	  *) base=`basename $$path` ;; \
 	  esac; \
--- gcc/config.gcc
+++ gcc/config.gcc
@@ -467,6 +467,9 @@
 	fi
 	tm_file="vxworks-dummy.h ${tm_file}"
 	;;
+arm*-*-*|mips*-*-*|sh*-*-*|sparc*-*-*)
+	tm_file="vxworks-dummy.h ${tm_file}"
+	;;
 esac
 
 # On a.out targets, we need to use collect2.


Re: [patch, 4.6/4.7] fix installation of plugin header files

2011-06-24 Thread Joseph S. Myers
On Fri, 24 Jun 2011, Matthias Klose wrote:

> On 06/20/2011 05:18 PM, Joseph S. Myers wrote:
> > On Mon, 20 Jun 2011, Matthias Klose wrote:
> > 
> >>  - PR45078; vxworks-dummy.h is included for cpu_type in arm,
> >>i386, mips, sh and sparc but only installed when it's i386; copy it
> >>manually anytime.
> > 
> > I don't think you should list particular config/ headers in PLUGIN_HEADERS 
> > in Makefile.in; provide a way for targets to specify their additions to 
> > this list in config.gcc instead.  Is the issue headers that are directly 
> > #included from tm.h headers (for whatever reason) rather than listed in 
> > tm_file?  (Some of those #includes may be avoidable, but the .def ones 
> > probably do need listing explicitly.)
> > 
> > The aim should be to get the extra files in tm_file_list, which is 
> > included in PLUGIN_HEADERS, so that they appear in $(TM_H) dependencies as 
> > well.
> 
> updated patch attached.

That doesn't sufficiently address the issues I pointed out.

* Listing arm-cores.def in Makefile.in is still wrong.

* If you add a header to tm_file (which needs a more detailed analysis of 
why including it there in the list of headers is safe for all targets 
affected) then you should also remove the #include directives that 
directly include it from other headers.

* There are other files included in tm.h headers that this patch is silent 
on.

I believe you don't need to do anything about headers listed in 
HeaderInclude in a .opt file that are also explicitly #included.  Apart 
from those, all #include directives in tm.h headers should be 
investigated.  If they can be replaced by entries in tm_file, by all means 
do so, but if not, then *don't* add them explicitly to Makefile.in, 
provide a way for them to get into tm_file_list in the Makefile without 
them getting into tm_include_list there (which may mean a new config.gcc 
variable).  This new mechanism is where arm-cores.def and other such 
headers should be listed - not directly in Makefile.in.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: C++ PATCH for c++/35255 (address of template-id)

2011-06-24 Thread Jason Merrill

On 06/24/2011 08:33 AM, Diego Novillo wrote:

Patch missing.


Oopt.
commit 124387ceea38a3c0204c9f91d17bbfa68063d42e
Author: Jason Merrill 
Date:   Wed Jun 22 16:07:10 2011 -0400

	PR c++/35255
	* pt.c (resolve_overloaded_unification): Fix DR 115 handling.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 08ce5af..b3dd85f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -14524,6 +14524,7 @@ resolve_overloaded_unification (tree tparms,
 	 the affected templates before we try to unify, in case the
 	 explicit args will completely resolve the templates in question.  */
 
+  int ok = 0;
   tree expl_subargs = TREE_OPERAND (arg, 1);
   arg = TREE_OPERAND (arg, 0);
 
@@ -14538,7 +14539,7 @@ resolve_overloaded_unification (tree tparms,
 	  ++processing_template_decl;
 	  subargs = get_bindings (fn, DECL_TEMPLATE_RESULT (fn),
   expl_subargs, /*check_ret=*/false);
-	  if (subargs)
+	  if (subargs && !any_dependent_template_arguments_p (subargs))
 	{
 	  elem = tsubst (TREE_TYPE (fn), subargs, tf_none, NULL_TREE);
 	  if (try_one_overload (tparms, targs, tempargs, parm,
@@ -14549,8 +14550,16 @@ resolve_overloaded_unification (tree tparms,
 		  ++good;
 		}
 	}
+	  else if (subargs)
+	++ok;
 	  --processing_template_decl;
 	}
+  /* If no templates (or more than one) are fully resolved by the
+	 explicit arguments, this template-id is a non-deduced context; it
+	 could still be OK if we deduce all template arguments for the
+	 enclosing call through other arguments.  */
+  if (good != 1)
+	good = ok;
 }
   else if (TREE_CODE (arg) != OVERLOAD
 	   && TREE_CODE (arg) != FUNCTION_DECL)
diff --git a/gcc/testsuite/g++.dg/template/partial10.C b/gcc/testsuite/g++.dg/template/partial10.C
new file mode 100644
index 000..53a48fb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/partial10.C
@@ -0,0 +1,18 @@
+// PR c++/35255, DR 115
+// { dg-do link }
+
+// 14.8.1: In contexts where deduction is done and fails, or in contexts
+// where deduction is not done, if a template argument list is specified
+// and it, along with any default template arguments, identifies a single
+// function template specialization, then the template-id is an lvalue for
+// the function template specialization.
+
+template  void def(Fn fn) {}
+
+template  T2 fn(T1, T2);
+template  int fn(T1) { }
+
+int main()
+{
+  def(fn);
+}
diff --git a/gcc/testsuite/g++.dg/template/partial11.C b/gcc/testsuite/g++.dg/template/partial11.C
new file mode 100644
index 000..b5ceaa8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/partial11.C
@@ -0,0 +1,24 @@
+// DR 115
+
+// 14.8.1: In contexts where deduction is done and fails, or in contexts
+// where deduction is not done, if a template argument list is specified
+// and it, along with any default template arguments, identifies a single
+// function template specialization, then the template-id is an lvalue for
+// the function template specialization.
+
+// Here, deduction is not done to resolve fn because the target type
+// is a template parameter, so we resolve to the second template, and then
+// the call to def fails because we deduce different values of Fn for the
+// two function arguments.
+
+template  void def(Fn fn, Fn fn2);
+
+template  T2 fn(T1, T2);
+template  int fn(T1);
+
+int f(int,int);
+
+int main()
+{
+  def(fn,f);		// { dg-error "" }
+}


Re: [PATCH] Use TYPE_NEXT_VARIANT instead of TREE_CHAIN as chain_next for types during GC (PR c++/46400)

2011-06-24 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Use TYPE_NEXT_VARIANT instead of TREE_CHAIN as chain_next for types during GC (PR c++/46400)

2011-06-24 Thread Richard Henderson
On 06/24/2011 07:43 AM, Jakub Jelinek wrote:
> +   chain_next ("CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), 
> TS_TYPE_COMMON) ? ((union lang_tree_node *) TYPE_NEXT_VARIANT (&%h.generic)) 
> : CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), TS_COMMON) ? ((union 
> lang_tree_node *) TREE_CHAIN (&%h.generic)) : NULL"))) lang_tree_node {

Is it possible to break this out into an inline (or, i suppose, out of line)
function?  This is getting fairly unreadable...


r~


[pph] Stream chain of struct fields (issue4631072)

2011-06-24 Thread Gabriel Charette
We were only streaming the first field of every struct. Struct fields have a 
chain link to the next field, thus we need to stream the DECL_CHAIN of every 
field as well recursively.

I limited this to VAR_DECL and FUNCTION_DECL for now (which fixes all of our 
current bugs in regards to the struct fields issues), are there any other DECLs 
that can potentially be fields of a struct?

I was tempted to stream the DECL_CHAIN for the whole "group" VAR_DECL belongs 
to in pph_read_tree, but as it turns out we don't want to do this for 
NAMESPACE_DECL as it gets a new chain link when re-added to the bindings early 
on and changing it here causes an ICE in one of the test cases.
Hence, I reverted to the conservative way of only streaming it for what we know 
we need it for.

Syntax-wise: Is it ok to play this 'case' fall through trick with VAR_DECL or 
should I make a separate case entry with it's own break?

2011-06-24  Gabriel Charette  

* gcc/cp/pph-streamer-in.c (pph_read_tree):
Stream in DECL_CHAIN of VAR_DECL.
Stream in DECL_CHAIN of FUNCTION_DECL.
* gcc/cp/pph-streamer-out.c (pph_write_tree):
Stream out DECL_CHAIN of VAR_DECL.
Stream out DECL_CHAIN of FUNCTION_DECL.
* gcc/testsuite/g++.dg/pph/x1functions.cc: Fixed bogus, now asm diff.
* gcc/testsuite/g++.dg/pph/x1variables.cc: Fixed bogus, now asm diff.

diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 0e6763f..8d12839 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -1216,12 +1216,13 @@ pph_read_tree (struct lto_input_block *ib 
ATTRIBUTE_UNUSED,
   DECL_INITIAL (expr) = pph_in_tree (stream);
   break;
 
+case VAR_DECL:
+  DECL_CHAIN (expr) = pph_in_tree (stream);
 case CONST_DECL:
 case FIELD_DECL:
 case NAMESPACE_DECL:
 case PARM_DECL:
 case USING_DECL:
-case VAR_DECL:
   /* FIXME pph: Should we merge DECL_INITIAL into lang_specific? */
   DECL_INITIAL (expr) = pph_in_tree (stream);
   pph_in_lang_specific (stream, expr);
@@ -1232,6 +1233,7 @@ pph_read_tree (struct lto_input_block *ib 
ATTRIBUTE_UNUSED,
   pph_in_lang_specific (stream, expr);
   DECL_SAVED_TREE (expr) = pph_in_tree (stream);
   DECL_STRUCT_FUNCTION (expr) = pph_in_struct_function (stream);
+  DECL_CHAIN (expr) = pph_in_tree (stream);
   break;
 
 case TYPE_DECL:
diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c
index 613cdcd..e85a629 100644
--- a/gcc/cp/pph-streamer-out.c
+++ b/gcc/cp/pph-streamer-out.c
@@ -1062,12 +1062,13 @@ pph_write_tree (struct output_block *ob, tree expr, 
bool ref_p)
   pph_out_tree_or_ref_1 (stream, DECL_INITIAL (expr), ref_p, 3);
   break;
 
+case VAR_DECL:
+  pph_out_tree_or_ref_1 (stream, DECL_CHAIN (expr), ref_p, 3);
 case CONST_DECL:
 case FIELD_DECL:
 case NAMESPACE_DECL:
 case PARM_DECL:
 case USING_DECL:
-case VAR_DECL:
   /* FIXME pph: Should we merge DECL_INITIAL into lang_specific? */
   pph_out_tree_or_ref_1 (stream, DECL_INITIAL (expr), ref_p, 3);
   pph_out_lang_specific (stream, expr, ref_p);
@@ -1078,6 +1079,7 @@ pph_write_tree (struct output_block *ob, tree expr, bool 
ref_p)
   pph_out_lang_specific (stream, expr, ref_p);
   pph_out_tree_or_ref_1 (stream, DECL_SAVED_TREE (expr), ref_p, 3);
   pph_out_struct_function (stream, DECL_STRUCT_FUNCTION (expr), ref_p);
+  pph_out_tree_or_ref_1 (stream, DECL_CHAIN (expr), ref_p, 3);
   break;
 
 case TYPE_DECL:
diff --git a/gcc/testsuite/g++.dg/pph/x1functions.cc 
b/gcc/testsuite/g++.dg/pph/x1functions.cc
index 20cde5c..78df01b 100644
--- a/gcc/testsuite/g++.dg/pph/x1functions.cc
+++ b/gcc/testsuite/g++.dg/pph/x1functions.cc
@@ -1,5 +1,4 @@
-// { dg-xfail-if "BOGUS" { "*-*-*" } { "-fpph-map=pph.map" } }
-// { dg-bogus "'mbr_decl_inline' was not declared in this scope" "" { xfail 
*-*-* } 0 }
+// pph asm xdiff
 
 #include "x1functions.h"
 
diff --git a/gcc/testsuite/g++.dg/pph/x1variables.cc 
b/gcc/testsuite/g++.dg/pph/x1variables.cc
index bac3136..0f0814f 100644
--- a/gcc/testsuite/g++.dg/pph/x1variables.cc
+++ b/gcc/testsuite/g++.dg/pph/x1variables.cc
@@ -1,6 +1,4 @@
-// { dg-xfail-if "BOGUS" { "*-*-*" } { "-fpph-map=pph.map" } }
-// { dg-bogus "c1variables.h:5:8: error: 'int D::mbr_uninit_plain' is not a 
static member of 'struct D'" "" { xfail *-*-* } 0 }
-// { dg-bogus "c1variables.h:6:14: error: 'const int D::mbr_init_const' is not 
a static member of 'struct D'" "" { xfail *-*-* } 0 }
+// pph asm xdiff
 
 #include "x1variables.h"
 

--
This patch is available for review at http://codereview.appspot.com/4631072


Re: [pph] Stream chain of struct fields (issue4631072)

2011-06-24 Thread gchare

Fixes two pph BOGUS bugs which now result in an asm diff.

Tested with bootstrap build and pph regression testing.

http://codereview.appspot.com/4631072/


RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-24 Thread Fang, Changpeng
Hi,

 I have no preference in tune feature coding. But I agree with you it's better 
to
put similar things together. I modified the code following your suggestion.

Is it OK to commit this modified patch?

Thanks,

Changpeng




From: Jan Hubicka [hubi...@ucw.cz]
Sent: Thursday, June 23, 2011 6:20 PM
To: Fang, Changpeng
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; hubi...@ucw.cz; rguent...@suse.de
Subject: Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int 
> x86_avx256_split_unaligned_load
>  static const unsigned int x86_avx256_split_unaligned_store
>= m_COREI7 | m_BDVER1 | m_GENERIC;
>
> +static const unsigned int x86_prefer_avx128
> +  = m_BDVER1;

What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the 
target
tunning flags spread across multiple places seems unnecesary.

Honza

From a325395439a314f87b3c79a5b9ce79a6a976a710 Mon Sep 17 00:00:00 2001
From: Changpeng Fang 
Date: Wed, 22 Jun 2011 15:03:05 -0700
Subject: [PATCH] Auto-vectorizer generates 128-bit AVX insns by default for bdver1

	* config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.

	* config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL entry.
	(TARGET_AVX128_OPTIMAL): New definition.

	* config/i386/i386.c (initial_ix86_tune_features): Initialize
	X86_TUNE_AVX128_OPTIMAL entry.
	(ix86_option_override_internal): Enable the generation
	of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
	(ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
	(ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
---
 gcc/config/i386/i386.c   |   16 
 gcc/config/i386/i386.h   |4 +++-
 gcc/config/i386/i386.opt |2 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 014401b..b3434dd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2089,7 +2089,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
   /* X86_SOFTARE_PREFETCHING_BENEFICIAL: Enable software prefetching
  at -O3.  For the moment, the prefetching seems badly tuned for Intel
  chips.  */
-  m_K6_GEODE | m_AMD_MULTIPLE
+  m_K6_GEODE | m_AMD_MULTIPLE,
+
+  /* X86_TUNE_AVX128_OPTIMAL: Enable 128-bit AVX instruction generation for
+ the auto-vectorizer.  */
+  m_BDVER1
 };
 
 /* Feature tests against the various architecture variations.  */
@@ -2623,6 +2627,7 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune,
 { "-mvzeroupper",			MASK_VZEROUPPER },
 { "-mavx256-split-unaligned-load",	MASK_AVX256_SPLIT_UNALIGNED_LOAD},
 { "-mavx256-split-unaligned-store",	MASK_AVX256_SPLIT_UNALIGNED_STORE},
+{ "-mprefer-avx128",		MASK_PREFER_AVX128},
   };
 
   const char *opts[ARRAY_SIZE (isa_opts) + ARRAY_SIZE (flag_opts) + 6][2];
@@ -3672,6 +3677,9 @@ ix86_option_override_internal (bool main_args_p)
 	  if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
 	  && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
 	target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
+	  /* Enable 128-bit AVX instruction generation for the auto-vectorizer.  */
+	  if (TARGET_AVX128_OPTIMAL && !(target_flags_explicit & MASK_PREFER_AVX128))
+	target_flags |= MASK_PREFER_AVX128;
 	}
 }
   else 
@@ -34614,7 +34622,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
   return V2DImode;
 
 case SFmode:
-  if (TARGET_AVX && !flag_prefer_avx128)
+  if (TARGET_AVX && !TARGET_PREFER_AVX128)
 	return V8SFmode;
   else
 	return V4SFmode;
@@ -34622,7 +34630,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
 case DFmode:
   if (!TARGET_VECTORIZE_DOUBLE)
 	return word_mode;
-  else if (TARGET_AVX && !flag_prefer_avx128)
+  else if (TARGET_AVX && !TARGET_PREFER_AVX128)
 	return V4DFmode;
   else if (TARGET_SSE2)
 	return V2DFmode;
@@ -34639,7 +34647,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
 static unsigned int
 ix86_autovectorize_vector_sizes (void)
 {
-  return (TARGET_AVX && !flag_prefer_avx128) ? 32 | 16 : 0;
+  return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
 }
 
 /* Initialize the GCC target structure.  */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8badcbb..d9317ed 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -312,6 +312,7 @@ enum ix86_tune_indices {
   X86_TUNE_OPT_AGU,
   X86_TUNE_VECTORIZE_DOUBLE,
   X86_TUNE_SOFTWARE_PREFETCHING_BENEFICIAL,
+  X86_TUNE_AVX128_OPTIMAL,
 
   X86_TUNE_LAST
 };
@@ -410,7 +411,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
 	ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE]
 #define TARGET_SOFTWARE_PREFETCHING_BENEFICIAL \
 	ix86_tune_features[X86_TUNE_SOF

[lra] patch to fix a x86 wrong code generation

2011-06-24 Thread Vladimir Makarov
The following patch solves a bug which results in a wrong code 
generation of Maeno's algorithm of matrix multiplications on x86.


The patch was successfully bootstrapped on x86, ppc64, ia64.

2011-06-24  Vladimir Makarov 

* lra-constraints.c (extract_loc_address_regs): Add an argument.
Don't process the reg as an index reg on the top address level
when it is added to the symbol.
(extract_address_regs): Pass new argument to
extract_loc_address_regs.


Index: lra-constraints.c
===
--- lra-constraints.c   (revision 175386)
+++ lra-constraints.c   (working copy)
@@ -383,8 +383,8 @@ ok_for_base_p_nonstrict (rtx reg, enum m
   return ok_for_base_p_1 (regno, mode, outer_code, index_code);
 }
 
-/* Process address part with location *LOC to extract address
-   characteristics.
+/* Process address part (or all address if TOP_P) with location *LOC
+   to extract address characteristics.
 
If CONTEXT_P is false, we are looking at the base part of an
address, otherwise we are looking at the index part.
@@ -393,7 +393,7 @@ ok_for_base_p_nonstrict (rtx reg, enum m
give the context that the rtx appears in; MODIFY_P if *LOC is
modified.  */
 static void
-extract_loc_address_regs (enum machine_mode mode,
+extract_loc_address_regs (bool top_p, enum machine_mode mode,
  rtx *loc, bool context_p, enum rtx_code outer_code,
  enum rtx_code index_code,
  bool modify_p, struct address *ad)
@@ -447,8 +447,8 @@ extract_loc_address_regs (enum machine_m
   must be in the first operand.  */
if (MAX_REGS_PER_ADDRESS == 1)
  {
-   extract_loc_address_regs (mode, arg0_loc, false, PLUS, code1,
- modify_p, ad);
+   extract_loc_address_regs (false, mode, arg0_loc, false, PLUS,
+ code1, modify_p, ad);
gcc_assert (CONSTANT_P (arg1)); /* It should be a displacement.  */
ad->disp_loc = arg1_loc;
  }
@@ -458,11 +458,11 @@ extract_loc_address_regs (enum machine_m
   addresses are in canonical form.  */
else if (INDEX_REG_CLASS == base_reg_class (VOIDmode, PLUS, SCRATCH))
  {
-   extract_loc_address_regs (mode, arg0_loc, false, PLUS, code1,
- modify_p, ad);
+   extract_loc_address_regs (false, mode, arg0_loc, false, PLUS,
+ code1, modify_p, ad);
if (! CONSTANT_P (arg1))
- extract_loc_address_regs (mode, arg1_loc, true, PLUS, code0,
-   modify_p, ad);
+ extract_loc_address_regs (false, mode, arg1_loc, true, PLUS,
+   code0, modify_p, ad);
else
  ad->disp_loc = arg1_loc;
  }
@@ -471,15 +471,17 @@ extract_loc_address_regs (enum machine_m
   change what class the first operand must be.  */
else if (code1 == CONST_INT || code1 == CONST_DOUBLE)
  {
-   extract_loc_address_regs (mode, arg0_loc, context_p, PLUS, code1,
- modify_p, ad);
+   extract_loc_address_regs (false, mode, arg0_loc, context_p, PLUS,
+ code1, modify_p, ad);
ad->disp_loc = arg1_loc;
  }
/* If the second operand is a symbolic constant, the first
-  operand must be an index register.  */
+  operand must be an index register but only if this part is
+  all the address.  */
else if (code1 == SYMBOL_REF || code1 == CONST || code1 == LABEL_REF)
  {
-   extract_loc_address_regs (mode, arg0_loc, true, PLUS, code1,
+   extract_loc_address_regs (false, mode, arg0_loc,
+ top_p ? true : context_p, PLUS, code1,
  modify_p, ad);
ad->disp_loc = arg1_loc;
  }
@@ -493,9 +495,9 @@ extract_loc_address_regs (enum machine_m
 || ok_for_index_p_nonstrict (arg0)))
  {
extract_loc_address_regs
- (mode, arg0_loc, ! base_ok_p, PLUS, REG, modify_p, ad);
+ (false, mode, arg0_loc, ! base_ok_p, PLUS, REG, modify_p, ad);
extract_loc_address_regs
- (mode, arg1_loc, base_ok_p, PLUS, REG, modify_p, ad);
+ (false, mode, arg1_loc, base_ok_p, PLUS, REG, modify_p, ad);
  }
else if (code0 == REG && code1 == REG
 && REGNO (arg1) < FIRST_PSEUDO_REGISTER
@@ -504,38 +506,38 @@ extract_loc_address_regs (enum machine_m
 || ok_for_index_p_nonstrict (arg1)))
  {
extract_loc_address_regs 
- (mode, arg0_loc, base_ok_p, PLUS, REG, modify_p, ad);
+ (false, mode, arg0_loc, base_ok_p,

PATCH TRUNK: better format output for time reports.

2011-06-24 Thread Basile Starynkevitch
Hello All,

When cc1 report timing, the timing variable name has a too short width for df 
reg dead/unused notes:

Execution times (seconds)
 phase setup   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
1089 kB ( 1%) ggc
 trivially dead code   :   0.02 ( 1%) usr   0.00 ( 0%) sys   0.03 ( 1%) wall
   0 kB ( 0%) ggc
 df scan insns :   0.07 ( 2%) usr   0.00 ( 0%) sys   0.11 ( 2%) wall
  42 kB ( 0%) ggc
 df live regs  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 1%) wall
   0 kB ( 0%) ggc
 df reg dead/unused notes:   0.02 ( 1%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall  
  1395 kB ( 1%) ggc
 register information  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
   0 kB ( 0%) ggc
 alias analysis:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 393 kB ( 0%) ggc
 rebuild jump labels   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
   0 kB ( 0%) ggc
 preprocessing :   0.11 ( 3%) usr   0.10 (16%) sys   0.39 ( 8%) wall   
11550 kB (10%) ggc
 lexical analysis  :   0.02 ( 1%) usr   0.11 (17%) sys   0.31 ( 7%) wall
   0 kB ( 0%) ggc

The following trivial patch should fix that:
### patch against trunk 175201
Index: gcc/timevar.c
===
--- gcc/timevar.c   (revision 175201)
+++ gcc/timevar.c   (working copy)
@@ -478,7 +478,7 @@ timevar_print (FILE *fp)
continue;
 
   /* The timing variable name.  */
-  fprintf (fp, " %-22s:", tv->name);
+  fprintf (fp, " %-24s:", tv->name);
 
 #ifdef HAVE_USER_TIME
   /* Print user-mode time for this process.  */
### gcc/ChangeLog entry
2011-06-25  Basile Starynkevitch  

* timevar.c (timevar_print): Increase width for display of timevar name.
###

Ok for trunk?

Regards
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***