Re: PING: [ARM] Model automodified addresses in the Cortex A8 and A9 schedulers -- NEON

2011-09-14 Thread Richard Sandiford
Ramana Radhakrishnan  writes:
> I do have a one nit on the ml bit though I must say I'm not an ML
> expert which is why I resisted for a while. The one comment that I
> have and I should have realized earlier was that the file had been
> parameterized by the core in quite a few places and I would like to
> still retain that capability.
>
> This look better ? I have retained your original logic and only
> parameterized wblatency on the core .

Looks good, thanks.

Richard


[RFC] Add FMA support to sparc backend

2011-09-14 Thread David Miller

Eric, this is a preliminary version of the FMA patch I've been
working on.  Just so you can see what I'm doing.

First, ignore the fact that there are two configure tests for the
presence of support for these instructions.  I'm busy normalizing the
-xarch options which binutils supports so that they are the same as
Sun AS and therefore just one test is necessary.

Second, like rs6000 the sparc negate fused multiply instructions
negate the full result, not the multiply result.  So we cannot use
those instructions for the fnmadf4/fnmsdf4/fnmasf4/fnmssf4 patterns.
Since rs6000 provides patterns for such negate operations (presumably
just in case the combiner creates a match) I have done so for sparc
as well.

I was really surprised that cpu designers haven't settled on a
consistent formula for negated fused multiply add/sub instructions.
Ho hum...

For now my plan is to turn these fused multiply instructions on if you
ask to compile targetting a cpu that supports them.

I'll write a suitable changelog etc. once everything is finalized, this
patch posting is just to elicit feedback.

Thanks!

diff --git a/gcc/config.in b/gcc/config.in
index d202b038..4bb6271 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -272,6 +272,12 @@
 #endif
 
 
+/* Define if your assembler supports FMAF instructions. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_GAS_FMAF
+#endif
+
+
 /* Define if your assembler supports the --gdwarf2 option. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_AS_GDWARF2_DEBUG_FLAG
@@ -479,6 +485,12 @@
 #endif
 
 
+/* Define if your assembler supports FMAF instructions. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_SUN_FMAF
+#endif
+
+
 /* Define if your assembler and linker support thread-local storage. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_AS_TLS
@@ -1047,12 +1059,6 @@
 #endif
 
 
-/* Define if _Unwind_GetIPInfo is available. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GETIPINFO
-#endif
-
-
 /* Define to 1 if you have the `getrlimit' function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_GETRLIMIT
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index cf9e197..ce15730 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -752,9 +752,9 @@ sparc_option_override (void)
 /* UltraSPARC T2 */
 { MASK_ISA, MASK_V9},
 /* UltraSPARC T3 */
-{ MASK_ISA, MASK_V9},
+{ MASK_ISA, MASK_V9 | MASK_FMAF},
 /* UltraSPARC T4 */
-{ MASK_ISA, MASK_V9},
+{ MASK_ISA, MASK_V9 | MASK_FMAF},
   };
   const struct cpu_table *cpu;
   unsigned int i;
@@ -833,9 +833,9 @@ sparc_option_override (void)
   if (target_flags_explicit & MASK_FPU)
 target_flags = (target_flags & ~MASK_FPU) | fpu;
 
-  /* Don't allow -mvis if FPU is disabled.  */
+  /* Don't allow -mvis or -mfmaf if FPU is disabled.  */
   if (! TARGET_FPU)
-target_flags &= ~MASK_VIS;
+target_flags &= ~(MASK_VIS | MASK_FMAF);
 
   /* -mvis assumes UltraSPARC+, so we are sure v9 instructions
  are available.
@@ -9505,6 +9505,25 @@ sparc_rtx_costs (rtx x, int code, int outer_code, int 
opno ATTRIBUTE_UNUSED,
*total = COSTS_N_INSNS (1);
   return false;
 
+case FMA:
+  {
+   rtx sub;
+
+   gcc_assert (float_mode_p);
+   *total = sparc_costs->float_mul;
+
+   sub = XEXP (x, 0);
+   if (GET_CODE (sub) == NEG)
+ sub = XEXP (sub, 0);
+   *total += rtx_cost (sub, FMA, 0, speed);
+
+   sub = XEXP (x, 2);
+   if (GET_CODE (sub) == NEG)
+ sub = XEXP (sub, 0);
+   *total += rtx_cost (sub, FMA, 2, speed);
+   return true;
+  }
+
 case MULT:
   if (float_mode_p)
*total = sparc_costs->float_mul;
diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index afdca1e..c8174ce 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -1880,6 +1880,11 @@ extern int sparc_indent_opcode;
 #define TARGET_SUN_TLS TARGET_TLS
 #define TARGET_GNU_TLS 0
 
+#if !(defined(HAVE_AS_GAS_FMAF) || defined(HAVE_AS_SUN_FMAF))
+#undef TARGET_FMAF
+#define TARGET_FMAF 0
+#endif
+
 /* The number of Pmode words for the setjmp buffer.  */
 #define JMP_BUF_SIZE 12
 
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 721db93..58ac6d7 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -5345,6 +5345,78 @@
   "fmuls\t%1, %2, %0"
   [(set_attr "type" "fpmul")])
 
+(define_insn "fmadf4"
+  [(set (match_operand:DF 0 "register_operand" "=e")
+(fma:DF (match_operand:DF 1 "register_operand" "e")
+   (match_operand:DF 2 "register_operand" "e")
+   (match_operand:DF 3 "register_operand" "e")))]
+  "TARGET_FMAF"
+  "fmaddd\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
+(define_insn "fmsdf4"
+  [(set (match_operand:DF 0 "register_operand" "=e")
+(fma:DF (match_operand:DF 1 "register_operand" "e")
+   (match_operand:DF 2 "register_operand" "e")
+   (neg:DF (match_operand:DF 3 "register_operand" "e"]
+  "TARGET_FMAF"
+  "fmsubd\t%1, %2,

Re: [PATCH, PR50251] set DRAP type stack realignment for stack_restore

2011-09-14 Thread Tom de Vries
On 09/04/2011 03:44 PM, Tom de Vries wrote:
> On 09/04/2011 11:10 AM, Richard Guenther wrote:
>> On Sun, Sep 4, 2011 at 11:00 AM, Tom de Vries  wrote:
>>> Hi,
>>>
>>> this patch fixes PR50251, which was caused by r178353.
>>>
>>> The patch was bootstrapped and reg-tested on i686 and x86_64.
>>> On i686, the test-cases reported failing in PR50251 pass again.
>>>
>>> The patch selects the DRAP type stack realignment method in case a 
>>> stack_restore
>>> is used. If that is not done, the presence of the stack_restore at reload 
>>> leaves
>>> FRAME_POINTER without an elimination rule for i386 port.
>>>
>>> OK for trunk?
>>
>> Shouldn't we eventually simply set cfun->calls_alloca when expanding either
>> stack save or restore?  Similar to how it is set from
>> allocate_dynamic_stack_space?
>> I'm not sure we encountered the dead stack save/restore pair before
>> this new folding,
>> so, just to make other targets not confused about them as well?
> 
> Setting cfun->calls_alloca will work as well, but doesn't seem necessary to 
> me.
> AFAIU, since other targets don't define MAX_STACK_ALIGNMENT, they don't need 
> to
> do a realign themselves. If I disable MAX_STACK_ALIGNMENT on i386, the
> middle-end handles the realign and the stack_restore causes no problems. So to
> me this seems a problem with the i386-specific implementation of realignment.
> 
> I'm more worried about other similar cases not working for i386 than about 
> other
> ports. Another, more involved way to fix this, would be in reload to:
> - calculate which registers we cannot use for elimination (which we already 
> do)
> - pass that as context to target.can_eliminate. The i386 target can then
>   fulfill its requirement to be able to eliminate the frame pointer by
>   setting need_drap if the stack pointer is not available.
> I think this way we could remove most if not all of the
> 'crtl->need_drap = true' snippets, and have a completer solution and perhaps
> also more optimal.
> 
> Is this simple crtl->need_drap fix ok for now, or should I start testing the
> cfun->calls_alloca fix?
> 

Ping.

I have tested the attached patch on both x86_64 and i686 and it fixes all
failures in PR50251 without any regressions.

Since I don't feel like the right person to dig into reload guts to implement
the solution above, I want to check in this fix to i386 backend.

Uros, Richard H., OK for trunk?

Thank you,
- Tom

2011-09-14  Tom de Vries  

* explow.c (emit_stack_restore): Set crtl->need_drap if
stack_restore is emitted.
Index: gcc/explow.c
===
--- gcc/explow.c (revision 178145)
+++ gcc/explow.c (working copy)
@@ -1062,6 +1062,20 @@ emit_stack_restore (enum save_level save
   /* The default is that we use a move insn.  */
   rtx (*fcn) (rtx, rtx) = gen_move_insn;
 
+  /* If stack_realign_drap, the x86 backend emits a prologue that aligns both
+ STACK_POINTER and HARD_FRAME_POINTER.
+ If stack_realign_fp, the x86 backend emits a prologue that aligns only
+ STACK_POINTER. This renders the HARD_FRAME_POINTER unusable for accessing
+ aligned variables, which is reflected in ix86_can_eliminate.
+ We normally still have the realigned STACK_POINTER that we can use.
+ But if there is a stack restore still present at reload, it can trigger 
+ mark_not_eliminable for the STACK_POINTER, leaving no way to eliminate
+ FRAME_POINTER into a hard reg.
+ To prevent this situation, we force need_drap if we emit a stack
+ restore.  */
+  if (SUPPORTS_STACK_ALIGNMENT)
+crtl->need_drap = true;
+
   /* See if this machine has anything special to do for this kind of save.  */
   switch (save_level)
 {


Re: PING: [ARM] Model automodified addresses in the Cortex A8 and A9 schedulers -- NEON

2011-09-14 Thread Ramana Radhakrishnan
On 14 September 2011 08:47, Richard Sandiford
 wrote:
> Ramana Radhakrishnan  writes:
>> I do have a one nit on the ml bit though I must say I'm not an ML
>> expert which is why I resisted for a while. The one comment that I
>> have and I should have realized earlier was that the file had been
>> parameterized by the core in quite a few places and I would like to
>> still retain that capability.
>>
>> This look better ? I have retained your original logic and only
>> parameterized wblatency on the core .
>
> Looks good, thanks.


OK - please apply if no one objects in 24 hours.

Ramana
>
> Richard
>


Re: Missing SCALAR_INT_MODE_P checks in simplify_subreg

2011-09-14 Thread Eric Botcazou
> Tested on arm-linux-gnueabi and x86_64-linux-gnu.  Although I suppose
> I could self-approve this, I'd appreciate it if someone would double-check.
>
> Richard
>
>
> gcc/
>   * simplify-rtx.c (simplify_subreg): Check that the inner mode is
>   a scalar integer before applying integer-only optimisations to
>   inner arithmetic.

Looks good to me.

-- 
Eric Botcazou


Re: [S390] Use match_test rather than eq/ne symbol_ref

2011-09-14 Thread Andreas Krebbel
> gcc/
>   * config/s390/s390.md: Use match_test rather than eq/ne symbol_ref
>   throughout file.

This is ok. Thanks!

-Andreas-


Re: [Patch, Ada, Darwin] Remove use of "flat_namespace" linker flag.

2011-09-14 Thread Arnaud Charlet
Iain,

> So is the following OK for trunk?
> (no Ada regressions on *-darwin9, x86_64-darwin10).
> 
> cheers
> Iain
> 
> ada:
> 
>   * gcc-interface/Makefile.in (darwin): Do not issue the
>   '-flat_namespace'  linker flag during Ada build.
>* mlib-tgt-specific-darwin.adb: Remove 'flat_namespace' flag from
>   the default shared library options.

After double checking on our various test suites, this change is OK,
thanks.

Arno


Re: [PLUGIN] Fix PLUGIN_FINISH_TYPE

2011-09-14 Thread Romain Geissler
Hi,

I tried to fix PLUGIN_FINISH_DECL as well to include typedefs in C++.

The followings does not currently trigger the PLUGIN_FINISH_DECL
(or not in all cases), but should them ?
 - function parameters (in the function prototype)
 - definition (with a function body) of a top-level function (while the exact
   same function definition enclosed in a class definition will trigger
   PLUGIN_FINISH_DECL)
 - label declaration
 - constants defined by enums
 - namespace

Romain.


Re: Prevent inliner from removing aliases of used comdats

2011-09-14 Thread Jan Hubicka
> Jan,
> 
> Any testcase do can add?

Sadly not really - in both cases (openoffice and mozilla) the modules involved 
are too large for testsuite.
I tried to delta it but didn't get anything useful (about 100k of C++)

Honza
> 
> Graham


Re: [0/4] Modulo scheduling with haifa-sched for C6X

2011-09-14 Thread Richard Sandiford
Bernd Schmidt  writes:
> I have added support for this to haifa-sched.c. I expect the question
> "why not use SMS" to come up; there were a number of reasons why I felt
> that code is unsuitable:

Fully agree that SMS is unsuitable here FWIW, but...

> There are (or were, when I started) some glaring weaknesses in SMS, such
> as giving up when the loop contains autoincrement addresses (which is
> the case in the example above), and by the looks of it fairly poor
> memory disambiguation compared to sched-ebb with cselib.

...I didn't see from an admittedly quick read of the patch how you
handle memory disambiguation between iterations.  If a loop includes:

 lb $3,($4)
 sb $5,1($4)

then the two instructions can be reordered by normal ebb scheduling,
but the inter-iteration conflict is important for modulo scheduling.

Richard


Re: [PATCH, PR50251] set DRAP type stack realignment for stack_restore

2011-09-14 Thread Uros Bizjak
On Wed, Sep 14, 2011 at 10:20 AM, Tom de Vries  wrote:

 this patch fixes PR50251, which was caused by r178353.

 The patch was bootstrapped and reg-tested on i686 and x86_64.
 On i686, the test-cases reported failing in PR50251 pass again.

 The patch selects the DRAP type stack realignment method in case a 
 stack_restore
 is used. If that is not done, the presence of the stack_restore at reload 
 leaves
 FRAME_POINTER without an elimination rule for i386 port.

 OK for trunk?
>>>
>>> Shouldn't we eventually simply set cfun->calls_alloca when expanding either
>>> stack save or restore?  Similar to how it is set from
>>> allocate_dynamic_stack_space?
>>> I'm not sure we encountered the dead stack save/restore pair before
>>> this new folding,
>>> so, just to make other targets not confused about them as well?
>>
>> Setting cfun->calls_alloca will work as well, but doesn't seem necessary to 
>> me.
>> AFAIU, since other targets don't define MAX_STACK_ALIGNMENT, they don't need 
>> to
>> do a realign themselves. If I disable MAX_STACK_ALIGNMENT on i386, the
>> middle-end handles the realign and the stack_restore causes no problems. So 
>> to
>> me this seems a problem with the i386-specific implementation of realignment.
>>
>> I'm more worried about other similar cases not working for i386 than about 
>> other
>> ports. Another, more involved way to fix this, would be in reload to:
>> - calculate which registers we cannot use for elimination (which we already 
>> do)
>> - pass that as context to target.can_eliminate. The i386 target can then
>>   fulfill its requirement to be able to eliminate the frame pointer by
>>   setting need_drap if the stack pointer is not available.
>> I think this way we could remove most if not all of the
>> 'crtl->need_drap = true' snippets, and have a completer solution and perhaps
>> also more optimal.
>>
>> Is this simple crtl->need_drap fix ok for now, or should I start testing the
>> cfun->calls_alloca fix?
>>
>
> Ping.
>
> I have tested the attached patch on both x86_64 and i686 and it fixes all
> failures in PR50251 without any regressions.
>
> Since I don't feel like the right person to dig into reload guts to implement
> the solution above, I want to check in this fix to i386 backend.
>
> Uros, Richard H., OK for trunk?
>
> Thank you,
> - Tom
>
> 2011-09-14  Tom de Vries  
>
>        * explow.c (emit_stack_restore): Set crtl->need_drap if
>        stack_restore is emitted.

Since this approach just follows the approach taken in other builtin_*
functions, I'd say this is an oversight for stack_save/restore
builtins and the patch is OK. Any enhancements in this area can be
implemented independently of the fix.

Please also add the test from PR, with -mpreferred-stack-boundary=12
that currently fails for 32bit and 64bit x86 targets.

Thanks,
Uros.


Re: [0/4] Modulo scheduling with haifa-sched for C6X

2011-09-14 Thread Bernd Schmidt
On 09/14/11 11:03, Richard Sandiford wrote:
> ...I didn't see from an admittedly quick read of the patch how you
> handle memory disambiguation between iterations.  If a loop includes:
> 
>  lb $3,($4)
>  sb $5,1($4)
> 
> then the two instructions can be reordered by normal ebb scheduling,
> but the inter-iteration conflict is important for modulo scheduling.

There's nothing special to handle, I think. sched-deps should see that
the ld in iteration 1 is DEP_ANTI against the sb in iteration 0
(assuming there's also an increment).


Bernd


Re: PING: [ARM] Model automodified addresses in the Cortex A8 and A9 schedulers -- NEON

2011-09-14 Thread Julian Brown
On Tue, 13 Sep 2011 21:44:54 +0100
Ramana Radhakrishnan  wrote:

> On 9 September 2011 13:56, Richard Sandiford
>  wrote:
> > Ping for this patch:
> 
> I do have a one nit on the ml bit though I must say I'm not an ML
> expert which is why I resisted for a while. The one comment that I
> have and I should have realized earlier was that the file had been
> parameterized by the core in quite a few places and I would like to
> still retain that capability.
> 
> This look better ? I have retained your original logic and only
> parameterized wblatency on the core .
> 
> Thoughts ? Can someone else also give the ML bits a once-over ?  It
> appears to generate identical descriptions to yours.

A couple of stylistic points:

+(* The latency to use on all address register writeback dependencies.  *)
+let writebackLatency whichcore = 
+match whichcore with
+CortexA8 -> 1
+| CortexA9 -> 1 

This always returns 1? Why not just:

let writebackLatency _ = 1

Also you could write:

let writebackLatency = function
CortexA8 -> 1
  | CortexA9 -> 1

which is a little more concise.
 
+  if (try ignore (Hashtbl.find ht (guard, latency)); false
+  with Not_found -> true)

There's no need for this construct. Use Hashtbl.mem ("membership") instead, 
e.g.:

if not (Hashtbl.mem ht (guard, latency)) then ...

(I see that you're just modifying existing code, but still...)

+let comp_fn (guard1, latency1) (guard2, latency2) =
+  if latency1 > latency2 then -1
+  else if latency1 < latency2 then 1
+  else if guard1 > guard2 then -1
+  else if guard2 > guard1 then 1
+  else 0

You can use the built-in function compare (e.g. "compare guard2 guard1") here, 
instead of the if/else chain. It has exactly the semantics you need, i.e. it 
returns -1/0/1 for less-than/equal/greater-than.

I have no comments on the patch algorithmically speaking.

Thanks,

Julian


[PATCH, PR50322] Fix test-case ivopts-lt.c to use int of same size as pointer.

2011-09-14 Thread Tom de Vries
Mike,
Zdenek,

The attached patch fixes PR50322.

The test-case is supposed to succeed if the loop counter data-type has the same
size as a pointer. The patch defines TYPE to be an int datatype of the same size
as a pointer, and uses it. After this fix, there's no need for the avr xfails
anymore.

tested with avr, x86_64 and x86_64 -m32.

OK for trunk?

Thanks,
- Tom

2011-09-14  Tom de Vries  

PR testsuite/50322
* gcc.dg/tree-ssa/ivopts-lt.c: Define TYPE.  Use TYPE in f1.  Undo avr
xfails.
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (revision 178804)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (working copy)
@@ -1,8 +1,18 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-ivopts" } */
 
+#if (__SIZEOF_LONG_LONG__ == __SIZEOF_POINTER__)
+#define TYPE unsigned long long int
+#elif (__SIZEOF_LONG__ == __SIZEOF_POINTER__)
+#define TYPE unsigned long int
+#elif (__SIZEOF_INT__ == __SIZEOF_POINTER__)
+#define TYPE unsigned int
+#else
+#error Add target support here
+#endif
+
 void
-f1 (char *p, unsigned long int i, unsigned long int n)
+f1 (char *p, TYPE i, TYPE n)
 {
   p += i;
   do
@@ -14,8 +24,7 @@ f1 (char *p, unsigned long int i, unsign
   while (i < n);
 }
 
-/* For the fails on avr see PR tree-optimization/50322.  */
-/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" { xfail { "avr-*-*" } } } } */
+/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" } } */
 /* { dg-final { scan-tree-dump-times "PHI 

[patch] Allow not simple ivs in SLP

2011-09-14 Thread Ira Rosen
Hi,

This patch makes data-refs analysis to not fail if simple_iv returns
false in basic block SLP.

Bootstrapped and tested on powerpc64-suse-linux.
OK for mainline?

Thanks,
Ira

ChangeLog:

 * tree-data-ref.c (dr_analyze_innermost): Rename to...
 (dr_analyze_innermost_1): ... this.  Add new argument.
 Allow not simple iv if analyzing basic block.
 (dr_analyze_innermost): Call dr_analyze_innermost_1.
 (create_data_ref): Call dr_analyze_innermost_1.

testsuite/ChangeLog:

 * gcc.dg/vect/bb-slp-24.c: New test.

Index: tree-data-ref.c
===
--- tree-data-ref.c (revision 178755)
+++ tree-data-ref.c (working copy)
@@ -722,11 +722,11 @@ canonicalize_base_object_address (tree addr)
 }

 /* Analyzes the behavior of the memory reference DR in the innermost loop or
-   basic block that contains it. Returns true if analysis succeed or false
+   basic block that contains it.  Returns true if analysis succeed or false
otherwise.  */

-bool
-dr_analyze_innermost (struct data_reference *dr)
+static bool
+dr_analyze_innermost_1 (struct data_reference *dr, struct loop *nest)
 {
   gimple stmt = DR_STMT (dr);
   struct loop *loop = loop_containing_stmt (stmt);
@@ -769,14 +769,25 @@ canonicalize_base_object_address (tree addr)
 }
   else
 base = build_fold_addr_expr (base);
+
   if (in_loop)
 {
   if (!simple_iv (loop, loop_containing_stmt (stmt), base, &base_iv,
   false))
 {
-  if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, "failed: evolution of base is not affine.\n");
-  return false;
+  if (nest)
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "failed: evolution of base is not"
+" affine.\n");
+return false;
+}
+  else
+{
+  base_iv.base = base;
+  base_iv.step = ssize_int (0);
+  base_iv.no_overflow = true;
+}
 }
 }
   else
@@ -801,10 +812,18 @@ canonicalize_base_object_address (tree addr)
   else if (!simple_iv (loop, loop_containing_stmt (stmt),
poffset, &offset_iv, false))
 {
-  if (dump_file && (dump_flags & TDF_DETAILS))
-fprintf (dump_file, "failed: evolution of offset is not"
-" affine.\n");
-  return false;
+  if (nest)
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+fprintf (dump_file, "failed: evolution of offset is not"
+" affine.\n");
+  return false;
+}
+  else
+{
+  offset_iv.base = poffset;
+  offset_iv.step = ssize_int (0);
+}
 }
 }

@@ -832,6 +851,19 @@ canonicalize_base_object_address (tree addr)
   return true;
 }

+/* Analyzes the behavior of the memory reference DR in the innermost loop or
+   basic block that contains it.  Returns true if analysis succeed or false
+   otherwise.  */
+
+bool
+dr_analyze_innermost (struct data_reference *dr)
+{
+  gimple stmt = DR_STMT (dr);
+  struct loop *loop = loop_containing_stmt (stmt);
+
+  return dr_analyze_innermost_1 (dr, loop);
+}
+
 /* Determines the base object and the list of indices of memory reference
DR, analyzed in LOOP and instantiated in loop nest NEST.  */

@@ -972,7 +1004,7 @@ create_data_ref (loop_p nest, loop_p loop, tree me
   DR_REF (dr) = memref;
   DR_IS_READ (dr) = is_read;

-  dr_analyze_innermost (dr);
+  dr_analyze_innermost_1 (dr, nest);
   dr_analyze_indices (dr, nest, loop);
   dr_analyze_alias (dr);

Index: testsuite/gcc.dg/vect/bb-slp-24.c
===
--- testsuite/gcc.dg/vect/bb-slp-24.c   (revision 0)
+++ testsuite/gcc.dg/vect/bb-slp-24.c   (revision 0)
@@ -0,0 +1,55 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+
+#define A 3
+#define N 256
+
+short src[N], dst[N];
+
+void foo (short * __restrict dst, short * __restrict src, int h, int stride)
+{
+int i;
+h /= 8;
+for (i = 0; i < h; i++) {
+dst[0] += A*src[0];
+dst[1] += A*src[1];
+dst[2] += A*src[2];
+dst[3] += A*src[3];
+dst[4] += A*src[4];
+dst[5] += A*src[5];
+dst[6] += A*src[6];
+dst[7] += A*src[7];
+dst += stride;
+src += stride;
+}
+}
+
+
+int main (void)
+{
+  int i;
+
+  check_vect ();
+
+  for (i = 0; i < N; i++)
+{
+  dst[i] = 0;
+  src[i] = i;
+}
+
+  foo (dst, src, N, 8);
+
+  for (i = 0; i < N; i++)
+{
+  if (dst[i] != A * i)
+abort ();
+}
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "basic block vectorized using
SLP" 1 "slp" { target vect_element_a

Re: Missing SCALAR_INT_MODE_P checks in simplify_subreg

2011-09-14 Thread Richard Sandiford
Eric Botcazou  writes:
>> Tested on arm-linux-gnueabi and x86_64-linux-gnu.  Although I suppose
>> I could self-approve this, I'd appreciate it if someone would double-check.
>>
>> Richard
>>
>>
>> gcc/
>>  * simplify-rtx.c (simplify_subreg): Check that the inner mode is
>>  a scalar integer before applying integer-only optimisations to
>>  inner arithmetic.
>
> Looks good to me.

Thanks.  I went ahead and applied it.

Richard


[ARM] Optimise handling of neon_vget_high/low

2011-09-14 Thread Richard Sandiford
neon_vget_high and neon_vget_low extract one half of a vector.
The patterns look like:

(define_insn "neon_vget_highv16qi"
  [(set (match_operand:V8QI 0 "s_register_operand" "=w")
(vec_select:V8QI (match_operand:V16QI 1 "s_register_operand" "w")
 (parallel [(const_int 8) (const_int 9)
(const_int 10) (const_int 11)
(const_int 12) (const_int 13)
(const_int 14) (const_int 15)])))]
  "TARGET_NEON"
{
  int dest = REGNO (operands[0]);
  int src = REGNO (operands[1]);

  if (dest != src + 2)
return "vmov\t%P0, %f1";
  else
return "";
}
  [(set_attr "neon_type" "neon_bp_simple")]
)

But there's nothing here to tell the register allocator what's expected
of it, so we do often get the move.  The patch below makes the patterns
expand to normal subreg moves instead.

Unfortunately, when I first tried this, I ran across some bugs in
simplify-rtx.c.  They should be fixed now.  Of course, I can't
guarantee that there are other similar bugs elsewhere, but I'll
try to fix any that crop up.

The new patterns preserve the current treatment on big-endian targets.
Namely, the "low" half is always in the lower-numbered registers
(subreg byte offset 0).

Tested on arm-linux-gnueabi.  OK to install?

Richard


gcc/
* config/arm/neon.md (neon_vget_highv16qi, neon_vget_highv8hi)
(neon_vget_highv4si, neon_vget_highv4sf, neon_vget_highv2di)
(neon_vget_lowv16qi, neon_vget_lowv8hi, neon_vget_lowv4si)
(neon_vget_lowv4sf, neon_vget_lowv2di): Turn into define_expands
that produce subreg moves.  Define using VQX iterators.

Index: gcc/config/arm/neon.md
===
--- gcc/config/arm/neon.md  2011-09-13 13:33:29.0 +0100
+++ gcc/config/arm/neon.md  2011-09-13 16:21:23.189304498 +0100
@@ -2969,183 +2969,27 @@ (define_insn "neon_vcombine"
(set_attr "neon_type" "neon_bp_simple")]
 )
 
-(define_insn "neon_vget_highv16qi"
-  [(set (match_operand:V8QI 0 "s_register_operand" "=w")
-   (vec_select:V8QI (match_operand:V16QI 1 "s_register_operand" "w")
- (parallel [(const_int 8) (const_int 9)
-   (const_int 10) (const_int 11)
-   (const_int 12) (const_int 13)
-   (const_int 14) (const_int 15)])))]
-  "TARGET_NEON"
-{
-  int dest = REGNO (operands[0]);
-  int src = REGNO (operands[1]);
-
-  if (dest != src + 2)
-return "vmov\t%P0, %f1";
-  else
-return "";
-}
-  [(set_attr "neon_type" "neon_bp_simple")]
-)
-
-(define_insn "neon_vget_highv8hi"
-  [(set (match_operand:V4HI 0 "s_register_operand" "=w")
-   (vec_select:V4HI (match_operand:V8HI 1 "s_register_operand" "w")
-(parallel [(const_int 4) (const_int 5)
-   (const_int 6) (const_int 7)])))]
-  "TARGET_NEON"
-{
-  int dest = REGNO (operands[0]);
-  int src = REGNO (operands[1]);
-
-  if (dest != src + 2)
-return "vmov\t%P0, %f1";
-  else
-return "";
-}
-  [(set_attr "neon_type" "neon_bp_simple")]
-)
-
-(define_insn "neon_vget_highv4si"
-  [(set (match_operand:V2SI 0 "s_register_operand" "=w")
-   (vec_select:V2SI (match_operand:V4SI 1 "s_register_operand" "w")
-(parallel [(const_int 2) (const_int 3)])))]
-  "TARGET_NEON"
-{
-  int dest = REGNO (operands[0]);
-  int src = REGNO (operands[1]);
-
-  if (dest != src + 2)
-return "vmov\t%P0, %f1";
-  else
-return "";
-}
-  [(set_attr "neon_type" "neon_bp_simple")]
-)
-
-(define_insn "neon_vget_highv4sf"
-  [(set (match_operand:V2SF 0 "s_register_operand" "=w")
-   (vec_select:V2SF (match_operand:V4SF 1 "s_register_operand" "w")
-(parallel [(const_int 2) (const_int 3)])))]
-  "TARGET_NEON"
-{
-  int dest = REGNO (operands[0]);
-  int src = REGNO (operands[1]);
-
-  if (dest != src + 2)
-return "vmov\t%P0, %f1";
-  else
-return "";
-}
-  [(set_attr "neon_type" "neon_bp_simple")]
-)
-
-(define_insn "neon_vget_highv2di"
-  [(set (match_operand:DI 0 "s_register_operand" "=w")
-   (vec_select:DI (match_operand:V2DI 1 "s_register_operand" "w")
-  (parallel [(const_int 1)])))]
-  "TARGET_NEON"
-{
-  int dest = REGNO (operands[0]);
-  int src = REGNO (operands[1]);
-
-  if (dest != src + 2)
-return "vmov\t%P0, %f1";
-  else
-return "";
-}
-  [(set_attr "neon_type" "neon_bp_simple")]
-)
-
-(define_insn "neon_vget_lowv16qi"
-  [(set (match_operand:V8QI 0 "s_register_operand" "=w")
-   (vec_select:V8QI (match_operand:V16QI 1 "s_register_operand" "w")
- (parallel [(const_int 0) (const_int 1)
-   (const_int 2) (const_int 3)
-   (const_int 4) (const_int 5)
-   (const_int 6) (const_int 7)])))

[C++ Patch] PR 50391

2011-09-14 Thread Paolo Carlini

Hi,

here I'm just avoiding calling build_exception_variant when specs is an 
error_mark_node.


Tested x86_64-linux. Ok for mainline?

Thanks,
Paolo.

/
/cp
2011-09-14  Paolo Carlini  

PR c++/50391
* pt.c (regenerate_decl_from_template): Don't pass an error_mark_node
to build_exception_variant.

/testsuite
2011-09-14  Paolo Carlini  

PR c++/50391
* g++.dg/cpp0x/noexcept15.C: New.
Index: testsuite/g++.dg/cpp0x/noexcept15.C
===
--- testsuite/g++.dg/cpp0x/noexcept15.C (revision 0)
+++ testsuite/g++.dg/cpp0x/noexcept15.C (revision 0)
@@ -0,0 +1,34 @@
+// PR c++/50391
+// { dg-options -std=c++0x }
+
+#include 
+
+template
+  struct single
+  {
+Tp elem;  // { dg-error "incomplete type" }
+
+constexpr single(const Tp& e)
+: elem(e) { }   // { dg-error "invalid field" }
+
+single(single&& s) // { dg-error "not a member" }
+noexcept(std::is_nothrow_move_constructible::value) 
+: elem(s.elem) { } // { dg-error "invalid field|no member" }
+  };
+
+template
+  constexpr single::type>
+  make_single(Tp&& x)
+  {
+return single::type>(x);
+  }
+
+class Blob;  // { dg-error "forward declaration" }
+
+void
+foo(Blob *b)
+{
+  make_single(*b);
+}
+
+// { dg-prune-output "include" }
Index: cp/pt.c
===
--- cp/pt.c (revision 178845)
+++ cp/pt.c (working copy)
@@ -17729,7 +17729,7 @@ regenerate_decl_from_template (tree decl, tree tmp
   specs = tsubst_exception_specification (TREE_TYPE (code_pattern),
  args, tf_error, NULL_TREE,
  /*defer_ok*/false);
-  if (specs)
+  if (specs && specs != error_mark_node)
TREE_TYPE (decl) = build_exception_variant (TREE_TYPE (decl),
specs);
 


[PATCH] [MELT] Fix CUMULMD5 leading space

2011-09-14 Thread Alexandre Lissy
Since v3.82, make issued some strange warning on duplicated targets. It
turns out that those were due to a leading space hidden in CUMULMD5
variable, leading to target breakage.
---
 gcc/ChangeLog.MELT |4 
 gcc/melt-build.mk  |   18 +-
 gcc/melt-build.tpl |2 +-
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT
index 78c6ae6..6efd8d9 100644
--- a/gcc/ChangeLog.MELT
+++ b/gcc/ChangeLog.MELT
@@ -1,3 +1,7 @@
+2011-09-14  Alexandre Lissy  
+   * melt-build.tpl: Remove leading space on CUMULMD5, breaking some make
+   targets.
+   * melt-build.mk: Regenerate.
 
 2011-09-14  Basile Starynkevitch  
* melt-runtime.c (melt_dbgshortbacktrace): Print just the basename
diff --git a/gcc/melt-build.mk b/gcc/melt-build.mk
index af9e266..fd9d8a2 100644
--- a/gcc/melt-build.mk
+++ b/gcc/melt-build.mk
@@ -15,7 +15,7 @@
 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 # GNU General Public License for more details.
 # 
-#@ from melt-build.tpl line 21 generated by Autogen 5.12 using melt-build.def
+#@ from melt-build.tpl line 21 generated by Autogen 5.11.9 using melt-build.def
 
 
 ## the following Makefile variables are expected to be set
@@ -147,7 +147,7 @@ MELT_GENERATED_FIRST_C_FILES= \
 MELT_GENERATED_FIRST_BASE= \
   $(basename $(notdir $(MELT_GENERATED_FIRST_C_FILES)))
 
-MELT_GENERATED_FIRST_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-first+meltdesc.c) 
+MELT_GENERATED_FIRST_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-first+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_BASE_C_FILES= \
@@ -156,7 +156,7 @@ MELT_GENERATED_BASE_C_FILES= \
 MELT_GENERATED_BASE_BASE= \
   $(basename $(notdir $(MELT_GENERATED_BASE_C_FILES)))
 
-MELT_GENERATED_BASE_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-base+meltdesc.c) 
+MELT_GENERATED_BASE_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-base+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_DEBUG_C_FILES= \
@@ -165,7 +165,7 @@ MELT_GENERATED_DEBUG_C_FILES= \
 MELT_GENERATED_DEBUG_BASE= \
   $(basename $(notdir $(MELT_GENERATED_DEBUG_C_FILES)))
 
-MELT_GENERATED_DEBUG_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-debug+meltdesc.c) 
+MELT_GENERATED_DEBUG_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-debug+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_MACRO_C_FILES= \
@@ -174,7 +174,7 @@ MELT_GENERATED_MACRO_C_FILES= \
 MELT_GENERATED_MACRO_BASE= \
   $(basename $(notdir $(MELT_GENERATED_MACRO_C_FILES)))
 
-MELT_GENERATED_MACRO_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-macro+meltdesc.c) 
+MELT_GENERATED_MACRO_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-macro+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_NORMAL_C_FILES= \
@@ -183,7 +183,7 @@ MELT_GENERATED_NORMAL_C_FILES= \
 MELT_GENERATED_NORMAL_BASE= \
   $(basename $(notdir $(MELT_GENERATED_NORMAL_C_FILES)))
 
-MELT_GENERATED_NORMAL_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-normal+meltdesc.c) 
+MELT_GENERATED_NORMAL_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-normal+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_NORMATCH_C_FILES= \
@@ -192,7 +192,7 @@ MELT_GENERATED_NORMATCH_C_FILES= \
 MELT_GENERATED_NORMATCH_BASE= \
   $(basename $(notdir $(MELT_GENERATED_NORMATCH_C_FILES)))
 
-MELT_GENERATED_NORMATCH_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-normatch+meltdesc.c) 
+MELT_GENERATED_NORMATCH_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-normatch+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_GENOBJ_C_FILES= \
@@ -201,7 +201,7 @@ MELT_GENERATED_GENOBJ_C_FILES= \
 MELT_GENERATED_GENOBJ_BASE= \
   $(basename $(notdir $(MELT_GENERATED_GENOBJ_C_FILES)))
 
-MELT_GENERATED_GENOBJ_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-genobj+meltdesc.c) 
+MELT_GENERATED_GENOBJ_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-genobj+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT

Re: [PATCH, ARM] Unaligned accesses for builtin memcpy [2/2]

2011-09-14 Thread Julian Brown
On Fri, 6 May 2011 14:13:32 +0100
Julian Brown  wrote:

> Hi,
> 
> This is the second of two patches to add unaligned-access support to
> the ARM backend. It builds on the first patch to provide support for
> unaligned accesses when expanding block moves (i.e. for builtin memcpy
> operations). It makes some effort to use load/store multiple
> instructions where appropriate (when accessing sufficiently-aligned
> source or destination addresses), and also makes some effort to
> generate fast code (for -O1/2/3) or small code (for -Os), though some
> of the heuristics may need tweaking still.

The preceding patch to this one has now been approved & applied:

http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00575.html

FAOD, is this one OK too?

Thanks,

Julian


Re: [RFC] Builtin infrastructure change

2011-09-14 Thread Joseph S. Myers
On Tue, 13 Sep 2011, Michael Meissner wrote:

>   implicit_built_in_decls to the new macros/functions, and poison
>   the old names. Make sure 0 is not a legitimate builtin index.

I think the poisoning belongs in system.h; direct dependencies on the host 
compiler (such as GCC_VERSION and __GNUC__ tests) should be kept as local 
as possible, in just a few headers such as system.h and hwint.h, with 
other places testing feature macros defined in system.h (or through 
autoconf) if it's really necessary to have such dependencies there.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, testsuite, ARM] Change to expected failure for g++.dg/abi/local1.C on ARM target

2011-09-14 Thread Joseph S. Myers
On Wed, 14 Sep 2011, Jiangning Liu wrote:

> +// { dg-do run { xfail { arm*-*-eabi* } } }

That's specific to bare-metal targets rather than all ARM EABI (such as 
arm-none-linux-gnueabi or arm-none-symbianelf), is that your intent?  If 
you wanted to cover all ARM EABI targets you'd use the arm_eabi 
effective-target.  But maybe you actually want a new effective-target 
keyword for targets defining __GXX_TYPEINFO_EQUALITY_INLINE to 0 rather 
than just ARM EABI?  I've seen this test fail on i686-mingw32 and 
m68k-uclinux as well.

-- 
Joseph S. Myers
jos...@codesourcery.com


RFC: allowing fwprop to propagate subregs

2011-09-14 Thread Richard Sandiford
At the moment, fwprop will propagate constants and registers
even if no further rtl simplifications are possible:

  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
flags |= PR_CAN_APPEAR;

What do you think about extending this to subregs?  The reason for
asking is that on NEON, vector loads like vld4 are represented as a load
of a single monolithic register followed by subreg extractions of each
vector:

  (set (reg:OI FULL) (...))
  (set (reg:V2SI V0) (subreg:V2SI (reg:OI FULL) 0))
  (set (reg:V2SI V1) (subreg:V2SI (reg:OI FULL) 16))
  (set (reg:V2SI V2) (subreg:V2SI (reg:OI FULL) 32))
  (set (reg:V2SI V3) (subreg:V2SI (reg:OI FULL) 48))

Nothing ever propagates these subregs, so the separate moves
survive until IRA.  This has three problems:

  - We generally want the registers allocated to V0...V3 to be the same
as FULL, so that the four subreg moves become nops.  And this often
happens in simple examples.  But if register pressure is relatively
high, these moves can sometimes cause IRA to spill in cases where
it doesn't if the subregs are used instead of each Vi.

  - Perhaps related, register pressure becomes harder to estimate.

  - These moves can interfere with pre-reload scheduling.

In combination with the MODES_TIEABLE_P patch that I posted here:

http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00626.html

this patch significantly improves the code generated for several libav
loops.  Unfortunately, I don't have a setup that can do meaningful
x86_64 performance measurements, but a diff of the before and after
output for libav showed many cases where the patch removed moves.

What do you think?  Alternatives include propagating in lower-subreg,
or maybe only in the second fwprop pass.

Richard


gcc/
* fwprop.c (propagate_rtx): Also set PR_CAN_APPEAR for subregs.

Index: gcc/fwprop.c
===
--- gcc/fwprop.c2011-08-26 09:58:28.829540497 +0100
+++ gcc/fwprop.c2011-08-26 10:14:03.767707504 +0100
@@ -664,7 +664,7 @@ propagate_rtx (rtx x, enum machine_mode 
 return NULL_RTX;
 
   flags = 0;
-  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
+  if (REG_P (new_rtx) || CONSTANT_P (new_rtx) || GET_CODE (new_rtx) == SUBREG)
 flags |= PR_CAN_APPEAR;
   if (!for_each_rtx (&new_rtx, varying_mem_p, NULL))
 flags |= PR_HANDLE_MEM;


Re: RFC: allowing fwprop to propagate subregs

2011-09-14 Thread H.J. Lu
On Wed, Sep 14, 2011 at 8:24 AM, Richard Sandiford
 wrote:
> At the moment, fwprop will propagate constants and registers
> even if no further rtl simplifications are possible:
>
>  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
>    flags |= PR_CAN_APPEAR;
>
> What do you think about extending this to subregs?  The reason for
> asking is that on NEON, vector loads like vld4 are represented as a load
> of a single monolithic register followed by subreg extractions of each
> vector:
>
>  (set (reg:OI FULL) (...))
>  (set (reg:V2SI V0) (subreg:V2SI (reg:OI FULL) 0))
>  (set (reg:V2SI V1) (subreg:V2SI (reg:OI FULL) 16))
>  (set (reg:V2SI V2) (subreg:V2SI (reg:OI FULL) 32))
>  (set (reg:V2SI V3) (subreg:V2SI (reg:OI FULL) 48))
>
> Nothing ever propagates these subregs, so the separate moves
> survive until IRA.  This has three problems:
>
>  - We generally want the registers allocated to V0...V3 to be the same
>    as FULL, so that the four subreg moves become nops.  And this often
>    happens in simple examples.  But if register pressure is relatively
>    high, these moves can sometimes cause IRA to spill in cases where
>    it doesn't if the subregs are used instead of each Vi.
>
>  - Perhaps related, register pressure becomes harder to estimate.
>
>  - These moves can interfere with pre-reload scheduling.
>
> In combination with the MODES_TIEABLE_P patch that I posted here:
>
>    http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00626.html
>
> this patch significantly improves the code generated for several libav
> loops.  Unfortunately, I don't have a setup that can do meaningful
> x86_64 performance measurements, but a diff of the before and after
> output for libav showed many cases where the patch removed moves.
>
> What do you think?  Alternatives include propagating in lower-subreg,
> or maybe only in the second fwprop pass.
>
> Richard
>
>
> gcc/
>        * fwprop.c (propagate_rtx): Also set PR_CAN_APPEAR for subregs.
>
> Index: gcc/fwprop.c
> ===
> --- gcc/fwprop.c        2011-08-26 09:58:28.829540497 +0100
> +++ gcc/fwprop.c        2011-08-26 10:14:03.767707504 +0100
> @@ -664,7 +664,7 @@ propagate_rtx (rtx x, enum machine_mode
>     return NULL_RTX;
>
>   flags = 0;
> -  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
> +  if (REG_P (new_rtx) || CONSTANT_P (new_rtx) || GET_CODE (new_rtx) == 
> SUBREG)
>     flags |= PR_CAN_APPEAR;
>   if (!for_each_rtx (&new_rtx, varying_mem_p, NULL))
>     flags |= PR_HANDLE_MEM;
>

A SUBREG may not be REG nor CONSTANT. Don't you need
to check REG_P/CONSTANT_P on SUBREG?


-- 
H.J.


Re: [MICROBLAZE] Use match_test rather than eq/ne symbol_ref

2011-09-14 Thread Michael Eager

On 09/13/2011 10:58 AM, Richard Sandiford wrote:

As per the subject.  Tested by making sure that there were no new
warnings building microblaze-elf, and that there were no changes
in the assembly output for the C and C++ testsuite.  OK to install?


OK.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: [C++ Patch] PR 50391

2011-09-14 Thread Jason Merrill

OK.

Jason


Re: RFC: allowing fwprop to propagate subregs

2011-09-14 Thread Richard Sandiford
"H.J. Lu"  writes:
> On Wed, Sep 14, 2011 at 8:24 AM, Richard Sandiford
>  wrote:
>> At the moment, fwprop will propagate constants and registers
>> even if no further rtl simplifications are possible:
>>
>>  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
>>    flags |= PR_CAN_APPEAR;
>>
>> What do you think about extending this to subregs?  The reason for
>> asking is that on NEON, vector loads like vld4 are represented as a load
>> of a single monolithic register followed by subreg extractions of each
>> vector:
>>
>>  (set (reg:OI FULL) (...))
>>  (set (reg:V2SI V0) (subreg:V2SI (reg:OI FULL) 0))
>>  (set (reg:V2SI V1) (subreg:V2SI (reg:OI FULL) 16))
>>  (set (reg:V2SI V2) (subreg:V2SI (reg:OI FULL) 32))
>>  (set (reg:V2SI V3) (subreg:V2SI (reg:OI FULL) 48))
>>
>> Nothing ever propagates these subregs, so the separate moves
>> survive until IRA.  This has three problems:
>>
>>  - We generally want the registers allocated to V0...V3 to be the same
>>    as FULL, so that the four subreg moves become nops.  And this often
>>    happens in simple examples.  But if register pressure is relatively
>>    high, these moves can sometimes cause IRA to spill in cases where
>>    it doesn't if the subregs are used instead of each Vi.
>>
>>  - Perhaps related, register pressure becomes harder to estimate.
>>
>>  - These moves can interfere with pre-reload scheduling.
>>
>> In combination with the MODES_TIEABLE_P patch that I posted here:
>>
>>    http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00626.html
>>
>> this patch significantly improves the code generated for several libav
>> loops.  Unfortunately, I don't have a setup that can do meaningful
>> x86_64 performance measurements, but a diff of the before and after
>> output for libav showed many cases where the patch removed moves.
>>
>> What do you think?  Alternatives include propagating in lower-subreg,
>> or maybe only in the second fwprop pass.
>>
>> Richard
>>
>>
>> gcc/
>>        * fwprop.c (propagate_rtx): Also set PR_CAN_APPEAR for subregs.
>>
>> Index: gcc/fwprop.c
>> ===
>> --- gcc/fwprop.c        2011-08-26 09:58:28.829540497 +0100
>> +++ gcc/fwprop.c        2011-08-26 10:14:03.767707504 +0100
>> @@ -664,7 +664,7 @@ propagate_rtx (rtx x, enum machine_mode
>>     return NULL_RTX;
>>
>>   flags = 0;
>> -  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
>> +  if (REG_P (new_rtx) || CONSTANT_P (new_rtx) || GET_CODE (new_rtx) == 
>> SUBREG)
>>     flags |= PR_CAN_APPEAR;
>>   if (!for_each_rtx (&new_rtx, varying_mem_p, NULL))
>>     flags |= PR_HANDLE_MEM;
>>
>
> A SUBREG may not be REG nor CONSTANT. Don't you need
> to check REG_P/CONSTANT_P on SUBREG?

Yeah, good point.  There should be a "&& REG_P (SUBREG_REG (new_rtx))"
in there.  Probably also worth checking for non-paradoxical subregs.

Richard


[Patch, committed] Add myself to MAINTAINERS

2011-09-14 Thread Tom de Vries
Added myself to MAINTAINERS (write after approval).

Thanks,
- Tom

2011-09-14  Tom de Vries  

* MAINTAINERS (Write After Approval): Add myself.
Index: MAINTAINERS
===
--- MAINTAINERS (revision 178804)
+++ MAINTAINERS (working copy)
@@ -505,6 +505,7 @@ Konrad Trifunovickonrad.trifunovic@i
 David Ung	dav...@mips.com
 Neil Vachharajaninvach...@gmail.com
 Kris Van Hees	kris.van.h...@oracle.com
+Tom de Vries	t...@codesourcery.com
 Nenad Vukicevic	ne...@intrepid.com
 Feng Wang	fengw...@nudt.edu.cn
 Stephen M. Webb	stephen.w...@bregmasoft.com


Re: [PATCH] [Annotalysis] Fix internal compiler errors in thread safety analysis.

2011-09-14 Thread Diego Novillo
On Tue, Sep 13, 2011 at 18:27, Delesley Hutchins  wrote:
> This patch fixes two bugs which cause an internal compiler error in
> annotalysis. The first change fixes a failure when a variable is
> typecast from an unknown (forward-defined)  type.  The second change
> fixes a case in which an assert was triggered, because subexpressions
> were not canonicalized correctly.
>
> Bootstrapped and passed gcc regression testsuite on
> x86_64-unknown-linux-gnu.  Tested on google core components.
>
> Okay for google/gcc-4_6?

OK with a few nits below.

> 2011-9-13  DeLesley Hutchins  
>
>       * gcc/cp/class.c (cp_get_virtual_function_decl) bugfix where
> type is uknown
>       * gcc/tree-threadsafe-analyze.c (get_canonical_lock_expr)
> don't remove & on recursive call

Missing ':' after ')'.  This entry should be split, the change to
tree-threadsafe-analyze.c goes in gcc/ChangeLog.google-4_6.  The
change to gcc/cp/class.c goes in gcc/cp/ChangeLog.google-4_6.

Also, you need a ChangeLog entry for
gcc/testsuite/ChangeLog.google-4_6 to describe the changes to the
testsuite files.

> @@ -8384,9 +8384,15 @@ cp_get_virtual_function_decl (tree ref, tree known
>  {
>   HOST_WIDE_INT index = tree_low_cst (OBJ_TYPE_REF_TOKEN (ref), 1);
>   HOST_WIDE_INT i = 0;
> -  tree v = BINFO_VIRTUALS (TYPE_BINFO (known_type));
> +  tree binfo = TYPE_BINFO(known_type);

space before '('.

> @@ -884,10 +884,12 @@ get_canonical_lock_expr (tree lock, tree base_obj,
>         {
>           tree base = TREE_OPERAND (lock, 0);
>           tree canon_base;
> -          /* When the expr is a pointer to a lockable type (i.e. mu.Lock()
> +          /* For expressions that denote locks,
> +             When the expr is a pointer to a lockable type (i.e. mu.Lock()
>              or Lock(&mu) internally), we don't need the address-taken
>              operator (&).  */
> -          if (lookup_attribute("lockable", TYPE_ATTRIBUTES (TREE_TYPE 
> (base
> +          if (!is_temp_expr &&
> +              lookup_attribute("lockable", TYPE_ATTRIBUTES (TREE_TYPE 
> (base

'&&' goes in the line below.


Diego.


Re: [RFC] Builtin infrastructure change

2011-09-14 Thread Michael Meissner
On Wed, Sep 14, 2011 at 02:33:12PM +, Joseph S. Myers wrote:
> On Tue, 13 Sep 2011, Michael Meissner wrote:
> 
> > implicit_built_in_decls to the new macros/functions, and poison
> > the old names. Make sure 0 is not a legitimate builtin index.
> 
> I think the poisoning belongs in system.h; direct dependencies on the host 
> compiler (such as GCC_VERSION and __GNUC__ tests) should be kept as local 
> as possible, in just a few headers such as system.h and hwint.h, with 
> other places testing feature macros defined in system.h (or through 
> autoconf) if it's really necessary to have such dependencies there.

Ok, I wasn't sure where the best place to put the poison.  System.h sounds like
the best place.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899


Re: RFC: allowing fwprop to propagate subregs

2011-09-14 Thread Paolo Bonzini

On 09/14/2011 05:44 PM, Richard Sandiford wrote:

>  A SUBREG may not be REG nor CONSTANT. Don't you need
>  to check REG_P/CONSTANT_P on SUBREG?


Yeah, good point.  There should be a "&&  REG_P (SUBREG_REG (new_rtx))"
in there.  Probably also worth checking for non-paradoxical subregs.


I was going to suggest the former.

Paradoxical subregs are already handled elsewhere in fwprop, but it also 
shouldn't hurt to include them.


Paolo


Re: [PATCH, PR50322] Fix test-case ivopts-lt.c to use int of same size as pointer.

2011-09-14 Thread Zdenek Dvorak
Hi,

> The attached patch fixes PR50322.
> 
> The test-case is supposed to succeed if the loop counter data-type has the 
> same
> size as a pointer. The patch defines TYPE to be an int datatype of the same 
> size
> as a pointer, and uses it. After this fix, there's no need for the avr xfails
> anymore.
> 
> tested with avr, x86_64 and x86_64 -m32.

what about using uintptr_t?

Zdenek

> Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (revision 178804)
> +++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (working copy)
> @@ -1,8 +1,18 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fdump-tree-ivopts" } */
>  
> +#if (__SIZEOF_LONG_LONG__ == __SIZEOF_POINTER__)
> +#define TYPE unsigned long long int
> +#elif (__SIZEOF_LONG__ == __SIZEOF_POINTER__)
> +#define TYPE unsigned long int
> +#elif (__SIZEOF_INT__ == __SIZEOF_POINTER__)
> +#define TYPE unsigned int
> +#else
> +#error Add target support here
> +#endif
> +
>  void
> -f1 (char *p, unsigned long int i, unsigned long int n)
> +f1 (char *p, TYPE i, TYPE n)
>  {
>p += i;
>do
> @@ -14,8 +24,7 @@ f1 (char *p, unsigned long int i, unsign
>while (i < n);
>  }
>  
> -/* For the fails on avr see PR tree-optimization/50322.  */
> -/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" { xfail { "avr-*-*" } 
> } } } */
> +/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" } } */
>  /* { dg-final { scan-tree-dump-times "PHI  -/* { dg-final { scan-tree-dump-times "p_\[0-9\]* <" 1 "ivopts" { xfail { 
> "avr-*-*" } } } } */
> +/* { dg-final { scan-tree-dump-times "p_\[0-9\]* <" 1 "ivopts" } } */
>  /* { dg-final { cleanup-tree-dump "ivopts" } } */



[PATCH] [MELT] Fix CUMULMD5 leading space

2011-09-14 Thread Alexandre Lissy
Since v3.82, make issued some strange warning on duplicated targets. It
turns out that those were due to a leading space hidden in CUMULMD5
variable, leading to target breakage.
---
 gcc/ChangeLog.MELT |4 
 gcc/melt-build.mk  |   18 +-
 gcc/melt-build.tpl |2 +-
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT
index 78c6ae6..6efd8d9 100644
--- a/gcc/ChangeLog.MELT
+++ b/gcc/ChangeLog.MELT
@@ -1,3 +1,7 @@
+2011-09-14  Alexandre Lissy  
+   * melt-build.tpl: Remove leading space on CUMULMD5, breaking some make
+   targets.
+   * melt-build.mk: Regenerate.
 
 2011-09-14  Basile Starynkevitch  
* melt-runtime.c (melt_dbgshortbacktrace): Print just the basename
diff --git a/gcc/melt-build.mk b/gcc/melt-build.mk
index af9e266..fd9d8a2 100644
--- a/gcc/melt-build.mk
+++ b/gcc/melt-build.mk
@@ -15,7 +15,7 @@
 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 # GNU General Public License for more details.
 # 
-#@ from melt-build.tpl line 21 generated by Autogen 5.12 using melt-build.def
+#@ from melt-build.tpl line 21 generated by Autogen 5.11.9 using melt-build.def
 
 
 ## the following Makefile variables are expected to be set
@@ -147,7 +147,7 @@ MELT_GENERATED_FIRST_C_FILES= \
 MELT_GENERATED_FIRST_BASE= \
   $(basename $(notdir $(MELT_GENERATED_FIRST_C_FILES)))
 
-MELT_GENERATED_FIRST_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-first+meltdesc.c) 
+MELT_GENERATED_FIRST_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-first+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_BASE_C_FILES= \
@@ -156,7 +156,7 @@ MELT_GENERATED_BASE_C_FILES= \
 MELT_GENERATED_BASE_BASE= \
   $(basename $(notdir $(MELT_GENERATED_BASE_C_FILES)))
 
-MELT_GENERATED_BASE_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-base+meltdesc.c) 
+MELT_GENERATED_BASE_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-base+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_DEBUG_C_FILES= \
@@ -165,7 +165,7 @@ MELT_GENERATED_DEBUG_C_FILES= \
 MELT_GENERATED_DEBUG_BASE= \
   $(basename $(notdir $(MELT_GENERATED_DEBUG_C_FILES)))
 
-MELT_GENERATED_DEBUG_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-debug+meltdesc.c) 
+MELT_GENERATED_DEBUG_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-debug+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_MACRO_C_FILES= \
@@ -174,7 +174,7 @@ MELT_GENERATED_MACRO_C_FILES= \
 MELT_GENERATED_MACRO_BASE= \
   $(basename $(notdir $(MELT_GENERATED_MACRO_C_FILES)))
 
-MELT_GENERATED_MACRO_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-macro+meltdesc.c) 
+MELT_GENERATED_MACRO_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-macro+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_NORMAL_C_FILES= \
@@ -183,7 +183,7 @@ MELT_GENERATED_NORMAL_C_FILES= \
 MELT_GENERATED_NORMAL_BASE= \
   $(basename $(notdir $(MELT_GENERATED_NORMAL_C_FILES)))
 
-MELT_GENERATED_NORMAL_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-normal+meltdesc.c) 
+MELT_GENERATED_NORMAL_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-normal+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_NORMATCH_C_FILES= \
@@ -192,7 +192,7 @@ MELT_GENERATED_NORMATCH_C_FILES= \
 MELT_GENERATED_NORMATCH_BASE= \
   $(basename $(notdir $(MELT_GENERATED_NORMATCH_C_FILES)))
 
-MELT_GENERATED_NORMATCH_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-normatch+meltdesc.c) 
+MELT_GENERATED_NORMATCH_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-normatch+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT_GENERATED_GENOBJ_C_FILES= \
@@ -201,7 +201,7 @@ MELT_GENERATED_GENOBJ_C_FILES= \
 MELT_GENERATED_GENOBJ_BASE= \
   $(basename $(notdir $(MELT_GENERATED_GENOBJ_C_FILES)))
 
-MELT_GENERATED_GENOBJ_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-genobj+meltdesc.c) 
+MELT_GENERATED_GENOBJ_CUMULMD5 := $(shell $(GAWK) -F\" 
'/melt_cumulated_hexmd5/{print $$2}' 
$(melt_make_source_dir)/generated/warmelt-genobj+meltdesc.c)
 
 #@ from melt-build.tpl line 135
 MELT

Re: Prevent inliner from removing aliases of used comdats

2011-09-14 Thread Dmitry Gorbachev
Would a testcase from PR50226 suffice?


Re: [trans-mem] Fix compilation of libitm with recent gcc

2011-09-14 Thread Aldy Hernandez
Patrick Marlier  writes:

> Here the error when gcc compiles libitm:
>
> ../../../transactional-memory/libitm/aatree.h: In constructor
> ‘GTM::aa_node_base::aa_node_base(GTM::aa_node_base::level_type)’:
> ../../../transactional-memory/libitm/aatree.h:53:16: error:
> list-initializer for non-class type must not be parenthesized
> [-Werror]
>
> The fix just consists to remove the parenthesis.

I have run into the same problem while doing a merge from trunk.  This
will be fixed (with your patch ;-)) once I commit in the next few days.

Oh yeah, expect many changes to the branch once I commit.  We hadn't
done a merge in almost 1.5 years.

Thanks Patrick.


Go patch committed: Fix defer/recover at high optimization levels

2011-09-14 Thread Ian Lance Taylor
The Go defer/panic/recover handling works through a little dance that
includes comparing return addresses to see if a particular call to
recover should pick up a particular call to panic.  This dance only
works if the appropriate functions are not inlined.  I messed up the
condition for whether the functions could be inlined, such that it
usually worked but broke at -O3.  This patch fixes the problem.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r b740d140ea5f go/gogo-tree.cc
--- a/go/gogo-tree.cc	Tue Sep 13 14:32:18 2011 -0700
+++ b/go/gogo-tree.cc	Wed Sep 14 11:30:56 2011 -0700
@@ -1159,8 +1159,11 @@
 
 	  // If a function calls the predeclared recover function, we
 	  // can't inline it, because recover behaves differently in a
-	  // function passed directly to defer.
-	  if (this->calls_recover_ && !this->is_recover_thunk_)
+	  // function passed directly to defer.  If this is a recover
+	  // thunk that we built to test whether a function can be
+	  // recovered, we can't inline it, because that will mess up
+	  // our return address comparison.
+	  if (this->calls_recover_ || this->is_recover_thunk_)
 	DECL_UNINLINABLE(decl) = 1;
 
 	  // If this is a thunk created to call a function which calls


Re: Prevent inliner from removing aliases of used comdats

2011-09-14 Thread Jan Hubicka
> Would a testcase from PR50226 suffice?
Sure, fell free to commit it!

Honza


Re: hashtable exception safety patch

2011-09-14 Thread François Dumont

On 09/13/2011 10:14 PM, Paolo Carlini wrote:

Hi,


To rebase would have forced me to delay the patch of one day which I try to 
avoid... without success. Here it is again.

Sorry about that, I was annoyed seeing a bit of - surely, trivial - work I had 
done on the orig patch nullified. Thanks for your patience.

Paolo

Attached patch applied.

2011-09-14  François Dumont 
Paolo Carlini 

* include/bits/hashtable.h (_Hashtable<>::_M_rehash): Take and 
restore

hash policy _M_prev_resize on exception.
(_Hashtable<>::_M_insert_bucket): Capture hash policy next resize
before using it and use latter method to have it restored on 
exception.

(_Hashtable<>::_M_insert(_Arg&& __v, std::false_type): Likewise.
(_Hashtable<>::insert(_InputIterator, _InputIterator): Likewise.
(_Hashtable<>::rehash): Likewise.
* testsuite/23_containers/unordered_set/insert/hash_policy.cc: New.
* testsuite/23_containers/unordered_multiset/insert/hash_policy.cc:
Likewise.

François

Index: include/bits/hashtable.h
===
--- include/bits/hashtable.h	(revision 178792)
+++ include/bits/hashtable.h	(working copy)
@@ -458,8 +458,9 @@
   // reserve, if present, comes from _Rehash_base.
 
 private:
-  // Unconditionally change size of bucket array to n.
-  void _M_rehash(size_type __n);
+  // Unconditionally change size of bucket array to n, restore hash policy
+  // resize value to __next_resize on exception.
+  void _M_rehash(size_type __n, size_type __next_resize);
 };
 
 
@@ -743,7 +744,7 @@
   _M_rehash_policy = __pol;
   size_type __n_bkt = __pol._M_bkt_for_elements(_M_element_count);
   if (__n_bkt > _M_bucket_count)
-	_M_rehash(__n_bkt);
+	_M_rehash(__n_bkt, __pol._M_next_resize);
 }
 
   template __do_rehash
 	  = _M_rehash_policy._M_need_rehash(_M_bucket_count,
 	_M_element_count, 1);
@@ -920,14 +922,14 @@
 	__n = this->_M_bucket_index(__k, __code, __do_rehash.second);
 	  }
 
-	// Allocate the new node before doing the rehash so that we don't
-	// do a rehash if the allocation throws.
-	_Node* __new_node = _M_allocate_node(std::forward<_Arg>(__v));
-
+	_Node* __new_node = 0;
 	__try
 	  {
+	// Allocate the new node before doing the rehash so that we
+	// don't do a rehash if the allocation throws.
+	__new_node = _M_allocate_node(std::forward<_Arg>(__v));
 	if (__do_rehash.first)
-	  _M_rehash(__do_rehash.second);
+	  _M_rehash(__do_rehash.second, __saved_next_resize);
 
 	__new_node->_M_next = _M_buckets[__n];
 	this->_M_store_code(__new_node, __code);
@@ -939,7 +941,10 @@
 	  }
 	__catch(...)
 	  {
-	_M_deallocate_node(__new_node);
+	if (!__new_node)
+	  _M_rehash_policy._M_next_resize = __saved_next_resize;
+	else
+	  _M_deallocate_node(__new_node);
 	__throw_exception_again;
 	  }
   }
@@ -981,11 +986,12 @@
 		 _H1, _H2, _Hash, _RehashPolicy, __chc, __cit, __uk>::
   _M_insert(_Arg&& __v, std::false_type)
   {
+	const size_type __saved_next_resize = _M_rehash_policy._M_next_resize;
 	std::pair __do_rehash
 	  = _M_rehash_policy._M_need_rehash(_M_bucket_count,
 	_M_element_count, 1);
 	if (__do_rehash.first)
-	  _M_rehash(__do_rehash.second);
+	  _M_rehash(__do_rehash.second, __saved_next_resize);
 
 	const key_type& __k = this->_M_extract(__v);
 	typename _Hashtable::_Hash_code_type __code = this->_M_hash_code(__k);
@@ -1024,11 +1030,12 @@
   insert(_InputIterator __first, _InputIterator __last)
   {
 	size_type __n_elt = __detail::__distance_fw(__first, __last);
+	const size_type __saved_next_resize = _M_rehash_policy._M_next_resize;
 	std::pair __do_rehash
 	  = _M_rehash_policy._M_need_rehash(_M_bucket_count,
 	_M_element_count, __n_elt);
 	if (__do_rehash.first)
-	  _M_rehash(__do_rehash.second);
+	  _M_rehash(__do_rehash.second, __saved_next_resize);
 
 	for (; __first != __last; ++__first)
 	  this->insert(*__first);
@@ -1184,9 +1191,11 @@
 	   _H1, _H2, _Hash, _RehashPolicy, __chc, __cit, __uk>::
 rehash(size_type __n)
 {
+  const size_type __saved_next_resize = _M_rehash_policy._M_next_resize;
   _M_rehash(std::max(_M_rehash_policy._M_next_bkt(__n),
 			 _M_rehash_policy._M_bkt_for_elements(_M_element_count
-			  + 1)));
+			  + 1)),
+		__saved_next_resize);
 }
 
   template::
-_M_rehash(size_type __n)
+_M_rehash(size_type __n, size_type __next_resize)
 {
-  _Node** __new_array = _M_allocate_buckets(__n);
+  _Node** __new_array = 0;
   __try
 	{
+	  __new_array = _M_allocate_buckets(__n);
 	  _M_begin_bucket_index = __n;
 	  for (size_type __i = 0; __i < _M_bucket_count; ++__i)
 	while (_Node* __p = _M_buckets[__i])
@@ -1218,15 +1228,23 @@
 	}
   __catch(...)
 	{
-	  // A failure here means that a hash function threw an exception.
-	  // We can't

Re: [RFC] Builtin infrastructure change

2011-09-14 Thread Richard Henderson
On 09/13/2011 04:27 PM, Michael Meissner wrote:
> This patch touches a bunch of files, but most of the changes are fairly
> mechanical.  What this does is change the infrastructure of how machine
> independent builtin functions are handled.  At this point, there is no change
> to the compiler from a user perspective.  Instead of having two parallel 
> arrays
> that have tree pointers for built-in functions:
> 
>   built_in_decls
>   implicit_built_in_decls
> 
> I have replaced these by two macros that call inline functions:
> 
>   BUILT_IN_DECLS
>   IMPLICIT_BUILT_IN_DECLS
> 
> and changed the other parts that want to modify the builtins.  By changing it
> to a functional approach instead of an array, we can more easily change the
> code underneath.

Why bother with the case change and poisoning?  Merely changing the type of the 
symbol from object to function ought to be sufficient to catch all users at 
compile time.

I say this because I'm not keen on functions using all-caps names...


r~


Re: [PATCH 2/7] Generate virtual locations for tokens

2011-09-14 Thread Jason Merrill

On 09/14/2011 04:50 AM, Dodji Seketeli wrote:

To comply with this invariant, the initial patch of Tom was cloning T
into T', and was using T'->src_loc to track the virtual location of T.
Which is close to what you are proposing, while respecting the
invariant.


Yes, that's what I had in mind.  I didn't realize that there was only 
one token for FOO.



But it turned out that was using too much memory :-(.  So
we devised this scheme instead.


Ah. :(


+  void *macro;


This should be a union rather than an untyped pointer.


+  else if (context->tokens_kind == TOKENS_KIND_EXTENDED)
+   {
+ /* So we are in presence of an extended token context, which
+means that each token in this context has a virtual
+location attached to it.  So let's not forget to update
+the pointer to the current virtual location of the
+current token when we update the pointer to the current
+token */
+
+ rhs = *FIRST (context).ptoken++;
+ if (context->macro)


The other places that deal with TOKENS_KIND_EXTENDED don't test that 
context->macro is non-null.  Why is it needed here?



+{
+  cpp_hashnode *macro;
+  if (context->tokens_kind == TOKENS_KIND_EXTENDED)
+   {
+ macro_context *mc = (macro_context *) context->macro;
+ macro = mc->macro_node;
+ /* If context->buff is set, it means the life time of tokens
+is bound to the life time of this context; so we must
+free the tokens; that means we must free the virtual
+locations of these tokens too.  */
+ if (context->buff && mc->virt_locs)
+   {
+ free (mc->virt_locs);
+ mc->virt_locs = NULL;
+   }
+ free (mc);
+ context->macro = NULL;
+   }
+  else
+   macro = (cpp_hashnode *) context->macro;
+
+  if (macro != NULL)
+   macro->flags &= ~NODE_DISABLED;


How can macro end up NULL if context->macro was set?


+/* In the traditionnal mode of the preprocessor, if we are currently
+   location if we are in the traditionnal mode, and just returns


"traditional"

I don't think we need to talk about virtual locations before 
cpp_get_token_1; it's not an external interface, and it's redundant with 
the description before cpp_get_token_with_location.



+  result = cpp_get_token_1 (pfile, loc);
   if (pfile->context->macro)
-*loc = pfile->invocation_location;
+{
+  if (!CPP_OPTION (pfile, track_macro_expansion))
+   *loc = pfile->invocation_location;
+}
   else
 *loc = result->src_loc;

+  *loc = maybe_adjust_loc_for_trad_cpp (pfile, *loc);


Let's move this code into cpp_get_token_1 so that all the location 
tweaking is in one place.



+  switch (it->kind)
+{
+case MACRO_ARG_TOKEN_NORMAL:
+case MACRO_ARG_TOKEN_EXPANDED:
+  it->token_ptr++;
+  if (track_macro_exp_p)
+   it->location_ptr++;
+  break;
+case MACRO_ARG_TOKEN_STRINGIFIED:
+#ifdef ENABLE_CHECKING
+  if (it->num_forwards > 0)
+   abort ();
+  it->num_forwards++;
+#endif
+  break;
+}


Don't you want to increment num_forwards in the normal/expanded cases, too?


+tokens_buff_append_token (cpp_reader *pfile,
+ _cpp_buff *buffer,
+ source_location *virt_locs,
+ const cpp_token *token,
+ source_location def_loc,
+ source_location parm_def_loc,
+ const struct line_map *map,
+ unsigned int *macro_token_index)


Why is macro_token_index a pointer?  Nothing seems to modify the referent.


+/* Appends a token to the end of the token buffer BUFFER.  Note that
+   this function doesn't enlarge BUFFER; it overwrite the last memory
+   location of BUFFER that holds a token.


That doesn't sound like appending.

Jason


Go patch committed: Fix handling of structs with embedded builtin types

2011-09-14 Thread Ian Lance Taylor
The Go frontend was mishandling structs with embedded builtin types, as
in

struct {
int
}

This is because builtin type names are not packed with the package name,
whereas field names are.  Rather than try to sort out that confusion, I
just added another hack around it.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 24e17efc20cf go/expressions.cc
--- a/go/expressions.cc	Wed Sep 14 11:46:20 2011 -0700
+++ b/go/expressions.cc	Wed Sep 14 15:23:39 2011 -0700
@@ -12640,6 +12640,16 @@
 		  {
 			const Struct_field* sf = st->field(fre->field_index());
 			name = sf->field_name();
+
+			// See below.  FIXME.
+			if (!Gogo::is_hidden_name(name)
+			&& name[0] >= 'a'
+			&& name[0] <= 'z')
+			  {
+			if (gogo->lookup_global(name.c_str()) != NULL)
+			  name = gogo->pack_hidden_name(name, false);
+			  }
+
 			char buf[20];
 			snprintf(buf, sizeof buf, "%u", fre->field_index());
 			size_t buflen = strlen(buf);
@@ -12671,7 +12681,7 @@
 
 	  // A predefined name won't be packed.  If it starts with a
 	  // lower case letter we need to check for that case, because
-	  // the field name will be packed.
+	  // the field name will be packed.  FIXME.
 	  if (!Gogo::is_hidden_name(name)
 	  && name[0] >= 'a'
 	  && name[0] <= 'z')
@@ -13505,7 +13515,10 @@
 Ast_dump_context* ast_dump_context) const
 {
   ast_dump_context->ostream() <<  "unsafe.Offsetof(";
-  ast_dump_context->ostream() << this->field_->field_name();
+  ast_dump_context->dump_type(this->type_);
+  ast_dump_context->ostream() << '.';
+  ast_dump_context->ostream() <<
+Gogo::message_name(this->field_->field_name());
   ast_dump_context->ostream() << ")";
 }
 
diff -r 24e17efc20cf go/types.cc
--- a/go/types.cc	Wed Sep 14 11:46:20 2011 -0700
+++ b/go/types.cc	Wed Sep 14 15:23:39 2011 -0700
@@ -1297,28 +1297,28 @@
   if (!this->has_pointer())
 runtime_type_kind |= RUNTIME_TYPE_KIND_NO_POINTERS;
   Struct_field_list::const_iterator p = fields->begin();
-  go_assert(p->field_name() == "Kind");
+  go_assert(p->is_field_name("Kind"));
   mpz_t iv;
   mpz_init_set_ui(iv, runtime_type_kind);
   vals->push_back(Expression::make_integer(&iv, p->type(), bloc));
 
   ++p;
-  go_assert(p->field_name() == "align");
+  go_assert(p->is_field_name("align"));
   Expression::Type_info type_info = Expression::TYPE_INFO_ALIGNMENT;
   vals->push_back(Expression::make_type_info(this, type_info));
 
   ++p;
-  go_assert(p->field_name() == "fieldAlign");
+  go_assert(p->is_field_name("fieldAlign"));
   type_info = Expression::TYPE_INFO_FIELD_ALIGNMENT;
   vals->push_back(Expression::make_type_info(this, type_info));
 
   ++p;
-  go_assert(p->field_name() == "size");
+  go_assert(p->is_field_name("size"));
   type_info = Expression::TYPE_INFO_SIZE;
   vals->push_back(Expression::make_type_info(this, type_info));
 
   ++p;
-  go_assert(p->field_name() == "hash");
+  go_assert(p->is_field_name("hash"));
   mpz_set_ui(iv, this->hash_for_method(gogo));
   vals->push_back(Expression::make_integer(&iv, p->type(), bloc));
 
@@ -1327,7 +1327,7 @@
   this->type_functions(&hash_fn, &equal_fn);
 
   ++p;
-  go_assert(p->field_name() == "hashfn");
+  go_assert(p->is_field_name("hashfn"));
   Function_type* fntype = p->type()->function_type();
   Named_object* no = Named_object::make_function_declaration(hash_fn, NULL,
 			 fntype,
@@ -1336,14 +1336,14 @@
   vals->push_back(Expression::make_func_reference(no, NULL, bloc));
 
   ++p;
-  go_assert(p->field_name() == "equalfn");
+  go_assert(p->is_field_name("equalfn"));
   fntype = p->type()->function_type();
   no = Named_object::make_function_declaration(equal_fn, NULL, fntype, bloc);
   no->func_declaration_value()->set_asm_name(equal_fn);
   vals->push_back(Expression::make_func_reference(no, NULL, bloc));
 
   ++p;
-  go_assert(p->field_name() == "string");
+  go_assert(p->is_field_name("string"));
   Expression* s = Expression::make_string((name != NULL
 	   ? name->reflection(gogo)
 	   : this->reflection(gogo)),
@@ -1351,7 +1351,7 @@
   vals->push_back(Expression::make_unary(OPERATOR_AND, s, bloc));
 
   ++p;
-  go_assert(p->field_name() == "uncommonType");
+  go_assert(p->is_field_name("uncommonType"));
   if (name == NULL && methods == NULL)
 vals->push_back(Expression::make_nil(bloc));
   else
@@ -1365,7 +1365,7 @@
 }
 
   ++p;
-  go_assert(p->field_name() == "ptrToThis");
+  go_assert(p->is_field_name("ptrToThis"));
   if (name == NULL)
 vals->push_back(Expression::make_nil(bloc));
   else
@@ -1402,10 +1402,10 @@
   vals->reserve(3);
 
   Struct_field_list::const_iterator p = fields->begin();
-  go_assert(p->field_name() == "name");
+  go_assert(p->is_field_name("name"));
 
   ++p;
-  go_assert(p->field_name() == "pkgPath");
+  go_assert(p->is_field_name("pkgPath"));
 
   if (name == NULL)
 {
@@ -1444,7 +1444,7 @@
 }
 
   ++p;
-  go_assert(p->field_name() == "methods");
+  go_assert(p->is_field_name("

Re: Initial shrink-wrapping patch

2011-09-14 Thread Richard Henderson
On 09/13/2011 08:36 AM, Bernd Schmidt wrote:
> On 09/13/11 15:05, Richard Sandiford wrote:
>> It just feels like checking for trap_if or turning off cross-jumping
>> are working around problems in the representation of shrink-wrapped
>> functions.  There should be something in the IL to say that those
>> two blocks cannot be merged for CFI reasons. 
> 
> There is - JUMP_LABELs and such, and the simple_return vs return
> distinction. This works for essentially all the interesting cases. The
> problem here is that we don't have a jump as the last insn. So how about
> the solution in crossjumping as below?
> 
>> Maybe two flags on
>> the basic block to say whether they start (resp. end) with the
>> "wrapped" version of the CFI?  (Which unfortunately would need
>> to be checked explicitly.)
> 
> I think that's overdesigning it, and it breaks as soon as something
> discards the bb info (reorg...) or puts a label in the middle of a
> prologue or epilogue.  Keeping that up-to-date would be much more
> fragile than just manually dealing with the few cases where we can't
> tell what's going on.
> 
>> OTOH, if another reviewer thinks that's unreasnable, I'll happily
>> defer to them.
> 
> Cc'ing rth for a second opinion...

It feels hacky, but I don't have anything better to suggest.
I think the patch is ok.


r~


Handle multi-word regsiters in REG_CFA_RESTORE notes

2011-09-14 Thread Bernd Schmidt
While testing the altest iteration of shrink-wrapping on mips-elf, a new
failure showed up in gcc.dg/pr43139.c. When restoring floating-point
registers, we attach REG_CFA_RESTORE notes for DFmode registers, but the
dwarf2cfi code only records a single regno for such a multiword hard reg.

Fixed with this patch, tested with shrink-wrapping on mips-elf (default
multilib plus two with -mips16). Ok?


Bernd
* dwarf2cfi.c (dwarf2out_frame_debug_cfa_expression): Handle
multi-word registers.

Index: gcc/dwarf2cfi.c
===
--- gcc/dwarf2cfi.c (revision 178734)
+++ gcc/dwarf2cfi.c (working copy)
@@ -1236,10 +1236,15 @@ dwarf2out_frame_debug_cfa_expression (rt
 static void
 dwarf2out_frame_debug_cfa_restore (rtx reg)
 {
-  unsigned int regno = dwf_regno (reg);
+  unsigned int orig_regno = REGNO (reg);
+  int nregs = hard_regno_nregs[orig_regno][GET_MODE (reg)];
+  while (nregs-- > 0)
+{
+  unsigned int regno = DWARF_FRAME_REGNUM (orig_regno + nregs);
 
-  add_cfi_restore (regno);
-  update_row_reg_save (cur_row, regno, NULL);
+  add_cfi_restore (regno);
+  update_row_reg_save (cur_row, regno, NULL);
+}
 }
 
 /* A subroutine of dwarf2out_frame_debug, process a REG_CFA_WINDOW_SAVE.


[4.6] Fix set-but-not-used warning [committed] (issue5013044)

2011-09-14 Thread Diego Novillo

The GCC version we use internally to build 4.6 treats unused variable
warnings as errors.  This was causing bootstraps to fail.

Tested on x86_64.  Committed to gcc-4_6-branch.


* tree-vect-stmts.c (vect_transform_stmt): Remove unused
local variable ORIG_SCALAR_STMT.

Index: tree-vect-stmts.c
===
--- tree-vect-stmts.c   (revision 178875)
+++ tree-vect-stmts.c   (working copy)
@@ -4814,7 +4814,7 @@ vect_transform_stmt (gimple stmt, gimple
   bool is_store = false;
   gimple vec_stmt = NULL;
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
-  gimple orig_stmt_in_pattern, orig_scalar_stmt = stmt;
+  gimple orig_stmt_in_pattern;
   bool done;
 
   switch (STMT_VINFO_TYPE (stmt_info))

--
This patch is available for review at http://codereview.appspot.com/5013044


[ARM testsuite] remove option "-march=armv5te" for pr46934.c testcase

2011-09-14 Thread Bin Cheng
Hi,
Bug PR46934 exists on all thumb1 targets, for example cortex-m0.
This patch removes "-march=armv5te" option for testcase pr46934.c,
allowing the test to be run on all thumb1 targets, not just armv5te.

Is it ok? Thanks.

gcc/testsuite/ChangeLog:
2011-09-15  Cheng Bin 
   * gcc.target/arm/pr46934.c: remove -march=armv5te option.

Thanks-chengbin

relax-option-for-pr46934-testcase.patch
Description: Binary data


[ARM] pass "--be8" to linker when linking for M profile

2011-09-14 Thread Bin Cheng
Hi,
The linker should do endian swizzling at link-time according to "--be8"
option.
This patch modifies BE8_LINK_SPEC by adding cortex-m processors in the specs
string.

Since R-profile supports configurable big-endian instruction fetch, I didn't
include it here.

Is it ok? Thanks.

2011-09-15  Cheng Bin 
* config/arm/bpabi.h (BE8_LINK_SPEC): add cortex-m arch and
processors.

Thanks-chengbin

gcc-be8-for-m-profile.patch
Description: Binary data