[PATCH] Fix PR56150

2013-01-31 Thread Richard Biener

This fixes PR56150 - mixed store/loads are not equal.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2013-01-30  Richard Biener  

PR tree-optimization/56150
* tree-ssa-tail-merge.c (gimple_equal_p): Properly handle
mixed store non-store stmts.

* gcc.dg/torture/pr56150.c: New testcase.

Index: gcc/tree-ssa-tail-merge.c
===
*** gcc/tree-ssa-tail-merge.c   (revision 195574)
--- gcc/tree-ssa-tail-merge.c   (working copy)
*** gimple_equal_p (same_succ same_succ, gim
*** 1119,1135 
  case GIMPLE_ASSIGN:
lhs1 = gimple_get_lhs (s1);
lhs2 = gimple_get_lhs (s2);
!   if (gimple_vdef (s1))
!   {
! if (vn_valueize (gimple_vdef (s1)) != vn_valueize (gimple_vdef (s2)))
!   return false;
! if (TREE_CODE (lhs1) != SSA_NAME
! && TREE_CODE (lhs2) != SSA_NAME)
!   return true;
!   }
!   return (TREE_CODE (lhs1) == SSA_NAME
! && TREE_CODE (lhs2) == SSA_NAME
! && vn_valueize (lhs1) == vn_valueize (lhs2));
  
  case GIMPLE_COND:
t1 = gimple_cond_lhs (s1);
--- 1119,1132 
  case GIMPLE_ASSIGN:
lhs1 = gimple_get_lhs (s1);
lhs2 = gimple_get_lhs (s2);
!   if (TREE_CODE (lhs1) != SSA_NAME
! && TREE_CODE (lhs2) != SSA_NAME)
!   return (vn_valueize (gimple_vdef (s1))
!   == vn_valueize (gimple_vdef (s2)));
!   else if (TREE_CODE (lhs1) == SSA_NAME
!  && TREE_CODE (lhs2) == SSA_NAME)
!   return vn_valueize (lhs1) == vn_valueize (lhs2);
!   return false;
  
  case GIMPLE_COND:
t1 = gimple_cond_lhs (s1);
Index: gcc/testsuite/gcc.dg/torture/pr56150.c
===
*** gcc/testsuite/gcc.dg/torture/pr56150.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr56150.c  (working copy)
***
*** 0 
--- 1,17 
+ /* { dg-do compile } */
+ 
+ struct {
+ int f4;
+ } g1;
+ 
+ long g2;
+ 
+ volatile long g3;
+ 
+ void func_1 ()
+ {
+   if (g2)
+ g1 = g1;
+   else
+ g3;
+ }


[PATCH] More PR56150 slow compile-time fixes

2013-01-31 Thread Richard Biener

This reverts the change to go into loop-closed SSA form for
virtual operands.  Nothing relys on that and the verifier doesn't
verify it works.  Furthermore SSA updating will destroy loop-closed
virtual SSA form very quickly.  As the PR shows it can be quite
costly to go into loop-closed SSA form for virtuals (50% of
compile-time).

So the following patch reverts the change for now.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-01-31  Richard Biener  

PR tree-optimization/56150
* tree-ssa-loop-manip.c (find_uses_to_rename_stmt): Do not
visit virtual operands.
(find_uses_to_rename_bb): Likewise.

Index: gcc/tree-ssa-loop-manip.c
===
*** gcc/tree-ssa-loop-manip.c   (revision 195574)
--- gcc/tree-ssa-loop-manip.c   (working copy)
*** find_uses_to_rename_stmt (gimple stmt, b
*** 402,408 
if (is_gimple_debug (stmt))
  return;
  
!   FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_USES)
  find_uses_to_rename_use (bb, var, use_blocks, need_phis);
  }
  
--- 402,408 
if (is_gimple_debug (stmt))
  return;
  
!   FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_USE)
  find_uses_to_rename_use (bb, var, use_blocks, need_phis);
  }
  
*** find_uses_to_rename_bb (basic_block bb,
*** 422,429 
  for (bsi = gsi_start_phis (e->dest); !gsi_end_p (bsi); gsi_next (&bsi))
{
  gimple phi = gsi_stmt (bsi);
!   find_uses_to_rename_use (bb, PHI_ARG_DEF_FROM_EDGE (phi, e),
!use_blocks, need_phis);
}
  
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
--- 422,430 
  for (bsi = gsi_start_phis (e->dest); !gsi_end_p (bsi); gsi_next (&bsi))
{
  gimple phi = gsi_stmt (bsi);
!   if (! virtual_operand_p (gimple_phi_result (phi)))
! find_uses_to_rename_use (bb, PHI_ARG_DEF_FROM_EDGE (phi, e),
!  use_blocks, need_phis);
}
  
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))


Re: [PATCH][RFC] Add -fno-aggressive-loop-optimizations

2013-01-31 Thread Richard Biener
On Wed, 30 Jan 2013, Pat Haugen wrote:

> On 01/29/2013 04:53 AM, Richard Biener wrote:
> > I'm curious about the affect of -fno-aggressive-loop-optimizations
> > on SPEC CPU 2006 numbers (not curious enough to try for myself
> > though).  Both on extra PASSes for official latest sources
> > (I have no access to those) and on performance.
> The patch/option result in both 464.h264ref and 416.gamess passing (as opposed
> to infinite loop). As for performance, I didn't see any difference outside of
> noise range for both 32-bit and 64-bit runs on PowerPC.

Ok, I'll go ahead and apply the patch then.

Richard.


Re: [PATCH][RFC] Add -fno-aggressive-loop-optimizations

2013-01-31 Thread Richard Biener
On Thu, 31 Jan 2013, Richard Biener wrote:

> On Wed, 30 Jan 2013, Pat Haugen wrote:
> 
> > On 01/29/2013 04:53 AM, Richard Biener wrote:
> > > I'm curious about the affect of -fno-aggressive-loop-optimizations
> > > on SPEC CPU 2006 numbers (not curious enough to try for myself
> > > though).  Both on extra PASSes for official latest sources
> > > (I have no access to those) and on performance.
> > The patch/option result in both 464.h264ref and 416.gamess passing (as 
> > opposed
> > to infinite loop). As for performance, I didn't see any difference outside 
> > of
> > noise range for both 32-bit and 64-bit runs on PowerPC.
> 
> Ok, I'll go ahead and apply the patch then.

Done.  I propose the following addition to changes.html.

Ok?

Thanks,
Richard.

2013-01-31  Richard Biener  

* changes.html: Mention -fno-aggressive-loop-optimizations.

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.89
diff -u -r1.89 changes.html
--- changes.html27 Jan 2013 13:44:12 -  1.89
+++ changes.html31 Jan 2013 09:05:20 -
@@ -30,6 +30,13 @@
 The G++ namespace association extension, __attribute 
((strong)),
 has been deprecated. Inline namespaces should be used instead.
 
+GCC now uses more a aggressive analysis to derive an upper bound for
+the number of iterations of loops using constraints imposed by language
+standards.  This may cause non-conforming programs to no longer work as
+expected, such as SPEC CPU 2006 464.h264ref and 416.gamess.  A new
+option, -fno-aggressive-loop-optimizations, was added
+to disable this aggressive analysis.
+
 On ARM, a bug has been fixed in GCC's implementation of the AAPCS
 rules for the layout of vectors that could lead to wrong code being
 generated.  Vectors larger than 8 bytes in size are now by default


Re: [patch libiberty]: Fix PR 543413

2013-01-31 Thread Rainer Orth
Kai Tietz  writes:

> this patch fixes wrong handling of cases that bitness of size_t is
> wider as 32-bit.
>
> ChangeLog
>
> 2013-01-30  Kai Tietz  
>
>   PR other/543413

 ^ this is clearly wrong (6 digits)

>   * md5.c (md5_process_block):  Handle case that size_t is
>   a wider-integer-scalar a 32-bit unsigned integer.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: SLP for vectors

2013-01-31 Thread Richard Biener
On Wed, Jan 30, 2013 at 9:45 PM, Marc Glisse  wrote:
> On Tue, 29 Jan 2013, Richard Biener wrote:
>
>> On Sun, Jan 27, 2013 at 4:28 PM, Marc Glisse  wrote:
>>>
>>> Hello,
>>>
>>> this message is to check that I am not doing something absurd and ask for
>>> a
>>> bit of advice.
>>>
>>> In the attached patch, I let SLP recognize vector loads/stores just like
>>> it
>>> recognizes those in an array. It has a few issues: the cost of the
>>> vectorized version is overestimated, the base object is wrong (doesn't
>>> strip
>>> the bit_field_ref, but since the base address is right and the base
>>> object
>>> is almost unused...), but most importantly it only works if the vectors
>>> have
>>> their address taken, otherwise (after asking gimple_vuse?) SLP doesn't
>>> detect their use as loads or stores at all.
>>
>>
>> Yes, if they have not their address taken they are not loads.
>>
>>> Also, it only works if you write
>>> result[0]=..., result[1]=... and does not recognize a constructor as a
>>> series of stores.
>>>
>>> Is slowly extending SLP the right way to go? Should
>>> get_references_in_stmt
>>> report vector element accesses?
>>
>>
>> It does if it is a memory access.
>>
>> You didn't provide a new testcase so it's hard for me to guess what you
>> are trying to do.  I suppose you want to vectorize
>>
>>   w[0] = v[0] + v[0];
>>   w[1] = v[1] + v[1];
>>
>> into
>>
>>   w = v + v;
>>
>> As it would work if w and v are array accesses instead of vector
>> subscripts?
>
>
> Yes, sorry for not being more precise. Vectorization that works for an array
> should work (even better) for a vector.
>
>
>> Note that similar issues (and bugreports) exist for complex component
>> accesses.
>
>
> One extension at a time :-)
>
>
>> As of handling non-memory BIT_FIELD_REFs - I suggest to split out
>> the test of whether a stmt is considered a "load" for the purpose of
>> vectorization and simply special-case this therein, not requiring a VUSE.
>
>
> Ok. The hardest part will be preventing the pass from creating ADDR_EXPR of
> those vectors, or at least folding them before the pass is done.

Oh ... I suppose the "loads"/"stores" need simply be specially handled.

>
>> I suppose the data-ref analysis parts are not strictly required, nor
>> the get_addr_base_and_unit_offset_1 parts?
>
>
> Actually it is necessary (at least the get_addr_base_and_unit_offset_1 part
> is) otherwise I get a gimple verification failure because we have an
> ADDR_EXPR of a BIT_FIELD_REF.

Hm.  Certainly handling BIT_FIELD_REF in get_addr_base_and_unit_offset_1
is ok, but we should ensure by other means that the address of a
BIT_FIELD_REF is never taken ...

>> They should be split out
>> and separately tested anyway.  The data-ref part at least will confuse
>> analysis of 'a[0]' or 'a[1]' vs. 'a', but I suppose independent of the
>> patch.
>
>
> Ah... Could you be more specific? Are you thinking about arrays of vectors?

No, I was thinking of code that accesses both the whole vector and the
vector piecewise.  Mixing this kind of accesses will confuse data-ref
analysis I believe.  But it also will disable vectorization because there are
already vector types in the IL ... so it's probably a minor issue for now.

Richard.

> Thanks for the help,
>
> --
> Marc Glisse


Re: [patch libiberty]: Fix PR 543413

2013-01-31 Thread Kai Tietz
2013/1/31 Rainer Orth :
> Kai Tietz  writes:
>
>> this patch fixes wrong handling of cases that bitness of size_t is
>> wider as 32-bit.
>>
>> ChangeLog
>>
>> 2013-01-30  Kai Tietz  
>>
>>   PR other/543413
>
>  ^ this is clearly wrong (6 digits)
>
>>   * md5.c (md5_process_block):  Handle case that size_t is
>>   a wider-integer-scalar a 32-bit unsigned integer.
>
> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University

Thanks, your heads up is too late.  corrected this already.

Cheers,
Kai


[PATCH] Avoid .debug_loc loclist entries that look like loclist terminators (PR debug/56154)

2013-01-31 Thread Jakub Jelinek
Hi!

As the pr56154-{1,2,3}.c testcases show, sometimes we start a location
list with an empty range (either for to PR49382, or because the first
insn in the function is inline asm (or many of them) which has zero
size and the compiler doesn't know it).  If
!have_multiple_function_sections, we emit .debug_loc addresses relative
to .Ltext0 symbol and DW_AT_low_pc of the CU is .Ltext0, otherwise they
are emitted as absolute addresses, there is DW_AT_ranges on the CU and
DW_AT_low_pc is 0.  If an empty range, either known by the compiler
or unknown (because of empty inline asm, or say inline asm which is
.section someothersection; emits lots of stuff there; .previous)
is emitted for the beginning of the first function emitted for the CU,
then we can end up with e.g.
  .LVL0 - .Ltext0
  .LVL0 - .Ltext0
range where .LVL0 == .Ltext0, or say
  .LVL0 - .Ltext0
  .LVL23 - .Ltext0
range where .LVL0 == .LVL1 == ... == .LVL23 == .Ltext0.  Unfortunately
that is
  0
  0
and 0, 0 is location list terminator entry, rather than valid empty range.
In that case, neither the empty range is usable by the debug info consumer,
nor if there are any other ranges in the location list after it.
E.g. readelf -wo then complains there is garbage hole in the .debug_loc
section.

Fixed by checking for this on the first function, and if any empty range
is detected at the beginning of the function in some location list,
we force have_multiple_function_sections (and thus, DW_AT_ranges for
the CU, DW_AT_low_pc 0 and absolute .debug_loc addresses).

This patch doesn't seem to affect .debug_{info,loc,ranges} sizes on
stage3 cc1plus binaries on either x86_64-linux and i686-linux, so I guess
it doesn't hit that often, and even when it hits, it results in tiny
decrease of size of .debug_info, tiny growth in .debug_ranges, no growth in
.debug_loc, just possibly many relocations against .debug_loc (but that
affects just *.o/*.a files).  The alternative discussed was to force
in such loclists a base selection entry, but that would be more work
on the gcc side and .debug_loc size could grow more.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-01-31  Jakub Jelinek  

PR debug/56154
* dwarf2out.c (dwarf2_debug_hooks): Set end_function hook to
dwarf2out_end_function.
(in_first_function_p, maybe_at_text_label_p,
first_loclabel_num_not_at_text_label): New variables.
(dwarf2out_var_location): In the first function find out
lowest loclabel_num N where .LVLN is known not to be equal
to .Ltext0.
(find_empty_loc_ranges_at_text_label, dwarf2out_end_function): New
functions.

* gcc.dg/guality/pr56154-1.c: New test.
* gcc.dg/guality/pr56154-2.c: New test.
* gcc.dg/guality/pr56154-3.c: New test.
* gcc.dg/guality/pr56154-4.c: New test.
* gcc.dg/guality/pr56154-aux.c: New file.

--- gcc/dwarf2out.c.jj  2013-01-11 09:02:48.0 +0100
+++ gcc/dwarf2out.c 2013-01-30 16:18:58.362552894 +0100
@@ -2351,6 +2351,7 @@ static void dwarf2out_imported_module_or
 static void dwarf2out_abstract_function (tree);
 static void dwarf2out_var_location (rtx);
 static void dwarf2out_begin_function (tree);
+static void dwarf2out_end_function (unsigned int);
 static void dwarf2out_set_name (tree, tree);
 
 /* The debug hooks structure.  */
@@ -2378,7 +2379,7 @@ const struct gcc_debug_hooks dwarf2_debu
 #endif
   dwarf2out_end_epilogue,
   dwarf2out_begin_function,
-  debug_nothing_int,   /* end_function */
+  dwarf2out_end_function,  /* end_function */
   dwarf2out_function_decl, /* function_decl */
   dwarf2out_global_decl,
   dwarf2out_type_decl, /* type_decl */
@@ -20627,6 +20628,14 @@ dwarf2out_set_name (tree decl, tree name
 add_name_attribute (die, dname);
 }
 
+/* True if before or during processing of the first function being emitted.  */
+static bool in_first_function_p = true;
+/* True if loc_note during dwarf2out_var_location call might still be
+   before first real instruction at address equal to .Ltext0.  */
+static bool maybe_at_text_label_p = true;
+/* One above highest N where .LVLN label might be equal to .Ltext0 label.  */
+static unsigned int first_loclabel_num_not_at_text_label;
+
 /* Called by the final INSN scan whenever we see a var location.  We
use it to drop labels in the right places, and throw the location in
our lookup table.  */
@@ -20734,6 +20743,45 @@ dwarf2out_var_location (rtx loc_note)
   ASM_OUTPUT_DEBUG_LABEL (asm_out_file, "LVL", loclabel_num);
   loclabel_num++;
   last_label = ggc_strdup (loclabel);
+  /* See if loclabel might be equal to .Ltext0.  If yes,
+bump first_loclabel_num_not_at_text_label.  */
+  if (!have_multiple_function_sections
+ && in_first_function_p
+ && maybe_at_text_label_p)
+   {
+ static rtx last_start;
+ rtx insn;
+ for (insn = loc_note; insn; insn = previous_insn 

[PATCH] If possible, include range of profile hunk before prologue in .debug_loc ranges (PR debug/54793)

2013-01-31 Thread Jakub Jelinek
Hi!

The Linux kernel started (recently?) using -mfentry -p on x86_64/i686?
for some instrumentation, I believe they normally overwrite the -mfentry
prologues with nops and only when some function needs to be instrumented,
overwrite it with some call or jump.
The problem in this PR is that var-tracking notes that are meant to
live on the very first insn in the function (info about fn arguments and the
like) no longer do so, they all start after the profile prologue, so when
*gdb/systemtap or similar debug info consumer inspects debug info at the
beginning of the function, it thinks the arguments are optimized away.

This patch attempts to extend those ranges over the profiler snippet before
prologue on some targets.  I'm not doing it everywhere, because some targets
emit some assembly in their targetm.asm_out.function_prologue hook
even for HAVE_prologue, and it is likely the profiler hook isn't prepared
to handle it (think about e.g. ia64 or a few other targets).
Also, sometimes the profiler code might e.g. use push/pop in it and
invalidate potential sp based locations, either we could stop doing this
if cfun->returns_struct or cfun->static_chain_decl, but Alex argued
that it isn't any different from inline asm which uses push/pop inside of
it.  The debug info there is also correct only on the first insn in the
inline asm and right after the inline asm, not necessary inside of it,
and similarly with this patch it isn't necessarily right inside of the
profiler snippet (the kernel won't care, as it overwrites the snippet
with nops or call/jmp).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-01-31  Jakub Jelinek  

PR debug/54793
* final.c (need_profile_function): New variable.
(final_start_function): Drop ATTRIBUTE_UNUSED from first argument.
If first of NOTE_INSN_BASIC_BLOCK or NOTE_INSN_FUNCTION_BEG
is only preceeded by NOTE_INSN_VAR_LOCATION or NOTE_INSN_DELETED
notes, targetm.asm_out.function_prologue doesn't emit anything,
HAVE_prologue and profiler should be emitted before prologue,
set need_profile_function instead of emitting it.
(final_scan_insn): If need_profile_function, emit
profile_function on the first NOTE_INSN_BASIC_BLOCK or
NOTE_INSN_FUNCTION_BEG note.

--- gcc/final.c.jj  2013-01-15 17:20:37.0 +0100
+++ gcc/final.c 2013-01-25 10:31:48.848897310 +0100
@@ -200,6 +200,9 @@ rtx current_insn_predicate;
 /* True if printing into -fdump-final-insns= dump.  */   
 bool final_insns_dump_p;
 
+/* True if profile_function should be called, but hasn't been called yet.  */
+static bool need_profile_function;
+
 static int asm_insn_count (rtx);
 static void profile_function (FILE *);
 static void profile_after_prologue (FILE *);
@@ -1668,13 +1671,15 @@ reemit_insn_block_notes (void)
  test and compare insns.  */
 
 void
-final_start_function (rtx first ATTRIBUTE_UNUSED, FILE *file,
+final_start_function (rtx first, FILE *file,
  int optimize_p ATTRIBUTE_UNUSED)
 {
   block_depth = 0;
 
   this_is_asm_operands = 0;
 
+  need_profile_function = false;
+
   last_filename = LOCATION_FILE (prologue_location);
   last_linenum = LOCATION_LINE (prologue_location);
   last_discriminator = discriminator = 0;
@@ -1695,7 +1700,41 @@ final_start_function (rtx first ATTRIBUT
   /* The Sun386i and perhaps other machines don't work right
  if the profiling code comes after the prologue.  */
   if (targetm.profile_before_prologue () && crtl->profile)
-profile_function (file);
+{
+  if (targetm.asm_out.function_prologue
+ == default_function_pro_epilogue
+#ifdef HAVE_prologue
+ && HAVE_prologue
+#endif
+)
+   {
+ rtx insn;
+ for (insn = first; insn; insn = NEXT_INSN (insn))
+   if (!NOTE_P (insn))
+ {
+   insn = NULL_RTX;
+   break;
+ }
+   else if (NOTE_KIND (insn) == NOTE_INSN_BASIC_BLOCK
+|| NOTE_KIND (insn) == NOTE_INSN_FUNCTION_BEG)
+ break;
+   else if (NOTE_KIND (insn) == NOTE_INSN_DELETED
+|| NOTE_KIND (insn) == NOTE_INSN_VAR_LOCATION)
+ continue;
+   else
+ {
+   insn = NULL_RTX;
+   break;
+ }
+
+ if (insn)
+   need_profile_function = true;
+ else
+   profile_function (file);
+   }
+  else
+   profile_function (file);
+}
 
   /* If debugging, assign block numbers to all of the blocks in this
  function.  */
@@ -2075,6 +2114,12 @@ final_scan_insn (rtx insn, FILE *file, i
  break;
 
case NOTE_INSN_BASIC_BLOCK:
+ if (need_profile_function)
+   {
+ profile_function (asm_out_file);
+ need_profile_function = false;
+   }
+
  if (targetm.asm_out.unwind_emit)
targetm.asm_out.unwind_emit (

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-01-31 Thread Christophe Lyon
Hello,

Sorry for the long delay (ref http://patchwork.ozlabs.org/patch/199397/)



On 6 December 2012 20:26, Teresa Johnson  wrote:
>
>
>
> On Wed, Nov 28, 2012 at 7:48 AM, Christophe Lyon
>  wrote:
>>
>> I have updated my trunk checkout, and I can confirm that eval.c now
>> compiles with your patch (and the other 4 patches I added to PR55121).
>
>
> good
>
>>
>>
>> Now, when looking at the whole Spec2k results:
>> - vpr passes now (used to fail)
>
>
> good
>
>>
>> - gcc, parser, perlbmk bzip2 and twolf no longer build: they all fail
>> with the same error from gas:
>> can't resolve `.text.unlikely' {.text.unlikely section} - `.LBB171'
>> {.text section}
>> - gap still does not build (same error as above)
>>
>> I haven't looked in detail, so I may be missing an obvious patch here.
>
>
> Finally had a chance to get back to this. I was able to reproduce the
> failure using x86_64 linux with "-freorder-blocks-and-partition -g".
> However, I am also getting the same failure with a pristine copy of trunk.
> Can you confirm whether you were seeing any of these failures without my
> patches, because I believe they are probably a limitation with function
> splitting and debug info that is orthogonal to my patch.
>
Yes I confirm that I see these failures without your patch too; and
both -freorder-blocks-and-partition and -g are present in my
command-line.
And now gap's integer.c fails to compile with a similar error message too.

>>
>> And I still observe runtime mis-behaviour on crafty, galgel, facerec and
>> fma3d.
>
>
> I'm not seeing this on x86_64, unfortunately, so it might require some
> follow-on work to triage and fix.
>
> I'll look into the gas failure, but if someone could review this patch in
> the meantime given that it does improve things considerably (at least
> without -g), that would be great.
>
Indeed.

> Thanks,
> Teresa
>

Thanks
Christophe


Fix for PR55561 race condition in libgomp

2013-01-31 Thread VandeVondele Joost
The attached patch fixes a race condition in libgomp. Based on the suggestions 
by Dmitry, and verified that it fixes the corresponding sanitizer warnings. 

2012-12-30  Dmitry Vyukov  

PR libgomp/55561
* config/linux/wait.h (do_spin): Use atomic load for addr.
* config/linux/ptrlock.c (gomp_ptrlock_get_slow): Use atomic
for intptr and ptrlock.
* onfig/linux/ptrlock.h (gomp_ptrlock_get): Use atomic load
for ptrlock.



Index: libgomp/config/linux/wait.h
===
--- libgomp/config/linux/wait.h	(revision 194730)
+++ libgomp/config/linux/wait.h	(working copy)
@@ -51,7 +51,7 @@ static inline int do_spin (int *addr, in
   if (__builtin_expect (gomp_managed_threads > gomp_available_cpus, 0))
 count = gomp_throttled_spin_count_var;
   for (i = 0; i < count; i++)
-if (__builtin_expect (*addr != val, 0))
+if (__builtin_expect (__atomic_load_n (addr, MEMMODEL_RELAXED) != val, 0))
   return 0;
 else
   cpu_relax ();
Index: libgomp/config/linux/ptrlock.c
===
--- libgomp/config/linux/ptrlock.c	(revision 194730)
+++ libgomp/config/linux/ptrlock.c	(working copy)
@@ -50,9 +50,9 @@ gomp_ptrlock_get_slow (gomp_ptrlock_t *p
 #endif
   do
 do_wait (intptr, 2);
-  while (*intptr == 2);
+  while (__atomic_load_n (intptr, MEMMODEL_RELAXED) == 2);
   __asm volatile ("" : : : "memory");
-  return *ptrlock;
+  return (void *) __atomic_load_n (ptrlock, MEMMODEL_ACQUIRE);
 }
 
 void
Index: libgomp/config/linux/ptrlock.h
===
--- libgomp/config/linux/ptrlock.h	(revision 194730)
+++ libgomp/config/linux/ptrlock.h	(working copy)
@@ -48,8 +48,9 @@ static inline void *gomp_ptrlock_get (go
 {
   uintptr_t oldval;
 
-  if ((uintptr_t) *ptrlock > 2)
-return *ptrlock;
+  uintptr_t v = (uintptr_t) __atomic_load_n (ptrlock, MEMMODEL_ACQUIRE);
+  if (v > 2)
+return (void *) v;
 
   oldval = 0;
   if (__atomic_compare_exchange_n (ptrlock, &oldval, 1, false,


Re: Fix for PR55561 race condition in libgomp

2013-01-31 Thread Jakub Jelinek
On Thu, Jan 31, 2013 at 02:50:51PM +, VandeVondele  Joost wrote:
> The attached patch fixes a race condition in libgomp. Based on the 
> suggestions by Dmitry, and verified that it fixes the corresponding sanitizer 
> warnings. 
> 
> 2012-12-30  Dmitry Vyukov  

Ok, but please use current date rather than 2012-12-30, and also add
your name and email address below Dmitry's.

Jakub


C++ PATCH for c++/56162 (pmf1.C failure on arm)

2013-01-31 Thread Jason Merrill
In my 56104 patch I failed to notice that in the ARM (vbit in delta) 
case, we need to adjust the delta even if we know we're doing a 
non-virtual call.


Tested x86_64-pc-linux-gnu, applying to trunk.

commit 6a52d34876377448990d550cb763171180f4
Author: Jason Merrill 
Date:   Thu Jan 31 09:45:49 2013 -0500

	PR c++/56162
	PR c++/56104
	* typeck.c (get_member_function_from_ptrfunc): Fix
	ptrmemfunc_vbit_in_delta case.

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index bfac394..688c266 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -3176,9 +3176,7 @@ get_member_function_from_ptrfunc (tree *instance_ptrptr, tree function,
   e3 = pfn_from_ptrmemfunc (function);
   delta = delta_from_ptrmemfunc (function);
   idx = build1 (NOP_EXPR, vtable_index_type, e3);
-  if (nonvirtual)
-	e1 = integer_zero_node;
-  else switch (TARGET_PTRMEMFUNC_VBIT_LOCATION)
+  switch (TARGET_PTRMEMFUNC_VBIT_LOCATION)
 	{
 	case ptrmemfunc_vbit_in_pfn:
 	  e1 = cp_build_binary_op (input_location,


C++ PATCH for c++/54410 (duplicate template bindings in DWARF)

2013-01-31 Thread Jason Merrill
We ended up inserting the type in the array of types to add template 
parameters to twice, so we added the parameters twice.  Fixed by only 
adding it the first time; it doesn't matter whether we happen to be 
creating a declaration or definition DIE at this point, since when we 
get to EOF we'll add the bindings to the definition die (if any).


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 1e9546939b355bd9f84e1928e1599b3a9f3bbe21
Author: Jason Merrill 
Date:   Wed Jan 30 17:21:49 2013 -0500

	PR c++/54410
	* dwarf2out.c (gen_struct_or_union_type_die): Always schedule template
	parameters the first time.
	(gen_scheduled_generic_parms_dies): Check completeness here.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 5a03280..3106dd9 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -19061,6 +19061,10 @@ gen_struct_or_union_type_die (tree type, dw_die_ref context_die,
 
   scope_die = scope_die_for (type, context_die);
 
+  /* Generate child dies for template paramaters.  */
+  if (!type_die && debug_info_level > DINFO_LEVEL_TERSE)
+schedule_generic_params_dies_gen (type);
+
   if (! type_die || (nested && is_cu_die (scope_die)))
 /* First occurrence of type or toplevel definition of nested class.  */
 {
@@ -19078,11 +19082,6 @@ gen_struct_or_union_type_die (tree type, dw_die_ref context_die,
   else
 remove_AT (type_die, DW_AT_declaration);
 
-  /* Generate child dies for template paramaters.  */
-  if (debug_info_level > DINFO_LEVEL_TERSE
-  && COMPLETE_TYPE_P (type))
-schedule_generic_params_dies_gen (type);
-
   /* If this type has been completed, then give it a byte_size attribute and
  then give a list of members.  */
   if (complete && !ns_decl)
@@ -20592,7 +20591,8 @@ gen_scheduled_generic_parms_dies (void)
 return;
   
   FOR_EACH_VEC_ELT (*generic_type_instances, i, t)
-gen_generic_params_dies (t);
+if (COMPLETE_TYPE_P (t))
+  gen_generic_params_dies (t);
 }
 
 
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/template-params-11.C b/gcc/testsuite/g++.dg/debug/dwarf2/template-params-11.C
new file mode 100644
index 000..8000295
--- /dev/null
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/template-params-11.C
@@ -0,0 +1,9 @@
+// PR c++/54410
+// { dg-options "-g -dA" }
+// { dg-final { scan-assembler-times "DIE \\(\[^\n\]*\\) DW_TAG_template_type_param" 1 } }
+
+namespace N {
+  template  struct A { };
+}
+
+N::A a;


[PATCH] Fix PR56157

2013-01-31 Thread Richard Biener

This fixes PR56157 - when looking up operands for a vectorized
SLP stmt we have to match up SLP childs with operands.  We do
so by looking for the operand in the DEF position of the SLP
child.  But that get's complicated in face of either stmt being
a pattern stmt and its operands being DEFed by pattern stmts
(in case this is a multiple stmt pattern) or by the scalar
variant (of eventually another pattern the SLP child points to).

I ended up giving up trying to be clever and trying to handle all
situations with a single compare and just compare with pattern
and non-pattern def.  False matches are not possible, just we
must not miss a match.

Bootstrapped and tested on x86_64-unknown-linux-gnu, ok?

Thanks,
Richard.

2013-01-31  Richard Biener  

PR tree-optimization/
* tree-vect-slp.c (vect_get_slp_defs): More thoroughly try to
match up operand with SLP child.

* gcc.dg/torture/pr56157.c: New testcase.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 195610)
--- gcc/tree-vect-slp.c (working copy)
*** void
*** 2616,2628 
  vect_get_slp_defs (vec ops, slp_tree slp_node,
 vec *vec_oprnds, int reduc_index)
  {
!   gimple first_stmt, first_def;
int number_of_vects = 0, i;
unsigned int child_index = 0;
HOST_WIDE_INT lhs_size_unit, rhs_size_unit;
slp_tree child = NULL;
vec *vec_defs;
!   tree oprnd, def_lhs;
bool vectorized_defs;
  
first_stmt = SLP_TREE_SCALAR_STMTS (slp_node)[0];
--- 2616,2628 
  vect_get_slp_defs (vec ops, slp_tree slp_node,
 vec *vec_oprnds, int reduc_index)
  {
!   gimple first_stmt;
int number_of_vects = 0, i;
unsigned int child_index = 0;
HOST_WIDE_INT lhs_size_unit, rhs_size_unit;
slp_tree child = NULL;
vec *vec_defs;
!   tree oprnd;
bool vectorized_defs;
  
first_stmt = SLP_TREE_SCALAR_STMTS (slp_node)[0];
*** vect_get_slp_defs (vec ops, slp_tr
*** 2638,2666 
if (SLP_TREE_CHILDREN (slp_node).length () > child_index)
  {
child = (slp_tree) SLP_TREE_CHILDREN (slp_node)[child_index];
-   first_def = SLP_TREE_SCALAR_STMTS (child)[0];
  
! /* In the end of a pattern sequence we have a use of the original 
stmt,
!so we need to compare OPRND with the original def.  */
!   if (is_pattern_stmt_p (vinfo_for_stmt (first_def))
! && !STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (first_stmt))
!   && !is_pattern_stmt_p (vinfo_for_stmt (first_stmt)))
! first_def = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (first_def));
  
!   if (is_gimple_call (first_def))
! def_lhs = gimple_call_lhs (first_def);
!   else
! def_lhs = gimple_assign_lhs (first_def);
! 
!   if (operand_equal_p (oprnd, def_lhs, 0))
! {
!   /* The number of vector defs is determined by the number of
!  vector statements in the node from which we get those
 statements.  */
!  number_of_vects = SLP_TREE_NUMBER_OF_VEC_STMTS (child);
!  vectorized_defs = true;
  child_index++;
! }
  }
  
if (!vectorized_defs)
--- 2638,2659 
if (SLP_TREE_CHILDREN (slp_node).length () > child_index)
  {
child = (slp_tree) SLP_TREE_CHILDREN (slp_node)[child_index];
  
! /* We have to check both pattern and original def, if available.  */
! gimple first_def = SLP_TREE_SCALAR_STMTS (child)[0];
! gimple related = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (first_def));
  
! if (operand_equal_p (oprnd, gimple_get_lhs (first_def), 0)
! || (related
! && operand_equal_p (oprnd, gimple_get_lhs (related), 0)))
!   {
! /* The number of vector defs is determined by the number of
!vector statements in the node from which we get those
 statements.  */
! number_of_vects = SLP_TREE_NUMBER_OF_VEC_STMTS (child);
! vectorized_defs = true;
  child_index++;
!   }
  }
  
if (!vectorized_defs)
Index: gcc/testsuite/gcc.dg/torture/pr56157.c
===
*** gcc/testsuite/gcc.dg/torture/pr56157.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr56157.c  (working copy)
***
*** 0 
--- 1,25 
+ /* { dg-do compile } */
+ /* { dg-options "-ftree-vectorize" } */
+ 
+ struct Pixel {
+ unsigned short r;
+ unsigned short g;
+ unsigned short b;
+ unsigned short a;
+ };
+ 
+ void fn(unsigned char * __restrict dst, const unsigned char * __restrict src)
+ {
+   unsigned x;
+   for(x = 0; x < 1024; x += 1)
+ {
+   struct Pixel pixel;
+   pixel.r = (unsigned short)(((unsigned)src[0]) * 0x

libgo patch committed: Don't allocate during backtrace

2013-01-31 Thread Ian Lance Taylor
The libbacktrace interfaces says that the strings passed to the callback
routine may disappear.  As it happens, in the current implementation
they will not.  The Go backtrace routine needs to hang on to the
strings, and I've discovered that it can't reliably allocate memory,
since it might be called in the middle of an existing memory
allocation.  So at least for now just assume that the backtrace strings
will hang around.  I'll fix this if I can figure out a way to allocate
memory there.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 368d10817c19 libgo/runtime/go-callers.c
--- a/libgo/runtime/go-callers.c	Wed Jan 30 14:00:04 2013 -0800
+++ b/libgo/runtime/go-callers.c	Thu Jan 31 08:38:26 2013 -0800
@@ -43,8 +43,14 @@
 
   loc = &arg->locbuf[arg->index];
   loc->pc = pc;
-  loc->filename = runtime_gostring ((const byte *) filename);
-  loc->function = runtime_gostring ((const byte *) function);
+
+  /* The libbacktrace library says that these strings might disappear,
+ but with the current implementation they won't.  We can't easily
+ allocate memory here, so for now assume that we can save a
+ pointer to the strings.  */
+  loc->filename = runtime_gostringnocopy ((const byte *) filename);
+  loc->function = runtime_gostringnocopy ((const byte *) function);
+
   loc->lineno = lineno;
   ++arg->index;
   return arg->index >= arg->max;


Re: [PATCH] Fix PR56157

2013-01-31 Thread Jakub Jelinek
On Thu, Jan 31, 2013 at 05:31:12PM +0100, Richard Biener wrote:
> 2013-01-31  Richard Biener  
> 
>   PR tree-optimization/
>   * tree-vect-slp.c (vect_get_slp_defs): More thoroughly try to
>   match up operand with SLP child.
> 
>   * gcc.dg/torture/pr56157.c: New testcase.

Ok, thanks.

> *** gcc/testsuite/gcc.dg/torture/pr56157.c(revision 0)
> --- gcc/testsuite/gcc.dg/torture/pr56157.c(working copy)

I'd perhaps remove some more cruft from the testcase,

/* { dg-do compile } */
/* { dg-options "-ftree-vectorize" } */

struct Pixel { unsigned short r, g, b, a; };

void
fn (unsigned char * __restrict dst, const unsigned char * __restrict src)
{
  unsigned x;
  for (x = 0; x < 1024; x++)
{
  struct Pixel pixel;
  pixel.r = src[0] * 0xU / 0xff;
  pixel.g = src[1] * 0xU / 0xff;
  pixel.b = src[2] * 0xU / 0xff;
  pixel.a = src[3] * 0xU / 0xff;
  __builtin_memcpy (dst, &pixel, sizeof pixel);
  src += 4;
  dst += 8;
}
}

is less obfuscated and still does the same thing, but your call.

Jakub


RE: Fix for PR55561 race condition in libgomp

2013-01-31 Thread VandeVondele Joost
The updated changelog entry is below, but somebody with write access should do 
the commit, please.

2013-01-31  Dmitry Vyukov  
  Joost VandeVondele 

PR libgomp/55561
* config/linux/wait.h (do_spin): Use atomic load for addr.
* config/linux/ptrlock.c (gomp_ptrlock_get_slow): Use atomic
for intptr and ptrlock.
* config/linux/ptrlock.h (gomp_ptrlock_get): Use atomic load
for ptrlock.


Re: Cortex-A15 vfnma/vfnms test patch

2013-01-31 Thread Ramana Radhakrishnan

On 01/28/13 14:03, amol pise wrote:

Dear Ramana,

Thank You very much for the changelog and commit of my patch in gcc.
I will follow the steps mentioned by you.



There are no vector forms for the vfnma and vfnms instructions.

A co-worker (thanks Kyryll) just pointed out to me that I'd misread the 
ARM-ARM when I checked this and it looks like the test run I did had 
failed but it looks I looked I checked the wrong work area.


Looking at the output a bit more carefully now compared to what I did 
the other day I see only the scalar forms being generated.


Also the ARM-ARM specifies that these instructions only have the scalar 
forms (Section 8.8.318)


vfnm<>.f64 Dd, Dn, Dm
vfnm<>.f32 Sd, Sn, Sm

instructions.

I have now reverted this patch as obvious.

Sorry about the inconvenience caused.


regards
Ramana



Thank You,
Amol Pise


On Mon, Jan 28, 2013 at 4:18 PM, Ramana Radhakrishnan  wrote:



[Taking gcc-help off this thread.]

Amol,



I have tested these instruction with GCC and these instructions are
generated.
Please review and marge this test support patch in gcc main trunk.



Thanks for this patch and sorry about the delay in getting around to this.

This is ok and I'll take this under the 10 line rule this time .

If you intend to continue to submit patches to gcc can I ask that you start
the process for copyright assignments or confirm that you have a copyright
assignment on file ?

http://gcc.gnu.org/contribute.html#legal

If you don't, send an email to g...@gcc.gnu.org with a request for copyright
assignment papers and a maintainer will send you these.

http://gcc.gnu.org/contribute.html in general is a good summary of the
process related to contributing patches to GCC in general . Please do read
that and follow up on g...@gcc.gnu.org if you have any more questions.

And finally don't forget to add a changelog to your patches as documented in
links from the above mentioned page. Since this is your first time I've
added the following Changelog entry for your patch and applied it.

regards
Ramana



2013-01-27  Amol Pise  

 * gcc.target/arm/neon-vfnms-1.c: New test.
 * gcc.target/arm/neon-vfnma-1.c: New test.











libgo patch committed: Block signals when creating new thread

2013-01-31 Thread Ian Lance Taylor
This patch to libgo disables signals while creating a new thread.
Otherwise if a signal comes in between the time the thread starts and
the time the thread initializes its m and g TLS variables, the program
will crash with an error "signal received on thread not created by Go."
Signals are already enabled by the thread after it has been initialized.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 4716407240c2 libgo/runtime/proc.c
--- a/libgo/runtime/proc.c	Thu Jan 31 08:39:01 2013 -0800
+++ b/libgo/runtime/proc.c	Thu Jan 31 09:24:12 2013 -0800
@@ -3,6 +3,7 @@
 // license that can be found in the LICENSE file.
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1217,6 +1218,9 @@
 	pthread_attr_t attr;
 	pthread_t tid;
 	size_t stacksize;
+	sigset_t clear;
+	sigset_t old;
+	int ret;
 
 #if 0
 	static const Type *mtype;  // The Go type M
@@ -1249,7 +1253,15 @@
 	if(pthread_attr_setstacksize(&attr, stacksize) != 0)
 		runtime_throw("pthread_attr_setstacksize");
 
-	if(pthread_create(&tid, &attr, runtime_mstart, mp) != 0)
+	// Block signals during pthread_create so that the new thread
+	// starts with signals disabled.  It will enable them in minit.
+	sigfillset(&clear);
+	sigemptyset(&old);
+	sigprocmask(SIG_BLOCK, &clear, &old);
+	ret = pthread_create(&tid, &attr, runtime_mstart, mp);
+	sigprocmask(SIG_SETMASK, &old, nil);
+
+	if (ret != 0)
 		runtime_throw("pthread_create");
 
 	return mp;


libbacktrace patch committed: Fix threaded race

2013-01-31 Thread Ian Lance Taylor
Using libbacktrace more extensively from Go has revealed a race
condition in libbacktrace.  I cleverly save some memory by reusing a
vector in the DWARF data structure.  However, that is fairly hard to do
if multiple threads are simultaneously getting a backtrace.  This patch
changes the code to always use a new vector when threaded.  This will
introduce additional memory fragmentation, but that seems impossible to
avoid.  Bootstrapped and ran libbacktrace and Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian


2013-01-31  Ian Lance Taylor  

* dwarf.c (read_function_info): Permit fvec parameter to be NULL.
(dwarf_lookup_pc): Don't use ddata->fvec if threaded.


Index: dwarf.c
===
--- dwarf.c	(revision 195478)
+++ dwarf.c	(working copy)
@@ -2473,10 +2473,21 @@ read_function_info (struct backtrace_sta
 		struct function_addrs **ret_addrs,
 		size_t *ret_addrs_count)
 {
+  struct function_vector lvec;
+  struct function_vector *pfvec;
   struct dwarf_buf unit_buf;
   struct function_addrs *addrs;
   size_t addrs_count;
 
+  /* Use FVEC if it is not NULL.  Otherwise use our own vector.  */
+  if (fvec != NULL)
+pfvec = fvec;
+  else
+{
+  memset (&lvec, 0, sizeof lvec);
+  pfvec = &lvec;
+}
+
   unit_buf.name = ".debug_info";
   unit_buf.start = ddata->dwarf_info;
   unit_buf.buf = u->unit_data;
@@ -2489,20 +2500,28 @@ read_function_info (struct backtrace_sta
   while (unit_buf.left > 0)
 {
   if (!read_function_entry (state, ddata, u, 0, &unit_buf, lhdr,
-error_callback, data, fvec))
+error_callback, data, pfvec))
 	return;
 }
 
-  if (fvec->count == 0)
+  if (pfvec->count == 0)
 return;
 
-  addrs = (struct function_addrs *) fvec->vec.base;
-  addrs_count = fvec->count;
+  addrs = (struct function_addrs *) pfvec->vec.base;
+  addrs_count = pfvec->count;
 
-  /* Finish this list of addresses, but leave the remaining space in
- the vector available for the next function unit.  */
-  backtrace_vector_finish (state, &fvec->vec);
-  fvec->count = 0;
+  if (fvec == NULL)
+{
+  if (!backtrace_vector_release (state, &lvec.vec, error_callback, data))
+	return;
+}
+  else
+{
+  /* Finish this list of addresses, but leave the remaining space in
+	 the vector available for the next function unit.  */
+  backtrace_vector_finish (state, &fvec->vec);
+  fvec->count = 0;
+}
 
   qsort (addrs, addrs_count, sizeof (struct function_addrs),
 	 function_addrs_compare);
@@ -2663,8 +2682,16 @@ dwarf_lookup_pc (struct backtrace_state
   if (read_line_info (state, ddata, error_callback, data, entry->u, &lhdr,
 			  &lines, &count))
 	{
+	  struct function_vector *pfvec;
+
+	  /* If not threaded, reuse DDATA->FVEC for better memory
+	 consumption.  */
+	  if (state->threaded)
+	pfvec = NULL;
+	  else
+	pfvec = &ddata->fvec;
 	  read_function_info (state, ddata, &lhdr, error_callback, data,
-			  entry->u, &ddata->fvec, &function_addrs,
+			  entry->u, pfvec, &function_addrs,
 			  &function_addrs_count);
 	  free_line_header (state, &lhdr, error_callback, data);
 	  new_data = 1;


Commit: V850: Add support for the E3V5 architecture variant

2013-01-31 Thread Nick Clifton
Hi Guys,

  I am applying the patch below to add support for the E3V5 architecture
  variant to the V850 backend.  This patch was originally developed by
  Renesas, but it is only recently that their copyright assignment was
  completed so that it could be contributed.

  I realize that the current mainline sources are only accepting
  regression fixes at the moment, but I asked for and received
  permission to apply the patch, with the caveat that:

"if you break your arch we'll not wait for you to fix it ;)"

  The patch does actually fix a few problems with the other V850
  architecture variants, such that the gcc testsuite shows 31 fewer
  unexpected failures for each multilib.

  Support for the e3v5 architecture variant is already in the binutils
  and simulator sources.
  
  Tested with no regressions (and several improvements) on a v850e-elf
  toolchain.

Cheers
  Nick

gcc/ChangeLog
2013-01-31  Hiroyuki Ono  
Nick Clifton  

* config/v850/constraints.md (Q): Define as a memory constraint.
* config/v850/predicates.md (label_ref_operand): New predicate.
(e3v5_shift_operand): New predicate.
(ior_operator): New predicate.
* config/v850/t-v850: Add e3v5 multilib.
* config/v850/v850-protos.h (v850_adjust_insn_length): Prototype.
(v850_gen_movdi): Prototype.
* config/v850/v850.c: Add support for e3v5 architecture.
Rename all uses of TARGET_V850E || TARGET_V850E2_ALL to
TARGET_V850E_UP.
(construct_save_jarl): Add e3v5 long JARL support.
(v850_adjust_insn_length): New function.  Adjust length of call
insns when using e3v5 instructions.
(v850_gen_movdi): New function: Generate instructions to move a
DImode value.
* config/v850/v850.h (TARGET_CPU_v850e3v5): Define.
(CPP_SPEC): Define __v850e3v5__ as appropriate.
(TARGET_USE_FPU): Enable for e3v5.
(CONST_OK_FOR_W): New macro.
(ADJUST_INSN_LENGTH): Define.
* config/v850/v850.md (UNSPEC_LOOP): Define.
(attr cpu): Add v850e3v5.
Rename all uses of TARGET_V850E2 to TARGET_V850E2V3_UP.
(movdi): New pattern.
(movdi_internal): New pattern.
(cbranchsf4): Conditionalize on TARGET_USE_FPU.
(cbranchdf4): Conditionalize on TARGET_USE_FPU.
(cstoresf4): Likewise.
(cstoredf4): Likewise.
(insv): New pattern.
(rotlso3_a): New pattern.
(rotlsi3_b): New pattern
(rotlsi3_v850e3v5): New pattern.
(doloop_begin): New pattern.
(fix_loop_counter): New pattern.
(doloop_end): New pattern.
(branch_normal): Add e3v5 long branch support.
(branch_invert): Likewise.
(branch_z_normal): Likewise.
(branch_z_invert): Likewise.
(branch_nz_normal): Likewise.
(branch_nz_invert): Likewise.
(call_internal_short): Add e3v5 register-indirect JARL support.
(call_internal_long): Likewise.
(call_value_internal_short): Likewise.
(call_value_internal_long): Likewise.
* config/v850/v850.opt (mv850e3v5, mv850e2v4): New options.
(mloop): New option.
* config.gcc: Add support for configuring v840e3v5 target.
* doc/invoke.texi: Document new v850 specific command line
options.

libgcc/ChangeLog
2013-01-31  Nick Clifton  

* config/v850/lib1funcs.S: Add support for e3v5 architecture
variant.



v850e3v5.gcc.patch.xz
Description: application/xz


Merge from trunk to gccgo branch

2013-01-31 Thread Ian Lance Taylor
I've merged trunk revision 195620 to the gccgo branch.

Ian


[PATCH] GCC 4.9, powerpc, add more debugging to -mdebug=reg and -mdebug=addr

2013-01-31 Thread Michael Meissner
None of these changes affect the code, but they provide some more information
that I've found useful when using the -mdebug=reg and -mdebug=addr options.

When GCC 4.9 opens up, can I install these patches in the source tree>

2013-01-31  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_debug_reg_global): Print
MODES_TIEABLE_P for selected modes.  Print the numerical value of
the various virtual registers. Use GPR/FPR first/last values,
instead of hard coding the register numbers.  Print which modes
have reload functions registered.
(rs6000_option_override_internal): If -mdebug=reg, trace the
options settings before/after setting cpu, target and subtarget
settings.
(rs6000_secondary_reload_trace): Improve the RTL dump for
-mdebug=addr and for secondary reload failures in
rs6000_secondary_reload_inner.
(rs6000_secondary_reload_fail): Likewise.
(rs6000_secondary_reload_inner): Likewise.

* config/rs6000/rs6000.md (FIRST_GPR_REGNO): Add convenience
macros for first/last GPR and FPR registers.
(LAST_GPR_REGNO): Likewise.
(FIRST_FPR_REGNO): Likewise.
(LAST_FPR_REGNO): Likewise.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 195592)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1693,6 +1693,7 @@ rs6000_debug_reg_global (void)
   static const char *const tf[2] = { "false", "true" };
   const char *nl = (const char *)0;
   int m;
+  size_t m1, m2, v;
   char costly_num[20];
   char nop_num[20];
   char flags_buffer[40];
@@ -1713,10 +1714,66 @@ rs6000_debug_reg_global (void)
 "other"
   };
 
-  fprintf (stderr, "Register information: (last virtual reg = %d)\n",
-  LAST_VIRTUAL_REGISTER);
-  rs6000_debug_reg_print (0, 31, "gr");
-  rs6000_debug_reg_print (32, 63, "fp");
+  /* Modes we want tieable information on.  */
+  static const enum machine_mode print_tieable_modes[] = {
+QImode,
+HImode,
+SImode,
+DImode,
+TImode,
+SFmode,
+DFmode,
+TFmode,
+SDmode,
+DDmode,
+TDmode,
+V8QImode,
+V4HImode,
+V2SImode,
+V16QImode,
+V8HImode,
+V4SImode,
+V2DImode,
+V32QImode,
+V16HImode,
+V8SImode,
+V4DImode,
+V2SFmode,
+V4SFmode,
+V2DFmode,
+V8SFmode,
+V4DFmode,
+CCmode,
+CCUNSmode,
+CCEQmode,
+  };
+
+  /* Virtual regs we are interested in.  */
+  const static struct {
+int regno; /* register number.  */
+const char *name;  /* register name.  */
+  } virtual_regs[] = {
+{ STACK_POINTER_REGNUM,"stack pointer:" },
+{ TOC_REGNUM,  "toc:  " },
+{ STATIC_CHAIN_REGNUM, "static chain: " },
+{ RS6000_PIC_OFFSET_TABLE_REGNUM,  "pic offset:   " },
+{ HARD_FRAME_POINTER_REGNUM,   "hard frame:   " },
+{ ARG_POINTER_REGNUM,  "arg pointer:  " },
+{ FRAME_POINTER_REGNUM,"frame pointer:" },
+{ FIRST_PSEUDO_REGISTER,   "first pseudo: " },
+{ FIRST_VIRTUAL_REGISTER,  "first virtual:" },
+{ VIRTUAL_INCOMING_ARGS_REGNUM,"incoming_args:" },
+{ VIRTUAL_STACK_VARS_REGNUM,   "stack_vars:   " },
+{ VIRTUAL_STACK_DYNAMIC_REGNUM,"stack_dynamic:" },
+{ VIRTUAL_OUTGOING_ARGS_REGNUM,"outgoing_args:" },
+{ VIRTUAL_CFA_REGNUM,  "cfa (frame):  " },
+{ VIRTUAL_PREFERRED_STACK_BOUNDARY_REGNUM, "stack boundry:" },
+{ LAST_VIRTUAL_REGISTER,   "last virtual: " },
+  };
+
+  fputs ("\nHard register information:\n", stderr);
+  rs6000_debug_reg_print (FIRST_GPR_REGNO, LAST_GPR_REGNO, "gr");
+  rs6000_debug_reg_print (FIRST_FPR_REGNO, LAST_FPR_REGNO, "fp");
   rs6000_debug_reg_print (FIRST_ALTIVEC_REGNO,
  LAST_ALTIVEC_REGNO,
  "vs");
@@ -1729,6 +1786,10 @@ rs6000_debug_reg_global (void)
   rs6000_debug_reg_print (SPE_ACC_REGNO, SPE_ACC_REGNO, "spe_a");
   rs6000_debug_reg_print (SPEFSCR_REGNO, SPEFSCR_REGNO, "spe_f");
 
+  fputs ("\nVirtual/stack/frame registers:\n", stderr);
+  for (v = 0; v < ARRAY_SIZE (virtual_regs); v++)
+fprintf (stderr, "%s regno = %3d\n", virtual_regs[v].name, 
virtual_regs[v].regno);
+
   fprintf (stderr,
   "\n"
   "d  reg_class = %s\n"
@@ -1756,18 +1817,53 @@ rs6000_debug_reg_global (void)
   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]);
 
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
-if (rs6000_vector_unit[m] || rs6000_vector_mem[m])
+if (rs6000_vector_unit[m] || rs6000_vector_mem[m]
+ 

libgo patch committed: Remove confusion about split-stack backtrace

2013-01-31 Thread Ian Lance Taylor
In an earlier patch I tried to remove confusion about split-stack
functions in backtraces, but it still doesn't work in conjunction with
skipping functions.  Both runtime.Callers and backtrace_full can skip
stack frames.  backtrace_full will count split-stack frames in the skip
count, but runtime.Callers should not.  In order to keep this
consistent, this patch moves all the skip count handling to
runtime.Callers.  Bootstrapped and ran Go testsuite in
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r c205459490e5 libgo/runtime/go-callers.c
--- a/libgo/runtime/go-callers.c	Thu Jan 31 09:24:55 2013 -0800
+++ b/libgo/runtime/go-callers.c	Thu Jan 31 11:42:48 2013 -0800
@@ -16,6 +16,7 @@
 struct callers_data
 {
   Location *locbuf;
+  int skip;
   int index;
   int max;
 };
@@ -41,6 +42,12 @@
 	return 0;
 }
 
+  if (arg->skip > 0)
+{
+  --arg->skip;
+  return 0;
+}
+
   loc = &arg->locbuf[arg->index];
   loc->pc = pc;
 
@@ -75,10 +82,11 @@
   struct callers_data data;
 
   data.locbuf = locbuf;
+  data.skip = skip + 1;
   data.index = 0;
   data.max = m;
-  backtrace_full (__go_get_backtrace_state (), skip + 1, callback,
-		  error_callback, &data);
+  backtrace_full (__go_get_backtrace_state (), 0, callback, error_callback,
+		  &data);
   return data.index;
 }
 


Merged trunk to gccgo branch

2013-01-31 Thread Ian Lance Taylor
Continuing my quest for some really stable gccgo sources, I've merged
trunk revision 195627 to the gccgo branch.

Ian


Re: Commit: V850: Add support for the E3V5 architecture variant

2013-01-31 Thread Gerald Pfeifer
Hi Nick,

the web page update required a tweak:  goes at the end of the
entry.  Committed.

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.91
diff -u -3 -p -r1.91 changes.html
--- changes.html31 Jan 2013 18:34:10 -  1.91
+++ changes.html31 Jan 2013 20:51:10 -
@@ -698,9 +698,9 @@ B b(42); // OK
 
   
 This target now supports the E3V5 architecture via the use
-of the new -mv850e3v5 command-line option.  It also has
-experimental support for the e3v5 LOOP instruction which can 
be
-enabled via the new -mloop command-line option.
+of the new -mv850e3v5 command-line option.  It also has
+experimental support for the e3v5 LOOP instruction which
+can be enabled via the new -mloop command-line option.
   
 
 XStormy16


Re: [PATCH] testsuite tcl portability fix - avoid lreverse

2013-01-31 Thread Mike Stump
On Jan 30, 2013, at 12:18 AM, Jakub Jelinek  wrote:
> I've noticed
> ERROR: (DejaGnu) proc "lreverse {{ASAN_OPTIONS 0}}" does not exist.   
>  
> errors when regtesting gcc 4.8 on RHEL 5, it seems lreverse has been added
> to tcl only in version 8.5.

> Fixed thusly, ok for trunk?

Ok.


libgo patch committed: Recognize morestack.S if no function name

2013-01-31 Thread Ian Lance Taylor
When using an old assembler it's possible to get debug info that says
that a PC is in morestack.S but does not have the function name.  This
patch changes the Go backtrace code to recognize this case.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 388d41792584 libgo/runtime/go-callers.c
--- a/libgo/runtime/go-callers.c	Thu Jan 31 11:43:44 2013 -0800
+++ b/libgo/runtime/go-callers.c	Thu Jan 31 15:11:01 2013 -0800
@@ -34,13 +34,24 @@
   /* Skip split stack functions.  */
   if (function != NULL)
 {
-  const char *p = function;
+  const char *p;
 
+  p = function;
   if (__builtin_strncmp (p, "___", 3) == 0)
 	++p;
   if (__builtin_strncmp (p, "__morestack_", 12) == 0)
 	return 0;
 }
+  else if (filename != NULL)
+{
+  const char *p;
+
+  p = strrchr (filename, '/');
+  if (p == NULL)
+	p = filename;
+  if (__builtin_strncmp (p, "morestack.S", 11) == 0)
+	return 0;
+}
 
   if (arg->skip > 0)
 {


Re: patch to fix PR56144

2013-01-31 Thread Steven Bosscher
On Wed, Jan 30, 2013 at 6:24 PM, Vladimir Makarov wrote:
> The following patch fixes
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56144
>
> The patch was successfully bootstrapped and tested on x86/x86-64.

Hello Vlad,

Can you please put this patch on the lra-branch too, so that the
auto-testers can pick it up?

Ciao!
Steven


Re: [PATCH] Vtable pointer verification (corruption/attach detection -- new feature

2013-01-31 Thread Caroline Tice
cmt...@google.com


On Wed, Jan 30, 2013 at 2:09 AM, Florian Weimer  wrote:
> On 11/01/2012 09:07 PM, Caroline Tice wrote:
>>
>> We have been developing a new security hardening feature for GCC that
>> is designed to detect and handle (during program execution) when a
>> vtable pointer that is about to be used for a virtual function call is
>> not a valid vtable pointer for that call (i.e. it has become
>> corrupted, possibly due to a  hacker attack).  We gave a presentation
>> on this work at the Gnu Tools Cauldron in Prague last July.  We now
>> have the implementation fully working and are submitting this patch
>> for review.  We would like to get this into the next release of GCC if
>> possible.
>
>
> Thanks for posting this collection of interesting patches.
>
> As far as I understand it, this changes ABI in the sense that every object
> file needs to contain the constructors that register the vtables, otherwise
> verification will later fail.

Yes.

> If this data could be emitted in a
> declarative fashion, it might be possible to emit it by default, in a
> separate ELF section.  This way, it is always there when needed, and it
> wouldn't have any performance impact if not used.

That might be possible; it would need to be carefully organized
though.  If is not enough to have a list of all vtable addresses; we
need to know, for each virtual class (where by virtual class I mean a
class for which a vtable might be generated) the set of vtables that
is is legal for an object of that class to point to (i.e. for a base
class, it can point to its own vtable or to the vtable of any of its
descendants in the inheritance hierarchy).

>
> I didn't look at the actual permitted-vtable set lookup in detail.  How
> expensive is it?

The set up is expected to be somewhat expensive, although we are
trying to make it as efficient as possible.  We are in the process of
gathering performance data.

> C++ virtual method calls are efficient, but they have
> their problems.  The vtable needs relocation (contributing to startup cost),
> and we have the fragile base class problem (which severely constrains the
> evolution of separately-compiled base classes).  Assuming that we need the
> vtable check and it has non-trivial cost, it might make sense to revamp C++
> virtual method dispatch altogether, addressing both security and modularity
> issues.

That might be best, but we don't have the authority to unilaterally
make a change of this magnitude to the standard. ;-)

>
> (Yes, I understand these two paragraphs go off in entirely different
> directions. 8-)
>

Let me know if you have any more questions, or if you would like any
further clarifications.

-- Caroline Tice
cmt...@google.com

> --
> Florian Weimer / Red Hat Product Security Team


gccgo patch committed: Don't emit methods for identical unnamed structs

2013-01-31 Thread Ian Lance Taylor
When a source file has multiple identical unnamed structs, and those
structs have methods, gccgo would emit multiple copies of the methods.
That would be bad because each copy would have the same name, causing an
assembler error about multiply defined symbols.  This patch fixes the
problem.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 73fde5ee3282 go/types.cc
--- a/go/types.cc	Thu Jan 31 15:11:34 2013 -0800
+++ b/go/types.cc	Thu Jan 31 16:20:32 2013 -0800
@@ -4170,6 +4170,11 @@
 
 // Class Struct_type.
 
+// A hash table used to find identical unnamed structs so that they
+// share method tables.
+
+Struct_type::Identical_structs Struct_type::identical_structs;
+
 // Traversal.
 
 int
@@ -4596,6 +4601,21 @@
 {
   if (this->all_methods_ != NULL)
 return;
+
+  // It is possible to have multiple identical structs that have
+  // methods.  We want them to share method tables.  Otherwise we will
+  // emit identical methods more than once, which is bad since they
+  // will even have the same names.
+  std::pair ins =
+Struct_type::identical_structs.insert(std::make_pair(this, this));
+  if (!ins.second)
+{
+  // An identical struct was already entered into the hash table.
+  // Note that finalize_methods is, fortunately, not recursive.
+  this->all_methods_ = ins.first->second->all_methods_;
+  return;
+}
+
   Type::finalize_methods(gogo, this, this->location_, &this->all_methods_);
 }
 
diff -r 73fde5ee3282 go/types.h
--- a/go/types.h	Thu Jan 31 15:11:34 2013 -0800
+++ b/go/types.h	Thu Jan 31 16:20:32 2013 -0800
@@ -2184,6 +2184,12 @@
   do_export(Export*) const;
 
  private:
+  // Used to merge method sets of identical unnamed structs.
+  typedef Unordered_map_hash(Struct_type*, Struct_type*, Type_hash_identical,
+			 Type_identical) Identical_structs;
+
+  static Identical_structs identical_structs;
+
   // Used to avoid infinite loops in field_reference_depth.
   struct Saw_named_type
   {


Re: [PATCH] Vtable pointer verification, C++ front end changes (patch 1 of 3)

2013-01-31 Thread Caroline Tice
On Wed, Jan 30, 2013 at 9:26 AM, Jason Merrill  wrote:
> I'm also touching on the middle-end and library changes, but would
> appreciate a more thorough review from others in those areas.
>
> On 01/23/2013 05:25 PM, Caroline Tice wrote:
>>
>> Index: gcc/cp/g++spec.c
>
>
> Changes to g++spec.c only affect the g++ driver, not the gcc driver. Are you
> sure this is what you want?  Can't you handle this stuff directly in the
> specs like address sanitizer does?
>
>> @@ -17954,6 +17954,10 @@ mark_class_instantiated (tree t, int ext
>>if (! extern_p)
>>  {
>>CLASSTYPE_DEBUG_REQUESTED (t) = 1;
>> +
>> +  if (flag_vtable_verify)
>> +vtv_save_class_info (t);
>
> Why do you need this here as well as in finish_struct_1?
>

If we don't have this in both places, then we miss getting vtable
pointers for instantiated templates.


>> -static tree
>> +tree
>>  get_mangled_id (tree decl)
>
>
> It doesn't look like you call get_mangled_id anywhere, so I think you don't
> need this change anymore.

You are right; this is leftover from an earlier version where we were
calling this function directly; I'll remove the change.

>
>>finalize_compilation_unit ();
>>
>> +  if (flag_vtable_verify)
>> +{
>> +  /* Generate the special constructor initialization function that
>> + calls __VLTRegisterPairs, and give it a very high initialization
>> + priority.  */
>> +  vtv_generate_init_routine ();
>> +}
>
>
> Why do you want to do this after cgraph finalization?  Is it so that you
> know which vtables are really being emitted?  Please explain in the comment.

Exactly.  We need to know which vtables are really emitted before we
output __VLTRegisterPair calls; otherwise we got in trouble emitting
calls for vtables that weren't output, or causing vtables to be output
when otherwise they wouldn't be.

>
>> +  if (flag_vtable_verify
>> +  && strstr (IDENTIFIER_POINTER (DECL_NAME (fn)), ".vtable"))
>> +return fn;
>
>
> How would this function end up with .vtable at the end of its name?
>

Whoops.  This is leftover from older code when we modified
start_objects and added that to the constructor initialization
function. I will change this.


>> +  if (TREE_CHAIN (base_class))
>> +class_type_decl = TREE_CHAIN (base_class);
>> +  else
>> +class_type_decl = TYPE_NAME (base_class);
>> +
>> +  /* Temporarily remove any qualifiers on type.  */
>> +  type_decl_type = TREE_TYPE (class_type_decl);
>> +  save_quals = TYPE_QUALS (type_decl_type);
>> +  reset_type_qualifiers (null_quals, type_decl_type);
>> +
>> +  base_id = DECL_ASSEMBLER_NAME (TREE_CHAIN (base_class));
>
>
> I think you want TYPE_LINKAGE_IDENTIFIER here.
>

I don't know the difference between DECL_ASSEMBLER_NAME and
TYPE_LINKAGE_IDENTIFIER.  We are just trying to get the mangled name
for the class.

> I don't understand what the qualifier business is trying to accomplish,
> especially since you never use type_decl_type.  You do this in several
> places, but it should never be necessary; classes don't have any qualifiers.
>

We used to not have the "qualifier business", assuming that classes
did not have any type qualifiers.  This turned out not to be a true
assumption.  Occasionally we were getting a case where a class had a
"const" qualifier attached to it *sometimes*.  We used the mangled
name to look it up in our hashtable, and it didn't find the entry for
the class (which had been put in without the const qualifier).  So we
temporarily remove any type qualifiers before getting the mangled
name; then re-apply whatever qualifiers were there before.  The cod
actually works because the type_decl_type we change the qualifiers on
is a subtree of the class_type_decl, which is a subtree of the
base_class that we get the mangled name from. (We have tested this
code; it works and fixes our problem).

>> +  if (vtbl_map_node_registration_find (base_vtable_map_node, vtable_decl,
>> +   offset))
>> +return true;
>> +
>> +  vtbl_map_node_registration_insert (base_vtable_map_node, vtable_decl,
>> +offset);
>
>
> Here you're doing two hash table lookups when one would be enough.
>

As written the insert function doesn't return anything to let you know
whether the item was already there or not, which we need to know (we
use the results here to avoid generating redundant calls to
__VLTRegisterPair.  I suppose we could modify the insert function to
return a boolean indicating if the item was already in the hashtable,
and then we could get by with just one call here...



>
>> 2).  A linked list of all
>> +   vtbl_map_nodes ("vtbl_map_nodes") for easy iteration through all of
>> +   them; and 3). An array of vtbl_map_nodes, where the array index
>> +   corresponds to the unique id of the vtbl_map_node, which gives us
>> +   an easy way to use bitmaps to represent and find the vtable map
>> +   nodes.
>
>
> What's hard about iteration over

Merge trunk to gccgo branch

2013-01-31 Thread Ian Lance Taylor
It's true, I've merged trunk to gccgo branch again, in this case
revision 195638.

Ian


libgo patch committed: Correct test for morestack.S

2013-01-31 Thread Ian Lance Taylor
I am an idiot.  (Hope it is right this time.)

Ian

diff -r 686169e30cff libgo/runtime/go-callers.c
--- a/libgo/runtime/go-callers.c	Thu Jan 31 16:22:11 2013 -0800
+++ b/libgo/runtime/go-callers.c	Thu Jan 31 21:42:52 2013 -0800
@@ -49,7 +49,7 @@
   p = strrchr (filename, '/');
   if (p == NULL)
 	p = filename;
-  if (__builtin_strncmp (p, "morestack.S", 11) == 0)
+  if (__builtin_strncmp (p, "/morestack.S", 12) == 0)
 	return 0;
 }
 


Merged trunk to gccgo branch

2013-01-31 Thread Ian Lance Taylor
I merged trunk revision 195640 to the gccgo branch.

Ian