Re: [SH] Add simple_return pattern

2012-09-11 Thread Christian Bruel

On 09/11/2012 03:05 AM, Kaz Kojima wrote:
> Christian Bruel  wrote:
>> This patch implements the simple_return pattern to enable -fshrink-wrap
>> on SH. It also clean up some redundancies for expand_epilogue (called
>> twice from the "return" and "epilogue" patterns and the
>> sh_expand_prologue parameter type.
>>
>> No regressions with sh-superh-elf and sh4-linux gcc testsuites.
> 
> With the patch + revision 191106, I've got a new failure:
> 
> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation,  -fprofile-use -D_PROFILE_USE 
> (internal compiler error)
> 
> for sh4-unknown-linux-gnu.  My testsuite/gcc/gcc.log says
> 
> /exp/ldroot/dodes/xsh-gcc/gcc/xgcc -B/exp/ldroot/dodes/xsh-gcc/gcc/ 
> /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c 
> -fno-diagnostics-show-caret -O2 -freorder-blocks-and-partition -fprofile-use 
> -D_PROFILE_USE -lm -o /exp/ldroot/dodes/xsh-gcc/gcc/testsuite/gcc/bb-reorg.x02
> /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c: In 
> function 'main':
> /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c:38:1: 
> error: EDGE_CROSSING missing across section boundary
> /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c:38:1: 
> internal compiler error: verify_flow_info failed
> Please submit a full bug report,
> 
> Regards,

Ugh, indeed, I forgot a SPEC file that set the release mode on my
SH-Linux distri, so verify_flow_info was not called :-(. I need to test
again.

thanks !

Christian

>   kaz
> 


Bootstrap fails (was: Remove unnecessary VEC function overloads.)

2012-09-11 Thread Tobias Burnus

On 09/11/2012 01:52 AM, Diego Novillo wrote:

Remove unnecessary VEC function overloads.

Several VEC member functions that accept an element 'T' used to have
two overloads: one taking 'T', the second taking 'T *'.


They might be unnecessary,  but with your patch bootstrapping fails here 
with the following failure.


Did you test with or without Graphite?

Tobias


/home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c: In 
function ‘void move_sd_regions(vec_t**, vec_t**)’:
/home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: error: no matching 
function for call to 
‘vec_t::safe_push(vec_t**, sd_region*&, const 
char [61], int, const char [16])’

  (vec_t::safe_push (&(V), O VEC_CHECK_INFO MEM_STAT_INFO))
   ^
/home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c:146:5: 
note: in expansion of macro 'VEC_safe_push'

 VEC_safe_push (sd_region, heap, *target, s);
 ^
/home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: note: candidate is:
  (vec_t::safe_push (&(V), O VEC_CHECK_INFO MEM_STAT_INFO))



Ping [SH] Define NO_IMPLICIT_EXTERN_C for newlib targets

2012-09-11 Thread Christian Bruel
Hi Kaz,

Any news for my sh-superh-elf --with-newlib patch ?

http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00137.html

Thanks

Christian


Re: [PATCH] Fix PR54492

2012-09-11 Thread Richard Guenther
On Mon, 10 Sep 2012, William J. Schmidt wrote:

> Here's the revised patch with a param.  Bootstrapped and tested in the
> same manner.  Ok for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Bill
> 
> 
> 2012-08-10  Bill Schmidt  
> 
>   * doc/invoke.texi (max-slsr-cand-scan): New description.
>   * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit
>   the time spent searching for a basis.
>   * params.def (PARAM_MAX_SLSR_CANDIDATE_SCAN): New param.
> 
> 
> Index: gcc/doc/invoke.texi
> ===
> --- gcc/doc/invoke.texi   (revision 191135)
> +++ gcc/doc/invoke.texi   (working copy)
> @@ -9407,6 +9407,11 @@ having a regular register file and accurate regist
>  See @file{haifa-sched.c} in the GCC sources for more details.
>  
>  The default choice depends on the target.
> +
> +@item max-slsr-cand-scan
> +Set the maximum number of existing candidates that will be considered when
> +seeking a basis for a new straight-line strength reduction candidate.
> +
>  @end table
>  @end table
>  
> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c   (revision 191135)
> +++ gcc/gimple-ssa-strength-reduction.c   (working copy)
> @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "domwalk.h"
>  #include "pointer-set.h"
>  #include "expmed.h"
> +#include "params.h"
>  
>  /* Information about a strength reduction candidate.  Each statement
> in the candidate table represents an expression of one of the
> @@ -353,10 +354,14 @@ find_basis_for_candidate (slsr_cand_t c)
>cand_chain_t chain;
>slsr_cand_t basis = NULL;
>  
> +  // Limit potential of N^2 behavior for long candidate chains.
> +  int iters = 0;
> +  int max_iters = PARAM_VALUE (PARAM_MAX_SLSR_CANDIDATE_SCAN);
> +
>mapping_key.base_expr = c->base_expr;
>chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key);
>  
> -  for (; chain; chain = chain->next)
> +  for (; chain && iters < max_iters; chain = chain->next, ++iters)
>  {
>slsr_cand_t one_basis = chain->cand;
>  
> Index: gcc/params.def
> ===
> --- gcc/params.def(revision 191135)
> +++ gcc/params.def(working copy)
> @@ -973,6 +973,13 @@ DEFPARAM (PARAM_SCHED_PRESSURE_ALGORITHM,
> "Which -fsched-pressure algorithm to apply",
> 1, 1, 2)
>  
> +/* Maximum length of candidate scans in straight-line strength reduction.  */
> +DEFPARAM (PARAM_MAX_SLSR_CANDIDATE_SCAN,
> +   "max-slsr-cand-scan",
> +   "Maximum length of candidate scans for straight-line "
> +   "strength reduction",
> +   50, 1, 99)
> +
>  /*
>  Local variables:
>  mode:c
> 
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend


[PATCH] Fix PR54515

2012-09-11 Thread Richard Guenther

This is the trunk variant of the 54515 fix - we shouldn't really
return NULL_TREE from get_base_address apart from for invalid
inputs (and then it's just GIGO).  This makes us go half-way to
fix the PR, I'll followup with a patch to look through
WITH_SIZE_EXPR (after thinking about effects on alias analysis).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-09-11  Richard Guenther  

PR middle-end/54515
* gimple.c (get_base_address): Do not return NULL_TREE apart
from for WITH_SIZE_EXPR.
* gimple-fold.c (canonicalize_constructor_val): Do not call
get_base_address when not necessary.

* g++.dg/tree-ssa/pr54515.C: New testcase.

Index: gcc/gimple.c
===
--- gcc/gimple.c(revision 191143)
+++ gcc/gimple.c(working copy)
@@ -2878,16 +2878,12 @@ get_base_address (tree t)
   && TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR)
 t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
 
-  if (TREE_CODE (t) == SSA_NAME
-  || DECL_P (t)
-  || TREE_CODE (t) == STRING_CST
-  || TREE_CODE (t) == CONSTRUCTOR
-  || INDIRECT_REF_P (t)
-  || TREE_CODE (t) == MEM_REF
-  || TREE_CODE (t) == TARGET_MEM_REF)
-return t;
-  else
+  /* ???  Either the alias oracle or all callers need to properly deal
+ with WITH_SIZE_EXPRs before we can look through those.  */
+  if (TREE_CODE (t) == WITH_SIZE_EXPR)
 return NULL_TREE;
+
+  return t;
 }
 
 void
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 191143)
+++ gcc/gimple-fold.c   (working copy)
@@ -154,13 +154,15 @@ canonicalize_constructor_val (tree cval,
 }
   if (TREE_CODE (cval) == ADDR_EXPR)
 {
-  tree base = get_base_address (TREE_OPERAND (cval, 0));
-  if (!base && TREE_CODE (TREE_OPERAND (cval, 0)) == COMPOUND_LITERAL_EXPR)
+  tree base = NULL_TREE;
+  if (TREE_CODE (TREE_OPERAND (cval, 0)) == COMPOUND_LITERAL_EXPR)
{
  base = COMPOUND_LITERAL_EXPR_DECL (TREE_OPERAND (cval, 0));
  if (base)
TREE_OPERAND (cval, 0) = base;
}
+  else
+   base = get_base_address (TREE_OPERAND (cval, 0));
   if (!base)
return NULL_TREE;
 
Index: gcc/testsuite/g++.dg/tree-ssa/pr54515.C
===
--- gcc/testsuite/g++.dg/tree-ssa/pr54515.C (revision 0)
+++ gcc/testsuite/g++.dg/tree-ssa/pr54515.C (working copy)
@@ -0,0 +1,19 @@
+// { dg-do compile }
+// { dg-options "-O2" }
+
+template < typename T > T h2le (T)
+{
+T a;
+unsigned short &b = a;
+short c = 0;
+unsigned char (&d)[2] = reinterpret_cast < unsigned char (&)[2] > (c);
+unsigned char (&e)[2] = reinterpret_cast < unsigned char (&)[2] > (b);
+e[0] = d[0];
+return a;
+}
+
+void
+bar ()
+{
+h2le ((unsigned short) 0);
+}


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Richard Guenther
On Mon, Sep 10, 2012 at 6:30 PM, Richard Henderson  wrote:
> On 09/10/2012 09:11 AM, Iyer, Balaji V wrote:
>> Can you please help me get a start on how to get can be done? From
>> what I understand (please correct me if I am wrong), this requires
>> rearranging and duplicating a lot of passes and can potentially open
>> up to a lot of bugs.
>
> Certainly not duplicating passes.  And probably not even rearranging them.
>
> The Important parts are:
>
>   (1) Having a bit in "struct loop" that indicates the special semantics
>   you have for #pragma simd.  I don't know if maybe all loops inside an
>   elemental function are so automatically marked?
>
>   (2) Have bits in "struct function" that summarize the contents of the
>   bit from "struct loop", for all loops in the function.  Note that
>   this bit would need to be updated during inlining.
>
>   (3) Change the "gate" predicates for the relevant function to also check
>   the bit from "struct function".  In some cases the pass might need
>   to run globally (perhaps if-conversion?) and in some cases the pass
>   might be able to restrict work to specific loops (e.g. the vectorizer),
>   skipping loops for which the optimization is not enabled.

Note that we do not preserve the loop tree before the gimple loop optimizer
passes.  Nor do we have a convenient way (currently) to transfer per-loop
information from GENERIC to the point where we can first create the loop
tree (after the CFG is built).  The former is because I didn't want to think
about the inlining case (I'm still chasing bugs for preserving the loop tree
from the start of gimple loop optimizer passes ...), the latter could be done
in a similar way we handle predications or OMP annotations - have
special instructions in the IL.

Richard.

>
> r~
>


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Richard Guenther
On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson  wrote:
> On 09/10/2012 09:09 AM, Iyer, Balaji V wrote:
>>> >If that's the case, what's the point in defining an external ABI and 
>>> >defining what
>>> >__attribute__((vector)) placed on a function declaration means?
>
>> When you have __attribute__((vector)) you are asking the compiler to
>> create a vector AND a scalar version of the function. The advantage
>> is that if the function is used, for example, in 2 loops where 1 can
>> be vectorized and another cannot, the vectorizable loop won't suffer
>> (i.e. suffer from being not-vectorized).
>
> You've totally mis-understood my point.
>
> Whether or not the compiler creates a clone COULD BE totally up to the
> compiler, based on whether or not vectorization is enabled, whether the
> loop has been analyzed such that vectorization may proceed, or indeed
> the phase of the moon.
>
> But in order for that to happen, the clone must be totally private to
> the module for which we are generating code (in the LTO sense, this is
> the entire program or dll; without LTO, this is just the object file).
> It means that we never attempt to generate clones for functions for
> which the body of the function is not visible.
>
> On the other hand, if you insist on assuming a clone exists merely
> because a declaration bears an attribute, then you must address ALL
> of the problems with respect to defining a stable ABI in the face of
> different cpu revisions, different ISAs, and different vector lengths.
>
> I've not seen you address ANY of these problems, despite having the
> problem pointed out multiple times.

Indeed, if the definition of an elemental function is always visible to the
vectorizer the vectorizer itself can instruct the creation of the clone
if it does not already exist (just make those clones managed by the
callgraph).  Then the clones are visible to the current TU only and no
ABI issues exist (though you could say that the vectorizer or the inliner
could as well force inlining of elemental functions into places it wants to
vectorize - one complication even with local clones is that the x86 ABI
has no callee-saved XMM registers which makes function calls inside
loops especially expensive).

Richard.

>
> r~


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Richard Guenther
On Tue, Sep 11, 2012 at 10:41 AM, Richard Guenther
 wrote:
> On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson  wrote:
>> On 09/10/2012 09:09 AM, Iyer, Balaji V wrote:
 >If that's the case, what's the point in defining an external ABI and 
 >defining what
 >__attribute__((vector)) placed on a function declaration means?
>>
>>> When you have __attribute__((vector)) you are asking the compiler to
>>> create a vector AND a scalar version of the function. The advantage
>>> is that if the function is used, for example, in 2 loops where 1 can
>>> be vectorized and another cannot, the vectorizable loop won't suffer
>>> (i.e. suffer from being not-vectorized).
>>
>> You've totally mis-understood my point.
>>
>> Whether or not the compiler creates a clone COULD BE totally up to the
>> compiler, based on whether or not vectorization is enabled, whether the
>> loop has been analyzed such that vectorization may proceed, or indeed
>> the phase of the moon.
>>
>> But in order for that to happen, the clone must be totally private to
>> the module for which we are generating code (in the LTO sense, this is
>> the entire program or dll; without LTO, this is just the object file).
>> It means that we never attempt to generate clones for functions for
>> which the body of the function is not visible.
>>
>> On the other hand, if you insist on assuming a clone exists merely
>> because a declaration bears an attribute, then you must address ALL
>> of the problems with respect to defining a stable ABI in the face of
>> different cpu revisions, different ISAs, and different vector lengths.
>>
>> I've not seen you address ANY of these problems, despite having the
>> problem pointed out multiple times.
>
> Indeed, if the definition of an elemental function is always visible to the
> vectorizer the vectorizer itself can instruct the creation of the clone
> if it does not already exist (just make those clones managed by the
> callgraph).  Then the clones are visible to the current TU only and no
> ABI issues exist (though you could say that the vectorizer or the inliner
> could as well force inlining of elemental functions into places it wants to
> vectorize - one complication even with local clones is that the x86 ABI
> has no callee-saved XMM registers which makes function calls inside
> loops especially expensive).

Btw, this then happily fits into my suggestion that the "elementalness"
can be autodetected by the compiler simply by means of a proper IPA
pass and thus be fully LTO / whole-program aware.  No need for an
attribute (where you'd need to handle the case that the attribute was placed
there by error).

Richard.

> Richard.
>
>>
>> r~


Re: Ping [SH] Define NO_IMPLICIT_EXTERN_C for newlib targets

2012-09-11 Thread Kaz Kojima
Christian Bruel  wrote:
> Any news for my sh-superh-elf --with-newlib patch ?
> 
> http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00137.html

The patch is OK for both 4.7 and 4.8.  Sorry for the delay.

Regards,
kaz



Re: [PATCH] Combine location with block using block_locations

2012-09-11 Thread Richard Guenther
On Mon, Sep 10, 2012 at 5:27 PM, Dehao Chen  wrote:
> On Mon, Sep 10, 2012 at 3:01 AM, Richard Guenther
>  wrote:
>> On Sun, Sep 9, 2012 at 12:26 AM, Dehao Chen  wrote:
>>> Hi, Diego,
>>>
>>> Thanks a lot for the review. I've updated the patch.
>>>
>>> This patch is large and may easily break builds because it reserves
>>> more complete information for TREE_BLOCK as well as gimple_block (may
>>> trigger bugs that was hided when these info are unavailable). I've
>>> done more rigorous testing to ensure that most bugs are caught before
>>> checking in.
>>>
>>> * Sync to the head and retest all gcc testsuite.
>>> * Port the patch to google-4_7 branch to retest all gcc testsuite, as
>>> well as build many large applications.
>>>
>>> Through these tests, I've found two additional bugs that was omitted
>>> in the original implementation. A new patch is attached (patch.txt) to
>>> fix these problems. After this fix, all gcc testsuites pass for both
>>> trunk and google-4_7 branch. I've also copy pasted the new fixes
>>> (lto.c and tree-cfg.c) below. Now I'd say this patch is in good shape.
>>> But it may not be perfect. I'll look into build failures as soon as it
>>> arises.
>>>
>>> Richard and Diego, could you help me take a look at the following two fixes?
>>>
>>> Thanks,
>>> Dehao
>>>
>>> New fixes:
>>> --- gcc/lto/lto.c   (revision 191083)
>>> +++ gcc/lto/lto.c   (working copy)
>>> @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t)
>>>  {
>>>enum tree_code code = TREE_CODE (t);
>>>LTO_NO_PREVAIL (TREE_TYPE (t));
>>> -  if (CODE_CONTAINS_STRUCT (code, TS_COMMON))
>>> -LTO_NO_PREVAIL (TREE_CHAIN (t));
>>
>> That change is odd.  Can you show us how it breaks?
>
> This will break LTO build of gcc.c-torture/execute/pr38051.c
>
> There is data structure like:
>
>   union { long int l; char c[sizeof (long int)]; } u;
>
> Once the block info is reserved for this, it'll reserve this data
> structure. And inside this data structure, there is VAR_DECL. Thus
> LTO_NO_PREVAIL assertion does not satisfy here for TREE_CHAIN (t).

I see - the issue here is that this data structure is not reached at the time
we call free_lang_data (via find_decls_types_r).  But maybe I do not understand
"once the block info is reserved for this".

So the patch papers over an issue elsewhere I believe.  Maybe Micha can
add some clarification here though, how BLOCK_VARS should be visible
here

Richard.

>>
>>>if (DECL_P (t))
>>>  {
>>>LTO_NO_PREVAIL (DECL_NAME (t));
>>>
>>> Index: gcc/tree-cfg.c
>>> ===
>>> --- gcc/tree-cfg.c  (revision 191083)
>>> +++ gcc/tree-cfg.c  (working copy)
>>> @@ -5980,9 +5974,21 @@ move_stmt_op (tree *tp, int *walk_subtrees, void *
>>>tree t = *tp;
>>>
>>>if (EXPR_P (t))
>>> -/* We should never have TREE_BLOCK set on non-statements.  */
>>> -gcc_assert (!TREE_BLOCK (t));
>>> -
>>> +{
>>> +  tree block = TREE_BLOCK (t);
>>> +  if (p->orig_block == NULL_TREE
>>> + || block == p->orig_block
>>> + || block == NULL_TREE)
>>> +   TREE_SET_BLOCK (t, p->new_block);
>>> +#ifdef ENABLE_CHECKING
>>> +  else if (block != p->new_block)
>>> +   {
>>> + while (block && block != p->orig_block)
>>> +   block = BLOCK_SUPERCONTEXT (block);
>>> + gcc_assert (block);
>>> +   }
>>> +#endif
>>
>> I think what this means is that TREE_BLOCK on non-stmts are meaningless
>> (thus only gimple_block is interesting on GIMPLE, not BLOCKs on trees).
>>
>> So instead of setting a BLOCK in some cases you should clear BLOCK
>> if it happens to be set, or alternatively, only re-set it if there was
>> a block associated
>> with it.
>
> Yeah, makes sense. New change:
>
> @@ -5980,9 +5974,10 @@
>tree t = *tp;
>
>if (EXPR_P (t))
> -/* We should never have TREE_BLOCK set on non-statements.  */
> -gcc_assert (!TREE_BLOCK (t));
> -
> +{
> +  if (TREE_BLOCK (t))
> +   TREE_SET_BLOCK (t, p->new_block);
> +}
>else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME)
>  {
>if (TREE_CODE (t) == SSA_NAME)
>
> Thanks,
> Dehao
>
>>
>> Richard.
>>
>>> +}
>>>else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME)
>>>  {
>>>if (TREE_CODE (t) == SSA_NAME)
>>>
>>> Whole patch:
>>> gcc/ChangeLog:
>>> 2012-09-08  Dehao Chen  
>>>
>>> * toplev.c (general_init): Init block_locations.
>>> * tree.c (tree_set_block): New.
>>> (tree_block): Change to use LOCATION_BLOCK.
>>> * tree.h (TREE_SET_BLOCK): New.
>>> * final.c (reemit_insn_block_notes): Change to use LOCATION_BLOCK.
>>> (final_start_function): Likewise.
>>> * input.c (expand_location_1): Likewise.
>>> * input.h (LOCATION_LOCUS): New.
>>> (LOCATION_BLOCK): New.
>>> (IS_UNKNOWN_LOCATION): New.
>>> * fold-const.c (expr_location_or): Change to use new location.
>>> * reorg.c (emit_d

Re: [patch] PR54149: fix data race in LIM pass

2012-09-11 Thread Richard Guenther
On Tue, Sep 11, 2012 at 1:15 AM, Aldy Hernandez  wrote:
> In this failing testcase the LIM pass writes to g_13 regardless of the
> initial value of g_13, which is the test protecting the write.  This causes
> an incorrect store data race wrt both the C++ memory model and transactional
> memory (the latter if the store occurs inside of a transaction).
>
> The problem here is that the ``lsm_flag'' temporary should only be set to
> true on the code paths where we actually set the original global.  As it
> stands, we are setting lsm_flag to true for reads or writes.
>
> Fixed by only setting lsm_flag=1 when the original code path has a write.
>
> Tested on x86-64 Linux.
>
> OK for trunk?

+  /* Only set the flag for writes.  */
+  if (is_gimple_assign (loc->stmt)
+ && gimple_assign_lhs (loc->stmt) == *loc->ref)

ok with

  && gimple_assign_lhs_ptr (loc->stmt) == loc->ref

instead.  Let's hope we conservatively catch all writes to ref this way (which
is what we need, right)?

Thanks,
Richard.


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Gabriel Dos Reis
On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther
 wrote:

> Btw, this then happily fits into my suggestion that the "elementalness"
> can be autodetected by the compiler simply by means of a proper IPA
> pass and thus be fully LTO / whole-program aware.  No need for an
> attribute (where you'd need to handle the case that the attribute was placed
> there by error).

We are in violent agreement.

-- Gaby


[PATCH,i386] Enable prefetchw in processor alias table for AMD targets

2012-09-11 Thread venkataramanan.kumar
Hi Maintainers,

This patch enables "prefetchw" ISA in the processor alias table for targets 
amdfam10,barcelona and bdver1,2 and btver1,2.

GCC regression test passes with the patch.

Ok for trunk?

Change log:

2012-09-11  Venkataramanan Kumar  

* config/i386/i386.c (processor_alias_table): Enable PTA_PRFCHW
for targets amdfam10, barcelona, bdver1, bdver2, btver1 and btver2.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 190345)
+++ gcc/config/i386/i386.c  (working copy)
@@ -3151,31 +3151,33 @@
| PTA_SSE2 | PTA_NO_SAHF},
   {"amdfam10", PROCESSOR_AMDFAM10, CPU_AMDFAM10,
PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE
-   | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM},
+   | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM 
+   | PTA_PRFCHW},
   {"barcelona", PROCESSOR_AMDFAM10, CPU_AMDFAM10,
PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE
-   | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM},
+   | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM
+   | PTA_PRFCHW},
   {"bdver1", PROCESSOR_BDVER1, CPU_BDVER1,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
| PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4
-   | PTA_XOP | PTA_LWP},
+   | PTA_XOP | PTA_LWP | PTA_PRFCHW},
   {"bdver2", PROCESSOR_BDVER2, CPU_BDVER2,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
| PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
| PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C
-   | PTA_FMA},
+   | PTA_FMA | PTA_PRFCHW},
   {"btver1", PROCESSOR_BTVER1, CPU_GENERIC64,
 PTA_64BIT | PTA_MMX |  PTA_SSE  | PTA_SSE2 | PTA_SSE3
-| PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16},
+| PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_PRFCHW},
   {"generic32", PROCESSOR_GENERIC32, CPU_PENTIUMPRO,
PTA_HLE /* flags are only used for -march switch.  */ },
   {"btver2", PROCESSOR_BTVER2, CPU_GENERIC64,
PTA_64BIT | PTA_MMX |  PTA_SSE  | PTA_SSE2 | PTA_SSE3
| PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_SSE4_1
| PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
-   | PTA_BMI | PTA_F16C | PTA_MOVBE},
+   | PTA_BMI | PTA_F16C | PTA_MOVBE | PTA_PRFCHW},
   {"generic64", PROCESSOR_GENERIC64, CPU_GENERIC64,
PTA_64BIT
 | PTA_HLE /* flags are only used for -march switch.  */ },



Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Jakub Jelinek
On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote:
> On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther
>  wrote:
> 
> > Btw, this then happily fits into my suggestion that the "elementalness"
> > can be autodetected by the compiler simply by means of a proper IPA
> > pass and thus be fully LTO / whole-program aware.  No need for an
> > attribute (where you'd need to handle the case that the attribute was placed
> > there by error).
> 
> We are in violent agreement.

For locally defined functions sure, the question is if we want the attribute
to be something for external functions.  Something that would have ABI
implications (the external symbol would need to be provided in two forms (or
more?), one scalar with normal mangling, one vector with some other kind
of mangling/suffix/whatever), when compiling the definition of function with
such an attribute the compiler could verify its properties (i.e. autodetect
and if it is not autodetected elemental, complain?), and when using extern
function just rely on it being provided twice.  Even with LTO, the function
can be defined in some other shared library etc.

Nothing says the implementation of the vector version of the elemental
function necessary has to be vectorized, just that the arguments would need
to be passed in the expected vector registers, similarly for return value.
Say if the elemental function is compiled with -O0, then there could just be
a loop executing the scalar body several times and creating vectors.

Jakub


RE: [PATCH] Enable bbro for -Os

2012-09-11 Thread Zhenqiang Chen
Thank you for the detail comments.

The updated patched is attached. Is it OK?

Thanks!
-Zhenqiang

> -Original Message-
> From: Eric Botcazou [mailto:ebotca...@adacore.com]
> Sent: Tuesday, September 11, 2012 1:01 AM
> To: Zhenqiang Chen
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Enable bbro for -Os
> 
> > All other comments are accepted.
> >
> > The updated patch is attached. Is it OK?
> 
> As you probably gathered, I had missed that Steven and Richard had already
> commented on your patch before posting my message.  Sorry about that...
> 
> I think that the patch is interesting because, even if it doesn't exactly
> implement what the comment in gate_handle_reorder_blocks was talking
> about, it fixes code layout regressions without increasing the code size
(and
> even decreasing it).  So, assuming that Steven and Richard don't strongly
> oppose, I think the patch is OK modulo the following nits:
> 
> +   The above description is for the full algorithm, which is used when
the
> +   function is optimized for speed.  When the function is optimized for
size,
> +   in order to reduce long jump and connect more fall through edges,
> + the
> 
> long jumps... bb-reorder.c uses "fallthru edges" consistently.
> 
> +   algorithm is modified as follows:
> +   (1) Break long trace to short ones.  The trace is broken at a block,
which
> +   has multi-predecessors/successors during finding traces.
> 
> long traces... A trace is broken at a block that has multiple
predecessors/
> successors during trace discovery.
> 
> +   (2) Ignore the edge probability and frequency for fall through edges.
> 
> fallthru
> 
> +   (3) Keep its original order when there is no chance to fall through.
> + bbro
> 
> Keep the original order of blocks...  We rely on the results of
cfg_cleanup
> 
> +   bases on the result of cfg_cleanup, which does lots of optimizations
> + on
> cfg.
> +   So the order is expected to be kept if no fall through.
> +
> +   To implement the change for code size optimization, block's index is
> +   selected as the key and all traces are found in one round.
> 
> 
> +   /* If the best destination has multiple successors or
predecessors,
> +  don't allow it to be added when optimizing for size.  This
makes
> +  sure predecessors with smaller index handled before the best
> +  destination.  It breaks long trace and reduces long jumps.
> 
> missing "are" before "handled"
> 
> 
> +  After removing the best edge, the final result will be
ABCD/ACBD.
> +  It does not add jump compared with the previous order. But it
> +  reduce the possibility of long jump.  */
> 
> Double space before "But".
> 
> 
> +  if (optimize_function_for_size_p (cfun))
> +{
> +  e_index = src_index_p ? e->src->index : e->dest->index;
> +  b_index = src_index_p ? cur_best_edge->src->index
> +   : cur_best_edge->dest->index;
> +  /* The smaller one is better to keep the original order.  */
> +  return b_index > e_index;
> +}
> 
> Trailing space after the last parenthesis.
> 
> 
> +   /* If dest has multiple predecessors, skip it.  We expect
> +  that one predecessor with smaller index connect with it
> +  later.  */
> 
> connects
> 
> 
> +   /* Only connect Trace n with Trace n + 1.  It is conservative
> +  to keep the order as close as possible to the original
order.
> +  It also helps to reduce long jump.  */
> 
> long jumps
> 
> 
> Thanks for working on this.
> 
> --
> Eric Botcazou


Enable-bbro-for-size-updated3.patch
Description: Binary data


Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)

2012-09-11 Thread Richard Guenther
On Tue, Sep 11, 2012 at 9:58 AM, Tobias Burnus  wrote:
> On 09/11/2012 01:52 AM, Diego Novillo wrote:
>>
>> Remove unnecessary VEC function overloads.
>>
>> Several VEC member functions that accept an element 'T' used to have
>> two overloads: one taking 'T', the second taking 'T *'.
>
>
> They might be unnecessary,  but with your patch bootstrapping fails here
> with the following failure.
>
> Did you test with or without Graphite?

Fixed with the attached.

Richard.

> Tobias
>
>
> /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c: In function
> ‘void move_sd_regions(vec_t**, vec_t**)’:
> /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: error: no matching function
> for call to ‘vec_t::safe_push(vec_t**,
> sd_region*&, const char [61], int, const char [16])’
>   (vec_t::safe_push (&(V), O VEC_CHECK_INFO MEM_STAT_INFO))
>^
> /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c:146:5: note: in
> expansion of macro 'VEC_safe_push'
>  VEC_safe_push (sd_region, heap, *target, s);
>  ^
> /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: note: candidate is:
>   (vec_t::safe_push (&(V), O VEC_CHECK_INFO MEM_STAT_INFO))
>


p
Description: Binary data


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Richard Guenther
On Tue, Sep 11, 2012 at 11:06 AM, Jakub Jelinek  wrote:
> On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote:
>> On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther
>>  wrote:
>>
>> > Btw, this then happily fits into my suggestion that the "elementalness"
>> > can be autodetected by the compiler simply by means of a proper IPA
>> > pass and thus be fully LTO / whole-program aware.  No need for an
>> > attribute (where you'd need to handle the case that the attribute was 
>> > placed
>> > there by error).
>>
>> We are in violent agreement.
>
> For locally defined functions sure, the question is if we want the attribute
> to be something for external functions.  Something that would have ABI
> implications (the external symbol would need to be provided in two forms (or
> more?), one scalar with normal mangling, one vector with some other kind
> of mangling/suffix/whatever), when compiling the definition of function with
> such an attribute the compiler could verify its properties (i.e. autodetect
> and if it is not autodetected elemental, complain?), and when using extern
> function just rely on it being provided twice.  Even with LTO, the function
> can be defined in some other shared library etc.
>
> Nothing says the implementation of the vector version of the elemental
> function necessary has to be vectorized, just that the arguments would need
> to be passed in the expected vector registers, similarly for return value.
> Say if the elemental function is compiled with -O0, then there could just be
> a loop executing the scalar body several times and creating vectors.

Sure.  And the "versioning" can happen from the C frontend then.  Of course
this one has the requirement of documenting the ABI.

Richard.

> Jakub


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Gabriel Dos Reis
On Tue, Sep 11, 2012 at 4:06 AM, Jakub Jelinek  wrote:
> On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote:
>> On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther
>>  wrote:
>>
>> > Btw, this then happily fits into my suggestion that the "elementalness"
>> > can be autodetected by the compiler simply by means of a proper IPA
>> > pass and thus be fully LTO / whole-program aware.  No need for an
>> > attribute (where you'd need to handle the case that the attribute was 
>> > placed
>> > there by error).
>>
>> We are in violent agreement.
>
> For locally defined functions sure, the question is if we want the attribute
> to be something for external functions.  Something that would have ABI
> implications (the external symbol would need to be provided in two forms (or
> more?), one scalar with normal mangling, one vector with some other kind
> of mangling/suffix/whatever), when compiling the definition of function with
> such an attribute the compiler could verify its properties (i.e. autodetect
> and if it is not autodetected elemental, complain?), and when using extern
> function just rely on it being provided twice.  Even with LTO, the function
> can be defined in some other shared library etc.
>
> Nothing says the implementation of the vector version of the elemental
> function necessary has to be vectorized, just that the arguments would need
> to be passed in the expected vector registers, similarly for return value.
> Say if the elemental function is compiled with -O0, then there could just be
> a loop executing the scalar body several times and creating vectors.
>

As it was pointed out earlier (by Marc?), there is also an issue of overload
resolution if these automatically synthetized functions have to be something
visible, which of course entails the whole ABI issues.  This is really
a language
design issue, not just compiler implementation.   If the synthetized functions
do not need to have the same status as real functions (hence no need for
attributes), then these issues evaporate.

-- Gaby


Re: [PATCH] PowerPC VLE port

2012-09-11 Thread Segher Boessenkool

2012-09-10  Maciej W. Rozycki  

gcc/
* config/rs6000/rs6000.c (print_operand) <'c'>: Remove.
* config/rs6000/spe.md: Remove a leftover comment.


Okay.


This patch wasn't sent to gcc-patches -- can we see it please?


Segher



Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)

2012-09-11 Thread Dominique Dhumieres
> Fixed with the attached.

Followed by the same failure on darwin. Fixed with

--- ../_clean/gcc/config/darwin.c   2012-07-09 22:06:21.0 +0200
+++ ../p_work/gcc/config/darwin.c   2012-09-11 11:53:02.0 +0200
@@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *na
  the assumption of how this is done.  */
   if (lto_section_names == NULL)
 lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16);
-  VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, &e);
+  VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e);
}
   else if (strncmp (name, "__DWARF,", 8) == 0)
 darwin_asm_dwarf_section (name, flags, decl);
@@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *na
   fprintf (asm_out_file, "Lsection%.*s:\n", namelen, sname);
   e.count = 1;
   e.name = xstrdup (sname);
-  VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, &e);
+  VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e);
 }
 }
 
(now at stage 2).

TIA

Dominique


Re: [Patch ARM] implement bswap16

2012-09-11 Thread Christophe Lyon
On 10 September 2012 19:30, Richard Earnshaw  wrote:
> On 10/09/12 16:40, Christophe Lyon wrote:
>> Why do we have to keep room for the predicate here? (%?) Doesn't this
>> pattern match only in unconditional cases?
>>
>
> Because the ARM back-end has a very late conditionalizer pass that can
> also generate conditional execution.  It very rarely kicks in these
> days, but if the predication rules are in there you could end up with an
> instruction that the compiler thought was conditionally executed being
> always run.  That would be bad^TM.
>

Thanks for the clarification.

>> BTW, I didn't manage to have GCC generate conditional revsh. I merely
>> added an "if (y)" guard before calling builtin_bswap16, but this
>> didn't turn into a conditional revsh.
>>
On this topic, could you suggest a way to generate conditional revsh?

I would like to augment the testsuite for this, and I tried:

int y;
short swaps16(short x) {
  if (y)
  return __builtin_bswap16(x);
}
but it generates:
swaps16:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movwr3, #:lower16:y @ 50*arm_movsi_vfp/4[length = 4]
movtr3, #:upper16:y @ 51*arm_movt   [length = 4]
ldr r3, [r3]@ 7 *arm_movsi_vfp/5[length = 4]
cmp r3, #0  @ 8 *arm_cmpsi_insn/3   [length = 4]
beq .L3 @ 9 arm_cond_branch [length = 4]
revsh   r0, r0  @ 13*arm_revsh/3[length = 4]
bx  lr  @ 56*arm_return [length = 12]
.L3:
bx  lr  @ 58*arm_return [length = 12]

ie unconditional revsh.


Another question regarding the *arm_revsh pattern you wrote: why is
the "arch" set to "t1,t2,32" ?  Shouldn't it be "t1,t2,a" ?
(IIUC, "32" matches both "a" and "t2" as per the definition of TARGET_32BIT)

Thanks

Christophe.


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Marc Glisse

On Tue, 11 Sep 2012, Richard Guenther wrote:


On Tue, Sep 11, 2012 at 10:41 AM, Richard Guenther
 wrote:

On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson  wrote:

Whether or not the compiler creates a clone COULD BE totally up to the
compiler, based on whether or not vectorization is enabled, whether the
loop has been analyzed such that vectorization may proceed, or indeed
the phase of the moon.

But in order for that to happen, the clone must be totally private to
the module for which we are generating code (in the LTO sense, this is
the entire program or dll; without LTO, this is just the object file).
It means that we never attempt to generate clones for functions for
which the body of the function is not visible.

On the other hand, if you insist on assuming a clone exists merely
because a declaration bears an attribute, then you must address ALL
of the problems with respect to defining a stable ABI in the face of
different cpu revisions, different ISAs, and different vector lengths.

I've not seen you address ANY of these problems, despite having the
problem pointed out multiple times.


Indeed, if the definition of an elemental function is always visible to the
vectorizer the vectorizer itself can instruct the creation of the clone
if it does not already exist (just make those clones managed by the
callgraph).  Then the clones are visible to the current TU only and no
ABI issues exist (though you could say that the vectorizer or the inliner
could as well force inlining of elemental functions into places it wants to
vectorize - one complication even with local clones is that the x86 ABI
has no callee-saved XMM registers which makes function calls inside
loops especially expensive).


I thought gcc wouldn't use the x86 ABI for those private calls. I guess 
what I remember were vague discussions and not a description of the 
current status...



Btw, this then happily fits into my suggestion that the "elementalness"
can be autodetected by the compiler simply by means of a proper IPA
pass and thus be fully LTO / whole-program aware.  No need for an
attribute (where you'd need to handle the case that the attribute was placed
there by error).


Note that, apart from preventing external calls, it removes this use case:

__attribute__((vector(4))) double mysqrt(double x){return sqrt(x);}

__m256d var;
mysqrt(var);

I am not sure it is the best way to achieve this, but it is one way. I am 
also planning a patch to turn {sqrt(a),sqrt(b)} into sqrt({a,b}) when the 
target likes it. And there is a PR asking for a __builtin_math_sqrt.


--
Marc Glisse


[PATCH] Fix PR54534

2012-09-11 Thread Richard Guenther

The backport of the patch for PR53572 caused us to remove unused
decls at -O0, a regresion on the branch - fixed by the following.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2012-09-11  Richard Guenther  

PR debug/54534
* cgraph.h (varpool_can_remove_if_no_refs): Restore dependence
on flag_toplevel_reorder.

Index: gcc/cgraph.h
===
--- gcc/cgraph.h(revision 191174)
+++ gcc/cgraph.h(working copy)
@@ -951,7 +951,7 @@ varpool_can_remove_if_no_refs (struct va
   return (!node->force_output && !node->used_from_other_partition
  && ((DECL_COMDAT (node->decl)
   && !varpool_used_from_object_file_p (node))
- || !node->externally_visible
+ || (flag_toplevel_reorder && !node->externally_visible)
  || DECL_HAS_VALUE_EXPR_P (node->decl)));
 }
 


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Jakub Jelinek
On Tue, Sep 11, 2012 at 12:29:10PM +0200, Marc Glisse wrote:
> >Btw, this then happily fits into my suggestion that the "elementalness"
> >can be autodetected by the compiler simply by means of a proper IPA
> >pass and thus be fully LTO / whole-program aware.  No need for an
> >attribute (where you'd need to handle the case that the attribute was placed
> >there by error).
> 
> Note that, apart from preventing external calls, it removes this use case:
> 
> __attribute__((vector(4))) double mysqrt(double x){return sqrt(x);}
> 
> __m256d var;
> mysqrt(var);

I don't think those functions should be available for C++ overloading.
For one, it would be only for C++, not for C, and how would you handle
the case where the user already provides __m256d mysqrt(__m256d); overload
in addition to the one with vector attribute?
I'd say the compiler should when beneficial synthetize calls to those
in SLP or normal vectorizer instead, so you'd write:
  (__m256d){mysqrt(var[0]),mysqrt(var[1]),mysqrt(var[2]),mysqrt(var[3])};
instead of mysqrt(var); and the compiler would turn that into
  mysqrt.elem.V4DF(var)
(or whatever the mangling of the elemental functions would be).

Jakub


Re: [Patch ARM] implement bswap16

2012-09-11 Thread Richard Earnshaw
On 11/09/12 11:25, Christophe Lyon wrote:
> On 10 September 2012 19:30, Richard Earnshaw  wrote:
>> On 10/09/12 16:40, Christophe Lyon wrote:
>>> Why do we have to keep room for the predicate here? (%?) Doesn't this
>>> pattern match only in unconditional cases?
>>>
>>
>> Because the ARM back-end has a very late conditionalizer pass that can
>> also generate conditional execution.  It very rarely kicks in these
>> days, but if the predication rules are in there you could end up with an
>> instruction that the compiler thought was conditionally executed being
>> always run.  That would be bad^TM.
>>
> 
> Thanks for the clarification.
> 
>>> BTW, I didn't manage to have GCC generate conditional revsh. I merely
>>> added an "if (y)" guard before calling builtin_bswap16, but this
>>> didn't turn into a conditional revsh.
>>>
> On this topic, could you suggest a way to generate conditional revsh?
> 
> I would like to augment the testsuite for this, and I tried:
> 
> int y;
> short swaps16(short x) {
>   if (y)
>   return __builtin_bswap16(x);
> }
> but it generates:
> swaps16:
>   @ args = 0, pretend = 0, frame = 0
>   @ frame_needed = 0, uses_anonymous_args = 0
>   @ link register save eliminated.
>   movwr3, #:lower16:y @ 50*arm_movsi_vfp/4[length = 4]
>   movtr3, #:upper16:y @ 51*arm_movt   [length = 4]
>   ldr r3, [r3]@ 7 *arm_movsi_vfp/5[length = 4]
>   cmp r3, #0  @ 8 *arm_cmpsi_insn/3   [length = 4]
>   beq .L3 @ 9 arm_cond_branch [length = 4]
>   revsh   r0, r0  @ 13*arm_revsh/3[length = 4]
>   bx  lr  @ 56*arm_return [length = 12]
> .L3:
>   bx  lr  @ 58*arm_return [length = 12]
> 
> ie unconditional revsh.
> 
> 
> Another question regarding the *arm_revsh pattern you wrote: why is
> the "arch" set to "t1,t2,32" ?  Shouldn't it be "t1,t2,a" ?
> (IIUC, "32" matches both "a" and "t2" as per the definition of TARGET_32BIT)
> 
> Thanks
> 
> Christophe.
> 

Try something like:

short foo(int);

short swaps (short x, int y)
{
  int z = x;
  if (y)
z = __builtin_bswap16(x);
  return foo (z);
}

If that's not enough, try adding 1 to z before calling foo.

R.




Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Marc Glisse

On Tue, 11 Sep 2012, Jakub Jelinek wrote:


On Tue, Sep 11, 2012 at 12:29:10PM +0200, Marc Glisse wrote:

Note that, apart from preventing external calls, it removes this use case:

__attribute__((vector(4))) double mysqrt(double x){return sqrt(x);}

__m256d var;
mysqrt(var);


I don't think those functions should be available for C++ overloading.


The current patch does make them available, according to their author.


For one, it would be only for C++, not for C, and how would you handle
the case where the user already provides __m256d mysqrt(__m256d); overload
in addition to the one with vector attribute?


The same way you handle it when the user provides 2 identical overloads.


I'd say the compiler should when beneficial synthetize calls to those
in SLP or normal vectorizer instead, so you'd write:
 (__m256d){mysqrt(var[0]),mysqrt(var[1]),mysqrt(var[2]),mysqrt(var[3])};
instead of mysqrt(var); and the compiler would turn that into
 mysqrt.elem.V4DF(var)
(or whatever the mangling of the elemental functions would be).


Ok.

--
Marc Glisse


Recognize vec_perm_expr in a constructor of bit_field_ref

2012-09-11 Thread Marc Glisse

Hello,

here is a patch that turns {v[1],v[0]} into vec_perm_expr(v,v,{1,0}) if 
the target is ok with it.


I am attaching 2 versions of the patch. p-good is the one that passes 
testing. p-bad, where I rely on fold_stmt to detect identity permutations, 
ICEs towards the end of the pass while checking a bogus gimple stmt (one 
that gimple_debug_stmt crashes on if I call it in gdb). From a performance 
point of view, p-good makes sense, but I liked the simplicity of p-bad and 
I am confused as to why it fails.


2012-09-11  Marc Glisse  

gcc/
* tree-ssa-forwprop.c (simplify_vector_constructor): New function.
(ssa_forward_propagate_and_combine): Call it.

gcc/testsuite/
* gcc.dg/tree-ssa/forwprop-22.c: New testcase.

--
Marc GlisseIndex: Makefile.in
===
--- Makefile.in (revision 191173)
+++ Makefile.in (working copy)
@@ -2237,21 +2237,22 @@ tree-outof-ssa.o : tree-outof-ssa.c $(TR
$(TREE_H) $(DIAGNOSTIC_H) $(TM_H) coretypes.h dumpfile.h \
$(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) \
$(EXPR_H) $(SSAEXPAND_H) $(GIMPLE_PRETTY_PRINT_H)
 tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
$(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h $(FLAGS_H) \
$(GIMPLE_PRETTY_PRINT_H) langhooks.h
 tree-ssa-forwprop.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \
$(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
-   langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H)
+   langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) \
+   $(TREE_VECTORIZER_H)
 tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
$(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
langhooks.h $(FLAGS_H) $(GIMPLE_PRETTY_PRINT_H)
 tree-ssa-ifcombine.o : tree-ssa-ifcombine.c $(CONFIG_H) $(SYSTEM_H) \
coretypes.h $(TM_H) $(TREE_H) $(BASIC_BLOCK_H) \
$(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
$(TREE_PRETTY_PRINT_H)
 tree-ssa-phiopt.o : tree-ssa-phiopt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
Index: testsuite/gcc.dg/tree-ssa/forwprop-22.c
===
--- testsuite/gcc.dg/tree-ssa/forwprop-22.c (revision 0)
+++ testsuite/gcc.dg/tree-ssa/forwprop-22.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-require-effective-target vect_perm } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+typedef double vec __attribute__((vector_size (2 * sizeof (double;
+void f (vec *px, vec *y, vec *z)
+{
+  vec x = *px;
+  vec t1 = { x[1], x[0] };
+  vec t2 = { x[0], x[1] };
+  *y = t1;
+  *z = t2;
+}
+
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/gcc.dg/tree-ssa/forwprop-22.c
___
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: tree-ssa-forwprop.c
===
--- tree-ssa-forwprop.c (revision 191173)
+++ tree-ssa-forwprop.c (working copy)
@@ -26,20 +26,21 @@ along with GCC; see the file COPYING3.
 #include "tm_p.h"
 #include "basic-block.h"
 #include "gimple-pretty-print.h"
 #include "tree-flow.h"
 #include "tree-pass.h"
 #include "langhooks.h"
 #include "flags.h"
 #include "gimple.h"
 #include "expr.h"
 #include "cfgloop.h"
+#include "tree-vectorizer.h"
 
 /* This pass propagates the RHS of assignment statements into use
sites of the LHS of the assignment.  It's basically a specialized
form of tree combination.   It is hoped all of this can disappear
when we have a generalized tree combiner.
 
One class of common cases we handle is forward propagating a single use
variable into a COND_EXPR.
 
  bb0:
@@ -2787,20 +2788,105 @@ simplify_permutation (gimple_stmt_iterat
   if (TREE_CODE (op0) == SSA_NAME)
ret = remove_prop_source_from_use (op0);
   if (op0 != op1 && TREE_CODE (op1) == SSA_NAME)
ret |= remove_prop_source_from_use (op1);
   return ret ? 2 : 1;
 }
 
   return 0;
 }
 
+/* Recognize a VEC_PERM_EXPR.  Returns true if there were any changes.  */
+
+static bool
+simplify_vector_constructor (gimple_stmt_iterator *gsi)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  gimple def_stmt;
+  tree op, op2, orig, type, elem_type;
+  unsigned elem_size, nelts, i;
+  enum tree_code code;
+  constructor_elt *elt;
+  unsigned char *sel;
+  bool maybe_ident;
+
+  gcc_checking_assert (gimple_assign_rhs_c

Re: [PATCH] PowerPC VLE port

2012-09-11 Thread Maciej W. Rozycki
On Tue, 11 Sep 2012, Segher Boessenkool wrote:

> > > 2012-09-10  Maciej W. Rozycki  
> > > 
> > > gcc/
> > > * config/rs6000/rs6000.c (print_operand) <'c'>: Remove.
> > > * config/rs6000/spe.md: Remove a leftover comment.
> > 
> > Okay.
> 
> This patch wasn't sent to gcc-patches -- can we see it please?

 Umm, I didn't notice a cc to gcc-patches was removed in the course of 
discussion, sorry about that.  Here's the change concerned.

  Maciej

gcc-powerpc-print-operand-c.patch
Index: gcc/config/rs6000/spe.md
===
--- gcc/config/rs6000/spe.md(revision 191161)
+++ gcc/config/rs6000/spe.md(working copy)
@@ -2945,8 +2945,6 @@
   "mfspefscr %0"
   [(set_attr "type" "vecsimple")])
 
-;; FP comparison stuff.
-
 ;; Flip the GT bit.
 (define_insn "e500_flip_gt_bit"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 191161)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -14659,14 +14659,6 @@ print_operand (FILE *file, rtx x, int co
   /* %c is output_addr_const if a CONSTANT_ADDRESS_P, otherwise
 output_operand.  */
 
-case 'c':
-  /* X is a CR register.  Print the number of the GT bit of the CR.  */
-  if (GET_CODE (x) != REG || ! CR_REGNO_P (REGNO (x)))
-   output_operand_lossage ("invalid %%c value");
-  else
-   fprintf (file, "%d", 4 * (REGNO (x) - CR0_REGNO) + 1);
-  return;
-
 case 'D':
   /* Like 'J' but get to the GT bit only.  */
   gcc_assert (REG_P (x));


Re: vector comparisons in C++

2012-09-11 Thread Marc Glisse

Any comment?
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg02098.html

Maybe separately on the technical and political aspects?


On Sat, 1 Sep 2012, Marc Glisse wrote:


With the patch...

On Sat, 1 Sep 2012, Marc Glisse wrote:


Hello,

this patch copies some more vector extensions from the C front-end to the 
C++ front-end. There seemed to be some reluctance to add those, but I guess 
a patch is the best way to ask. Note that I only added the vector x vector 
operations, not the vector x scalar ones.


I have some issues with the vector-compare-2.c torture test. It passes a 
vector by value (argument and return type), which is likely to warn 
(although for some reason it doesn't for me, with today's compiler). And it 
takes -Wno-psabi through a .x file, but those are not read in c-c++-common, 
so I put it in dg-options. I would have changed the function to use 
pointers, but I don't know if it specifically wants to test passing by 
value...




2012-08-31  Marc Glisse  
PR c++/54427

cp/ChangeLog
* typeck.c (cp_build_binary_op) [LSHIFT_EXPR, RSHIFT_EXPR, EQ_EXPR,
NE_EXPR, LE_EXPR, GE_EXPR, LT_EXPR, GT_EXPR]: Handle VECTOR_TYPE.

testsuite/ChangeLog
* gcc.dg/vector-shift.c: Move ...
* c-c++-common/vector-shift.c: ... here.
* gcc.dg/vector-shift1.c: Move ...
* c-c++-common/vector-shift1.c: ... here.
* gcc.dg/vector-shift3.c: Move ...
* c-c++-common/vector-shift3.c: ... here.
* gcc.dg/vector-compare-1.c: Move ...
* c-c++-common/vector-compare-1.c: ... here.
* gcc.dg/vector-compare-2.c: Move ...
* c-c++-common/vector-compare-2.c: ... here.
* gcc.c-torture/execute/vector-compare-1.c: Move ...
* c-c++-common/torture/vector-compare-1.c: ... here.
* gcc.c-torture/execute/vector-compare-2.x: Delete.
* gcc.c-torture/execute/vector-compare-2.c: Move ...
* c-c++-common/torture/vector-compare-2.c: ... here.
* gcc.c-torture/execute/vector-shift.c: Move ...
* c-c++-common/torture/vector-shift.c: ... here.
* gcc.c-torture/execute/vector-shift2.c: Move ...
* c-c++-common/torture/vector-shift2.c: ... here.
* gcc.c-torture/execute/vector-subscript-1.c: Move ...
* c-c++-common/torture/vector-subscript-1.c: ... here.
* gcc.c-torture/execute/vector-subscript-2.c: Move ...
* c-c++-common/torture/vector-subscript-2.c: ... here.
* gcc.c-torture/execute/vector-subscript-3.c: Move ...
* c-c++-common/torture/vector-subscript-3.c: ... here.


--
Marc Glisse


Re: [i386] recognize haddpd

2012-09-11 Thread Marc Glisse

Hello,

any advice?
http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00044.html


On Sun, 2 Sep 2012, Marc Glisse wrote:


Hello,

this patch passes bootstrap+testsuite. It is probably wrong in many ways, but 
I don't know enough to do more without some advice.


The goal is to recognize that v[0]+v[1] can be computed with haddpd. With the 
patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also, thanks to 
it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd.


1) Is a define_insn the right tool?
2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because vec_duplicate 
doesn't match vec_concat. Do we really need to duplicate (no pun intended) 
the pattern?
3) v[0]+v[1] is not recognized. Some pass changed their order, and nothing 
tries the reverse order. I can see 3 ways: canonicalize the order at some 
point, let combine try both orders for commutative operators or make the 
patterns more flexible (I don't know how many would need changing).
4) I don't understand the set_attr part. I copied it from the haddpd 
define_insn, and removed (set_attr "type" "sseadd") because it crashed the 
compiler. isa and prefix make sense and they match the alternatives, but I am 
not sure about "mode" (removing it still works IIRC).



2012-09-02  Marc Glisse  

gcc/
* config/i386/sse.md (*sse3_hv2df3_low): New.

gcc/testsuite/
* gcc.target/i386/pr54400.c: New testcase.


--
Marc Glisse


Re: Bootstrap fails

2012-09-11 Thread Diego Novillo

On 2012-09-11 03:58 , Tobias Burnus wrote:


Did you test with or without Graphite?


I tested with and without release checking, all languages and all 
targets that use VEC.  So many combinations... how is graphite enabled?



Diego.


Re: Bootstrap fails

2012-09-11 Thread Diego Novillo

On 2012-09-11 05:35 , Richard Guenther wrote:

On Tue, Sep 11, 2012 at 9:58 AM, Tobias Burnus  wrote:

On 09/11/2012 01:52 AM, Diego Novillo wrote:


Remove unnecessary VEC function overloads.

Several VEC member functions that accept an element 'T' used to have
two overloads: one taking 'T', the second taking 'T *'.



They might be unnecessary,  but with your patch bootstrapping fails here
with the following failure.

Did you test with or without Graphite?


Fixed with the attached.


Thanks!


Diego.


Re: Bootstrap fails

2012-09-11 Thread Richard Guenther
On Tue, Sep 11, 2012 at 1:41 PM, Diego Novillo  wrote:
> On 2012-09-11 03:58 , Tobias Burnus wrote:
>
>> Did you test with or without Graphite?
>
>
> I tested with and without release checking, all languages and all targets
> that use VEC.  So many combinations... how is graphite enabled?

By having its prerequesites available (cloog and isl).

Richard.

>
> Diego.


Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)

2012-09-11 Thread Diego Novillo

On 2012-09-11 06:12 , Dominique Dhumieres wrote:

Fixed with the attached.


Followed by the same failure on darwin. Fixed with

--- ../_clean/gcc/config/darwin.c   2012-07-09 22:06:21.0 +0200
+++ ../p_work/gcc/config/darwin.c   2012-09-11 11:53:02.0 +0200
@@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *na
   the assumption of how this is done.  */
if (lto_section_names == NULL)
  lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16);
-  VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, &e);
+  VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e);
 }
else if (strncmp (name, "__DWARF,", 8) == 0)
  darwin_asm_dwarf_section (name, flags, decl);
@@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *na
fprintf (asm_out_file, "Lsection%.*s:\n", namelen, sname);
e.count = 1;
e.name = xstrdup (sname);
-  VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, &e);
+  VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e);
  }
  }


Gah, my grep did not include config/*.c.

This is ok, of course.


Diego.



Re: Bootstrap fails

2012-09-11 Thread Tobias Burnus

On 09/11/2012 01:41 PM, Diego Novillo wrote:

On 2012-09-11 03:58 , Tobias Burnus wrote:


Did you test with or without Graphite?


I tested with and without release checking, all languages and all 
targets that use VEC.  So many combinations...


There is unfortunately always an N+1 configuration which one hasn't 
tested ...




how is graphite enabled?


I think it is automatically enabled when the libraries are found; at 
least I didn't specify anything special and just see the following 
configure output:


checking for version 0.10 of ISL... yes
checking for version 0.17.0 of CLooG... yes

Consequently (cf. toplevel configure):
# Treat either --without-cloog or --without-isl as a request to disable
# GRAPHITE support and skip all following checks.

If you don't have them in the default tree: See 
http://gcc.gnu.org/install/prerequisites.html and 
http://gcc.gnu.org/install/configure.html  Both also build in tree.



Tobias

PS: Thanks for the clean up patch.


Re: Remove unnecessary VEC function overloads.

2012-09-11 Thread Diego Novillo

On 2012-09-11 01:01 , Ian Lance Taylor wrote:

On Mon, Sep 10, 2012 at 4:52 PM, Diego Novillo  wrote:


Ian, could you commit the changes in go/gofrontend?


Done.  Actually, it looks like you already committed them, but I
brought the master repo up to date.


Yes, sorry.  I'm not quite sure how to deal with Go patches, in general. 
 Had I not committed the patch, then Go would've been broken.


Is it OK if these patches get committed to GCC trunk?  I have at least 2 
or 3 more of this kind in the queue.  Or do you prefer to have the 
master repo update first? (in which case, trunk will be broken for a 
little while).



Diego.


Re: [patch] PR54149: fix data race in LIM pass

2012-09-11 Thread Aldy Hernandez



ok with

   && gimple_assign_lhs_ptr (loc->stmt) == loc->ref

instead.  Let's hope we conservatively catch all writes to ref this way (which
is what we need, right)?


Yes.

Thanks.  Committing the attached patch.


PR middle-end/54149
* tree-ssa-loop-im.c (execute_sm_if_changed_flag_set): Only set
flag for writes.

diff --git a/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c 
b/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c
new file mode 100644
index 000..59f81b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c
@@ -0,0 +1,54 @@
+/* { dg-do link } */
+/* { dg-options "--param allow-store-data-races=0" } */
+/* { dg-final { simulate-thread } } */
+
+#include 
+#include 
+
+#include "simulate-thread.h"
+
+/* PR 54139 */
+/* Test that speculative stores do not happen for --param
+   allow-store-data-races=0.  */
+
+int g_13=1, insns=1;
+
+__attribute__((noinline))
+void simulate_thread_main()
+{
+  int l_245;
+
+  /* Since g_13 is unilaterally set positive above, there should be
+ no store to g_13 below.  */
+  for (l_245 = 0; l_245 <= 1; l_245 += 1)
+for (; g_13 <= 0; g_13 = 1)
+  ;
+}
+
+int main()
+{
+  simulate_thread_main ();
+  simulate_thread_done ();
+  return 0;
+}
+
+void simulate_thread_other_threads ()
+{
+  ++g_13;
+  ++insns;
+}
+
+int simulate_thread_step_verify ()
+{
+  return 0;
+}
+
+int simulate_thread_final_verify ()
+{
+  if (g_13 != insns)
+{
+  printf("FAIL: g_13 was incorrectly cached\n");
+  return 1;
+}
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 0f61631..67cab3a 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -2113,9 +2113,14 @@ execute_sm_if_changed_flag_set (struct loop *loop, 
mem_ref_p ref)
   gimple_stmt_iterator gsi;
   gimple stmt;
 
-  gsi = gsi_for_stmt (loc->stmt);
-  stmt = gimple_build_assign (flag, boolean_true_node);
-  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  /* Only set the flag for writes.  */
+  if (is_gimple_assign (loc->stmt)
+ && gimple_assign_lhs_ptr (loc->stmt) == loc->ref)
+   {
+ gsi = gsi_for_stmt (loc->stmt);
+ stmt = gimple_build_assign (flag, boolean_true_node);
+ gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+   }
 }
   VEC_free (mem_ref_loc_p, heap, locs);
   return flag;


shrink-wrapping duplicates BBs across partitions.

2012-09-11 Thread Christian Bruel
Hello,

While testing the patch to enable shrink-wrapping on SH [PR54546], we
hit an the "error: EDGE_CROSSING missing across section boundary"

Indeed, shrink-wrap duplicates a bb with successors (containing the
return sequence) into an unlikely section. I first thought about setting
the EDGE_CROSSING on flag on those edge, but I feel that this block
duplication doesn't go in the direction of this optimization. Not
duplicating BBs between partitions solves the problem.

Does this restriction look right to you ? (regression tests are still
running on x86 and sh)

Thanks a lot for any comment.

Christian





Index: function.c
===
--- function.c	(revision 191177)
+++ function.c	(working copy)
@@ -6063,6 +6063,7 @@
 	  FOR_EACH_EDGE (e, ei, tmp_bb->preds)
 	if (single_succ_p (e->src)
 		&& !bitmap_bit_p (&bb_on_list, e->src->index)
+		&& (BB_PARTITION (e->src) == BB_PARTITION (e->dest))
 		&& can_duplicate_block_p (e->src))
 	  {
 		edge pe;


Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)

2012-09-11 Thread Dominique Dhumieres
> This is ok, of course.

Then could you please commit it (I don't have write access)?

TIA

Dominique


Re: Remove unnecessary VEC function overloads.

2012-09-11 Thread Ian Lance Taylor
On Tue, Sep 11, 2012 at 5:03 AM, Diego Novillo  wrote:
> On 2012-09-11 01:01 , Ian Lance Taylor wrote:
>>
>> On Mon, Sep 10, 2012 at 4:52 PM, Diego Novillo 
>> wrote:
>>>
>>>
>>> Ian, could you commit the changes in go/gofrontend?
>>
>>
>> Done.  Actually, it looks like you already committed them, but I
>> brought the master repo up to date.
>
>
> Yes, sorry.  I'm not quite sure how to deal with Go patches, in general.
> Had I not committed the patch, then Go would've been broken.
>
> Is it OK if these patches get committed to GCC trunk?  I have at least 2 or
> 3 more of this kind in the queue.  Or do you prefer to have the master repo
> update first? (in which case, trunk will be broken for a little while).

I think the right thing to do is to let Go break for a little while,
so that the code in the GCC repository is always a copy of the
gofrontend repository.

I hope to get back to moving the remaining GCC-specific code out of
gofrontend soon.

Ian


Re: shrink-wrapping duplicates BBs across partitions.

2012-09-11 Thread Steven Bosscher
> Does this restriction look right to you ? (regression tests are still
> running on x86 and sh)

Please generate your patches with diff -up (or svn diff -x -up).

> + && (BB_PARTITION (e->src) == BB_PARTITION (e->dest))

No need for parentheses around this check.

The shrink wrapping code appears to be dealing with partitioning, or
at least there are BB_COPY_PARTITIONs further down. So I can't tell
whether this fix is correct. Can you show in more detail what happens?
(A dotty graph is always helpful ;-)

Ciao!
Steven


[PATCH, TESTSUITE] Add -fno-short-enums to pr51712

2012-09-11 Thread Kyrylo Tkachov
Add -fno-short-enums flag to test c-c++-common/pr51712.c as discussed in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712.
This removes the excess warning that caused the test to fail.
Tested in arm-none-eabi configuration. The test now passes.
Comment? Ok for trunk?

Thanks,
Kyrill

gcc/testsuite

2012-09-11  Kyrylo Tkachov  

* c-c++-common/pr51712.c: Add -fno-short-enums flag to test.--- a/gcc/testsuite/c-c++-common/pr51712.c
+++ b/gcc/testsuite/c-c++-common/pr51712.c
@@ -1,6 +1,6 @@
 /* PR c/51712 */
 /* { dg-do compile } */
-/* { dg-options "-Wtype-limits" } */
+/* { dg-options "-Wtype-limits -fno-short-enums" {target short_enums} } */
 
 enum test_enum {
   FOO,


Re: [PATCH, TESTSUITE] Add -fno-short-enums to pr51712

2012-09-11 Thread Jakub Jelinek
On Tue, Sep 11, 2012 at 01:46:37PM +0100, Kyrylo Tkachov wrote:
> Add -fno-short-enums flag to test c-c++-common/pr51712.c as discussed in
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712.
> This removes the excess warning that caused the test to fail.
> Tested in arm-none-eabi configuration. The test now passes.
> Comment? Ok for trunk?
> 
> Thanks,
> Kyrill
> 
> gcc/testsuite
> 
> 2012-09-11  Kyrylo Tkachov  
> 
>   * c-c++-common/pr51712.c: Add -fno-short-enums flag to test.

> --- a/gcc/testsuite/c-c++-common/pr51712.c
> +++ b/gcc/testsuite/c-c++-common/pr51712.c
> @@ -1,6 +1,6 @@
>  /* PR c/51712 */
>  /* { dg-do compile } */
> -/* { dg-options "-Wtype-limits" } */
> +/* { dg-options "-Wtype-limits -fno-short-enums" {target short_enums} } */

That is wrong, it means that on non-short_enums targets suddenly no
dg-options would be passed.
Instead you should keep the dg-options line as is and add
/* { dg-additional-options "-fno-short-enums" { target short_enums } } */
or just add the new dg-options line but keep the old one as well (though,
dg-additional-options is the new preferred way).

Jakub


[Patch ARM] Allow auto-vectorizer to use vfma.

2012-09-11 Thread Ramana Radhakrishnan

Hi,

This allows the auto-vectorizer to use vfma under Ofast or ffast-math.
I have a follow-up patch which will add support for these from 
arm_neon.h as well before someone asks. It's being regression tested as 
we speak and that'll follow shortly.


Tested on A15 silicon native with no regressions.

Committed.


regards,
Ramana



2012-09-11  Ramana Radhakrishnan  
Matthew Gretton-Dann  

* config/arm/neon.md (fma4): New pattern.
(*fmsub4): Likewise.
* doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw):  Document it.

2012-09-11  Ramana Radhakrishnan  
Matthew Gretton-Dann  

* gcc.target/arm/neon-vfma-1.c: New testcase.
* gcc.target/arm/neon-vfms-1.c: Likewise.
* gcc.target/arm/neon-vmla-1.c: Update test to use int instead
of float.
* gcc.target/arm/neon-vmls-1.c: Likewise.
* lib/target-supports.exp (add_options_for_arm_neonv2): New
function.
(check_effective_target_arm_neonv2_ok_nocache): Likewise.
(check_effective_target_arm_neonv2_ok): Likewise.
(check_effective_target_arm_neonv2_hw): Likewise.
(check_effective_target_arm_neonv2): Likewise.diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index a929546..4821bb7 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -707,6 +707,33 @@
 (const_string "neon_mla_qqq_32_qqd_32_scalar")]
 )
 
+;; Fused multiply-accumulate
+(define_insn "fma4"
+  [(set (match_operand:VCVTF 0 "register_operand" "=w")
+(fma:VCVTF (match_operand:VCVTF 1 "register_operand" "w")
+		 (match_operand:VCVTF 2 "register_operand" "w")
+		 (match_operand:VCVTF 3 "register_operand" "0")))]
+  "TARGET_NEON && TARGET_FMA && flag_unsafe_math_optimizations"
+  "vfma%?.\\t%0, %1, %2"
+  [(set (attr "neon_type")
+	(if_then_else (match_test "")
+		  (const_string "neon_fp_vmla_ddd")
+		  (const_string "neon_fp_vmla_qqq")))]
+)
+
+(define_insn "*fmsub4"
+  [(set (match_operand:VCVTF 0 "register_operand" "=w")
+(fma:VCVTF (neg:VCVTF (match_operand:VCVTF 1 "register_operand" "w"))
+		   (match_operand:VCVTF 2 "register_operand" "w")
+		   (match_operand:VCVTF 3 "register_operand" "0")))]
+  "TARGET_NEON && TARGET_FMA && flag_unsafe_math_optimizations"
+  "vfms%?.\\t%0, %1, %2"
+  [(set (attr "neon_type")
+	(if_then_else (match_test "")
+		  (const_string "neon_fp_vmla_ddd")
+		  (const_string "neon_fp_vmla_qqq")))]
+)
+
 (define_insn "ior3"
   [(set (match_operand:VDQ 0 "s_register_operand" "=w,w")
 	(ior:VDQ (match_operand:VDQ 1 "s_register_operand" "w,0")
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 7e9dbe3..3fe52ad 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1525,11 +1525,19 @@ ARM target supports generating NEON instructions.
 @item arm_neon_hw
 Test system supports executing NEON instructions.
 
+@item arm_neonv2_hw
+Test system supports executing NEON v2 instructions.
+
 @item arm_neon_ok
 @anchor{arm_neon_ok}
 ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible
 options.  Some multilibs may be incompatible with these options.
 
+@item arm_neonv2_ok
+@anchor{arm_neon_ok}
+ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible
+options.  Some multilibs may be incompatible with these options.
+
 @item arm_neon_fp16_ok
 @anchor{arm_neon_fp16_ok}
 ARM Target supports @code{-mfpu=neon-fp16 -mfloat-abi=softfp} or compatible
diff --git a/gcc/testsuite/gcc.target/arm/neon-vfma-1.c b/gcc/testsuite/gcc.target/arm/neon-vfma-1.c
new file mode 100644
index 000..a003a82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon-vfma-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neonv2_ok } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+/* { dg-add-options arm_neonv2 } */
+/* { dg-final { scan-assembler "vfma\\.f32\[	\]+\[dDqQ]" } } */
+
+/* Verify that VFMA is used.  */
+void f1(int n, float a, float x[], float y[]) {
+  int i;
+  for (i = 0; i < n; ++i)
+y[i] = a * x[i] + y[i];
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon-vfms-1.c b/gcc/testsuite/gcc.target/arm/neon-vfms-1.c
new file mode 100644
index 000..8cefd8a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon-vfms-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neonv2_ok } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+/* { dg-add-options arm_neonv2 } */
+/* { dg-final { scan-assembler "vfms\\.f32\[	\]+\[dDqQ]" } } */
+
+/* Verify that VFMS is used.  */
+void f1(int n, float a, float x[], float y[]) {
+  int i;
+  for (i = 0; i < n; ++i)
+y[i] = a * -x[i] + y[i];
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon-vmla-1.c b/gcc/testsuite/gcc.target/arm/neon-vmla-1.c
index 9d239ed..c60c014 100644
--- a/gcc/testsuite/gcc.target/arm/neon-vmla-1.c
+++ b/gcc/testsuite/gcc.target/arm/neon-vmla-1.c
@@ -1,10 +1,10 @@
 /

Re: [Patch ARM] Allow auto-vectorizer to use vfma.

2012-09-11 Thread Tobias Burnus

Hi,

your patch broke bootstrapping here:

/home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node 
`arm_neon_ok' previously defined at line 1532.


(Sorry for only complaining about those issues today.)

Tobias

On 09/11/2012 02:54 PM, Ramana Radhakrishnan wrote:

Hi,

This allows the auto-vectorizer to use vfma under Ofast or ffast-math.
I have a follow-up patch which will add support for these from
arm_neon.h as well before someone asks. It's being regression tested as
we speak and that'll follow shortly.

Tested on A15 silicon native with no regressions.

Committed.


regards,
Ramana



2012-09-11  Ramana Radhakrishnan  
 Matthew Gretton-Dann  

 * config/arm/neon.md (fma4): New pattern.
 (*fmsub4): Likewise.
 * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw):  Document it.

2012-09-11  Ramana Radhakrishnan  
 Matthew Gretton-Dann  

 * gcc.target/arm/neon-vfma-1.c: New testcase.
 * gcc.target/arm/neon-vfms-1.c: Likewise.
 * gcc.target/arm/neon-vmla-1.c: Update test to use int instead
 of float.
 * gcc.target/arm/neon-vmls-1.c: Likewise.
 * lib/target-supports.exp (add_options_for_arm_neonv2): New
 function.
 (check_effective_target_arm_neonv2_ok_nocache): Likewise.
 (check_effective_target_arm_neonv2_ok): Likewise.
 (check_effective_target_arm_neonv2_hw): Likewise.
 (check_effective_target_arm_neonv2): Likewise.




Re: [Patch ARM] Allow auto-vectorizer to use vfma.

2012-09-11 Thread Steven Bosscher
> your patch broke bootstrapping here:
>
> /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node
> `arm_neon_ok' previously defined at line 1532.
>
> (Sorry for only complaining about those issues today.)

No need to feel sorry about that. It is Really Bad that people
apparently don't test their patches properly.

Ciao!
Steven


RE: [PATCH, TESTSUITE] Add -fno-short-enums to pr51712

2012-09-11 Thread Kyrylo Tkachov
Fixed the format of the test options, as per Jakub's comment.

Add -fno-short-enums flag to test c-c++-common/pr51712.c as discussed in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712.
This removes the excess warning that caused the test to fail.
Tested in arm-none-eabi configuration. The test now passes.
Comment? Ok for trunk?

Thanks,
Kyrill

gcc/testsuite

2012-09-11  Kyrylo Tkachov  

* c-c++-common/pr51712.c: Add -fno-short-enums flag to test.

-Original Message-
From: Jakub Jelinek [mailto:ja...@redhat.com] 
Sent: 11 September 2012 13:50
To: Kyrylo Tkachov
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH, TESTSUITE] Add -fno-short-enums to pr51712

On Tue, Sep 11, 2012 at 01:46:37PM +0100, Kyrylo Tkachov wrote:
> Add -fno-short-enums flag to test c-c++-common/pr51712.c as discussed in
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712.
> This removes the excess warning that caused the test to fail.
> Tested in arm-none-eabi configuration. The test now passes.
> Comment? Ok for trunk?
> 
> Thanks,
> Kyrill
> 
> gcc/testsuite
> 
> 2012-09-11  Kyrylo Tkachov  
> 
>   * c-c++-common/pr51712.c: Add -fno-short-enums flag to test.

> --- a/gcc/testsuite/c-c++-common/pr51712.c
> +++ b/gcc/testsuite/c-c++-common/pr51712.c
> @@ -1,6 +1,6 @@
>  /* PR c/51712 */
>  /* { dg-do compile } */
> -/* { dg-options "-Wtype-limits" } */
> +/* { dg-options "-Wtype-limits -fno-short-enums" {target short_enums} }
*/

That is wrong, it means that on non-short_enums targets suddenly no
dg-options would be passed.
Instead you should keep the dg-options line as is and add
/* { dg-additional-options "-fno-short-enums" { target short_enums } } */
or just add the new dg-options line but keep the old one as well (though,
dg-additional-options is the new preferred way).

Jakub
--- a/gcc/testsuite/c-c++-common/pr51712.c
+++ b/gcc/testsuite/c-c++-common/pr51712.c
@@ -1,6 +1,7 @@
 /* PR c/51712 */
 /* { dg-do compile } */
 /* { dg-options "-Wtype-limits" } */
+/* { dg-additional-options "-fno-short-enums" { target short_enums } } */
 
 enum test_enum {
   FOO,


Re: [Patch ARM] Allow auto-vectorizer to use vfma.

2012-09-11 Thread Tobias Burnus

On 09/11/2012 03:08 PM, Tobias Burnus wrote:

your patch broke bootstrapping here:
/home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node 
`arm_neon_ok' previously defined at line 1532.


I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should 
have been arm_neon2_ok. (I also changed spaces into tabs in the ChangeLog.)


Tobias

PS: Fortunately, documentation changes do not require an all-language 
bootstrap.




On 09/11/2012 02:54 PM, Ramana Radhakrishnan wrote:

Hi,

This allows the auto-vectorizer to use vfma under Ofast or ffast-math.
I have a follow-up patch which will add support for these from
arm_neon.h as well before someone asks. It's being regression tested as
we speak and that'll follow shortly.

Tested on A15 silicon native with no regressions.

Committed.


regards,
Ramana



2012-09-11  Ramana Radhakrishnan 
 Matthew Gretton-Dann 

 * config/arm/neon.md (fma4): New pattern.
 (*fmsub4): Likewise.
 * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): 
Document it.


2012-09-11  Ramana Radhakrishnan 
 Matthew Gretton-Dann 

 * gcc.target/arm/neon-vfma-1.c: New testcase.
 * gcc.target/arm/neon-vfms-1.c: Likewise.
 * gcc.target/arm/neon-vmla-1.c: Update test to use int instead
 of float.
 * gcc.target/arm/neon-vmls-1.c: Likewise.
 * lib/target-supports.exp (add_options_for_arm_neonv2): New
 function.
 (check_effective_target_arm_neonv2_ok_nocache): Likewise.
 (check_effective_target_arm_neonv2_ok): Likewise.
 (check_effective_target_arm_neonv2_hw): Likewise.
 (check_effective_target_arm_neonv2): Likewise.





Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 191180)
+++ gcc/ChangeLog	(working copy)
@@ -1,9 +1,13 @@
+2012-09-11  Tobias Burnus  
+
+	* doc/sourcebuild.texi (arm_neon_v2_ok): Fix @anchor.
+
 2012-09-11  Ramana Radhakrishnan  
-Matthew Gretton-Dann  
+	Matthew Gretton-Dann  
 
-   * config/arm/neon.md (fma4): New pattern.
-   (*fmsub4): Likewise.
-   * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw):  Document it.
+	* config/arm/neon.md (fma4): New pattern.
+	(*fmsub4): Likewise.
+	* doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw):  Document it.
 
 2012-09-11  Aldy Hernandez  
 
Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi	(revision 191180)
+++ gcc/doc/sourcebuild.texi	(working copy)
@@ -1534,7 +1534,7 @@ ARM Target supports @code{-mfpu=neon -mfloat-abi=s
 options.  Some multilibs may be incompatible with these options.
 
 @item arm_neonv2_ok
-@anchor{arm_neon_ok}
+@anchor{arm_neon2_ok}
 ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible
 options.  Some multilibs may be incompatible with these options.
 


Re: [PATCH] Combine location with block using block_locations

2012-09-11 Thread Michael Matz
Hi,

On Tue, 11 Sep 2012, Richard Guenther wrote:

> >>> +++ gcc/lto/lto.c   (working copy)
> >>> @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t)
> >>>  {
> >>>enum tree_code code = TREE_CODE (t);
> >>>LTO_NO_PREVAIL (TREE_TYPE (t));
> >>> -  if (CODE_CONTAINS_STRUCT (code, TS_COMMON))
> >>> -LTO_NO_PREVAIL (TREE_CHAIN (t));
> >>
> >> That change is odd.  Can you show us how it breaks?
> >
> > This will break LTO build of gcc.c-torture/execute/pr38051.c
> >
> > There is data structure like:
> >
> >   union { long int l; char c[sizeof (long int)]; } u;
> >
> > Once the block info is reserved for this, it'll reserve this data
> > structure. And inside this data structure, there is VAR_DECL. Thus
> > LTO_NO_PREVAIL assertion does not satisfy here for TREE_CHAIN (t).
> 
> I see - the issue here is that this data structure is not reached at the 
> time we call free_lang_data (via find_decls_types_r).

It should be reached just fine.  The problem is that TREE_CHAIN of that 
union type contains random garbage (in this case the var_decl 'u').  This 
is not supposed to happen.  It's set as part of reading back a BLOCK_VARS 
chain, so the type_decl itself is in such a chain (and 'u' is part of it 
via the TREE_CHAIN pointer).

I have no idea why this is no problem without the patch.  Possibly because 
of the hunk in remove_unused_scope_block_p that makes more blocks stay.

> But maybe I do not understand "once the block info is reserved for 
> this".
> 
> So the patch papers over an issue elsewhere I believe.  Maybe Micha can 
> add some clarification here though, how BLOCK_VARS should be visible 
> here

Hmm.  Without the half-hearted tries to support debug info with LTO the 
block_vars list was no problem, it simply wouldn't be streamed.  Now I 
think it is a problem, and we need to fix it up with the prevailing decls 
if there are multiple ones.  I.e. instead of removing the two lines, 
replace LTO_NO_PREVAIL (TREE_CHAIN (t)) with LTO_SET_PREVAIL.

This is quite unfortunate as we really rather want to make sure that 
TREE_CHAIN isn't randomly set to something.  But as long as block_vars are 
implemented via TREE_CHAIN, and we want to preserve block_vars we don't 
have much choice :-(


Ciao,
Michael.


[PATCH][1/n] Improve LTO type merging

2012-09-11 Thread Richard Guenther

This removes the unused gtc_mode param and moves lifetime management
of the various tables to a central place, avoiding repeated checks.

LTO bootstrapped on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-09-11  Richard Guenther  

* lto.c (enum gtc_mode): Remove.
(struct type_pair_d): Adjust.
(lookup_type_pair): Likewise.
(gimple_type_leader): Do not mark as deletable.
(gimple_lookup_type_leader): Adjust.
(gtc_visit): Likewise.
(gimple_types_compatible_p_1): Likewise.
(gimple_types_compatible_p): Likewise.
(gimple_type_hash): Likewise.
(gimple_register_type): Likewise.
(read_cgraph_and_symbols): Manage lifetime of tables
here.

Index: gcc/lto/lto.c
===
--- gcc/lto/lto.c   (revision 191177)
+++ gcc/lto/lto.c   (working copy)
@@ -276,6 +276,8 @@ lto_read_in_decl_state (struct data_in *
   return data;
 }
 
+
+
 /* Global type table.  FIXME, it should be possible to re-use some
of the type hashing routines in tree.c (type_hash_canon, type_hash_lookup,
etc), but those assume that types were built with the various
@@ -285,8 +287,6 @@ static GTY((if_marked ("ggc_marked_p"),
 static GTY((if_marked ("tree_int_map_marked_p"), param_is (struct 
tree_int_map)))
   htab_t type_hash_cache;
 
-enum gtc_mode { GTC_MERGE = 0, GTC_DIAG = 1 };
-
 static hashval_t gimple_type_hash (const void *);
 
 /* Structure used to maintain a cache of some type pairs compared by
@@ -295,16 +295,13 @@ static hashval_t gimple_type_hash (const
 
-2: The pair (T1, T2) has just been inserted in the table.
 0: T1 and T2 are different types.
-1: T1 and T2 are the same type.
-
-   The two elements in the SAME_P array are indexed by the comparison
-   mode gtc_mode.  */
+1: T1 and T2 are the same type.  */
 
 struct type_pair_d
 {
   unsigned int uid1;
   unsigned int uid2;
-  signed char same_p[2];
+  signed char same_p;
 };
 typedef struct type_pair_d *type_pair_t;
 DEF_VEC_P(type_pair_t);
@@ -323,9 +320,6 @@ lookup_type_pair (tree t1, tree t2)
   unsigned int index;
   unsigned int uid1, uid2;
 
-  if (type_pair_cache == NULL)
-type_pair_cache = XCNEWVEC (struct type_pair_d, GIMPLE_TYPE_PAIR_SIZE);
-
   if (TYPE_UID (t1) < TYPE_UID (t2))
 {
   uid1 = TYPE_UID (t1);
@@ -348,8 +342,7 @@ lookup_type_pair (tree t1, tree t2)
 
   type_pair_cache [index].uid1 = uid1;
   type_pair_cache [index].uid2 = uid2;
-  type_pair_cache [index].same_p[0] = -2;
-  type_pair_cache [index].same_p[1] = -2;
+  type_pair_cache [index].same_p = -2;
 
   return &type_pair_cache[index];
 }
@@ -381,7 +374,7 @@ typedef struct GTY(()) gimple_type_leade
 } gimple_type_leader_entry;
 
 #define GIMPLE_TYPE_LEADER_SIZE 16381
-static GTY((deletable, length("GIMPLE_TYPE_LEADER_SIZE")))
+static GTY((length("GIMPLE_TYPE_LEADER_SIZE")))
   gimple_type_leader_entry *gimple_type_leader;
 
 /* Lookup an existing leader for T and return it or NULL_TREE, if
@@ -392,9 +385,6 @@ gimple_lookup_type_leader (tree t)
 {
   gimple_type_leader_entry *leader;
 
-  if (!gimple_type_leader)
-return NULL_TREE;
-
   leader = &gimple_type_leader[TYPE_UID (t) % GIMPLE_TYPE_LEADER_SIZE];
   if (leader->type != t)
 return NULL_TREE;
@@ -403,7 +393,6 @@ gimple_lookup_type_leader (tree t)
 }
 
 
-
 /* Return true if T1 and T2 have the same name.  If FOR_COMPLETION_P is
true then if any type has no name return false, otherwise return
true if both types have no names.  */
@@ -535,11 +524,11 @@ gtc_visit (tree t1, tree t2,
 
   /* Allocate a new cache entry for this comparison.  */
   p = lookup_type_pair (t1, t2);
-  if (p->same_p[GTC_MERGE] == 0 || p->same_p[GTC_MERGE] == 1)
+  if (p->same_p == 0 || p->same_p == 1)
 {
   /* We have already decided whether T1 and T2 are the
 same, return the cached result.  */
-  return p->same_p[GTC_MERGE] == 1;
+  return p->same_p == 1;
 }
 
   if ((slot = pointer_map_contains (sccstate, p)) != NULL)
@@ -574,7 +563,7 @@ gimple_types_compatible_p_1 (tree t1, tr
 {
   struct sccs *state;
 
-  gcc_assert (p->same_p[GTC_MERGE] == -2);
+  gcc_assert (p->same_p == -2);
 
   state = XOBNEW (sccstate_obstack, struct sccs);
   *pointer_map_insert (sccstate, p) = state;
@@ -861,7 +850,7 @@ pop:
  x = VEC_pop (type_pair_t, *sccstack);
  cstate = (struct sccs *)*pointer_map_contains (sccstate, x);
  cstate->on_sccstack = false;
- x->same_p[GTC_MERGE] = state->u.same_p;
+ x->same_p = state->u.same_p;
}
   while (x != p);
 }
@@ -958,11 +947,11 @@ gimple_types_compatible_p (tree t1, tree
   /* If we've visited this type pair before (in the case of aggregates
  with self-referential types), and we made a decision, return it.  */
   p = lookup_type_pair (t1, t2);
-  if (p->same_p[GTC_MERGE] == 0 || p->same_p[GTC_MERGE] == 1)
+  if (p->same_p == 0 || p->sa

Re: [Patch ARM] Allow auto-vectorizer to use vfma.

2012-09-11 Thread Ramana Radhakrishnan

On 09/11/12 14:17, Tobias Burnus wrote:

On 09/11/2012 03:08 PM, Tobias Burnus wrote:

your patch broke bootstrapping here:
/home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node
`arm_neon_ok' previously defined at line 1532.


I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should
have been arm_neon2_ok. (I also changed spaces into tabs in the ChangeLog.)



On 09/11/12 14:17, Tobias Burnus wrote:
> On 09/11/2012 03:08 PM, Tobias Burnus wrote:
>> your patch broke bootstrapping here:
>> /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node
>> `arm_neon_ok' previously defined at line 1532.
>
> I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should
> have been arm_neon2_ok. (I also changed spaces into tabs in the 
ChangeLog.)


Nearly: should be arm_neonv2_ok rather than arm_neon_ok. I've realized
another issue with the command line and committed this as obvious after
checking that the documentation built fine.

Thanks and apologies for the slip-up.

I've changed machines recently and somethings not ok in this new setup.

regards,
Ramana

2012-09-11  Ramana Radhakrishnan  

* doc/sourcebuild.texi (arm_neon_v2_ok): Adjust command line.

Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi	(revision 191181)
+++ gcc/doc/sourcebuild.texi	(revision 191182)
@@ -1534,8 +1534,8 @@ ARM Target supports @code{-mfpu=neon -mf
 options.  Some multilibs may be incompatible with these options.
 
 @item arm_neonv2_ok
-@anchor{arm_neon2_ok}
-ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible
+@anchor{arm_neonv2_ok}
+ARM Target supports @code{-mfpu=neon-vfpv4 -mfloat-abi=softfp} or compatible
 options.  Some multilibs may be incompatible with these options.
 
 @item arm_neon_fp16_ok

[PATCH, AARCH64] Added predefines for AArch64 code models

2012-09-11 Thread Chris Schlumberger-Socha

This patch adds predefines for AArch64 code models. These code models are
added as an effective target for the AArch64 platform.

Tests for these predefines have been added to `gcc.target/aarch64/'.

Thanks,
Chris

ChangeLog:

[AArch64] Added predefines for AArch64 code models.

gcc/

* config/aarch64/aarch64.h (TARGET_CPU_CPP_BUILTINS): Add predefine for
AArch64 code models.

gcc/testsuite/

* gcc.target/aarch64/predefine_large.c: New test for large code model
predefine.
* gcc.target/aarch64/predefine_small.c: Likewise for small code model.
* gcc.target/aarch64/predefine_tiny.c: Likewise for small code model.
* lib/target-supports.exp
(check_effective_target_aarch64_tiny): Check effective target for tiny
code model.
(check_effective_target_aarch64_small): Likewise for small code model.
(check_effective_target_aarch64_large): Likewise for large code model.

>From c130393b5d8b888550e548b36dd34a71b8d94f88 Mon Sep 17 00:00:00 2001
From: Chris Schlumberger-Socha 
Date: Wed, 22 Aug 2012 18:22:26 +0100
Subject: [PATCH] Added predefines for AArch64 code models.

Added DejaGnu tests for new predefines.
---
 gcc/config/aarch64/aarch64.h   |   34 
 gcc/testsuite/gcc.target/aarch64/predefine_large.c |7 +++
 gcc/testsuite/gcc.target/aarch64/predefine_small.c |7 +++
 gcc/testsuite/gcc.target/aarch64/predefine_tiny.c  |7 +++
 gcc/testsuite/lib/target-supports.exp  |   42 
 5 files changed, 89 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/predefine_large.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/predefine_small.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/predefine_tiny.c

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 5d121fa..593c01a 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -23,14 +23,32 @@
 #define GCC_AARCH64_H
 
 /* Target CPU builtins.  */
-#define TARGET_CPU_CPP_BUILTINS()		\
-  do		\
-{		\
-  builtin_define ("__aarch64__");		\
-  if (TARGET_BIG_END)			\
-	builtin_define ("__AARCH64EB__");	\
-  else	\
-	builtin_define ("__AARCH64EL__");	\
+#define TARGET_CPU_CPP_BUILTINS()			\
+  do			\
+{			\
+  builtin_define ("__aarch64__");			\
+  if (TARGET_BIG_END)\
+	builtin_define ("__AARCH64EB__");		\
+  else		\
+	builtin_define ("__AARCH64EL__");		\
+			\
+  switch (aarch64_cmodel)\
+	{		\
+	  case AARCH64_CMODEL_TINY:			\
+	  case AARCH64_CMODEL_TINY_PIC:			\
+	builtin_define ("__AARCH64_CMODEL_TINY__");	\
+	break;	\
+	  case AARCH64_CMODEL_SMALL:			\
+	  case AARCH64_CMODEL_SMALL_PIC:		\
+	builtin_define ("__AARCH64_CMODEL_SMALL__");\
+	break;	\
+	  case AARCH64_CMODEL_LARGE:			\
+	builtin_define ("__AARCH64_CMODEL_LARGE__");	\
+	break;	\
+	  default:	\
+	break;	\
+	}		\
+			\
 } while (0)
 
 
diff --git a/gcc/testsuite/gcc.target/aarch64/predefine_large.c b/gcc/testsuite/gcc.target/aarch64/predefine_large.c
new file mode 100644
index 000..0d7d4da
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/predefine_large.c
@@ -0,0 +1,7 @@
+/* { dg-skip-if "Code model already defined" { aarch64_tiny || aarch64_small } } */
+
+#ifdef __AARCH64_CMODEL_LARGE__
+  int dummy;
+#else
+  #error
+#endif
diff --git a/gcc/testsuite/gcc.target/aarch64/predefine_small.c b/gcc/testsuite/gcc.target/aarch64/predefine_small.c
new file mode 100644
index 000..b136284
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/predefine_small.c
@@ -0,0 +1,7 @@
+/* { dg-skip-if "Code model already defined" { aarch64_tiny || aarch64_large } } */
+
+#ifdef __AARCH64_CMODEL_SMALL__
+  int dummy;
+#else
+  #error
+#endif
diff --git a/gcc/testsuite/gcc.target/aarch64/predefine_tiny.c b/gcc/testsuite/gcc.target/aarch64/predefine_tiny.c
new file mode 100644
index 000..d2c844b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/predefine_tiny.c
@@ -0,0 +1,7 @@
+/* { dg-skip-if "Code model already defined" { aarch64_small || aarch64_large } } */
+
+#ifdef __AARCH64_CMODEL_TINY__
+  int dummy;
+#else
+  #error
+#endif
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 51805ed..2252c83 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -4654,3 +4654,45 @@ proc check_effective_target_ucontext_h { } {
 	#include 
 }]
 }
+
+proc check_effective_target_aarch64_tiny { } {
+if { [istarget aarch64*-*-*] } {
+	return [check_no_compiler_messages aarch64_tiny object {
+	#ifdef __AARCH64_CMODEL_TINY__
+	int dummy;
+	#else
+	#error target not AArch64 tiny code model
+	#endif
+	}]
+} else {
+	return 0
+}
+}
+
+proc check_effective_target_aarch64_small { } {
+if { [istarget aarch64

[PATCH] Improve debug info for partial inlining (PR debug/54519)

2012-09-11 Thread Jakub Jelinek
Hi!

As discussed in the PR, right now we do a very bad job for debug info
of partially inlined functions (both when they are kept only partially
inlined, or when partial inlining is performed, but doesn't seem to be
useful and foo.part.N is inlined back, either into the original function, or
into a function into which foo has been inlined first).

This patch improves that by doing something similar to what ipa-prop.c does,
in particular for arguments that aren't actually passed to foo.part.N
we add debug args and corresponding debug bind and debug source bind stmts
to provide better debug info (if foo.part.N isn't inlined, then
DW_OP_GNU_parameter_ref is going to be used together with corresponding call
site arguments).

Bootstrapped/regtested on x86_64-linux and i686-linux, some of the tests
still fail with some option combinations, am going to file a DF VTA PR for
that momentarily.  Ok for trunk?

2012-09-11  Jakub Jelinek  

PR debug/54519
* ipa-split.c (split_function): Add debug args and
debug source and normal stmts for args_to_skip which are
gimple regs.
* tree-inline.c (copy_debug_stmt): When inlining, adjust
source debug bind stmts to debug binds of corresponding
DEBUG_EXPR_DECL.

* gcc.dg/guality/pr54519-1.c: New test.
* gcc.dg/guality/pr54519-2.c: New test.
* gcc.dg/guality/pr54519-3.c: New test.
* gcc.dg/guality/pr54519-4.c: New test.
* gcc.dg/guality/pr54519-5.c: New test.

--- gcc/ipa-split.c.jj  2012-08-20 11:09:45.0 +0200
+++ gcc/ipa-split.c 2012-09-10 16:04:39.499558177 +0200
@@ -1059,6 +1059,7 @@ split_function (struct split_point *spli
   gimple last_stmt = NULL;
   unsigned int i;
   tree arg, ddef;
+  VEC(tree, gc) **debug_args = NULL;
 
   if (dump_file)
 {
@@ -1232,6 +1233,65 @@ split_function (struct split_point *spli
   gimple_set_block (call, DECL_INITIAL (current_function_decl));
   VEC_free (tree, heap, args_to_pass);
 
+  if (args_to_skip)
+for (parm = DECL_ARGUMENTS (current_function_decl), num = 0;
+parm; parm = DECL_CHAIN (parm), num++)
+  if (bitmap_bit_p (args_to_skip, num)
+ && is_gimple_reg (parm))
+   {
+ tree ddecl;
+ gimple def_temp;
+
+ arg = get_or_create_ssa_default_def (cfun, parm);
+ if (!MAY_HAVE_DEBUG_STMTS)
+   continue;
+ if (debug_args == NULL)
+   debug_args = decl_debug_args_insert (node->symbol.decl);
+ ddecl = make_node (DEBUG_EXPR_DECL);
+ DECL_ARTIFICIAL (ddecl) = 1;
+ TREE_TYPE (ddecl) = TREE_TYPE (parm);
+ DECL_MODE (ddecl) = DECL_MODE (parm);
+ VEC_safe_push (tree, gc, *debug_args, parm);
+ VEC_safe_push (tree, gc, *debug_args, ddecl);
+ def_temp = gimple_build_debug_bind (ddecl, unshare_expr (arg),
+ call);
+ gsi_insert_after (&gsi, def_temp, GSI_NEW_STMT);
+   }
+  if (debug_args != NULL)
+{
+  unsigned int i;
+  tree var, vexpr;
+  gimple_stmt_iterator cgsi;
+  gimple def_temp;
+
+  push_cfun (DECL_STRUCT_FUNCTION (node->symbol.decl));
+  var = BLOCK_VARS (DECL_INITIAL (node->symbol.decl));
+  i = VEC_length (tree, *debug_args);
+  cgsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR));
+  do
+   {
+ i -= 2;
+ while (var != NULL_TREE
+&& DECL_ABSTRACT_ORIGIN (var)
+   != VEC_index (tree, *debug_args, i))
+   var = TREE_CHAIN (var);
+ if (var == NULL_TREE)
+   break;
+ vexpr = make_node (DEBUG_EXPR_DECL);
+ parm = VEC_index (tree, *debug_args, i);
+ DECL_ARTIFICIAL (vexpr) = 1;
+ TREE_TYPE (vexpr) = TREE_TYPE (parm);
+ DECL_MODE (vexpr) = DECL_MODE (parm);
+ def_temp = gimple_build_debug_source_bind (vexpr, parm,
+NULL);
+ gsi_insert_before (&cgsi, def_temp, GSI_SAME_STMT);
+ def_temp = gimple_build_debug_bind (var, vexpr, NULL);
+ gsi_insert_before (&cgsi, def_temp, GSI_SAME_STMT);
+   }
+  while (i);
+  pop_cfun ();
+}
+
   /* We avoid address being taken on any variable used by split part,
  so return slot optimization is always possible.  Moreover this is
  required to make DECL_BY_REFERENCE work.  */
--- gcc/tree-inline.c.jj2012-08-22 11:18:56.0 +0200
+++ gcc/tree-inline.c   2012-09-11 09:13:49.509205799 +0200
@@ -2355,6 +2355,31 @@ copy_debug_stmt (gimple stmt, copy_body_
   gimple_debug_source_bind_set_var (stmt, t);
   walk_tree (gimple_debug_source_bind_get_value_ptr (stmt),
 remap_gimple_op_r, &wi, NULL);
+  /* When inlining and source bind refers to one of the optimized
+away parameters, change the source bind into normal debug bind
+referring to the corresponding DEBUG_EXPR_DECL that should have
+been boun

Re: [PATCH] Combine location with block using block_locations

2012-09-11 Thread Richard Guenther
On Tue, Sep 11, 2012 at 3:30 PM, Michael Matz  wrote:
> Hi,
>
> On Tue, 11 Sep 2012, Richard Guenther wrote:
>
>> >>> +++ gcc/lto/lto.c   (working copy)
>> >>> @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t)
>> >>>  {
>> >>>enum tree_code code = TREE_CODE (t);
>> >>>LTO_NO_PREVAIL (TREE_TYPE (t));
>> >>> -  if (CODE_CONTAINS_STRUCT (code, TS_COMMON))
>> >>> -LTO_NO_PREVAIL (TREE_CHAIN (t));
>> >>
>> >> That change is odd.  Can you show us how it breaks?
>> >
>> > This will break LTO build of gcc.c-torture/execute/pr38051.c
>> >
>> > There is data structure like:
>> >
>> >   union { long int l; char c[sizeof (long int)]; } u;
>> >
>> > Once the block info is reserved for this, it'll reserve this data
>> > structure. And inside this data structure, there is VAR_DECL. Thus
>> > LTO_NO_PREVAIL assertion does not satisfy here for TREE_CHAIN (t).
>>
>> I see - the issue here is that this data structure is not reached at the
>> time we call free_lang_data (via find_decls_types_r).
>
> It should be reached just fine.  The problem is that TREE_CHAIN of that
> union type contains random garbage (in this case the var_decl 'u').  This
> is not supposed to happen.  It's set as part of reading back a BLOCK_VARS
> chain, so the type_decl itself is in such a chain (and 'u' is part of it
> via the TREE_CHAIN pointer).
>
> I have no idea why this is no problem without the patch.  Possibly because
> of the hunk in remove_unused_scope_block_p that makes more blocks stay.
>
>> But maybe I do not understand "once the block info is reserved for
>> this".
>>
>> So the patch papers over an issue elsewhere I believe.  Maybe Micha can
>> add some clarification here though, how BLOCK_VARS should be visible
>> here
>
> Hmm.  Without the half-hearted tries to support debug info with LTO the
> block_vars list was no problem, it simply wouldn't be streamed.  Now I
> think it is a problem, and we need to fix it up with the prevailing decls
> if there are multiple ones.  I.e. instead of removing the two lines,
> replace LTO_NO_PREVAIL (TREE_CHAIN (t)) with LTO_SET_PREVAIL.
>
> This is quite unfortunate as we really rather want to make sure that
> TREE_CHAIN isn't randomly set to something.  But as long as block_vars are
> implemented via TREE_CHAIN, and we want to preserve block_vars we don't
> have much choice :-(

I don't think we can fixup TREE_CHAIN - the things cannot be in multiple
lists after all.  Unifying/fixing up would need to happen at a BLOCK level.
But as you say - only TYPE_DECLs should be in BLOCK_VARS, but never
global ones, so there would be nothing to replace.  Which means we shouldn't
even try to merge those.  Hmm.

Richard.

>
> Ciao,
> Michael.


Re: Recognize vec_perm_expr in a constructor of bit_field_ref

2012-09-11 Thread Richard Guenther
On Tue, Sep 11, 2012 at 1:07 PM, Marc Glisse  wrote:
> Hello,
>
> here is a patch that turns {v[1],v[0]} into vec_perm_expr(v,v,{1,0}) if the
> target is ok with it.
>
> I am attaching 2 versions of the patch. p-good is the one that passes
> testing. p-bad, where I rely on fold_stmt to detect identity permutations,
> ICEs towards the end of the pass while checking a bogus gimple stmt (one
> that gimple_debug_stmt crashes on if I call it in gdb). From a performance
> point of view, p-good makes sense, but I liked the simplicity of p-bad and I
> am confused as to why it fails.

Probably because you cannot simply increase num_ops ...

> 2012-09-11  Marc Glisse  
>
> gcc/
> * tree-ssa-forwprop.c (simplify_vector_constructor): New function.
> (ssa_forward_propagate_and_combine): Call it.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/forwprop-22.c: New testcase.
>
> --
> Marc Glisse
> Index: Makefile.in
> ===
> --- Makefile.in (revision 191173)
> +++ Makefile.in (working copy)
> @@ -2237,21 +2237,22 @@ tree-outof-ssa.o : tree-outof-ssa.c $(TR
> $(TREE_H) $(DIAGNOSTIC_H) $(TM_H) coretypes.h dumpfile.h \
> $(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) \
> $(EXPR_H) $(SSAEXPAND_H) $(GIMPLE_PRETTY_PRINT_H)
>  tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> $(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
> $(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h $(FLAGS_H) \
> $(GIMPLE_PRETTY_PRINT_H) langhooks.h
>  tree-ssa-forwprop.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H)
> coretypes.h \
> $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \
> $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
> -   langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H)
> +   langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) \
> +   $(TREE_VECTORIZER_H)
>  tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h
> \
> $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
> $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
> langhooks.h $(FLAGS_H) $(GIMPLE_PRETTY_PRINT_H)
>  tree-ssa-ifcombine.o : tree-ssa-ifcombine.c $(CONFIG_H) $(SYSTEM_H) \
> coretypes.h $(TM_H) $(TREE_H) $(BASIC_BLOCK_H) \
> $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
> $(TREE_PRETTY_PRINT_H)
>  tree-ssa-phiopt.o : tree-ssa-phiopt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> $(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
> Index: testsuite/gcc.dg/tree-ssa/forwprop-22.c
> ===
> --- testsuite/gcc.dg/tree-ssa/forwprop-22.c (revision 0)
> +++ testsuite/gcc.dg/tree-ssa/forwprop-22.c (revision 0)
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_double } */
> +/* { dg-require-effective-target vect_perm } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +typedef double vec __attribute__((vector_size (2 * sizeof (double;
> +void f (vec *px, vec *y, vec *z)
> +{
> +  vec x = *px;
> +  vec t1 = { x[1], x[0] };
> +  vec t2 = { x[0], x[1] };
> +  *y = t1;
> +  *z = t2;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
>
> Property changes on: testsuite/gcc.dg/tree-ssa/forwprop-22.c
> ___
> Added: svn:keywords
>+ Author Date Id Revision URL
> Added: svn:eol-style
>+ native
>
> Index: tree-ssa-forwprop.c
> ===
> --- tree-ssa-forwprop.c (revision 191173)
> +++ tree-ssa-forwprop.c (working copy)
> @@ -26,20 +26,21 @@ along with GCC; see the file COPYING3.
>  #include "tm_p.h"
>  #include "basic-block.h"
>  #include "gimple-pretty-print.h"
>  #include "tree-flow.h"
>  #include "tree-pass.h"
>  #include "langhooks.h"
>  #include "flags.h"
>  #include "gimple.h"
>  #include "expr.h"
>  #include "cfgloop.h"
> +#include "tree-vectorizer.h"
>
>  /* This pass propagates the RHS of assignment statements into use
> sites of the LHS of the assignment.  It's basically a specialized
> form of tree combination.   It is hoped all of this can disappear
> when we have a generalized tree combiner.
>
> One class of common cases we handle is forward propagating a single use
> variable into a COND_EXPR.
>
>   bb0:
> @@ -2787,20 +2788,105 @@ simplify_permutation (gimple_stmt_iterat
>if (TREE_CODE (op0) == SSA_NAME)
> ret = remove_prop_source_from_use (op0);
>if (op0 != op1 && TREE_CODE (op1) == SSA_NAME)
> ret |= remove_prop_source_from_use (op1);
>return ret ? 2 : 1;
>  }
>
>return 0;
>  }
>
> +/* Recognize a VEC_PERM_EXPR.  Returns true if there were any changes.  */
> +
> +static

Re: Recognize vec_perm_expr in a constructor of bit_field_ref

2012-09-11 Thread Marc Glisse

On Tue, 11 Sep 2012, Richard Guenther wrote:


On Tue, Sep 11, 2012 at 1:07 PM, Marc Glisse  wrote:

Hello,

here is a patch that turns {v[1],v[0]} into vec_perm_expr(v,v,{1,0}) if the
target is ok with it.

I am attaching 2 versions of the patch. p-good is the one that passes
testing. p-bad, where I rely on fold_stmt to detect identity permutations,
ICEs towards the end of the pass while checking a bogus gimple stmt (one
that gimple_debug_stmt crashes on if I call it in gdb). From a performance
point of view, p-good makes sense, but I liked the simplicity of p-bad and I
am confused as to why it fails.


Probably because you cannot simply increase num_ops ...


Ah... thanks, it makes sense now... For some reason I thought it was a 
fixed size structure and num_ops just told it how many of the fields were 
in use.


[...]

Ok with that change.


Just to be sure, that means you prefer the version where I manually detect 
identity and don't call fold, right?


Thank you for all the quick reviews,

--
Marc Glisse


Re: [PATCH] Improve debug info for partial inlining (PR debug/54519)

2012-09-11 Thread Steven Bosscher
> +  if (args_to_skip)
> +for (parm = DECL_ARGUMENTS (current_function_decl), num = 0;
> +parm; parm = DECL_CHAIN (parm), num++)
> +  if (bitmap_bit_p (args_to_skip, num)
> + && is_gimple_reg (parm))
> +   {
> + tree ddecl;
> + gimple def_temp;
> +
> + arg = get_or_create_ssa_default_def (cfun, parm);
> + if (!MAY_HAVE_DEBUG_STMTS)
> +   continue;

You can do this MAY_HAVE_DEBUG_STMTS check before the loop, e.g.

> +  if (args_to_skip && MAY_HAVE_DEBUG_STMTS)

Ciao!
Steven


Re: [PATCH] Improve debug info for partial inlining (PR debug/54519)

2012-09-11 Thread Jakub Jelinek
On Tue, Sep 11, 2012 at 04:41:24PM +0200, Steven Bosscher wrote:
> > +  if (args_to_skip)
> > +for (parm = DECL_ARGUMENTS (current_function_decl), num = 0;
> > +parm; parm = DECL_CHAIN (parm), num++)
> > +  if (bitmap_bit_p (args_to_skip, num)
> > + && is_gimple_reg (parm))
> > +   {
> > + tree ddecl;
> > + gimple def_temp;
> > +
> > + arg = get_or_create_ssa_default_def (cfun, parm);
> > + if (!MAY_HAVE_DEBUG_STMTS)
> > +   continue;
> 
> You can do this MAY_HAVE_DEBUG_STMTS check before the loop, e.g.
> 
> > +  if (args_to_skip && MAY_HAVE_DEBUG_STMTS)

No, that would result in -fcompare-debug failures if parm doesn't have a
default def yet.

Jakub


Remove def operands cache

2012-09-11 Thread Michael Matz
Hi,

the operands cache is ugly.  This patch removes it at least for the def 
operands, saving three pointers for roughly each normal statement (the 
pointer in gsbase, and two pointers from def_optype_d).  This is 
relatively easy to do, because all statements except ASMs have at most one 
def (and one vdef), which themself aren't pointed to by something else, 
unlike the use operands which have more structure for the SSA web.

Performance wise the patch is a slight improvement (1% for some C++ 
testcases, but relatively noisy, but at least not slower), bootstrap time 
is unaffected.  As the iterator is a bit larger code size increases by 1 
promille.

The patch is regstrapped on x86_64-linux.  If it's approved I'll adjust 
the WORD count markers in gimple.h, I left it out in this submission as 
it's just verbose noise in comments.

Okay for trunk?


Ciao,
Michael.
* tree-ssa-operands.h (struct def_optype_d, def_optype_p): Remove.
(ssa_operands.free_defs): Remove.
(DEF_OP_PTR, DEF_OP): Remove.
(struct ssa_operand_iterator_d): Remove 'defs', add 'flags', 'def_i'
members, rename 'phi_stmt' to 'stmt'.
* gimple.h (gimple_statement_with_ops.def_ops): Remove.
(gimple_def_ops, gimple_set_def_ops): Remove.
(gimple_vdef_op): Don't take const gimple, adjust.
* tree-ssa-operands.c (build_defs): Remove.
(init_ssa_operands): Don't initialize it.
(fini_ssa_operands): Don't free it.
(cleanup_build_arrays): Don't truncate it.
(finalize_ssa_stmt_operands): Don't assert on it.
(alloc_def, add_def_op, append_def): Remove.
(finalize_ssa_defs): Remove building of def_ops list.
(finalize_ssa_uses): Don't mark for SSA renaming here, ...
(add_stmt_operand): ... but here, don't call append_def.
(get_indirect_ref_operands): Remove recurse_on_base argument.
(get_expr_operands): Adjust call to get_indirect_ref_operands.
(verify_ssa_operands): Don't check def operands.
(free_stmt_operands): Don't free def operands.
* gimple.c (gimple_copy): Don't clear def operands.
* tree-flow-inline.h (op_iter_next_use): Adjust to explicitely
handle def operand.
(op_iter_next_tree): Ditto.
(clear_and_done_ssa_iter): Clear new fields.
(op_iter_init): Adjust to setup new iterator structure.
(op_iter_init_phiuse): Adjust.

Index: tree-ssa-operands.h
===
--- tree-ssa-operands.h.orig2012-09-06 16:14:30.0 +0200
+++ tree-ssa-operands.h 2012-09-06 16:18:33.0 +0200
@@ -34,14 +34,6 @@ typedef ssa_use_operand_t *use_operand_p
 #define NULL_USE_OPERAND_P ((use_operand_p)NULL)
 #define NULL_DEF_OPERAND_P ((def_operand_p)NULL)
 
-/* This represents the DEF operands of a stmt.  */
-struct def_optype_d
-{
-  struct def_optype_d *next;
-  tree *def_ptr;
-};
-typedef struct def_optype_d *def_optype_p;
-
 /* This represents the USE operands of a stmt.  */
 struct use_optype_d
 {
@@ -68,7 +60,6 @@ struct GTY(()) ssa_operands {
 
bool ops_active;
 
-   struct def_optype_d * GTY ((skip (""))) free_defs;
struct use_optype_d * GTY ((skip (""))) free_uses;
 };
 
@@ -82,9 +73,6 @@ struct GTY(()) ssa_operands {
 #define USE_OP_PTR(OP) (&((OP)->use_ptr))
 #define USE_OP(OP) (USE_FROM_PTR (USE_OP_PTR (OP)))
 
-#define DEF_OP_PTR(OP) ((OP)->def_ptr)
-#define DEF_OP(OP) (DEF_FROM_PTR (DEF_OP_PTR (OP)))
-
 #define PHI_RESULT_PTR(PHI)gimple_phi_result_ptr (PHI)
 #define PHI_RESULT(PHI)DEF_FROM_PTR (PHI_RESULT_PTR (PHI))
 #define SET_PHI_RESULT(PHI, V) SET_DEF (PHI_RESULT_PTR (PHI), (V))
@@ -135,11 +123,12 @@ typedef struct ssa_operand_iterator_d
 {
   bool done;
   enum ssa_op_iter_type iter_type;
-  def_optype_p defs;
+  int flags;
+  unsigned def_i;
   use_optype_p uses;
   int phi_i;
   int num_phi;
-  gimple phi_stmt;
+  gimple stmt;
 } ssa_op_iter;
 
 /* These flags are used to determine which operands are returned during
Index: gimple.h
===
--- gimple.h.orig   2012-09-06 16:14:30.0 +0200
+++ gimple.h2012-09-07 16:01:27.0 +0200
@@ -224,12 +226,12 @@ struct GTY(()) gimple_statement_with_ops
   /* [ WORD 1-6 ]  */
   struct gimple_statement_base gsbase;
 
+  /* XXX adjust word count */
   /* [ WORD 7-8 ]
  SSA operand vectors.  NOTE: It should be possible to
  amalgamate these vectors with the operand vector OP.  However,
  the SSA operand vectors are organized differently and contain
  more information (like immediate use chaining).  */
-  struct def_optype_d GTY((skip (""))) *def_ops;
   struct use_optype_d GTY((skip (""))) *use_ops;
 };
 
@@ -1374,27 +1376,6 @@ gimple_has_mem_ops (const_gimple g)
 }
 
 
-/* Return the set of DEF operands for statement G.  */
-
-static inline struc

[PATCH] fix bootstrap on darwin to adapt to VEC changes

2012-09-11 Thread Jack Howarth
  The attached patch fixes the bootstrap on darwin to cope with the
VEC changes to remove unnecessary VEC function overloads. Tested on
x86_64-apple-darwin12. Okay for gcc trunk.
   Jack

2012-09-11  Dominique d'Humieres  
Jack Howarth  

* config/darwin.c (darwin_asm_named_section): Adjust for VEC
changes.
(darwin_asm_dwarf_section): Likewise.

Index: gcc/config/darwin.c
===
--- gcc/config/darwin.c (revision 191179)
+++ gcc/config/darwin.c (working copy)
@@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *na
  the assumption of how this is done.  */
   if (lto_section_names == NULL)
 lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16);
-  VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, &e);
+  VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e);
}
   else if (strncmp (name, "__DWARF,", 8) == 0)
 darwin_asm_dwarf_section (name, flags, decl);
@@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *na
   fprintf (asm_out_file, "Lsection%.*s:\n", namelen, sname);
   e.count = 1;
   e.name = xstrdup (sname);
-  VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, &e);
+  VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e);
 }
 }
 


Re: [patch] Expand SJLJ exceptions as tablejump/casesi

2012-09-11 Thread Richard Henderson
On 09/10/2012 04:26 PM, Steven Bosscher wrote:
> +  rtx index = force_reg (index_mode, dispatch_index);

You can't modify the result of force_reg.  Use copy_to_{mode_,}reg instead.

> +   rtx tmp = expand_simple_binop (index_mode, MINUS,
> +  index, CONST1_RTX (index_mode),
> +  index, 0, OPTAB_DIRECT);
> +   gcc_assert (REG_P (tmp));
> +   if (tmp != index)
> + emit_move_insn (index, tmp);

This pattern is force_expand_binop.

Of course, you don't really need to force index be the same all
the way down the chain.  You could just as well use

  index = expand_simple_binop (index_mode, MINUS, index, one,
   index, 0, OPTAB_DIRECT);

and use any new pseudo in the next iteration.

Otherwise this looks good.


r~


Re: [Patch ARM] implement bswap16

2012-09-11 Thread Christophe Lyon
On 11 September 2012 12:52, Richard Earnshaw  wrote:
> Try something like:
>
> short foo(int);
>
> short swaps (short x, int y)
> {
>   int z = x;
>   if (y)
> z = __builtin_bswap16(x);
>   return foo (z);
> }
>
> If that's not enough, try adding 1 to z before calling foo.
>

Thanks, it works.
It's surprising however that 'return z' isn't enough.

Here is a new version of the patch, which also transforms the 32 bits
arm_rev/thumb1_rev into arm_rev/arm_rev_cond.

I have enhanced the testcase too.

Christophe.


bswap16.patch
Description: Binary data


Re: [PATCH, libstdc++] Improve slightly __cxa_guard_acquire

2012-09-11 Thread Jakub Jelinek
On Thu, Sep 06, 2012 at 11:10:37PM +0200, Jakub Jelinek wrote:
> > +   int expected(0);
> > if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false,
> > __ATOMIC_ACQ_REL,
> > __ATOMIC_RELAXED))
> 
> Shouldn't this __ATOMIC_RELAXED be also __ATOMIC_ACQUIRE?  If expected ends
> up being guard_bit, then the code will return 0; right away.

Here is a patch for that.  Ok for trunk/4.7?

2012-09-11  Jakub Jelinek  

PR libstdc++/54172
* libsupc++/guard.cc (__cxa_guard_acquire): Fix up the last
argument of the first __atomic_compare_exchange_n.

--- libstdc++-v3/libsupc++/guard.cc.jj  2012-09-11 16:55:16.0 +0200
+++ libstdc++-v3/libsupc++/guard.cc 2012-09-11 16:56:38.035848876 +0200
@@ -253,7 +253,7 @@ namespace __cxxabiv1
int expected(0);
if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false,
__ATOMIC_ACQ_REL,
-   __ATOMIC_RELAXED))
+   __ATOMIC_ACQUIRE))
  {
// This thread should do the initialization.
return 1;

Jakub


Re: [Patch ARM testsuite] fix 3 tests for big-endian

2012-09-11 Thread Christophe Lyon
Ping?
http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00068.html

Thanks
Christophe.


On 3 September 2012 11:01, Christophe Lyon  wrote:
> On 31 August 2012 18:14, Janis Johnson  wrote:
>>
>> do something like
>>
>> /* { dg-final { scan-assembler-times "fmrrd\[\\t \]+r0,\[\\t \]*r1,\[\\t 
>> \]*d0" 2 } { target arm_little_endian } } */
>> /* { dg-final { scan-assembler-times "fmrrd\[\\t \]+r1,\[\\t \]*r0,\[\\t 
>> \]*d0" 2  } {target { ! arm_little_endian } } } */
>>
>> That's untested, but you get the idea.
>>
>> Janis
>>
>>
>
> Thanks for your review. Here is an updated patch.
>
> Christophe.
>
> 2012-09-03  Christophe Lyon  
>
> gcc/testsuite/
> * gcc.target/arm/neon-vset_lanes8.c, gcc.target/arm/pr51835.c,
> gcc.target/arm/pr48252.c: Fix for big-endian support.


Re: [PATCH, libstdc++] Improve slightly __cxa_guard_acquire

2012-09-11 Thread Richard Henderson
On 09/11/2012 08:02 AM, Jakub Jelinek wrote:
> 2012-09-11  Jakub Jelinek  
> 
>   PR libstdc++/54172
>   * libsupc++/guard.cc (__cxa_guard_acquire): Fix up the last
>   argument of the first __atomic_compare_exchange_n.

Looks good.


r~


Re: Recognize vec_perm_expr in a constructor of bit_field_ref

2012-09-11 Thread Marc Glisse

On Tue, 11 Sep 2012, Richard Guenther wrote:


On Tue, Sep 11, 2012 at 1:07 PM, Marc Glisse  wrote:

Hello,

here is a patch that turns {v[1],v[0]} into vec_perm_expr(v,v,{1,0}) if the
target is ok with it.

I am attaching 2 versions of the patch. p-good is the one that passes
testing. p-bad, where I rely on fold_stmt to detect identity permutations,
ICEs towards the end of the pass while checking a bogus gimple stmt (one
that gimple_debug_stmt crashes on if I call it in gdb). From a performance
point of view, p-good makes sense, but I liked the simplicity of p-bad and I
am confused as to why it fails.


Probably because you cannot simply increase num_ops ...


2012-09-11  Marc Glisse  

gcc/
* tree-ssa-forwprop.c (simplify_vector_constructor): New function.
(ssa_forward_propagate_and_combine): Call it.

gcc/testsuite/
* gcc.dg/tree-ssa/forwprop-22.c: New testcase.

[...]

Ok with that change.


Attached is what I am testing and will commit if it passes.

--
Marc GlisseIndex: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 191187)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -26,20 +26,21 @@ along with GCC; see the file COPYING3.
 #include "tm_p.h"
 #include "basic-block.h"
 #include "gimple-pretty-print.h"
 #include "tree-flow.h"
 #include "tree-pass.h"
 #include "langhooks.h"
 #include "flags.h"
 #include "gimple.h"
 #include "expr.h"
 #include "cfgloop.h"
+#include "tree-vectorizer.h"
 
 /* This pass propagates the RHS of assignment statements into use
sites of the LHS of the assignment.  It's basically a specialized
form of tree combination.   It is hoped all of this can disappear
when we have a generalized tree combiner.
 
One class of common cases we handle is forward propagating a single use
variable into a COND_EXPR.
 
  bb0:
@@ -2787,20 +2788,98 @@ simplify_permutation (gimple_stmt_iterat
   if (TREE_CODE (op0) == SSA_NAME)
ret = remove_prop_source_from_use (op0);
   if (op0 != op1 && TREE_CODE (op1) == SSA_NAME)
ret |= remove_prop_source_from_use (op1);
   return ret ? 2 : 1;
 }
 
   return 0;
 }
 
+/* Recognize a VEC_PERM_EXPR.  Returns true if there were any changes.  */
+
+static bool
+simplify_vector_constructor (gimple_stmt_iterator *gsi)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  gimple def_stmt;
+  tree op, op2, orig, type, elem_type;
+  unsigned elem_size, nelts, i;
+  enum tree_code code;
+  constructor_elt *elt;
+  unsigned char *sel;
+  bool maybe_ident;
+
+  gcc_checking_assert (gimple_assign_rhs_code (stmt) == CONSTRUCTOR);
+
+  op = gimple_assign_rhs1 (stmt);
+  type = TREE_TYPE (op);
+  gcc_checking_assert (TREE_CODE (type) == VECTOR_TYPE);
+
+  nelts = TYPE_VECTOR_SUBPARTS (type);
+  elem_type = TREE_TYPE (type);
+  elem_size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));
+
+  sel = XALLOCAVEC (unsigned char, nelts);
+  orig = NULL;
+  maybe_ident = true;
+  FOR_EACH_VEC_ELT (constructor_elt, CONSTRUCTOR_ELTS (op), i, elt)
+{
+  tree ref, op1;
+
+  if (i >= nelts)
+   return false;
+
+  if (TREE_CODE (elt->value) != SSA_NAME)
+   return false;
+  def_stmt = SSA_NAME_DEF_STMT (elt->value);
+  if (!def_stmt || !is_gimple_assign (def_stmt))
+   return false;
+  code = gimple_assign_rhs_code (def_stmt);
+  if (code != BIT_FIELD_REF)
+   return false;
+  op1 = gimple_assign_rhs1 (def_stmt);
+  ref = TREE_OPERAND (op1, 0);
+  if (orig)
+   {
+ if (ref != orig)
+   return false;
+   }
+  else
+   {
+ if (TREE_CODE (ref) != SSA_NAME)
+   return false;
+ orig = ref;
+   }
+  if (TREE_INT_CST_LOW (TREE_OPERAND (op1, 1)) != elem_size)
+   return false;
+  sel[i] = TREE_INT_CST_LOW (TREE_OPERAND (op1, 2)) / elem_size;
+  if (sel[i] != i) maybe_ident = false;
+}
+  if (i < nelts)
+return false;
+
+  if (maybe_ident)
+{
+  gimple_assign_set_rhs_from_tree (gsi, orig);
+}
+  else
+{
+  op2 = vect_gen_perm_mask (type, sel);
+  if (!op2)
+   return false;
+  gimple_assign_set_rhs_with_ops_1 (gsi, VEC_PERM_EXPR, orig, orig, op2);
+}
+  update_stmt (gsi_stmt (*gsi));
+  return true;
+}
+
 /* Main entry point for the forward propagation and statement combine
optimizer.  */
 
 static unsigned int
 ssa_forward_propagate_and_combine (void)
 {
   basic_block bb;
   unsigned int todoflags = 0;
 
   cfg_changed = false;
@@ -2958,20 +3037,23 @@ ssa_forward_propagate_and_combine (void)
  }
else if (code == VEC_PERM_EXPR)
  {
int did_something = simplify_permutation (&gsi);
if (did_something == 2)
  cfg_changed = true;
changed = did_something != 0;
  }
else if (code == BIT_FIELD_REF)
  changed 

Obsolete picochip-* in 4.7.2+

2012-09-11 Thread Jakub Jelinek
Hi!

As discussed on IRC, the picochip-* port doesn't have an active maintainer
anymore, this patch adds it to deprecated ports for 4.7.2+ so that it can be 
removed in
GCC 4.8 unless somebody steps up to maintain it.

Ok for trunk/4.7?

2012-09-11  Jakub Jelinek  

* config.gcc: Obsolete picochip-*.

--- gcc/config.gcc  2012-09-05 14:52:14.428548941 +0200
+++ gcc/config.gcc  2012-09-11 17:05:15.147522191 +0200
@@ -245,7 +245,8 @@ md_file=
 
 # Obsolete configurations.
 case ${target} in
-   score-* \
+   picochip-*  \
+ | score-* \
  )
 if test "x$enable_obsolete" != xyes; then
   echo "*** Configuration ${target} is obsolete." >&2

--- gcc-4.7/changes.html10 Aug 2012 16:25:46 -  1.124
+++ gcc-4.7/changes.html11 Sep 2012 15:15:38 -
@@ -29,7 +29,14 @@
 next release of GCC will have their sources permanently
 removed.
 
-The following ports for individual systems on
+All GCC ports for the following processor
+architectures have been declared obsolete:
+
+
+ picoChip (picochip-*)
+
+
+The following ports for individual systems on
 particular architectures have been obsoleted:
 
 


Jakub
2012-09-11  Jakub Jelinek  

* config.gcc: Obsolete picochip-*.

--- gcc/config.gcc  2012-09-05 14:52:14.428548941 +0200
+++ gcc/config.gcc  2012-09-11 17:05:15.147522191 +0200
@@ -245,7 +245,8 @@ md_file=
 
 # Obsolete configurations.
 case ${target} in
-   score-* \
+   picochip-*  \
+ | score-* \
  )
 if test "x$enable_obsolete" != xyes; then
   echo "*** Configuration ${target} is obsolete." >&2
--- gcc-4.7/changes.html10 Aug 2012 16:25:46 -  1.124
+++ gcc-4.7/changes.html11 Sep 2012 15:15:38 -
@@ -29,7 +29,14 @@
 next release of GCC will have their sources permanently
 removed.
 
-The following ports for individual systems on
+All GCC ports for the following processor
+architectures have been declared obsolete:
+
+
+ picoChip (picochip-*)
+
+
+The following ports for individual systems on
 particular architectures have been obsoleted:
 
 


[C++ Patch] Remove uses of ATTRIBUTE_UNUSED in the function parameters

2012-09-11 Thread Paolo Carlini

Hi,

since we are now using C++, I think we can remove the attributes and 
just use unnamed parameters. For now I kept the names in comments for 
documentation purposes, but would be glad to remove those too, if you like.


Booted and tested x86_64-linux.

Thanks,
Paolo.

PS: slightly interesting, in a couple of cases - 
write_unnamed_type_name, wrap_cleanups_r - the parameters were actually 
used.


//
2012-09-11  Paolo Carlini  

* typeck.c (build_indirect_ref, build_function_call,
build_function_call_vec, build_binary_op, build_unary_op,
build_compound_expr, build_c_cast, build_modify_expr): Remove
uses of ATTRIBUTE_UNUSED on the parameters.
* class.c (set_linkage_according_to_type, resort_type_method_vec,
dfs_find_final_overrider_post, empty_base_at_nonzero_offset_p):
Likewise.
* decl.c (local_variable_p_walkfn): Likewise.
* except.c (wrap_cleanups_r, check_noexcept_r): Likewise.
* error.c (find_typenames_r): Likewise.
* tree.c (verify_stmt_tree_r, bot_replace,
handle_java_interface_attribute, handle_com_interface_attribute,
handle_init_priority_attribute, c_register_addr_space): Likewise.
* cp-gimplify.c (cxx_omp_clause_default_ctor): Likewise.
* cp-lang.c (objcp_tsubst_copy_and_build): Likewise.
* pt.c (unify_success, unify_invalid, instantiation_dependent_r):
Likewise.
* semantics.c (dfs_calculate_bases_pre): Likewise.
* decl2.c (fix_temporary_vars_context_r, clear_decl_external):
Likewise.
* parser.c (cp_lexer_token_at, cp_parser_omp_clause_mergeable,
cp_parser_omp_clause_nowait, cp_parser_omp_clause_ordered,
cp_parser_omp_clause_untied): Likewise.
* mangle.c (write_unnamed_type_name,
discriminator_for_string_literal): Likewise.
* search.c (dfs_accessible_post, dfs_debug_mark): Likewise.
* lex.c (handle_pragma_vtable, handle_pragma_unit,
handle_pragma_interface, handle_pragma_implementation,
handle_pragma_java_exceptions): Likewise.

Index: typeck.c
===
--- typeck.c(revision 191177)
+++ typeck.c(working copy)
@@ -2772,7 +2772,7 @@ build_x_indirect_ref (location_t loc, tree expr, r
 
 /* Helper function called from c-common.  */
 tree
-build_indirect_ref (location_t loc ATTRIBUTE_UNUSED,
+build_indirect_ref (location_t /*loc*/,
tree ptr, ref_operator errorstring)
 {
   return cp_build_indirect_ref (ptr, errorstring, tf_warning_or_error);
@@ -3207,7 +3207,7 @@ get_member_function_from_ptrfunc (tree *instance_p
 
 /* Used by the C-common bits.  */
 tree
-build_function_call (location_t loc ATTRIBUTE_UNUSED, 
+build_function_call (location_t /*loc*/, 
 tree function, tree params)
 {
   return cp_build_function_call (function, params, tf_warning_or_error);
@@ -3215,9 +3215,9 @@ tree
 
 /* Used by the C-common bits.  */
 tree
-build_function_call_vec (location_t loc ATTRIBUTE_UNUSED,
+build_function_call_vec (location_t /*loc*/,
 tree function, VEC(tree,gc) *params,
-VEC(tree,gc) *origtypes ATTRIBUTE_UNUSED)
+VEC(tree,gc) * /*origtypes*/)
 {
   VEC(tree,gc) *orig_params = params;
   tree ret = cp_build_function_call_vec (function, ¶ms,
@@ -3693,7 +3693,7 @@ enum_cast_to_int (tree op)
 /* For the c-common bits.  */
 tree
 build_binary_op (location_t location, enum tree_code code, tree op0, tree op1,
-int convert_p ATTRIBUTE_UNUSED)
+int /*convert_p*/)
 {
   return cp_build_binary_op (location, code, op0, op1, tf_warning_or_error);
 }
@@ -5448,7 +5448,7 @@ cp_build_unary_op (enum tree_code code, tree xarg,
 
 /* Hook for the c-common bits that build a unary op.  */
 tree
-build_unary_op (location_t location ATTRIBUTE_UNUSED,
+build_unary_op (location_t /*location*/,
enum tree_code code, tree xarg, int noconvert)
 {
   return cp_build_unary_op (code, xarg, noconvert, tf_warning_or_error);
@@ -5784,7 +5784,7 @@ build_x_compound_expr (location_t loc, tree op1, t
 /* Like cp_build_compound_expr, but for the c-common bits.  */
 
 tree
-build_compound_expr (location_t loc ATTRIBUTE_UNUSED, tree lhs, tree rhs)
+build_compound_expr (location_t /*loc*/, tree lhs, tree rhs)
 {
   return cp_build_compound_expr (lhs, rhs, tf_warning_or_error);
 }
@@ -6652,7 +6652,7 @@ build_const_cast (tree type, tree expr, tsubst_fla
 /* Like cp_build_c_cast, but for the c-common bits.  */
 
 tree
-build_c_cast (location_t loc ATTRIBUTE_UNUSED, tree type, tree expr)
+build_c_cast (location_t /*loc*/, tree type, tree expr)
 {
   return cp_build_c_cast (type, expr, tf_warning_or_error);
 }
@@ -6782,11 +6782,11 @@ cp_build_c_cast (tree type, tree expr, tsubst_flag
 
 /* For use from the C common bits.  */
 tree
-build_modify_expr (location_t location ATTRIBUTE

Re: shrink-wrapping duplicates BBs across partitions.

2012-09-11 Thread Christian Bruel
Actually, the edge is fairly simple. I have

BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT

and BB10 has no other incoming edges. and we are duplicating it.

My hypothesis, is that with a gcov based profile, we should never have
such partitioning on the edges, BB10 should be COLD as well. My
suggestion was to avoid shrink-wrapping failing on the block duplication
for this case, but that would hide the real cause. I now prefer to
understand why BB10 is HOT in the first place... if this is a correct
assumption that it should not be.

Thanks

Christian


On 09/11/2012 02:46 PM, Steven Bosscher wrote:
>> Does this restriction look right to you ? (regression tests are still
>> running on x86 and sh)
> 
> Please generate your patches with diff -up (or svn diff -x -up).
> 
>> +&& (BB_PARTITION (e->src) == BB_PARTITION (e->dest))
> 
> No need for parentheses around this check.
> 
> The shrink wrapping code appears to be dealing with partitioning, or
> at least there are BB_COPY_PARTITIONs further down. So I can't tell
> whether this fix is correct. Can you show in more detail what happens?
> (A dotty graph is always helpful ;-)
> 
> Ciao!
> Steven
> 


Re: [PATCH] Combine location with block using block_locations

2012-09-11 Thread Michael Matz
Hi,

On Tue, 11 Sep 2012, Dehao Chen wrote:

> Looks like we have two choices:
> 
> 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t)

This will actually not work correctly in some cases.  The problem is, if 
the prevailing decl is already part of another chain (say in another 
block_var list) you would break the current chain.  Hence block vars need 
special handling in the lto streamer (another reason why tree_chain is not 
the most clever think to use for this chain).  This problem area needs to 
be solved somehow if block info is to be preserved correctly.

> 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL
> (TREE_CHAIN (t)).

That's also a large hammer as it basically will mean no debug info after 
LTO :-/ Sigh, at this point I have no good solution that doesn't involve 
quite some work, perhaps your hack is good enough for the time being, 
though I hate it :)


Ciao,
Michael.


Re: Change double_int calls to new interface.

2012-09-11 Thread Mark Kettenis
> Index: gcc/ChangeLog
> 
> 2012-09-04  Lawrence Crowl  
> 
>   * double-int.h (double_int::operator &=): New.
>   (double_int::operator ^=): New.
>   (double_int::operator |=): New.
>   (double_int::mul_with_sign): Modify overflow parameter to bool*.
>   (double_int::add_with_sign): New.
>   (double_int::ule): New.
>   (double_int::sle): New.
>   (binary double_int::operator *): Remove parameter name.
>   (binary double_int::operator +): Likewise.
>   (binary double_int::operator -): Likewise.
>   (binary double_int::operator &): Likewise.
>   (double_int::operator |): Likewise.
>   (double_int::operator ^): Likewise.
>   (double_int::and_not): Likewise.
>   (double_int::from_shwi): Tidy formatting.
>   (double_int::from_uhwi): Likewise.
>   (double_int::from_uhwi): Likewise.
>   * double-int.c (double_int::mul_with_sign): Modify overflow 
> parameter
>   to bool*.
>   (double_int::add_with_sign): New.
>   (double_int::ule): New.
>   (double_int::sle): New.
>   * builtins.c: Modify to use the new double_int interface.
>   * cgraph.c: Likewise.
>   * combine.c: Likewise.
>   * dwarf2out.c: Likewise.
>   * emit-rtl.c: Likewise.
>   * expmed.c: Likewise.
>   * expr.c: Likewise.
>   * fixed-value.c: Likewise.
>   * fold-const.c: Likewise.
>   * gimple-fold.c: Likewise.
>   * gimple-ssa-strength-reduction.c: Likewise.
>   * gimplify-rtx.c: Likewise.
>   * ipa-prop.c: Likewise.
>   * loop-iv.c: Likewise.
>   * optabs.c: Likewise.
>   * stor-layout.c: Likewise.
>   * tree-affine.c: Likewise.
>   * tree-cfg.c: Likewise.
>   * tree-dfa.c: Likewise.
>   * tree-flow-inline.h: Likewise.
>   * tree-object-size.c: Likewise.
>   * tree-predcom.c: Likewise.
>   * tree-pretty-print.c: Likewise.
>   * tree-sra.c: Likewise.
>   * tree-ssa-address.c: Likewise.
>   * tree-ssa-alias.c: Likewise.
>   * tree-ssa-ccp.c: Likewise.
>   * tree-ssa-forwprop.c: Likewise.
>   * tree-ssa-loop-ivopts.c: Likewise.
>   * tree-ssa-loop-niter.c: Likewise.
>   * tree-ssa-phiopt.c: Likewise.
>   * tree-ssa-pre.c: Likewise.
>   * tree-ssa-sccvn: Likewise.
>   * tree-ssa-structalias.c: Likewise.
>   * tree-ssa.c: Likewise.
>   * tree-switch-conversion.c: Likewise.
>   * tree-vect-loop-manip.c: Likewise.
>   * tree-vrp.c: Likewise.
>   * tree.h: Likewise.
>   * tree.c: Likewise.
>   * varasm.c: Likewise.

I fear this has broken hppa.  Bootstrap on OpenBSD/hppa now fails with:


In file included from ../../../src/gcc/gcc/mcf.c:47:0:
../../../src/gcc/gcc/mcf.c: In function 'void dump_fixup_edge(FILE*, 
fixup_graph_type*, fixup_edge_p)':
../../../src/gcc/gcc/system.h:288:78: error: integer overflow in expression 
[-Werror=overflow]
  ? ~ (t) 0 << (sizeof(t) * CHAR_BIT - 1) : (t) 0))
  ^
../../../src/gcc/gcc/system.h:289:44: note: in expansion of macro 
'INTTYPE_MINIMUM'
 #define INTTYPE_MAXIMUM(t) ((t) (~ (t) 0 - INTTYPE_MINIMUM (t)))
^
../../../src/gcc/gcc/mcf.c:55:22: note: in expansion of macro 'INTTYPE_MAXIMUM'
 #define CAP_INFINITY INTTYPE_MAXIMUM (HOST_WIDEST_INT)
  ^
../../../src/gcc/gcc/mcf.c:211:34: note: in expansion of macro 'CAP_INFINITY'
   if (fedge->max_capacity == CAP_INFINITY)
  ^

Something must be wrong with the overflow detection logic in the new
double_int interfaces.  I suspect this is because for hppa
HOST_WIDE_INT is 32 bits wide, since on i386 and x86_64 I don't hit this.


Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)

2012-09-11 Thread Diego Novillo

On 2012-09-11 08:42 , Dominique Dhumieres wrote:

This is ok, of course.


Then could you please commit it (I don't have write access)?


Done.  Rev 191192.


2012-09-11  Dominique Dhumieres  

* config/darwin.c (darwin_asm_named_section): Adjust for
VEC changes.
(darwin_asm_dwarf_section): Likewise.

diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index 33a831f..54c92d1 100644
--- a/gcc/config/darwin.c
+++ b/gcc/config/darwin.c
@@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *name,
  the assumption of how this is done.  */
   if (lto_section_names == NULL)
 lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16);
-  VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, &e);
+  VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e);
}
   else if (strncmp (name, "__DWARF,", 8) == 0)
 darwin_asm_dwarf_section (name, flags, decl);
@@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *name, 
unsigned int flags,

   fprintf (asm_out_file, "Lsection%.*s:\n", namelen, sname);
   e.count = 1;
   e.name = xstrdup (sname);
-  VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, 
&e);

+  VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e);
 }
 }



Re: [C++ Patch] Remove uses of ATTRIBUTE_UNUSED in the function parameters

2012-09-11 Thread Jakub Jelinek
On Tue, Sep 11, 2012 at 05:29:12PM +0200, Paolo Carlini wrote:
> PS: slightly interesting, in a couple of cases -
> write_unnamed_type_name, wrap_cleanups_r - the parameters were
> actually used.

Just a general comment, often an argument is only conditionally used,
e.g. depending on some preprocessor macro (e.g. target hook).  In that
case unnamed parameter is not an option, but dropping ATTRIBUTE_UNUSED is
not desirable either.

Jakub


Re: shrink-wrapping duplicates BBs across partitions.

2012-09-11 Thread Steven Bosscher
On Tue, Sep 11, 2012 at 5:31 PM, Christian Bruel  wrote:
> Actually, the edge is fairly simple. I have
>
> BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT
>
> and BB10 has no other incoming edges. and we are duplicating it.

That is wrong, should never happen. Is there a test case to play with?
It'd be good to have a PR for this.

Ciao!
Steven


Re: Change double_int calls to new interface.

2012-09-11 Thread Andreas Schwab
Mark Kettenis  writes:

> In file included from ../../../src/gcc/gcc/mcf.c:47:0:
> ../../../src/gcc/gcc/mcf.c: In function 'void dump_fixup_edge(FILE*, 
> fixup_graph_type*, fixup_edge_p)':
> ../../../src/gcc/gcc/system.h:288:78: error: integer overflow in expression 
> [-Werror=overflow]

This is PR54528.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Fix var-tracking for window register targets

2012-09-11 Thread Diego Novillo
Caught on a sparc build.

Testing on sparc.  Will commit once it finishes.


Diego.

* var-tracking.c (vt_add_function_parameter): Adjust for VEC
changes.

diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 8c9ec48..9f5bc12 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -9356,13 +9356,13 @@ vt_add_function_parameter (tree parm)
   && HARD_REGISTER_P (incoming)
   && OUTGOING_REGNO (REGNO (incoming)) != REGNO (incoming))
 {
-  parm_reg_t *p
-   = VEC_safe_push (parm_reg_t, gc, windowed_parm_regs, NULL);
-  p->incoming = incoming;
+  parm_reg_t p;
+  p.incoming = incoming;
   incoming
= gen_rtx_REG_offset (incoming, GET_MODE (incoming),
  OUTGOING_REGNO (REGNO (incoming)), 0);
-  p->outgoing = incoming;
+  p.outgoing = incoming;
+  VEC_safe_push (parm_reg_t, gc, windowed_parm_regs, p);
 }
   else if (MEM_P (incoming)
   && REG_P (XEXP (incoming, 0))
@@ -9371,11 +9371,11 @@ vt_add_function_parameter (tree parm)
   rtx reg = XEXP (incoming, 0);
   if (OUTGOING_REGNO (REGNO (reg)) != REGNO (reg))
{
- parm_reg_t *p
-   = VEC_safe_push (parm_reg_t, gc, windowed_parm_regs, NULL);
- p->incoming = reg;
+ parm_reg_t p;
+ p.incoming = reg;
  reg = gen_raw_REG (GET_MODE (reg), OUTGOING_REGNO (REGNO (reg)));
- p->outgoing = reg;
+ p.outgoing = reg;
+ VEC_safe_push (parm_reg_t, gc, windowed_parm_regs, p);
  incoming = replace_equiv_address_nv (incoming, reg);
}
 }


Re: shrink-wrapping duplicates BBs across partitions.

2012-09-11 Thread Christian Bruel


On 09/11/2012 05:40 PM, Steven Bosscher wrote:
> On Tue, Sep 11, 2012 at 5:31 PM, Christian Bruel  
> wrote:
>> Actually, the edge is fairly simple. I have
>>
>> BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT
>>
>> and BB10 has no other incoming edges. and we are duplicating it.
> 
> That is wrong, should never happen. Is there a test case to play with?

Thanks for the confirmation. The case happens on SH only when applying
the simple_return patch [PR target/54546] on the bb-reorder test from
the testsuite.

> It'd be good to have a PR for this.

I'll update the PR above with what I find, lets see if this turns out to
be target independent.

thanks

Christian

> 
> Ciao!
> Steven
> 


Re: Obsolete picochip-* in 4.7.2+

2012-09-11 Thread Daniel Towner



Hi!

As discussed on IRC, the picochip-* port doesn't have an active maintainer
anymore, this patch adds it to deprecated ports for 4.7.2+ so that it can be 
removed in
GCC 4.8 unless somebody steps up to maintain it.

Ok for trunk/4.7?

2012-09-11  Jakub Jelinek

* config.gcc: Obsolete picochip-*.

--- gcc/config.gcc  2012-09-05 14:52:14.428548941 +0200
+++ gcc/config.gcc  2012-09-11 17:05:15.147522191 +0200
@@ -245,7 +245,8 @@ md_file=

  # Obsolete configurations.
  case ${target} in
-   score-* \
+   picochip-*  \
+ | score-* \
   )
  if test "x$enable_obsolete" != xyes; then
echo "*** Configuration ${target} is obsolete.">&2

--- gcc-4.7/changes.html10 Aug 2012 16:25:46 -  1.124
+++ gcc-4.7/changes.html11 Sep 2012 15:15:38 -
@@ -29,7 +29,14 @@
  next release of GCC will have their sources permanently
  removed.

-The following ports for individual systems on
+All GCC ports for the following processor
+architectures have been declared obsolete:
+
+
+   picoChip (picochip-*)
+
+
+The following ports for individual systems on
  particular architectures have been obsoleted:

  



As some of you will be aware, picoChip was acquired earlier this year by 
Mindspeed Technologies. Although the picoChip specific tool chain, which 
includes the port of GCC, is still being actively used by customers in 
existing products, further development of picoChip products is ceasing 
and customers are migrating to the equivalent Mindspeed products. No 
further development work will be undertaken for picoGcc, and no one 
within Mindspeed will be able to continue to support the port, so it is 
right that the picochip port should be obsoleted.


Thank you to everyone who has helped myself and the other maintainers of 
the picochip port over the years.


regards,

dan.

--
--
Daniel Towner, Mindspeed Technologies Inc.
Upper Borough Court, Upper Borough Walls, Bath BA1 1RG, UK
daniel.tow...@mindspeed.com
+44 7786 702589


--
This message has been scanned for viruses and dangerous content
by Mindspeed IT using MailScanner and is believed to be clean.



Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector

2012-09-11 Thread Marc Glisse

On Sun, 9 Sep 2012, Marc Glisse wrote:


Hello,

this patch lets the compiler try to rewrite:

(vec_concat (vec_select x [a]) (vec_select x [b]))

as:

vec_select x [a b]

or even just "x" if appropriate.

In a first iteration I was restricting it to b-a==1, but it seemed better not 
to: it helps for {v[1],v[0]} and doesn't change anything for unknown 
patterns.


Note that I am planning to do a similar optimization at tree level, but it 
shouldn't make this one useless because such patterns can be created during 
rtl passes. The testcase may need an additional -fno-tree-xxx to still be 
useful at that point though.


Since the tree-ssa patch was reviewed faster, assume there is a 
-fno-tree-forwprop in dg-options for the testcase.



bootstrap+testsuite on x86_64-linux-gnu.

2012-09-09  Marc Glisse  

gcc/
* simplify-rtx.c (simplify_binary_operation_1): Handle vec_concat
of vec_selects from the same vector.

gcc/testsuite/
* gcc.target/i386/vect-rebuild.c: New testcase.


http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00540.html

--
Marc Glisse


Re: [PATCH] Combine location with block using block_locations

2012-09-11 Thread Dehao Chen
I saw comments in tree-streamer-out.c:

  /* Do not stream BLOCK_SOURCE_LOCATION.  We cannot handle debug information
 for early inlining so drop it on the floor instead of ICEing in
 dwarf2out.c.  */
  streamer_write_chain (ob, BLOCK_VARS (expr), ref_p);

However, what the code is doing seemed contradictory with the comment.
Or am I missing something?



On Tue, Sep 11, 2012 at 8:32 AM, Michael Matz  wrote:
> Hi,
>
> On Tue, 11 Sep 2012, Dehao Chen wrote:
>
>> Looks like we have two choices:
>>
>> 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t)
>
> This will actually not work correctly in some cases.  The problem is, if
> the prevailing decl is already part of another chain (say in another
> block_var list) you would break the current chain.  Hence block vars need
> special handling in the lto streamer (another reason why tree_chain is not
> the most clever think to use for this chain).  This problem area needs to
> be solved somehow if block info is to be preserved correctly.
>
>> 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL
>> (TREE_CHAIN (t)).
>
> That's also a large hammer as it basically will mean no debug info after
> LTO :-/ Sigh, at this point I have no good solution that doesn't involve
> quite some work, perhaps your hack is good enough for the time being,
> though I hate it :)

I got it. Then I'll keep the patch as it is (remove the
LTO_NO_PREVAIL), and work with Honza to resolve the issue he had, and
then we should be good to check in?

Thanks,
Dehao

>
>
> Ciao,
> Michael.


Re: [C++ Patch] Remove uses of ATTRIBUTE_UNUSED in the function parameters

2012-09-11 Thread Paolo Carlini

On 09/11/2012 05:37 PM, Jakub Jelinek wrote:

On Tue, Sep 11, 2012 at 05:29:12PM +0200, Paolo Carlini wrote:

PS: slightly interesting, in a couple of cases -
write_unnamed_type_name, wrap_cleanups_r - the parameters were
actually used.

Just a general comment, often an argument is only conditionally used,
e.g. depending on some preprocessor macro (e.g. target hook).  In that
case unnamed parameter is not an option, but dropping ATTRIBUTE_UNUSED is
not desirable either.
Of course. As far as I can see, that isn't the case for the C++ 
front-end uses, but hey, if you spot something which *may* be less than 
straightforward in my patch, please let me know asap!


Paolo.


Re: shrink-wrapping duplicates BBs across partitions.

2012-09-11 Thread Christian Bruel
when running a cfg dump, I get many messages like:

Invalid sum of incoming frequencies 1667, should be 3334

So it looks like a profile information was not correctly propagated
somewhere. which could lead to such partitioning incoherency. I have no
idea for the moment if this is local problem or not, just want to share
that in case someone as an input on this.

Cheers

Christian


On 09/11/2012 05:40 PM, Steven Bosscher wrote:
> On Tue, Sep 11, 2012 at 5:31 PM, Christian Bruel  
> wrote:
>> Actually, the edge is fairly simple. I have
>>
>> BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT
>>
>> and BB10 has no other incoming edges. and we are duplicating it.
> 
> That is wrong, should never happen. Is there a test case to play with?
> It'd be good to have a PR for this.
> 
> Ciao!
> Steven
> 


Re: shrink-wrapping duplicates BBs across partitions.

2012-09-11 Thread Jakub Jelinek
On Tue, Sep 11, 2012 at 05:40:30PM +0200, Steven Bosscher wrote:
> On Tue, Sep 11, 2012 at 5:31 PM, Christian Bruel  
> wrote:
> > Actually, the edge is fairly simple. I have
> >
> > BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT
> >
> > and BB10 has no other incoming edges. and we are duplicating it.
> 
> That is wrong, should never happen. Is there a test case to play with?
> It'd be good to have a PR for this.

Isn't that the standard case when !HAVE_return ?  Then you can have only a
single return through epilogue, and when the epilogue is in the hot
partition, even if cold code is returning, it needs to jump to the epilogue.

Jakub


Re: Scheduler: Allow breaking dependencies by modifying patterns

2012-09-11 Thread Vladimir Makarov

On 08/03/2012 08:05 AM, Bernd Schmidt wrote:

This patch allows us to change

rn++
rm=[rn]

into

rm=[rn + 4]
rn++
That is an interesting optimization.  I think analogous optimization 
could be done for INC/DEC addressing (probably it might be beneficial 
for ppc which has such addressing and displacement addressing).  
Although it will complicate the haifa scheduler quite a lot as a new 
insn is generated and the real benefits are may be not worth of it (as 
an additional insn should be generated which in many cases it could 
result even in worse code).

Opportunities to do this are discovered by a mini-pass over the
instructions after generating dependencies and before scheduling a
block. At that point we have all the information required to ensure that
a candidate dep between two instructions is only used to show the
register dependence, and to ensure that every insn with a memory
reference is only subject to at most one dep causing a pattern change.

The dep_t structure is extended to hold an optional pointer to a
"replacement description", which holds information about what to change
when a dependency is broken. The time when this replacement is applied
differs depending on whether the changed insn is the DEP_CON (in which
case the pattern is changed whenever the broken dependency becomes the
last one), or the DEP_PRO, in which case we make the change when the
corresponding DEP_CON has been scheduled. This ensures that the ready
list always contains insns with the correct pattern.

A few additional bits are needed in the dep structure: one to hold
information about whether a dependency occurs multiple times, and one to
distinguish dependencies that are purely for register values from those
with other meanings (e.g. memory references).

Also, sched-rgn was changed to use a new bit, DEP_POSTPONED, rather than
HARD_DEP to indicate that we don't want to schedule an insn in the
current block.

A possible future extension would be to also allow autoinc addressing
modes as the increment insn.

Bootstrapped and tested on x86_64-linux, and also tested on c6x-elf
(quite a number of changes were necessary to make it work there). It was
originally written for a mips target and tested there in the context of
a 4.6 tree. I've also run spec2000 on x86_64, with no change that looked
like anything other than noise. Ok?


Ok, thanks.  The changes are pretty straightforward.  Only just a few 
comments.


One is a missed change log entry for haifa_note_dep.

Second one is for

+  /* Cached cost of the dependency.  Make sure to update UNKNOWN_DEP_COST
+ when changing the size of this field.  */
+  int cost:20;
 };

+#define UNKNOWN_DEP_COST (-1<<19)
+

You could use a macro to define bit widths and UNKNOWN_DEP_COST.  But probably 
it is a taste matter.

The third one is success_in_block in find_modifiable_mems.  It is calculated 
but nowhere used.  Probably it was used for debugging.  You should something to 
do with this.

Thanks for the patch, Bernd.  Sorry for the delay with the review.  I thought 
that Maxim writes his comments first.



Re: [C++ Patch] Remove uses of ATTRIBUTE_UNUSED in the function parameters

2012-09-11 Thread Gabriel Dos Reis
On Tue, Sep 11, 2012 at 10:37 AM, Jakub Jelinek  wrote:
> On Tue, Sep 11, 2012 at 05:29:12PM +0200, Paolo Carlini wrote:
>> PS: slightly interesting, in a couple of cases -
>> write_unnamed_type_name, wrap_cleanups_r - the parameters were
>> actually used.
>
> Just a general comment, often an argument is only conditionally used,
> e.g. depending on some preprocessor macro (e.g. target hook).  In that
> case unnamed parameter is not an option, but dropping ATTRIBUTE_UNUSED is
> not desirable either.

That a parameter is unused in a function body should be clear from the context.
And in those case, it is desirable that the parameter be unnamed, and
the attribute
be dropped.  That is what Paolo's patch is doing.  That should not be
controversial.

-- Gaby


RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Iyer, Balaji V
Please see my answers below

>-Original Message-
>From: Richard Henderson [mailto:r...@redhat.com]
>Sent: Monday, September 10, 2012 12:38 PM
>To: Iyer, Balaji V
>Cc: Richard Guenther; gcc-patches@gcc.gnu.org; Gabriel Dos Reis; Aldy
>Hernandez (al...@redhat.com); Jeff Law
>Subject: Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
>
>On 09/10/2012 09:09 AM, Iyer, Balaji V wrote:
>>> >If that's the case, what's the point in defining an external ABI and
>>> >defining what
>>> >__attribute__((vector)) placed on a function declaration means?
>
>> When you have __attribute__((vector)) you are asking the compiler to
>> create a vector AND a scalar version of the function. The advantage is
>> that if the function is used, for example, in 2 loops where 1 can be
>> vectorized and another cannot, the vectorizable loop won't suffer
>> (i.e. suffer from being not-vectorized).
>
>
>On the other hand, if you insist on assuming a clone exists merely because a
>declaration bears an attribute, then you must address ALL of the problems with
>respect to defining a stable ABI in the face of different cpu revisions, 
>different
>ISAs, and different vector lengths.

The function mangling handles several of the version inconsistencies you have 
mentioned. If the CPU revisions, vector lengths are not the same between the 
function declaration and the function, then the name of the function will be 
different and the linker should complain.


>
>I've not seen you address ANY of these problems, despite having the problem
>pointed out multiple times.
>
>
>r~


Re: [PATCH, TESTSUITE] Add -fno-short-enums to pr51712

2012-09-11 Thread Mike Stump
On Sep 11, 2012, at 6:12 AM, Kyrylo Tkachov  wrote:
> Fixed the format of the test options, as per Jakub's comment.

> Ok for trunk?

Ok.


Re: [Patch ARM testsuite] fix 3 tests for big-endian

2012-09-11 Thread Mike Stump
On Sep 11, 2012, at 8:06 AM, Christophe Lyon  wrote:
> Ping?
> http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00068.html

Since the arm people haven't rejected it…  Ok.


Re: [patch] Expand SJLJ exceptions as tablejump/casesi

2012-09-11 Thread Steven Bosscher
Hello,

Thanks for the quick review!

On Tue, Sep 11, 2012 at 5:03 PM, Richard Henderson wrote:
> On 09/10/2012 04:26 PM, Steven Bosscher wrote:
>> +  rtx index = force_reg (index_mode, dispatch_index);
>
> You can't modify the result of force_reg.  Use copy_to_{mode_,}reg instead.

Done.

>> +   rtx tmp = expand_simple_binop (index_mode, MINUS,
>> +  index, CONST1_RTX (index_mode),
>> +  index, 0, OPTAB_DIRECT);
>> +   gcc_assert (REG_P (tmp));
>> +   if (tmp != index)
>> + emit_move_insn (index, tmp);
>
> This pattern is force_expand_binop.

Didn't know about this one :-)

> Of course, you don't really need to force index be the same all
> the way down the chain.  You could just as well use
>
>   index = expand_simple_binop (index_mode, MINUS, index, one,
>index, 0, OPTAB_DIRECT);
>
> and use any new pseudo in the next iteration.

Right, I've made the changes to do so.

> Otherwise this looks good.

I made the following changes:

$ interdiff sjlj_tablejump.diff.20120910 sjlj_tablejump.diff
diff -u stmt.c stmt.c
--- stmt.c  (working copy)
+++ stmt.c  (working copy)
@@ -2129,19 +2129,16 @@
 This is more efficient than a dispatch table on most machines.
 The last "index--" is redundant but the code is trivially dead
 and will be cleaned up by later passes.  */
-  rtx index = force_reg (index_mode, dispatch_index);
+  rtx index = copy_to_mode_reg (index_mode, dispatch_index);
   rtx zero = CONST0_RTX (index_mode);
   for (int i = 0; i < ncases; i++)
 {
  tree elt = VEC_index (tree, dispatch_table, i);
  rtx lab = label_rtx (CASE_LABEL (elt));
  do_jump_if_equal (index_mode, index, zero, lab, 0);
- rtx tmp = expand_simple_binop (index_mode, MINUS,
-index, CONST1_RTX (index_mode),
-index, 0, OPTAB_DIRECT);
- gcc_assert (REG_P (tmp));
- if (tmp != index)
-   emit_move_insn (index, tmp);
+ force_expand_binop (index_mode, code_to_optab (MINUS),
+ index, CONST1_RTX (index_mode),
+ index, 0, OPTAB_DIRECT);
}
 }
   else

and I'm re-testing the updated patch. OK for trunk if it passes?

Ciao!
Steven


Re: [patch] Expand SJLJ exceptions as tablejump/casesi

2012-09-11 Thread Richard Henderson
On 09/11/2012 10:53 AM, Steven Bosscher wrote:
> +   force_expand_binop (index_mode, code_to_optab (MINUS),

Use sub_optab directly, rather than code_to_optab.

Otherwise ok.


r~


Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)

2012-09-11 Thread Richard Henderson
On 09/11/2012 10:14 AM, Iyer, Balaji V wrote:
> The function mangling handles several of the version inconsistencies
> you have mentioned. If the CPU revisions, vector lengths are not the
> same between the function declaration and the function, then the name
> of the function will be different and the linker should complain.

Sure.  I get that.  And that works for code within a single project.

But that means that if you build a shared library containing one of
these elemental functions, its external ABI changes depending on what
compiler flags you build it with.

Can you not understand how totally unacceptable this is?


r~






Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-09-11 Thread Xinliang David Li
Can you resend your patch in text form (also need to resolve the
latest conflicts) so that it can be commented inline?

Please also provide as summary a more up-to-date description of
1) Command line option syntax and semantics
2) New dumping APIs and semantics
3) Conversion changes

Looking at the patch briefly, I am confused with the opt-info syntax.
I thought the following is desired:

-fopt-info=pass-flags

where pass is the pass name, and flags is one of [optimized, notes,
missed].  Both pass and flags can be omitted.

Is it implemented this way in your patch?

David




On Mon, Sep 10, 2012 at 11:20 AM, Sharad Singhai  wrote:
> Ping.
>
> Thanks,
> Sharad
> Sharad
>
>
> On Wed, Sep 5, 2012 at 10:34 AM, Sharad Singhai  wrote:
>> Ping.
>>
>> Thanks,
>> Sharad
>>
>> Sharad
>>
>>
>>
>>
>> On Fri, Aug 24, 2012 at 1:06 AM, Sharad Singhai  wrote:
>>>
>>> Sorry about the delay. Please see comments inline.
>>>
>>> On Wed, Jul 4, 2012 at 6:33 AM, Richard Guenther
>>>  wrote:
>>> > On Tue, Jul 3, 2012 at 11:07 PM, Sharad Singhai 
>>> > wrote:
>>> >> Apologies for the spam. Attempting to resend the patch after shrinking
>>> >> it.
>>> >>
>>> >> I have updated the attached patch to use a new dump message
>>> >> classification system for the vectorizer. It currently uses four
>>> >> classes, viz, MSG_OPTIMIZED_LOCATIONS, MSG_UNOPTIMIZED_LOCATION,
>>> >> MSG_MISSING_OPTIMIZATION, and MSG_NOTE. I have gone through the
>>> >> vectorizer passes and have converted each call to fprintf (dump_file,
>>> >> ) to a message classification matching in spirit. Most often, it
>>> >> is MSG_OPTIMIZED_LOCATIONS, but occasionally others as well.
>>> >>
>>> >> For example, the following
>>> >>
>>> >> if (vect_print_dump_info (REPORT_DETAILS))
>>> >>   {
>>> >> fprintf (vect_dump, "niters for prolog loop: ");
>>> >> print_generic_expr (vect_dump, iters, TDF_SLIM);
>>> >>   }
>>> >>
>>> >> gets converted to
>>> >>
>>> >> if (dump_kind (MSG_OPTIMIZED_LOCATIONS))
>>> >>   {
>>> >>  dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
>>> >>   "niters for prolog loop: ");
>>> >>  dump_generic_expr (MSG_OPTIMIZED_LOCATIONS, TDF_SLIM, iters);
>>> >>   }
>>> >>
>>> >> The asymmetry between the first printf and the second is due to the
>>> >> fact that 'vect_print_dump_info (xxx)' prints the location as a
>>> >> "side-effect". To preserve the original intent somewhat, I have
>>> >> converted the first call within a dump sequence to a dump_printf_loc
>>> >> (xxx) which prints the location while the subsequence calls within the
>>> >> same conditional get converted to the corresponding plain variants.
>>> >
>>> > Ok, that looks reasonable.
>>> >
>>> >> I considered removing the support for alternate dump file, but ended
>>> >> up preserving it instead since it is needed for setting the alternate
>>> >> dump file to stderr for the case when -fopt-info is given but no dump
>>> >> file is available.
>>> >>
>>> >> The following invocation
>>> >> g++ ... -ftree-vectorize -fopt-info=4
>>> >>
>>> >> dumps *all* available information to stderr. Currently, the opt-info
>>> >> level is common to all passes, i.e., a pass can't specify if wants a
>>> >> different level of diagnostic info. This can be added as an
>>> >> enhancement with a suitable syntax for selecting passes.
>>> >>
>>> >> I haven't fixed up the documentation/tests but wanted to get some
>>> >> feedback about the current state of patch before doing that.
>>> >
>>> > Some comments / questions.
>>> >
>>> > +  if (dump_file && (dump_kind & opt_info_flags))
>>> > +{
>>> > +  dump_loc (dump_kind, dump_file, loc);
>>> > +  print_generic_expr (dump_file, t, dump_flags | extra_dump_flags);
>>> > +}
>>> > +
>>> > +  if (alt_dump_file && (dump_kind & opt_info_flags))
>>> > +{
>>> >
>>> > you always test dump_kind against the same opt_info_flags variable.
>>> > I would have thought that the alternate dump file has a different
>>> > opt_info_flags
>>> > setting so I can have -fdump-tree-vect-details -fopt-info=1.  Am I
>>> > missing
>>> > something?
>>>
>>> It was an oversight on my part. I have since fixed this. There are two
>>> separate flags corresponding to the two types of dump files,
>>>
>>> pflags ==> pass private dump file
>>> alt_flags ==> opt-info dump file
>>>
>>> > If I do
>>> >
>>> >> gcc file1.c file2.c -O3 -fdump-tree-vectorize=foo
>>> >
>>> > what will foo contain afterwards?  I think you need to document the
>>> > behavior
>>> > when such redirection is used with the compiler-driver feature of
>>> > handling
>>> > multiple translation units.  Especially the difference (or not
>>> > difference) to
>>> >
>>> >> gcc file1.c -O3 -fdump-tree-vectorize=foo
>>> >> gcc file2.c -O3 -fdump-tree-vectorize=foo
>>>
>>> Yes, the dump file gets overwritten during each invocation. I have
>>> noted this in the documentation.
>>>
>>> > I suppose we do not want to append to foo (but eventually support that
>>> > with some alternate syntax?

Re: [PATCH] limited C++ parsing support for gengtype

2012-09-11 Thread Diego Novillo

On 2012-08-29 20:31 , Aaron Gray wrote:


2012-08-30 Aaron Gray 

 * gengtype-lex.l: Support for FILE
 Support for C++ single line Comments
 Support for classes
 Support for enums
 ignore 'static'
 ignore 'inline'
 ignore 'public:'
 ignore 'protected:'
 ignore 'private:'
 ignore 'friend'
 support for 'operator' token
 support for 'new'
 support for 'delete'
 added support for '+' as a token for summations in enum bodies

 * gengtype.h: added 'TYPE_ENUM' to 'enum typekind'
 added enum TYPE_ENUM to 'struct type' union


Write entries like these as:

* gengtype.h (enum type_kind): Add TYPE_ENUM.
(struct type): Add TYPE_ENUM.


 added OPERATOR_KEYWORD and OPERATOR keywords to Token Code enum


Likewise.



 * gengtype-parser.c: updated 'token_names[]'
 (direct_declarator): support for parsing limited operators
 support for parsing constructors with no parameters
 support for parsing enums

 * gengtype.c: added 'type_p enums'  to maintain list of enums
 (resolve_typedef): added support for stucture types and enums
 added 'new_enum()'


diff --git a/gcc/gengtype-lex.l b/gcc/gengtype-lex.l
index 5788a6a..af9696a 100644
--- a/gcc/gengtype-lex.l
+++ b/gcc/gengtype-lex.l
@@ -53,11 +53,11 @@ update_lineno (const char *l, size_t len)
  ID[[:alpha:]_][[:alnum:]_]*
  WS[[:space:]]+
  HWS   [ \t\r\v\f]*
-IWORD  
short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET
+IWORD  
short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET|FILE
  ITYPE {IWORD}({WS}{IWORD})*
  EOID  [^[:alnum:]_]

-%x in_struct in_struct_comment in_comment
+%x in_struct in_struct_comment in_comment in_line_comment
in_line_struct_comment
  %option warn noyywrap nounput nodefault perf-report
  %option 8bit never-interactive
  %%
@@ -83,6 +83,14 @@ EOID [^[:alnum:]_]
BEGIN(in_struct);
return UNION;
  }
+^{HWS}class/{EOID} {
+  BEGIN(in_struct);
+  return STRUCT;
+}
+^{HWS}enum/{EOID} {
+  BEGIN(in_struct);
+  return ENUM;
+}
  ^{HWS}extern/{EOID} {
BEGIN(in_struct);
return EXTERN;
@@ -101,10 +109,20 @@ EOID  [^[:alnum:]_]
  \\\n  { lexer_line.line++; }

  "const"/{EOID}  /* don't care */
+"static"/{EOID}  /* don't care */
+"inline"/{EOID}  /* don't care */
+"public:"/* don't care */
+"private:"   /* don't care */
+"protected:" /* don't care */
+"operator"/{EOID}   { return OPERATOR_KEYWORD; }
+"new"/{EOID}{ *yylval = XDUPVAR (const char,
yytext+1, yyleng-2, yyleng-1); return OPERATOR; }
+"delete"/{EOID} { *yylval = XDUPVAR (const char,
yytext+1, yyleng-2, yyleng-1); return OPERATOR; }
+"friend"/{EOID}
  "GTY"/{EOID}{ return GTY_TOKEN; }
  "VEC"/{EOID}{ return VEC_TOKEN; }
  "union"/{EOID}  { return UNION; }
  "struct"/{EOID} { return STRUCT; }
+"class"/{EOID}   { return CLASS; }


Why not just return STRUCT here?


@@ -3,7 +3,7 @@

 This file is part of GCC.

-   GCC is free software; you can redistribute it and/or modify it under
+   /GCC is free software; you can redistribute it and/or modify it under


This seems out of place.



@@ -778,6 +791,7 @@ type (options_p *optsp, bool nested)
return resolve_typedef (s, &lexer_line);

  case STRUCT:
+case CLASS:


I think that as far as gengtype is concerned, 'struct' and 'class' 
should be exactly the same thing.  So, all the handling for 'CLASS' you 
added should not be needed.




+/* enum definition: type() does all the work.  */
+static void
+parse_enum (void)
+{
+  options_p dummy;
+  type (&dummy, false);
+  /* There may be junk after the type: notably, we cannot currently
+ distinguish 'struct foo *function(prototype);' from 'struct foo;'
+ ...  we could call declarator(), but it's a waste of time at
+ present.  Instead, just eat whatever token is currently lookahead
+ and go back to lexical skipping mode. */
+  advance ();
+}
+


I'm not quite sure what is this trying to do.


@@ -601,16 +602,93 @@ type_p
  resolve_typedef (const char *s, struct fileloc *pos)
  {
pair_p p;
+  type_p t;
+  type_p e;
+
for (p = typedefs; p != NULL; p = p->next)
  if (strcmp (p->name, s) == 0)
return p->type;

+  for (t = structures; t != NULL; t = t->next)
+{
+  switch ( t->kind)
+{
+  case TYPE_NONE:
+   if (do_debug)
+  fprintf(stderr, "TYPE_NONE:\n");
+break;
+  case TYPE_SCALAR:
+if (do_debug)
+  fprintf(s

Re: Change double_int calls to new interface.

2012-09-11 Thread Lawrence Crowl
On 9/11/12, Andreas Schwab  wrote:
> Mark Kettenis  writes:
>> In file included from ../../../src/gcc/gcc/mcf.c:47:0:
>> ../../../src/gcc/gcc/mcf.c: In function 'void dump_fixup_edge(FILE*,
>> fixup_graph_type*, fixup_edge_p)':
>> ../../../src/gcc/gcc/system.h:288:78: error: integer overflow in
>> expression [-Werror=overflow]
>
> This is PR54528.

The expression itself looks correct.  I have not been able to
duplicate the problem on x86.  I am now waiting on access to the
compile farm for access to a hppa system.  Does anyone have more
specific information on the condition that generates the error?

-- 
Lawrence Crowl


Re: [PATCH] limited C++ parsing support for gengtype

2012-09-11 Thread Gabriel Dos Reis
On Tue, Sep 11, 2012 at 3:41 PM, Diego Novillo  wrote:

>> @@ -778,6 +791,7 @@ type (options_p *optsp, bool nested)
>> return resolve_typedef (s, &lexer_line);
>>
>>   case STRUCT:
>> +case CLASS:
>
>
> I think that as far as gengtype is concerned, 'struct' and 'class' should be
> exactly the same thing.  So, all the handling for 'CLASS' you added should
> not be needed.


100% agreed.

-- Gaby


Backtrace library [1/3]

2012-09-11 Thread Ian Lance Taylor
I have finished the initial implementation of the backtrace library I
proposed at http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html .  I've
separated the work into three patches.  These patches only implement the
backtrace library itself; actual use of the library will follow in
separate patches.

This initial implementation only supports ELF and DWARF.  The library is
designed to work correctly for other cases, in the sense that it will
report that it can not find any backtrace information.  The library is
designed to make it straightforward to add support for other object file
formats and debugging formats.  My intent is to commit the library with
ELF/DWARF support and then support other people in extending it.  In
particular, adding support for Mach-O and PE with DWARF should be
simple.

This patch is the interface to and configury of libbacktrace.  I've
separated these out as the parts of libbacktrace that require the most
review.  The interface to libbacktrace is in the file backtrace.h.  This
is what callers will use.  The file backtrace-supported.h is also
available so that programs can see whether calling the backtrace library
will work at all.

The configury is fairly standard.  Note that libbacktrace is built as
both a host library (to link into the compilers) and as a target library
(to link into libgo and possibly other libraries).

Bootstrapped on x86_64-unknown-linux-gnu in conjunction with the other
two patches.  OK for mainline?

Ian


2012-09-11  Ian Lance Taylor  

* Initial implementation.


Index: libbacktrace/README
===
--- libbacktrace/README	(revision 0)
+++ libbacktrace/README	(revision 0)
@@ -0,0 +1,23 @@
+The libbacktrace library
+Initially written by Ian Lance Taylor 
+
+The libbacktrace library may be linked into a program or library and
+used to produce symbolic backtraces.  Sample uses would be to print a
+detailed backtrace when an error occurs or to gather detailed
+profiling information.
+
+The libbacktrace library is provided under a BSD license.  See the
+source files for the exact license text.
+
+The public functions are declared and documented in the header file
+backtrace.h, which should be #include'd by a user of the library.
+
+Building libbacktrace will generate a file backtrace-supported.h,
+which a user of the library may use to determine whether backtraces
+will work.  See the source file backtrace-supported.h.in for the
+macros that it defines.
+
+As of September 2012, libbacktrace only supports ELF executables with
+DWARF debugging information.  The library is written to make it
+straightforward to add support for other object file and debugging
+formats.
Index: libbacktrace/backtrace.h
===
--- libbacktrace/backtrace.h	(revision 0)
+++ libbacktrace/backtrace.h	(revision 0)
@@ -0,0 +1,165 @@
+/* backtrace.h -- Public header file for stack backtrace library.
+   Copyright (C) 2012 Free Software Foundation, Inc.
+   Written by Ian Lance Taylor, Google.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+(1) Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer. 
+
+(2) Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.  
+
+(3) The name of the author may not be used to
+endorse or promote products derived from this software without
+specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.  */
+
+#ifndef BACKTRACE_H
+#define BACKTRACE_H
+
+#include 
+#include 
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* The backtrace code needs to open the executable file in order to
+   find the debug info.  On systems that do not support
+   /proc/self/exe, the program using the backtrace library needs to
+   tell the backtrace library the name of the executable to open.  It
+   does so by calling backtrace_set_executable_name.  The FILENAME
+   argument must point to a permanent buffer.  */

Backtrace library [2/3]

2012-09-11 Thread Ian Lance Taylor
I have finished the initial implementation of the backtrace library I
proposed at http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html .  I've
separated the work into three patches.  These patches only implement the
backtrace library itself; actual use of the library will follow in
separate patches.

This patch is the changes to the top-level directories for the backtrace
library.  This is straightforward.  Note that libbacktrace is built as
both a host library (to link into the compilers) and as a target library
(to link into libgo and possibly other libraries).

Bootstrapped on x86_64-unknown-linux-gnu in conjunction with the other
two patches.  OK for mainline?

Ian


2012-09-11  Ian Lance Taylor  

* MAINTAINERS (Various Maintainers): Add libbacktrace.
* configure.ac (host_libs): Add libbacktrace.
(target_libraries): Add libbacktrace.
* Makefile.def (host_modules): Add libbacktrace.
(target_modules): Likewise.
* configure, Makefile.in: Rebuild.


Index: configure.ac
===
--- configure.ac	(revision 191171)
+++ configure.ac	(working copy)
@@ -133,7 +133,7 @@ build_tools="build-texinfo build-flex bu
 
 # these libraries are used by various programs built for the host environment
 #
-host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib libcpp libdecnumber gmp mpfr mpc isl cloog libelf libiconv"
+host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib libbacktrace libcpp libdecnumber gmp mpfr mpc isl cloog libelf libiconv"
 
 # these tools are built for the host environment
 # Note, the powerpc-eabi build depends on sim occurring before gdb in order to
@@ -152,6 +152,7 @@ libgcj="target-libffi \
 # the host libraries and the host tools (which may be a cross compiler)
 # Note that libiberty is not a target library.
 target_libraries="target-libgcc \
+		target-libbacktrace \
 		target-libgloss \
 		target-newlib \
 		target-libgomp \
Index: MAINTAINERS
===
--- MAINTAINERS	(revision 191171)
+++ MAINTAINERS	(working copy)
@@ -155,6 +155,7 @@ objective-c/c++		Stan Shebs		stanshebs@e
 
 			Various Maintainers
 
+libbacktrace		Ian Lance Taylor	i...@airs.com
 libcpp			Per Bothner		p...@bothner.com
 libcpp			All C and C++ front end maintainers
 fp-bit			Ian Lance Taylor	i...@airs.com
Index: Makefile.def
===
--- Makefile.def	(revision 191171)
+++ Makefile.def	(working copy)
@@ -80,6 +80,7 @@ host_modules= { module= tcl;
 missing=mostlyclean; };
 host_modules= { module= itcl; };
 host_modules= { module= ld; bootstrap=true; };
+host_modules= { module= libbacktrace; bootstrap=true; };
 host_modules= { module= libcpp; bootstrap=true; };
 host_modules= { module= libdecnumber; bootstrap=true; };
 host_modules= { module= libgui; };
@@ -121,6 +122,7 @@ target_modules = { module= libmudflap; l
 target_modules = { module= libssp; lib_path=.libs; };
 target_modules = { module= newlib; };
 target_modules = { module= libgcc; bootstrap=true; no_check=true; };
+target_modules = { module= libbacktrace; };
 target_modules = { module= libquadmath; };
 target_modules = { module= libgfortran; };
 target_modules = { module= libobjc; };


Re: Backtrace library [1/3]

2012-09-11 Thread Gabriel Dos Reis
On Tue, Sep 11, 2012 at 5:53 PM, Ian Lance Taylor  wrote:

> This patch is the interface to and configury of libbacktrace.  I've
> separated these out as the parts of libbacktrace that require the most
> review.  The interface to libbacktrace is in the file backtrace.h.  This
> is what callers will use.  The file backtrace-supported.h is also
> available so that programs can see whether calling the backtrace library
> will work at all.

So, you've settled on a C interface?  A C++ interface would have been
native for other open source projects that are C++ oriented...

-- Gaby


Re: Backtrace library [1/3]

2012-09-11 Thread Chris Lattner

On Sep 11, 2012, at 3:53 PM, Ian Lance Taylor  wrote:

> I have finished the initial implementation of the backtrace library I
> proposed at http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html .  I've
> separated the work into three patches.  These patches only implement the
> backtrace library itself; actual use of the library will follow in
> separate patches.

Hi Ian,

I have no specific comment on the implementation of this library, but:
> 
> +/* Get a full stack backtrace.  SKIP is the number of frames to skip;
> +   passing 0 will start the trace with the function calling backtrace.
> +   DATA is passed to the callback routine.  If any call to CALLBACK
> +   returns a non-zero value, the stack backtrace stops, and backtrace
> +   returns that value; this may be used to limit the number of stack
> +   frames desired.  If all calls to CALLBACK return 0, backtrace
> +   returns 0.  The backtrace function will make at least one call to
> +   either CALLBACK or ERROR_CALLBACK.  This function requires debug
> +   info for the executable.  */
> +
> +extern int backtrace (int skip, backtrace_callback callback,
> +   backtrace_error_callback error_callback, void *data);

FYI, "backtrace" is a well-known function provide by glibc (and other libc's).  
It might be best to pick another name.

-Chris



Re: Backtrace library [1/3]

2012-09-11 Thread Ian Lance Taylor
On Tue, Sep 11, 2012 at 4:01 PM, Gabriel Dos Reis
 wrote:
> On Tue, Sep 11, 2012 at 5:53 PM, Ian Lance Taylor  wrote:
>
>> This patch is the interface to and configury of libbacktrace.  I've
>> separated these out as the parts of libbacktrace that require the most
>> review.  The interface to libbacktrace is in the file backtrace.h.  This
>> is what callers will use.  The file backtrace-supported.h is also
>> available so that programs can see whether calling the backtrace library
>> will work at all.
>
> So, you've settled on a C interface?  A C++ interface would have been
> native for other open source projects that are C++ oriented...

Yes, a C interface is convenient for libgo, and of course is generally
usable.  We can certainly layer a C++ interface on top if it seems
useful.

The interface is somewhat constrained in that, on systems that support
anonymous mmap, it does not call malloc.  That makes it possible to do
a symbolic backtrace from a signal handler.

Ian


  1   2   >