[PATCH] Fix PR58553

2013-10-01 Thread Richard Biener

This fixes niter analysis for memset/memcpy pattern recognition in
loop distribution.  We now have to look in a more proper way whether
the store is dominating or dominated by the loop exit condition.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-10-01  Richard Biener  

PR tree-optimization/58553
* tree-loop-distribution.c (struct partition_s): Add niter member.
(classify_partition): Populate niter member for the partition
and properly identify whether the relevant store happens before
or after the loop exit.
(generate_memset_builtin): Use niter member from the partition.
(generate_memcpy_builtin): Likewise.

* gcc.dg/torture/pr58553.c: New testcase.

Index: gcc/tree-loop-distribution.c
===
*** gcc/tree-loop-distribution.c(revision 203032)
--- gcc/tree-loop-distribution.c(working copy)
*** typedef struct partition_s
*** 569,574 
--- 569,575 
/* data-references a kind != PKIND_NORMAL partition is about.  */
data_reference_p main_dr;
data_reference_p secondary_dr;
+   tree niter;
  } *partition_t;
  
  
*** generate_memset_builtin (struct loop *lo
*** 848,868 
  {
gimple_stmt_iterator gsi;
gimple stmt, fn_call;
!   tree nb_iter, mem, fn, nb_bytes;
location_t loc;
tree val;
  
stmt = DR_STMT (partition->main_dr);
loc = gimple_location (stmt);
-   if (gimple_bb (stmt) == loop->latch)
- nb_iter = number_of_latch_executions (loop);
-   else
- nb_iter = number_of_exit_cond_executions (loop);
  
/* The new statements will be placed before LOOP.  */
gsi = gsi_last_bb (loop_preheader_edge (loop)->src);
  
!   nb_bytes = build_size_arg_loc (loc, partition->main_dr, nb_iter);
nb_bytes = force_gimple_operand_gsi (&gsi, nb_bytes, true, NULL_TREE,
   false, GSI_CONTINUE_LINKING);
mem = build_addr_arg_loc (loc, partition->main_dr, nb_bytes);
--- 849,865 
  {
gimple_stmt_iterator gsi;
gimple stmt, fn_call;
!   tree mem, fn, nb_bytes;
location_t loc;
tree val;
  
stmt = DR_STMT (partition->main_dr);
loc = gimple_location (stmt);
  
/* The new statements will be placed before LOOP.  */
gsi = gsi_last_bb (loop_preheader_edge (loop)->src);
  
!   nb_bytes = build_size_arg_loc (loc, partition->main_dr, partition->niter);
nb_bytes = force_gimple_operand_gsi (&gsi, nb_bytes, true, NULL_TREE,
   false, GSI_CONTINUE_LINKING);
mem = build_addr_arg_loc (loc, partition->main_dr, nb_bytes);
*** generate_memcpy_builtin (struct loop *lo
*** 908,928 
  {
gimple_stmt_iterator gsi;
gimple stmt, fn_call;
!   tree nb_iter, dest, src, fn, nb_bytes;
location_t loc;
enum built_in_function kind;
  
stmt = DR_STMT (partition->main_dr);
loc = gimple_location (stmt);
-   if (gimple_bb (stmt) == loop->latch)
- nb_iter = number_of_latch_executions (loop);
-   else
- nb_iter = number_of_exit_cond_executions (loop);
  
/* The new statements will be placed before LOOP.  */
gsi = gsi_last_bb (loop_preheader_edge (loop)->src);
  
!   nb_bytes = build_size_arg_loc (loc, partition->main_dr, nb_iter);
nb_bytes = force_gimple_operand_gsi (&gsi, nb_bytes, true, NULL_TREE,
   false, GSI_CONTINUE_LINKING);
dest = build_addr_arg_loc (loc, partition->main_dr, nb_bytes);
--- 905,921 
  {
gimple_stmt_iterator gsi;
gimple stmt, fn_call;
!   tree dest, src, fn, nb_bytes;
location_t loc;
enum built_in_function kind;
  
stmt = DR_STMT (partition->main_dr);
loc = gimple_location (stmt);
  
/* The new statements will be placed before LOOP.  */
gsi = gsi_last_bb (loop_preheader_edge (loop)->src);
  
!   nb_bytes = build_size_arg_loc (loc, partition->main_dr, partition->niter);
nb_bytes = force_gimple_operand_gsi (&gsi, nb_bytes, true, NULL_TREE,
   false, GSI_CONTINUE_LINKING);
dest = build_addr_arg_loc (loc, partition->main_dr, nb_bytes);
*** classify_partition (loop_p loop, struct
*** 1125,1130 
--- 1118,1124 
partition->kind = PKIND_NORMAL;
partition->main_dr = NULL;
partition->secondary_dr = NULL;
+   partition->niter = NULL_TREE;
  
EXECUTE_IF_SET_IN_BITMAP (partition->stmts, 0, i, bi)
  {
*** classify_partition (loop_p loop, struct
*** 1151,1160 
|| !flag_tree_loop_distribute_patterns)
  return;
  
-   nb_iter = number_of_exit_cond_executions (loop);
-   if (!nb_iter || nb_iter == chrec_dont_know)
- return;
- 
/* Detect memset and memcpy.  */
single_load = NULL;
single_store = NULL;
--- 1145,1150 
*** classify_partition (loop_p loop, struct
*** 1193,1198 
--- 1183,1199 
}
  }
  
+   if (!si

[ARM, PR58578] Split shift di patterns

2013-10-01 Thread Kugan
Hi,

I am attaching a patch that reverts Split shift di patterns (r197527) as
it introduced PR58578. I am also attaching a patch to add a testcase
based on this failiures.

No regression on qemu for arm-none-eabi and new testcase now passes.

Is this OK?

Thanks,
Kugan
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7c9a6c5..abc545f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,15 @@
+2013-10-01  Kugan Vivekanandarajah  
+
+   PR target/58578
+   Revert
+   2013-04-05  Greta Yorsh  
+   * config/arm/arm.md (arm_ashldi3_1bit):  define_insn into
+   define_insn_and_split.
+   (arm_ashrdi3_1bit,arm_lshrdi3_1bit): Likewise.
+   (shiftsi3_compare): New pattern.
+   (rrx): New pattern.
+   * config/arm/unspecs.md (UNSPEC_RRX): New.
+
 2013-09-30  Richard Sandiford  
 
* vec.h (vec_prefix, vec): Prefix member names with "m_".
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 3b22081..4b9f991 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2013-10-01  Kugan Vivekanandarajah  
+
+   PR Target/58578
+   * gcc.target/arm/pr58578.c: New test.
+
 2013-09-30  Jakub Jelinek  
 
PR middle-end/58564
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index b094cff..e8d5464 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -3867,26 +3867,13 @@
   "
 )
 
-(define_insn_and_split "arm_ashldi3_1bit"
+(define_insn "arm_ashldi3_1bit"
   [(set (match_operand:DI0 "s_register_operand" "=r,&r")
 (ashift:DI (match_operand:DI 1 "s_register_operand" "0,r")
(const_int 1)))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT"
-  "#"   ; "movs\\t%Q0, %Q1, asl #1\;adc\\t%R0, %R1, %R1"
-  "&& reload_completed"
-  [(parallel [(set (reg:CC CC_REGNUM)
-  (compare:CC (ashift:SI (match_dup 1) (const_int 1))
-   (const_int 0)))
- (set (match_dup 0) (ashift:SI (match_dup 1) (const_int 1)))])
-   (set (match_dup 2) (plus:SI (plus:SI (match_dup 3) (match_dup 3))
-  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0]
-  {
-operands[2] = gen_highpart (SImode, operands[0]);
-operands[0] = gen_lowpart (SImode, operands[0]);
-operands[3] = gen_highpart (SImode, operands[1]);
-operands[1] = gen_lowpart (SImode, operands[1]);
-  }
+  "movs\\t%Q0, %Q1, asl #1\;adc\\t%R0, %R1, %R1"
   [(set_attr "conds" "clob")
(set_attr "length" "8")
(set_attr "type" "multiple")]
@@ -3964,43 +3951,18 @@
   "
 )
 
-(define_insn_and_split "arm_ashrdi3_1bit"
+(define_insn "arm_ashrdi3_1bit"
   [(set (match_operand:DI  0 "s_register_operand" "=r,&r")
 (ashiftrt:DI (match_operand:DI 1 "s_register_operand" "0,r")
  (const_int 1)))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT"
-  "#"   ; "movs\\t%R0, %R1, asr #1\;mov\\t%Q0, %Q1, rrx"
-  "&& reload_completed"
-  [(parallel [(set (reg:CC CC_REGNUM)
-   (compare:CC (ashiftrt:SI (match_dup 3) (const_int 1))
-   (const_int 0)))
-  (set (match_dup 2) (ashiftrt:SI (match_dup 3) (const_int 1)))])
-   (set (match_dup 0) (unspec:SI [(match_dup 1)
-  (reg:CC_C CC_REGNUM)]
- UNSPEC_RRX))]
-  {
-operands[2] = gen_highpart (SImode, operands[0]);
-operands[0] = gen_lowpart (SImode, operands[0]);
-operands[3] = gen_highpart (SImode, operands[1]);
-operands[1] = gen_lowpart (SImode, operands[1]);
-  }
+  "movs\\t%R0, %R1, asr #1\;mov\\t%Q0, %Q1, rrx"
   [(set_attr "conds" "clob")
(set_attr "length" "8")
(set_attr "type" "multiple")]
 )
 
-(define_insn "*rrx"
-  [(set (match_operand:SI 0 "s_register_operand" "=r")
-(unspec:SI [(match_operand:SI 1 "s_register_operand" "r")
-(reg:CC_C CC_REGNUM)]
-   UNSPEC_RRX))]
-  "TARGET_32BIT"
-  "mov\\t%0, %1, rrx"
-  [(set_attr "conds" "use")
-   (set_attr "type" "mov_shift")]
-)
-
 (define_expand "ashrsi3"
   [(set (match_operand:SI  0 "s_register_operand" "")
(ashiftrt:SI (match_operand:SI 1 "s_register_operand" "")
@@ -4070,27 +4032,13 @@
   "
 )
 
-(define_insn_and_split "arm_lshrdi3_1bit"
+(define_insn "arm_lshrdi3_1bit"
   [(set (match_operand:DI  0 "s_register_operand" "=r,&r")
 (lshiftrt:DI (match_operand:DI 1 "s_register_operand" "0,r")
  (const_int 1)))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT"
-  "#"   ;  "movs\\t%R0, %R1, lsr #1\;mov\\t%Q0, %Q1, rrx"
-  "&& reload_completed"
-  [(parallel [(set (reg:CC CC_REGNUM)
-   (compare:CC (lshiftrt:SI (match_dup 3) (const_int 1))
-   (const_int 0)))
-  (set (match_dup 2) (lshiftrt:SI (match_dup 3) (const_int 1)))])
-   (set (match_dup 0) (unspec:SI [(match_dup 1)
-  (reg:CC_C CC_REGNUM)]

Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA. 2/2 New registers and instructions

2013-10-01 Thread Ilya Enkovich
On 26 Sep 23:12, Uros Bizjak wrote:
> On Tue, Sep 17, 2013 at 10:41 AM, Ilya Enkovich  
> wrote:
> 
> >> >> The x86 part looks mostly OK (I have a couple of comments bellow), but
> >> >> please first get target-independent changes reviewed and committed.
> >> >
> >> > Do you mean I should move bound type and mode declaration into a 
> >> > separate patch?
> >>
> >> Yes, target-independent part (middle end) has to go through the
> >> separate review to check if this part is OK. The target-dependent part
> >> uses the infrastructure from the middle end, so it can go into the
> >> code base only after target-independent parts are committed.
> >
> > I sent a separate patch for bound type and mode class 
> > (http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01268.html). Here is target 
> > part of the patch with fixes you mentioned. Does it look OK?
> >
> > Bootstrapped and checked on linux-x86_64. Still shows incorrect length 
> > attribute computation (described here 
> > http://gcc.gnu.org/ml/gcc/2013-07/msg00311.html).
> 
> Please look at the attached patch that solves length computation
> problem. The patch also implements length calculation in a generic
> way, as proposed earlier.
> 
> The idea is to calculate total insn length via generic "length"
> attribute calculation from "length_nobnd" attribute, but iff
> length_attribute is non-null. This way, we are able to decorate
> bnd-prefixed instructions by "lenght_nobnd" attribute, and generic
> part will automatically call ix86_bnd_prefixed_insn_p predicate with
> current insn pattern. I also belive that this approach is most
> flexible to decorate future patterns.
> 
> The patch adds new attribute to a couple of patterns to illustrate its usage.
> 
> Please test this approach. Modulo length calculations, improved by the
> patch in this message, I have no further comments, but please repost
> complete (target part) of your patch.

Hi Uros,

Thanks for your reply! I applied approach you proposed for length attribute. It 
works well. Make check is clean now.

I also adjusted bound registers to recently added mask registers. Attached is a 
new patch.

Thanks,
Ilya

--

2013-09-30  Ilya Enkovich  

* config/i386/constraints.md (B): New.
(Ti): New.
(Tb): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
* config/i386/i386-modes.def (BND32): New.
(BND64): New.
* config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.c (isa_opts): Add mmpx.
(regclass_map): Add bound registers.
(dbx_register_map): Likewise.
(dbx64_register_map): Likewise.
(svr4_dbx_register_map): Likewise.
(PTA_MPX): New.
(ix86_option_override_internal): Support MPX ISA.
(ix86_conditional_register_usage): Support bound registers.
(print_reg): Likewise.
(ix86_code_end): Add MPX bnd prefix.
(output_set_got): Likewise.
(ix86_output_call_insn): Likewise.
(ix86_print_operand): Add '!' (MPX bnd) print prefix support.
(ix86_print_operand_punct_valid_p): Likewise.
(ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
UNSPEC_BNDMK_ADDR.
(ix86_class_likely_spilled_p): Add bound regs support.
(ix86_hard_regno_mode_ok): Likewise.
(x86_order_regs_for_local_alloc): Likewise.
(ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value.
(FIXED_REGISTERS): Add bound registers.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(HARD_REGNO_NREGS): Likewise.
(TARGET_MPX): New.
(VALID_BND_REG_MODE): New.
(FIRST_BND_REG): New.
(LAST_BND_REG): New.
(reg_class): Add BND_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BND_REGNO_P): New.
(ANY_BND_REG_P): New.
(BNDmode): New.
(HI_REGISTER_NAMES): Add bound registers.
* config/i386/i386.md (UNSPEC_BNDMK): New.
(UNSPEC_BNDMK_ADDR): New.
(UNSPEC_BNDSTX): New.
(UNSPEC_BNDLDX): New.
(UNSPEC_BNDLDX_ADDR): New.
(UNSPEC_BNDCL): New.
(UNSPEC_BNDCU): New.
(UNSPEC_BNDCN): New.
(UNSPEC_MPX_FENCE): New.
(BND0_REG): New.
(BND1_REG): New.
(type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_immediate): Likewise.
(prefix_0f): Likewise.
(memory): Likewise.
(prefix_rep): Check for bnd prefix.
(length_nobnd): New.
(length): Use length_nobnd if specified.
(BND): New.
(bnd_ptr): New.
(BNDCHECK): New.
(bndcheck): New.
(*jcc_1): Add bnd prefix and rename length attr to length_nobnd.
(*jcc_2): Likewise.
(jump): Likewise.
(simple_return_internal): Likewise.
(simple_return_pop_internal): Likewise.
(*indirect_jump): Add MPX b

Re: [patch] Add tree-ssa-coalesce.h

2013-10-01 Thread Richard Biener
On Mon, Sep 30, 2013 at 5:48 PM, Andrew MacLeod  wrote:
> Move the prototype for coalesce_ssa_name() out of tree-ssa-live.h and put it
> in a new tree-ssa-coalesce.h file.
> Include tree-ssa-coalesce.h from  tree-outof-ssa.h as it forms part of the
> out-of-ssa module.
>
> Also move gimple_can_coalesce_p from tree-ssa-coalesce.c to gimple.h as it
> operates on gimple structures and is also used a couple of other places.
> The prototype is already in gimple.h.
>
> Bootstraps on build/x86_64-unknown-linux-gnu with no new regressions.   OK?

Ok.

Thanks,
Richard.

> Andrew
>


Re: RFA: Use "m_foo" rather than "foo_" for member variables

2013-10-01 Thread Richard Biener
On Mon, Sep 30, 2013 at 10:39 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Sun, Sep 29, 2013 at 11:08 AM, Richard Sandiford
>>  wrote:
>>> Michael Matz  writes:
Trever Saunders  writes:
> Richard Biener  writes:
> > Btw, I've come around multiple coding-styles in the past and I
> > definitely would prefer m_mode / m_count to mark members vs. mode_ and
> > count_. (and s_XXX for static members IIRC).
>
> I'd prefer m_/s_foo for members / static things too fwiw.

 Me as well.  It's still ugly, but not so unsymmetric as the trailing
 underscore.
>>>
>>> Well, I'm not sure how I came to be the one writing these patches,
>>> but I suppose I prefer m_foo too.  So how about the attached?
>>>
>>> The first patch has changes to the coding conventions.  I added
>>> some missing spaces while there.
>>>
>>> The second patch has the mechanical code changes.  The reason for
>>> yesterday's mass adding of spaces was because the second patch would
>>> have been pretty inconsistent otherwise.
>>>
>>> Tested on x86_64-linux-gnu.
>>
>> Ok.
>
> Applied, thanks.  I was only looking for private and protected members,
> so I ended up missing vec.  I installed the patch below as an obvious
> follow-up.  Tested on x86_64-linux-gnu.
>
> There are some other uses foo_, but many of them seem to date from before
> the C++ switchover.

My usual foo_ use is for stuff like

int qsort_fn (const void *ptr1_, const void *ptr2_)
{
  const T *ptr1 = (const T *)ptr1_;
...

to not need to invent another name for the void * typed parameter and
keep the name association obvious.

But grepping for '_ '  indeed shows a lot more, but mostly in macros.

Richard.

> Richard
>
>
> gcc/
> * vec.h (vec_prefix, vec): Prefix member names with "m_".
> * vec.c (vec_prefix::calculate_allocation): Update accordingly.
>
> Index: gcc/vec.c
> ===
> --- gcc/vec.c   2013-09-27 09:16:58.010299213 +0100
> +++ gcc/vec.c   2013-09-30 18:09:02.892316820 +0100
> @@ -183,8 +183,8 @@ vec_prefix::calculate_allocation (vec_pr
>
>if (pfx)
>  {
> -  alloc = pfx->alloc_;
> -  num = pfx->num_;
> +  alloc = pfx->m_alloc;
> +  num = pfx->m_num;
>  }
>else if (!reserve)
>  /* If there's no vector, and we've not requested anything, then we
> Index: gcc/vec.h
> ===
> --- gcc/vec.h   2013-09-30 18:06:22.236575959 +0100
> +++ gcc/vec.h   2013-09-30 18:06:22.305576705 +0100
> @@ -235,8 +235,8 @@ struct vec_prefix
>friend struct va_heap;
>friend struct va_stack;
>
> -  unsigned alloc_;
> -  unsigned num_;
> +  unsigned m_alloc;
> +  unsigned m_num;
>  };
>
>  template struct vec;
> @@ -285,7 +285,7 @@ va_heap::reserve (vec   MEM_STAT_DECL)
>  {
>unsigned alloc
> -= vec_prefix::calculate_allocation (v ? &v->vecpfx_ : 0, reserve, exact);
> += vec_prefix::calculate_allocation (v ? &v->m_vecpfx : 0, reserve, 
> exact);
>if (!alloc)
>  {
>release (v);
> @@ -293,7 +293,7 @@ va_heap::reserve (vec  }
>
>if (GATHER_STATISTICS && v)
> -v->vecpfx_.release_overhead ();
> +v->m_vecpfx.release_overhead ();
>
>size_t size = vec::embedded_size (alloc);
>unsigned nelem = v ? v->length () : 0;
> @@ -301,7 +301,7 @@ va_heap::reserve (vecv->embedded_init (alloc, nelem);
>
>if (GATHER_STATISTICS)
> -v->vecpfx_.register_overhead (size FINAL_PASS_MEM_STAT);
> +v->m_vecpfx.register_overhead (size FINAL_PASS_MEM_STAT);
>  }
>
>
> @@ -315,7 +315,7 @@ va_heap::release (vec  return;
>
>if (GATHER_STATISTICS)
> -v->vecpfx_.release_overhead ();
> +v->m_vecpfx.release_overhead ();
>::free (v);
>v = NULL;
>  }
> @@ -364,7 +364,7 @@ va_gc::reserve (vec *&v,
> MEM_STAT_DECL)
>  {
>unsigned alloc
> -= vec_prefix::calculate_allocation (v ? &v->vecpfx_ : 0, reserve, exact);
> += vec_prefix::calculate_allocation (v ? &v->m_vecpfx : 0, reserve, 
> exact);
>if (!alloc)
>  {
>::ggc_free (v);
> @@ -433,9 +433,9 @@ void unregister_stack_vec (unsigned);
>  va_stack::alloc (vec &v, unsigned nelems,
>  vec *space)
>  {
> -  v.vec_ = space;
> -  register_stack_vec (static_cast (v.vec_));
> -  v.vec_->embedded_init (nelems, 0);
> +  v.m_vec = space;
> +  register_stack_vec (static_cast (v.m_vec));
> +  v.m_vec->embedded_init (nelems, 0);
>  }
>
>
> @@ -462,16 +462,16 @@ va_stack::reserve (vec  }
>
>/* Move VEC_ to the heap.  */
> -  nelems += v->vecpfx_.num_;
> +  nelems += v->m_vecpfx.m_num;
>vec *oldvec = v;
>v = NULL;
>va_heap::reserve (reinterpret_cast *&>(v), 
> nelems,
> exact PASS_MEM_STAT);
>if (v && oldvec)
>  {
> -  v->vecpfx_.num_ = oldvec->length ();
> -  memcpy (v->vecdata_,
> - oldvec->vecdata_,
> +  v->m_vecpfx.m_num = oldvec->length ();
> +   

Re: cost model patch

2013-10-01 Thread Richard Biener
On Mon, Sep 30, 2013 at 5:26 PM, Xinliang David Li  wrote:
> Yes, that will do.  Can you do it for me? I can't  do testing easily
> on arm myself.

It also fails on x86_64 with -m32.  I always test on x86_64 with
multilibs enabled:

make -k -j12 check RUNTESTFLAGS="--target_board=unix/\{,-m32\}"

Richard.

> thanks,
>
> David
>
>
>
>
> On Mon, Sep 30, 2013 at 3:29 AM, Kyrill Tkachov  
> wrote:
>> Hi Richard, David,
>>
>>> In principle yes.  Note that it changes the behavior of -O2
>>> -ftree-vectorize
>>> as -ftree-vectorize does not imply changing the default cost model.  I am
>>> fine with that, but eventually this will have some testsuite fallout.
>>
>> Indeed I am observing a regression with this patch on arm-none-eabi in
>> gcc.dg/tree-ssa/gen-vect-26.c.
>>
>> Seems that the cheap vectoriser model doesn't do unaligned stores (as
>> expected I think?). Is adding -fvect-cost-model=dynamic to the test options
>> the correct approach?
>>
>>
>> Thanks,
>> Kyrill
>>
>>


Re: [Ping] [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Yufeng Zhang

Ping~

Thanks,
Yufeng

On 09/25/13 12:37, Yufeng Zhang wrote:

Hello,

Please find the updated version of the patch in the attachment.  It has
addressed the previous comments and also included some changes in order
to pass the bootstrapping on x86_64.

It's also passed the regtest on arm-none-eabi and aarch64-none-elf.

It will also fix the test failure as reported here:
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html

OK for the trunk?

Thanks,
Yufeng


gcc/

  * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New
function.
  (backtrace_base_for_ref): Call get_unwidened, check 'base_in'
  again and set unwidend_p with true; call safe_to_multiply_p to
avoid
  unsafe unwidened cases.

gcc/testsuite/

  * gcc.dg/tree-ssa/slsr-40.c: New test.



On 09/11/13 13:39, Bill Schmidt wrote:

On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote:

On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang   wrote:

Hi,

Following Bin's patch in
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch tweaks
backtrace_base_for_ref () to strip of any widening conversion after the
first TREE_CODE check fails.  Without this patch, the test
(gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as
backtrace_base_for_ref () will stop if not seeing an ssa_name since the tree
code can be nop_expr instead.

Regtested on arm and aarch64; still bootstrapping x86_64.

OK for the trunk if the x86_64 bootstrap succeeds?


Please add a testcase.


Also, the comment "Strip of" should read "Strip off".  Otherwise I have
no comments.

Thanks,
Bill



Richard.





Re: [wwwdocs] Mention ubsan in 4.9 changes.html

2013-10-01 Thread Marek Polacek
Ping.

On Thu, Sep 19, 2013 at 05:12:34PM +0200, Marek Polacek wrote:
> Maybe it'd be worth noting in changes.html that GCC now has the
> ubsan...
> 
> Ok to apply?
> 
> --- www/htdocs/gcc-4.9/changes.html.mp2013-09-19 16:54:32.113724993 
> +0200
> +++ www/htdocs/gcc-4.9/changes.html   2013-09-19 17:07:05.418030738 +0200
> @@ -38,6 +38,14 @@
>  AddressSanitizer, a fast memory error detector, is now available on 
> ARM.
>  
>
> +  
> +UndefinedBehaviorSanitizer, a fast undefined behavior detector,
> +has been added and can be enabled via 
> -fsanitize=undefined.
> +  Various computations will be instrumented to detect undefined behavior
> +  at runtime.  UndefinedBehaviorSanitizer is currently available for C
> +  and C++ languages.
> +
> +  
>  
>  New Languages and Language specific improvements

Marek


Re: [PATCH][AARCH64]Replace gen_rtx_PLUS with plus_constant

2013-10-01 Thread Kyrill Tkachov

On 30/09/13 14:20, Renlin Li wrote:

Hello all,

Sorry for my last patch that cause some test regressions. I have correct
it, and it has been tested for aarch64-none-elf on the model.

This patch will replace all explicit calls to gen_rtx_PLUS and GEN_INT
with plus_constant.

OK for trunk?

Kind regards,
Renlin Li

gcc/ChangeLog:

2013-09-30  Renlin Li 

  * config/aarch64/aarch64.c (aarch64_expand_prologue): Use
plus_constant.
  (aarch64_expand_epilogue): Likewise.

Looks ok to me, but I can't approve it. CC'ing the maintainers...

Kyrill



[PING] 3 patches waiting for approval/review

2013-10-01 Thread Andreas Krebbel
[RFC] Allow functions calling mcount before prologue to be leaf functions
http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html

[PATCH] PR57377: Fix mnemonic attribute
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01364.html

[PATCH] Doc: Add documentation for the mnemonic attribute
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01436.html

Bye,

-Andreas-



Re: [PATCH][ARM]Replace gen_rtx_PLUS with plus_constant

2013-10-01 Thread Kyrill Tkachov

On 30/09/13 14:23, Renlin Li wrote:

Hello all,

Sorry for my last patch that cause some test regressions. I have correct
it, and it has been tested for arm-none-eabi on the model.

This patch will replace all explicit calls to gen_rtx_PLUS and GEN_INT
with plus_constant.

OK for trunk?

Kind regards,
Renlin Li

gcc/ChangeLog:

2013-09-30  Renlin Li  

  * config/arm/arm.c (arm_output_mi_thunk): Use plus_constant.

Looks ok to me but I can't approve it. CC'ing maintainers...

Kyrill



Re: RFA: GCC Testsuite: Annotate compile tests that need at least 32-bit pointers/integers

2013-10-01 Thread nick clifton

Hi Mike,


It may be reasonable to special case ptr32plus to say no on your platform, from 
check_effective_target_tls_native, we see code like:



you could do something like:

proc check_effective_target_ptr32plus { } {
 # msp430 never really has 32 or more bits in a pointer.
 if { [istarget msp430-*-*] } {
 return 0
 }
 return [check_no_compiler_messages ptr32plus object {
 int dummy[sizeof (void *) >= 4 ? 1 : -1];
 }]
}

Then, you don't have to worry about people adding tests with this predicate and 
those test cases failing.  I don't have a good handle on wether this is better 
or not, so, I'll let you decide what you think is best.


Thanks - that is a good idea.  (I am embarrassed that I did not think of 
it myself).  I have checked the patch in with this change added.


Cheers
  Nick




Re: [ARM, PR58578] Split shift di patterns

2013-10-01 Thread Ramana Radhakrishnan

On 10/01/13 08:42, Kugan wrote:

Hi,

I am attaching a patch that reverts Split shift di patterns (r197527) as
it introduced PR58578. I am also attaching a patch to add a testcase
based on this failiures.

No regression on qemu for arm-none-eabi and new testcase now passes.

Is this OK?

Thanks,
Kugan




Ok if no regressions.

Thanks,
Ramana



Re: cost model patch

2013-10-01 Thread Kyrill Tkachov

On 01/10/13 09:28, Richard Biener wrote:

On Mon, Sep 30, 2013 at 5:26 PM, Xinliang David Li  wrote:

Yes, that will do.  Can you do it for me? I can't  do testing easily
on arm myself.

It also fails on x86_64 with -m32.  I always test on x86_64 with
multilibs enabled:

make -k -j12 check RUNTESTFLAGS="--target_board=unix/\{,-m32\}"


Appears on aarch64-none-elf as well...
This patch makes the tests pass for me.

I notice there's PR58556 that talks about these failures, so shall I link this 
patch to this PR?


Ok to apply?

Kyrill

P.S. Since we've changed the default cost model for the vectoriser, perhaps we 
should consider reorganising the vectoriser tests taking into consideration what 
tests apply to which cost model?


[gcc/testsuite/]
2013-10-01  Kyrylo Tkachov  

PR tree-optimization/58556
* gcc.dg/tree-ssa/gen-vect-26.c: Use dynamic vector cost model.
* gcc.dg/tree-ssa/gen-vect-28.c: Likewise.diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c
index f14bf83..dadeb07 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-26.c
@@ -1,6 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
-/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
-/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse" { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -fvect-cost-model=dynamic" } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -fvect-cost-model=dynamic -mno-sse" { target { i?86-*-* x86_64-*-* } } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c
index d90520e..f314b28 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c
@@ -1,6 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
-/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
-/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse" { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -fvect-cost-model=dynamic" } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -fvect-cost-model=dynamic -mno-sse" { target { i?86-*-* x86_64-*-* } } } */
 
 #include 
 

Re: expand_expr tweaks to fix PR57134

2013-10-01 Thread Alan Modra
I'm committing this cleanup patch to my PR 57134,57586 changes as
obvious.  That it is obvious can be seen from an assert in
tree-ssa-operands.c get_asm_expr_operands().

  /* This should have been split in gimplify_asm_expr.  */
  gcc_assert (!allows_reg || !is_inout);

Bootstrapped, etc. powerpc64-linux.

* stmt.c (expand_asm_operands): Revert part of 2013-09-24 special
casing inout operands.

Index: gcc/stmt.c
===
--- gcc/stmt.c  (revision 203053)
+++ gcc/stmt.c  (working copy)
@@ -807,9 +807,7 @@ expand_asm_operands (tree string, tree outputs, tr
  || is_inout)
{
  op = expand_expr (val, NULL_RTX, VOIDmode,
-   !allows_reg ? EXPAND_MEMORY
-   : !is_inout ? EXPAND_WRITE
-   : EXPAND_NORMAL);
+   !allows_reg ? EXPAND_MEMORY : EXPAND_WRITE);
  if (MEM_P (op))
op = validize_mem (op);
 

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH][ARM]Replace gen_rtx_PLUS with plus_constant

2013-10-01 Thread Marcus Shawcroft
On 30 September 2013 14:23, Renlin Li  wrote:

> OK for trunk?
>
> Kind regards,
> Renlin Li
>
> gcc/ChangeLog:
>
> 2013-09-30  Renlin Li  
>
> * config/arm/arm.c (arm_output_mi_thunk): Use plus_constant.

OK
/Marcus


Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Richard Biener
On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang  wrote:
> Hello,
>
> Please find the updated version of the patch in the attachment.  It has
> addressed the previous comments and also included some changes in order to
> pass the bootstrapping on x86_64.
>
> It's also passed the regtest on arm-none-eabi and aarch64-none-elf.
>
> It will also fix the test failure as reported here:
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html
>
> OK for the trunk?

+   where n is a 32-bit unsigned int and pointer are 64-bit long.  In this
+   case, the gimple for (n - 1) is:
+
+ _2 = n_1(D) + 4294967295; // 0x
+
+   and it is wrong to multiply the large constant by 4 in the 64-bit space.  */
+
+static bool
+safe_to_multiply_p (tree type, double_int cst)
+{
+  if (TYPE_UNSIGNED (type)
+  && ! double_int_fits_to_tree_p (signed_type_for (type), cst))
+return false;
+
+  return true;
+}

This looks wrong.  The only relevant check is as whether the
multiplication overflows the original type as you miss the implicit
truncation that happens.  Which is something you don't know
unless you know the value.  It definitely isn't a property of a type
and a constant but the property of two constants and a type.
Or the predicate has a wrong name.

The use of get_unwidened in this core routine looks like this is
all happening in the wrong place and we should have picked up
another candidate for this instead?  I'm sure Bill will know more here.

Richard.



> Thanks,
> Yufeng
>
>
> gcc/
>
> * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New
> function.
> (backtrace_base_for_ref): Call get_unwidened, check 'base_in'
> again and set unwidend_p with true; call safe_to_multiply_p to avoid
> unsafe unwidened cases.
>
> gcc/testsuite/
>
> * gcc.dg/tree-ssa/slsr-40.c: New test.
>
>
>
>
> On 09/11/13 13:39, Bill Schmidt wrote:
>>
>> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote:
>>>
>>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang
>>> wrote:

 Hi,

 Following Bin's patch in
 http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch
 tweaks
 backtrace_base_for_ref () to strip of any widening conversion after the
 first TREE_CODE check fails.  Without this patch, the test
 (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as
 backtrace_base_for_ref () will stop if not seeing an ssa_name since the
 tree
 code can be nop_expr instead.

 Regtested on arm and aarch64; still bootstrapping x86_64.

 OK for the trunk if the x86_64 bootstrap succeeds?
>>>
>>>
>>> Please add a testcase.
>>
>>
>> Also, the comment "Strip of" should read "Strip off".  Otherwise I have
>> no comments.
>>
>> Thanks,
>> Bill
>>
>>>
>>> Richard.


Re: [patch] Separate immediate uses and phi routines from tree-flow*.h

2013-10-01 Thread Richard Biener
On Fri, Sep 27, 2013 at 8:36 PM, Andrew MacLeod  wrote:
> On 09/27/2013 04:44 AM, Richard Biener wrote:
>>
>> On Thu, Sep 26, 2013 at 6:07 PM, Andrew MacLeod 
>> wrote:
>>
>>>
>>> Ugg. I incorporated what we talked about, and it was much messier than
>>> expected :-P.  I ended up with a chicken and egg problem between the
>>> gimple_v{use,def}_op routines in gimple-ssa.h  and the operand routines
>>> in
>>> tree-ssa-operands.h.   They both require each other, and I couldn't get
>>> things into a consistent state while they are in separate files.  It was
>>> actually the immediate use iterators which were requiring
>>> gimple_vuse_op()...  So I have created a new ssa-iterators.h file  to
>>> resolve this problem.  They build on the operand code and clearly has
>>> other
>>> prerequisites, so that seems reasonable to me...
>>>
>>> This in fact solves a couple of other little warts. It allows me to put
>>> both
>>> gimple_phi_arg_imm_use_ptr() and phi_arg_index_from_use() into
>>> tree-phinodes.h.
>>>
>>> It also exposes that gimple.c::walk_stmt_load_store_addr_ops() and
>>> friends
>>> actually depend on the existence of PHI nodes, meaning it really belongs
>>> on
>>> the gimple-ssa border as well. So I moved those into gimple-ssa.c
>>
>> It doesn't depend on PHI nodes but it also works for PHI nodes.  So
>> I'd rather have it in gimple.c.
>
> OK, well, the code depends on using PHI node routines to compile then :-)
>
> so poking around and thinking out it a it more, how does this work for you?
>
> I moved the phi_ routines which act as simple accessors to the PHI statement
> kind from tree-flow-inline.h into gimple.h.  I think this makes sense, they
> access elements of the statement.   the phi routines which actually do
> something other than that I put into tree-phinodes.[ch].   So this puts no
> SSA prerequisites into gimple.h.
>
> The park of the walk_* routines whch was causing issues was the dependency
> on PHI_ARG_DEF macro which was using the immediate use operands,
>
> #define PHI_ARG_DEF_PTR(PHI, I) gimple_phi_arg_imm_use_ptr ((PHI), (I))
> #define PHI_ARG_DEF(PHI, I) USE_FROM_PTR (PHI_ARG_DEF_PTR ((PHI), (I)))
>
> which as I reflected upon it, this really makes very little sense...  I
> changed this to
>
> #define PHI_ARG_DEF(PHI, I) gimple_phi_arg_def ((PHI), (I))
>
> And that resolved the problem nicely...  another bit of leftover stupidity I
> guess.I changed the code to call the routine directly instead of using
> the macro so it wont be dependant on tree-ssa-operands.h.  I'll eventually
> revisit tree-ssa-operands.h and try to fix up those macros.  The names are
> confusing with the gimple versions for some of these things. the
> PHI_ARG_DEF_PTR macro there is not compatible with gimple_phi_arg_def since
> it returns a use_operand_p.
>
> With those changes this is the patch.  Bootstraps on
> x86_64-unknown-linux-gnu and running regressions, assuming there are no
> issues.. OK?

Yes.

Thanks,
Richard.

> Andrew
>
>
>
>


Re: [PATCH]Fix computation of offset in ivopt

2013-10-01 Thread Richard Biener
On Mon, Sep 30, 2013 at 7:39 AM, bin.cheng  wrote:
>
>
>> -Original Message-
>> From: Richard Biener [mailto:richard.guent...@gmail.com]
>> Sent: Friday, September 27, 2013 4:30 PM
>> To: Bin Cheng
>> Cc: GCC Patches
>> Subject: Re: [PATCH]Fix computation of offset in ivopt
>>
>> On Fri, Sep 27, 2013 at 7:07 AM, bin.cheng  wrote:
>> >
>> >
>> >   case INTEGER_CST:
>> >   //...
>> >   *offset = int_cst_value (expr);
>> > change to
>> >   case INTEGER_CST:
>> >   //...
>> >   *offset = sext_hwi (int_cst_value (expr), type);
>> >
>> > and
>> >   case MULT_EXPR:
>> >   //...
>> >   *offset = sext_hwi (int_cst_value (expr), type); to
>> >   case MULT_EXPR:
>> >   //...
>> >  HOST_WIDE_INT xxx = (HOST_WIDE_INT)off0 * int_cst_value (op1);
>> >   *offset = sext_hwi (xxx, type);
>> >
>> > Any comments?
>>
>> The issue is of course that we end up converting offsets to sizetype at
> some
>> point which makes them all appear unsigned.  The fix for this is to simply
>> interpret them as signed ... but it's really a mess ;)
>>
>
> Hi,  this is updated patch which calculates signed offset in strip_offset_1
> then sign extend it in strip_offset.
>
> Bootstrap and test on x86_64/x86/arm. Is it OK?

I don't think you need

+  /* Sign extend off if expr is in type which has lower precision
+ than HOST_WIDE_INT.  */
+  if (TYPE_PRECISION (TREE_TYPE (expr)) <= HOST_BITS_PER_WIDE_INT)
+off = sext_hwi (off, TYPE_PRECISION (TREE_TYPE (expr)));

at least it would be suspicious if you did ...

The only case that I can think of points to a bug in strip_offset_1
again, namely if sizetype (the type of all offsets) is smaller than
a HOST_WIDE_INT in which case

+boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
+*offset = off0 + int_cst_value (tmp) + boffset / BITS_PER_UNIT;

is wrong as boffset / BITS_PER_UNIT does not do a signed division
then (for negative boffset which AFAIK does not happen - but it would
be technically allowed).  Thus, the predicates like

+&& cst_and_fits_in_hwi (tmp)

would need to be amended with a check that the MSB is not set.

Btw, the cst_and_fits_in_hwi implementation is odd:

bool
cst_and_fits_in_hwi (const_tree x)
{
  if (TREE_CODE (x) != INTEGER_CST)
return false;

  if (TYPE_PRECISION (TREE_TYPE (x)) > HOST_BITS_PER_WIDE_INT)
return false;

  return (TREE_INT_CST_HIGH (x) == 0
  || TREE_INT_CST_HIGH (x) == -1);
}

the precision check seems totally pointless and I wonder what's
the point of this routine as there is host_integerp () already
and tree_low_cst instead of int_cst_value - oh, I see, the latter
forcefully sign-extends  that should make the extension
not necessary.

Btw, int_cst_value sounds like a very bad name for a value-changing
function.

Richard.


> Thanks.
> bin
>
> 2013-09-30  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (strip_offset_1): Change parameter type.
> Count DECL_FIELD_BIT_OFFSET when computing offset for COMPONENT_REF.
> (strip_offset): Sign extend before return.


Re: cost model patch

2013-10-01 Thread Richard Biener
On Tue, Oct 1, 2013 at 11:33 AM, Kyrill Tkachov  wrote:
> On 01/10/13 09:28, Richard Biener wrote:
>>
>> On Mon, Sep 30, 2013 at 5:26 PM, Xinliang David Li 
>> wrote:
>>>
>>> Yes, that will do.  Can you do it for me? I can't  do testing easily
>>> on arm myself.
>>
>> It also fails on x86_64 with -m32.  I always test on x86_64 with
>> multilibs enabled:
>>
>> make -k -j12 check RUNTESTFLAGS="--target_board=unix/\{,-m32\}"
>
>
> Appears on aarch64-none-elf as well...
> This patch makes the tests pass for me.
>
> I notice there's PR58556 that talks about these failures, so shall I link
> this patch to this PR?
>
> Ok to apply?

Ok.

Thanks,
Richard.

> Kyrill
>
> P.S. Since we've changed the default cost model for the vectoriser, perhaps
> we should consider reorganising the vectoriser tests taking into
> consideration what tests apply to which cost model?
>
> [gcc/testsuite/]
> 2013-10-01  Kyrylo Tkachov  
>
> PR tree-optimization/58556
> * gcc.dg/tree-ssa/gen-vect-26.c: Use dynamic vector cost model.
> * gcc.dg/tree-ssa/gen-vect-28.c: Likewise.


Re: [PATCH][AARCH64]Replace gen_rtx_PLUS with plus_constant

2013-10-01 Thread Marcus Shawcroft
On 30 September 2013 14:20, Renlin Li  wrote:

> gcc/ChangeLog:
>
> 2013-09-30  Renlin Li 
>
> * config/aarch64/aarch64.c (aarch64_expand_prologue): Use plus_constant.
> (aarch64_expand_epilogue): Likewise.

OK
/Marcus


[PATCH v2] Fix libgfortran cross compile configury w.r.t newlib

2013-10-01 Thread Marcus Shawcroft

On 30/09/13 13:40, Marcus Shawcroft wrote:


Well, I thought this patch would work for me, but it does not.  It looks
like gcc_no_link is set to 'no' on my target because, technically, I can
link even if I don't use a linker script.  I just can't find any
functions.




In which case gating on gcc_no_link could be replaced with a test that
looks to see if we can link with the library.  Perhaps looking for
exit() or some such that might reasonably be expected to be present.

For example:

AC_CHECK_FUNC(exit)
if test "x${with_newlib}" = "xyes" -a "x${ac_cv_func_exit}" = "xno"; then

/Marcus







Patch attached.

/Marcus

2013-10-01  Marcus Shawcroft  

* configure.ac (AC_CHECK_FUNCS_ONCE): Add for exit() then make
existing AC_CHECK_FUNCS_ONCE dependent on outcome.
diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac
index 4609eba..ac0c02f 100644
--- a/libgfortran/configure.ac
+++ b/libgfortran/configure.ac
@@ -261,7 +261,8 @@ GCC_HEADER_STDINT(gstdint.h)
 AC_CHECK_MEMBERS([struct stat.st_blksize, struct stat.st_blocks, struct 
stat.st_rdev])
 
 # Check for library functions.
-if test "x${with_newlib}" = "xyes"; then
+AC_CHECK_FUNC(exit)
+if test "x${with_newlib}" = "xyes" -a "x${ac_cv_func_exit}" = "xno"; then
# We are being configured with a cross compiler.  AC_REPLACE_FUNCS
# may not work correctly, because the compiler may not be able to
# link executables.

Re: Ping^6: contribute Synopsys Designware ARC port

2013-10-01 Thread Diego Novillo
On Sat, Sep 28, 2013 at 9:54 AM, Joern Rennecke
 wrote:
> The main part of the port (everything but the testsuite) is still waiting
> for review:
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00323.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00324.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00325.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00328.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01870.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02070.html
>
> I've retested a i686-pc-linux-gnu native bootstrap as well as the obvious
> arc-elf32 / arc-linux-uclibc builds in trunk r202981.

I have been reviewing these patches (I've gone through 2), and so far
I find nothing surprising in them.  I should be able to finish them
today or tomorrow.  Joern, I assume that you'll be one of the
maintainers for the port?  Anyone else?

SC folks, could you appoint Joern (and any other volunteer that Joern
suggests) as maintainers?

Thanks.  Diego.


Re: [PATCH] Trivial cleanup

2013-10-01 Thread Michael Matz
Hi,

On Mon, 30 Sep 2013, Jeff Law wrote:

> >   - the compiler better do an awesome job of sharing stack  space for
> > user variables in a function... I wouldn't want to blow up the stack
> > with a bazillion unrelatd temps each wit their own location.
> If the objects have the same type and disjoint lifetimes, they can be easily
> shared.
> 
> Things are more difficult if the types are different

Not anymore.  We adjust the alias machinery when merging differently typed 
variables into the same stack slot, see update_alias_info_with_stack_vars.

> -- IIRC, the root 
> of the problem is the optimizers can interchange a load of one type with 
> a later store of the other -- the aliasing code says "hey, they're 
> different types, so they don't alias, feel free to move them around as 
> desired" and all hell breaks loose.

That was the problem once, yes.  Meanwhile we should have fairly decent 
stack slot reuse, especially with variables declared in different scopes 
(since the end-of-scope CLOBBERs), for non-SSA_NAME temps that is.  For 
the others it's the register allocator anyway that has to do a decent job 
(and it does).


Ciao,
Michael.


Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Bill Schmidt
On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote:
> On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang  wrote:
> > Hello,
> >
> > Please find the updated version of the patch in the attachment.  It has
> > addressed the previous comments and also included some changes in order to
> > pass the bootstrapping on x86_64.
> >
> > It's also passed the regtest on arm-none-eabi and aarch64-none-elf.
> >
> > It will also fix the test failure as reported here:
> > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html
> >
> > OK for the trunk?
> 
> +   where n is a 32-bit unsigned int and pointer are 64-bit long.  In this
> +   case, the gimple for (n - 1) is:
> +
> + _2 = n_1(D) + 4294967295; // 0x
> +
> +   and it is wrong to multiply the large constant by 4 in the 64-bit space.  
> */
> +
> +static bool
> +safe_to_multiply_p (tree type, double_int cst)
> +{
> +  if (TYPE_UNSIGNED (type)
> +  && ! double_int_fits_to_tree_p (signed_type_for (type), cst))
> +return false;
> +
> +  return true;
> +}
> 
> This looks wrong.  The only relevant check is as whether the
> multiplication overflows the original type as you miss the implicit
> truncation that happens.  Which is something you don't know
> unless you know the value.  It definitely isn't a property of a type
> and a constant but the property of two constants and a type.
> Or the predicate has a wrong name.
> 
> The use of get_unwidened in this core routine looks like this is
> all happening in the wrong place and we should have picked up
> another candidate for this instead?  I'm sure Bill will know more here.

I'm not happy with how this patch is progressing.  Without having looked
too deeply, this might be better handled earlier when determining which
casts are safe to use in building candidates.  What you have here seems
more like closing the barn door after the horse got out.  Maybe that's
the only solution, but it doesn't seem likely.

Another problem is that your test case isn't testing anything except
that the compiler doesn't crash.  That isn't sufficient as a regression
test.

I'll spend some time looking at this to see if I can find a better
approach.  It might be a day or two before I can get to it.  In addition
to the included test case, are there any other cases you've found that I
should be concerned with?

Thanks,
Bill

> 
> Richard.
> 
> 
> 
> > Thanks,
> > Yufeng
> >
> >
> > gcc/
> >
> > * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New
> > function.
> > (backtrace_base_for_ref): Call get_unwidened, check 'base_in'
> > again and set unwidend_p with true; call safe_to_multiply_p to avoid
> > unsafe unwidened cases.
> >
> > gcc/testsuite/
> >
> > * gcc.dg/tree-ssa/slsr-40.c: New test.
> >
> >
> >
> >
> > On 09/11/13 13:39, Bill Schmidt wrote:
> >>
> >> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote:
> >>>
> >>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang
> >>> wrote:
> 
>  Hi,
> 
>  Following Bin's patch in
>  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch
>  tweaks
>  backtrace_base_for_ref () to strip of any widening conversion after the
>  first TREE_CODE check fails.  Without this patch, the test
>  (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as
>  backtrace_base_for_ref () will stop if not seeing an ssa_name since the
>  tree
>  code can be nop_expr instead.
> 
>  Regtested on arm and aarch64; still bootstrapping x86_64.
> 
>  OK for the trunk if the x86_64 bootstrap succeeds?
> >>>
> >>>
> >>> Please add a testcase.
> >>
> >>
> >> Also, the comment "Strip of" should read "Strip off".  Otherwise I have
> >> no comments.
> >>
> >> Thanks,
> >> Bill
> >>
> >>>
> >>> Richard.
> 



Re: Ping^6: contribute Synopsys Designware ARC port

2013-10-01 Thread Chung-Ju Wu
2013/10/1 Diego Novillo :
> On Sat, Sep 28, 2013 at 9:54 AM, Joern Rennecke
>  wrote:
>> The main part of the port (everything but the testsuite) is still waiting
>> for review:
>
> I have been reviewing these patches (I've gone through 2), and so far
> I find nothing surprising in them.  I should be able to finish them
> today or tomorrow.  Joern, I assume that you'll be one of the
> maintainers for the port?  Anyone else?
>
> SC folks, could you appoint Joern (and any other volunteer that Joern
> suggests) as maintainers?
>
> Thanks.  Diego.

It seems that Joern has been appointed as port maintainer earlier:
  http://gcc.gnu.org/ml/gcc/2013-01/msg00094.html

And he also added himself in MAINTAINERS file already. :)


Best regards,
jasonwucj


[Committed] S/390: Fix PR 58574

2013-10-01 Thread Andreas Krebbel
Hi,

this fixes a bug in the literal pool splitting code in the s390 back
end.  Jakub debugged the problem and provided a fix.

I've tested the patch on s390 and s390x with the default options as
well as -march=z10/-mtune=zEC12.

No regressions.

Committed to mainline.

Jakub tested the 4.8 version and will commit it soon.

Bye,

-Andreas-


2013-10-01  Jakub Jelinek  
Andreas Krebbel  

PR target/58574
* config/s390/s390.c (s390_split_branches): Modify check for table
jump insns.
(s390_chunkify_start): Rearrange table jump insn check in order to
deal with compare and branch insns correctly.

2013-10-01  Jakub Jelinek  

PR target/58574
* gcc.c-torture/execute/pr58574.c: New testcase.

---
 gcc/config/s390/s390.c|   51 +-!!!
 gcc/testsuite/gcc.c-torture/execute/pr58574.c |  219 ++
 2 files changed, 230 insertions(+), 14 deletions(-), 26 modifications(!)

Index: gcc/config/s390/s390.c
===
*** gcc/config/s390/s390.c.orig
--- gcc/config/s390/s390.c
*** s390_split_branches (void)
*** 6025,6035 
  
for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
  {
!   if (! JUMP_P (insn))
continue;
  
pat = PATTERN (insn);
!   if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) > 2)
pat = XVECEXP (pat, 0, 0);
if (GET_CODE (pat) != SET || SET_DEST (pat) != pc_rtx)
continue;
--- 6025,6035 
  
for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
  {
!   if (! JUMP_P (insn) || tablejump_p (insn, NULL, NULL))
continue;
  
pat = PATTERN (insn);
!   if (GET_CODE (pat) == PARALLEL)
pat = XVECEXP (pat, 0, 0);
if (GET_CODE (pat) != SET || SET_DEST (pat) != pc_rtx)
continue;
*** s390_chunkify_start (void)
*** 7049,7054 
--- 7049,7056 
  
for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
  {
+   rtx table;
+ 
/* Labels marked with LABEL_PRESERVE_P can be target
 of non-local jumps, so we have to mark them.
 The same holds for named labels.
*** s390_chunkify_start (void)
*** 7063,7104 
  if (! vec_insn || ! JUMP_TABLE_DATA_P (vec_insn))
bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (insn));
}
  
!   /* If we have a direct jump (conditional or unconditional)
!or a casesi jump, check all potential targets.  */
else if (JUMP_P (insn))
{
!   rtx pat = PATTERN (insn);
!   rtx table;
  
! if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) > 2)
pat = XVECEXP (pat, 0, 0);
  
!   if (GET_CODE (pat) == SET)
! {
  rtx label = JUMP_LABEL (insn);
  if (label)
{
! if (s390_find_pool (pool_list, label)
  != s390_find_pool (pool_list, insn))
bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (label));
}
- }
-  else if (tablejump_p (insn, NULL, &table))
-{
-  rtx vec_pat = PATTERN (table);
-  int i, diff_p = GET_CODE (vec_pat) == ADDR_DIFF_VEC;
- 
-  for (i = 0; i < XVECLEN (vec_pat, diff_p); i++)
-{
-  rtx label = XEXP (XVECEXP (vec_pat, diff_p, i), 0);
- 
-  if (s390_find_pool (pool_list, label)
-  != s390_find_pool (pool_list, insn))
-bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (label));
-   }
}
! }
  }
  
/* Insert base register reload insns before every pool.  */
--- 7065,7105 
  if (! vec_insn || ! JUMP_TABLE_DATA_P (vec_insn))
bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (insn));
}
+   /* Check potential targets in a table jump (casesi_jump).  */
+   else if (tablejump_p (insn, NULL, &table))
+   {
+ rtx vec_pat = PATTERN (table);
+ int i, diff_p = GET_CODE (vec_pat) == ADDR_DIFF_VEC;
+ 
+ for (i = 0; i < XVECLEN (vec_pat, diff_p); i++)
+   {
+ rtx label = XEXP (XVECEXP (vec_pat, diff_p, i), 0);
  
! if (s390_find_pool (pool_list, label)
! != s390_find_pool (pool_list, insn))
!   bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (label));
!   }
!   }
!   /* If we have a direct jump (conditional or unconditional),
!check all potential targets.  */
else if (JUMP_P (insn))
{
! rtx pat = PATTERN (insn);
  
! if (GET_CODE (pat) == PARALLEL)
pat = XVECEXP (pat, 0, 0);
  
! if (GET_CODE (pat) == SET)
!   {
  rtx label = JUMP_LABEL (insn);
  if (label)
{
! if (s390_find_pool (pool_list, label

Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Bill Schmidt


On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote:
> On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote:
> > On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang  wrote:
> > > Hello,
> > >
> > > Please find the updated version of the patch in the attachment.  It has
> > > addressed the previous comments and also included some changes in order to
> > > pass the bootstrapping on x86_64.
> > >
> > > It's also passed the regtest on arm-none-eabi and aarch64-none-elf.
> > >
> > > It will also fix the test failure as reported here:
> > > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html
> > >
> > > OK for the trunk?
> > 
> > +   where n is a 32-bit unsigned int and pointer are 64-bit long.  In this
> > +   case, the gimple for (n - 1) is:
> > +
> > + _2 = n_1(D) + 4294967295; // 0x
> > +
> > +   and it is wrong to multiply the large constant by 4 in the 64-bit 
> > space.  */
> > +
> > +static bool
> > +safe_to_multiply_p (tree type, double_int cst)
> > +{
> > +  if (TYPE_UNSIGNED (type)
> > +  && ! double_int_fits_to_tree_p (signed_type_for (type), cst))
> > +return false;
> > +
> > +  return true;
> > +}
> > 
> > This looks wrong.  The only relevant check is as whether the
> > multiplication overflows the original type as you miss the implicit
> > truncation that happens.  Which is something you don't know
> > unless you know the value.  It definitely isn't a property of a type
> > and a constant but the property of two constants and a type.
> > Or the predicate has a wrong name.
> > 
> > The use of get_unwidened in this core routine looks like this is
> > all happening in the wrong place and we should have picked up
> > another candidate for this instead?  I'm sure Bill will know more here.
> 
> I'm not happy with how this patch is progressing.  Without having looked
> too deeply, this might be better handled earlier when determining which
> casts are safe to use in building candidates.  What you have here seems
> more like closing the barn door after the horse got out.  Maybe that's
> the only solution, but it doesn't seem likely.
> 
> Another problem is that your test case isn't testing anything except
> that the compiler doesn't crash.  That isn't sufficient as a regression
> test.

Sorry, that was a pre-coffee comment.  I would like also to see a test
that verifies the expected gimple, though, not just that the test runs.

> 
> I'll spend some time looking at this to see if I can find a better
> approach.  It might be a day or two before I can get to it.  In addition
> to the included test case, are there any other cases you've found that I
> should be concerned with?
> 
> Thanks,
> Bill
> 
> > 
> > Richard.
> > 
> > 
> > 
> > > Thanks,
> > > Yufeng
> > >
> > >
> > > gcc/
> > >
> > > * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New
> > > function.
> > > (backtrace_base_for_ref): Call get_unwidened, check 'base_in'
> > > again and set unwidend_p with true; call safe_to_multiply_p to 
> > > avoid
> > > unsafe unwidened cases.
> > >
> > > gcc/testsuite/
> > >
> > > * gcc.dg/tree-ssa/slsr-40.c: New test.
> > >
> > >
> > >
> > >
> > > On 09/11/13 13:39, Bill Schmidt wrote:
> > >>
> > >> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote:
> > >>>
> > >>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang
> > >>> wrote:
> > 
> >  Hi,
> > 
> >  Following Bin's patch in
> >  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch
> >  tweaks
> >  backtrace_base_for_ref () to strip of any widening conversion after the
> >  first TREE_CODE check fails.  Without this patch, the test
> >  (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as
> >  backtrace_base_for_ref () will stop if not seeing an ssa_name since the
> >  tree
> >  code can be nop_expr instead.
> > 
> >  Regtested on arm and aarch64; still bootstrapping x86_64.
> > 
> >  OK for the trunk if the x86_64 bootstrap succeeds?
> > >>>
> > >>>
> > >>> Please add a testcase.
> > >>
> > >>
> > >> Also, the comment "Strip of" should read "Strip off".  Otherwise I have
> > >> no comments.
> > >>
> > >> Thanks,
> > >> Bill
> > >>
> > >>>
> > >>> Richard.
> > 



Re: Ping^6: contribute Synopsys Designware ARC port

2013-10-01 Thread Joern Rennecke

Quoting Diego Novillo :


I have been reviewing these patches (I've gone through 2), and so far
I find nothing surprising in them.  I should be able to finish them
today or tomorrow.  Joern, I assume that you'll be one of the
maintainers for the port?  Anyone else?


Yes.  Claudiu Zissulescu at Synopsys would in principle be available as
co-maintainer, but I suppose it is customary to apply for write-after-
approval status first.


SC folks, could you appoint Joern (and any other volunteer that Joern
suggests) as maintainers?


I've already been appointed as maintainer back in January.


[PATCH] Attach jump thread path to edge->aux in tree-ssa-threadupdate.c

2013-10-01 Thread Jeff Law


The old code in tree-ssa-threadupdate.c had a 3-edge form to describe a 
jump threading path.  The incoming edge E, to which two additional edges 
were attached onto e->aux.


The general form of the FSA opt really needs a full path to keep the SSA 
graph updating code (particularly the PHI node updates).  We're already 
passing the full path into register_jump_thread, but it just extracted 
the 3-edge form from the full path.


This patch preserves the full path for the duration of the updating code 
by attaching the path to e->aux and dropping the old 3-edge form completely.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
Installed on trunk.



Jeff

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 32a4b1f..2b35553 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,39 @@
+2013-10-01  Jeff Law  
+
+   * tree-ssa-threadedge.c (thread_across_edge): Make path a pointer to
+   a vec.  Only delete the path if we create one without successfully
+   registering a jump thread.
+   * tree-ssa-threadupdate.h (register_jump_thread): Pass in path vector
+   as a pointer.
+   * tree-ssa-threadupdate.c (threaded_edges): Remove.  No longer used
+   (paths): New vector of jump threading paths.
+   (THREAD_TARGET, THREAD_TARGET2): Remove accessor macros.
+   (THREAD_PATH): New accessor macro for the entire thread path.
+   (lookup_redirection_data): Get intermediate and final outgoing edge
+   from the thread path.
+   (create_edge_and_update_destination_phis): Copy the threading path.
+   (ssa_fix_duplicate_block_edges): Get edges and block types from the
+   jump threading path.
+   (ssa_redirect_edges): Get edges and block types from the jump threading
+   path.  Free the path vector.
+   (thread_block): Get edges from the jump threading path.  Look at the
+   entire path to see if we thread to a loop exit.  If we cancel a jump
+   thread request, then free the path vector.
+   (thread_single_edge): Get edges and block types from the jump threading
+   path.  Free the path vector.
+   (thread_through_loop_header): Get edges and block types from the jump
+   threading path.  Free the path vector.
+   (mark_threaded_blocks): Iterate over the vector of paths and store
+   the path on the appropriate edge.  Get edges and block types from the
+   jump threading path.
+   (mark_threaded_blocks): Get edges and block types from the jump
+   threading path.  Free the path vector.
+   (thread_through_all_blocks): Use the vector of paths rather than
+   a vector of 3-edge sets.
+   (register_jump_thread): Accept pointer to a path vector rather
+   than the path vector itself.  Store the path vector for later use.
+   Simplify.
+
 2013-10-01  Kugan Vivekanandarajah  
 
PR target/58578
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index cf62785..39e921b 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -929,13 +929,13 @@ thread_across_edge (gimple dummy_cond,
  if (dest == NULL || dest == e->dest)
goto fail;
 
- vec path = vNULL;
+ vec *path = new vec ();
   jump_thread_edge *x
= new jump_thread_edge (e, EDGE_START_JUMP_THREAD);
- path.safe_push (x);
+ path->safe_push (x);
 
  x = new jump_thread_edge (taken_edge, EDGE_COPY_SRC_BLOCK);
- path.safe_push (x);
+ path->safe_push (x);
 
  /* See if we can thread through DEST as well, this helps capture
 secondary effects of threading without having to re-run DOM or
@@ -953,17 +953,14 @@ thread_across_edge (gimple dummy_cond,
  handle_dominating_asserts,
  simplify,
  visited,
- &path);
+ path);
  BITMAP_FREE (visited);
}
 
  remove_temporary_equivalences (stack);
- propagate_threaded_block_debug_into (path.last ()->e->dest,
+ propagate_threaded_block_debug_into (path->last ()->e->dest,
   e->dest);
  register_jump_thread (path);
- for (unsigned int i = 0; i < path.length (); i++)
-   delete path[i];
- path.release ();
  return;
}
 }
@@ -992,37 +989,39 @@ thread_across_edge (gimple dummy_cond,
bitmap_clear (visited);
bitmap_set_bit (visited, taken_edge->dest->index);
bitmap_set_bit (visited, e->dest->index);
-vec path = vNULL;
+vec *path = new vec ();
 
/* Record whether or not we were able to thread through a successor
   of E->dest.  */
 jump_thread_edge *x = new jump_thread_edge (e, EDGE_START_JUMP_THREAD);
-   path.safe_push (x);
+   path->safe_push (x)

Re: [PATCH] Fix PR58554

2013-10-01 Thread Bernhard Reutner-Fischer

On 30 September 2013 14:19:01 Richard Biener  wrote:


This fixes PR58554, pattern recognition in loop distribution now
needs to check whether all stmts are unconditionally executed.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2013-09-30  Richard Biener  

PR tree-optimization/58554
* tree-loop-distribution.c (classify_partition): Require unconditionally
executed stores for memcpy and memset recognition.
(tree_loop_distribution): Calculate dominance info.

* gcc.dg/torture/pr58554.c: New testcase.

Index: gcc/tree-loop-distribution.c



*** out:
*** 1719,1724 
--- 1723,1729 
{
  if (!cd)
{
+ calculate_dominance_info (CDI_DOMINATORS);
  calculate_dominance_info (CDI_POST_DOMINATORS);
  cd = new control_dependences (create_edge_list ());
  free_dominance_info (CDI_POST_DOMINATORS);


Don't you have to free CDI_DOMINATORS too now, somewhere?

Thanks,

Sent with AquaMail for Android
http://www.aqua-mail.com




Re: [Patch,AArch64] Support SADDL/SSUBL/UADDL/USUBL

2013-10-01 Thread Marcus Shawcroft
> 2013-09-30  Vidya Praveen  
>
> * aarch64-simd.md
> (aarch64_l2_internal): Rename to 
> ...
> (aarch64_l_hi_internal): ... this;
> Insert '\t' to output template.
> (aarch64_l_lo_internal): New.
> (aarch64_saddl2, aarch64_uaddl2): Modify to call
> gen_aarch64_l_hi_internal() 
> instead.
> (aarch64_ssubl2, aarch64_usubl2): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> 2013-09-30  Vidya Praveen  
>
> * gcc.target/aarch64/vect_saddl_1.c: New.

OK /Marcus


Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Bill Schmidt


On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote:
> On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote:
> > On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang  wrote:
> > > Hello,
> > >
> > > Please find the updated version of the patch in the attachment.  It has
> > > addressed the previous comments and also included some changes in order to
> > > pass the bootstrapping on x86_64.
> > >
> > > It's also passed the regtest on arm-none-eabi and aarch64-none-elf.
> > >
> > > It will also fix the test failure as reported here:
> > > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html
> > >
> > > OK for the trunk?
> > 
> > +   where n is a 32-bit unsigned int and pointer are 64-bit long.  In this
> > +   case, the gimple for (n - 1) is:
> > +
> > + _2 = n_1(D) + 4294967295; // 0x
> > +
> > +   and it is wrong to multiply the large constant by 4 in the 64-bit 
> > space.  */
> > +
> > +static bool
> > +safe_to_multiply_p (tree type, double_int cst)
> > +{
> > +  if (TYPE_UNSIGNED (type)
> > +  && ! double_int_fits_to_tree_p (signed_type_for (type), cst))
> > +return false;
> > +
> > +  return true;
> > +}
> > 
> > This looks wrong.  The only relevant check is as whether the
> > multiplication overflows the original type as you miss the implicit
> > truncation that happens.  Which is something you don't know
> > unless you know the value.  It definitely isn't a property of a type
> > and a constant but the property of two constants and a type.
> > Or the predicate has a wrong name.
> > 
> > The use of get_unwidened in this core routine looks like this is
> > all happening in the wrong place and we should have picked up
> > another candidate for this instead?  I'm sure Bill will know more here.
> 
> I'm not happy with how this patch is progressing.  Without having looked
> too deeply, this might be better handled earlier when determining which
> casts are safe to use in building candidates.  What you have here seems
> more like closing the barn door after the horse got out.  Maybe that's
> the only solution, but it doesn't seem likely.
> 
> Another problem is that your test case isn't testing anything except
> that the compiler doesn't crash.  That isn't sufficient as a regression
> test.
> 
> I'll spend some time looking at this to see if I can find a better
> approach.  It might be a day or two before I can get to it.  In addition
> to the included test case, are there any other cases you've found that I
> should be concerned with?

To help me investigate this without having to build a cross compiler,
could you please compile your test case (without the patch applied)
using -fdump-tree-reassoc2 -fdump-tree-slsr-details and send me the
generated dump files?

Thanks,
Bill

> 
> Thanks,
> Bill
> 
> > 
> > Richard.
> > 
> > 
> > 
> > > Thanks,
> > > Yufeng
> > >
> > >
> > > gcc/
> > >
> > > * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New
> > > function.
> > > (backtrace_base_for_ref): Call get_unwidened, check 'base_in'
> > > again and set unwidend_p with true; call safe_to_multiply_p to 
> > > avoid
> > > unsafe unwidened cases.
> > >
> > > gcc/testsuite/
> > >
> > > * gcc.dg/tree-ssa/slsr-40.c: New test.
> > >
> > >
> > >
> > >
> > > On 09/11/13 13:39, Bill Schmidt wrote:
> > >>
> > >> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote:
> > >>>
> > >>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang
> > >>> wrote:
> > 
> >  Hi,
> > 
> >  Following Bin's patch in
> >  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch
> >  tweaks
> >  backtrace_base_for_ref () to strip of any widening conversion after the
> >  first TREE_CODE check fails.  Without this patch, the test
> >  (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as
> >  backtrace_base_for_ref () will stop if not seeing an ssa_name since the
> >  tree
> >  code can be nop_expr instead.
> > 
> >  Regtested on arm and aarch64; still bootstrapping x86_64.
> > 
> >  OK for the trunk if the x86_64 bootstrap succeeds?
> > >>>
> > >>>
> > >>> Please add a testcase.
> > >>
> > >>
> > >> Also, the comment "Strip of" should read "Strip off".  Otherwise I have
> > >> no comments.
> > >>
> > >> Thanks,
> > >> Bill
> > >>
> > >>>
> > >>> Richard.
> > 



[gomp4] Support more than 1024 CPUs (PR libgomp/57298)

2013-10-01 Thread Jakub Jelinek
Hi!

This is just a preparation for the OMP_PLACES work, I've figured out before
changing the affinity stuff it might be better to fix this PR.
As gomp_init_num_threads is always called before gomp_init_affinity, there
is no point calling the same pthread_getaffinity_np twice, and I'll need
the initial affinity masks for OMP_PLACES anyway, so the patch just
remembers it.  The CPU_*_S and CPU_ALLOC_SIZE macros were apparently
introduced only in glibc 2.7, so the patch attempts to deal even with older
glibcs, by just using gomp_cpuset_size = 128 in that case (1024 bits).

2013-10-01  Jakub Jelinek  

PR libgomp/57298
* config/linux/proc.c (gomp_cpuset_size, gomp_cpusetp): New variables.
(gomp_cpuset_popcount): Use CPU_COUNT_S if available, or CPU_COUNT if
gomp_cpuset_size is sizeof (cpu_set_t).  Use gomp_cpuset_size instead
of sizeof (cpu_set_t) to determine number of iterations.
(gomp_init_num_threads): Initialize gomp_cpuset_size and gomp_cpusetp
here, use gomp_cpusetp instead of &cpuset and pass gomp_cpuset_size
instead of sizeof (cpu_set_t) to pthread_getaffinity_np.
(get_num_procs): Don't call pthread_getaffinity_np if gomp_cpusetp
is NULL.  Use gomp_cpusetp instead of &cpuset and pass gomp_cpuset_size
instead of sizeof (cpu_set_t) to pthread_getaffinity_np.
* config/linux/proc.h (gomp_cpuset_popcount): Add attribute_hidden.
(gomp_cpuset_size, gomp_cpusetp): Declare.
* config/linux/affinity.c (CPU_ISSET_S, CPU_ZERO_S, CPU_SET_S): Define
if CPU_ALLOC_SIZE isn't defined.
(gomp_init_affinity): Don't call pthread_getaffinity_np here, instead
use gomp_cpusetp computed by gomp_init_num_threads.  Use CPU_*_S
variants of macros with gomp_cpuset_size as set size, for cpusetnew
use alloca for it if CPU_ALLOC_SIZE is defined, otherwise local
fixed size variable.
(gomp_init_thread_affinity): Use CPU_*_S variants of macros with
gomp_cpuset_size as set size, for cpuset use alloca for it if
CPU_ALLOC_SIZE is defined, otherwise local fixed size variable.

--- libgomp/config/linux/proc.c.jj  2013-03-20 10:02:06.0 +0100
+++ libgomp/config/linux/proc.c 2013-10-01 14:09:00.759638855 +0200
@@ -39,19 +39,27 @@
 #endif
 
 #ifdef HAVE_PTHREAD_AFFINITY_NP
+unsigned long gomp_cpuset_size;
+cpu_set_t *gomp_cpusetp;
+
 unsigned long
 gomp_cpuset_popcount (cpu_set_t *cpusetp)
 {
-#ifdef CPU_COUNT
-  /* glibc 2.6 and above provide a macro for this.  */
-  return CPU_COUNT (cpusetp);
+#ifdef CPU_COUNT_S
+  /* glibc 2.7 and above provide a macro for this.  */
+  return CPU_COUNT_S (gomp_cpuset_size, cpusetp);
 #else
+#ifdef CPU_COUNT
+  if (gomp_cpuset_size == sizeof (cpu_set_t))
+/* glibc 2.6 and above provide a macro for this.  */
+return CPU_COUNT (cpusetp);
+#endif
   size_t i;
   unsigned long ret = 0;
   extern int check[sizeof (cpusetp->__bits[0]) == sizeof (unsigned long int)];
 
   (void) check;
-  for (i = 0; i < sizeof (*cpusetp) / sizeof (cpusetp->__bits[0]); i++)
+  for (i = 0; i < gomp_cpuset_size / sizeof (cpusetp->__bits[0]); i++)
 {
   unsigned long int mask = cpusetp->__bits[i];
   if (mask == 0)
@@ -70,16 +78,28 @@ void
 gomp_init_num_threads (void)
 {
 #ifdef HAVE_PTHREAD_AFFINITY_NP
-  cpu_set_t cpuset;
+#if defined (_SC_NPROCESSORS_CONF) && defined (CPU_ALLOC_SIZE)
+  gomp_cpuset_size = sysconf (_SC_NPROCESSORS_CONF);
+  gomp_cpuset_size = CPU_ALLOC_SIZE (gomp_cpuset_size);
+#else
+  gomp_cpuset_size = sizeof (cpuset);
+#endif
 
-  if (pthread_getaffinity_np (pthread_self (), sizeof (cpuset), &cpuset) == 0)
+  gomp_cpusetp = (cpu_set_t *) gomp_malloc (gomp_cpuset_size);
+  if (pthread_getaffinity_np (pthread_self (), gomp_cpuset_size,
+ gomp_cpusetp) == 0)
 {
   /* Count only the CPUs this process can use.  */
-  gomp_global_icv.nthreads_var = gomp_cpuset_popcount (&cpuset);
+  gomp_global_icv.nthreads_var = gomp_cpuset_popcount (gomp_cpusetp);
   if (gomp_global_icv.nthreads_var == 0)
gomp_global_icv.nthreads_var = 1;
   return;
 }
+  else
+{
+  free (gomp_cpusetp);
+  gomp_cpusetp = NULL;
+}
 #endif
 #ifdef _SC_NPROCESSORS_ONLN
   gomp_global_icv.nthreads_var = sysconf (_SC_NPROCESSORS_ONLN);
@@ -90,15 +110,14 @@ static int
 get_num_procs (void)
 {
 #ifdef HAVE_PTHREAD_AFFINITY_NP
-  cpu_set_t cpuset;
-
   if (gomp_cpu_affinity == NULL)
 {
   /* Count only the CPUs this process can use.  */
-  if (pthread_getaffinity_np (pthread_self (), sizeof (cpuset),
- &cpuset) == 0)
+  if (gomp_cpusetp
+ && pthread_getaffinity_np (pthread_self (), gomp_cpuset_size,
+gomp_cpusetp) == 0)
{
- int ret = gomp_cpuset_popcount (&cpuset);
+ int ret = gomp_cpuset_popcount (gomp_cpusetp);
  return ret != 0 ? ret : 1;
   

Re: [PATCH, doc]: Fix "@anchor should not appear in @heading" warning

2013-10-01 Thread Uros Bizjak
On Sun, Sep 29, 2013 at 6:59 PM, Uros Bizjak  wrote:

> Rather trivial fix - put @anchor before @heading, as texi manual suggests.
>
> 2013-09-29  Uros Bizjak  
>
> * doc/install.texi (Host/target specific installation notes for GCC):
> Put @anchor before @heading.
>
> Tested by "make doc" with texinfo 5.1 on Fedora 19.

I have committed both patches to mainline SVN under the assumption
that there were no objections.

Uros.


Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Yufeng Zhang

Hi Bill,

Thank you for the review and the offer to help.

On 10/01/13 15:36, Bill Schmidt wrote:

On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote:

On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote:

On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang  wrote:

Hello,

Please find the updated version of the patch in the attachment.  It has
addressed the previous comments and also included some changes in order to
pass the bootstrapping on x86_64.

It's also passed the regtest on arm-none-eabi and aarch64-none-elf.

It will also fix the test failure as reported here:
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html

OK for the trunk?


+   where n is a 32-bit unsigned int and pointer are 64-bit long.  In this
+   case, the gimple for (n - 1) is:
+
+ _2 = n_1(D) + 4294967295; // 0x
+
+   and it is wrong to multiply the large constant by 4 in the 64-bit space.  */
+
+static bool
+safe_to_multiply_p (tree type, double_int cst)
+{
+  if (TYPE_UNSIGNED (type)
+&&  ! double_int_fits_to_tree_p (signed_type_for (type), cst))
+return false;
+
+  return true;
+}

This looks wrong.  The only relevant check is as whether the
multiplication overflows the original type as you miss the implicit
truncation that happens.  Which is something you don't know
unless you know the value.  It definitely isn't a property of a type
and a constant but the property of two constants and a type.
Or the predicate has a wrong name.

The use of get_unwidened in this core routine looks like this is
all happening in the wrong place and we should have picked up
another candidate for this instead?  I'm sure Bill will know more here.


I'm not happy with how this patch is progressing.  Without having looked
too deeply, this might be better handled earlier when determining which
casts are safe to use in building candidates.  What you have here seems
more like closing the barn door after the horse got out.  Maybe that's
the only solution, but it doesn't seem likely.

Another problem is that your test case isn't testing anything except
that the compiler doesn't crash.  That isn't sufficient as a regression
test.

I'll spend some time looking at this to see if I can find a better
approach.  It might be a day or two before I can get to it.  In addition
to the included test case, are there any other cases you've found that I
should be concerned with?


To help me investigate this without having to build a cross compiler,
could you please compile your test case (without the patch applied)
using -fdump-tree-reassoc2 -fdump-tree-slsr-details and send me the
generated dump files?


The issue is not specific to AArch64; please find the attached dumps 
generated from the x86-64 gcc by compiling gcc.dg/tree-ssa/slsr-39.c.


W.r.t your comment in the other email about adding a test to verify the 
expected gimple, I think the existing test gcc.dg/tree-ssa/slsr-39.c is 
sufficient.  The test currently fails on both AArch64 and x86-64, and 
presumably also fails on any other 64-bit target where pointer is 64-bit 
and int is 32-bit size.  The patch I proposed is to fix this issue and 
gcc.dg/tree-ssa/slsr-39.c itself shall be a good regression test (with 
specific verification on gimple ir).


The new test proposed in this patch is to regtest the issue my original 
patch has, which is a runtime failure due to incorrect optimization.


I'll address other comments in separate emails.

Thanks,
Yufeng
;; Function foo (foo, funcdef_no=0, decl_uid=1722, symbol_order=0)

;; 1 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2
;; 2 succs { 1 }
foo (int[50] * a2, int v1)
{
  int j;
  long unsigned int _3;
  long unsigned int _4;
  int[50] * _6;
  int _11;
  int _12;
  int _13;

  :
  j_2 = v1_1(D) + 5;
  _3 = (long unsigned int) j_2;
  _4 = _3 * 200;
  _6 = a2_5(D) + _4;
  j_7 = v1_1(D) + 6;
  *_6[j_2] = j_2;
  *_6[j_7] = j_2;
  _11 = v1_1(D) + 4;
  _12 = *_6[_11];
  _13 = _12 + 1;
  *_6[_11] = _13;
  return;

}



;; Function foo (foo, funcdef_no=0, decl_uid=1722, symbol_order=0)

;; 1 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2
;; 2 succs { 1 }

Strength reduction candidate vector:

  1  [2] j_2 = v1_1(D) + 5;
 ADD  : v1_1(D) + (5 * 1) : int
 basis: 0  dependent: 5  sibling: 0
 next-interp: 0  dead-savings: 0

  2  [2] _3 = (long unsigned int) j_2;
 ADD  : v1_1(D) + (5 * 1) : long unsigned int
 basis: 0  dependent: 0  sibling: 0
 next-interp: 0  dead-savings: 0

  3  [2] _4 = _3 * 200;
 MULT : (v1_1(D) + 5) * 200 : long unsigned int
 basis: 0  dependent: 0  sibling: 0
 next-interp: 0  dead-savings: 1

  4  [2] _6 = a2_5(D) + _4;
 ADD  : a2_5(D) + (1 * _4) : int[50] *
 basis: 0  dependent: 0  sibling: 0
 next-interp: 0  dead-savings: 0

  5  [2] j_7 = v1_1(D) + 6;
 ADD  : v1_1(D) + (6 * 1) : int
 basis: 1  dependent: 8  sibling: 0
 next-interp: 0  dead-savings: 0

  6  [2] *_6[j_2] = j_2;
 REF  : _6 + 

Re: [PATCH, IRA] Fix ALLOCNO_MODE in the case of paradoxical subreg.

2013-10-01 Thread Wei Mi
> Please check whether it is ok. Boostrap and regression ok. I am also
> verifying its performance effect on google applications (But most of
> them are 64 bits, so I cannot verify its performance effect on 32 bits
> apps).

Have verified It has no performance impact on google applications.

Thanks,
Wei Mi.


Re: [wwwdocs] Mention ubsan in 4.9 changes.html

2013-10-01 Thread Gerald Pfeifer
On Thu, 19 Sep 2013, Marek Polacek wrote:
> Maybe it'd be worth noting in changes.html that GCC now has the
> ubsan...

Not just maybe! :-)

> --- www/htdocs/gcc-4.9/changes.html.mp2013-09-19 16:54:32.113724993 
> +0200
> +++ www/htdocs/gcc-4.9/changes.html   2013-09-19 17:07:05.418030738 +0200
> +UndefinedBehaviorSanitizer, a fast undefined behavior detector,
> +has been added and can be enabled via 
> -fsanitize=undefined.
> +  Various computations will be instrumented to detect undefined behavior
> +  at runtime.  UndefinedBehaviorSanitizer is currently available for C
> +  and C++ languages.

"for the C and C++ languages"

Is ubsan a common abbreviation?  If so, you may want to mention it,
for example in parenthesis after the first word.  Otherwise, okay
as is (with the extra "the" above).

Thanks,
Gerald

PS: Actually, wouldn't this be worth a news item on the GCC main
page, too?


[patch] Move some prototypes out of tree-flow.h

2013-10-01 Thread Andrew MacLeod
This patch moves 5 sets of prototypes out of tree-flow.h and creates 4 
new header files.


I then  #include the new header files from tree-flow.h as a temporary 
measure until all prototypes have been cleared up. then as previously 
discussed, I will revisit all the #includes *of* and *within* 
tree-flow.h to get it and the .c files down to the basics required.   I 
suspect this will then expose more functions that should be shuffled. 
This time around I'm only shuffled what really needed shuffling for 
compilation.


all files now list all the exports within a file.

tree-cfgcleanup.h: new file. pretty basic.. just moved the prototypes.

tree-dfa.h: new file. Add the prototypes, and moved 
get_addr_base_and_unit_offset_1 from tree-flow-inline.h.   its related 
funcitons were in tree-dfa.c so I just left them all there for now.


tree-pretty-print.h: This file already existed, but some of the 
prototypes were in tree-flow.h for some reason. It also contained a 
prototype for a c front end debug routine which isn't used anywhere, so 
I deleted that.


tree-into-ssa.h: Another new file, moved the prototypes out of tree-flow 
and there were bunch of debug prototypes ni the .c file itself. I moved 
those to the header file for clarity.


gimple-low.h: The final new file. Moved the prototypes here.

gImple-low.c: I moved try_catch_may_fallthru() and block_may_fallthru() 
to tree.c... there are gimple versions in gimple-low.c already, and it 
turns out that block_may_fallthru() is called from the c++ front end.. 
so it doesn't belong in gimple.c. The prototype is already in tree.h anyway.


a few .c files required adding tree.h to the include file to pick up 
bits that moved due to the reshuffling. mostly uses of enum tree_code in 
tree-pretty-print.h.  eventually, all the .c files ought to include 
tree.h directly.  I'll tak care of that to some degree when tree-flow.h 
is finally sorted out.


Bootstraps on x86_64-unknown-linux-gnu and no new regressions.  OK?

Andrew

	* tree-flow.h: Include new .h files.  Move prototypes.
	* tree-cfgcleanup.h: New file.  Add prototypes from tree-flow.h.
	* tree-dfa.h: New File.  Add prototypes from tree-flow.h.
	(get_addr_base_and_unit_offset_1) Move from tree-flow-inline.h.
	* tree-pretty-print.h: Add prototypes from tree-flow.h.
	* tree-into-ssa.h: New File.  Add prototypes from tree-flow.h.
	({debug|dump}*): Move debugging prototypes out of tree-into-ssa.c.
	* tree-into-ssa.c ({debug|dump}*): Move prototypes to header file.
	* tree.h (get_ref_base_and_extent): Move prototype out.
	* tree-flow-inline.h (get_addr_base_and_unit_offset_1): Move to 
	tree-dfa.h.
	* gimple-low.h: New File.  Add prototypes from tree-flow.h.
	* gimple-low.c (try_catch_may_fallthru, block_may_fallthru): Move to...
	* tree.c (try_catch_may_fallthru, block_may_fallthru): Here.
	* tree-scalar-evolution.c: Include tree.h.
	* sese.c: Include tree.h.
	* dumpfile.c: Move gimple-pretty-print.h include after tree.h.
	* dwarf2out.c: Include tree-dfa.h.
	* tree-chrec.c: Include tree.h.
	* tree-data-ref.c: Include tree.h.

Index: tree-flow.h
===
*** tree-flow.h	(revision 203034)
--- tree-flow.h	(working copy)
*** along with GCC; see the file COPYING3.
*** 30,36 
  #include "cgraph.h"
  #include "ipa-reference.h"
  #include "tree-ssa-alias.h"
! 
  
  /* This structure is used to map a gimple statement to a label,
 or list of labels to represent transaction restart.  */
--- 30,40 
  #include "cgraph.h"
  #include "ipa-reference.h"
  #include "tree-ssa-alias.h"
! #include "tree-cfgcleanup.h"
! #include "tree-dfa.h"
! #include "tree-pretty-print.h"
! #include "gimple-low.h"
! #include "tree-into-ssa.h"
  
  /* This structure is used to map a gimple statement to a label,
 or list of labels to represent transaction restart.  */
--- 96,101 
*** extern basic_block move_sese_region_to_f
*** 379,408 
  void remove_edge_and_dominated_blocks (edge);
  bool tree_node_can_be_shared (tree);
  
- /* In tree-cfgcleanup.c  */
- extern bitmap cfgcleanup_altered_bbs;
- extern bool cleanup_tree_cfg (void);
- 
- /* In tree-pretty-print.c.  */
- extern void dump_generic_bb (FILE *, basic_block, int, int);
- extern int op_code_prio (enum tree_code);
- extern int op_prio (const_tree);
- extern const char *op_symbol_code (enum tree_code);
- 
- /* In tree-dfa.c  */
- extern void renumber_gimple_stmt_uids (void);
- extern void renumber_gimple_stmt_uids_in_blocks (basic_block *, int);
- extern void dump_dfa_stats (FILE *);
- extern void debug_dfa_stats (void);
- extern void dump_variable (FILE *, tree);
- extern void debug_variable (tree);
- extern void set_ssa_default_def (struct function *, tree, tree);
- extern tree ssa_default_def (struct function *, tree);
- extern tree get_or_create_ssa_default_def (struct function *, tree);
- extern bool stmt_references_abnormal_ssa_name (gimple);
- extern tree get_addr_base_and_unit_of

Re: [PATCH v2] Fix libgfortran cross compile configury w.r.t newlib

2013-10-01 Thread Steve Ellcey
On Tue, 2013-10-01 at 12:40 +0100, Marcus Shawcroft wrote:
> On 30/09/13 13:40, Marcus Shawcroft wrote:
> 
> >> Well, I thought this patch would work for me, but it does not.  It looks
> >> like gcc_no_link is set to 'no' on my target because, technically, I can
> >> link even if I don't use a linker script.  I just can't find any
> >> functions.
> >>
> 
> > In which case gating on gcc_no_link could be replaced with a test that
> > looks to see if we can link with the library.  Perhaps looking for
> > exit() or some such that might reasonably be expected to be present.
> >
> > For example:
> >
> > AC_CHECK_FUNC(exit)
> > if test "x${with_newlib}" = "xyes" -a "x${ac_cv_func_exit}" = "xno"; then
> >
> > /Marcus

This patch works on my mips-mti-elf target.

Steve Ellcey
sell...@mips.com




Re: [Patch] Fix interval quantifier in lookahead subexpr in regex

2013-10-01 Thread Tim Shen
On Tue, Oct 1, 2013 at 1:32 AM, Paolo Carlini  wrote:
> Ok, thanks.

Committed.

Thanks!


-- 
Tim Shen


Re: Ping^6: contribute Synopsys Designware ARC port

2013-10-01 Thread Diego Novillo
On Tue, Oct 1, 2013 at 10:10 AM, Joern Rennecke
 wrote:

> Yes.  Claudiu Zissulescu at Synopsys would in principle be available as
> co-maintainer, but I suppose it is customary to apply for write-after-
> approval status first.

I'm not sure.  A question for the SC.

>> SC folks, could you appoint Joern (and any other volunteer that Joern
>> suggests) as maintainers?
>
>
> I've already been appointed as maintainer back in January.

OK, great.  Here's me paying attention :)


Re: [C++ Patch] PR 58563

2013-10-01 Thread Jason Merrill

OK.

Jason


Re: [PATCH]Fix computation of offset in ivopt

2013-10-01 Thread Bin.Cheng
On Tue, Oct 1, 2013 at 6:50 PM, Richard Biener
 wrote:
> On Mon, Sep 30, 2013 at 7:39 AM, bin.cheng  wrote:
>>
>>
>
> I don't think you need
>
> +  /* Sign extend off if expr is in type which has lower precision
> + than HOST_WIDE_INT.  */
> +  if (TYPE_PRECISION (TREE_TYPE (expr)) <= HOST_BITS_PER_WIDE_INT)
> +off = sext_hwi (off, TYPE_PRECISION (TREE_TYPE (expr)));
>
> at least it would be suspicious if you did ...
There is a problem for example of the first message.  The iv base if like:
pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4
I am not sure why but it seems (-4/0xFFFC) is represented by
(1073741823*4).  For each operand strip_offset_1 returns exactly the
positive number and result of multiplication never get its chance of
sign extend.  That's why the sign extend is necessary to fix the
problem. Or it should be fixed elsewhere by representing iv base with:
"pretmp_184 + ((sizetype) KeyIndex_180 + 4294967295) * 4" in the first place.

>
> The only case that I can think of points to a bug in strip_offset_1
> again, namely if sizetype (the type of all offsets) is smaller than
> a HOST_WIDE_INT in which case
>
> +boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
> +*offset = off0 + int_cst_value (tmp) + boffset / BITS_PER_UNIT;
>
> is wrong as boffset / BITS_PER_UNIT does not do a signed division
> then (for negative boffset which AFAIK does not happen - but it would
> be technically allowed).  Thus, the predicates like
>
> +&& cst_and_fits_in_hwi (tmp)
>
> would need to be amended with a check that the MSB is not set.
So I can handle it like:

+abs_boffset = abs_hwi (boffset);
+x = abs_boffset / BITS_PER_UNIT;
+if (boffset < 0)
+  x = -x;
+*offset = off0 + int_cst_value (tmp) + x;

Right?

>
> Btw, the cst_and_fits_in_hwi implementation is odd:
>
> bool
> cst_and_fits_in_hwi (const_tree x)
> {
>   if (TREE_CODE (x) != INTEGER_CST)
> return false;
>
>   if (TYPE_PRECISION (TREE_TYPE (x)) > HOST_BITS_PER_WIDE_INT)
> return false;
>
>   return (TREE_INT_CST_HIGH (x) == 0
>   || TREE_INT_CST_HIGH (x) == -1);
> }
>
> the precision check seems totally pointless and I wonder what's
> the point of this routine as there is host_integerp () already
> and tree_low_cst instead of int_cst_value - oh, I see, the latter
> forcefully sign-extends  that should make the extension
> not necessary.
See above.

Thanks.
bin


Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Bill Schmidt
OK, thanks.  The problem that you've encountered is that you are
attempting to do something illegal. ;)  (Bin's original patch is
actually to blame for that, as well as me for not catching it then.)

As your new test shows, it is unsafe to do the transformation in
backtrace_base_for_ref when widening from an unsigned type, because the
unsigned type has wrap semantics by default.  (The actual test must be
done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or
removed by compile option -- see the comments with legal_cast_p and
legal_cast_p_1 later in the module.)

You cannot in general prove that the transformation is allowable for a
specific constant, because you don't know that what you're adding it to
won't cause an overflow that's handled incorrectly.

I believe the correct fix for the unsigned-overflow case is to fail
backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns
false, where in_type is the type of the new *PBASE, and out_type is the
widening type that you're looking through.  So you can't just
STRIP_NOPS, you have to check the cast for legitimacy for this
transformation.

This does not explain why backtrace_base_for_ref does not find all the
opportunities on slsr-39.c.  I don't immediately see what's preventing
that.  Note that the transformation is legal in that case because you
are widening from a signed int to an unsigned int, which won't cause
problems.  You guys need to dig deeper into why those opportunities are
missed when sizetype is larger than int.  Let me know if you need help
figuring it out.

Thanks,
Bill

On Tue, 2013-10-01 at 16:06 +0100, Yufeng Zhang wrote:
> Hi Bill,
> 
> Thank you for the review and the offer to help.
> 
> On 10/01/13 15:36, Bill Schmidt wrote:
> > On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote:
> >> On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote:
> >>> On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang  
> >>> wrote:
>  Hello,
> 
>  Please find the updated version of the patch in the attachment.  It has
>  addressed the previous comments and also included some changes in order 
>  to
>  pass the bootstrapping on x86_64.
> 
>  It's also passed the regtest on arm-none-eabi and aarch64-none-elf.
> 
>  It will also fix the test failure as reported here:
>  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html
> 
>  OK for the trunk?
> >>>
> >>> +   where n is a 32-bit unsigned int and pointer are 64-bit long.  In this
> >>> +   case, the gimple for (n - 1) is:
> >>> +
> >>> + _2 = n_1(D) + 4294967295; // 0x
> >>> +
> >>> +   and it is wrong to multiply the large constant by 4 in the 64-bit 
> >>> space.  */
> >>> +
> >>> +static bool
> >>> +safe_to_multiply_p (tree type, double_int cst)
> >>> +{
> >>> +  if (TYPE_UNSIGNED (type)
> >>> +&&  ! double_int_fits_to_tree_p (signed_type_for (type), cst))
> >>> +return false;
> >>> +
> >>> +  return true;
> >>> +}
> >>>
> >>> This looks wrong.  The only relevant check is as whether the
> >>> multiplication overflows the original type as you miss the implicit
> >>> truncation that happens.  Which is something you don't know
> >>> unless you know the value.  It definitely isn't a property of a type
> >>> and a constant but the property of two constants and a type.
> >>> Or the predicate has a wrong name.
> >>>
> >>> The use of get_unwidened in this core routine looks like this is
> >>> all happening in the wrong place and we should have picked up
> >>> another candidate for this instead?  I'm sure Bill will know more here.
> >>
> >> I'm not happy with how this patch is progressing.  Without having looked
> >> too deeply, this might be better handled earlier when determining which
> >> casts are safe to use in building candidates.  What you have here seems
> >> more like closing the barn door after the horse got out.  Maybe that's
> >> the only solution, but it doesn't seem likely.
> >>
> >> Another problem is that your test case isn't testing anything except
> >> that the compiler doesn't crash.  That isn't sufficient as a regression
> >> test.
> >>
> >> I'll spend some time looking at this to see if I can find a better
> >> approach.  It might be a day or two before I can get to it.  In addition
> >> to the included test case, are there any other cases you've found that I
> >> should be concerned with?
> >
> > To help me investigate this without having to build a cross compiler,
> > could you please compile your test case (without the patch applied)
> > using -fdump-tree-reassoc2 -fdump-tree-slsr-details and send me the
> > generated dump files?
> 
> The issue is not specific to AArch64; please find the attached dumps 
> generated from the x86-64 gcc by compiling gcc.dg/tree-ssa/slsr-39.c.
> 
> W.r.t your comment in the other email about adding a test to verify the 
> expected gimple, I think the existing test gcc.dg/tree-ssa/slsr-39.c is 
> sufficient.  The test currently fails on both AArch64 and x86-64, and 
> p

Re: [ARM, AArch64] Make aarch64-common.c files more robust.

2013-10-01 Thread Richard Earnshaw
On 30/09/13 09:52, James Greenhalgh wrote:
> 
> Hi,
> 
> Recently I've found myself getting a number of segfaults from
> code calling in to the arm_early_load/alu_dep functions in
> aarch64-common.c. These functions expect a particular form
> for the RTX patterns they work over, but some of them do
> not validate this form.
> 
> This patch fixes that, removing segmentation faults I see
> when tuning for Cortex-A15 and Cortex-A7.
> 
> Tested on aarch64-none-elf and arm-none-eabi with no regressions.
> 
> OK?
> 
> Thanks,
> James
> 
> ---
> gcc/
> 
> 2013-09-30  James Greenhalgh  
> 
>   * config/arm/aarch-common.c
>   (arm_early_load_addr_dep): Add sanity checking.
>   (arm_no_early_alu_shift_dep): Likewise.
>   (arm_no_early_alu_shift_value_dep): Likewise.
>   (arm_no_early_mul_dep): Likewise.
>   (arm_no_early_store_addr_dep): Likewise.
> 
> 
> 0001-ARM-AArch64-Make-aarch64-common.c-files-more-robust.patch
> 
> 
> diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c
> index 69366af..ea50848 100644
> --- a/gcc/config/arm/aarch-common.c
> +++ b/gcc/config/arm/aarch-common.c
> @@ -44,7 +44,12 @@ arm_early_load_addr_dep (rtx producer, rtx consumer)
>  value = COND_EXEC_CODE (value);
>if (GET_CODE (value) == PARALLEL)
>  value = XVECEXP (value, 0, 0);
> +
> +  if (GET_CODE (value) != SET)
> +return 0;
> +
>value = XEXP (value, 0);

Please change this to use SET_DEST.

> +
>if (GET_CODE (addr) == COND_EXEC)
>  addr = COND_EXEC_CODE (addr);
>if (GET_CODE (addr) == PARALLEL)
> @@ -54,6 +59,10 @@ arm_early_load_addr_dep (rtx producer, rtx consumer)
>else
>  addr = XVECEXP (addr, 0, 0);
>  }
> +
> +  if (GET_CODE (addr) != SET)
> +return 0;
> +
>addr = XEXP (addr, 1);
>  

And this to use SET_SRC.

>return reg_overlap_mentioned_p (value, addr);
> @@ -74,21 +83,41 @@ arm_no_early_alu_shift_dep (rtx producer, rtx consumer)
>  value = COND_EXEC_CODE (value);
>if (GET_CODE (value) == PARALLEL)
>  value = XVECEXP (value, 0, 0);
> +
> +  if (GET_CODE (value) != SET)
> +return 0;
> +
>value = XEXP (value, 0);

SET_DEST.

> +
>if (GET_CODE (op) == COND_EXEC)
>  op = COND_EXEC_CODE (op);
>if (GET_CODE (op) == PARALLEL)
>  op = XVECEXP (op, 0, 0);
> +
> +  if (GET_CODE (op) != SET)
> +return 0;
> +
>op = XEXP (op, 1);

SET_SRC.

>  
> +  if (!INSN_P (op))
> +return 0;

This looks wrong.  What are you really expecting here?  Surely not INSN,
since then ...
> +
>early_op = XEXP (op, 0);

... this would always fault, since element 0 is an int

> +
>/* This is either an actual independent shift, or a shift applied to
>   the first operand of another operation.  We want the whole shift
>   operation.  */
>if (REG_P (early_op))
>  early_op = op;
>  
> -  return !reg_overlap_mentioned_p (value, early_op);
> +  if (GET_CODE (op) == ASHIFT
> +  || GET_CODE (op) == ROTATE
> +  || GET_CODE (op) == ASHIFTRT
> +  || GET_CODE (op) == LSHIFTRT
> +  || GET_CODE (op) == ROTATERT)
> +return !reg_overlap_mentioned_p (value, early_op);
> +  else
> +return 0;
>  }
>  
>  /* Return nonzero if the CONSUMER instruction (an ALU op) does not
> @@ -106,13 +135,25 @@ arm_no_early_alu_shift_value_dep (rtx producer, rtx 
> consumer)
>  value = COND_EXEC_CODE (value);
>if (GET_CODE (value) == PARALLEL)
>  value = XVECEXP (value, 0, 0);
> +
> +  if (GET_CODE (value) != SET)
> +return 0;
> +
>value = XEXP (value, 0);

SET_DEST.

> +
>if (GET_CODE (op) == COND_EXEC)
>  op = COND_EXEC_CODE (op);
>if (GET_CODE (op) == PARALLEL)
>  op = XVECEXP (op, 0, 0);
> +
> +  if (GET_CODE (op) != SET)
> +return 0;
> +
>op = XEXP (op, 1);
>  
> +  if (!INSN_P (op))
> +return 0;
> +

Same issue as above.

>early_op = XEXP (op, 0);
>  
>/* This is either an actual independent shift, or a shift applied to
> @@ -121,7 +162,14 @@ arm_no_early_alu_shift_value_dep (rtx producer, rtx 
> consumer)
>if (!REG_P (early_op))
>  early_op = XEXP (early_op, 0);
>  
> -  return !reg_overlap_mentioned_p (value, early_op);
> +  if (GET_CODE (op) == ASHIFT
> +  || GET_CODE (op) == ROTATE
> +  || GET_CODE (op) == ASHIFTRT
> +  || GET_CODE (op) == LSHIFTRT
> +  || GET_CODE (op) == ROTATERT)
> +return !reg_overlap_mentioned_p (value, early_op);
> +  else
> +return 0;
>  }
>  
>  /* Return nonzero if the CONSUMER (a mul or mac op) does not
> @@ -138,11 +186,20 @@ arm_no_early_mul_dep (rtx producer, rtx consumer)
>  value = COND_EXEC_CODE (value);
>if (GET_CODE (value) == PARALLEL)
>  value = XVECEXP (value, 0, 0);
> +
> +  if (GET_CODE (value) != SET)
> +return 0;
> +
>value = XEXP (value, 0);

SET_DEST.

> +
>if (GET_CODE (op) == COND_EXEC)
>  op = COND_EXEC_CODE (op);
>if (GET_CODE (op) == PARALLEL)
>  op = XVECEXP (op, 0, 0);
> +
> +  if (GET_CODE (op) != SET)
> +re

Re: [PATCH] fix size_estimation for builtin_expect

2013-10-01 Thread Rong Xu
ping.

On Fri, Sep 27, 2013 at 3:56 PM, Rong Xu  wrote:
> Hi,
>
> builtin_expect should be a NOP in size_estimation. Indeed, the call
> stmt itself is 0 weight in size and time. But it may introduce
> an extra relation expr which has non-zero size/time. The end result
> is: for w/ and w/o builtin_expect, we have different size/time estimation
> for inlining.
>
> This patch fixes this problem.
>
> An earlier discussion of this patch is
>   https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315
>
> This new patch address Honza's comments.
> It passes the bootstrap and regression.
>
> Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think
> that's an overkill for this simple problem. Your code is mostly dealing
> with the recursively walk the PHI node to find the real def stmts.
> Here the traversal is within one BB and I may need to continue on multiple
> real assignment. Calling walk_use_def_chains probably only uses
> the SSA_NAME_DEF_STMT() part of the code.
>
> Thanks,
>
> -Rong


Re: [PATCH] alternative hirate for builtin_expert

2013-10-01 Thread Rong Xu
ping.

On Fri, Sep 27, 2013 at 3:53 PM, Rong Xu  wrote:
> Hi,
>
>
>
>   Current default probability for builtin_expect is 0.9996.
> This makes the freq of unlikely bb very low (4), which
> suppresses the inlining of any calls within those bb.
>
> We used FDO data to measure the branch probably for
> the branch annotated with builtin_expert.
> For google internal benchmarks, the weight average
> (the profile count value as the weight) is 0.9081.
>
> Linux kernel is another program that is heavily annotated
> with builtin-expert. We measured its weight average as 0.8717,
> using google search as the workload.
>
> This patch sets the alternate hirate probability for builtin_expert
> to 90%. With the alternate hirate, we measured performance
> improvement for google benchmarks and Linux kernel.
>
> An earlier discussion is
> https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b
>
> This new patch is for the trunk and addresses Honza's comments.
>
> Honza: this new probability is off by default. When we backport to google
> branch we will make it the default. Let me know if you want to do the same
> here.
>
> Thanks,
>
> -Rong


Re: [wwwdocs] Mention ubsan in 4.9 changes.html

2013-10-01 Thread Marek Polacek
On Tue, Oct 01, 2013 at 05:07:28PM +0200, Gerald Pfeifer wrote:
> > --- www/htdocs/gcc-4.9/changes.html.mp  2013-09-19 16:54:32.113724993 
> > +0200
> > +++ www/htdocs/gcc-4.9/changes.html 2013-09-19 17:07:05.418030738 +0200
> > +UndefinedBehaviorSanitizer, a fast undefined behavior detector,
> > +has been added and can be enabled via 
> > -fsanitize=undefined.
> > +Various computations will be instrumented to detect undefined behavior
> > +at runtime.  UndefinedBehaviorSanitizer is currently available for C
> > +and C++ languages.
> 
> "for the C and C++ languages"
> 
> Is ubsan a common abbreviation?  If so, you may want to mention it,
> for example in parenthesis after the first word.  Otherwise, okay
> as is (with the extra "the" above).

Thanks, commited.

> PS: Actually, wouldn't this be worth a news item on the GCC main
> page, too?

Perhaps.  I'll post a patch for that in near future.

Marek


Re: Ping^6: contribute Synopsys Designware ARC port

2013-10-01 Thread Diego Novillo
On Sat, Sep 28, 2013 at 9:54 AM, Joern Rennecke
 wrote:
> The main part of the port (everything but the testsuite) is still waiting
> for review:
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00323.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00324.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00325.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00328.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01870.html
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02070.html

I have finished reading through these patches.  They are OK to commit.

The changes indicated below are minor. Ideally, you'd address them
before committing the patch, but if it's easier to do it post-commit,
that's OK too.

- The Copyright years should be 2013 in every new file.  Or has this
port been released before?

- In config/arc/arc-protos.h:
+/* insn-attrtab.c doesn't include reload.h, which declares
regno_clobbered_p. */
+extern int regno_clobbered_p (unsigned int, rtx, enum machine_mode, int);

Why not include reload.h here?  Interface changes (however rare) make
this a hassle.

- In config/arc/simdext.md
+;; Va, [Ib,u8] instructions
+;; (define_insn "vld32wh_insn"
+;;   [(set (match_operand:V8HI 0 "vector_register_operand"   "=v")
+;; (vec_concat:V8HI (unspec:V4HI [(match_operand:SI 1 "immediate_operand" "P")
+;;  (vec_select:HI (match_operand:V8HI 2 "vector_register_operand"  "v")
+;;  (parallel [(match_operand:SI 3 "immediate_operand" "L")]))]
UNSPEC_ARC_SIMD_VLD32WH)
+;; (vec_select:V4HI (match_dup 0)

Necessary?  If so, please add a comment stating why it's commented out.

- In doc/extend.texi:
+Permissible values for this parameter are: @w{@code{ilink1}} and
+@w{@code{ilink2}}.
+

ARC developers already know what ilink1 and ilink2 mean?

+@cindex indirect calls on Epiphany
+These attribute specifies how a particular function is called on
+ARC, ARM and Epiphany

s/specifies/specify/


+because __alignof__ sees only the type of the dereference, wheras
+__builtin_arc_align uses alignment information from the pointer

s/wheras/whereas/

- I have not fully cross-referenced the list of documented builtins vs
the list of implemented builtins. Please double check them.

- Ditto the list of -m options. It looks like they're all documented,
but I haven't diff'd the doc vs the options file.

- In libgcc/config/arc/gmon/mcount.c

The file has a different copyright/license notice at the top.  Is this
from a third party source? Can it be changed to lgpl?

+#if 0
+ if (catomic_compare_and_exchange_bool_acq (&p->state, GMON_PROF_BUSY,
+   GMON_PROF_ON))
+  return;

Get rid of this?

+#elif defined (__ARC700__)
+/* ??? This could temporrarily loose the ERROR / OFF condition in a race,

s/temporrarily/temporarily/
s/loose/lose/

- Many files in libgcc/config/arc/... have #if0 blocks. Are they
really necessary?

- In libgcc/config/arc/ieee-754/arc600-dsp/muldf3.S
+/* We have checked for infinitey / NaN input before, and transformed
+   denormalized inputs into normalized inputs.  Thus, the worst case

s/infinitey/infinity/

It  also happens in another muldf3.S file in a sibling directory.

- libgcc/config/arc/t-arc-newlib does not contain the exception clause.

- In config/arc/arc.md there are several define patterns commented
out.  Toss them out?

- In config/arc/arc.c:

No need to include stdio.h

No need to mark struct arc_frame_info with GTY. It contains no pointers.

In arc_expand_epilogue():
Get rid of fp_restored_p

  if (1)
{
  unsigned int pretend_size = cfun->machine->frame_info.pretend_size;

Just move everything out of the if()?

In output_shift(): Get rid of the #if 1s?

In arc_encode_section_info():

/* Symbols in the text segment can be accessed without indirecting via the
   constant pool; it may take an extra binary operation, but this is still
   faster than indirecting via memory.  Don't do this when not optimizing,
   since we won't be calculating al of the offsets necessary to do this
   simplification.  */

But that seems out of sync with the code. It never checks whether
optimizations are enabled.


Thanks.  Diego.


Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #3

2013-10-01 Thread Michael Meissner
On Thu, Sep 26, 2013 at 06:56:37PM -0400, David Edelsohn wrote:
> On Thu, Sep 26, 2013 at 4:51 PM, Michael Meissner
>  wrote:
> > I discovered that I was setting the wv/wu constraints incorrectly to
> > ALTIVEC_REGS, which leads to reload failures in some cases.
> >
> > Is this patch ok to apply along with the previous patch assuming it 
> > bootstraps
> > and has no regressions with make check?  It builds the programs that had
> > failures with the previous patch.
> >
> > 2013-09-26  Michael Meissner  
> >
> > * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Don't
> > allow wv/wu constraints to be ALTIVEC_REGISTERS unless DF/SF can
> > occupy the Altivec registers.
> 
> Okay.
> 
> Can you add a testcase to catch this in the future?

I looked into this, and of the 5 spec benchmarks that did not compile without
the fix, 4 are in fortran, and the 5th in C++ did not show up the error if I
deleted any code, so there is no simple test case I know of.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Patch correcting possible bug in toplevel.c regarding plugin & diagnostics finalizartion

2013-10-01 Thread Basile Starynkevitch
Hello All

I'm re-reading toplev_main from gcc/toplev.c and I find strange that
invoke_plugin_callbacks for PLUGIN_FINISH was called before
diagnostic_finish (which I guess is finalizing all the diagnostic
infrastructure).

My guess would be that some GCC plugins might occasionally emit a
diagnostic at PLUGIN_FINISH stage. Imagine a plugin which fills some
database (e.g. with software metrics). It would probably close the
database at PLUGIN_FINISH, and in the rare event that database closing
fails, emit a diagnostic (using e.g. error)


 proposed patch against trunk svn rev 203073
Index: gcc/toplev.c
===
--- gcc/toplev.c(revision 203073)
+++ gcc/toplev.c(working copy)
@@ -1970,11 +1970,13 @@ toplev_main (int argc, char **argv)
 
   if (warningcount || errorcount || werrorcount)
 print_ignored_options ();
-  diagnostic_finish (global_dc);
 
-  /* Invoke registered plugin callbacks if any.  */
+  /* Invoke registered plugin callbacks if any.  Some plugins could
+ emit some diagnostics here.  */
   invoke_plugin_callbacks (PLUGIN_FINISH, NULL);
 
+  diagnostic_finish (global_dc);
+
   finalize_plugins ();
   location_adhoc_data_fini (line_table);
   if (seen_error () || werrorcount)

 proposed gcc/ChangeLog entry

2013-10-01  Basile Starynkevitch  

* toplev.c (toplev_main): Move PLUGIN_FINISH invocation before 
  diagnostic_finish.

#

I am not entirely sure about this patch (because I don't fully
understand what diagnostic_finish is supposed to do, just partly
guessing it). Comments are welcome.

Regards


-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***




[PATCH] Improving uniform_vector_p() function.

2013-10-01 Thread Cong Hou
The current uniform_vector_p() function only returns non-NULL when the
vector is directly a uniform vector. For example, for the following
gimple code:

vect_cst_.15_91 = {_9, _9, _9, _9, _9, _9, _9, _9};


The current implementation can only detect that {_9, _9, _9, _9, _9,
_9, _9, _9} is a uniform vector, but fails to recognize
vect_cst_.15_91 is also one. This simple patch searches through
assignment chains to find more uniform vectors.


thanks,
Cong



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 45c1667..b42f8a9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2013-10-01  Cong Hou  
+
+   * tree.c: Improve the function uniform_vector_p() so that a
+   vector assigned with a uniform vector is also treated as a
+   uniform vector.
+
diff --git a/gcc/tree.c b/gcc/tree.c
index 1c881e4..1d6d894 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10297,6 +10297,17 @@ uniform_vector_p (const_tree vec)
   return first;
 }

+  if (TREE_CODE (vec) == SSA_NAME)
+{
+  gimple def = SSA_NAME_DEF_STMT (vec);
+  if (gimple_code (def) == GIMPLE_ASSIGN)
+{
+  tree rhs = gimple_op (def, 1);
+  if (VECTOR_TYPE_P (TREE_TYPE (rhs)))
+return uniform_vector_p (rhs);
+}
+}
+
   return NULL_TREE;
 }


Re: [PATCH 2/6] Andes nds32: machine description of nds32 porting (2).

2013-10-01 Thread Richard Sandiford
Chung-Ju Wu  writes:
> +  /* Use $r15, if the value is NOT in the range of Is20,
> + we must output "sethi + ori" directly since
> + we may already passed the split stage.  */
> +  return "sethi\t%0, hi20(%1)\;ori\t%0, %0, lo12(%1)";
> +case 17:
> +  return "#";

I don't really understand the comment for case 16.  Returning "#"
(like for case 17) forces a split even at the output stage.

In this case it might not be worth forcing a split though, so I don't
see any need to change the code.  I think the comment should be changed
to give a different reason though.

> +   /* Note that (le:SI X INT_MAX) is not the same as (lt:SI X INT_MIN).
> +  We better have an assert here in case GCC does not properly
> +  optimize it away.  */
> +   gcc_assert (code != LE || INTVAL (operands[2]) != INT_MAX);

Sorry, I was being lazy when I said INT_MAX.  I really meant INT_MAX on
the target (assuming SImode == int), whereas INT_MAX here is a host thing.
0x7fff would be OK.

> +  /* Create RbRe_str string.
> + Note that we need to output ',' character if there exists En4 field.  */
> +  if (REGNO (operands[0]) != SP_REGNUM && REGNO (operands[1]) != SP_REGNUM)
> +  RbRe_str = INTVAL (operands[2]) != 0 ? "%0, %1, " : "%0, %1";
> +  else
> +  RbRe_str = "";

The "RbRe_str =" lines should only be indented by 2 extra spaces, not 4.
Same for pop.

Looks good otherwise, thanks.

Richard


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-10-01 Thread Teresa Johnson
On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson  wrote:
> On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka  wrote:
>>>
>>> I looked at one that failed after 100 as well (20031204-1.c). In this
>>> case, it was due to expansion which was creating multiple branches/bbs
>>> from a logical OR and guessing incorrectly on how to assign the
>>> counts:
>>>
>>>  if (octets == 4 && (*cp == ':' || *cp == '\0')) {
>>>
>>> The (*cp == ':' || *cp == '\0') part looked like the following going
>>> into RTL expansion:
>>>
>>>   [20031204-1.c : 31:33] _29 = _28 == 58;
>>>   [20031204-1.c : 31:33] _30 = _28 == 0;
>>>   [20031204-1.c : 31:33] _31 = _29 | _30;
>>>   [20031204-1.c : 31:18] if (_31 != 0)
>>> goto ;
>>>   else
>>> goto ;
>>>
>>> where the result of the OR was always true, so bb 16 had a count of
>>> 100 and bb 19 a count of 0. When it was expanded, the expanded version
>>> of the above turned into 2 bbs with a branch in between. Both
>>> comparisons were done in the first bb, but the first bb checked
>>> whether the result of the *cp == '\0' compare was true, and if not
>>> branched to the check for whether the *cp == ':' compare was true. It
>>> gave the branch to the second check against ':' a count of 0, so that
>>> bb got a count of 0 and was split out, and put the count of 100 on the
>>> fall through assuming the compare with '\0' always evaluated to true.
>>> In reality, this OR condition was always true because *cp was ':', not
>>> '\0'. Therefore, the count of 0 on the second block with the check for
>>> ':' was incorrect, we ended up trying to execute it, and failed.
>>
>> I see, we produce:
>> ;; if (_26 != 0)
>>
>> (insn 94 93 95 (set (reg:CCZ 17 flags)
>> (compare:CCZ (reg:QI 107 [ D.2184 ])
>> (const_int 0 [0]))) a.c:31 -1
>>  (nil))
>>
>> (insn 95 94 96 (set (reg:QI 122 [ D.2186 ])
>> (eq:QI (reg:CCZ 17 flags)
>> (const_int 0 [0]))) a.c:31 -1
>>  (nil))
>>
>> (insn 96 95 97 (set (reg:CCZ 17 flags)
>> (compare:CCZ (reg:QI 122 [ D.2186 ])
>> (const_int 0 [0]))) a.c:31 -1
>>  (nil))
>>
>> (jump_insn 97 96 98 (set (pc)
>> (if_then_else (ne (reg:CCZ 17 flags)
>> (const_int 0 [0]))
>> (label_ref 100)
>> (pc))) a.c:31 -1
>>  (expr_list:REG_BR_PROB (const_int 6100 [0x17d4])
>> (nil)))
>>
>> (insn 98 97 99 (set (reg:CCZ 17 flags)
>> (compare:CCZ (reg:QI 108 [ D.2186 ])
>> (const_int 0 [0]))) a.c:31 -1
>>  (nil))
>>
>> (jump_insn 99 98 100 (set (pc)
>> (if_then_else (eq (reg:CCZ 17 flags)
>> (const_int 0 [0]))
>> (label_ref 0)
>> (pc))) a.c:31 -1
>>  (expr_list:REG_BR_PROB (const_int 3900 [0xf3c])
>> (nil)))
>>
>> (code_label 100 99 0 14 "" [0 uses])
>>
>> That is because we TER together "_26 = _25 | _24" and "if (_26 != 0)"
>>
>> First I think the logic of do_jump should really be moved to trees.  It is 
>> not
>> doing things that can not be adequately represented by gimple.
>>
>> I am not that certain we want to move it before profiling though.
>>>
>>> Presumably we had the correct profile data for both blocks, but the
>>> accuracy was reduced when the OR was represented as a logical
>>> computation with a single branch. We could change the expansion code
>>> to do something different, e.g. treat as a 50-50 branch. But we would
>>> still end up with integer truncation issues when there was a single
>>> training run. But that could be dealt with conservatively in the
>>
>> Yep, but it is still better than what we have now - if the test above was
>> in hot part of program (i.e. not executed once), we will end up optimizing
>> the second conditional for size.
>>
>> So I think it is do_jump bug to not distribute probabilities across the two
>> conditoinals introduced.
>>> bbpart code as I suggested for the jump threading issue above. I.e. a
>>> cold block with incoming non-cold edges conservatively not marked cold
>>> for splitting.
>>
>> Yep, we can probably do that, but we ought to fix the individual cases
>> above at least for resonable number of runs.
>
> I made this change and it removed a few of the failures.
>
> I looked at another case that still failed with 1 train run but passed
> with 100. It turned out to be another truncation issue exposed by RTL
> expansion, where we created some control flow for a memset builtin
> which was in a block with an execution count of 1. Some of the blocks
> got frequencies less than half the original block, so the count was
> rounded down or truncated to 0. I noticed that in this case (as well
> as the jump threading case I fixed by looking for non-zero incoming
> edges in partitioning) that the bb frequency was non-zero.
>
> Why not just have probably_never_executed_bb_p return simply return
> false bb->frequency is non-zero (right now it does the opposite -
> returns true when bb->frequency is 0)? Making this change removed a
> bunch of other fail

Committed: Use ARC_{FIRST,LAST}_SIMD_VR_REG instead of decimal literals in arc_conditional_register_usage; fix reg_alloc_order for DMA config regs

2013-10-01 Thread Joern Rennecke


2013-09-06  Joern Rennecke  

* config/arc/arc.c (arc_conditional_register_usage):
Use ARC_FIRST_SIMD_VR_REG / ARC_LAST_SIMD_VR_REG.
Also set reg_alloc_order for DMA config regs.

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 51ad7d7..796c768 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -1196,7 +1196,10 @@ arc_conditional_register_usage (void)
   if (TARGET_SIMD_SET)
 {
   int i;
-  for (i=64; i<88; i++)
+  for (i = ARC_FIRST_SIMD_VR_REG; i <= ARC_LAST_SIMD_VR_REG; i++)
+   reg_alloc_order [i] = i;
+  for (i = ARC_FIRST_SIMD_DMA_CONFIG_REG;
+  i <= ARC_LAST_SIMD_DMA_CONFIG_REG; i++)
reg_alloc_order [i] = i;
 }
   /* For Arctangent-A5 / ARC600, lp_count may not be read in an instruction


Re: [PATCH 2/6] Andes nds32: machine description of nds32 porting (3).

2013-10-01 Thread Richard Sandiford
Chung-Ju Wu  writes:
>>> +mno-ctor-dtor
>>> +Target Report RejectNegative
>>> +Disable constructor/destructor feature.
>> 
>> How is this option used?
>> 
>
> It effects how we link crt stuff.  Refer to nds32.h:
>
> /* The option -mno-ctor-dtor can disable constructor/destructor feature
>by applying different crt stuff.  In the convention, crt0.o is the
>startup file without constructor/destructor;
>crt1.o, crti.o, crtbegin.o, crtend.o, and crtn.o are the
>startup files with constructor/destructor.
>Note that crt0.o, crt1.o, crti.o, and crtn.o are provided
>by newlib/mculib/glibc/ublic, while crtbegin.o and crtend.o are
>currently provided by GCC for nds32 target.
>
>For nds32 target so far:
>If -mno-ctor-dtor, we are going to link
>"crt0.o [user objects]".
>If general cases, we are going to link
>"crt1.o crtbegin1.o [user objects] crtend1.o".  */
> #define STARTFILE_SPEC \
>   " %{!mno-ctor-dtor:crt1.o%s;:crt0.o%s}" \
>   " %{!mno-ctor-dtor:crtbegin1.o%s}"
> #define ENDFILE_SPEC \
>   " %{!mno-ctor-dtor:crtend1.o%s}"

Oops, sorry, I forgot to check the specs.

> +mforce-fp-as-gp
> +Target Report Mask(FORCE_FP_AS_GP)
> +Prevent $fp being allocated during register allocation so that compiler is 
> able to force performing fp-as-gp optimization.

Maybe:

  Prevent $fp from being used by the register allocator, so that it is always 
available for the fp-as-gp optimization.

Looks good otherwise, thanks.

Richard


Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-10-01 Thread H.J. Lu
On Sun, Sep 22, 2013 at 2:29 AM, Uros Bizjak  wrote:
> On Wed, Sep 18, 2013 at 3:45 PM, Zamyatin, Igor  
> wrote:
>> Ccing Uros. Changes in i386.md could be related to the fix for PR57954.
>
>> -Original Message-
>> From: Wei Mi [mailto:w...@google.com]
>> Sent: Thursday, September 12, 2013 2:51 AM
>> To: GCC Patches
>> Cc: David Li; Zamyatin, Igor
>> Subject: [PATCH] disable use_vector_fp_converts for m_CORE_ALL
>>
>> For the following testcase 1.c, on westmere and sandybridge, performance 
>> with the option -mtune=^use_vector_fp_converts is better (improves from 
>> 3.46s to 2.83s). It means cvtss2sd is often better than
>> unpcklps+cvtps2pd on recent x86 platforms.
>>
>> 1.c:
>> float total = 0.2;
>> int k = 5;
>>
>> int main() {
>>  int i;
>>
>>  for (i = 0; i < 10; i++) {
>>total += (0.5 + k);
>>  }
>>
>>  return total == 0.3;
>> }
>>
>> assembly generated by gcc-r201963 without -mtune=^use_vector_fp_converts
>> .L2:
>> unpcklps%xmm0, %xmm0
>> subl$1, %eax
>> cvtps2pd%xmm0, %xmm0
>> addsd   %xmm1, %xmm0
>> unpcklpd%xmm0, %xmm0
>> cvtpd2ps%xmm0, %xmm0
>> jne .L2
>>
>> assembly generated by gcc-r201963 with -mtune=^use_vector_fp_converts
>> .L2:
>> cvtss2sd%xmm0, %xmm0
>> subl$1, %eax
>> addsd   %xmm1, %xmm0
>> cvtsd2ss%xmm0, %xmm0
>> jne .L2
>>
>> But for testcase 2.c (Thanks to Igor Zamyatin for the testcase), performance 
>> with the option -mtune=^use_vector_fp_converts is worse.
>> Analysis to the assembly shows the performance degradation comes from 
>> partial reg stall caused by cvtsd2ss. Adding pxor %xmm0, %xmm0 before 
>> cvtsd2ss b(,%rdx,8), %xmm0 gets the performance back.
>>
>> 2.c:
>> double b[1024];
>>
>> float a[1024];
>>
>> int main()
>> {
>> int i;
>> for(i = 0 ; i < 1024 * 1024 * 256; i++)
>>   a[i & 1023] = a[i & 1023] * (float)b[i & 1023];
>> return (int)a[512];
>> }
>>
>> without -mtune-crtl=^use_vector_fp_converts
>> .L2:
>> movl%eax, %edx
>> addl$1, %eax
>> andl$1023, %edx
>> cmpl$268435456, %eax
>> movsd   b(,%rdx,8), %xmm0
>> cvtpd2ps%xmm0, %xmm0==> without partial reg stall
>> because of movsd.
>> mulss   a(,%rdx,4), %xmm0
>> movss   %xmm0, a(,%rdx,4)
>> jne .L2
>>
>> with -mtune-crtl=^use_vector_fp_converts
>> .L2:
>> movl%eax, %edx
>> addl$1, %eax
>> andl$1023, %edx
>> cmpl$268435456, %eax
>> cvtsd2ssb(,%rdx,8), %xmm0   ==> with partial reg
>> stall. Needs to insert "pxor %xmm0, %xmm0" before current insn.
>> mulss   a(,%rdx,4), %xmm0
>> movss   %xmm0, a(,%rdx,4)
>> jne .L2
>>
>> So the patch is to turn off use_vector_fp_converts for m_CORE_ALL to use 
>> cvtss2sd/cvtsd2ss directly,  and add "pxor %xmmreg %xmmreg" before 
>> cvtss2sd/cvtsd2ss to break partial reg stall (similar as what r201308 does 
>> for cvtsi2ss/cvtsi2sd). bootstrap and regression pass. ok for trunk?
>>
>> Thanks,
>> Wei Mi.
>>
>> 2013-09-11  Wei Mi  
>>
>> * config/i386/x86-tune.def (DEF_TUNE): Remove
>> m_CORE_ALL.
>> * config/i386/i386.md: Add define_peephole2 to
>> break partial reg stall for cvtss2sd/cvtsd2ss.
>
> You don't need reload_completed in peephole2 patterns.
>
> Otherwise the patch is OK.
>
> Thanks,
> Uros.

Hi Wei Mi,

Have you checked in your patch?

-- 
H.J.


Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA. 2/2 New registers and instructions

2013-10-01 Thread Uros Bizjak
On Tue, Oct 1, 2013 at 9:41 AM, Ilya Enkovich  wrote:

>> >> >> The x86 part looks mostly OK (I have a couple of comments bellow), but
>> >> >> please first get target-independent changes reviewed and committed.
>> >> >
>> >> > Do you mean I should move bound type and mode declaration into a 
>> >> > separate patch?
>> >>
>> >> Yes, target-independent part (middle end) has to go through the
>> >> separate review to check if this part is OK. The target-dependent part
>> >> uses the infrastructure from the middle end, so it can go into the
>> >> code base only after target-independent parts are committed.
>> >
>> > I sent a separate patch for bound type and mode class 
>> > (http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01268.html). Here is target 
>> > part of the patch with fixes you mentioned. Does it look OK?
>> >
>> > Bootstrapped and checked on linux-x86_64. Still shows incorrect length 
>> > attribute computation (described here 
>> > http://gcc.gnu.org/ml/gcc/2013-07/msg00311.html).
>>
>> Please look at the attached patch that solves length computation
>> problem. The patch also implements length calculation in a generic
>> way, as proposed earlier.
>>
>> The idea is to calculate total insn length via generic "length"
>> attribute calculation from "length_nobnd" attribute, but iff
>> length_attribute is non-null. This way, we are able to decorate
>> bnd-prefixed instructions by "lenght_nobnd" attribute, and generic
>> part will automatically call ix86_bnd_prefixed_insn_p predicate with
>> current insn pattern. I also belive that this approach is most
>> flexible to decorate future patterns.
>>
>> The patch adds new attribute to a couple of patterns to illustrate its usage.
>>
>> Please test this approach. Modulo length calculations, improved by the
>> patch in this message, I have no further comments, but please repost
>> complete (target part) of your patch.
>
> Hi Uros,
>
> Thanks for your reply! I applied approach you proposed for length attribute. 
> It works well. Make check is clean now.
>
> I also adjusted bound registers to recently added mask registers. Attached is 
> a new patch.
>
> Thanks,
> Ilya
>
> --
>
> 2013-09-30  Ilya Enkovich  
>
> * config/i386/constraints.md (B): New.
> (Ti): New.
> (Tb): New.
> * config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
> * config/i386/i386-modes.def (BND32): New.
> (BND64): New.
> * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
> * config/i386/i386.c (isa_opts): Add mmpx.
> (regclass_map): Add bound registers.
> (dbx_register_map): Likewise.
> (dbx64_register_map): Likewise.
> (svr4_dbx_register_map): Likewise.
> (PTA_MPX): New.
> (ix86_option_override_internal): Support MPX ISA.
> (ix86_conditional_register_usage): Support bound registers.
> (print_reg): Likewise.
> (ix86_code_end): Add MPX bnd prefix.
> (output_set_got): Likewise.
> (ix86_output_call_insn): Likewise.
> (ix86_print_operand): Add '!' (MPX bnd) print prefix support.
> (ix86_print_operand_punct_valid_p): Likewise.
> (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
> UNSPEC_BNDMK_ADDR.
> (ix86_class_likely_spilled_p): Add bound regs support.
> (ix86_hard_regno_mode_ok): Likewise.
> (x86_order_regs_for_local_alloc): Likewise.
> (ix86_bnd_prefixed_insn_p): New.
> * config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value.
> (FIXED_REGISTERS): Add bound registers.
> (CALL_USED_REGISTERS): Likewise.
> (REG_ALLOC_ORDER): Likewise.
> (HARD_REGNO_NREGS): Likewise.
> (TARGET_MPX): New.
> (VALID_BND_REG_MODE): New.
> (FIRST_BND_REG): New.
> (LAST_BND_REG): New.
> (reg_class): Add BND_REGS.
> (REG_CLASS_NAMES): Likewise.
> (REG_CLASS_CONTENTS): Likewise.
> (BND_REGNO_P): New.
> (ANY_BND_REG_P): New.
> (BNDmode): New.
> (HI_REGISTER_NAMES): Add bound registers.
> * config/i386/i386.md (UNSPEC_BNDMK): New.
> (UNSPEC_BNDMK_ADDR): New.
> (UNSPEC_BNDSTX): New.
> (UNSPEC_BNDLDX): New.
> (UNSPEC_BNDLDX_ADDR): New.
> (UNSPEC_BNDCL): New.
> (UNSPEC_BNDCU): New.
> (UNSPEC_BNDCN): New.
> (UNSPEC_MPX_FENCE): New.
> (BND0_REG): New.
> (BND1_REG): New.
> (type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
> (length_immediate): Likewise.
> (prefix_0f): Likewise.
> (memory): Likewise.
> (prefix_rep): Check for bnd prefix.
> (length_nobnd): New.
> (length): Use length_nobnd if specified.
> (BND): New.
> (bnd_ptr): New.
> (BNDCHECK): New.
> (bndcheck): New.
> (*jcc_1): Add bnd prefix and rename length attr to length_nobnd.
> (*jcc_2): Likewise.
>  

[RFC] Change dependency-generating compiler from $(CC) to $(CXX) in gcc/

2013-10-01 Thread Jan-Benedict Glaw
Hi!

I'm trying to build GCC on an AIX ppc system (gcc111.fsffrance.org),
where IBM's XL C compiler is installed, but not XL C++. So my
configury uses CC=/usr/bin/xlC, and a default value (g++) for CXX.

  Because $CC is used for dependency checking (gcc/configure.ac::977
or gcc/configure:8765), `depcomp' detects "aix" style dependency
generation, but that'll fail because g++ is actually used for
compilation. (This will result in all the .o files containing
dependency information.)

  Since we're using $(CXX) instead of $(CC) these days, I suggest
using CXX instead of CC. A quick check (without regenerating all
configure and Makefile files just changing CC to CXX in gcc/configure)
makes it build properly. (If done properly, some variables will change
their name s/CC/CXX/, so that must be changed in some more files if
done properly.)

  There are, however, some more directories (other than gcc/) that
might be equally affected: Searching for CCDEPMODE also reveals:

* boehm-gc
* libatomic
* libcpp
* libffi
* libgfortran
* libgo
* libgomp
* libitm
* libjava
* libmudflap
* libquadmath
* libsanitizer
* libssp
* libvtv
* lto-plugin
* zlib

I haven't checked these (if they're build with CC or CXX), but those
which use CXX (if there are any) should probably also be changed.

  My basic question though is: Shall I cook up a patch to do the
proper transition from CC to CXX for dependency generation? Or better
keep it as is (which I consider broken, but probably few people will
ever experience that problem)?

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: Alles wird gut! ...und heute wirds schon ein bißchen 
besser.
the second  :


signature.asc
Description: Digital signature


Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Bill Schmidt


On Tue, 2013-10-01 at 11:56 -0500, Bill Schmidt wrote:
> OK, thanks.  The problem that you've encountered is that you are
> attempting to do something illegal. ;)  (Bin's original patch is
> actually to blame for that, as well as me for not catching it then.)
> 
> As your new test shows, it is unsafe to do the transformation in
> backtrace_base_for_ref when widening from an unsigned type, because the
> unsigned type has wrap semantics by default.  (The actual test must be
> done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or
> removed by compile option -- see the comments with legal_cast_p and
> legal_cast_p_1 later in the module.)
> 
> You cannot in general prove that the transformation is allowable for a
> specific constant, because you don't know that what you're adding it to
> won't cause an overflow that's handled incorrectly.
> 
> I believe the correct fix for the unsigned-overflow case is to fail
> backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns
> false, where in_type is the type of the new *PBASE, and out_type is the
> widening type that you're looking through.  So you can't just
> STRIP_NOPS, you have to check the cast for legitimacy for this
> transformation.
> 
> This does not explain why backtrace_base_for_ref does not find all the
> opportunities on slsr-39.c.  I don't immediately see what's preventing
> that.  Note that the transformation is legal in that case because you
> are widening from a signed int to an unsigned int, which won't cause
> problems.  You guys need to dig deeper into why those opportunities are
> missed when sizetype is larger than int.  Let me know if you need help
> figuring it out.

Sorry, I had to leave before and wanted to get this response back to you
in case I didn't get back soon.  I've looked at this some more, and your
general approach should work ok once you get the legal_cast_p check in
place where you do the get_unwidened call now.  Once you know you have a
legal widening, you don't have to worry about the safe_to_multiply_p
stuff.  I.e., you don't need the last two chunks in the patch to
backtrace_base_for_ref, and you don't need the unwidened_p variable.  It
should all fall out properly by just restricting your unwidening to
legal casts.

Thanks,
Bill

> 
> Thanks,
> Bill
> 
> On Tue, 2013-10-01 at 16:06 +0100, Yufeng Zhang wrote:
> > Hi Bill,
> > 
> > Thank you for the review and the offer to help.
> > 
> > On 10/01/13 15:36, Bill Schmidt wrote:
> > > On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote:
> > >> On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote:
> > >>> On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang  
> > >>> wrote:
> >  Hello,
> > 
> >  Please find the updated version of the patch in the attachment.  It has
> >  addressed the previous comments and also included some changes in 
> >  order to
> >  pass the bootstrapping on x86_64.
> > 
> >  It's also passed the regtest on arm-none-eabi and aarch64-none-elf.
> > 
> >  It will also fix the test failure as reported here:
> >  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html
> > 
> >  OK for the trunk?
> > >>>
> > >>> +   where n is a 32-bit unsigned int and pointer are 64-bit long.  In 
> > >>> this
> > >>> +   case, the gimple for (n - 1) is:
> > >>> +
> > >>> + _2 = n_1(D) + 4294967295; // 0x
> > >>> +
> > >>> +   and it is wrong to multiply the large constant by 4 in the 64-bit 
> > >>> space.  */
> > >>> +
> > >>> +static bool
> > >>> +safe_to_multiply_p (tree type, double_int cst)
> > >>> +{
> > >>> +  if (TYPE_UNSIGNED (type)
> > >>> +&&  ! double_int_fits_to_tree_p (signed_type_for (type), cst))
> > >>> +return false;
> > >>> +
> > >>> +  return true;
> > >>> +}
> > >>>
> > >>> This looks wrong.  The only relevant check is as whether the
> > >>> multiplication overflows the original type as you miss the implicit
> > >>> truncation that happens.  Which is something you don't know
> > >>> unless you know the value.  It definitely isn't a property of a type
> > >>> and a constant but the property of two constants and a type.
> > >>> Or the predicate has a wrong name.
> > >>>
> > >>> The use of get_unwidened in this core routine looks like this is
> > >>> all happening in the wrong place and we should have picked up
> > >>> another candidate for this instead?  I'm sure Bill will know more here.
> > >>
> > >> I'm not happy with how this patch is progressing.  Without having looked
> > >> too deeply, this might be better handled earlier when determining which
> > >> casts are safe to use in building candidates.  What you have here seems
> > >> more like closing the barn door after the horse got out.  Maybe that's
> > >> the only solution, but it doesn't seem likely.
> > >>
> > >> Another problem is that your test case isn't testing anything except
> > >> that the compiler doesn't crash.  That isn't sufficient as a regression
> > >> test.
> > >>
> > >> I'll spend some time loo

[wwwdocs] Buildstat update for 4.8

2013-10-01 Thread Tom G. Christensen
Latest results for gcc 4.8.x.

-tgc

Testresults for 4.8.1
  sparc64-sun-solaris2.9

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/buildstat.html,v
retrieving revision 1.4
diff -u -r1.4 buildstat.html
--- buildstat.html  6 Aug 2013 10:55:27 -   1.4
+++ buildstat.html  1 Oct 2013 20:12:42 -
@@ -195,6 +195,14 @@
 
 
 
+sparc64-sun-solaris2.9
+ 
+Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-09/msg02155.html";>4.8.1
+
+
+
+
 sparc64-sun-solaris2.10
  
 Test results:


[wwwdocs] Buildstat update for 4.7

2013-10-01 Thread Tom G. Christensen
Latest results for gcc 4.7.x.

-tgc

Testresults for 4.7.3
  sparc64-sun-solaris2.8

Testresults for 4.7.2
  i386-apple-darwin9.8.0

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/buildstat.html,v
retrieving revision 1.10
diff -u -r1.10 buildstat.html
--- buildstat.html  4 Aug 2013 21:38:10 -   1.10
+++ buildstat.html  1 Oct 2013 20:14:07 -
@@ -94,6 +94,14 @@
 
 
 
+i386-apple-darwin9.8.0
+ 
+Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00755.html";>4.7.2
+
+
+
+
 i386-apple-darwin10.8.0
  
 Test results:
@@ -234,6 +242,14 @@
 
 
 
+sparc64-sun-solaris2.8
+ 
+Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-09/msg02154.html";>4.7.3
+
+
+
+
 sparc64-sun-solaris2.10
  
 Test results:


[wwwdocs] Buildstat update for 4.5

2013-10-01 Thread Tom G. Christensen
Latest results for 4.5.x

-tgc

Testresults for 4.5.4:
  i386-pc-solaris2.8 (2)
  i386-pc-solaris2.9
  sparc-sun-solaris2.7 (2)
  sparc-sun-solaris2.8
  sparc-sun-solaris2.9
  sparc64-sun-solaris2.7
  sparc64-sun-solaris2.8
  sparc64-sun-solaris2.9

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.5/buildstat.html,v
retrieving revision 1.15
diff -u -r1.15 buildstat.html
--- buildstat.html  3 Oct 2012 23:43:53 -   1.15
+++ buildstat.html  1 Oct 2013 20:17:08 -
@@ -143,6 +143,8 @@
 i386-pc-solaris2.8
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00251.html";>4.5.4,
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00250.html";>4.5.4,
 http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02309.html";>4.5.3,
 http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01359.html";>4.5.3,
 http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01215.html";>4.5.3,
@@ -159,6 +161,7 @@
 i386-pc-solaris2.9
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00223.html";>4.5.4,
 http://gcc.gnu.org/ml/gcc-testresults/2010-12/msg01902.html";>4.5.2,
 http://gcc.gnu.org/ml/gcc-testresults/2010-04/msg01571.html";>4.5.0
 
@@ -292,6 +295,8 @@
 sparc-sun-solaris2.7
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00677.html";>4.5.4,
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00676.html";>4.5.4,
 http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03485.html";>4.5.3,
 http://gcc.gnu.org/ml/gcc-testresults/2010-06/msg01449.html";>4.5.0
 
@@ -301,6 +306,7 @@
 sparc-sun-solaris2.8
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00234.html";>4.5.4,
 http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00833.html";>4.5.3,
 http://gcc.gnu.org/ml/gcc-testresults/2011-01/msg01684.html";>4.5.2,
 http://gcc.gnu.org/ml/gcc-testresults/2010-12/msg01975.html";>4.5.2,
@@ -318,6 +324,7 @@
 sparc-sun-solaris2.9
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00240.html";>4.5.4,
 http://gcc.gnu.org/ml/gcc-testresults/2010-12/msg01974.html";>4.5.2,
 http://gcc.gnu.org/ml/gcc-testresults/2010-10/msg02205.html";>4.5.1
 
@@ -336,6 +343,30 @@
 
 
 
+sparc64-sun-solaris2.7
+ 
+Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-09/msg02151.html";>4.5.4
+
+
+
+
+sparc64-sun-solaris2.8
+ 
+Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-09/msg02152.html";>4.5.4
+
+
+
+
+sparc64-sun-solaris2.9
+ 
+Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-09/msg02153.html";>4.5.4
+
+
+
+
 sparc-unknown-linux-gnu
  
 Test results:


Re: [patch] move htab_iterator

2013-10-01 Thread Tom Tromey
> "Andrew" == Andrew MacLeod  writes:

Andrew> Sure, how's this?

Thanks!

Andrew> And who has to approve the libiberty bits?

libiberty   DJ Delorie  d...@redhat.com
libiberty   Ian Lance Taylori...@airs.com

Tom


[wwwdocs] Buildstat update for 4.4

2013-10-01 Thread Tom G. Christensen
Latest results for 4.4.x

-tgc

Testresults for 4.4.7:
  i386-pc-solaris2.8 (2)
  i386-pc-solaris2.9
  sparc-sun-solaris2.7 (2)
  sparc-sun-solaris2.8
  sparc-sun-solaris2.9

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.4/buildstat.html,v
retrieving revision 1.27
diff -u -r1.27 buildstat.html
--- buildstat.html  3 Oct 2012 21:05:01 -   1.27
+++ buildstat.html  1 Oct 2013 20:24:10 -
@@ -187,6 +187,8 @@
 i386-pc-solaris2.8
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00249.html";>4.4.7,
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00248.html";>4.4.7,
 http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02185.html";>4.4.7,
 http://gcc.gnu.org/ml/gcc-testresults/2010-07/msg00900.html";>4.4.4,
 http://gcc.gnu.org/ml/gcc-testresults/2009-05/msg00105.html";>4.4.0
@@ -197,6 +199,7 @@
 i386-pc-solaris2.9
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00222.html";>4.4.7,
 http://gcc.gnu.org/ml/gcc-testresults/2010-05/msg00264.html";>4.4.4,
 http://gcc.gnu.org/ml/gcc-testresults/2009-11/msg01365.html";>4.4.2,
 http://gcc.gnu.org/ml/gcc-testresults/2009-08/msg00867.html";>4.4.1,
@@ -379,6 +382,8 @@
 sparc-sun-solaris2.7
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00675.html";>4.4.7,
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00674.html";>4.4.7,
 http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg00145.html";>4.4.6
 http://gcc.gnu.org/ml/gcc-testresults/2009-07/msg00420.html";>4.4.0
 
@@ -388,6 +393,7 @@
 sparc-sun-solaris2.8
  
 Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00233.html";>4.4.7,
 http://gcc.gnu.org/ml/gcc-testresults/2010-05/msg00428.html";>4.4.4,
 http://gcc.gnu.org/ml/gcc-testresults/2010-04/msg00431.html";>4.4.3,
 http://gcc.gnu.org/ml/gcc-testresults/2010-02/msg01276.html";>4.4.3,
@@ -397,6 +403,14 @@
 
 
 
+sparc-sun-solaris2.9
+ 
+Test results:
+http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg00239.html";>4.4.7
+
+
+
+
 sparc-sun-solaris2.10
  
 Test results:


[PATCH] Do not append " *INTERNAL* " to the decl name

2013-10-01 Thread Dehao Chen
Hi,

This patch disables the C++ frontend to add " *INTERNAL* " suffix to
maybe_in_charge_destructor/constructor. This is needed because these
functions could be emitted in the debug info, and we would want to
demangle these names.

Bootstrapped and passed all regression tests.

OK for trunk?

Thanks,
Dehao

gcc/ChangeLog:

2013-10-01  Dehao Chen  

* cp/mangle.c (write_special_name_constructor): Remove the
INTERNAL suffix.

Index: gcc/cp/mangle.c
===
--- gcc/cp/mangle.c (revision 202991)
+++ gcc/cp/mangle.c (working copy)
@@ -687,13 +687,6 @@ write_mangled_name (const tree decl, bool top_leve
 mangled_name:;
   write_string ("_Z");
   write_encoding (decl);
-  if (DECL_LANG_SPECIFIC (decl)
-  && (DECL_MAYBE_IN_CHARGE_DESTRUCTOR_P (decl)
-  || DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (decl)))
- /* We need a distinct mangled name for these entities, but
-   we should never actually output it.  So, we append some
-   characters the assembler won't like.  */
- write_string (" *INTERNAL* ");
 }
 }

@@ -1654,8 +1647,7 @@ write_identifier (const char *identifier)
Currently, allocating constructors are never used.

We also need to provide mangled names for the maybe-in-charge
-   constructor, so we treat it here too.  mangle_decl_string will
-   append *INTERNAL* to that, to make sure we never emit it.  */
+   constructor, so we treat it here too.  */

 static void
 write_special_name_constructor (const tree ctor)
@@ -1682,8 +1674,7 @@ write_special_name_constructor (const tree ctor)
 ::= D2 # base object (not-in-charge) destructor

We also need to provide mangled names for the maybe-incharge
-   destructor, so we treat it here too.  mangle_decl_string will
-   append *INTERNAL* to that, to make sure we never emit it.  */
+   destructor, so we treat it here too.  */

 static void
 write_special_name_destructor (const tree dtor)


Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-10-01 Thread Wei Mi
> Hi Wei Mi,
>
> Have you checked in your patch?
>
> --
> H.J.

No, I havn't. Honza wants me to wait for his testing on AMD hardware.
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01603.html


Re: [RFC] Change dependency-generating compiler from $(CC) to $(CXX) in gcc/

2013-10-01 Thread Tom Tromey
> "Jan-Benedict" == Jan-Benedict Glaw  writes:

Jan-Benedict> Since we're using $(CXX) instead of $(CC) these days, I
Jan-Benedict> suggest using CXX instead of CC. A quick check (without
Jan-Benedict> regenerating all configure and Makefile files just
Jan-Benedict> changing CC to CXX in gcc/configure) makes it build
Jan-Benedict> properly.

I think that would be fine.

Jan-Benedict>   There are, however, some more directories (other than gcc/) that
Jan-Benedict> might be equally affected: Searching for CCDEPMODE also reveals:

I wouldn't worry about these.

Many of them are target libraries.  These will be built with the GCC
that was just built.

The others that I know of are C libraries, not C++.

Tom


[RFC] [PATCH] Change dependency-generating compiler from $(CC) to $(CXX) in gcc/

2013-10-01 Thread Jan-Benedict Glaw
On Tue, 2013-10-01 20:49:30 +0200, Jan-Benedict Glaw  wrote:
[...]
>   Since we're using $(CXX) instead of $(CC) these days, I suggest
> using CXX instead of CC. A quick check (without regenerating all
> configure and Makefile files just changing CC to CXX in gcc/configure)
> makes it build properly. (If done properly, some variables will change
> their name s/CC/CXX/, so that must be changed in some more files if
> done properly.)

For gcc/, I prepared this patch, which seems to work:


2013-10-01  Jan-Benedict Glaw  

gcc/
* configure.ac: Use CXX instead of CC for dependency generation.
* Makefile.in: Change CCDEPMODE to CXXDEPMODE.
* configure: Regenerate.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index f216962..105c534 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -974,7 +974,7 @@ AC_CONFIG_COMMANDS([gccdepdir],[
   ${CONFIG_SHELL-/bin/sh} $ac_aux_dir/mkinstalldirs $lang/$DEPDIR
   done], [subdirs="$subdirs" ac_aux_dir=$ac_aux_dir DEPDIR=$DEPDIR])
 
-ZW_PROG_COMPILER_DEPENDENCIES([CC])
+ZW_PROG_COMPILER_DEPENDENCIES([CXX])
 AC_LANG_POP(C++)
 
 # diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f55f1d1..9a6369e 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -309,7 +309,7 @@ write_entries_to_file = $(shell rm -f $(2) || :) $(shell 
touch $(2)) \
 # 
 
 # Dependency tracking stuff.
-CCDEPMODE = @CCDEPMODE@
+CXXDEPMODE = @CXXDEPMODE@
 DEPDIR = @DEPDIR@
 depcomp = $(SHELL) $(srcdir)/../depcomp
 
@@ -1029,7 +1029,7 @@ INCLUDES = -I. -I$(@D) -I$(srcdir) -I$(srcdir)/$(@D) \
   $(CLOOGINC) $(ISLINC)
 
 COMPILE.base = $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) -o $@
-ifeq ($(CCDEPMODE),depmode=gcc3)
+ifeq ($(CXXDEPMODE),depmode=gcc3)
 # Note a subtlety here: we use $(@D) for the directory part, to make
 # things like the go/%.o rule work properly; but we use $(*F) for the
 # file part, as we just want the file part of the stem, not the entire
@@ -1038,7 +1038,7 @@ COMPILE = $(COMPILE.base) -MT $@ -MMD -MP -MF 
$(@D)/$(DEPDIR)/$(*F).TPo
 POSTCOMPILE = @mv $(@D)/$(DEPDIR)/$(*F).TPo $(@D)/$(DEPDIR)/$(*F).Po
 else
 COMPILE = source='$<' object='$@' libtool=no \
-DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) $(COMPILE.base)
+DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) $(COMPILE.base)
 POSTCOMPILE =
 endif
 
diff --git a/gcc/configure b/gcc/configure
index 2ac0347..0ca79bd 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -735,7 +735,7 @@ LDEXP_LIB
 EXTRA_GCC_LIBS
 GNAT_LIBEXC
 COLLECT2_LIBS
-CCDEPMODE
+CXXDEPMODE
 DEPDIR
 am__leading_dot
 CXXCPP
@@ -8762,12 +8762,12 @@ ac_config_commands="$ac_config_commands depdir"
 ac_config_commands="$ac_config_commands gccdepdir"
 
 
-depcc="$CC"   am_compiler_list=
+depcc="$CXX"  am_compiler_list=
 
 am_depcomp=$ac_aux_dir/depcomp
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking dependency style of $depcc" 
>&5
 $as_echo_n "checking dependency style of $depcc... " >&6; }
-if test "${am_cv_CC_dependencies_compiler_type+set}" = set; then :
+if test "${am_cv_CXX_dependencies_compiler_type+set}" = set; then :
   $as_echo_n "(cached) " >&6
 else
   if test -f "$am_depcomp"; then
@@ -8789,7 +8789,7 @@ else
   # directory.
   mkdir sub
 
-  am_cv_CC_dependencies_compiler_type=none
+  am_cv_CXX_dependencies_compiler_type=none
   if test "$am_compiler_list" = ""; then
  am_compiler_list=`sed -n 's/^\([a-zA-Z0-9]*\))$/\1/p' < ./depcomp`
   fi
@@ -8834,7 +8834,7 @@ else
   #   icc: Command line remark: option '-MP' not supported
   if (grep 'ignoring option' conftest.err ||
   grep 'not supported' conftest.err) >/dev/null 2>&1; then :; else
-am_cv_CC_dependencies_compiler_type=$depmode
+am_cv_CXX_dependencies_compiler_type=$depmode
$as_echo "$as_me:$LINENO: success" >&5
 break
   fi
@@ -8846,15 +8846,15 @@ else
   cd ..
   rm -rf conftest.dir
 else
-  am_cv_CC_dependencies_compiler_type=none
+  am_cv_CXX_dependencies_compiler_type=none
 fi
 
 fi
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$am_cv_CC_dependencies_compiler_type" >&5
-$as_echo "$am_cv_CC_dependencies_compiler_type" >&6; }
-if test x${am_cv_CC_dependencies_compiler_type-none} = xnone
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$am_cv_CXX_dependencies_compiler_type" >&5
+$as_echo "$am_cv_CXX_dependencies_compiler_type" >&6; }
+if test x${am_cv_CXX_dependencies_compiler_type-none} = xnone
 then as_fn_error "no usable dependency style found" "$LINENO" 5
-else CCDEPMODE=depmode=$am_cv_CC_dependencies_compiler_type
+else CXXDEPMODE=depmode=$am_cv_CXX_dependencies_compiler_type
 
 fi
 


-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: Gib Dein Bestes. Dann übertriff Dich selbst!
the second  :


signature.asc
Description: Digital signature


Re: [Patch, Fortran] PR55469 - fix I/O memory leaks in case of failure and iostat= being present

2013-10-01 Thread Tobias Burnus
I have committed the patch after Janne's approval on IRC  and fresh 
building/regtesting as Rev. 203086. See 
http://gcc.gnu.org/ml/fortran/2012-11/msg00092.html


As suggested by Janne, I will backport it to 4.8 in in a bunch of days.

Tobias

On November 29, 2012 12:38, Tobias Burnus wrote:

Tobias Burnus wrote:
l_push_char allocates memory which is freed with free_line. However, 
currently, the memory is not always freed when calling 
generate_error. If one aborts, that's fine. However, generate_error 
can also set the iostat variable.


Updated version: Corrected PR number - and ensured that if 
convert_real fails, free_saved is called (cf. additional test case in 
the PR).


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias




Fwd: Fwd: Re: [patch] move htab_iterator

2013-10-01 Thread Andrew MacLeod

bounced from gcc-patches... must have been some html creep in..
Andrew

 Original Message 
Subject:Fwd: Re: [patch] move htab_iterator
Date:   Tue, 01 Oct 2013 16:54:52 -0400
From:   Andrew MacLeod 
To: 	gcc-patches , tromey , 
Ian Lance Taylor , DJ Delorie 





Anyone interested in approving the move of these hash table iterators 
into libiberty?  GCC is no longer using them, so I'm either deleting 
them, or moving them...


Thanks
Andrew

 Original Message 
Subject:Re: [patch] move htab_iterator
Date:   Mon, 30 Sep 2013 13:43:14 -0400
From:   Andrew MacLeod 
To: Tom Tromey 
CC: 	gcc-patches , Richard Biener 
, Jakub Jelinek 




On 09/30/2013 01:02 PM, Tom Tromey wrote:

Tom> How about putting it into libiberty?
Tom> That way other hashtab users, like gdb, can use it.

Andrew> I have no problem with that, but Jakub didn't seem to think it
Andrew> belonged there.

All I found was this:

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00721.html

Quoting from it: "It doesn't belong to hashtab.h, because that is a
libiberty API, this style of iterators is GCC specific."

I think that's an accurate assessment of the current code, but I don't
see why it has to continue to be that way.

My argument in favor of moving it to libiberty is that other programs
can then use it; and furthermore that since it is tightly tied to the
hashtab implementation, it ought to be maintained there in order to
preserve the module boundary.

So, please reconsider.


Sure, how's this?

And who has to approve the libiberty bits?

Bootstrapping now... but since its unused I doubt that will be an issue
:-)...

Andrew









	gcc
	* tree-flow.h (htab_iterator, FOR_EACH_HTAB_ELEMENT): Move from here.
	* tree-flow-inline.h (first_htab_element, end_htab_p,
	next_htab_element): Also move from here.

	include
	* hashtab.h (htab_iterator, FOR_EACH_HTAB_ELEMENT,
	first_htab_element, end_htab_p, next_htab_element): Move to here.
	Change boolean to int and 0/1.

Index: gcc/tree-flow.h
===
*** gcc/tree-flow.h	(revision 203034)
--- gcc/tree-flow.h	(working copy)
*** struct GTY(()) gimple_df {
*** 92,112 
htab_t GTY ((param_is (struct tm_restart_node))) tm_restart;
  };
  
- 
- typedef struct
- {
-   htab_t htab;
-   PTR *slot;
-   PTR *limit;
- } htab_iterator;
- 
- /* Iterate through the elements of hashtable HTAB, using htab_iterator ITER,
-storing each element in RESULT, which is of type TYPE.  */
- #define FOR_EACH_HTAB_ELEMENT(HTAB, RESULT, TYPE, ITER) \
-   for (RESULT = (TYPE) first_htab_element (&(ITER), (HTAB)); \
- 	!end_htab_p (&(ITER)); \
- 	RESULT = (TYPE) next_htab_element (&(ITER)))
- 
  /* It is advantageous to avoid things like life analysis for variables which
 do not need PHI nodes.  This enum describes whether or not a particular
 variable may need a PHI node.  */
--- 92,97 
Index: gcc/tree-flow-inline.h
===
*** gcc/tree-flow-inline.h	(revision 203034)
--- gcc/tree-flow-inline.h	(working copy)
*** gimple_vop (const struct function *fun)
*** 42,93 
return fun->gimple_df->vop;
  }
  
- /* Initialize the hashtable iterator HTI to point to hashtable TABLE */
- 
- static inline void *
- first_htab_element (htab_iterator *hti, htab_t table)
- {
-   hti->htab = table;
-   hti->slot = table->entries;
-   hti->limit = hti->slot + htab_size (table);
-   do
- {
-   PTR x = *(hti->slot);
-   if (x != HTAB_EMPTY_ENTRY && x != HTAB_DELETED_ENTRY)
- 	break;
- } while (++(hti->slot) < hti->limit);
- 
-   if (hti->slot < hti->limit)
- return *(hti->slot);
-   return NULL;
- }
- 
- /* Return current non-empty/deleted slot of the hashtable pointed to by HTI,
-or NULL if we have  reached the end.  */
- 
- static inline bool
- end_htab_p (const htab_iterator *hti)
- {
-   if (hti->slot >= hti->limit)
- return true;
-   return false;
- }
- 
- /* Advance the hashtable iterator pointed to by HTI to the next element of the
-hashtable.  */
- 
- static inline void *
- next_htab_element (htab_iterator *hti)
- {
-   while (++(hti->slot) < hti->limit)
- {
-   PTR x = *(hti->slot);
-   if (x != HTAB_EMPTY_ENTRY && x != HTAB_DELETED_ENTRY)
- 	return x;
- };
-   return NULL;
- }
- 
  /* Get the number of the next statement uid to be allocated.  */
  static inline unsigned int
  gimple_stmt_max_uid (struct function *fn)
--- 42,47 
Index: include/hashtab.h
===
*** include/hashtab.h	(revision 203034)
--- include/hashtab.h	(working copy)
*** extern hashval_t iterative_hash (const v
*** 202,207 
--- 202,270 
  /* Shorthand for hashing something with an intrinsic size.  */
  #define iterative_hash_object(OB,INIT) iterative_hash (&OB, sizeof (OB), INIT)
  
+ /* GCC style hash table iterator.  */
+ 
+

[Patch, Fortran, committed] PR58579 - fix allocation of string temporaries: Avoid overallocation

2013-10-01 Thread Tobias Burnus
In gfc_conv_string_tmp, gfortran allocates temporary strings. However, 
using "TYPE_SIZE (type)" didn't yield one byte as intended but 64 - 
which means that gfortran allocated 64 times as much memory as needed.


I wonder whether the same issue also occurs elsewhere.

Committed (Rev. ) after building and regtesting on x86-64-gnu-linux. I 
didn't see a simple way to generate a test case - but the dump of the 
PR's test case looks fine both for kind=1 and kind=4 strings.


Tobias
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 203085)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,9 @@
+2013-10-01  Tobias Burnus  
+
+	PR fortran/58579
+	* trans-expr.c (gfc_conv_string_tmp): Correctly obtain
+	the byte size of a single character.
+
 2013-09-27  Janne Blomqvist  
 
 	* intrinsic.texi (DATE_AND_TIME): Fix example.
Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c	(Revision 203085)
+++ gcc/fortran/trans-expr.c	(Arbeitskopie)
@@ -2355,11 +2355,14 @@ gfc_conv_string_tmp (gfc_se * se, tree type, tree
 {
   /* Allocate a temporary to hold the result.  */
   var = gfc_create_var (type, "pstr");
-  tmp = gfc_call_malloc (&se->pre, type,
-			 fold_build2_loc (input_location, MULT_EXPR,
-	  TREE_TYPE (len), len,
-	  fold_convert (TREE_TYPE (len),
-			TYPE_SIZE (type;
+  gcc_assert (POINTER_TYPE_P (type));
+  tmp = TREE_TYPE (type);
+  gcc_assert (TREE_CODE (tmp) == ARRAY_TYPE);
+  tmp = TYPE_SIZE_UNIT (TREE_TYPE (tmp));
+  tmp = fold_build2_loc (input_location, MULT_EXPR, size_type_node,
+			fold_convert (size_type_node, len),
+			fold_convert (size_type_node, tmp));
+  tmp = gfc_call_malloc (&se->pre, type, tmp);
   gfc_add_modify (&se->pre, var, tmp);
 
   /* Free the temporary afterwards.  */


[patch] More tree-flow.h prototypes.

2013-10-01 Thread Andrew MacLeod
This patch moves prototypes into gimple-fold.h (which already existed). 
There were a few in tree-flow.h and a bunch in gimple.h. The routines 
are used frequently enough that it makes sense to include gimple-fold.h 
from gimple.h instead of from within each .c file that needs it. 
(presumably why the prototypes were in gimple.h to begin with). I took 
gimple-fold.h out of whatever .c files it was included in.


tree-ssa-copy.h was also created for the prototypes in that file and 
included from tree-ssa.h.


Bootstraps and no new regressions.  OK?

Andrew

	* tree-flow.h: Remove some prototypes.
	* gimple-fold.h: Add prototypes from gimple.h and tree-flow.h.
	* tree-ssa-copy.h: New file.  Relocate prototypes from tree-flow.h.
	* gimple.h: Include gimple-fold.h, move prototypes into gimple-fold.h.
	* tree-ssa.h: Include tree-ssa-copy.h.
	* gimple-fold.c: Remove gimple-fold.h from include list.
	* tree-vrp.c: Remove gimple-fold.h from include list.
	* tree-ssa-sccvn.c: Remove gimple-fold.h from include list.
	* tree-ssa-ccp.c: Remove gimple-fold.h from include list.

Index: tree-flow.h
===
*** tree-flow.h	(revision 203075)
--- tree-flow.h	(working copy)
*** void mark_virtual_operands_for_renaming 
*** 297,306 
  tree get_current_def (tree);
  void set_current_def (tree, tree);
  
- /* In tree-ssa-ccp.c  */
- tree fold_const_aggregate_ref (tree);
- tree gimple_fold_stmt_to_constant (gimple, tree (*)(tree));
- 
  /* In tree-ssa-dom.c  */
  extern void dump_dominator_optimization_stats (FILE *);
  extern void debug_dominator_optimization_stats (void);
--- 297,302 
*** int loop_depth_of_name (tree);
*** 308,322 
  tree degenerate_phi_result (gimple);
  bool simple_iv_increment_p (gimple);
  
- /* In tree-ssa-copy.c  */
- extern void propagate_value (use_operand_p, tree);
- extern void propagate_tree_value (tree *, tree);
- extern void propagate_tree_value_into_stmt (gimple_stmt_iterator *, tree);
- extern void replace_exp (use_operand_p, tree);
- extern bool may_propagate_copy (tree, tree);
- extern bool may_propagate_copy_into_stmt (gimple, tree);
- extern bool may_propagate_copy_into_asm (tree);
- 
  /* In tree-ssa-loop-ch.c  */
  bool do_while_loop_p (struct loop *);
  
--- 304,309 
Index: gimple-fold.h
===
*** gimple-fold.h	(revision 203075)
--- gimple-fold.h	(working copy)
*** along with GCC; see the file COPYING3.  
*** 22,31 
  #ifndef GCC_GIMPLE_FOLD_H
  #define GCC_GIMPLE_FOLD_H
  
! tree fold_const_aggregate_ref_1 (tree, tree (*) (tree));
! tree fold_const_aggregate_ref (tree);
! 
! tree gimple_fold_stmt_to_constant_1 (gimple, tree (*) (tree));
! tree gimple_fold_stmt_to_constant (gimple, tree (*) (tree));
  
  #endif  /* GCC_GIMPLE_FOLD_H */
--- 22,43 
  #ifndef GCC_GIMPLE_FOLD_H
  #define GCC_GIMPLE_FOLD_H
  
! extern tree canonicalize_constructor_val (tree, tree);
! extern tree get_symbol_constant_value (tree);
! extern void gimplify_and_update_call_from_tree (gimple_stmt_iterator *, tree);
! extern tree gimple_fold_builtin (gimple);
! extern tree gimple_extract_devirt_binfo_from_cst (tree, tree);
! extern bool fold_stmt (gimple_stmt_iterator *);
! extern bool fold_stmt_inplace (gimple_stmt_iterator *);
! extern tree maybe_fold_and_comparisons (enum tree_code, tree, tree, 
! 	enum tree_code, tree, tree);
! extern tree maybe_fold_or_comparisons (enum tree_code, tree, tree,
!    enum tree_code, tree, tree);
! extern tree gimple_fold_stmt_to_constant_1 (gimple, tree (*) (tree));
! extern tree gimple_fold_stmt_to_constant (gimple, tree (*) (tree));
! extern tree fold_const_aggregate_ref_1 (tree, tree (*) (tree));
! extern tree fold_const_aggregate_ref (tree);
! extern tree gimple_get_virt_method_for_binfo (HOST_WIDE_INT, tree);
! extern bool gimple_val_nonnegative_real_p (tree);
  
  #endif  /* GCC_GIMPLE_FOLD_H */
Index: tree-ssa-copy.h
===
*** tree-ssa-copy.h	(revision 0)
--- tree-ssa-copy.h	(revision 0)
***
*** 0 
--- 1,31 
+ /* Header file for copy propagation and SSA_NAME replacement.
+Copyright (C) 2013 Free Software Foundation, Inc.
+ 
+ This file is part of GCC.
+ 
+ GCC is free software; you can redistribute it and/or modify it under
+ the terms of the GNU General Public License as published by the Free
+ Software Foundation; either version 3, or (at your option) any later
+ version.
+ 
+ GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+ WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+  for more details.
+ 
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3.  If not see
+ .  */
+ 
+ #ifndef GCC_TREE_SSA_COPY_H
+ #d

Re: [patch] move htab_iterator

2013-10-01 Thread DJ Delorie

I'm typically against adding things to libiberty "because there's no
other place for them".  The purpose of libiberty is to provide a
portability layer, not a trash can.  However, htab is already in
there, and the argument for putting its accessors there is sound.

However, most of the other functions in hashtab.h are of the form
htab_*().  Could these be changed to match that pattern?  If these
functions are unused, it shouldn't matter to rename them.  (although,
if they're unused, it shouldn't matter to discard them, either)


Re: Ping^6: contribute Synopsys Designware ARC port

2013-10-01 Thread Joern Rennecke

Quoting Diego Novillo :


On Sat, Sep 28, 2013 at 9:54 AM, Joern Rennecke
 wrote:

The main part of the port (everything but the testsuite) is still waiting
for review:
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00323.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00324.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00325.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00328.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01870.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02070.html


I have finished reading through these patches.  They are OK to commit.

The changes indicated below are minor. Ideally, you'd address them
before committing the patch, but if it's easier to do it post-commit,
that's OK too.


Oops, I've already started my commit-spree after discussion on IRC.


- The Copyright years should be 2013 in every new file.  Or has this
port been released before?


The port has been available via git for quite a while:
https://github.com/foss-for-synopsys-dwc-arc-processors/gcc

I've also added earlier versions as a branches in the FSF gcc svn repo
in 2008 and 2009.


- In config/arc/arc-protos.h:
+/* insn-attrtab.c doesn't include reload.h, which declares
regno_clobbered_p. */
+extern int regno_clobbered_p (unsigned int, rtx, enum machine_mode, int);

Why not include reload.h here?  Interface changes (however rare) make
this a hassle.


There was also a general rule against including headers in header files,
although that has been weakened in the interim.  Not sure what the exact
position now is.
At any rate, things that didn't use to depend on reload.h would suddenly do.
Not sure if the automatic dependencies take care of adding all the
new dependencies, but at any rate, this constrains the build.
And all the requirements for reload.h would also have to be included.
Worst of all, a number of generator programs includes tm_p.h, so if any
of the include files that have to be included for the sake of reload.h
is generated, there'll be a circular dependency.  AFAICT, insn-modes.h
is a current example.  And it is intrinsically needed for the prototype
I want.
Finally, tricks like the #ifdef RTX_CODE in function.h backfire when
the order of includes gets modified by includes from tm_p.h .

I'd rather have to copy a prototype in response to a clear error message
once in a blue moon than constantly fight with weird breakages of the
build system.

I suppose a better way would be for the *.md file that causes a header
file dependency to somehow request the inclusion of the header file by
the generated file(s).


- In config/arc/simdext.md
+;; Va, [Ib,u8] instructions
+;; (define_insn "vld32wh_insn"
+;;   [(set (match_operand:V8HI 0 "vector_register_operand"   "=v")
+;; (vec_concat:V8HI (unspec:V4HI [(match_operand:SI 1   
"immediate_operand" "P")

+;;  (vec_select:HI (match_operand:V8HI 2 "vector_register_operand"  "v")
+;;  (parallel [(match_operand:SI 3 "immediate_operand" "L")]))]
UNSPEC_ARC_SIMD_VLD32WH)
+;; (vec_select:V4HI (match_dup 0)

Necessary?  If so, please add a comment stating why it's commented out.


As you can see in svn, this was already commented out in arc-20081210-branch;
ISTR that this was just a pair of UNSPEC patterns that could be replaced with
then-new vector operations; but the exact history is lost with the ARC svn
repo.  At any rate, the replacement patterns are clearly below, so I deleted
the old commented out patterns along with their unspec constants.


- In doc/extend.texi:
+Permissible values for this parameter are: @w{@code{ilink1}} and
+@w{@code{ilink2}}.
+

ARC developers already know what ilink1 and ilink2 mean?


The ones that have to program interrupts should, or they can read about
them in the architecture manual.
These are two link registers (for interrupt return addresses) associated
with specific interrupt levels.


+@cindex indirect calls on Epiphany
+These attribute specifies how a particular function is called on
+ARC, ARM and Epiphany

s/specifies/specify/


+because __alignof__ sees only the type of the dereference, wheras
+__builtin_arc_align uses alignment information from the pointer

s/wheras/whereas/


Fixed.


- I have not fully cross-referenced the list of documented builtins vs
the list of implemented builtins. Please double check them.

- Ditto the list of -m options. It looks like they're all documented,
but I haven't diff'd the doc vs the options file.


I think we already did this, but these things have a way of having forgotten
items and/or grow new inconsistencies... I'll try to remember ot check  
again...



- In libgcc/config/arc/gmon/mcount.c

The file has a different copyright/license notice at the top.  Is this
from a third party source?


Yes, it is from one of the newer BSD releases.  Can't recall exactly which,
but as you can see, I took care to use a base that has the three-clause
license, so there should be no issue with license compatibility.


Can it be changed to lgpl?


Nonetheless, the license say

Re: [PATCH] Improving uniform_vector_p() function.

2013-10-01 Thread Xinliang David Li
On Tue, Oct 1, 2013 at 10:31 AM, Cong Hou  wrote:
> The current uniform_vector_p() function only returns non-NULL when the
> vector is directly a uniform vector. For example, for the following
> gimple code:
>
> vect_cst_.15_91 = {_9, _9, _9, _9, _9, _9, _9, _9};
>
>
> The current implementation can only detect that {_9, _9, _9, _9, _9,
> _9, _9, _9} is a uniform vector, but fails to recognize
> vect_cst_.15_91 is also one. This simple patch searches through
> assignment chains to find more uniform vectors.
>
>
> thanks,
> Cong
>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 45c1667..b42f8a9 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2013-10-01  Cong Hou  
> +
> +   * tree.c: Improve the function uniform_vector_p() so that a
> +   vector assigned with a uniform vector is also treated as a
> +   uniform vector.
> +
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 1c881e4..1d6d894 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -10297,6 +10297,17 @@ uniform_vector_p (const_tree vec)
>return first;
>  }
>
> +  if (TREE_CODE (vec) == SSA_NAME)
> +{
> +  gimple def = SSA_NAME_DEF_STMT (vec);
> +  if (gimple_code (def) == GIMPLE_ASSIGN)


do  this:

 if (is_gimple_assign (def) && gimple_assign_copy_p (def))

> +{
> +  tree rhs = gimple_op (def, 1);
> +  if (VECTOR_TYPE_P (TREE_TYPE (rhs)))
> +return uniform_vector_p (rhs);
> +}
> +}
> +
>return NULL_TREE;
>  }

Do you have a test case showing what missed optimization this fix can enable ?

David


[PATCH] Use thread path in threadupdate's hash table

2013-10-01 Thread Jeff Law


tree-ssa-threadupdate.c has a hash table so that it can efficiently find 
cases where multiple incoming edges thread to the same destination. 
That hash table used the old 3-edge representation of jump thread paths. 
 This patch changes it to utilize the full path.


Bootstrapped and regression tested on x86_unknown-linux-gnu.  Installed 
on the trunk.
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 58299d2..111b564 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,15 @@
+2013-10-01  Jeff Law  
+
+   * tree-ssa-threadupdate.c (struct redirection_data): Delete
+   outgoing_edge and intermediate_edge fields.  Instead store the
+   path.
+   (redirection_data::hash): Hash on the last edge's destination
+   index.
+   (redirection_data::equal): Check the entire thread path.
+   (lookup_redirectio_data): Corresponding changes.
+   (create_edge_and_update_destination_phis): Likewise.
+   (thread_single_edge): Likewise.
+
 2013-10-01  Joern Rennecke  
Diego Novillo 
 
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index ecf9baf..2adea1b 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -116,14 +116,11 @@ struct redirection_data : 
typed_free_remove
  targets a single successor of B.  */
   basic_block dup_block;
 
-  /* An outgoing edge from B.  DUP_BLOCK will have OUTGOING_EDGE->dest as
- its single successor.  */
-  edge outgoing_edge;
+  /* The jump threading path.  */
+  vec *path;
 
-  edge intermediate_edge;
-
-  /* A list of incoming edges which we want to thread to
- OUTGOING_EDGE->dest.  */
+  /* A list of incoming edges which we want to thread to the
+ same path.  */
   struct el *incoming_edges;
 
   /* hash_table support.  */
@@ -133,21 +130,36 @@ struct redirection_data : 
typed_free_remove
   static inline int equal (const value_type *, const compare_type *);
 };
 
+/* Simple hashing function.  For any given incoming edge E, we're going
+   to be most concerned with the final destination of its jump thread
+   path.  So hash on the block index of the final edge in the path.  */
+
 inline hashval_t
 redirection_data::hash (const value_type *p)
 {
-  edge e = p->outgoing_edge;
-  return e->dest->index;
+  vec *path = p->path;
+  return path->last ()->e->dest->index;
 }
 
+/* Given two hash table entries, return true if they have the same
+   jump threading path.  */
 inline int
 redirection_data::equal (const value_type *p1, const compare_type *p2)
 {
-  edge e1 = p1->outgoing_edge;
-  edge e2 = p2->outgoing_edge;
-  edge e3 = p1->intermediate_edge;
-  edge e4 = p2->intermediate_edge;
-  return e1 == e2 && e3 == e4;
+  vec *path1 = p1->path;
+  vec *path2 = p2->path;
+
+  if (path1->length () != path2->length ())
+return false;
+
+  for (unsigned int i = 1; i < path1->length (); i++)
+{
+  if ((*path1)[i]->type != (*path2)[i]->type
+ || (*path1)[i]->e != (*path2)[i]->e)
+   return false;
+}
+
+  return true;
 }
 
 /* Data structure of information to pass to hash table traversal routines.  */
@@ -259,10 +271,7 @@ lookup_redirection_data (edge e, enum insert_option insert)
  /* Build a hash table element so we can see if E is already
  in the table.  */
   elt = XNEW (struct redirection_data);
-  /* Right now, if we have a joiner, it is always index 1 into the vector.  */
-  elt->intermediate_edge
-= (*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK ? (*path)[1]->e : NULL;
-  elt->outgoing_edge = path->last ()->e;
+  elt->path = path;
   elt->dup_block = NULL;
   elt->incoming_edges = NULL;
 
@@ -356,7 +365,7 @@ static void
 create_edge_and_update_destination_phis (struct redirection_data *rd,
 basic_block bb)
 {
-  edge e = make_edge (bb, rd->outgoing_edge->dest, EDGE_FALLTHRU);
+  edge e = make_edge (bb, rd->path->last ()->e->dest, EDGE_FALLTHRU);
 
   rescan_loop_exit (e, true, false);
   e->probability = REG_BR_PROB_BASE;
@@ -364,9 +373,9 @@ create_edge_and_update_destination_phis (struct 
redirection_data *rd,
 
   /* We have to copy path -- which means creating a new vector as well
  as all the jump_thread_edge entries.  */
-  if (rd->outgoing_edge->aux)
+  if (rd->path->last ()->e->aux)
 {
-  vec *path = THREAD_PATH (rd->outgoing_edge);
+  vec *path = THREAD_PATH (rd->path->last ()->e);
   vec *copy = new vec ();
 
   /* Sadly, the elements of the vector are pointers and need to
@@ -388,7 +397,7 @@ create_edge_and_update_destination_phis (struct 
redirection_data *rd,
  from the duplicate block, then we will need to add a new argument
  to them.  The argument should have the same value as the argument
  associated with the outgoing edge stored in RD.  */
-  copy_phi_args (e->dest, rd->outgoing_edge, e);
+  copy_phi_args (e->dest, rd->path->last ()->e, e);
 }
 
 /* Wire up the outgoing edges from the duplicate block and
@@ -787,7 +796,13 @@ thread_single_edge (edge 

Re: [PATCH] Improving uniform_vector_p() function.

2013-10-01 Thread Cong Hou
Actually I will introduce optimizations in the next patch. Currently
the function uniform_vector_p () is rarely used in GCC, but there are
certainly some optimization opportunities with the help of this
function.

For example, when we widen a vector with 8 identical element of short
type to two vectors of int type, GCC emits the following code:

  vect_cst_.15_91 = {_9, _9, _9, _9, _9, _9, _9, _9};
  vect__10.16_92 = [vec_unpack_lo_expr] vect_cst_.15_91;
  vect__10.16_93 = [vec_unpack_hi_expr] vect_cst_.15_91;

When vect_cst_.15_91 is a uniform vector, we know vect__10.16_92 and
vect__10.16_93 are identical so that we can remove the second
[vec_unpack_hi_expr] operation:

  vect_cst_.15_91 = {_9, _9, _9, _9, _9, _9, _9, _9};
  vect__10.16_92 = [vec_unpack_lo_expr] vect_cst_.15_91;
  vect__10.16_93 = vect__10.16_92;


thanks,
Cong


On Tue, Oct 1, 2013 at 2:37 PM, Xinliang David Li  wrote:
> On Tue, Oct 1, 2013 at 10:31 AM, Cong Hou  wrote:
>> The current uniform_vector_p() function only returns non-NULL when the
>> vector is directly a uniform vector. For example, for the following
>> gimple code:
>>
>> vect_cst_.15_91 = {_9, _9, _9, _9, _9, _9, _9, _9};
>>
>>
>> The current implementation can only detect that {_9, _9, _9, _9, _9,
>> _9, _9, _9} is a uniform vector, but fails to recognize
>> vect_cst_.15_91 is also one. This simple patch searches through
>> assignment chains to find more uniform vectors.
>>
>>
>> thanks,
>> Cong
>>
>>
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 45c1667..b42f8a9 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,9 @@
>> +2013-10-01  Cong Hou  
>> +
>> +   * tree.c: Improve the function uniform_vector_p() so that a
>> +   vector assigned with a uniform vector is also treated as a
>> +   uniform vector.
>> +
>> diff --git a/gcc/tree.c b/gcc/tree.c
>> index 1c881e4..1d6d894 100644
>> --- a/gcc/tree.c
>> +++ b/gcc/tree.c
>> @@ -10297,6 +10297,17 @@ uniform_vector_p (const_tree vec)
>>return first;
>>  }
>>
>> +  if (TREE_CODE (vec) == SSA_NAME)
>> +{
>> +  gimple def = SSA_NAME_DEF_STMT (vec);
>> +  if (gimple_code (def) == GIMPLE_ASSIGN)
>
>
> do  this:
>
>  if (is_gimple_assign (def) && gimple_assign_copy_p (def))
>
>> +{
>> +  tree rhs = gimple_op (def, 1);
>> +  if (VECTOR_TYPE_P (TREE_TYPE (rhs)))
>> +return uniform_vector_p (rhs);
>> +}
>> +}
>> +
>>return NULL_TREE;
>>  }
>
> Do you have a test case showing what missed optimization this fix can enable ?
>
> David


GTY on simple struct (Was: Re: Ping^6: contribute Synopsys Designware ARC port)

2013-10-01 Thread Joern Rennecke

Quoting Diego Novillo :


No need to mark struct arc_frame_info with GTY. It contains no pointers.


That's not quite how it works.  machine_function needs GTY.  It uses
arc_frame_info, hence arc_frame_info also needs GTY.


Re: [PATCH] Improving uniform_vector_p() function.

2013-10-01 Thread Xinliang David Li
On Tue, Oct 1, 2013 at 2:37 PM, Xinliang David Li  wrote:
> On Tue, Oct 1, 2013 at 10:31 AM, Cong Hou  wrote:
>> The current uniform_vector_p() function only returns non-NULL when the
>> vector is directly a uniform vector. For example, for the following
>> gimple code:
>>
>> vect_cst_.15_91 = {_9, _9, _9, _9, _9, _9, _9, _9};
>>
>>
>> The current implementation can only detect that {_9, _9, _9, _9, _9,
>> _9, _9, _9} is a uniform vector, but fails to recognize
>> vect_cst_.15_91 is also one. This simple patch searches through
>> assignment chains to find more uniform vectors.
>>
>>
>> thanks,
>> Cong
>>
>>
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 45c1667..b42f8a9 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,9 @@
>> +2013-10-01  Cong Hou  
>> +
>> +   * tree.c: Improve the function uniform_vector_p() so that a
>> +   vector assigned with a uniform vector is also treated as a
>> +   uniform vector.
>> +
>> diff --git a/gcc/tree.c b/gcc/tree.c
>> index 1c881e4..1d6d894 100644
>> --- a/gcc/tree.c
>> +++ b/gcc/tree.c
>> @@ -10297,6 +10297,17 @@ uniform_vector_p (const_tree vec)
>>return first;
>>  }
>>
>> +  if (TREE_CODE (vec) == SSA_NAME)
>> +{
>> +  gimple def = SSA_NAME_DEF_STMT (vec);
>> +  if (gimple_code (def) == GIMPLE_ASSIGN)
>
>
> do  this:
>
>  if (is_gimple_assign (def) && gimple_assign_copy_p (def))


Wrong comment from me. Should be

 if (gimple_assign_single_p (def))
  ..

David

>
>> +{
>> +  tree rhs = gimple_op (def, 1);
>> +  if (VECTOR_TYPE_P (TREE_TYPE (rhs)))
>> +return uniform_vector_p (rhs);
>> +}
>> +}
>> +
>>return NULL_TREE;
>>  }
>
> Do you have a test case showing what missed optimization this fix can enable ?
>
> David


Re: Announcing ?

2013-10-01 Thread Tim Shen
On Mon, Sep 30, 2013 at 2:25 PM, Paolo Carlini  wrote:
> 3- Tweak the implementation status page
> libstdc++/doc/xml/manual/status_cxx2011.xml (possibly clarify bits still
> missing too)

Here is the patch for this, along with some interface fixes.

Thank you!


-- 
Tim Shen


a.patch
Description: Binary data


Re: Announcing ?

2013-10-01 Thread Paolo Carlini
Hi,

> Il giorno 02/ott/2013, alle ore 00:03, Tim Shen  ha 
> scritto:
> 
>> On Mon, Sep 30, 2013 at 2:25 PM, Paolo Carlini  
>> wrote:
>> 3- Tweak the implementation status page
>> libstdc++/doc/xml/manual/status_cxx2011.xml (possibly clarify bits still
>> missing too)
> 
> Here is the patch for this, along with some interface fixes.

What do you mean by "not well defined"? Non confoming? Why?

Paolo

Re: Announcing ?

2013-10-01 Thread Tim Shen
On Tue, Oct 1, 2013 at 6:08 PM, Paolo Carlini  wrote:
> What do you mean by "not well defined"? Non confoming? Why?

I don't know if the word 'define' is appropriate, maybe I should just
use 'implement'.

It's indeed the problem that needs extra support from ,
described by [28.7.7]. More detailed,
http://www.open-std.org/Jtc1/sc22/wg21/docs/papers/2003/n1429.htm
offered what the function means.


-- 
Tim Shen


Re: Ping^6: contribute Synopsys Designware ARC port

2013-10-01 Thread Jeff Law

On 10/01/13 15:26, Joern Rennecke wrote:


I have finished reading through these patches.  They are OK to commit.

The changes indicated below are minor. Ideally, you'd address them
before committing the patch, but if it's easier to do it post-commit,
that's OK too.


Oops, I've already started my commit-spree after discussion on IRC.

No worries.  Just address Diego's stuff now.




- The Copyright years should be 2013 in every new file.  Or has this
port been released before?


The port has been available via git for quite a while:
https://github.com/foss-for-synopsys-dwc-arc-processors/gcc

Right.  Was any of this code from Doug Evans's old ARC support?

It doesn't hurt to have 2013 in the dates, and I suspect most files will 
get touched as a result of addressing Diego's comments.



Jeff


Re: GTY on simple struct (Was: Re: Ping^6: contribute Synopsys Designware ARC port)

2013-10-01 Thread Diego Novillo
On Tue, Oct 1, 2013 at 5:50 PM, Joern Rennecke
 wrote:
> Quoting Diego Novillo :
>
>> No need to mark struct arc_frame_info with GTY. It contains no pointers.
>
>
> That's not quite how it works.  machine_function needs GTY.  It uses
> arc_frame_info, hence arc_frame_info also needs GTY.

Gah, you're right.  I missed that connection.  Silly GC.


Diego.


Re: Announcing ?

2013-10-01 Thread Paolo Carlini
Hi,

> Il giorno 02/ott/2013, alle ore 00:21, Tim Shen  ha 
> scritto:
> 
>> On Tue, Oct 1, 2013 at 6:08 PM, Paolo Carlini  
>> wrote:
>> What do you mean by "not well defined"? Non confoming? Why?
> 
> I don't know if the word 'define' is appropriate, maybe I should just
> use 'implement'.
> 
> It's indeed the problem that needs extra support from ,
> described by [28.7.7]. More detailed,
> http://www.open-std.org/Jtc1/sc22/wg21/docs/papers/2003/n1429.htm
> offered what the function means.

Ok, thus just say that transform_primary is unimplented. Please also add a 
comment inline in the code explaining the issue, but don't word it in term of 
, because  is just what it is per C++11, and doesn't directly 
provide what we need in the public interface (if I understand the issue): just 
say that implementing the function appears to require libc support + a link to 
the message you sent a couple of weeks ago to the libstdc++-v3 mailing list.

Ok with the above changes.

Thanks!
Paolo


[PATCH] Improve probability/profile distribution in ORIF expansion

2013-10-01 Thread Teresa Johnson
This patch fixes an issue where expansion of an ORIF expression arbitrarily
applied the probability that the entire condition was true to just the
first condition. When the ORIF true probability was 100%, this resulted
in the second condition's jump being given a count of zero (since the
first condition's jump got 100% of the count), leading to incorrect function
splitting when it had a non-zero probability in reality. Since there
currently isn't better information about which condition resulted
in the ORIF being true, apply a 50-50 probability that it is the first
vs. second condition that caused the entire expression to be true,
so that neither condition's true label ends up as a 0-count bb.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?

2013-10-01  Teresa Johnson  

* dojump.c (do_jump_1): Divide probability between
both conditions of a TRUTH_ORIF_EXPR.

Index: dojump.c
===
--- dojump.c(revision 203077)
+++ dojump.c(working copy)
@@ -325,18 +325,27 @@ do_jump_1 (enum tree_code code, tree op0, tree op1
   break;

 case TRUTH_ORIF_EXPR:
-  if (if_true_label == NULL_RTX)
-   {
-  drop_through_label = gen_label_rtx ();
- do_jump (op0, NULL_RTX, drop_through_label, prob);
- do_jump (op1, if_false_label, NULL_RTX, prob);
-   }
-  else
-   {
- do_jump (op0, NULL_RTX, if_true_label, prob);
- do_jump (op1, if_false_label, if_true_label, prob);
-   }
-  break;
+  {
+/* Spread the probability evenly between the two conditions. So
+   the first condition has half the total probability of being true.
+   The second condition has the other half of the total probability,
+   so its jump has a probability of half the total, relative to
+   the probability we reached it (i.e. the first condition
was false).  */
+int op0_prob = prob / 2;
+int op1_prob = GCOV_COMPUTE_SCALE ((prob / 2), inv (op0_prob));
+if (if_true_label == NULL_RTX)
+  {
+drop_through_label = gen_label_rtx ();
+do_jump (op0, NULL_RTX, drop_through_label, op0_prob);
+do_jump (op1, if_false_label, NULL_RTX, op1_prob);
+  }
+else
+  {
+do_jump (op0, NULL_RTX, if_true_label, op0_prob);
+do_jump (op1, if_false_label, if_true_label, op1_prob);
+  }
+break;
+  }

 default:
   gcc_unreachable ();

-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-10-01 Thread Jan Hubicka
> > Hi Wei Mi,
> >
> > Have you checked in your patch?
> >
> > --
> > H.J.
> 
> No, I havn't. Honza wants me to wait for his testing on AMD hardware.
> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01603.html
I only wanted to separate it from the changes in generic so the regular testers
can pick it up separately.  So just go ahead and check it in.

Honza


Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Yufeng Zhang

On 10/01/13 20:55, Bill Schmidt wrote:



On Tue, 2013-10-01 at 11:56 -0500, Bill Schmidt wrote:

OK, thanks.  The problem that you've encountered is that you are
attempting to do something illegal. ;)  (Bin's original patch is
actually to blame for that, as well as me for not catching it then.)

As your new test shows, it is unsafe to do the transformation in
backtrace_base_for_ref when widening from an unsigned type, because the
unsigned type has wrap semantics by default.  (The actual test must be
done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or
removed by compile option -- see the comments with legal_cast_p and
legal_cast_p_1 later in the module.)

You cannot in general prove that the transformation is allowable for a
specific constant, because you don't know that what you're adding it to
won't cause an overflow that's handled incorrectly.

I believe the correct fix for the unsigned-overflow case is to fail
backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns
false, where in_type is the type of the new *PBASE, and out_type is the
widening type that you're looking through.  So you can't just
STRIP_NOPS, you have to check the cast for legitimacy for this
transformation.

This does not explain why backtrace_base_for_ref does not find all the
opportunities on slsr-39.c.  I don't immediately see what's preventing
that.  Note that the transformation is legal in that case because you
are widening from a signed int to an unsigned int, which won't cause
problems.  You guys need to dig deeper into why those opportunities are
missed when sizetype is larger than int.  Let me know if you need help
figuring it out.


Sorry, I had to leave before and wanted to get this response back to you
in case I didn't get back soon.  I've looked at this some more, and your
general approach should work ok once you get the legal_cast_p check in
place where you do the get_unwidened call now.  Once you know you have a
legal widening, you don't have to worry about the safe_to_multiply_p
stuff.  I.e., you don't need the last two chunks in the patch to
backtrace_base_for_ref, and you don't need the unwidened_p variable.  It
should all fall out properly by just restricting your unwidening to
legal casts.


Many thanks for looking into the issue so promptly.  I've updated the 
patch; I have to use legal_cast_p_1 instead as the gimple node is no 
longer available by then.


Does the new patch look sane?

The regtest on aarch64 and bootstrapping on x86-64 are still running.

Thanks,
Yufeng


gcc/

* gimple-ssa-strength-reduction.c (legal_cast_p_1): Forward
declaration.
(backtrace_base_for_ref): Call get_unwidened with 'base_in' if
'base_in' represent a conversion and legal_cast_p_1 holds; set
'base_in' with the returned value from get_unwidened.

gcc/testsuite/

* gcc.dg/tree-ssa/slsr-40.c: New test.diff --git a/gcc/gimple-ssa-strength-reduction.c 
b/gcc/gimple-ssa-strength-reduction.c
index 139a4a1..a558f34 100644
--- a/gcc/gimple-ssa-strength-reduction.c
+++ b/gcc/gimple-ssa-strength-reduction.c
@@ -379,6 +379,7 @@ static bool address_arithmetic_p;
 /* Forward function declarations.  */
 static slsr_cand_t base_cand_from_table (tree);
 static tree introduce_cast_before_cand (slsr_cand_t, tree, tree);
+static bool legal_cast_p_1 (tree, tree);
 
 /* Produce a pointer to the IDX'th candidate in the candidate vector.  */
 
@@ -768,6 +769,14 @@ backtrace_base_for_ref (tree *pbase)
   slsr_cand_t base_cand;
 
   STRIP_NOPS (base_in);
+
+  /* Strip off widening conversion(s) to handle cases where
+ e.g. 'B' is widened from an 'int' in order to calculate
+ a 64-bit address.  */
+  if (CONVERT_EXPR_P (base_in)
+  && legal_cast_p_1 (base_in, TREE_OPERAND (base_in, 0)))
+base_in = get_unwidened (base_in, NULL_TREE);
+
   if (TREE_CODE (base_in) != SSA_NAME)
 return tree_to_double_int (integer_zero_node);
 
@@ -786,7 +795,7 @@ backtrace_base_for_ref (tree *pbase)
   else if (base_cand->kind == CAND_ADD
   && TREE_CODE (base_cand->stride) == INTEGER_CST
   && integer_onep (base_cand->stride))
-{
+   {
  /* X = B + (i * S), S is integer one.  */
  *pbase = base_cand->base_expr;
  return base_cand->index;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-40.c 
b/gcc/testsuite/gcc.dg/tree-ssa/slsr-40.c
new file mode 100644
index 000..72726a3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-40.c
@@ -0,0 +1,27 @@
+/* Verify straight-line strength reduction for array
+   subscripting.
+
+   elems[n-1] is reduced to elems + n * 4 + 0x * 4, only when
+   pointers are of the same size as that of int (assuming 4 bytes).  */
+
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+struct data
+{
+  unsigned long elms[1];
+} gData;
+
+void __attribute__((noinline))
+foo (struct data *dst, unsigned int n)
+{
+  dst->elms[n - 1] &= 1;
+}
+
+int
+main ()
+{
+  foo (&gData, 1);
+  return

Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #4

2013-10-01 Thread David Edelsohn
On Tue, Oct 1, 2013 at 1:52 PM, Michael Meissner
 wrote:
> This patch moves most of the VSX DFmode operations from vsx.md to rs6000.md to
> use the traditional floating point instructions (f*) instead of the VSX scalar
> instructions (xs*) if all of the registers come from the traditional floating
> point register set.  The add, subtract, multiply, divide, reciprocal estimate,
> square root, absolute value, negate, round functions, and multiply/add
> instructions were changed.  Some of the converts have not been changed with
> these patches.  If the -mupper-regs-df switch is used, it will attempt to use
> the upper registers (those that overlay on the traditional Altivec register
> set).
>
> This patch also combines the scalar SFmode/DFmode support on non-SPE systems.
> It adds in ISA 2.07 (power8) single precision floating point support if the
> -mupper-regs-sf switch is used.
>
> At present, neither -mupper-regs-df nor -mupper-regs-sf is usable if reload 
> has
> to do anything.  A future patch will address this.
>
> I did need to adjust a few tests that were specifically testing VSX scalar 
> code
> generation.  In addition, I put in a simple test to make sure the initial
> -mupper-regs-df and -mupper-regs-sf works correctly.
>
> I tested this an except for power7, power8 I could not find any changes in 
> code
> generated for power4, power5, power6, power6x, G4, G5, cell, e5500, e6500,
> xilinx (sp_full, sp_lite, dp_full, dp_lite, none), 8548/8540 (spe), 750cl
> (paired floating point).
>
> For VSX systems there is code generation differences:
>
> 1)  The traditional fp instruction is generated instead of VSX;
>
> 2)  Because of #1, the code generator favors the 4 operand of multiply/add
> instructions, where the target register does not overlap with any of
> the inputs over the VSX version that that requires overlap.
>
> 3)  A few of the vectorized tests on power8 now generate more direct move
> instructions, instead of moving values through the stack than
> previously.  These tests are integer tests, where you are doing an
> operation between an integer vector and a scalar value.  Previously in
> some cases, the register allocator would store the value from a GPR 
> and
> reload it to the vector registers.
>
> 4)  There is a slight scheduling difference in doing long double abs, that
> causes a different register to be used.  The code for long double abs
> needs to be improved in any case (the early splitting is causing 
> spills
> to the stack).
>
> I had no differences in doing bootstrap and make check (with the testsuite
> fixes applied).
>
> In addition, I am running Spec 2006 floating point tests on a power7 box to
> compare the effects of going back to the traditional floating point tests.  
> For
> most tests, there is less than 2% difference between the runs.  One benchmark
> (482.sphinx3) is slightly faster with these changes, and it is dominated by
> floating point multiply/add operations.
>
> Can I apply these patches?
>
> [gcc]
> 2013-09-30  Michael Meissner  
>
> * config/rs6000/rs6000-builtin.def (XSRDPIM): Use floatdf2,
> ceildf2, btruncdf2, instead of vsx_* name.
>
> * config/rs6000/vsx.md (vsx_add3): Change arithmetic
> iterators to only do V2DF and V4SF here.  Move the DF code to
> rs6000.md where it is combined with SF mode.  Replace  with
> just 'v' since only vector operations are handled with these insns
> after moving the DF support to rs6000.md.
> (vsx_sub3): Likewise.
> (vsx_mul3): Likewise.
> (vsx_div3): Likewise.
> (vsx_fre2): Likewise.
> (vsx_neg2): Likewise.
> (vsx_abs2): Likewise.
> (vsx_nabs2): Likewise.
> (vsx_smax3): Likewise.
> (vsx_smin3): Likewise.
> (vsx_sqrt2): Likewise.
> (vsx_rsqrte2): Likewise.
> (vsx_fms4): Likewise.
> (vsx_nfma4): Likewise.
> (vsx_copysign3): Likewise.
> (vsx_btrunc2): Likewise.
> (vsx_floor2): Likewise.
> (vsx_ceil2): Likewise.
> (vsx_smaxsf3): Delete scalar ops that were moved to rs6000.md.
> (vsx_sminsf3): Likewise.
> (vsx_fmadf4): Likewise.
> (vsx_fmsdf4): Likewise.
> (vsx_nfmadf4): Likewise.
> (vsx_nfmsdf4): Likewise.
> (vsx_cmpdf_internal1): Likewise.
>
> * config/rs6000/rs6000.h (TARGET_SF_SPE): Define macros to make it
> simpler to select whether a target has SPE or traditional floating
> point support in iterators.
> (TARGET_DF_SPE): Likewise.
> (TARGET_SF_FPR): Likewise.
> (TARGET_DF_FPR): Likewise.
> (TARGET_SF_INSN): Macros to say whether floating point support
> exists for a given operation for expanders.
> (TARGET_DF_INSN): Likewise.
>
> * config/rs6000/rs6000.c (Ftrad): New mode attributes to allow
> co

Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-10-01 Thread Wei Mi
On Tue, Oct 1, 2013 at 3:50 PM, Jan Hubicka  wrote:
>> > Hi Wei Mi,
>> >
>> > Have you checked in your patch?
>> >
>> > --
>> > H.J.
>>
>> No, I havn't. Honza wants me to wait for his testing on AMD hardware.
>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01603.html
> I only wanted to separate it from the changes in generic so the regular 
> testers
> can pick it up separately.  So just go ahead and check it in.
>
> Honza

Thanks, check in as r203095.

Wei Mi.


Re: Announcing ?

2013-10-01 Thread Tim Shen
On Tue, Oct 1, 2013 at 6:41 PM, Paolo Carlini  wrote:
> Ok, thus just say that transform_primary is unimplented. Please also add a 
> comment inline in the code explaining the issue, but don't word it in term of 
> , because  is just what it is per C++11, and doesn't directly 
> provide what we need in the public interface (if I understand the issue): 
> just say that implementing the function appears to require libc support + a 
> link to the message you sent a couple of weeks ago to the libstdc++-v3 
> mailing list.
>
> Ok with the above changes.

Committed :)

Thanks!


-- 
Tim Shen


a.patch
Description: Binary data


Re: Announcing ?

2013-10-01 Thread Paolo Carlini

On 10/02/2013 01:45 AM, Tim Shen wrote:

On Tue, Oct 1, 2013 at 6:41 PM, Paolo Carlini  wrote:

Ok, thus just say that transform_primary is unimplented. Please also add a comment inline 
in the code explaining the issue, but don't word it in term of , because 
 is just what it is per C++11, and doesn't directly provide what we need in 
the public interface (if I understand the issue): just say that implementing the function 
appears to require libc support + a link to the message you sent a couple of weeks ago to 
the libstdc++-v3 mailing list.

Ok with the above changes.

Committed :)
Thanks. But then the function actually *is* implemented, saying 'not 
implemented' is misleading. Please change the xml to something like 
'isn't correctly implemented', or more informally, 'needs work', up to 
you, but please adjust it.


Thanks,
Paolo.


Re: Announcing ?

2013-10-01 Thread Tim Shen
On Tue, Oct 1, 2013 at 7:49 PM, Paolo Carlini  wrote:
> Thanks. But then the function actually *is* implemented, saying 'not
> implemented' is misleading. Please change the xml to something like 'isn't
> correctly implemented', or more informally, 'needs work', up to you, but
> please adjust it.

Committed.

I learn that an implementation could be incorrect ;)


-- 
Tim Shen


a.patch
Description: Binary data


Re: Announcing ?

2013-10-01 Thread Paolo Carlini

On 10/02/2013 02:00 AM, Tim Shen wrote:

On Tue, Oct 1, 2013 at 7:49 PM, Paolo Carlini  wrote:

Thanks. But then the function actually *is* implemented, saying 'not
implemented' is misleading. Please change the xml to something like 'isn't
correctly implemented', or more informally, 'needs work', up to you, but
please adjust it.

Committed.

Thanks.


I learn that an implementation could be incorrect ;)
You see, at variance with natural languages, for an artificial language 
like C++ it's easy to distinguish practical instances of unimplemented 
and very badly implemented: in the former case the function doesn't have 
a definition, or, even more clearly, doesn't even have a declaration. 
transform_primary can't be unimplemented ;)


Paolo.


Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64

2013-10-01 Thread Bill Schmidt
On Tue, 2013-10-01 at 23:57 +0100, Yufeng Zhang wrote:
> On 10/01/13 20:55, Bill Schmidt wrote:
> >
> >
> > On Tue, 2013-10-01 at 11:56 -0500, Bill Schmidt wrote:
> >> OK, thanks.  The problem that you've encountered is that you are
> >> attempting to do something illegal. ;)  (Bin's original patch is
> >> actually to blame for that, as well as me for not catching it then.)
> >>
> >> As your new test shows, it is unsafe to do the transformation in
> >> backtrace_base_for_ref when widening from an unsigned type, because the
> >> unsigned type has wrap semantics by default.  (The actual test must be
> >> done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or
> >> removed by compile option -- see the comments with legal_cast_p and
> >> legal_cast_p_1 later in the module.)
> >>
> >> You cannot in general prove that the transformation is allowable for a
> >> specific constant, because you don't know that what you're adding it to
> >> won't cause an overflow that's handled incorrectly.
> >>
> >> I believe the correct fix for the unsigned-overflow case is to fail
> >> backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns
> >> false, where in_type is the type of the new *PBASE, and out_type is the
> >> widening type that you're looking through.  So you can't just
> >> STRIP_NOPS, you have to check the cast for legitimacy for this
> >> transformation.
> >>
> >> This does not explain why backtrace_base_for_ref does not find all the
> >> opportunities on slsr-39.c.  I don't immediately see what's preventing
> >> that.  Note that the transformation is legal in that case because you
> >> are widening from a signed int to an unsigned int, which won't cause
> >> problems.  You guys need to dig deeper into why those opportunities are
> >> missed when sizetype is larger than int.  Let me know if you need help
> >> figuring it out.
> >
> > Sorry, I had to leave before and wanted to get this response back to you
> > in case I didn't get back soon.  I've looked at this some more, and your
> > general approach should work ok once you get the legal_cast_p check in
> > place where you do the get_unwidened call now.  Once you know you have a
> > legal widening, you don't have to worry about the safe_to_multiply_p
> > stuff.  I.e., you don't need the last two chunks in the patch to
> > backtrace_base_for_ref, and you don't need the unwidened_p variable.  It
> > should all fall out properly by just restricting your unwidening to
> > legal casts.
> 
> Many thanks for looking into the issue so promptly.  I've updated the 
> patch; I have to use legal_cast_p_1 instead as the gimple node is no 
> longer available by then.
> 
> Does the new patch look sane?

Yes, much better.  I'm happy with this approach.  However, please
restore the correct whitespace before the { at -786,7 +795,7.

Thanks for fixing this up!

Bill

> 
> The regtest on aarch64 and bootstrapping on x86-64 are still running.
> 
> Thanks,
> Yufeng
> 
> 
> gcc/
> 
>   * gimple-ssa-strength-reduction.c (legal_cast_p_1): Forward
>   declaration.
>   (backtrace_base_for_ref): Call get_unwidened with 'base_in' if
>   'base_in' represent a conversion and legal_cast_p_1 holds; set
>   'base_in' with the returned value from get_unwidened.
> 
> gcc/testsuite/
> 
>   * gcc.dg/tree-ssa/slsr-40.c: New test.



[PATCH] Reducing number of alias checks in vectorization.

2013-10-01 Thread Cong Hou
When alias exists between data refs in a loop, to vectorize it GCC
does loop versioning and adds runtime alias checks. Basically for each
pair of data refs with possible data dependence, there will be two
comparisons generated to make sure there is no aliasing between them
in each iteration of the vectorized loop. If there are many such data
refs pairs, the number of comparisons can be very large, which is a
big overhead.

However, in some cases it is possible to reduce the number of those
comparisons. For example, for the following loop, we can detect that
b[0] and b[1] are two consecutive member accesses so that we can
combine the alias check between a[0:100]&b[0] and a[0:100]&b[1] into
checking a[0:100]&b[0:2]:

void foo(int*a, int* b)
{
   for (int i = 0; i < 100; ++i)
a[i] = b[0] + b[1];
}

Actually, the requirement of consecutive memory accesses is too
strict. For the following loop, we can still combine the alias checks
between a[0:100]&b[0] and a[0:100]&b[100]:

void foo(int*a, int* b)
{
   for (int i = 0; i < 100; ++i)
a[i] = b[0] + b[100];
}

This is because if b[0] is not in a[0:100] and b[100] is not in
a[0:100] then a[0:100] cannot be between b[0] and b[100]. We only need
to check a[0:100] and b[0:101] don't overlap.

More generally, consider two pairs of data refs (a, b1) and (a, b2).
Suppose addr_b1 and addr_b2 are basic addresses of data ref b1 and b2;
offset_b1 and offset_b2 (offset_b1 < offset_b2) are offsets of b1 and
b2, and segment_length_a, segment_length_b1, and segment_length_b2 are
segment length of a, b1, and b2. Then we can combine the two
comparisons into one if the following condition is satisfied:

offset_b2- offset_b1 - segment_length_b1 < segment_length_a


This patch detects those combination opportunities to reduce the
number of alias checks. It is tested on an x86-64 machine.


thanks,
Cong



Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c (revision 202662)
+++ gcc/tree-vect-loop-manip.c (working copy)
@@ -19,6 +19,10 @@ You should have received a copy of the G
 along with GCC; see the file COPYING3.  If not see
 .  */

+#include 
+#include 
+#include 
+
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -2248,6 +2252,74 @@ vect_vfa_segment_size (struct data_refer
   return segment_length;
 }

+namespace
+{
+
+/* struct dr_addr_with_seg_len
+
+   A struct storing information of a data reference, including the data
+   ref itself, its basic address, the access offset and the segment length
+   for aliasing checks.  */
+
+struct dr_addr_with_seg_len
+{
+  dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len)
+: dr (d), basic_addr (addr), offset (off), seg_len (len) {}
+
+  data_reference* dr;
+  tree basic_addr;
+  tree offset;
+  tree seg_len;
+};
+
+/* Operator == between two dr_addr_with_seg_len objects.
+
+   This equality operator is used to make sure two data refs
+   are the same one so that we will consider to combine the
+   aliasing checks of those two pairs of data dependent data
+   refs.  */
+
+bool operator == (const dr_addr_with_seg_len& d1,
+  const dr_addr_with_seg_len& d2)
+{
+  return operand_equal_p (d1.basic_addr, d2.basic_addr, 0)
+ && operand_equal_p (d1.offset, d2.offset, 0)
+ && operand_equal_p (d1.seg_len, d2.seg_len, 0);
+}
+
+typedef std::pair 
+ dr_addr_with_seg_len_pair_t;
+
+
+/* Operator < between two dr_addr_with_seg_len_pair_t objects.
+
+   This operator is used to sort objects of dr_addr_with_seg_len_pair_t
+   so that we can combine aliasing checks during one scan.  */
+
+bool operator < (const dr_addr_with_seg_len_pair_t& p1,
+ const dr_addr_with_seg_len_pair_t& p2)
+{
+  const dr_addr_with_seg_len& p11 = p1.first;
+  const dr_addr_with_seg_len& p12 = p1.second;
+  const dr_addr_with_seg_len& p21 = p2.first;
+  const dr_addr_with_seg_len& p22 = p2.second;
+
+  if (p11.basic_addr != p21.basic_addr)
+return p11.basic_addr < p21.basic_addr;
+  if (p12.basic_addr != p22.basic_addr)
+return p12.basic_addr < p22.basic_addr;
+  if (TREE_CODE (p11.offset) != INTEGER_CST
+  || TREE_CODE (p21.offset) != INTEGER_CST)
+return p11.offset < p21.offset;
+  if (int_cst_value (p11.offset) != int_cst_value (p21.offset))
+return int_cst_value (p11.offset) < int_cst_value (p21.offset);
+  if (TREE_CODE (p12.offset) != INTEGER_CST
+  || TREE_CODE (p22.offset) != INTEGER_CST)
+return p12.offset < p22.offset;
+  return int_cst_value (p12.offset) < int_cst_value (p22.offset);
+}
+
+}

 /* Function vect_create_cond_for_alias_checks.

@@ -2292,20 +2364,51 @@ vect_create_cond_for_alias_checks (loop_
   if (may_alias_ddrs.is_empty ())
 return;

+
+  /* Basically, for each pair of dependent data refs store_ptr_0
+ and load_ptr_0, we create an expression:
+
+ ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
+ || (load_ptr_0 + load_segment_length_0) <= stor

Re: [PATCH] Reducing number of alias checks in vectorization.

2013-10-01 Thread pinskia


> On Oct 1, 2013, at 7:12 PM, Cong Hou  wrote:
> 
> When alias exists between data refs in a loop, to vectorize it GCC
> does loop versioning and adds runtime alias checks. Basically for each
> pair of data refs with possible data dependence, there will be two
> comparisons generated to make sure there is no aliasing between them
> in each iteration of the vectorized loop. If there are many such data
> refs pairs, the number of comparisons can be very large, which is a
> big overhead.
> 
> However, in some cases it is possible to reduce the number of those
> comparisons. For example, for the following loop, we can detect that
> b[0] and b[1] are two consecutive member accesses so that we can
> combine the alias check between a[0:100]&b[0] and a[0:100]&b[1] into
> checking a[0:100]&b[0:2]:
> 
> void foo(int*a, int* b)
> {
>   for (int i = 0; i < 100; ++i)
>a[i] = b[0] + b[1];
> }
> 
> Actually, the requirement of consecutive memory accesses is too
> strict. For the following loop, we can still combine the alias checks
> between a[0:100]&b[0] and a[0:100]&b[100]:
> 
> void foo(int*a, int* b)
> {
>   for (int i = 0; i < 100; ++i)
>a[i] = b[0] + b[100];
> }
> 
> This is because if b[0] is not in a[0:100] and b[100] is not in
> a[0:100] then a[0:100] cannot be between b[0] and b[100]. We only need
> to check a[0:100] and b[0:101] don't overlap.
> 
> More generally, consider two pairs of data refs (a, b1) and (a, b2).
> Suppose addr_b1 and addr_b2 are basic addresses of data ref b1 and b2;
> offset_b1 and offset_b2 (offset_b1 < offset_b2) are offsets of b1 and
> b2, and segment_length_a, segment_length_b1, and segment_length_b2 are
> segment length of a, b1, and b2. Then we can combine the two
> comparisons into one if the following condition is satisfied:
> 
> offset_b2- offset_b1 - segment_length_b1 < segment_length_a
> 
> 
> This patch detects those combination opportunities to reduce the
> number of alias checks. It is tested on an x86-64 machine.

I like the idea of this patch but I am not a fan of using stl really.  It seems 
a little too much dependence on c++ features for my liking.

Thanks,
Andrew

> 
> 
> thanks,
> Cong
> 
> 
> 
> Index: gcc/tree-vect-loop-manip.c
> ===
> --- gcc/tree-vect-loop-manip.c (revision 202662)
> +++ gcc/tree-vect-loop-manip.c (working copy)
> @@ -19,6 +19,10 @@ You should have received a copy of the G
> along with GCC; see the file COPYING3.  If not see
> .  */
> 
> +#include 
> +#include 
> +#include 
> +
> #include "config.h"
> #include "system.h"
> #include "coretypes.h"
> @@ -2248,6 +2252,74 @@ vect_vfa_segment_size (struct data_refer
>   return segment_length;
> }
> 
> +namespace
> +{
> +
> +/* struct dr_addr_with_seg_len
> +
> +   A struct storing information of a data reference, including the data
> +   ref itself, its basic address, the access offset and the segment length
> +   for aliasing checks.  */
> +
> +struct dr_addr_with_seg_len
> +{
> +  dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len)
> +: dr (d), basic_addr (addr), offset (off), seg_len (len) {}
> +
> +  data_reference* dr;
> +  tree basic_addr;
> +  tree offset;
> +  tree seg_len;
> +};
> +
> +/* Operator == between two dr_addr_with_seg_len objects.
> +
> +   This equality operator is used to make sure two data refs
> +   are the same one so that we will consider to combine the
> +   aliasing checks of those two pairs of data dependent data
> +   refs.  */
> +
> +bool operator == (const dr_addr_with_seg_len& d1,
> +  const dr_addr_with_seg_len& d2)
> +{
> +  return operand_equal_p (d1.basic_addr, d2.basic_addr, 0)
> + && operand_equal_p (d1.offset, d2.offset, 0)
> + && operand_equal_p (d1.seg_len, d2.seg_len, 0);
> +}
> +
> +typedef std::pair 
> + dr_addr_with_seg_len_pair_t;
> +
> +
> +/* Operator < between two dr_addr_with_seg_len_pair_t objects.
> +
> +   This operator is used to sort objects of dr_addr_with_seg_len_pair_t
> +   so that we can combine aliasing checks during one scan.  */
> +
> +bool operator < (const dr_addr_with_seg_len_pair_t& p1,
> + const dr_addr_with_seg_len_pair_t& p2)
> +{
> +  const dr_addr_with_seg_len& p11 = p1.first;
> +  const dr_addr_with_seg_len& p12 = p1.second;
> +  const dr_addr_with_seg_len& p21 = p2.first;
> +  const dr_addr_with_seg_len& p22 = p2.second;
> +
> +  if (p11.basic_addr != p21.basic_addr)
> +return p11.basic_addr < p21.basic_addr;
> +  if (p12.basic_addr != p22.basic_addr)
> +return p12.basic_addr < p22.basic_addr;
> +  if (TREE_CODE (p11.offset) != INTEGER_CST
> +  || TREE_CODE (p21.offset) != INTEGER_CST)
> +return p11.offset < p21.offset;
> +  if (int_cst_value (p11.offset) != int_cst_value (p21.offset))
> +return int_cst_value (p11.offset) < int_cst_value (p21.offset);
> +  if (TREE_CODE (p12.offset) != INTEGER_CST
> +  || TREE_CODE (p22.offset) != INTEGER_CST)
> +return p12.offset < p22.offset

[wwwdocs] Mention libstdc++-v3 in 4.9 changes.html

2013-10-01 Thread Tim Shen
Hi, libstdc++-v3  is ready for releasing.

Is it Ok to apply? By the way, do we need a News entry for this improvement?

Thanks!

Index: htdocs/gcc-4.9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.27
diff -r1.27 changes.html
136a137,140
>  href="http://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2011";>
>Improved experimental support for the new ISO C++ standard, C++11,
>including support for .
> 


-- 
Tim Shen


  1   2   >