date:20151005

Re: [PATCH] SH FDPIC backend support

2015-10-05 Thread Kaz Kojima

Oleg Endo  wrote:
>> So apparently the strange behavior I observed is intended. Presumably
>> there is some mechanism to ensure that these functions are always
>> static-linked? But I don't see it. The libgcc spec I see is:
>> 
>> *libgcc:
>> %{static|static-libgcc:-lgcc
>> -lgcc_eh}%{!static:%{!static-libgcc:%{!shared-libgcc:-lgcc --as-needed
>> -lgcc_s --no-as-needed}%{shared-libgcc:-lgcc_s%{!shared: -lgcc
>> 
>> This explicitly omits -lgcc when -shared-libgcc is used with -shared.
>> Thankfully __ashlsi3_r0 is not exported from libgcc.so.1 (as far as I
>> can tell), so this will just be a link error rather than horribly
>> wrong behavior, but it still seems like there's a bug here unless I'm
>> misunderstanding something. I think the final %{!shared: -lgcc} in the
>> spec is an error and should be replaced by simply -lgcc if there are
>> targets where libgcc.a contains necessary symbols that are not/cannot
>> be defined in libgcc_s.so.1.
> 
> Hm, maybe, but I don't know enough about this, sorry.  Kaz, maybe you
> have a comment on that?

Sorry for my late reply.  I was traveling.
I think that almost linux targets uses linker script libgcc_s.so
which includes -lgcc.  See trunk/libgcc/config/t-slibgcc-libgcc.
The target micro functions are statically linked with it.

Regards,
kaz

Re: RFC: Patch to allow spill slot alignment greater than the stack alignment

2015-10-05 Thread Bernd Schmidt


On 10/02/2015 10:57 PM, Steve Ellcey wrote:

I have spent some time trying to do dynamic stack alignment on MIPS and had
considerable trouble.  The problems are mainly due to the dwarf based stack
unwinding and setjmp/longjmp usages where the code does not go through the
normal function prologue and epilogue code.

[...]

The main advantage to this approach over dynamically aligning the stack
is that by not changing the real stack (or frame) pointer there is
minimal chance of breaking the ABI and there are no changes needed to
the dwarf unwind code.  The main disadvantage is that I am padding each
individual spill so I am wasting more space than absolutely required.
It should be possible to address this by putting all the aligned spills
together and sharing the padding but I would like to leave that for a
future improvement.

In the mean time I would like to get some comments on this approach and
see what people think.  Does this seem like a reasonable approach to
allowing for aligned spills beyond what the stack supports?


Personally I'm not a fan. Your description of it makes it sound 
immensely wasteful, and I'm really not clear on why stack alignment 
wouldn't work for MIPS when it's been shown to work elsewhere. I think 
we'd want to see a clear demonstration of unfixable problems with stack 
alignment before allowing something like this in.


Vlad would have to comment on the LRA bits, probably.


Bernd

Re: [PATCH 0/4] gimple accessor const correctness fixes

2015-10-05 Thread Bernd Schmidt


On 10/05/2015 02:25 AM, tbsaunde+...@tbsaunde.org wrote:

the first patch is just some cleanup  I ran into along the way, but the rest of
this series fixes const correctness for all of the gimple_x_ptr () functions.
I was able to just remove a couple of them, but unfortunately most ar needed
for either tree walking, or the use def data structures which store pointers
into gimple statements.

patches individually bootstrapped + regtested on x86_64-gnu-linux, ok?


It all looks reasonable. Ok.


Bernd

Re: [PATCH] Add verifier for leaked SSA names

2015-10-05 Thread Richard Biener

On Fri, Oct 2, 2015 at 5:24 PM, Jeff Law  wrote:
> On 10/02/2015 01:37 AM, Richard Biener wrote:
>>
>>
>> The following patch doesn't pass bootstrap & regtest.  It did at some
>> point though and its comment hints that fixing leaks after inlining
>> was too interesting a problem to solve ;)
>>
>> Thus patch is FYI.
>>
>> Richard.
>>
>> Index: tree-ssa.c
>> ===
>> --- tree-ssa.c  (revision 228320)
>> +++ tree-ssa.c  (working copy)
>> @@ -693,6 +693,16 @@ verify_def (basic_block bb, basic_block
>> goto err;
>>   }
>>
>> +  if (bb == NULL
>> +  /* ???  Too many latent cases in the main opt pipeline.  But it's
>> + worth to fix all cases before inlining as that reduces the
>> +amount of garbage kept live.  */
>> +  && !cfun->after_inlining)
>> +{
>> +  error ("removed STMT failed to release SSA name");
>> +  goto err;
>> +}
>> +
>
> I was building the verification step into the ssa name manager. Essentially
> at the point where we flush from the pending to the free list, we should
> have a consistent state.

Yeah, though when SSA verifiers run the state should also be consistent
and we'd get to pinpoint the offending pass easier.

> Thus we ought to be able to walk the IL marking everything we can see,
> combine that with the contents of the freelist and the result ought to be
> every SSA_NAME ever created.
>
> Reality is somewhat different, of course.
>
> Yours takes a slightly different approach.  Ultimately if we get the leaks
> plugged, we might even consider using both.

Sure.  Note that the above is from simply walking all SSA names.

Richard.

> jeff
>
>
>

Re: [PATCH] x86 interrupt attribute

2015-10-05 Thread Mike Stump

On Oct 4, 2015, at 11:15 AM, H.J. Lu  wrote:
> Current stack alignment implementation requires at least
> one, maybe two, scratch registers:

So, I have some cases where I need scratch registers as well.  I always save 2 
registers and they go first (and restore last), so I can always use them.

Re: RFC: PATCH for front end parts of C++ transactional memory TS

2015-10-05 Thread Andreas Schwab

Jason Merrill  writes:

> diff --git a/gcc/testsuite/g++.dg/tm/eh1.C b/gcc/testsuite/g++.dg/tm/eh1.C
> new file mode 100644
> index 000..1561211
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/tm/eh1.C
> @@ -0,0 +1,10 @@
> +// A handler can involve a transaction-safety conversion.
> +// { dg-do run }
> +// { dg-options "-fgnu-tm" }
> +
> +void g() transaction_safe {}
> +int main()
> +{
> +  try { throw g; }
> +  catch (void (*p)()) { }
> +}

FAIL: g++.dg/tm/eh1.C  -std=gnu++98 (test for excess errors)
Excess errors:
xg++: error: libitm.spec: No such file or directory

There is no libitm support on ia64.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

[Patch, avr] Fix PR 67839 - bit addressable instructions generated for out of range addresses

2015-10-05 Thread Senthil Kumar Selvaraj

Hi,

  As part of support for io and io_low attributes, the upper bound of
  the range check for low IO and IO addresses was changed from hardcoded
  values to hardcoded_range_end + 1 - GET_MODE_SIZE(mode).

  GCC passes VOID as the mode from genrecog, and GET_MODE_SIZE returns
  0, resulting in the range getting incorrectly extended by a byte.

  Not sure why it was done, as the mode of the operand shouldn't really
  matter when computing the upper bound. In any case, the insns that use 
the predicate already have a mem:QI wrapping it, and all the bit
  addressable instructions operate on a single IO register only.

  This patch reverts the check back to a hardcoded value, and adds a
  test to prevent regressions.

  No new regression failures. If ok, could someone commit please? I
don't have commit access.


Regards
Senthil

gcc/ChangeLog

2015-10-05  Senthil Kumar Selvaraj  

PR target/67839
* config/avr/predicates.md (low_io_address_operand): Don't
consider MODE when computing upper bound.
(io_address_operand): Likewise.


gcc/testsuite/ChangeLog

2015-10-05  Senthil Kumar Selvaraj  

PR target/67839
* gcc.target/avr/pr67839.c: New test.



diff --git gcc/config/avr/predicates.md gcc/config/avr/predicates.md
index 2d12bc6..622bc0b 100644
--- gcc/config/avr/predicates.md
+++ gcc/config/avr/predicates.md
@@ -46,7 +46,7 @@
 (define_special_predicate "low_io_address_operand"
   (ior (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op) - avr_arch->sfr_offset,
-  0, 0x20 - GET_MODE_SIZE (mode))"))
+  0, 0x1F)"))
(and (match_code "symbol_ref")
(match_test "SYMBOL_REF_FLAGS (op) & SYMBOL_FLAG_IO_LOW"
 
@@ -60,7 +60,7 @@
 (define_special_predicate "io_address_operand"
   (ior (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op) - avr_arch->sfr_offset,
-  0, 0x40 - GET_MODE_SIZE (mode))"))
+  0, 0x3F)"))
(and (match_code "symbol_ref")
(match_test "SYMBOL_REF_FLAGS (op) & SYMBOL_FLAG_IO"
 
diff --git gcc/testsuite/gcc.target/avr/pr67839.c 
gcc/testsuite/gcc.target/avr/pr67839.c
new file mode 100644
index 000..604ab4b
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr67839.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+/* { dg-final { scan-assembler "sbi 0x1f,0" } } */
+/* { dg-final { scan-assembler "cbi 0x1f,0" } } */
+/* { dg-final { scan-assembler-not "sbi 0x20,0" } } */
+/* { dg-final { scan-assembler-not "cbi 0x20,0" } } */
+/* { dg-final { scan-assembler "in r\\d+,__SREG__" } } */
+/* { dg-final { scan-assembler "out __SREG__,r\\d+" } } */
+/* { dg-final { scan-assembler-not "in r\\d+,0x40" } } */
+/* { dg-final { scan-assembler-not "out 0x40, r\\d+" } } */
+
+/* This testcase verifies that SBI/CBI/SBIS/SBIC
+   and IN/OUT instructions are not generated for
+   an IO addresses outside the valid range.
+*/
+#define IO_ADDR(x) (*((volatile char *)x + __AVR_SFR_OFFSET__))
+int main ()
+{
+  IO_ADDR(0x1f) |= 1;
+  IO_ADDR(0x1f) &= 0xFE;
+
+  IO_ADDR(0x20) |= 1;
+  IO_ADDR(0x20) &= 0xFE;
+
+  IO_ADDR(0x3f) = IO_ADDR(0x3f);
+
+  IO_ADDR(0x40) = IO_ADDR(0x40);
+  return 0;
+}

Re: [PATCH] Improve DOM's optimization of control statements

2015-10-05 Thread Richard Biener

On Fri, Oct 2, 2015 at 9:30 PM, Jeff Law  wrote:
> On 10/02/2015 05:15 AM, Renlin Li wrote:
>>
>> Hi Jeff,
>>
>> Your patch causes an ICE regression.
>> The test case is " gcc.c-torture/compile/pr27087.c", I observed it on
>> aarch64-none-elf target when compiling the test case with '-Os' flag.
>>
>> A quick check shows, the cfg has been changed, but the loop information
>> is not updated. Thus the information about the number of basic block in
>> a loop is not reliable.
>>
>> Could you please have a look?
>
> As I mentioned, when we collapse a conditional inside a loop, we may change
> the # of nodes in a loop which edges are exit edges and possibly other
> stuff.  So we need to mark loops as needing fixups.
>
> Verified this fixes the aarch64-elf regression and did a bootstrap &
> regression test on x86_64-linux-gnu.
>
> Installed on the trunk.
>
> jeff
>
> commit 992d281b2d1ba53a49198db44fee92a505e16f5d
> Author: Jeff Law 
> Date:   Fri Oct 2 15:22:04 2015 -0400
>
> Re: [PATCH] Improve DOM's optimization of control statements
>
> * tree-ssa-dom.c (optimize_stmt): Note when loop structures need
> fixups.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3f7561a..e541df3 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-10-02  Jeff Law  
> +
> +   * tree-ssa-dom.c (optimize_stmt): Note when loop structures need
> +   fixups.
> +
>  2015-10-02  Uros Bizjak  
>
> * system.h (ROUND_UP): New macro definition.
> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
> index a8b7038..d940816 100644
> --- a/gcc/tree-ssa-dom.c
> +++ b/gcc/tree-ssa-dom.c
> @@ -1843,6 +1843,12 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator
> si,
>   /* Delete threads that start at BB.  */
>   remove_jump_threads_starting_at (bb);
>
> + /* If BB is in a loop, then removing an outgoing edge from BB
> +may cause BB to move outside the loop, changes in the
> +loop exit edges, etc.  So note that loops need fixing.  */
> + if (bb_loop_depth (bb) > 0)
> +   loops_state_set (LOOPS_NEED_FIXUP);
> +

I would rather do this in remove_ctrl_stmt_and_useless_edges and only
if taken_edge is a loop exit.  loop fixup is a pretty big hammer which
we should avoid at all cost.

So please try to be more specific on the cases you invoke it.

Thanks,
Richard.

>   /* Now clean up the control statement at the end of
>  BB and remove unexecutable edges.  */
>   remove_ctrl_stmt_and_useless_edges (bb, taken_edge->dest);
>

Re: [PATCH 1/4] make build_uses store tree * instead of tree

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 2:25 AM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:
>
> 2015-10-04  Trevor Saunders  
>
> * tree-ssa-operands.c (build_uses): store tree * instead of
> tree.
> (finalize_ssa_uses): Adjust.
> (append_use): Likewise.
> (verify_ssa_operands): Likewise.

Ok (I remember doing this at some point - not sure why I didn't end up
committing it...)

Richard.

> ---
>  gcc/tree-ssa-operands.c | 30 --
>  1 file changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
> index 85f9cca..544e9df 100644
> --- a/gcc/tree-ssa-operands.c
> +++ b/gcc/tree-ssa-operands.c
> @@ -108,7 +108,7 @@ along with GCC; see the file COPYING3.  If not see
>  #define opf_address_taken (1 << 5)
>
>  /* Array for building all the use operands.  */
> -static vec build_uses;
> +static vec build_uses;
>
>  /* The built VDEF operand.  */
>  static tree build_vdef;
> @@ -359,8 +359,7 @@ finalize_ssa_defs (struct function *fn, gimple *stmt)
>  }
>
>
> -/* Takes elements from build_uses and turns them into use operands of STMT.
> -   TODO -- Make build_uses vec of tree *.  */
> +/* Takes elements from build_uses and turns them into use operands of STMT.  
> */
>
>  static inline void
>  finalize_ssa_uses (struct function *fn, gimple *stmt)
> @@ -379,7 +378,7 @@ finalize_ssa_uses (struct function *fn, gimple *stmt)
>if (oldvuse != (build_vuse != NULL_TREE
>   ? build_vuse : build_vdef))
> gimple_set_vuse (stmt, NULL_TREE);
> -  build_uses.safe_insert (0, (tree)gimple_vuse_ptr (stmt));
> +  build_uses.safe_insert (0, gimple_vuse_ptr (stmt));
>  }
>
>new_list.next = NULL;
> @@ -415,7 +414,7 @@ finalize_ssa_uses (struct function *fn, gimple *stmt)
>/* Now create nodes for all the new nodes.  */
>for (new_i = 0; new_i < build_uses.length (); new_i++)
>  {
> -  tree *op = (tree *) build_uses[new_i];
> +  tree *op = build_uses[new_i];
>last = add_use_op (fn, stmt, op, last);
>  }
>
> @@ -463,7 +462,7 @@ start_ssa_stmt_operands (void)
>  static inline void
>  append_use (tree *use_p)
>  {
> -  build_uses.safe_push ((tree) use_p);
> +  build_uses.safe_push (use_p);
>  }
>
>
> @@ -964,7 +963,7 @@ verify_ssa_operands (struct function *fn, gimple *stmt)
>def_operand_p def_p;
>ssa_op_iter iter;
>unsigned i;
> -  tree use, def;
> +  tree def;
>bool volatile_p = gimple_has_volatile_ops (stmt);
>
>/* build_ssa_operands w/o finalizing them.  */
> @@ -990,7 +989,7 @@ verify_ssa_operands (struct function *fn, gimple *stmt)
>return true;
>  }
>
> -  use = gimple_vuse (stmt);
> +  tree use = gimple_vuse (stmt);
>if (use
>&& TREE_CODE (use) == SSA_NAME)
>  use = SSA_NAME_VAR (use);
> @@ -1009,11 +1008,12 @@ verify_ssa_operands (struct function *fn, gimple 
> *stmt)
>
>FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
>  {
> -  FOR_EACH_VEC_ELT (build_uses, i, use)
> +  tree *op;
> +  FOR_EACH_VEC_ELT (build_uses, i, op)
> {
> - if (use_p->use == (tree *)use)
> + if (use_p->use == op)
> {
> - build_uses[i] = NULL_TREE;
> + build_uses[i] = NULL;
>   break;
> }
> }
> @@ -1024,11 +1024,13 @@ verify_ssa_operands (struct function *fn, gimple 
> *stmt)
>   return true;
> }
>  }
> -  FOR_EACH_VEC_ELT (build_uses, i, use)
> -if (use != NULL_TREE)
> +
> +  tree *op;
> +  FOR_EACH_VEC_ELT (build_uses, i, op)
> +if (op != NULL)
>{
> error ("use operand missing for stmt");
> -   debug_generic_expr (*(tree *)use);
> +   debug_generic_expr (*op);
> return true;
>}
>
> --
> 2.4.0
>

Re: [PATCH 3/4] remove unused gasm accessors

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 2:25 AM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:
>
> 2015-10-04  Trevor Saunders  
>
> * gimple.h (gimple_asm_input_op_ptr): Remove.
> (gimple_asm_output_op_ptr): Likewise.

Ok.

Thanks,
Richard.

> ---
>  gcc/gimple.h | 20 
>  1 file changed, 20 deletions(-)
>
> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index cfd8d2c..9e7a911 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -3717,16 +3717,6 @@ gimple_asm_input_op (const gasm *asm_stmt, unsigned 
> index)
>return asm_stmt->op[index + asm_stmt->no];
>  }
>
> -/* Return a pointer to input operand INDEX of GIMPLE_ASM ASM_STMT.  */
> -
> -static inline tree *
> -gimple_asm_input_op_ptr (const gasm *asm_stmt, unsigned index)
> -{
> -  gcc_gimple_checking_assert (index < asm_stmt->ni);
> -  return const_cast (&asm_stmt->op[index + asm_stmt->no]);
> -}
> -
> -
>  /* Set IN_OP to be input operand INDEX in GIMPLE_ASM ASM_STMT.  */
>
>  static inline void
> @@ -3747,16 +3737,6 @@ gimple_asm_output_op (const gasm *asm_stmt, unsigned 
> index)
>return asm_stmt->op[index];
>  }
>
> -/* Return a pointer to output operand INDEX of GIMPLE_ASM ASM_STMT.  */
> -
> -static inline tree *
> -gimple_asm_output_op_ptr (const gasm *asm_stmt, unsigned index)
> -{
> -  gcc_gimple_checking_assert (index < asm_stmt->no);
> -  return const_cast (&asm_stmt->op[index]);
> -}
> -
> -
>  /* Set OUT_OP to be output operand INDEX in GIMPLE_ASM ASM_STMT.  */
>
>  static inline void
> --
> 2.4.0
>

Re: [PATCH 2/4] remove gimple_location_ptr ()

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 2:25 AM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:
>
> 2015-10-04  Trevor Saunders  
>
> * gimple.h (gimple_location_ptr): Remove.
> * tree-vrp.c (check_all_array_refs): Adjust.

Ok.

Thanks,
RIchard.

> ---
>  gcc/gimple.h   | 9 -
>  gcc/tree-vrp.c | 5 +++--
>  2 files changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index 30b1041..cfd8d2c 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -1734,15 +1734,6 @@ gimple_location_safe (const gimple *g)
>return g ? gimple_location (g) : UNKNOWN_LOCATION;
>  }
>
> -/* Return pointer to location information for statement G.  */
> -
> -static inline const location_t *
> -gimple_location_ptr (const gimple *g)
> -{
> -  return &g->location;
> -}
> -
> -
>  /* Set location information for statement G.  */
>
>  static inline void
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index 3bc3b03..ef5ef10 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -6717,8 +6717,9 @@ check_all_array_refs (void)
> continue;
>
>   memset (&wi, 0, sizeof (wi));
> - wi.info = CONST_CAST (void *, (const void *)
> -   gimple_location_ptr (stmt));
> +
> + location_t loc = gimple_location (stmt);
> + wi.info = &loc;
>
>   walk_gimple_op (gsi_stmt (si),
>   check_array_bounds,
> --
> 2.4.0
>

Re: [PATCH 4/4] make more gimple_x_ptr accessors const correct

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 2:25 AM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:
>
> 2015-10-04  Trevor Saunders  
>
> * gimple.h (gimple_op_ptr): Require a non const gimple *.
> (gimple_assign_lhs_ptr): Likewise.
> (gimple_assign_rhs1_ptr): Likewise.
> (gimple_assign_rhs2_ptr): Likewise.
> (gimple_assign_rhs3_ptr): Likewise.
> (gimple_call_lhs_ptr): Likewise.
> (gimple_call_fn_ptr): Likewise.
> (gimple_call_chain_ptr): Likewise.
> (gimple_call_arg_ptr): Likewise.
> (gimple_cond_lhs_ptr): Likewise.
> (gimple_cond_rhs_ptr): Likewise.
> (gimple_switch_index_ptr): Likewise.
> (gimple_return_retval_ptr): Likewise.

Ok.

Thanks,
RIchard.

> ---
>  gcc/gimple.h | 78 
> ++--
>  1 file changed, 39 insertions(+), 39 deletions(-)
>
> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index 9e7a911..a456f54 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -2346,12 +2346,12 @@ gimple_op (const gimple *gs, unsigned i)
>  /* Return a pointer to operand I for statement GS.  */
>
>  static inline tree *
> -gimple_op_ptr (const gimple *gs, unsigned i)
> +gimple_op_ptr (gimple *gs, unsigned i)
>  {
>if (gimple_has_ops (gs))
>  {
>gcc_gimple_checking_assert (i < gimple_num_ops (gs));
> -  return gimple_ops (CONST_CAST_GIMPLE (gs)) + i;
> +  return gimple_ops (gs) + i;
>  }
>else
>  return NULL;
> @@ -2407,15 +2407,15 @@ gimple_assign_lhs (const gimple *gs)
>  /* Return a pointer to the LHS of assignment statement GS.  */
>
>  static inline tree *
> -gimple_assign_lhs_ptr (const gassign *gs)
> +gimple_assign_lhs_ptr (gassign *gs)
>  {
> -  return const_cast (&gs->op[0]);
> +  return &gs->op[0];
>  }
>
>  static inline tree *
> -gimple_assign_lhs_ptr (const gimple *gs)
> +gimple_assign_lhs_ptr (gimple *gs)
>  {
> -  const gassign *ass = GIMPLE_CHECK2 (gs);
> +  gassign *ass = GIMPLE_CHECK2 (gs);
>return gimple_assign_lhs_ptr (ass);
>  }
>
> @@ -2459,15 +2459,15 @@ gimple_assign_rhs1 (const gimple *gs)
> statement GS.  */
>
>  static inline tree *
> -gimple_assign_rhs1_ptr (const gassign *gs)
> +gimple_assign_rhs1_ptr (gassign *gs)
>  {
> -  return const_cast (&gs->op[1]);
> +  return &gs->op[1];
>  }
>
>  static inline tree *
> -gimple_assign_rhs1_ptr (const gimple *gs)
> +gimple_assign_rhs1_ptr (gimple *gs)
>  {
> -  const gassign *ass = GIMPLE_CHECK2 (gs);
> +  gassign *ass = GIMPLE_CHECK2 (gs);
>return gimple_assign_rhs1_ptr (ass);
>  }
>
> @@ -2511,16 +2511,16 @@ gimple_assign_rhs2 (const gimple *gs)
> statement GS.  */
>
>  static inline tree *
> -gimple_assign_rhs2_ptr (const gassign *gs)
> +gimple_assign_rhs2_ptr (gassign *gs)
>  {
>gcc_gimple_checking_assert (gimple_num_ops (gs) >= 3);
> -  return const_cast (&gs->op[2]);
> +  return &gs->op[2];
>  }
>
>  static inline tree *
> -gimple_assign_rhs2_ptr (const gimple *gs)
> +gimple_assign_rhs2_ptr (gimple *gs)
>  {
> -  const gassign *ass = GIMPLE_CHECK2 (gs);
> +  gassign *ass = GIMPLE_CHECK2 (gs);
>return gimple_assign_rhs2_ptr (ass);
>  }
>
> @@ -2564,11 +2564,11 @@ gimple_assign_rhs3 (const gimple *gs)
> statement GS.  */
>
>  static inline tree *
> -gimple_assign_rhs3_ptr (const gimple *gs)
> +gimple_assign_rhs3_ptr (gimple *gs)
>  {
> -  const gassign *ass = GIMPLE_CHECK2 (gs);
> +  gassign *ass = GIMPLE_CHECK2 (gs);
>gcc_gimple_checking_assert (gimple_num_ops (gs) >= 4);
> -  return const_cast (&ass->op[3]);
> +  return &ass->op[3];
>  }
>
>
> @@ -2764,15 +2764,15 @@ gimple_call_lhs (const gimple *gs)
>  /* Return a pointer to the LHS of call statement GS.  */
>
>  static inline tree *
> -gimple_call_lhs_ptr (const gcall *gs)
> +gimple_call_lhs_ptr (gcall *gs)
>  {
> -  return const_cast (&gs->op[0]);
> +  return &gs->op[0];
>  }
>
>  static inline tree *
> -gimple_call_lhs_ptr (const gimple *gs)
> +gimple_call_lhs_ptr (gimple *gs)
>  {
> -  const gcall *gc = GIMPLE_CHECK2 (gs);
> +  gcall *gc = GIMPLE_CHECK2 (gs);
>return gimple_call_lhs_ptr (gc);
>  }
>
> @@ -2948,15 +2948,15 @@ gimple_call_fn (const gimple *gs)
> statement GS.  */
>
>  static inline tree *
> -gimple_call_fn_ptr (const gcall *gs)
> +gimple_call_fn_ptr (gcall *gs)
>  {
> -  return const_cast (&gs->op[1]);
> +  return &gs->op[1];
>  }
>
>  static inline tree *
> -gimple_call_fn_ptr (const gimple *gs)
> +gimple_call_fn_ptr (gimple *gs)
>  {
> -  const gcall *gc = GIMPLE_CHECK2 (gs);
> +  gcall *gc = GIMPLE_CHECK2 (gs);
>return gimple_call_fn_ptr (gc);
>  }
>
> @@ -3052,9 +3052,9 @@ gimple_call_chain (const gimple *gs)
>  /* Return a pointer to the static chain for call statement CALL_STMT.  */
>
>  static inline tree *
> -gimple_call_chain_ptr (const gcall *call_stmt)
> +gimple_call_chain_ptr (gcall *call_stmt)
>  {
> -  return const_cast (&call_stmt->op[2]);
> +  return &call_stmt->op[2];
>  }
>
>  /* Set CHAIN to be the static chain for call st

Re: [PATCH] x86 interrupt attribute

2015-10-05 Thread Uros Bizjak

On Mon, Oct 5, 2015 at 1:17 AM, H.J. Lu  wrote:

>> Looking a bit deeper into the code, it looks that we want to realign
>> the stack in the interrupt handler. Let's assume that interrupt
>> handler is calling some other function that saves SSE vector regs to
>> the stack. According to the x86 ABI, incoming stack of the called
>> function is assumed to be aligned to 16 bytes. But, interrupt handler
>> violates this assumption, since the stack could be aligned to only 4
>> bytes for 32bit and 8 bytes for 64bit targets. Entering the called
>> function with stack, aligned to less than 16 bytes will certainly
>> violate ABI.
>>
>> So, it looks to me that we need to realign the stack in the interrupt
>> handler unconditionally to 16bytes. In this case, we also won't need
>> the following changes:
>>
>
> Current stack alignment implementation requires at least
> one, maybe two, scratch registers:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67841
>
> Extend it to the interrupt handler, which doesn't have any scratch
> registers may require significant changes in backend as well as
> register allocator.

 But without realignment, the handler is unusable for anything but
 simple functions. The handler will crash when called function will try
 to save vector reg to stack.

>>>
>>> We can use unaligned load and store to avoid crash.
>>
>> Oh, sorry, I meant "called function will crash", like:
>>
>> -> interrupt when %rsp = 0x...8 ->
>> -> interrupt handler ->
>> -> calls some function that tries to save xmm reg to stack
>> -> crash in the called function
>>
>
> It should be fixed by this patch.   But we need to fix stack
> alignment in interrupt handler to avoid scratch register.
>
>
> --
> H.J.
> ---
> commit 15f48be1dc7ff48207927d0b835e593d058f695b
> Author: H.J. Lu 
> Date:   Sun Oct 4 16:14:03 2015 -0700
>
> Correctly set incoming stack boundary for interrupt handler
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 7ebdcd9..0f0cc3c 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12037,8 +12037,11 @@ ix86_minimum_incoming_stack_boundary (bool sibcall)
>  {
>unsigned int incoming_stack_boundary;
>
> +  /* Stack of interrupt handler is always aligned to word_mode.  */
> +  if (cfun->machine->func_type != TYPE_NORMAL)
> +incoming_stack_boundary = TARGET_64BIT ? 64 : 32;

Just a heads up that in order to support stack realignmnent on x86_64,
MIN_STACK_BOUNDARY will soon be changed to BITS_PER_WORD, so you can
use it in the line above. Please see comment #5 and #6 of PR 66697
[1].

>/* Prefer the one specified at command line. */
> -  if (ix86_user_incoming_stack_boundary)
> +  else if (ix86_user_incoming_stack_boundary)
>  incoming_stack_boundary = ix86_user_incoming_stack_boundary;
>/* In 32bit, use MIN_STACK_BOUNDARY for incoming stack boundary
>   if -mstackrealign is used, it isn't used for sibcall check and

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66697

Uros.

Re: [v3 PATCH] PR 67844

2015-10-05 Thread Jonathan Wakely


On 05/10/15 06:59 +0300, Ville Voutilainen wrote:

   PR 67844.
   * include/std/tuple (_TC::_NonNestedTuple): Eagerly reject
   conversions from tuple types same as the target tuple.
   * testsuite/20_util/tuple/67844.cc: New.


OK for trunk with a copyright header on the testcase. Thanks for the
quick fix.

Re: C PATCH for c/65345 (file-scope _Atomic expansion with floats)

2015-10-05 Thread Christophe Lyon

On 2 October 2015 at 19:08, Ramana Radhakrishnan
 wrote:
>
>
> On 01/10/15 17:18, Marek Polacek wrote:
>> On Thu, Oct 01, 2015 at 11:02:09AM -0400, David Edelsohn wrote:
>>> On Thu, Oct 1, 2015 at 10:49 AM, Marek Polacek  wrote:
 Joseph reminded me that I had forgotten about this patch.  As mentioned
 here , I'm
 removing the XFAILs in the tests so people are likely to see new FAILs.

 I think the following targets will need similar fix as the one below:
 * MIPS
 * rs6000
 * alpha
 * sparc
 * s390
 * arm
 * sh
 * aarch64

 I'm CCing the respective maintainers.  You might want to XFAIL those tests.
>
> Thanks for the heads up. The use of the _Atomic feature
> is not really target specific but more a language standards issue across all 
> the architectures
> that the GCC project supports, therefore XFAILing them is the wrong approach 
> imho.
>
>
>>>
>>> Why aren't you testing the appropriate fix on all of the targets?
>>
>> It's very improbable that I could fix and properly test all of them;
>> I simply don't have the cycles and resources to fix e.g. sh/sparc/alpha/mips.
>
> I don't think anyone expects you to be testing the patch on every single port 
> .
>
> Even though these changes sit in the target hooks into various backends, you 
> may be best
> placed to advise how target maintainers adjust their backends. If at that 
> point this appears to be
> mechanical, it's been good practice in the community for folks to send patches
> that the maintainers can fully test even if the testing has been light for the
> proposed patch.
>
> However, I am not aware of a "policy" for these things other than that these
> sort of changes are selectively enforced in the community. Maybe we should 
> think
> about it 
>
>
>>
>> You want me to revert my fix, but I don't really see the point here; the
>> patch doesn't introduce any regressions, it's just that the new tests are
>> likely to FAIL.  It sounds preferable to me to fix 2 targets than to leave
>> all of them broken (and I bet many maintainers were unaware of the issue).
>>
>
>
>> Would XFAILing the new tests work for you, if you don't want to see any
>> new FAILs?
>>
>> If you still insist on reverting the patch, ok, but I think this PR is
>> unlikely to be resolved any time soon then.
>>
>>   Marek
>>
>
>
> I've had a quick look on aarch64 - changing the interface to use 
> create_tmp_var_raw
> is rather mechanical. What I'm struggling with is figuring out whether
> the change for TARGET_EXPR is applicable in the arm / aarch64 backends.
>
> It took me a couple of minutes to trial the interface changes (attached) on 
> aarch64
> as I had a cross-compiler build tree lying around and could see that the 
> compiler
> did not ICE with the 2 testcases provided and pr65345-4.c appeared to pass on 
> hardware.
>
>
I've just created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67848
which covers aarch64 and arm-linux-gnueabihf targets.

> regards
> Ramana
>

[PR other/65021] mkoffloads -save-temps handling, and cleanup cleanup (was: lto wrapper verboseness)

2015-10-05 Thread Thomas Schwinge

Hi!

In a similar vein to the earlier patch to "Pass on the verbose flag "-v"
to/in the mkoffloads", here is a patch to make the mkoffloads handle
"-save-temps", and this patch also happens to address
, "nvptx mkoffload doesn't clean up its
temporary files".  OK for trunk?

commit 693388578c8b71ed0dd7ee07a4153442e92e17de
Author: Thomas Schwinge 
Date:   Mon Oct 5 11:36:51 2015 +0200

[PR other/65021] mkoffloads -save-temps handling, and cleanup cleanup

gcc/
PR other/65021
* config/i386/intelmic-mkoffload.c (mkoffload_atexit): Rename
function to...
(mkoffload_cleanup): ... this.  Adjust all users.
(maybe_unlink): Look at save_temps and verbose flags instead of
debug flag.
(main): Parse "-save-temps" flag.
(generate_target_descr_file, generate_target_offloadend_file)
(generate_host_descr_file, prepare_target_image): Pass it on.
* config/nvptx/mkoffload.c (tool_cleanup): Implement.
(mkoffload_cleanup): New function.
(maybe_unlink): Look at save_temps and verbose flags instead of
debug flag.
(main): Instead of calling utils_cleanup, register atexit handler
for mkoffload_cleanup.
(main): Parse "-save-temps" flag.
(compile_native, main): Pass it on.
* lto-wrapper.c (compile_offload_image): Likewise.
---
 gcc/ChangeLog|  2 +-
 gcc/config/i386/intelmic-mkoffload.c | 30 ++---
 gcc/config/nvptx/mkoffload.c | 37 ++--
 gcc/lto-wrapper.c|  2 ++
 4 files changed, 53 insertions(+), 18 deletions(-)

diff --git gcc/ChangeLog gcc/ChangeLog
index 0d740a2..708ac08 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -464,7 +464,7 @@
 
* config/i386/intelmic-mkoffload.c (main): Parse "-v" flag.
(generate_target_descr_file, generate_target_offloadend_file)
-   (generate_host_descr_file, prepare_target_image, main): Pass it on.
+   (generate_host_descr_file, prepare_target_image): Pass it on.
* config/nvptx/mkoffload.c (main): Parse "-v" flag.
(compile_native, main): Pass it on.
* lto-wrapper.c (compile_offload_image): Likewise.
diff --git gcc/config/i386/intelmic-mkoffload.c 
gcc/config/i386/intelmic-mkoffload.c
index 14f3fb3..828b415 100644
--- gcc/config/i386/intelmic-mkoffload.c
+++ gcc/config/i386/intelmic-mkoffload.c
@@ -45,6 +45,7 @@ const char *temp_files[MAX_NUM_TEMPS];
 enum offload_abi offload_abi = OFFLOAD_ABI_UNSET;
 
 /* Delete tempfiles and exit function.  */
+
 void
 tool_cleanup (bool from_signal ATTRIBUTE_UNUSED)
 {
@@ -53,19 +54,24 @@ tool_cleanup (bool from_signal ATTRIBUTE_UNUSED)
 }
 
 static void
-mkoffload_atexit (void)
+mkoffload_cleanup (void)
 {
   tool_cleanup (false);
 }
 
-/* Unlink FILE unless we are debugging.  */
+/* Unlink FILE unless requested otherwise.  */
+
 void
 maybe_unlink (const char *file)
 {
-  if (debug)
-notice ("[Leaving %s]\n", file);
-  else
-unlink_if_ordinary (file);
+  if (!save_temps)
+{
+  if (unlink_if_ordinary (file)
+ && errno != ENOENT)
+   fatal_error (input_location, "deleting file %s: %m", file);
+}
+  else if (verbose)
+fprintf (stderr, "[Leaving %s]\n", file);
 }
 
 /* Add or change the value of an environment variable, outputting the
@@ -281,6 +287,8 @@ generate_target_descr_file (const char *target_compiler)
   struct obstack argv_obstack;
   obstack_init (&argv_obstack);
   obstack_ptr_grow (&argv_obstack, target_compiler);
+  if (save_temps)
+obstack_ptr_grow (&argv_obstack, "-save-temps");
   if (verbose)
 obstack_ptr_grow (&argv_obstack, "-v");
   obstack_ptr_grow (&argv_obstack, "-c");
@@ -321,6 +329,8 @@ generate_target_offloadend_file (const char 
*target_compiler)
   struct obstack argv_obstack;
   obstack_init (&argv_obstack);
   obstack_ptr_grow (&argv_obstack, target_compiler);
+  if (save_temps)
+obstack_ptr_grow (&argv_obstack, "-save-temps");
   if (verbose)
 obstack_ptr_grow (&argv_obstack, "-v");
   obstack_ptr_grow (&argv_obstack, "-c");
@@ -386,6 +396,8 @@ generate_host_descr_file (const char *host_compiler)
   struct obstack argv_obstack;
   obstack_init (&argv_obstack);
   obstack_ptr_grow (&argv_obstack, host_compiler);
+  if (save_temps)
+obstack_ptr_grow (&argv_obstack, "-save-temps");
   if (verbose)
 obstack_ptr_grow (&argv_obstack, "-v");
   obstack_ptr_grow (&argv_obstack, "-c");
@@ -434,6 +446,8 @@ prepare_target_image (const char *target_compiler, int 
argc, char **argv)
   struct obstack argv_obstack;
   obstack_init (&argv_obstack);
   obstack_ptr_grow (&argv_obstack, target_compiler);
+  if (save_temps)
+obstack_ptr_grow (&argv_obstack, "-save-temps");
   if (verbose)
 obstack_ptr_grow (&argv_obstack, "-v");
   obstack_ptr_grow (&argv_obstack, "-xlto");
@@ -536,7 +550,7 @@ main (int argc, char **argv)
   gcc_init_libintl ();
   diagn

Re: [Patch 2/2 ARM/AArch64] Add a new Cortex-A53 scheduling model

2015-10-05 Thread Christophe Lyon

On 1 October 2015 at 11:41, James Greenhalgh  wrote:
> On Thu, Oct 01, 2015 at 09:33:07AM +0100, Marcus Shawcroft wrote:
>> On 25/09/15 08:59, James Greenhalgh wrote:
>> >
>> > Hi,
>> >
>> > This patch introduces a new scheduling model for Cortex-A53.
>> >
>> > Bootstrapped and tested on arm-none-linux-gnueabi and 
>> > aarch64-none-linux-gnu
>> > and checked with a variety of popular benchmarking and microbenchmarking
>> > suites to show a benefit.
>> >
>> > OK?
>> >
>> > Thanks,
>> > James
>> >
>> > ---
>> > 2015-09-25  James Greenhalgh  
>> >
>> > * config/arm/aarch-common-protos.h
>> > (aarch_accumulator_forwarding): New.
>> > (aarch_forward_to_shift_is_not_shifted_reg): Likewise.
>> > * config/arm/aarch-common.c (aarch_accumulator_forwarding): New.
>> > (aarch_forward_to_shift_is_not_shifted_reg): Liekwise.
>> > * config/arm/cortex-a53.md: Rewrite.
>> >
>>
>> OK aarch64 with Kyrill's comments fixed.
>> /M
>
> Thanks,
>
> I had to rebase this over Evandro's recent patch adding neon_ldp/neon_ldp_q
> types to the old scheduling model. The rebase was obvious to resolve, and
> while I was there I also added the neon_stp/neon_stp_q types which were
> missing.
>
> I've attached what I ultimately committed as revision 228324. I messed up
> fixing the ChangeLog typo before commit, so that is revision 228325.
>

Hi James,

Since this commit I can see
gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c fail at -O2
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c:
In function 'exec_vst1_lane':
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c:96:1:
internal compiler error: output_operand: invalid %-code
0x78f79e output_operand_lossage(char const*, ...)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:3417
0x7934f3 output_asm_insn(char const*, rtx_def**)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:3782
0x793d77 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:3029
0x794b3a final(rtx_insn*, _IO_FILE*, int)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:2058
0x7956fb rest_of_handle_final
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:4449
0x7956fb execute
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:4524
Please submit a full bug report,

on aarch64_be.

I haven't looked at it in more details though.


> Thanks,
> James
>

Re: [AARCH64] Add missing entries in iterator vwcore

2015-10-05 Thread James Greenhalgh

On Thu, Oct 01, 2015 at 09:41:20PM +0100, Kugan wrote:
> Hi,
> 
> In "aarch64_get_lane" operand 0 is VEL, so  for %0,
> iterator vwcore should (?) support all the modes in VEL.
> 
> Ran into following error with a local patch for an existing test case.
> However it can also be reproduced with the attached test case.
> 
> fnction ???fn1???:
> t.c:25:1: internal compiler error: output_operand: invalid %-code
>  }
>  ^
> 0x8198fb output_operand_lossage(char const*, ...)
>   ../../base/gcc/final.c:3417
> 0x81a45b output_asm_insn(char const*, rtx_def**)
>   ../../base/gcc/final.c:3782
> 0x81b9d3 output_asm_insn(char const*, rtx_def**)
>   ../../base/gcc/final.c:2364
> 0x81b9d3 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
>   ../../base/gcc/final.c:3029
> 0x81be2b final(rtx_insn*, _IO_FILE*, int)
>   ../../base/gcc/final.c:2058
> 0x81c6e7 rest_of_handle_final
>   ../../base/gcc/final.c:4449
> 0x81c6e7 execute
>   ../../base/gcc/final.c:4524
> 
> 
> Attached patch fixes this. Bootstrapped and regression tested for
> aarch64-none-linux-gnu with no new regression. Is this OK for trunk?
>
> gcc/ChangeLog:
> 
> 2015-10-02  Kugan Vivekanandarajah  
> 
>   * config/aarch64/iterators.md: Add missing core element mode for
>mode.
> 
> gcc/testsuite/ChangeLog:
> 
> 2015-10-02  Kugan Vivekanandarajah  
> 
>   * gcc.target/aarch64/foo.c: New test.
> 

"foo.c" is not OK, please give this testcase a meaningful name.

> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 38c5a24..e49abd5 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -537,8 +537,11 @@
>  (V4HI "w") (V8HI "w")
>  (V2SI "w") (V4SI "w")
>  (DI   "x") (V2DI "x")
> +(V4HF "w") (V8HF "w")
>  (V2SF "w") (V4SF "w")
> -(V2DF "x")])
> +(V2DF "x") (SI   "x")
> +(HI   "x") (QI   "x")])

I don't understand the reasoning here, Surely we want "w" for SI,HI,QI
modes? Though are you sure we need them to fix your bug? I'd have expected
the hunk for V4HF and V8HF to be enough.

>  
>  ;; Double vector types for ALLX.
>  (define_mode_attr Vallxd [(QI "8b") (HI "4h") (SI "2s")])
> diff --git a/gcc/testsuite/gcc.target/aarch64/foo.c 
> b/gcc/testsuite/gcc.target/aarch64/foo.c
> index e69de29..77f161e 100644
> --- a/gcc/testsuite/gcc.target/aarch64/foo.c
> +++ b/gcc/testsuite/gcc.target/aarch64/foo.c

Again, please give this test a meaningful name.

Thanks,
James

> @@ -0,0 +1,25 @@
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +void fn2 ();
> +
> +typedef __Float16x4_t float16x4_t;
> +__fp16 result_float16x4[1];
> +float16x4_t exec_vst1_lane_vector_float16x4, exec_vst1_lane___trans_tmp_1;
> +
> +void fn1 ()
> +{
> +  exec_vst1_lane_vector_float16x4 = exec_vst1_lane___trans_tmp_1;
> +  __fp16 *__a = result_float16x4;
> +  float16x4_t __b = exec_vst1_lane___trans_tmp_1;
> +  int __lane = 0;
> +  *__a = ({ __b[__lane]; });
> +  union {
> +  short i;
> +  __fp16 f;
> +  } tmp_res;
> +  tmp_res.f = result_float16x4[0];
> +  if (tmp_res.i)
> +fn2();
> +}

Re: [Patch 2/2 ARM/AArch64] Add a new Cortex-A53 scheduling model

2015-10-05 Thread James Greenhalgh

On Mon, Oct 05, 2015 at 11:07:45AM +0100, Christophe Lyon wrote:
> On 1 October 2015 at 11:41, James Greenhalgh  wrote:
> > On Thu, Oct 01, 2015 at 09:33:07AM +0100, Marcus Shawcroft wrote:
> >> On 25/09/15 08:59, James Greenhalgh wrote:
> >> >
> >> > Hi,
> >> >
> >> > This patch introduces a new scheduling model for Cortex-A53.
> >> >
> >> > Bootstrapped and tested on arm-none-linux-gnueabi and 
> >> > aarch64-none-linux-gnu
> >> > and checked with a variety of popular benchmarking and microbenchmarking
> >> > suites to show a benefit.
> >> >
> >> > OK?
> >> >
> >> > Thanks,
> >> > James
> >> >
> >> > ---
> >> > 2015-09-25  James Greenhalgh  
> >> >
> >> > * config/arm/aarch-common-protos.h
> >> > (aarch_accumulator_forwarding): New.
> >> > (aarch_forward_to_shift_is_not_shifted_reg): Likewise.
> >> > * config/arm/aarch-common.c (aarch_accumulator_forwarding): New.
> >> > (aarch_forward_to_shift_is_not_shifted_reg): Liekwise.
> >> > * config/arm/cortex-a53.md: Rewrite.
> >> >
> >>
> >> OK aarch64 with Kyrill's comments fixed.
> >> /M
> >
> > Thanks,
> >
> > I had to rebase this over Evandro's recent patch adding neon_ldp/neon_ldp_q
> > types to the old scheduling model. The rebase was obvious to resolve, and
> > while I was there I also added the neon_stp/neon_stp_q types which were
> > missing.
> >
> > I've attached what I ultimately committed as revision 228324. I messed up
> > fixing the ChangeLog typo before commit, so that is revision 228325.
> >
> 
> Hi James,
> 
> Since this commit I can see
> gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c fail at -O2
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c:
> In function 'exec_vst1_lane':
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c:96:1:
> internal compiler error: output_operand: invalid %-code
> 0x78f79e output_operand_lossage(char const*, ...)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:3417
> 0x7934f3 output_asm_insn(char const*, rtx_def**)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:3782
> 0x793d77 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:3029
> 0x794b3a final(rtx_insn*, _IO_FILE*, int)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:2058
> 0x7956fb rest_of_handle_final
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:4449
> 0x7956fb execute
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:4524
> Please submit a full bug report,
> 
> on aarch64_be.
> 
> I haven't looked at it in more details though.

Hi Christophe,

Thanks for the report, I'd be surprised if that was to do with the
scheduling model. I can reproduce the failure, and expect that Kugan's
patch at https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00129.html ought
to do the job of fixing the ICE.

Thanks,
James

Re: [PATCH, ARM] Cleanup attr_thumb-static2.c test directives

2015-10-05 Thread Kyrill Tkachov


Hi Christian,

On 30/09/15 11:25, Christian Bruel wrote:


On 09/30/2015 10:48 AM, Christian Bruel wrote:

Change the test directive to avoid the awkward thumb1 useless skip and
relax the required platform

Passes for arm-eabi-none with default configure options.


sorry, resent clear :-)



This is ok.

Thanks,
Kyrill


Christian

Replace REAL_VALUES_EQUAL with real_equal

2015-10-05 Thread Richard Sandiford

Richard B suggested we should replace dconsthalf etc. with
dconst<1, 2> ().  When I tried that, the extra comma caused problems
with some lingering uses of the old target macros for handling reals
(e.g. REAL_ARITHMETIC instead of real_arithmetic), since the constant
was then treated as two macro parameters.  It would have been possible
to add an extra level of brackets to avoid this, but I thought I might
as well take the opportunity to remove the macros instead.  (Note that
I'm only removing macros that caused a problem directly, or are closely
related to ones that did.)

This first patch replaces REAL_VALUES_EQUAL with a real_equal function.
The prototype is the same as for real_identical, which has already
undergone a half-transition in this direction.

Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
building one target per CPU directory and checking that there were
no new warnings and no changes in testsuite output at -O2.
OK to install?

Thanks,
Richard

PS. It might be good to make real_value more "C++-like" in future,
but I think it should be done with care, especially with the potential
confusion between operator== being real_identical or real_equal.
This patch should at least be a strict improvement over the status quo.


gcc/c-family/
* c-lex.c (interpret_float): Use real_equal instead of
REAL_VALUES_EQUAL.

gcc/c/
* c-typeck.c (c_tree_equal): Use real_equal instead of
REAL_VALUES_EQUAL.

gcc/cp/
* tree.c (cp_tree_equal): Use real_equal instead of
REAL_VALUES_EQUAL.

gcc/
* real.h (real_equal): Declare.
(REAL_VALUES_EQUAL): Delete.
* real.c (real_equal): New function.
(real_compare): Use it.
* doc/tm.texi.in (REAL_VALUES_EQUAL): Delete.
* doc/tm.texi: Regenerate.
* builtins.c (fold_builtin_pow, fold_builtin_load_exponent): Use
real_equal instead of REAL_VALUES_EQUAL.
* config/aarch64/aarch64.c (aarch64_float_const_zero_rtx_p): Likewise.
* config/arm/arm.c (arm_const_double_rtx, neon_valid_immediate)
(fp_const_from_val): Likewise.
* config/fr30/fr30.c (fr30_const_double_is_zero): Likewise.
* config/m68k/m68k.c (standard_68881_constant_p): Likewise.
(floating_exact_log2): Likewise.
* config/sh/sh.c (fp_zero_operand, fp_one_operand): Likewise.
* config/vax/vax.c (vax_float_literal): Likewise.
* config/xtensa/predicates.md (const_float_1_operand): Likewise.
* cprop.c (implicit_set_cond_p): Likewise.
* expmed.c (expand_mult): Likewise.
* fold-const.c (const_binop): Likewise.
* simplify-rtx.c (simplify_binary_operation_1): Likewise.
(simplify_const_binary_operation): Likewise.
(simplify_const_relational_operation): Likewise.
* tree-call-cdce.c (check_pow): Likewise.
(gen_conditions_for_pow_cst_base): Likewise.
* tree-inline.c (estimate_num_insns): Likewise.
* tree-ssa-dom.c (record_equality): Likewise.
* tree-ssa-math-opts.c (representable_as_half_series_p): Likewise.
(gimple_expand_builtin_pow): Likewise.
(pass_optimize_widening_mul::execute): Likewise.
* tree-ssa-uncprop.c (associate_equivalences_with_edges): Likewise.
* tree-vect-patterns.c (vect_recog_pow_pattern): Likewise.
* tree.c (real_zerop, real_onep, real_minus_onep): Likewise.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 956cf43..89bea60 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -8419,22 +8419,22 @@ fold_builtin_pow (location_t loc, tree fndecl, tree 
arg0, tree arg1, tree type)
   c = TREE_REAL_CST (arg1);
 
   /* Optimize pow(x,0.0) = 1.0.  */
-  if (REAL_VALUES_EQUAL (c, dconst0))
+  if (real_equal (&c, &dconst0))
return omit_one_operand_loc (loc, type, build_real (type, dconst1),
 arg0);
 
   /* Optimize pow(x,1.0) = x.  */
-  if (REAL_VALUES_EQUAL (c, dconst1))
+  if (real_equal (&c, &dconst1))
return arg0;
 
   /* Optimize pow(x,-1.0) = 1.0/x.  */
-  if (REAL_VALUES_EQUAL (c, dconstm1))
+  if (real_equal (&c, &dconstm1))
return fold_build2_loc (loc, RDIV_EXPR, type,
build_real (type, dconst1), arg0);
 
   /* Optimize pow(x,0.5) = sqrt(x).  */
   if (flag_unsafe_math_optimizations
- && REAL_VALUES_EQUAL (c, dconsthalf))
+ && real_equal (&c, &dconsthalf))
{
  tree sqrtfn = mathfn_built_in (type, BUILT_IN_SQRT);
 
@@ -8448,7 +8448,7 @@ fold_builtin_pow (location_t loc, tree fndecl, tree arg0, 
tree arg1, tree type)
  const REAL_VALUE_TYPE dconstroot
= real_value_truncate (TYPE_MODE (type), dconst_third ());
 
- if (REAL_VALUES_EQUAL (c, dconstroot))
+ if (real_equal (&c, &dconstroot))
{
  tree cbrtfn = mathfn_built_in (type, BUILT_IN_CBRT);
  if (cbrtfn != NULL_

Remove remaining uses of REAL_VALUES_IDENTICAL

2015-10-05 Thread Richard Sandiford

This patch continues the removal of real-related macros.
We already had both the old-style REAL_VALUES_IDENTICAL and the
new-style real_identical, so this patch replaces all remaining
uses of the former with the latter.

Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
building one target per CPU directory and checking that there were
no new warnings and no changes in testsuite output at -O2.  OK to
install?

Thanks,
Richard


gcc/
* real.h (REAL_VALUES_IDENTICAL): Delete.
* config/m68k/m68k.c (standard_68881_constant_p): Use real_identical
instead of REAL_VALUES_IDENTICAL.
* fold-const.c (operand_equal_p): Likewise.
* ipa-icf.c (sem_variable::equals): Likewise.
* tree-complex.c (some_nonzerop): Likewise.
(expand_complex_multiplication): Likewise.
* tree.c (simple_cst_equal): Likewise.
* varasm.c (compare_constant): Likewise.

diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index b7d96a5..487cbf4 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -4336,11 +4336,10 @@ standard_68881_constant_p (rtx x)
 
   REAL_VALUE_FROM_CONST_DOUBLE (r, x);
 
-  /* Use REAL_VALUES_IDENTICAL instead of real_equal so that -0.0
- is rejected.  */
+  /* Use real_identical instead of real_equal so that -0.0 is rejected.  */
   for (i = 0; i < 6; i++)
 {
-  if (REAL_VALUES_IDENTICAL (r, values_68881[i]))
+  if (real_identical (&r, &values_68881[i]))
 return (codes_68881[i]);
 }
   
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 768b39b..1c72af6 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -2815,8 +2815,7 @@ operand_equal_p (const_tree arg0, const_tree arg1, 
unsigned int flags)
   TREE_FIXED_CST (arg1));
 
   case REAL_CST:
-   if (REAL_VALUES_IDENTICAL (TREE_REAL_CST (arg0),
-  TREE_REAL_CST (arg1)))
+   if (real_identical (&TREE_REAL_CST (arg0), &TREE_REAL_CST (arg1)))
  return 1;
 
 
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index d39a3c1..b076222 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -2030,8 +2030,8 @@ sem_variable::equals (tree t1, tree t2)
   /* Real constants are the same only if the same width of type.  */
   if (TYPE_PRECISION (TREE_TYPE (t1)) != TYPE_PRECISION (TREE_TYPE (t2)))
 return return_false_with_msg ("REAL_CST precision mismatch");
-  return return_with_debug (REAL_VALUES_IDENTICAL (TREE_REAL_CST (t1),
-  TREE_REAL_CST (t2)));
+  return return_with_debug (real_identical (&TREE_REAL_CST (t1),
+   &TREE_REAL_CST (t2)));
 case VECTOR_CST:
   {
unsigned i;
diff --git a/gcc/real.h b/gcc/real.h
index 2ffc0d2..1be4104 100644
--- a/gcc/real.h
+++ b/gcc/real.h
@@ -333,7 +333,6 @@ extern const struct real_format arm_half_format;
 #define REAL_ARITHMETIC(value, code, d1, d2) \
   real_arithmetic (&(value), code, &(d1), &(d2))
 
-#define REAL_VALUES_IDENTICAL(x, y)real_identical (&(x), &(y))
 #define REAL_VALUES_LESS(x, y) real_compare (LT_EXPR, &(x), &(y))
 
 /* Determine whether a floating-point value X is infinite.  */
diff --git a/gcc/tree-complex.c b/gcc/tree-complex.c
index b0ffc00..93c0a54 100644
--- a/gcc/tree-complex.c
+++ b/gcc/tree-complex.c
@@ -118,7 +118,7 @@ some_nonzerop (tree t)
  cannot be treated the same as operations with a real or imaginary
  operand if we care about the signs of zeros in the result.  */
   if (TREE_CODE (t) == REAL_CST && !flag_signed_zeros)
-zerop = REAL_VALUES_IDENTICAL (TREE_REAL_CST (t), dconst0);
+zerop = real_identical (&TREE_REAL_CST (t), &dconst0);
   else if (TREE_CODE (t) == FIXED_CST)
 zerop = fixed_zerop (t);
   else if (TREE_CODE (t) == INTEGER_CST)
@@ -1021,7 +1021,7 @@ expand_complex_multiplication (gimple_stmt_iterator *gsi, 
tree inner_type,
 case PAIR (ONLY_IMAG, ONLY_REAL):
   rr = ar;
   if (TREE_CODE (ai) == REAL_CST
- && REAL_VALUES_IDENTICAL (TREE_REAL_CST (ai), dconst1))
+ && real_identical (&TREE_REAL_CST (ai), &dconst1))
ri = br;
   else
ri = gimplify_build2 (gsi, MULT_EXPR, inner_type, ai, br);
diff --git a/gcc/tree.c b/gcc/tree.c
index b432997..f78a2c2 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -7370,7 +7370,7 @@ simple_cst_equal (const_tree t1, const_tree t2)
   return wi::to_widest (t1) == wi::to_widest (t2);
 
 case REAL_CST:
-  return REAL_VALUES_IDENTICAL (TREE_REAL_CST (t1), TREE_REAL_CST (t2));
+  return real_identical (&TREE_REAL_CST (t1), &TREE_REAL_CST (t2));
 
 case FIXED_CST:
   return FIXED_VALUES_IDENTICAL (TREE_FIXED_CST (t1), TREE_FIXED_CST (t2));
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 706e652..a5bb2b5 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -3076,7 +3076,7 @@ compare_constant (const tree t1, const tree t2)
   if (TYPE

Replace REAL_VALUES_LESS with real_less

2015-10-05 Thread Richard Sandiford

This patch continues the removal of real-related macros by
replacing REAL_VALUES_LESS with real_less.

Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
building one target per CPU directory and checking that there were
no new warnings and no changes in testsuite output at -O2.  OK to install?

Thanks,
Richard


gcc/ada/
* gcc-interface/trans.c (convert_with_check): Use real_less instead
of REAL_VALUES_LESS.

gcc/
* doc/tm.texi.in (REAL_VALUES_LESS): Delete.
* doc/tm.texi: Regenerate.
* real.h (real_less): Declare.
(REAL_VALUES_LESS): Delete.
* real.c (real_less): New function.
(real_compare): Use it.
* config/m68k/m68k.c (floating_exact_log2): Use real_less instead
of REAL_VALUES_LESS.
* config/microblaze/microblaze.c (microblaze_const_double_ok):
Likewise.
* fold-const.c (fold_convert_const_int_from_real): Likewise.
* simplify-rtx.c (simplify_const_unary_operation): Likewise.
(simplify_const_relational_operation): Likewise.
* tree-call-cdce.c (check_pow): Likewise.
(gen_conditions_for_pow_cst_base): Likewise.

diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index 3252ea2..9838da0 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -8999,8 +8999,8 @@ convert_with_check (Entity_Id gnat_type, tree gnu_expr, 
bool overflowp,
   if (INTEGRAL_TYPE_P (gnu_in_basetype)
  ? tree_int_cst_lt (gnu_in_lb, gnu_out_lb)
  : (FLOAT_TYPE_P (gnu_base_type)
-? REAL_VALUES_LESS (TREE_REAL_CST (gnu_in_lb),
-TREE_REAL_CST (gnu_out_lb))
+? real_less (&TREE_REAL_CST (gnu_in_lb),
+ &TREE_REAL_CST (gnu_out_lb))
 : 1))
gnu_cond
  = invert_truthvalue
@@ -9011,8 +9011,8 @@ convert_with_check (Entity_Id gnat_type, tree gnu_expr, 
bool overflowp,
   if (INTEGRAL_TYPE_P (gnu_in_basetype)
  ? tree_int_cst_lt (gnu_out_ub, gnu_in_ub)
  : (FLOAT_TYPE_P (gnu_base_type)
-? REAL_VALUES_LESS (TREE_REAL_CST (gnu_out_ub),
-TREE_REAL_CST (gnu_in_lb))
+? real_less (&TREE_REAL_CST (gnu_out_ub),
+ &TREE_REAL_CST (gnu_in_lb))
 : 1))
gnu_cond
  = build_binary_op (TRUTH_ORIF_EXPR, boolean_type_node, gnu_cond,
diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index 487cbf4..74de983 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -4365,7 +4365,7 @@ floating_exact_log2 (rtx x)
 
   REAL_VALUE_FROM_CONST_DOUBLE (r, x);
 
-  if (REAL_VALUES_LESS (r, dconst1))
+  if (real_less (&r, &dconst1))
 return 0;
 
   exp = real_exponent (&r);
diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index ebcf65a..9efa739 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -280,12 +280,12 @@ microblaze_const_double_ok (rtx op, machine_mode mode)
 
   if (mode == DFmode)
 {
-  if (REAL_VALUES_LESS (d, dfhigh) && REAL_VALUES_LESS (dflow, d))
+  if (real_less (&d, &dfhigh) && real_less (&dflow, &d))
return 1;
 }
   else
 {
-  if (REAL_VALUES_LESS (d, sfhigh) && REAL_VALUES_LESS (sflow, d))
+  if (real_less (&d, &sfhigh) && real_less (&sflow, &d))
return 1;
 }
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index fdd49d9..f6a0b5d 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9708,10 +9708,6 @@ array of @code{HOST_WIDE_INT}, but all code should treat 
it as an opaque
 quantity.
 @end defmac
 
-@deftypefn Macro int REAL_VALUES_LESS (REAL_VALUE_TYPE @var{x}, 
REAL_VALUE_TYPE @var{y})
-Tests whether @var{x} is less than @var{y}.
-@end deftypefn
-
 @deftypefn Macro HOST_WIDE_INT REAL_VALUE_FIX (REAL_VALUE_TYPE @var{x})
 Truncates @var{x} to a signed integer, rounding toward zero.
 @end deftypefn
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index ab75ad9..7fae1ca 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7131,10 +7131,6 @@ array of @code{HOST_WIDE_INT}, but all code should treat 
it as an opaque
 quantity.
 @end defmac
 
-@deftypefn Macro int REAL_VALUES_LESS (REAL_VALUE_TYPE @var{x}, 
REAL_VALUE_TYPE @var{y})
-Tests whether @var{x} is less than @var{y}.
-@end deftypefn
-
 @deftypefn Macro HOST_WIDE_INT REAL_VALUE_FIX (REAL_VALUE_TYPE @var{x})
 Truncates @var{x} to a signed integer, rounding toward zero.
 @end deftypefn
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 1c72af6..2851a29 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -1877,7 +1877,7 @@ fold_convert_const_int_from_real (enum tree_code code, 
tree type, const_tree arg
 {
   tree lt = TYPE_MIN_VALUE (type);
   REAL_VALUE_TYPE l = real_value_from_int_cst (NULL_TREE, lt);
-  if (REAL_VALUES_LESS (r, l))
+  if (real_less (&r, &l))
{

Remove remaining uses of REAL_ARITHMETIC

2015-10-05 Thread Richard Sandiford

This patch replaces all remaining uses of the old target macro
REAL_ARITHMETIC with calls to the (now generic) real_arithmetic
function.

Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
building one target per CPU directory and checking that there were
no new warnings and no changes in testsuite output at -O2.  OK to install?

Thanks,
Richard


gcc/ada/
* gcc-interface/trans.c (convert_with_check): Use real_arithmetic
instead of REAL_ARITHMETIC.

gcc/
* doc/tm.texi.in (REAL_ARITHMETIC): Delete.
* doc/tm.texi: Regenerate.
* real.h (REAL_ARITHMETIC): Delete.
* config/i386/i386.c (ix86_expand_lround, ix86_expand_round)
(ix86_expand_round_sse4): Use real_arithmetic instead of
REAL_ARITHMETIC.
* config/i386/sse.md (round2): Likewise.
* rtl.h (rtx_to_tree_code): Likewise (in comment).
* explow.c (rtx_to_tree_code): Likewise (in comment).
* match.pd: Likewise.
* simplify-rtx.c (simplify_binary_operation_1): Likewise.
* tree-ssa-math-opts.c (representable_as_half_series_p): Likewise.
(expand_pow_as_sqrts): Likewise.
* tree-pretty-print.c (dump_generic_node): Remove code that
was conditional on REAL_ARITHMETIC being undefined.

diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index 9838da0..f1e2dcb 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -9048,8 +9048,8 @@ convert_with_check (Entity_Id gnat_type, tree gnu_expr, 
bool overflowp,
   /* Compute the exact value calc_type'Pred (0.5) at compile time.  */
   fmt = REAL_MODE_FORMAT (TYPE_MODE (calc_type));
   real_2expN (&half_minus_pred_half, -(fmt->p) - 1, TYPE_MODE (calc_type));
-  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf,
-  half_minus_pred_half);
+  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf,
+  &half_minus_pred_half);
   gnu_pred_half = build_real (calc_type, pred_half);
 
   /* If the input is strictly negative, subtract this value
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 9c4cfbd..44847b4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -47538,7 +47538,7 @@ ix86_expand_lround (rtx op0, rtx op1)
   /* load nextafter (0.5, 0.0) */
   fmt = REAL_MODE_FORMAT (mode);
   real_2expN (&half_minus_pred_half, -(fmt->p) - 1, mode);
-  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf, half_minus_pred_half);
+  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf, &half_minus_pred_half);
 
   /* adj = copysign (0.5, op1) */
   adj = force_reg (mode, const_double_from_real_value (pred_half, mode));
@@ -47952,7 +47952,7 @@ ix86_expand_round (rtx operand0, rtx operand1)
   /* load nextafter (0.5, 0.0) */
   fmt = REAL_MODE_FORMAT (mode);
   real_2expN (&half_minus_pred_half, -(fmt->p) - 1, mode);
-  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf, half_minus_pred_half);
+  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf, &half_minus_pred_half);
 
   /* xa = xa + 0.5 */
   half = force_reg (mode, const_double_from_real_value (pred_half, mode));
@@ -48003,7 +48003,7 @@ ix86_expand_round_sse4 (rtx op0, rtx op1)
   /* load nextafter (0.5, 0.0) */
   fmt = REAL_MODE_FORMAT (mode);
   real_2expN (&half_minus_pred_half, -(fmt->p) - 1, mode);
-  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf, half_minus_pred_half);
+  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf, &half_minus_pred_half);
   half = const_double_from_real_value (pred_half, mode);
 
   /* e1 = copysign (0.5, op1) */
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 013681c..9b7a338 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -14881,7 +14881,7 @@
   /* load nextafter (0.5, 0.0) */
   fmt = REAL_MODE_FORMAT (scalar_mode);
   real_2expN (&half_minus_pred_half, -(fmt->p) - 1, scalar_mode);
-  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf, half_minus_pred_half);
+  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf, &half_minus_pred_half);
   half = const_double_from_real_value (pred_half, scalar_mode);
 
   vec_half = ix86_build_const_vector (mode, true, half);
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f6a0b5d..72366b9 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9736,21 +9736,6 @@ Determines whether @var{x} represents infinity (positive 
or negative).
 Determines whether @var{x} represents a ``NaN'' (not-a-number).
 @end deftypefn
 
-@deftypefn Macro void REAL_ARITHMETIC (REAL_VALUE_TYPE @var{output}, enum 
tree_code @var{code}, REAL_VALUE_TYPE @var{x}, REAL_VALUE_TYPE @var{y})
-Calculates an arithmetic operation on the two floating point values
-@var{x} and @var{y}, storing the result in @var{output} (which must be a
-variable).
-
-The operation to be performed is specified by @var{code}.  Only the
-following codes are supported: @code{PLUS_EXPR}, @code{MINUS_EXPR},
-@code{MULT_EXPR}, @c

Remove remaining uses of CONST_DOUBLE_FROM_REAL_VALUE

2015-10-05 Thread Richard Sandiford

This patch replaces all uses of CONST_DOUBLE_FROM_REAL_VALUE
with the already-existing const_double_from_real_value.

Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
building one target per CPU directory and checking that there were
no new warnings and no changes in testsuite output at -O2.  OK to install?

Thanks,
Richard


gcc/
* real.h (CONST_DOUBLE_ATOF): Use const_double_from_real_value
instead of CONST_DOUBLE_FROM_REAL_VALUE.
(CONST_DOUBLE_FROM_REAL_VALUE): Delete.
* config/c6x/c6x.md (divsf3, divdf3): Use const_double_from_real_value
instead of CONST_DOUBLE_FROM_REAL_VALUE.
* config/epiphany/epiphany.md (fixuns_truncsfsi2): Likewise.
* config/i386/i386.c (standard_80387_constant_rtx): Likewise.
(ix86_expand_builtin, ix86_emit_i387_log1p, ix86_emit_i387_round)
(ix86_emit_swsqrtsf): Likewise.
* config/ia64/ia64.c (ia64_expand_builtin): Likewise.
* config/mips/mips.md (fixuns_truncdfsi2, fixuns_truncdfdi2)
(fixuns_truncsfsi2, fixuns_truncsfdi2): Likewise.
* config/pa/pa.c (pa_expand_builtin): Likewise.
* config/rs6000/rs6000.c (rs6000_load_constant_and_splat): Likewise.
(rs6000_scale_v2df): Likewise.
* config/rs6000/rs6000.md (*cmptf_internal2): Likewise.
* config/s390/s390.md (fixuns_truncdddi2, fixuns_trunctddi2)
(fixuns_trunc2): Likewise.
* config/s390/vx-builtins.md (vec_ctd_s64, vec_ctd_u64, vec_ctsl)
(vec_ctul): Likewise.
* config/sparc/sparc.c (sparc_emit_fixunsdi): Likewise.
* config/spu/spu.c (hwint_to_const_double, spu_float_const): Likewise.
* config/spu/spu.md (floatunsdisf2, floatunstisf2): Likewise.
* cse.c (fold_rtx): Likewise.
* emit-rtl.c (immed_double_const): Likewise (in comments).
(init_emit_once): Likewise.
* expr.c (compress_float_constant, expand_expr_real_1)
(const_vector_from_tree): Likewise.
* optabs.c (expand_float, expand_fix): Likewise.
* reg-stack.c (reg_to_stack): Likewise.
* simplify-rtx.c (avoid_constant_pool_reference): Likewise.
(simplify_const_unary_operation, simplify_binary_operation_1)
(simplify_const_binary_operation, simplify_relational_operation)
(simplify_immed_subreg): Likewise.

diff --git a/gcc/config/c6x/c6x.md b/gcc/config/c6x/c6x.md
index 075968d..692d83f 100644
--- a/gcc/config/c6x/c6x.md
+++ b/gcc/config/c6x/c6x.md
@@ -2811,7 +2811,7 @@
   "TARGET_FP && flag_reciprocal_math"
 {
   operands[3] = force_reg (SFmode,
-  CONST_DOUBLE_FROM_REAL_VALUE (dconst2, SFmode));
+  const_double_from_real_value (dconst2, SFmode));
   operands[4] = gen_reg_rtx (SFmode);
   operands[5] = gen_reg_rtx (SFmode);
   operands[6] = gen_reg_rtx (SFmode);
@@ -2836,7 +2836,7 @@
   "TARGET_FP && flag_reciprocal_math"
 {
   operands[3] = force_reg (DFmode,
-  CONST_DOUBLE_FROM_REAL_VALUE (dconst2, DFmode));
+  const_double_from_real_value (dconst2, DFmode));
   operands[4] = gen_reg_rtx (DFmode);
   operands[5] = gen_reg_rtx (DFmode);
   operands[6] = gen_reg_rtx (DFmode);
diff --git a/gcc/config/epiphany/epiphany.md b/gcc/config/epiphany/epiphany.md
index 4280926..4c8b5d6 100644
--- a/gcc/config/epiphany/epiphany.md
+++ b/gcc/config/epiphany/epiphany.md
@@ -982,7 +982,7 @@
   rtx cmp = gen_rtx_LT (VOIDmode, cc1, CONST0_RTX (SFmode));
 
   real_2expN (&offset, 31, SFmode);
-  limit = CONST_DOUBLE_FROM_REAL_VALUE (offset, SFmode);
+  limit = const_double_from_real_value (offset, SFmode);
   limit = force_reg (SFmode, limit);
   emit_insn (gen_fix_truncsfsi2 (operands[0], operands[1]));
   emit_insn (gen_subsf3_f (tmp, operands[1], limit));
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 44847b4..ff52779 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10573,7 +10573,7 @@ standard_80387_constant_rtx (int idx)
   gcc_unreachable ();
 }
 
-  return CONST_DOUBLE_FROM_REAL_VALUE (ext_80387_constants_table[i],
+  return const_double_from_real_value (ext_80387_constants_table[i],
   XFmode);
 }
 
@@ -40143,7 +40143,7 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
rtx tmp;
 
real_inf (&inf);
-   tmp = CONST_DOUBLE_FROM_REAL_VALUE (inf, mode);
+   tmp = const_double_from_real_value (inf, mode);
 
tmp = validize_mem (force_const_mem (mode, tmp));
 
@@ -47021,7 +47021,7 @@ void ix86_emit_i387_log1p (rtx op0, rtx op1)
 
   emit_insn (gen_absxf2 (tmp, op1));
   test = gen_rtx_GE (VOIDmode, tmp,
-CONST_DOUBLE_FROM_REAL_VALUE (
+const_double_from_real_value (
REAL_VALUE_ATOF ("0.29289321881345247561810596348408353", XFmode),
XFmode));
   emit_jump_insn (gen_cbranchxf4 (test, XEXP (test, 0), XEXP (test, 1), 
label1));
@@ -47095,7 +47095

Remove REAL_VALUE_FROM_CONST_DOUBLE

2015-10-05 Thread Richard Sandiford

To maintain symmetry after the previous removal of
CONST_DOUBLE_FROM_REAL_VALUE, this patch also gets rid of
REAL_VALUE_FROM_CONST_DOUBLE.  All the macro did was copy the
contents of CONST_DOUBLE_REAL_VALUE into a temporary real_value
structure.  In many cases there was no need for this temporary
and we could simply use the CONST_DOUBLE_REAL_VALUE directly.
For that reason this patch is less automatic than the others.

Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
building one target per CPU directory and checking that there were
no new warnings and no changes in testsuite output at -O2.  OK to install?

Thanks,
Richard


gcc/
* real.h (REAL_VALUE_FROM_CONST_DOUBLE): Delete.
* config/aarch64/aarch64.c (aarch64_float_const_zero_rtx_p)
(aarch64_print_operand, aarch64_float_const_representable_p)
(aarch64_output_simd_mov_immediate): Use CONST_DOUBLE_REAL_VALUE
instead of REAL_VALUE_FROM_CONST_DOUBLE.
* config/arc/arc.c (arc_print_operand): Likewise.
* config/arm/arm.c (arm_const_double_rtx, vfp3_const_double_index)
(neon_valid_immediate, arm_print_operand, arm_emit_fp16_const)
(vfp3_const_double_for_fract_bits, vfp3_const_double_for_bits):
Likewise.
* config/arm/arm.md (*arm32_movhf, consttable_4, consttable_8)
(consttable_16): Likewise.
* config/arm/vfp.md (*movhf_vfp_neon, *movhf_vfp): Likewise.
* config/avr/avr.c (avr_print_operand): Likewise.
* config/bfin/bfin.md: Likewise (in a define_split).
* config/c6x/c6x.md: Likewise (in a define_split).
* config/cr16/cr16.c (cr16_const_double_ok): Likewise.
(cr16_print_operand): Likewise.
* config/cris/cris.c (cris_print_operand): Likewise.
* config/epiphany/epiphany.c (epiphany_print_operand): Likewise.
* config/fr30/fr30.c (fr30_print_operand): Likewise.
(fr30_const_double_is_zero): Likewise.
* config/frv/frv.c (frv_print_operand, output_move_single): Likewise.
* config/frv/frv.md: Likewise (in a define_split).
* config/frv/predicates.md (int_2word_operand): Likewise.
* config/h8300/h8300.c (h8300_print_operand): Likewise.
* config/i386/i386.c (standard_80387_constant_p): Likewise.
(ix86_print_operand, ix86_split_to_parts): Likewise.
* config/i386/i386.md: Likewise (in a define_split).
* config/ia64/ia64.c (ia64_split_tmode, ia64_print_operand): Likewise.
* config/iq2000/iq2000.md (movsf_lo_sum, movsf_high): Likewise.
* config/m32r/m32r.c (easy_df_const, m32r_print_operand): Likewise.
* config/m68k/m68k.c (handle_move_double, standard_68881_constant_p)
(print_operand): Likewise.
* config/m68k/m68k.md (movsf_cf_hard, movdf_cf_hard): Likewise.
* config/mep/mep.md: Likewise (in define_split).
* config/microblaze/microblaze.c (microblaze_const_double_ok)
(print_operand): Likewise.
* config/mips/mips.md (consttable_float): Likewise.
* config/mmix/mmix.c (mmix_intval): Likewise.
* config/mn10300/mn10300.c (mn10300_print_operand): Likewise.
* config/nvptx/nvptx.c (nvptx_print_operand): Likewise.
* config/pa/pa.c (pa_singlemove_string): Likewise.
* config/pdp11/pdp11.c (pdp11_expand_operands): Likewise.
(pdp11_asm_print_operand, legitimate_const_double_p): Likewise.
* config/rs6000/rs6000.c (num_insns_constant, rs6000_emit_cmove)
(output_toc): Likewise.
* config/rs6000/rs6000.md: Likewise (in define_splits).
* config/rx/rx.c (rx_print_operand): Likewise.
* config/s390/s390.c (s390_output_pool_entry): Likewise.
* config/sh/sh.c (fp_zero_operand, fp_one_operand): Likewise.
* config/sh/sh.md (consttable_sf, consttable_df): Likewise
(and also in define_splits).
* config/sparc/sparc.c (fp_sethi_p, fp_mov_p): Likewise.
(fp_high_losum_p): Likewise.
* config/sparc/sparc.md (*movsf_insn, *movsf_lo_sum): Likewise.
(*movsf_high): Likewise.
* config/spu/spu.c (const_double_to_hwint): Likewise.
* config/v850/v850.c (const_double_split): Likewise.
* config/vax/vax.c (vax_float_literal): Likewise.
* config/visium/visium.c (visium_expand_copysign): Likewise.
* config/visium/visium.md: Likewise (in define_split).
* config/xtensa/predicates.md (const_float_1_operand): Likewise.
* config/xtensa/xtensa.c (print_operand): Likewise.
(xtensa_output_literal): Likewise.
* cprop.c (implicit_set_cond_p): Likewise.
* dwarf2out.c (insert_float): Likewise.
* expmed.c (expand_mult, make_tree): Likewise.
* expr.c (compress_float_constant): Likewise.
* rtlanal.c (split_double): Likewise.
* simplify-rtx.c (avoid_constant_pool_reference): Likewise.
(simplify_const_unary_operation, simplify_binary_operation_1)

Re: [PATCH 2/4] [ARM] Add attribute/pragma target fpu=

2015-10-05 Thread Kyrill Tkachov


Hi Christian,

Sorry for the delay.

On 14/09/15 11:47, Christian Bruel wrote:

This patch defines and uses accessors for the current fpu type fields,
based on switchable arm_fpu_index rather than defuncted arm_fpu_desc.

Christian


+  if (TARGET_SOFT_FLOAT)
+arm_fpu_attr = FPU_NONE;
+  else if (TARGET_FPU_MODEL == ARM_FP_MODEL_VFP)
+arm_fpu_attr = FPU_VFP;
+  else
+gcc_unreachable();
 
Instead of "TARGET_FPU_MODEL == ARM_FP_MODEL_VFP" you can just use the new TARGET_VFP definition, right?



@@ -25679,7 +25667,7 @@
   if (print_tune_info)
arm_print_tune_info ();
 
-  if (! TARGET_SOFT_FLOAT && arm_fpu_desc->model == ARM_FP_MODEL_VFP)

+  if (! TARGET_SOFT_FLOAT && TARGET_FPU_MODEL == ARM_FP_MODEL_VFP)
{
  if (TARGET_HARD_FLOAT && TARGET_VFP_SINGLE)
arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1);

Likewise.

This is ok with those changes, but please wait until all patches in the series 
have been approved before
committing.

Thanks,
Kyrill

Re: [PATCH 1/4] [ARM] Add attribute/pragma target fpu=

2015-10-05 Thread Kyrill Tkachov



On 16/09/15 10:29, Christian Bruel wrote:



Maybe there is just dirt on my screen or my graphic-memory is broken, but I see 
an odd character between iWMMXt and and?

Thanks Bernhard,

Resent to remove this malicious sneaky ` character in the string error
message.


I keep forgetting if it's a capital W or a lowercase one but you'll know.
Thanks,


I don't know. iWMMXt is also referenced with a capital W in the other
parts of the compiler.


Let's go with what's in the compiler already.

@@ -25678,23 +25679,14 @@
   if (print_tune_info)
arm_print_tune_info ();
 
-  if (TARGET_SOFT_FLOAT)

+  if (! TARGET_SOFT_FLOAT && arm_fpu_desc->model == ARM_FP_MODEL_VFP)
{

Can use TARGET_VFP here?

Ok with that change.
Kyrill

Re: [PATCH] Unswitching outer loops.

2015-10-05 Thread Richard Biener

On Wed, Sep 30, 2015 at 12:46 PM, Yuri Rumyantsev  wrote:
> Hi Richard,
>
> I re-designed outer loop unswitching using basic idea of 23855 patch -
> hoist invariant guard if loop is empty without guard. Note that this
> was added to loop unswitching pass with simple modifications - using
> another loop iterator etc.
>
> Bootstrap and regression testing did not show any new failures.
> What is your opinion?

Overall it looks good.  Some comments below - a few more testcases would
be nice as well.

+  /* Loop must not be infinite.  */
+  if (!finite_loop_p (loop))
+return false;

why's that?

+  body = get_loop_body_in_dom_order (loop);
+  for (i = 0; i < loop->num_nodes; i++)
+{
+  if (body[i]->loop_father != loop)
+   continue;
+  if (!empty_bb_without_guard_p (loop, body[i]))

I wonder if there is a better way to iterate over the interesting
blocks and PHIs
we need to check for side-effects (and thus we maybe can avoid gathering
the loop in DOM order).

+  FOR_EACH_SSA_TREE_OPERAND (name, stmt, op_iter, SSA_OP_DEF)
+   {
+ if (may_be_used_outside

may_be_used_outside can be hoisted above the loop.  I wonder if we can take
advantage of loop-closed SSA form here (and the fact we have a single exit
from the loop).  Iterating over exit dest PHIs and determining whether the
exit edge DEF is inside the loop part it may not be should be enough.

+  gcc_assert (single_succ_p (pre_header));

that should be always true.

+  gsi_remove (&gsi, false);
+  bb = guard->dest;
+  remove_edge (guard);
+  /* Update dominance for destination of GUARD.  */
+  if (EDGE_COUNT (bb->preds) == 0)
+{
+  basic_block s_bb;
+  gcc_assert (single_succ_p (bb));
+  s_bb = single_succ (bb);
+  delete_basic_block (bb);
+  if (single_pred_p (s_bb))
+   set_immediate_dominator (CDI_DOMINATORS, s_bb, single_pred (s_bb));

all this massaging should be simplified by leaving it to CFG cleanup by
simply adjusting the CONDs condition to always true/false.  There is
gimple_cond_make_{true,false} () for this (would be nice to have a variant
taking a bool).

+  new_edge = make_edge (pre_header, exit->dest, flags);
+  if (fix_dom_of_exit)
+set_immediate_dominator (CDI_DOMINATORS, exit->dest, pre_header);
+  update_stmt (gsi_stmt (gsi));

the update_stmt should be not necessary, it's done by gsi_insert_after already.

+  /* Add NEW_ADGE argument for all phi in post-header block.  */
+  bb = exit->dest;
+  for (gphi_iterator gsi = gsi_start_phis (bb);
+   !gsi_end_p (gsi); gsi_next (&gsi))
+{
+  gphi *phi = gsi.phi ();
+  /* edge_iterator ei; */
+  tree arg;
+  if (virtual_operand_p (gimple_phi_result (phi)))
+   {
+ arg = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
+ add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);
+   }
+  else
+   {
+ /* Use exit edge argument.  */
+ arg = PHI_ARG_DEF_FROM_EDGE (phi, exit);
+ add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);

Hum.  How is it ok to use the exit edge argument for the edge that skips
the loop?  Why can't you always use the pre-header edge value?
That is, if we have

 for(i=0;i 0)
{
 for (;;)
   {
   }
 }
   }
  ... = i;

then i is used after the loop and the correct value to use if
n > 0 is false is '0'.  Maybe this way we can also relax
what check_exit_phi does?  IMHO the only restriction is
if sth defined inside the loop before the header check for
the inner loop is used after the loop.

Thanks,
Richard.

> Thanks.
>
> ChangeLog:
> 2015-09-30  Yuri Rumyantsev  
>
> * tree-ssa-loop-unswitch.c: Include "gimple-iterator.h" and
> "cfghooks.h", add prototypes for introduced new functions.
> (tree_ssa_unswitch_loops): Use from innermost loop iterator, move all
> checks on ability of loop unswitching to tree_unswitch_single_loop;
> invoke tree_unswitch_single_loop or tree_unswitch_outer_loop depending
> on innermost loop check.
> (tree_unswitch_single_loop): Add all required checks on ability of
> loop unswitching under zero recursive level guard.
> (tree_unswitch_outer_loop): New function.
> (find_loop_guard): Likewise.
> (empty_bb_without_guard_p): Likewise.
> (used_outside_loop_p): Likewise.
> (hoist_guard): Likewise.
> (check_exit_phi): Likewise.
>
>gcc/testsuite/ChangeLog:
> * gcc.dg/loop-unswitch-2.c: New test.
>
> 2015-09-16 11:26 GMT+03:00 Richard Biener :
>> Yeah, as said, the patch wasn't fully ready and it also felt odd to do
>> this hoisting in loop header copying.  Integrating it
>> with LIM would be a better fit eventually.
>>
>> Note that we did agree to go forward with your original patch just
>> making it more "generically" perform outer loop
>> unswitching.  Did you explore that idea further?
>>
>>
>>
>> On Tue, Sep 15, 2015 at 6:00 PM, Yuri Rumyantsev  wrote:
>>> Thanks Richard.
>>>
>>> I found one more issue that could not be fixed simply. In 23855 you
>>> consider the following test-case:
>>> void foo

Re: Replace REAL_VALUES_EQUAL with real_equal

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 12:41 PM, Richard Sandiford
 wrote:
> Richard B suggested we should replace dconsthalf etc. with
> dconst<1, 2> ().  When I tried that, the extra comma caused problems
> with some lingering uses of the old target macros for handling reals
> (e.g. REAL_ARITHMETIC instead of real_arithmetic), since the constant
> was then treated as two macro parameters.  It would have been possible
> to add an extra level of brackets to avoid this, but I thought I might
> as well take the opportunity to remove the macros instead.  (Note that
> I'm only removing macros that caused a problem directly, or are closely
> related to ones that did.)
>
> This first patch replaces REAL_VALUES_EQUAL with a real_equal function.
> The prototype is the same as for real_identical, which has already
> undergone a half-transition in this direction.
>
> Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
> building one target per CPU directory and checking that there were
> no new warnings and no changes in testsuite output at -O2.
> OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
> PS. It might be good to make real_value more "C++-like" in future,
> but I think it should be done with care, especially with the potential
> confusion between operator== being real_identical or real_equal.
> This patch should at least be a strict improvement over the status quo.
>
>
> gcc/c-family/
> * c-lex.c (interpret_float): Use real_equal instead of
> REAL_VALUES_EQUAL.
>
> gcc/c/
> * c-typeck.c (c_tree_equal): Use real_equal instead of
> REAL_VALUES_EQUAL.
>
> gcc/cp/
> * tree.c (cp_tree_equal): Use real_equal instead of
> REAL_VALUES_EQUAL.
>
> gcc/
> * real.h (real_equal): Declare.
> (REAL_VALUES_EQUAL): Delete.
> * real.c (real_equal): New function.
> (real_compare): Use it.
> * doc/tm.texi.in (REAL_VALUES_EQUAL): Delete.
> * doc/tm.texi: Regenerate.
> * builtins.c (fold_builtin_pow, fold_builtin_load_exponent): Use
> real_equal instead of REAL_VALUES_EQUAL.
> * config/aarch64/aarch64.c (aarch64_float_const_zero_rtx_p): Likewise.
> * config/arm/arm.c (arm_const_double_rtx, neon_valid_immediate)
> (fp_const_from_val): Likewise.
> * config/fr30/fr30.c (fr30_const_double_is_zero): Likewise.
> * config/m68k/m68k.c (standard_68881_constant_p): Likewise.
> (floating_exact_log2): Likewise.
> * config/sh/sh.c (fp_zero_operand, fp_one_operand): Likewise.
> * config/vax/vax.c (vax_float_literal): Likewise.
> * config/xtensa/predicates.md (const_float_1_operand): Likewise.
> * cprop.c (implicit_set_cond_p): Likewise.
> * expmed.c (expand_mult): Likewise.
> * fold-const.c (const_binop): Likewise.
> * simplify-rtx.c (simplify_binary_operation_1): Likewise.
> (simplify_const_binary_operation): Likewise.
> (simplify_const_relational_operation): Likewise.
> * tree-call-cdce.c (check_pow): Likewise.
> (gen_conditions_for_pow_cst_base): Likewise.
> * tree-inline.c (estimate_num_insns): Likewise.
> * tree-ssa-dom.c (record_equality): Likewise.
> * tree-ssa-math-opts.c (representable_as_half_series_p): Likewise.
> (gimple_expand_builtin_pow): Likewise.
> (pass_optimize_widening_mul::execute): Likewise.
> * tree-ssa-uncprop.c (associate_equivalences_with_edges): Likewise.
> * tree-vect-patterns.c (vect_recog_pow_pattern): Likewise.
> * tree.c (real_zerop, real_onep, real_minus_onep): Likewise.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 956cf43..89bea60 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -8419,22 +8419,22 @@ fold_builtin_pow (location_t loc, tree fndecl, tree 
> arg0, tree arg1, tree type)
>c = TREE_REAL_CST (arg1);
>
>/* Optimize pow(x,0.0) = 1.0.  */
> -  if (REAL_VALUES_EQUAL (c, dconst0))
> +  if (real_equal (&c, &dconst0))
> return omit_one_operand_loc (loc, type, build_real (type, dconst1),
>  arg0);
>
>/* Optimize pow(x,1.0) = x.  */
> -  if (REAL_VALUES_EQUAL (c, dconst1))
> +  if (real_equal (&c, &dconst1))
> return arg0;
>
>/* Optimize pow(x,-1.0) = 1.0/x.  */
> -  if (REAL_VALUES_EQUAL (c, dconstm1))
> +  if (real_equal (&c, &dconstm1))
> return fold_build2_loc (loc, RDIV_EXPR, type,
> build_real (type, dconst1), arg0);
>
>/* Optimize pow(x,0.5) = sqrt(x).  */
>if (flag_unsafe_math_optimizations
> - && REAL_VALUES_EQUAL (c, dconsthalf))
> + && real_equal (&c, &dconsthalf))
> {
>   tree sqrtfn = mathfn_built_in (type, BUILT_IN_SQRT);
>
> @@ -8448,7 +8448,7 @@ fold_builtin_pow (location_t loc, tree fndecl, tree 
> arg0, tree arg1, tree type)
>   const REAL_VALUE_TYPE dconstroot

Re: Replace REAL_VALUES_LESS with real_less

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 12:43 PM, Richard Sandiford
 wrote:
> This patch continues the removal of real-related macros by
> replacing REAL_VALUES_LESS with real_less.
>
> Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
> building one target per CPU directory and checking that there were
> no new warnings and no changes in testsuite output at -O2.  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/ada/
> * gcc-interface/trans.c (convert_with_check): Use real_less instead
> of REAL_VALUES_LESS.
>
> gcc/
> * doc/tm.texi.in (REAL_VALUES_LESS): Delete.
> * doc/tm.texi: Regenerate.
> * real.h (real_less): Declare.
> (REAL_VALUES_LESS): Delete.
> * real.c (real_less): New function.
> (real_compare): Use it.
> * config/m68k/m68k.c (floating_exact_log2): Use real_less instead
> of REAL_VALUES_LESS.
> * config/microblaze/microblaze.c (microblaze_const_double_ok):
> Likewise.
> * fold-const.c (fold_convert_const_int_from_real): Likewise.
> * simplify-rtx.c (simplify_const_unary_operation): Likewise.
> (simplify_const_relational_operation): Likewise.
> * tree-call-cdce.c (check_pow): Likewise.
> (gen_conditions_for_pow_cst_base): Likewise.
>
> diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
> index 3252ea2..9838da0 100644
> --- a/gcc/ada/gcc-interface/trans.c
> +++ b/gcc/ada/gcc-interface/trans.c
> @@ -8999,8 +8999,8 @@ convert_with_check (Entity_Id gnat_type, tree gnu_expr, 
> bool overflowp,
>if (INTEGRAL_TYPE_P (gnu_in_basetype)
>   ? tree_int_cst_lt (gnu_in_lb, gnu_out_lb)
>   : (FLOAT_TYPE_P (gnu_base_type)
> -? REAL_VALUES_LESS (TREE_REAL_CST (gnu_in_lb),
> -TREE_REAL_CST (gnu_out_lb))
> +? real_less (&TREE_REAL_CST (gnu_in_lb),
> + &TREE_REAL_CST (gnu_out_lb))
>  : 1))
> gnu_cond
>   = invert_truthvalue
> @@ -9011,8 +9011,8 @@ convert_with_check (Entity_Id gnat_type, tree gnu_expr, 
> bool overflowp,
>if (INTEGRAL_TYPE_P (gnu_in_basetype)
>   ? tree_int_cst_lt (gnu_out_ub, gnu_in_ub)
>   : (FLOAT_TYPE_P (gnu_base_type)
> -? REAL_VALUES_LESS (TREE_REAL_CST (gnu_out_ub),
> -TREE_REAL_CST (gnu_in_lb))
> +? real_less (&TREE_REAL_CST (gnu_out_ub),
> + &TREE_REAL_CST (gnu_in_lb))
>  : 1))
> gnu_cond
>   = build_binary_op (TRUTH_ORIF_EXPR, boolean_type_node, gnu_cond,
> diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
> index 487cbf4..74de983 100644
> --- a/gcc/config/m68k/m68k.c
> +++ b/gcc/config/m68k/m68k.c
> @@ -4365,7 +4365,7 @@ floating_exact_log2 (rtx x)
>
>REAL_VALUE_FROM_CONST_DOUBLE (r, x);
>
> -  if (REAL_VALUES_LESS (r, dconst1))
> +  if (real_less (&r, &dconst1))
>  return 0;
>
>exp = real_exponent (&r);
> diff --git a/gcc/config/microblaze/microblaze.c 
> b/gcc/config/microblaze/microblaze.c
> index ebcf65a..9efa739 100644
> --- a/gcc/config/microblaze/microblaze.c
> +++ b/gcc/config/microblaze/microblaze.c
> @@ -280,12 +280,12 @@ microblaze_const_double_ok (rtx op, machine_mode mode)
>
>if (mode == DFmode)
>  {
> -  if (REAL_VALUES_LESS (d, dfhigh) && REAL_VALUES_LESS (dflow, d))
> +  if (real_less (&d, &dfhigh) && real_less (&dflow, &d))
> return 1;
>  }
>else
>  {
> -  if (REAL_VALUES_LESS (d, sfhigh) && REAL_VALUES_LESS (sflow, d))
> +  if (real_less (&d, &sfhigh) && real_less (&sflow, &d))
> return 1;
>  }
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index fdd49d9..f6a0b5d 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -9708,10 +9708,6 @@ array of @code{HOST_WIDE_INT}, but all code should 
> treat it as an opaque
>  quantity.
>  @end defmac
>
> -@deftypefn Macro int REAL_VALUES_LESS (REAL_VALUE_TYPE @var{x}, 
> REAL_VALUE_TYPE @var{y})
> -Tests whether @var{x} is less than @var{y}.
> -@end deftypefn
> -
>  @deftypefn Macro HOST_WIDE_INT REAL_VALUE_FIX (REAL_VALUE_TYPE @var{x})
>  Truncates @var{x} to a signed integer, rounding toward zero.
>  @end deftypefn
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index ab75ad9..7fae1ca 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -7131,10 +7131,6 @@ array of @code{HOST_WIDE_INT}, but all code should 
> treat it as an opaque
>  quantity.
>  @end defmac
>
> -@deftypefn Macro int REAL_VALUES_LESS (REAL_VALUE_TYPE @var{x}, 
> REAL_VALUE_TYPE @var{y})
> -Tests whether @var{x} is less than @var{y}.
> -@end deftypefn
> -
>  @deftypefn Macro HOST_WIDE_INT REAL_VALUE_FIX (REAL_VALUE_TYPE @var{x})
>  Truncates @var{x} to a signed integer, rounding toward zero.
>  @end deftypefn
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index 1c72af6..2851a29 100644
> --- a/gcc/fold-const.

Re: Remove remaining uses of REAL_VALUES_IDENTICAL

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 12:43 PM, Richard Sandiford
 wrote:
> This patch continues the removal of real-related macros.
> We already had both the old-style REAL_VALUES_IDENTICAL and the
> new-style real_identical, so this patch replaces all remaining
> uses of the former with the latter.
>
> Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
> building one target per CPU directory and checking that there were
> no new warnings and no changes in testsuite output at -O2.  OK to
> install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * real.h (REAL_VALUES_IDENTICAL): Delete.
> * config/m68k/m68k.c (standard_68881_constant_p): Use real_identical
> instead of REAL_VALUES_IDENTICAL.
> * fold-const.c (operand_equal_p): Likewise.
> * ipa-icf.c (sem_variable::equals): Likewise.
> * tree-complex.c (some_nonzerop): Likewise.
> (expand_complex_multiplication): Likewise.
> * tree.c (simple_cst_equal): Likewise.
> * varasm.c (compare_constant): Likewise.
>
> diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
> index b7d96a5..487cbf4 100644
> --- a/gcc/config/m68k/m68k.c
> +++ b/gcc/config/m68k/m68k.c
> @@ -4336,11 +4336,10 @@ standard_68881_constant_p (rtx x)
>
>REAL_VALUE_FROM_CONST_DOUBLE (r, x);
>
> -  /* Use REAL_VALUES_IDENTICAL instead of real_equal so that -0.0
> - is rejected.  */
> +  /* Use real_identical instead of real_equal so that -0.0 is rejected.  */
>for (i = 0; i < 6; i++)
>  {
> -  if (REAL_VALUES_IDENTICAL (r, values_68881[i]))
> +  if (real_identical (&r, &values_68881[i]))
>  return (codes_68881[i]);
>  }
>
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index 768b39b..1c72af6 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -2815,8 +2815,7 @@ operand_equal_p (const_tree arg0, const_tree arg1, 
> unsigned int flags)
>TREE_FIXED_CST (arg1));
>
>case REAL_CST:
> -   if (REAL_VALUES_IDENTICAL (TREE_REAL_CST (arg0),
> -  TREE_REAL_CST (arg1)))
> +   if (real_identical (&TREE_REAL_CST (arg0), &TREE_REAL_CST (arg1)))
>   return 1;
>
>
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> index d39a3c1..b076222 100644
> --- a/gcc/ipa-icf.c
> +++ b/gcc/ipa-icf.c
> @@ -2030,8 +2030,8 @@ sem_variable::equals (tree t1, tree t2)
>/* Real constants are the same only if the same width of type.  */
>if (TYPE_PRECISION (TREE_TYPE (t1)) != TYPE_PRECISION (TREE_TYPE (t2)))
>  return return_false_with_msg ("REAL_CST precision mismatch");
> -  return return_with_debug (REAL_VALUES_IDENTICAL (TREE_REAL_CST (t1),
> -  TREE_REAL_CST (t2)));
> +  return return_with_debug (real_identical (&TREE_REAL_CST (t1),
> +   &TREE_REAL_CST (t2)));
>  case VECTOR_CST:
>{
> unsigned i;
> diff --git a/gcc/real.h b/gcc/real.h
> index 2ffc0d2..1be4104 100644
> --- a/gcc/real.h
> +++ b/gcc/real.h
> @@ -333,7 +333,6 @@ extern const struct real_format arm_half_format;
>  #define REAL_ARITHMETIC(value, code, d1, d2) \
>real_arithmetic (&(value), code, &(d1), &(d2))
>
> -#define REAL_VALUES_IDENTICAL(x, y)real_identical (&(x), &(y))
>  #define REAL_VALUES_LESS(x, y) real_compare (LT_EXPR, &(x), &(y))
>
>  /* Determine whether a floating-point value X is infinite.  */
> diff --git a/gcc/tree-complex.c b/gcc/tree-complex.c
> index b0ffc00..93c0a54 100644
> --- a/gcc/tree-complex.c
> +++ b/gcc/tree-complex.c
> @@ -118,7 +118,7 @@ some_nonzerop (tree t)
>   cannot be treated the same as operations with a real or imaginary
>   operand if we care about the signs of zeros in the result.  */
>if (TREE_CODE (t) == REAL_CST && !flag_signed_zeros)
> -zerop = REAL_VALUES_IDENTICAL (TREE_REAL_CST (t), dconst0);
> +zerop = real_identical (&TREE_REAL_CST (t), &dconst0);
>else if (TREE_CODE (t) == FIXED_CST)
>  zerop = fixed_zerop (t);
>else if (TREE_CODE (t) == INTEGER_CST)
> @@ -1021,7 +1021,7 @@ expand_complex_multiplication (gimple_stmt_iterator 
> *gsi, tree inner_type,
>  case PAIR (ONLY_IMAG, ONLY_REAL):
>rr = ar;
>if (TREE_CODE (ai) == REAL_CST
> - && REAL_VALUES_IDENTICAL (TREE_REAL_CST (ai), dconst1))
> + && real_identical (&TREE_REAL_CST (ai), &dconst1))
> ri = br;
>else
> ri = gimplify_build2 (gsi, MULT_EXPR, inner_type, ai, br);
> diff --git a/gcc/tree.c b/gcc/tree.c
> index b432997..f78a2c2 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -7370,7 +7370,7 @@ simple_cst_equal (const_tree t1, const_tree t2)
>return wi::to_widest (t1) == wi::to_widest (t2);
>
>  case REAL_CST:
> -  return REAL_VALUES_IDENTICAL (TREE_REAL_CST (t1), TREE_REAL_CST (t2));
> +  return real_identical (&TREE_REAL_CST (t1), &TREE_REAL_CST (t2));
>
>

Re: [PATCH 3/4] [ARM] Add attribute/pragma target fpu=

2015-10-05 Thread Kyrill Tkachov


Hi Christian,

On 14/09/15 12:39, Christian Bruel wrote:

This patch splits the neon_builtins initialization into 2 internals
functions. One for NEON and one for CRYPTO, each one guarded by its own
predicate. arm_init_neon_builtins is now global to be called from
arm_valid_target_attribute_tree if needed.



+void
+arm_init_crypto_builtins_internal (void)
+{
+  tree V16UQI_type_node
+= arm_simd_builtin_type (V16QImode, true, false);
 
Like Bernhard said, this should be static.


Otherwise this is ok (with the irrelevant arm-protos.h dropped as discussed 
earlier).

Thanks,
Kyrill

Re: Remove remaining uses of REAL_ARITHMETIC

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 12:44 PM, Richard Sandiford
 wrote:
> This patch replaces all remaining uses of the old target macro
> REAL_ARITHMETIC with calls to the (now generic) real_arithmetic
> function.
>
> Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
> building one target per CPU directory and checking that there were
> no new warnings and no changes in testsuite output at -O2.  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/ada/
> * gcc-interface/trans.c (convert_with_check): Use real_arithmetic
> instead of REAL_ARITHMETIC.
>
> gcc/
> * doc/tm.texi.in (REAL_ARITHMETIC): Delete.
> * doc/tm.texi: Regenerate.
> * real.h (REAL_ARITHMETIC): Delete.
> * config/i386/i386.c (ix86_expand_lround, ix86_expand_round)
> (ix86_expand_round_sse4): Use real_arithmetic instead of
> REAL_ARITHMETIC.
> * config/i386/sse.md (round2): Likewise.
> * rtl.h (rtx_to_tree_code): Likewise (in comment).
> * explow.c (rtx_to_tree_code): Likewise (in comment).
> * match.pd: Likewise.
> * simplify-rtx.c (simplify_binary_operation_1): Likewise.
> * tree-ssa-math-opts.c (representable_as_half_series_p): Likewise.
> (expand_pow_as_sqrts): Likewise.
> * tree-pretty-print.c (dump_generic_node): Remove code that
> was conditional on REAL_ARITHMETIC being undefined.
>
> diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
> index 9838da0..f1e2dcb 100644
> --- a/gcc/ada/gcc-interface/trans.c
> +++ b/gcc/ada/gcc-interface/trans.c
> @@ -9048,8 +9048,8 @@ convert_with_check (Entity_Id gnat_type, tree gnu_expr, 
> bool overflowp,
>/* Compute the exact value calc_type'Pred (0.5) at compile time.  */
>fmt = REAL_MODE_FORMAT (TYPE_MODE (calc_type));
>real_2expN (&half_minus_pred_half, -(fmt->p) - 1, TYPE_MODE 
> (calc_type));
> -  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf,
> -  half_minus_pred_half);
> +  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf,
> +  &half_minus_pred_half);
>gnu_pred_half = build_real (calc_type, pred_half);
>
>/* If the input is strictly negative, subtract this value
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 9c4cfbd..44847b4 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -47538,7 +47538,7 @@ ix86_expand_lround (rtx op0, rtx op1)
>/* load nextafter (0.5, 0.0) */
>fmt = REAL_MODE_FORMAT (mode);
>real_2expN (&half_minus_pred_half, -(fmt->p) - 1, mode);
> -  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf, half_minus_pred_half);
> +  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf, 
> &half_minus_pred_half);
>
>/* adj = copysign (0.5, op1) */
>adj = force_reg (mode, const_double_from_real_value (pred_half, mode));
> @@ -47952,7 +47952,7 @@ ix86_expand_round (rtx operand0, rtx operand1)
>/* load nextafter (0.5, 0.0) */
>fmt = REAL_MODE_FORMAT (mode);
>real_2expN (&half_minus_pred_half, -(fmt->p) - 1, mode);
> -  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf, half_minus_pred_half);
> +  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf, 
> &half_minus_pred_half);
>
>/* xa = xa + 0.5 */
>half = force_reg (mode, const_double_from_real_value (pred_half, mode));
> @@ -48003,7 +48003,7 @@ ix86_expand_round_sse4 (rtx op0, rtx op1)
>/* load nextafter (0.5, 0.0) */
>fmt = REAL_MODE_FORMAT (mode);
>real_2expN (&half_minus_pred_half, -(fmt->p) - 1, mode);
> -  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf, half_minus_pred_half);
> +  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf, 
> &half_minus_pred_half);
>half = const_double_from_real_value (pred_half, mode);
>
>/* e1 = copysign (0.5, op1) */
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 013681c..9b7a338 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -14881,7 +14881,7 @@
>/* load nextafter (0.5, 0.0) */
>fmt = REAL_MODE_FORMAT (scalar_mode);
>real_2expN (&half_minus_pred_half, -(fmt->p) - 1, scalar_mode);
> -  REAL_ARITHMETIC (pred_half, MINUS_EXPR, dconsthalf, half_minus_pred_half);
> +  real_arithmetic (&pred_half, MINUS_EXPR, &dconsthalf, 
> &half_minus_pred_half);
>half = const_double_from_real_value (pred_half, scalar_mode);
>
>vec_half = ix86_build_const_vector (mode, true, half);
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index f6a0b5d..72366b9 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -9736,21 +9736,6 @@ Determines whether @var{x} represents infinity 
> (positive or negative).
>  Determines whether @var{x} represents a ``NaN'' (not-a-number).
>  @end deftypefn
>
> -@deftypefn Macro void REAL_ARITHMETIC (REAL_VALUE_TYPE @var{output}, enum 
> tree_code @var{code}, REAL_VALUE_TYPE @var{x}, REAL_VALUE_TYPE @var{y})
> -Calculates an arithmetic op

Re: using scratchpads to enhance RTL-level if-conversion: the new patch now passes bootstrap with the default BUILD_CONFIG [i.e. no stage2-to-stage3 comparison errors even with debugging info off in s

2015-10-05 Thread Bernd Schmidt

This is currently not really reviewable due to broken indentation, 
possibly due to whitespace damage from your mailer or not following 
coding guidelines. Please ensure your code is formatted the same way as 
all other code in gcc. I'll point out some of the problems, but please 
investigate

  https://www.gnu.org/prep/standards/html_node/Writing-C.html
and fix everything before resubmission. Also read through maybe one or 
two whole gcc source files to get an impression of how things should 
look like.



 static bool
-noce_mem_write_may_trap_or_fault_p (const_rtx mem)
+noce_mem_write_may_trap_or_fault_considering_scratchpad_support_p
+  (const_rtx mem, bool scratchpads_enabled)


Just no. The presence of the argument and its documentation is enough to 
show that scratchpad support is considered. We don't want identifiers 
taking up a whole line.



+  if (optimize<2)
+  {
+return FALSE;
+  }


Lose braces around single statements.


+  if
(size_of_MEM_operand<=SCRATCHPADS_INCLUSIVE_MAX_SIZE_IN_BYTES)
+  {
+
+if (size_of_MEM_operand > size_of_biggest_scratchpad)
+{


Serious problems with the indentation. Check whether it's your mailer 
doing it or if this is a problem in your code. If it's the former, try 
using a text/plain attachment for sending patches. Also, identifiers too 
long.



+for
+  (insn_to_maybe_duplicate = BB_HEAD (then_bb);
+   insn_to_maybe_duplicate &&
+ (insn_to_maybe_duplicate != insn_a) &&
+ (insn_to_maybe_duplicate != BB_END (then_bb));


Use shorter identifiers (everywhere) to avoid having to use multiple 
lines. "cand", short for candidate, might be a good option here.
No need to use parentheses around != comparisons. Logical ops like && 
should start a line not end it.



+   insn_to_maybe_duplicate=NEXT_INSN (insn_to_maybe_duplicate))


Spaces around operators.


+if (MEM_ADDR_SPACE (orig_x) !=
+MEM_ADDR_SPACE (biggest_scratchpad_rtx))


Again, operator should start a line.


+/* Abe`s note: do we need to do the following after getting
+   a new pseudo-reg., as shown elsewhere in this file?
+  if (max_regno < max_reg_num ())  max_regno = max_reg_num ();
+*/


Avoid referencing yourself in the comments with things like "Abe's 
note". (Also, odd use of the backquote character there).



+  } /* (size_of_MEM_operand <=
+ SCRATCHPADS_INCLUSIVE_MAX_SIZE_IN_BYTES) */


Lose such end comments.


basic_block merge_bb,
  static void
  if_convert (bool after_combine)
  {
+
basic_block bb;
int pass;


Spurious whitespace change.


+  biggest_scratchpad_rtx = 0; /* Reminder: an "rtx", therefore a
pointer.  */


Comments go before a line, and this particular comment should just be 
removed as useless.


I'll point out one possible problem with the logic now (I'll wait for a 
properly formatted version before going in depth):



+if (size_of_MEM_operand > size_of_biggest_scratchpad)
+{
+  size_of_biggest_scratchpad = size_of_MEM_operand;
+  biggest_scratchpad_rtx = assign_stack_local
+(GET_MODE (orig_x), size_of_MEM_operand, 0);
+/* 0: align acc. to the machine mode indicated by
+  "GET_MODE (orig_x)" */
+  gcc_assert (biggest_scratchpad_rtx);
+  scratchpads.safe_push (std::make_pair (biggest_scratchpad_rtx,
+ size_of_MEM_operand));
+}
+


It looks like you're allocating extra stack slots, and later code can 
decide not to use them. This seems like it could be problematic.



Bernd

Re: Remove remaining uses of CONST_DOUBLE_FROM_REAL_VALUE

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 12:45 PM, Richard Sandiford
 wrote:
> This patch replaces all uses of CONST_DOUBLE_FROM_REAL_VALUE
> with the already-existing const_double_from_real_value.
>
> Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
> building one target per CPU directory and checking that there were
> no new warnings and no changes in testsuite output at -O2.  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * real.h (CONST_DOUBLE_ATOF): Use const_double_from_real_value
> instead of CONST_DOUBLE_FROM_REAL_VALUE.
> (CONST_DOUBLE_FROM_REAL_VALUE): Delete.
> * config/c6x/c6x.md (divsf3, divdf3): Use const_double_from_real_value
> instead of CONST_DOUBLE_FROM_REAL_VALUE.
> * config/epiphany/epiphany.md (fixuns_truncsfsi2): Likewise.
> * config/i386/i386.c (standard_80387_constant_rtx): Likewise.
> (ix86_expand_builtin, ix86_emit_i387_log1p, ix86_emit_i387_round)
> (ix86_emit_swsqrtsf): Likewise.
> * config/ia64/ia64.c (ia64_expand_builtin): Likewise.
> * config/mips/mips.md (fixuns_truncdfsi2, fixuns_truncdfdi2)
> (fixuns_truncsfsi2, fixuns_truncsfdi2): Likewise.
> * config/pa/pa.c (pa_expand_builtin): Likewise.
> * config/rs6000/rs6000.c (rs6000_load_constant_and_splat): Likewise.
> (rs6000_scale_v2df): Likewise.
> * config/rs6000/rs6000.md (*cmptf_internal2): Likewise.
> * config/s390/s390.md (fixuns_truncdddi2, fixuns_trunctddi2)
> (fixuns_trunc2): Likewise.
> * config/s390/vx-builtins.md (vec_ctd_s64, vec_ctd_u64, vec_ctsl)
> (vec_ctul): Likewise.
> * config/sparc/sparc.c (sparc_emit_fixunsdi): Likewise.
> * config/spu/spu.c (hwint_to_const_double, spu_float_const): Likewise.
> * config/spu/spu.md (floatunsdisf2, floatunstisf2): Likewise.
> * cse.c (fold_rtx): Likewise.
> * emit-rtl.c (immed_double_const): Likewise (in comments).
> (init_emit_once): Likewise.
> * expr.c (compress_float_constant, expand_expr_real_1)
> (const_vector_from_tree): Likewise.
> * optabs.c (expand_float, expand_fix): Likewise.
> * reg-stack.c (reg_to_stack): Likewise.
> * simplify-rtx.c (avoid_constant_pool_reference): Likewise.
> (simplify_const_unary_operation, simplify_binary_operation_1)
> (simplify_const_binary_operation, simplify_relational_operation)
> (simplify_immed_subreg): Likewise.
>
> diff --git a/gcc/config/c6x/c6x.md b/gcc/config/c6x/c6x.md
> index 075968d..692d83f 100644
> --- a/gcc/config/c6x/c6x.md
> +++ b/gcc/config/c6x/c6x.md
> @@ -2811,7 +2811,7 @@
>"TARGET_FP && flag_reciprocal_math"
>  {
>operands[3] = force_reg (SFmode,
> -  CONST_DOUBLE_FROM_REAL_VALUE (dconst2, SFmode));
> +  const_double_from_real_value (dconst2, SFmode));
>operands[4] = gen_reg_rtx (SFmode);
>operands[5] = gen_reg_rtx (SFmode);
>operands[6] = gen_reg_rtx (SFmode);
> @@ -2836,7 +2836,7 @@
>"TARGET_FP && flag_reciprocal_math"
>  {
>operands[3] = force_reg (DFmode,
> -  CONST_DOUBLE_FROM_REAL_VALUE (dconst2, DFmode));
> +  const_double_from_real_value (dconst2, DFmode));
>operands[4] = gen_reg_rtx (DFmode);
>operands[5] = gen_reg_rtx (DFmode);
>operands[6] = gen_reg_rtx (DFmode);
> diff --git a/gcc/config/epiphany/epiphany.md b/gcc/config/epiphany/epiphany.md
> index 4280926..4c8b5d6 100644
> --- a/gcc/config/epiphany/epiphany.md
> +++ b/gcc/config/epiphany/epiphany.md
> @@ -982,7 +982,7 @@
>rtx cmp = gen_rtx_LT (VOIDmode, cc1, CONST0_RTX (SFmode));
>
>real_2expN (&offset, 31, SFmode);
> -  limit = CONST_DOUBLE_FROM_REAL_VALUE (offset, SFmode);
> +  limit = const_double_from_real_value (offset, SFmode);
>limit = force_reg (SFmode, limit);
>emit_insn (gen_fix_truncsfsi2 (operands[0], operands[1]));
>emit_insn (gen_subsf3_f (tmp, operands[1], limit));
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 44847b4..ff52779 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -10573,7 +10573,7 @@ standard_80387_constant_rtx (int idx)
>gcc_unreachable ();
>  }
>
> -  return CONST_DOUBLE_FROM_REAL_VALUE (ext_80387_constants_table[i],
> +  return const_double_from_real_value (ext_80387_constants_table[i],
>XFmode);
>  }
>
> @@ -40143,7 +40143,7 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
> subtarget,
> rtx tmp;
>
> real_inf (&inf);
> -   tmp = CONST_DOUBLE_FROM_REAL_VALUE (inf, mode);
> +   tmp = const_double_from_real_value (inf, mode);
>
> tmp = validize_mem (force_const_mem (mode, tmp));
>
> @@ -47021,7 +47021,7 @@ void ix86_emit_i387_log1p (rtx op0, rtx op1)
>
>emit_insn (gen_absxf2 (tmp, op1));
>test = gen_rtx_GE (VO

Re: Remove REAL_VALUE_FROM_CONST_DOUBLE

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 12:47 PM, Richard Sandiford
 wrote:
> To maintain symmetry after the previous removal of
> CONST_DOUBLE_FROM_REAL_VALUE, this patch also gets rid of
> REAL_VALUE_FROM_CONST_DOUBLE.  All the macro did was copy the
> contents of CONST_DOUBLE_REAL_VALUE into a temporary real_value
> structure.  In many cases there was no need for this temporary
> and we could simply use the CONST_DOUBLE_REAL_VALUE directly.
> For that reason this patch is less automatic than the others.
>
> Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
> building one target per CPU directory and checking that there were
> no new warnings and no changes in testsuite output at -O2.  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * real.h (REAL_VALUE_FROM_CONST_DOUBLE): Delete.
> * config/aarch64/aarch64.c (aarch64_float_const_zero_rtx_p)
> (aarch64_print_operand, aarch64_float_const_representable_p)
> (aarch64_output_simd_mov_immediate): Use CONST_DOUBLE_REAL_VALUE
> instead of REAL_VALUE_FROM_CONST_DOUBLE.
> * config/arc/arc.c (arc_print_operand): Likewise.
> * config/arm/arm.c (arm_const_double_rtx, vfp3_const_double_index)
> (neon_valid_immediate, arm_print_operand, arm_emit_fp16_const)
> (vfp3_const_double_for_fract_bits, vfp3_const_double_for_bits):
> Likewise.
> * config/arm/arm.md (*arm32_movhf, consttable_4, consttable_8)
> (consttable_16): Likewise.
> * config/arm/vfp.md (*movhf_vfp_neon, *movhf_vfp): Likewise.
> * config/avr/avr.c (avr_print_operand): Likewise.
> * config/bfin/bfin.md: Likewise (in a define_split).
> * config/c6x/c6x.md: Likewise (in a define_split).
> * config/cr16/cr16.c (cr16_const_double_ok): Likewise.
> (cr16_print_operand): Likewise.
> * config/cris/cris.c (cris_print_operand): Likewise.
> * config/epiphany/epiphany.c (epiphany_print_operand): Likewise.
> * config/fr30/fr30.c (fr30_print_operand): Likewise.
> (fr30_const_double_is_zero): Likewise.
> * config/frv/frv.c (frv_print_operand, output_move_single): Likewise.
> * config/frv/frv.md: Likewise (in a define_split).
> * config/frv/predicates.md (int_2word_operand): Likewise.
> * config/h8300/h8300.c (h8300_print_operand): Likewise.
> * config/i386/i386.c (standard_80387_constant_p): Likewise.
> (ix86_print_operand, ix86_split_to_parts): Likewise.
> * config/i386/i386.md: Likewise (in a define_split).
> * config/ia64/ia64.c (ia64_split_tmode, ia64_print_operand): Likewise.
> * config/iq2000/iq2000.md (movsf_lo_sum, movsf_high): Likewise.
> * config/m32r/m32r.c (easy_df_const, m32r_print_operand): Likewise.
> * config/m68k/m68k.c (handle_move_double, standard_68881_constant_p)
> (print_operand): Likewise.
> * config/m68k/m68k.md (movsf_cf_hard, movdf_cf_hard): Likewise.
> * config/mep/mep.md: Likewise (in define_split).
> * config/microblaze/microblaze.c (microblaze_const_double_ok)
> (print_operand): Likewise.
> * config/mips/mips.md (consttable_float): Likewise.
> * config/mmix/mmix.c (mmix_intval): Likewise.
> * config/mn10300/mn10300.c (mn10300_print_operand): Likewise.
> * config/nvptx/nvptx.c (nvptx_print_operand): Likewise.
> * config/pa/pa.c (pa_singlemove_string): Likewise.
> * config/pdp11/pdp11.c (pdp11_expand_operands): Likewise.
> (pdp11_asm_print_operand, legitimate_const_double_p): Likewise.
> * config/rs6000/rs6000.c (num_insns_constant, rs6000_emit_cmove)
> (output_toc): Likewise.
> * config/rs6000/rs6000.md: Likewise (in define_splits).
> * config/rx/rx.c (rx_print_operand): Likewise.
> * config/s390/s390.c (s390_output_pool_entry): Likewise.
> * config/sh/sh.c (fp_zero_operand, fp_one_operand): Likewise.
> * config/sh/sh.md (consttable_sf, consttable_df): Likewise
> (and also in define_splits).
> * config/sparc/sparc.c (fp_sethi_p, fp_mov_p): Likewise.
> (fp_high_losum_p): Likewise.
> * config/sparc/sparc.md (*movsf_insn, *movsf_lo_sum): Likewise.
> (*movsf_high): Likewise.
> * config/spu/spu.c (const_double_to_hwint): Likewise.
> * config/v850/v850.c (const_double_split): Likewise.
> * config/vax/vax.c (vax_float_literal): Likewise.
> * config/visium/visium.c (visium_expand_copysign): Likewise.
> * config/visium/visium.md: Likewise (in define_split).
> * config/xtensa/predicates.md (const_float_1_operand): Likewise.
> * config/xtensa/xtensa.c (print_operand): Likewise.
> (xtensa_output_literal): Likewise.
> * cprop.c (implicit_set_cond_p): Likewise.
> * dwarf2out.c (insert_float): Likewise.
> * expmed.c (expand_mult, make_tree): Likewis

[Patch ARM/ AArch64] Fix typo in vcvt_f16.c testcase .

2015-10-05 Thread Ramana Radhakrishnan

Hi,

This test worked by accident. While looking at why this was failing randomly in 
my builds for arm-none-eabi, I discovered a bug in the way in which the 
testcases were written up in this case. Tested on arm-none-eabi cross fixing 
the issues that I was seeing with this test. 

Applied to trunk as obvious.

regards
Ramana


* gcc.target/aarc64/advsimd-intrinsics/vcvt_f16.c (TEST_MSG): Fix typo.
(exec_vcvt): Add comments.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c
index 48e50e1..c3e4d4f 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c
@@ -21,7 +21,7 @@ exec_vcvt (void)
 {
   clean_results ();
 
-#define TEST_MSG vcvt_f32_f16
+#define TEST_MSG "vcvt_f32_f16"
   {
 VECT_VAR_DECL (buffer_src, float, 16, 4) [] = { 16.0, 15.0, 14.0, 13.0 };
 
@@ -39,7 +39,7 @@ exec_vcvt (void)
 
   clean_results ();
 
-#define TEST_MSG vcvt_f16_f32
+#define TEST_MSG "vcvt_f16_f32"
   {
 VECT_VAR_DECL (buffer_src, float, 32, 4) [] = { 1.5, 2.5, 3.5, 4.5 };
 DECL_VARIABLE (vector_src, float, 32, 4);
@@ -54,6 +54,8 @@ exec_vcvt (void)
   }
 #undef TEST_MSG
 
+  /* We run more tests for AArch64 as the relevant intrinsics
+ do not exist on AArch32.  */
 #if defined (__aarch64__)
   clean_results ();

Re: [PATCH] Remove gimplifier use from PRE

2015-10-05 Thread Richard Biener

On Thu, 1 Oct 2015, Richard Biener wrote:

> 
> The following patch from the match-and-simplify branch removes
> gimplifier use from PRE replacing it with use of the gimple_build API
> building GIMPLE directly.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

The 5th (or so) refactoring attempt succeeded - see below.  The good
thing is the code looks much easier to follow now.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-10-05  Richard Biener  

* tree-ssa-pre.c (create_component_ref_by_pieces_1): Move
call handling ...
(create_expression_by_pieces): ... here and build GIMPLE
calls directly.  Use gimple_build API and avoid force_gimple_operand.
(insert_into_preds_of_block): Simplify.
(do_regular_insertion): Add comment.

Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 228467)
--- gcc/tree-ssa-pre.c  (working copy)
*** create_component_ref_by_pieces_1 (basic_
*** 2474,2515 
switch (currop->opcode)
  {
  case CALL_EXPR:
!   {
!   tree folded, sc = NULL_TREE;
!   unsigned int nargs = 0;
!   tree fn, *args;
!   if (TREE_CODE (currop->op0) == FUNCTION_DECL)
! fn = currop->op0;
!   else
! fn = find_or_generate_expression (block, currop->op0, stmts);
!   if (!fn)
! return NULL_TREE;
!   if (currop->op1)
! {
!   sc = find_or_generate_expression (block, currop->op1, stmts);
!   if (!sc)
! return NULL_TREE;
! }
!   args = XNEWVEC (tree, ref->operands.length () - 1);
!   while (*operand < ref->operands.length ())
! {
!   args[nargs] = create_component_ref_by_pieces_1 (block, ref,
!   operand, stmts);
!   if (!args[nargs])
! return NULL_TREE;
!   nargs++;
! }
!   folded = build_call_array (currop->type,
!  (TREE_CODE (fn) == FUNCTION_DECL
!   ? build_fold_addr_expr (fn) : fn),
!  nargs, args);
!   if (currop->with_bounds)
! CALL_WITH_BOUNDS_P (folded) = true;
!   free (args);
!   if (sc)
! CALL_EXPR_STATIC_CHAIN (folded) = sc;
!   return folded;
!   }
  
  case MEM_REF:
{
--- 2474,2480 
switch (currop->opcode)
  {
  case CALL_EXPR:
!   gcc_unreachable ();
  
  case MEM_REF:
{
*** create_expression_by_pieces (basic_block
*** 2798,2818 
  
switch (expr->kind)
  {
!   /* We may hit the NAME/CONSTANT case if we have to convert types
!that value numbering saw through.  */
  case NAME:
folded = PRE_EXPR_NAME (expr);
break;
  case CONSTANT:
!   folded = PRE_EXPR_CONSTANT (expr);
!   break;
! case REFERENCE:
!   {
!   vn_reference_t ref = PRE_EXPR_REFERENCE (expr);
!   folded = create_component_ref_by_pieces (block, ref, stmts);
!   if (!folded)
! return NULL_TREE;
}
break;
  case NARY:
{
--- 2763,2837 
  
switch (expr->kind)
  {
! /* We may hit the NAME/CONSTANT case if we have to convert types
!that value numbering saw through.  */
  case NAME:
folded = PRE_EXPR_NAME (expr);
+   if (useless_type_conversion_p (exprtype, TREE_TYPE (folded)))
+   return folded;
break;
  case CONSTANT:
!   { 
!   folded = PRE_EXPR_CONSTANT (expr);
!   tree tem = fold_convert (exprtype, folded);
!   if (is_gimple_min_invariant (tem))
! return tem;
!   break;
}
+ case REFERENCE:
+   if (PRE_EXPR_REFERENCE (expr)->operands[0].opcode == CALL_EXPR)
+   {
+ vn_reference_t ref = PRE_EXPR_REFERENCE (expr);
+ unsigned int operand = 1;
+ vn_reference_op_t currop = &ref->operands[0];
+ tree sc = NULL_TREE;
+ tree fn;
+ if (TREE_CODE (currop->op0) == FUNCTION_DECL)
+   fn = currop->op0;
+ else
+   fn = find_or_generate_expression (block, currop->op0, stmts);
+ if (!fn)
+   return NULL_TREE;
+ if (currop->op1)
+   {
+ sc = find_or_generate_expression (block, currop->op1, stmts);
+ if (!sc)
+   return NULL_TREE;
+   }
+ auto_vec args (ref->operands.length () - 1);
+ while (operand < ref->operands.length ())
+   {
+ tree arg = create_component_ref_by_pieces_1 (block, ref,
+  &operand, stmts);
+ if (!arg)
+   return NULL_TREE;
+ args.quick_push (arg);
+   }
+ gcall *call
+   = gimple_build_call_vec ((TREE_CODE (fn) == FUNCTION_DECL
+

Re: [PATCH] Fix PR67783, quadraticness in IPA inline analysis

2015-10-05 Thread Richard Biener

On Thu, 1 Oct 2015, Richard Biener wrote:

> 
> The following avoids quadraticness in the loop depth by only considering
> loop header defs as IVs for the analysis of the loop_stride predicate.
> This will miss cases like
> 
> foo (int inv)
> {
>  for (i = inv; i < n; ++i)
>   {
> int derived_iv = i + i * inv;
> ...
>   }
> }
> 
> but I doubt that's important in practice.  Another way would be to
> just consider the containing loop when analyzing the IV, thus iterate
> over outermost loop bodies only, replacing the
> 
>   simple_iv (loop, loop_containing_stmt (stmt), use, &iv, true)
> 
> check with
> 
>   simple_iv (loop_containing_stmt (stmt), loop_containing_stmt (stmt), 
> use, &iv, true);
> 
> but doing all this analysis for each stmt is already quite expensive,
> esp. as we are doing it for all uses instead of all defs ...
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> Honza, is this ok or did you do the current way on purpose (rather
> than for completeness as it was easy to do?)

Applied as r228472.

Richard.

> Thanks,
> Richard.
> 
> 2015-10-01  Richard Biener  
> 
>   PR ipa/67783
>   * ipa-inline-analysis.c (estimate_function_body_sizes): Only
>   consider loop header PHI defs as IVs.
> 
> Index: gcc/ipa-inline-analysis.c
> ===
> *** gcc/ipa-inline-analysis.c (revision 228319)
> --- gcc/ipa-inline-analysis.c (working copy)
> *** estimate_function_body_sizes (struct cgr
> *** 2760,2768 
>   {
> vec exits;
> edge ex;
> !   unsigned int j, i;
> struct tree_niter_desc niter_desc;
> -   basic_block *body = get_loop_body (loop);
> bb_predicate = *(struct predicate *) loop->header->aux;
>   
> exits = get_loop_exit_edges (loop);
> --- 2760,2767 
>   {
> vec exits;
> edge ex;
> !   unsigned int j;
> struct tree_niter_desc niter_desc;
> bb_predicate = *(struct predicate *) loop->header->aux;
>   
> exits = get_loop_exit_edges (loop);
> *** estimate_function_body_sizes (struct cgr
> *** 2788,2833 
>   }
> exits.release ();
>   
> !   for (i = 0; i < loop->num_nodes; i++)
>   {
> !   gimple_stmt_iterator gsi;
> !   bb_predicate = *(struct predicate *) body[i]->aux;
> !   for (gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi);
> !gsi_next (&gsi))
> ! {
> !   gimple *stmt = gsi_stmt (gsi);
> !   affine_iv iv;
> !   ssa_op_iter iter;
> !   tree use;
> ! 
> !   FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
> !   {
> ! predicate will_be_nonconstant;
> ! 
> ! if (!simple_iv
> ! (loop, loop_containing_stmt (stmt), use, &iv, true)
> ! || is_gimple_min_invariant (iv.step))
> !   continue;
> ! will_be_nonconstant
> !   = will_be_nonconstant_expr_predicate (fbi.info, info,
> ! iv.step,
> ! nonconstant_names);
> ! if (!true_predicate_p (&will_be_nonconstant))
> !   will_be_nonconstant
> !  = and_predicates (info->conds,
> !&bb_predicate,
> !&will_be_nonconstant);
> ! if (!true_predicate_p (&will_be_nonconstant)
> ! && !false_predicate_p (&will_be_nonconstant))
> !   /* This is slightly inprecise.  We may want to represent
> !  each loop with independent predicate.  */
> !   loop_stride =
> ! and_predicates (info->conds, &loop_stride,
> ! &will_be_nonconstant);
> !   }
> ! }
>   }
> -   free (body);
>   }
> set_hint_predicate (&inline_summaries->get (node)->loop_iterations,
> loop_iterations);
> --- 2787,2818 
>   }
> exits.release ();
>   
> !   for (gphi_iterator gsi = gsi_start_phis (loop->header);
> !!gsi_end_p (gsi); gsi_next (&gsi))
>   {
> !   gphi *phi = gsi.phi ();
> !   tree use = gimple_phi_result (phi);
> !   affine_iv iv;
> !   predicate will_be_nonconstant;
> !   if (!virtual_operand_p (use)
> !   || !simple_iv (loop, loop, use, &iv, true)
> !   || is_gimple_min_invariant (iv.step))
> ! continue;
> !   will_be_nonconstant
> ! = will_be_nonconstant_expr_predicate (fbi.info, info,
> !   iv.step,
> !   nonconst

Re: [AArch64] [TLSIE][2/2] Implement TLS IE for tiny model

2015-10-05 Thread James Greenhalgh

Hi Jiong,

I was looking at another bug and in the process of auditing our code
spotted an issue with this patch from back in June...

On Fri, Jun 19, 2015 at 10:15:38AM +0100, Jiong Wang wrote:
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 8b061ba..be9da5b 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> +(define_insn "tlsie_tiny_sidi"
> +  [(set (match_operand:DI 0 "register_operand" "=&r")
> + (zero_extend:DI
> +  (unspec:SI [(match_operand 1 "aarch64_tls_ie_symref" "S")
> +   (match_operand:DI 2 "register_operand" "r")
> +   ]
> +   UNSPEC_GOTTINYTLS)))]
> +  ""
> +  "ldr\\t%w0, %L1\;add\\t%0, %0, %2"

Here, you have no iterators, so the  will never be replaced. Consequently,
you are likely to hit an ICE if this pattern is ever used.

I presume you intended this to say

  "ldr\\t%w0, %L1\;add\\t%w0, %w0, %w2"

If so, consider that change preapproved.

Thanks,
James

> +  [(set_attr "type" "multiple")
> +   (set_attr "length" "8")]
> +)
> +
>  (define_expand "tlsle"
>[(set (match_operand 0 "register_operand" "=r")
>  (unspec [(match_operand 1 "register_operand" "r")
> diff --git a/gcc/testsuite/gcc.target/aarch64/tlsie_tiny.c 
> b/gcc/testsuite/gcc.target/aarch64/tlsie_tiny.c
> new file mode 100644
> index 000..8ac01b2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/tlsie_tiny.c
> @@ -0,0 +1,6 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftls-model=initial-exec -mcmodel=tiny" } */
> +
> +#include "tls.c"
> +
> +/* { dg-final { scan-assembler-times ":gottprel:" 2 } } */

Re: [PATCH] SH FDPIC backend support

2015-10-05 Thread Oleg Endo

On Sun, 2015-10-04 at 22:16 -0400, Rich Felker wrote:
> This is FDPIC-specific. Because there is fundamentally no way for a
> function to find its own GOT (it has one GOT for each process using
> the code containing the function), its GOT address has to be a
> (hidden) argument to the function which arrives in r12.
> 
> For calls via the PLT, r12 contains the PLT entry's (i.e. the calling
> module's) GOT pointer at the time of the call, and the PLT thunk
> replaces it with the callee's GOT pointer (loaded from the function
> descriptor) before jumping to the callee code. There is fundamentally
> nowhere the PLT thunk could store the old value of r12 and arrange for
> it to be restored at return time, so using a PLT forces r12 to be
> call-clobbered.
> 
> (Note that in the special case where the PLT is bypassed because the
> callee is defined in the same module and bound at link-time, the GOT
> value loaded by the caller is the right GOT value for the callee
> automatically.)
> 
> If we didn't care about being able to do PLT calls, there's no
> fundamental reason r12 has to be call-clobbered, but it still makes a
> lot more sense. Getting back the value of r12 you passed when making a
> function call is rarely useful except in the case where the caller
> knows the function is defined in the same module (so it can keep using
> r12 as its own GOT pointer after the call).
> 
> BTW the reason I'm spending time explaining this now is that it's
> something we should optimize after the FDPIC patch goes in: I think
> the r12-related spills/reload could be made a lot more efficient.

This will be a separate point then, after the initial FDPIC stuff is in.
Maybe also related:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12306
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54019

Cheers,
Oleg

Re: [PATCH] Remove restriction for remote testing

2015-10-05 Thread James Norris


Ping.

On 09/28/2015 07:35 AM, James Norris wrote:

Hi,

The attached patch fixes a problem when doing remote testing.
Specifically, testing of the atomic tests found in gcc/atomic.
The code in atomic_init precludes the setting of the variable
'link_flags' when doing remote testing. The conditional test
can be safely removed as get_multilibs will return "", and
atomic_link_flags will return the necessary '-latomic' that
will allow the atomic tests to successfully link.

OK for trunk?

Thanks,
Jim

[PATCH 2/2][ARC] Add support for ARCv2 CPUs

2015-10-05 Thread Claudiu Zissulescu

Just realized this patch haven't went thru to the mailing list. Reposted.

This patch adds basic support (libgcc) for Synopsys' ARCv2 CPUs. 

Can this be committed?

Thanks,
Claudiu

ChangeLog:
2015-08-28  Claudiu Zissulescu  

* config/arc/dp-hack.h: Add support for ARCHS.
* config/arc/ieee-754/divdf3.S: Likewise.
* config/arc/ieee-754/divsf3-stdmul.S: Likewise.
* config/arc/ieee-754/muldf3.S: Likewise.
* config/arc/ieee-754/mulsf3.S: Likewise
* config/arc/lib1funcs.S: Likewise
* config/arc/gmon/dcache_linesz.S: Don't read the build register
for ARCv2 cores.
* config/arc/gmon/profil.S (__profil, __profil_irq): Don't 
profile
for ARCv2 cores.
* config/arc/ieee-754/arc-ieee-754.h (MPYHU, MPYH): Define.
* config/arc/t-arc700-uClibc: Remove hard selection for ARC 700
cores.


02-arcv2.patch
Description: 02-arcv2.patch

Re: [PR other/65021] mkoffloads -save-temps handling, and cleanup cleanup

2015-10-05 Thread Bernd Schmidt


On 10/05/2015 12:04 PM, Thomas Schwinge wrote:

In a similar vein to the earlier patch to "Pass on the verbose flag "-v"
to/in the mkoffloads", here is a patch to make the mkoffloads handle
"-save-temps", and this patch also happens to address
, "nvptx mkoffload doesn't clean up its
temporary files".  OK for trunk?


The patch is ok.


  static void
-mkoffload_atexit (void)
+mkoffload_cleanup (void)
  {
tool_cleanup (false);
  }


Don't quite see the need for this change, but don't feel strongly enough 
about it to make you resubmit.



Bernd

Re: [PATCH] Remove restriction for remote testing

2015-10-05 Thread Bernd Schmidt


On 10/05/2015 02:00 PM, James Norris wrote:

Ping.


As I said previously, I think appending "-latomic" unconditionally in 
atomic_init is probably a better solution because I'm not convinced that 
the things atomic_link_flags does are appropriate in a remote host 
situation.



Bernd

Re: Do not use TYPE_CANONICAL in useless_type_conversion

2015-10-05 Thread Bernd Schmidt


+  /* For aggregates compare only the size and mode.  Accesses to fields do have
+ a type information by themselves and thus we only care if we can i.e.
+ use the types in move operations.  */
else if (AGGREGATE_TYPE_P (inner_type)
   && TREE_CODE (inner_type) == TREE_CODE (outer_type))
-return false;
+return (!TYPE_SIZE (outer_type)
+   || (TYPE_SIZE (inner_type)
+   && operand_equal_p (TYPE_SIZE (inner_type),
+   TYPE_SIZE (outer_type), 0)));
+
+  else if (TREE_CODE (inner_type) == OFFSET_TYPE
+  && TREE_CODE (inner_type) == TREE_CODE (outer_type))
+return useless_type_conversion_p (TREE_TYPE (outer_type),
+ TREE_TYPE (inner_type))
+  && useless_type_conversion_p
+   (TYPE_OFFSET_BASETYPE (outer_type),
+TYPE_OFFSET_BASETYPE (inner_type));



The comment says the mode is compared, but I don't see that in the code. 
Which is right?


Also, wouldn't the final condition be clearer written as

> +  else if (TREE_CODE (inner_type) == OFFSET_TYPE
> + && TREE_CODE (outer_type) == OFFSET_TYPE)


Bernd

Re: [PATCH] Unswitching outer loops.

2015-10-05 Thread Yuri Rumyantsev

Thanks Richard.
I'd like to answer on your last comment related to using of exit edge
argument for edge that skips loop.
Let's consider the following test-case:

#include 
#define N 32
float *foo(int ustride, int size, float *src)
{
   float *buffer, *p;
   int i, k;

   if (!src)
return NULL;

   buffer = (float *) malloc(N * size * sizeof(float));

   if(buffer)
  for(i=0, p=buffer; i:
  # _6 = PHI <0B(8), buffer_20(16)>
  return _6;

It is clear that we must preserve function semantic and transform it to
_6 = PHI <0B(12), buffer_19(9), buffer_19(4)>


2015-10-05 13:57 GMT+03:00 Richard Biener :
> On Wed, Sep 30, 2015 at 12:46 PM, Yuri Rumyantsev  wrote:
>> Hi Richard,
>>
>> I re-designed outer loop unswitching using basic idea of 23855 patch -
>> hoist invariant guard if loop is empty without guard. Note that this
>> was added to loop unswitching pass with simple modifications - using
>> another loop iterator etc.
>>
>> Bootstrap and regression testing did not show any new failures.
>> What is your opinion?
>
> Overall it looks good.  Some comments below - a few more testcases would
> be nice as well.
>
> +  /* Loop must not be infinite.  */
> +  if (!finite_loop_p (loop))
> +return false;
>
> why's that?
>
> +  body = get_loop_body_in_dom_order (loop);
> +  for (i = 0; i < loop->num_nodes; i++)
> +{
> +  if (body[i]->loop_father != loop)
> +   continue;
> +  if (!empty_bb_without_guard_p (loop, body[i]))
>
> I wonder if there is a better way to iterate over the interesting
> blocks and PHIs
> we need to check for side-effects (and thus we maybe can avoid gathering
> the loop in DOM order).
>
> +  FOR_EACH_SSA_TREE_OPERAND (name, stmt, op_iter, SSA_OP_DEF)
> +   {
> + if (may_be_used_outside
>
> may_be_used_outside can be hoisted above the loop.  I wonder if we can take
> advantage of loop-closed SSA form here (and the fact we have a single exit
> from the loop).  Iterating over exit dest PHIs and determining whether the
> exit edge DEF is inside the loop part it may not be should be enough.
>
> +  gcc_assert (single_succ_p (pre_header));
>
> that should be always true.
>
> +  gsi_remove (&gsi, false);
> +  bb = guard->dest;
> +  remove_edge (guard);
> +  /* Update dominance for destination of GUARD.  */
> +  if (EDGE_COUNT (bb->preds) == 0)
> +{
> +  basic_block s_bb;
> +  gcc_assert (single_succ_p (bb));
> +  s_bb = single_succ (bb);
> +  delete_basic_block (bb);
> +  if (single_pred_p (s_bb))
> +   set_immediate_dominator (CDI_DOMINATORS, s_bb, single_pred (s_bb));
>
> all this massaging should be simplified by leaving it to CFG cleanup by
> simply adjusting the CONDs condition to always true/false.  There is
> gimple_cond_make_{true,false} () for this (would be nice to have a variant
> taking a bool).
>
> +  new_edge = make_edge (pre_header, exit->dest, flags);
> +  if (fix_dom_of_exit)
> +set_immediate_dominator (CDI_DOMINATORS, exit->dest, pre_header);
> +  update_stmt (gsi_stmt (gsi));
>
> the update_stmt should be not necessary, it's done by gsi_insert_after 
> already.
>
> +  /* Add NEW_ADGE argument for all phi in post-header block.  */
> +  bb = exit->dest;
> +  for (gphi_iterator gsi = gsi_start_phis (bb);
> +   !gsi_end_p (gsi); gsi_next (&gsi))
> +{
> +  gphi *phi = gsi.phi ();
> +  /* edge_iterator ei; */
> +  tree arg;
> +  if (virtual_operand_p (gimple_phi_result (phi)))
> +   {
> + arg = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
> + add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);
> +   }
> +  else
> +   {
> + /* Use exit edge argument.  */
> + arg = PHI_ARG_DEF_FROM_EDGE (phi, exit);
> + add_phi_arg (phi, arg, new_edge, UNKNOWN_LOCATION);
>
> Hum.  How is it ok to use the exit edge argument for the edge that skips
> the loop?  Why can't you always use the pre-header edge value?
> That is, if we have
>
>  for(i=0;i{
>  if (n > 0)
> {
>  for (;;)
>{
>}
>  }
>}
>   ... = i;
>
> then i is used after the loop and the correct value to use if
> n > 0 is false is '0'.  Maybe this way we can also relax
> what check_exit_phi does?  IMHO the only restriction is
> if sth defined inside the loop before the header check for
> the inner loop is used after the loop.
>
> Thanks,
> Richard.
>
>> Thanks.
>>
>> ChangeLog:
>> 2015-09-30  Yuri Rumyantsev  
>>
>> * tree-ssa-loop-unswitch.c: Include "gimple-iterator.h" and
>> "cfghooks.h", add prototypes for introduced new functions.
>> (tree_ssa_unswitch_loops): Use from innermost loop iterator, move all
>> checks on ability of loop unswitching to tree_unswitch_single_loop;
>> invoke tree_unswitch_single_loop or tree_unswitch_outer_loop depending
>> on innermost loop check.
>> (tree_unswitch_single_loop): Add all required checks on ability of
>> loop unswitching under zero recursive level guard.
>> (tree_unswitch_outer_loop): New function.
>> (fin

Re: [AArch64] [TLSIE][2/2] Implement TLS IE for tiny model

2015-10-05 Thread Jiong Wang


James Greenhalgh writes:

> Hi Jiong,
>
> I was looking at another bug and in the process of auditing our code
> spotted an issue with this patch from back in June...
>
> On Fri, Jun 19, 2015 at 10:15:38AM +0100, Jiong Wang wrote:
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index 8b061ba..be9da5b 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> +(define_insn "tlsie_tiny_sidi"
>> +  [(set (match_operand:DI 0 "register_operand" "=&r")
>> +(zero_extend:DI
>> +  (unspec:SI [(match_operand 1 "aarch64_tls_ie_symref" "S")
>> +  (match_operand:DI 2 "register_operand" "r")
>> +  ]
>> +  UNSPEC_GOTTINYTLS)))]
>> +  ""
>> +  "ldr\\t%w0, %L1\;add\\t%0, %0, %2"
>
> Here, you have no iterators, so the  will never be replaced. Consequently,
> you are likely to hit an ICE if this pattern is ever used.
>
> I presume you intended this to say
>
>   "ldr\\t%w0, %L1\;add\\t%w0, %w0, %w2"
>
> If so, consider that change preapproved.

Thanks, yes, it's a hiding bug which might be triggered under ILP32 mode
only.

committed below patch after bootstrap & regression tls* testcases OK.


2015-10-05 James Greenhalgh 
   Jiong Wang  

gcc/
  * config/aarch64/aarch64.md (tlsie_tiny_sidi): Replace "" with "w".

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 74522f8..208f58f 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4738,7 +4738,7 @@
 		  ]
 		  UNSPEC_GOTTINYTLS)))]
   ""
-  "ldr\\t%w0, %L1\;add\\t%0, %0, %2"
+  "ldr\\t%w0, %L1\;add\\t%w0, %w0, %w2"
   [(set_attr "type" "multiple")
(set_attr "length" "8")]
 )

Re: C PATCH for c/65345 (file-scope _Atomic expansion with floats)

2015-10-05 Thread Marek Polacek

On Sat, Oct 03, 2015 at 08:58:19PM -0400, David Edelsohn wrote:
> The bug was not x86-specific.  The fix happens to be in
> target-specific code, but that's the luck of the draw.  Numerous other
> GCC developers have fixed bugs or added features that required tweaks
> to all ports.  Not all targets are easily accessible and you certainly
> can ask port maintainers to test a patch.  But writing that you don't
> have the cycles to fix all of the targets is not a collegial answer.
> Why do you believe that target maintainers have more cycles than you?
> I didn't see you tell Uros to fix the bug on x86.  The approach may
> work for your one specific bug, but it does not scale if every GCC
> developer pursues the same process.

I now sort of regret that I ever took up on this PR.

One would think that bringing a bug into the open and fixing at least
one target is preferable to keeping the bug unfixed, but apparently
you think otherwise.

> It's poor form to fix a bug only on x86 that is common to all targets
> and leave the problem "as an exercise for the reader" for all other
> targets -- essentially banishing the other targets to second-class
> status with respect to conformance -- especially when the change is
> mostly mechanical.  I don't expect developers to specifically enable
> and exploit every new feature on every architecture, but had expected
> bug fixes to be distributed to all targets.  "It's just not cricket."

The change is not really purely mechanical; if it was, I'd provide
patches for every target.  Yes, it's same in priciple for all targets,
but it needs a bit of debugging anyway -- and I can't debug e.g. MIPS.
Also, I'm not sure if TARGET_EXPRs can be used on every target.

> GCC has thrived for over 25 years -- supporting a huge number of
> targets and languages -- through a general sense of cooperation and
> collaboration to ensure the success of the entire project.  If this is
> going to degrade into a more parochial attitude, then maybe GCC will
> need an explicit policy to counteract that mindset.

Again, this wasn't a regression.  If I commit some middle end change that
breaks a target, sure, I should fix it, that is clear (and I've done so
in the past).

And note that I was following Joseph's suggestion (which makes sense to me).

> I am testing the attached patch for PPC.

I'd be surprised if this worked as-is, because you also need to change (some
of) COMPOUND_EXPRs to TARGET_EXPRs, but I don't know which one in particular,
because that's not clear to me by just looking at the code.

Marek

[PATCH] Clear SSA_NAME_ANTI_RANGE_P when appropriate (PR tree-optimization/67821)

2015-10-05 Thread Marek Polacek

Here, we were crashing on an assert in duplicate_ssa_name_range_info:

506   gcc_assert (!SSA_NAME_ANTI_RANGE_P (name));

The problem is that reset_flow_sensitive_info wasn't clearing the
SSA_NAME_ANTI_RANGE_P flag; I don't think NULL SSA_NAME_RANGE_INFO
can ever describe an anti-range...

Bootstrapped/regtested on x86_64-linux, ok for trunk/5?

2015-10-05  Marek Polacek  

PR tree-optimization/67821
* tree-ssanames.c (reset_flow_sensitive_info): Also clear
the SSA_NAME_ANTI_RANGE_P flag.

* gcc.dg/torture/pr67821-2.c: New test.
* gcc.dg/torture/pr67821.c: New test.

diff --git gcc/testsuite/gcc.dg/torture/pr67821-2.c 
gcc/testsuite/gcc.dg/torture/pr67821-2.c
index e69de29..38cfc84 100644
--- gcc/testsuite/gcc.dg/torture/pr67821-2.c
+++ gcc/testsuite/gcc.dg/torture/pr67821-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+int a, b, c, d, e, g;
+short f;
+
+void
+fn1 ()
+{
+  int i;
+  f = a - b;
+  e = (c && (i = d = (unsigned) f - 1)) || i;
+  g = (unsigned) f - 1;
+  c && (d = 0);
+}
diff --git gcc/testsuite/gcc.dg/torture/pr67821.c 
gcc/testsuite/gcc.dg/torture/pr67821.c
index e69de29..1c9e8b9 100644
--- gcc/testsuite/gcc.dg/torture/pr67821.c
+++ gcc/testsuite/gcc.dg/torture/pr67821.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+
+int isdigit (int);
+
+int
+foo (const char *s)
+{
+  int success = 1;
+  const char *p = s + 2;
+  if (!isdigit (*p))
+success = 0;
+  while (isdigit (*p))
+++p;
+  return success;
+}
diff --git gcc/tree-ssanames.c gcc/tree-ssanames.c
index 64e2379..c3484fe 100644
--- gcc/tree-ssanames.c
+++ gcc/tree-ssanames.c
@@ -561,7 +561,10 @@ reset_flow_sensitive_info (tree name)
mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (name));
 }
   else
-SSA_NAME_RANGE_INFO (name) = NULL;
+{
+  SSA_NAME_RANGE_INFO (name) = NULL;
+  SSA_NAME_ANTI_RANGE_P (name) = 0;
+}
 }
 
 /* Clear all flow sensitive data from all statements and PHI definitions

Marek

Re: [patch 0/3] Header file reduction.

2015-10-05 Thread Bernd Schmidt


On 10/02/2015 04:22 AM, Andrew MacLeod wrote:

The patches are generated by a pair of tools.
* gcc-order-includes goes through the headers and canonically reorders some of 
our more common/troublesome headers and removes any duplicates.  This includes 
headers which are included by other headers. (ie, obstack.h can be removed as a 
duplicate if bitmap.h is included already.)
* remove-includes is the tool which tries to remove each non-conditional header 
file and does the real work.


Is the bitmap/obstack example really one of a change that is desirable? 
I think if a file uses obstacks then an include of obstack.h is 
perfectly fine, giving us freedom to e.g. change bitmaps not to use 
obstacks. Given that multiple headers include obstack.h, and pretty much 
everything seems to indirectly include bitmap.h anyway, maybe a better 
change would be to just include it always in system.h.



I'll have a patch shortly to add these and some other useful tools to a
header-tools directory in contrib.


How soon? It's difficult to meaningfully comment on these patches 
without looking at how they were generated. Two points:

 * diff -c is somewhat unusual and I find diff -u much more readable.
 * Maybe the patches for reordering and removing should be split, also
   for readability and for easier future identification of problems.


Bernd

Re: [PATCH] Clear SSA_NAME_ANTI_RANGE_P when appropriate (PR tree-optimization/67821)

2015-10-05 Thread Richard Biener

On Mon, 5 Oct 2015, Marek Polacek wrote:

> Here, we were crashing on an assert in duplicate_ssa_name_range_info:
> 
> 506   gcc_assert (!SSA_NAME_ANTI_RANGE_P (name));
> 
> The problem is that reset_flow_sensitive_info wasn't clearing the
> SSA_NAME_ANTI_RANGE_P flag; I don't think NULL SSA_NAME_RANGE_INFO
> can ever describe an anti-range...

Yeah, also not a range.  Thus I think the assert is bogus (aka
superfluous) if !SSA_NAME_RANGE_INFO (name) holds.

Otherwise other setters of SSA_NAME_RANGE_INFO would need to make
sure SSA_NAME_ANTI_RANGE_P is cleared as well.

> Bootstrapped/regtested on x86_64-linux, ok for trunk/5?

So - can you instead remove the assert?

Thanks,
Richard.

> 2015-10-05  Marek Polacek  
> 
>   PR tree-optimization/67821
>   * tree-ssanames.c (reset_flow_sensitive_info): Also clear
>   the SSA_NAME_ANTI_RANGE_P flag.
> 
>   * gcc.dg/torture/pr67821-2.c: New test.
>   * gcc.dg/torture/pr67821.c: New test.
> 
> diff --git gcc/testsuite/gcc.dg/torture/pr67821-2.c 
> gcc/testsuite/gcc.dg/torture/pr67821-2.c
> index e69de29..38cfc84 100644
> --- gcc/testsuite/gcc.dg/torture/pr67821-2.c
> +++ gcc/testsuite/gcc.dg/torture/pr67821-2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +
> +int a, b, c, d, e, g;
> +short f;
> +
> +void
> +fn1 ()
> +{
> +  int i;
> +  f = a - b;
> +  e = (c && (i = d = (unsigned) f - 1)) || i;
> +  g = (unsigned) f - 1;
> +  c && (d = 0);
> +}
> diff --git gcc/testsuite/gcc.dg/torture/pr67821.c 
> gcc/testsuite/gcc.dg/torture/pr67821.c
> index e69de29..1c9e8b9 100644
> --- gcc/testsuite/gcc.dg/torture/pr67821.c
> +++ gcc/testsuite/gcc.dg/torture/pr67821.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +
> +int isdigit (int);
> +
> +int
> +foo (const char *s)
> +{
> +  int success = 1;
> +  const char *p = s + 2;
> +  if (!isdigit (*p))
> +success = 0;
> +  while (isdigit (*p))
> +++p;
> +  return success;
> +}
> diff --git gcc/tree-ssanames.c gcc/tree-ssanames.c
> index 64e2379..c3484fe 100644
> --- gcc/tree-ssanames.c
> +++ gcc/tree-ssanames.c
> @@ -561,7 +561,10 @@ reset_flow_sensitive_info (tree name)
>   mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (name));
>  }
>else
> -SSA_NAME_RANGE_INFO (name) = NULL;
> +{
> +  SSA_NAME_RANGE_INFO (name) = NULL;
> +  SSA_NAME_ANTI_RANGE_P (name) = 0;
> +}
>  }
>  
>  /* Clear all flow sensitive data from all statements and PHI definitions
> 
>   Marek
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [patch 0/3] Header file reduction.

2015-10-05 Thread Richard Biener

On Mon, Oct 5, 2015 at 3:27 PM, Bernd Schmidt  wrote:
> On 10/02/2015 04:22 AM, Andrew MacLeod wrote:
>>
>> The patches are generated by a pair of tools.
>> * gcc-order-includes goes through the headers and canonically reorders
>> some of our more common/troublesome headers and removes any duplicates.
>> This includes headers which are included by other headers. (ie, obstack.h
>> can be removed as a duplicate if bitmap.h is included already.)
>> * remove-includes is the tool which tries to remove each non-conditional
>> header file and does the real work.
>
>
> Is the bitmap/obstack example really one of a change that is desirable? I
> think if a file uses obstacks then an include of obstack.h is perfectly
> fine, giving us freedom to e.g. change bitmaps not to use obstacks. Given
> that multiple headers include obstack.h, and pretty much everything seems to
> indirectly include bitmap.h anyway, maybe a better change would be to just
> include it always in system.h.

Not system.h please - use coretypes.h if really necessary.

Richard.

>> I'll have a patch shortly to add these and some other useful tools to a
>> header-tools directory in contrib.
>
>
> How soon? It's difficult to meaningfully comment on these patches without
> looking at how they were generated. Two points:
>  * diff -c is somewhat unusual and I find diff -u much more readable.
>  * Maybe the patches for reordering and removing should be split, also
>for readability and for easier future identification of problems.
>
>
> Bernd
>

Re: [PR other/65021] mkoffloads -save-temps handling, and cleanup cleanup

2015-10-05 Thread Thomas Schwinge

Hi Bernd!

On Mon, 5 Oct 2015 14:12:06 +0200, Bernd Schmidt  wrote:
> On 10/05/2015 12:04 PM, Thomas Schwinge wrote:
> > In a similar vein to the earlier patch to "Pass on the verbose flag "-v"
> > to/in the mkoffloads", here is a patch to make the mkoffloads handle
> > "-save-temps", and this patch also happens to address
> > , "nvptx mkoffload doesn't clean up its
> > temporary files".  OK for trunk?
> 
> The patch is ok.

Thanks for the prompt review!

> >   static void
> > -mkoffload_atexit (void)
> > +mkoffload_cleanup (void)
> >   {
> > tool_cleanup (false);
> >   }
> 
> Don't quite see the need for this change, but don't feel strongly enough 
> about it to make you resubmit.

It's for uniformity, to make it easy for the reader: that's what the
other users of gcc/collect-utils.c are doing.  Oh, actually only
gcc/lto-wrapper.c; but gcc/collect2.c doesn't...

Maybe some more refactoring could be done here, possibly also to remove
duplicated code amongst users of gcc/collect-utils.c as well as in the
mkoffloads.

Anyway, I committed my patch without modifications to trunk in r228488.

commit 558e6810f0a18b67eb8474bd86db23ab7de4f2fe
Author: tschwinge 
Date:   Mon Oct 5 14:07:50 2015 +

[PR other/65021] mkoffloads -save-temps handling, and cleanup cleanup

gcc/
PR other/65021
* config/i386/intelmic-mkoffload.c (mkoffload_atexit): Rename
function to...
(mkoffload_cleanup): ... this.  Adjust all users.
(maybe_unlink): Look at save_temps and verbose flags instead of
debug flag.
(main): Parse "-save-temps" flag.
(generate_target_descr_file, generate_target_offloadend_file)
(generate_host_descr_file, prepare_target_image): Pass it on.
* config/nvptx/mkoffload.c (tool_cleanup): Implement.
(mkoffload_cleanup): New function.
(maybe_unlink): Look at save_temps and verbose flags instead of
debug flag.
(main): Instead of calling utils_cleanup, register atexit handler
for mkoffload_cleanup.
(main): Parse "-save-temps" flag.
(compile_native, main): Pass it on.
* lto-wrapper.c (compile_offload_image): Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@228488 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog| 23 +-
 gcc/config/i386/intelmic-mkoffload.c | 30 ++---
 gcc/config/nvptx/mkoffload.c | 37 ++--
 gcc/lto-wrapper.c|  2 ++
 4 files changed, 74 insertions(+), 18 deletions(-)

diff --git gcc/ChangeLog gcc/ChangeLog
index e665b6b..5340f47 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -1,3 +1,24 @@
+2015-10-05  Thomas Schwinge  
+
+   PR other/65021
+   * config/i386/intelmic-mkoffload.c (mkoffload_atexit): Rename
+   function to...
+   (mkoffload_cleanup): ... this.  Adjust all users.
+   (maybe_unlink): Look at save_temps and verbose flags instead of
+   debug flag.
+   (main): Parse "-save-temps" flag.
+   (generate_target_descr_file, generate_target_offloadend_file)
+   (generate_host_descr_file, prepare_target_image): Pass it on.
+   * config/nvptx/mkoffload.c (tool_cleanup): Implement.
+   (mkoffload_cleanup): New function.
+   (maybe_unlink): Look at save_temps and verbose flags instead of
+   debug flag.
+   (main): Instead of calling utils_cleanup, register atexit handler
+   for mkoffload_cleanup.
+   (main): Parse "-save-temps" flag.
+   (compile_native, main): Pass it on.
+   * lto-wrapper.c (compile_offload_image): Likewise.
+
 2015-10-05  Trevor Saunders  
 
* gimple.h (gimple_op_ptr): Require a non const gimple *.
@@ -763,7 +784,7 @@
 
* config/i386/intelmic-mkoffload.c (main): Parse "-v" flag.
(generate_target_descr_file, generate_target_offloadend_file)
-   (generate_host_descr_file, prepare_target_image, main): Pass it on.
+   (generate_host_descr_file, prepare_target_image): Pass it on.
* config/nvptx/mkoffload.c (main): Parse "-v" flag.
(compile_native, main): Pass it on.
* lto-wrapper.c (compile_offload_image): Likewise.
diff --git gcc/config/i386/intelmic-mkoffload.c 
gcc/config/i386/intelmic-mkoffload.c
index 14f3fb3..828b415 100644
--- gcc/config/i386/intelmic-mkoffload.c
+++ gcc/config/i386/intelmic-mkoffload.c
@@ -45,6 +45,7 @@ const char *temp_files[MAX_NUM_TEMPS];
 enum offload_abi offload_abi = OFFLOAD_ABI_UNSET;
 
 /* Delete tempfiles and exit function.  */
+
 void
 tool_cleanup (bool from_signal ATTRIBUTE_UNUSED)
 {
@@ -53,19 +54,24 @@ tool_cleanup (bool from_signal ATTRIBUTE_UNUSED)
 }
 
 static void
-mkoffload_atexit (void)
+mkoffload_cleanup (void)
 {
   tool_cleanup (false);
 }
 
-/* Unlink FILE unless we are debugging.  */
+/* Unlink FILE unless requested otherwise.  */
+
 void
 maybe_unlink (const char *file)
 {
-

Re: [PATCH] Clear SSA_NAME_ANTI_RANGE_P when appropriate (PR tree-optimization/67821)

2015-10-05 Thread Marek Polacek

On Mon, Oct 05, 2015 at 04:08:05PM +0200, Richard Biener wrote:
> On Mon, 5 Oct 2015, Marek Polacek wrote:
> 
> > Here, we were crashing on an assert in duplicate_ssa_name_range_info:
> > 
> > 506   gcc_assert (!SSA_NAME_ANTI_RANGE_P (name));
> > 
> > The problem is that reset_flow_sensitive_info wasn't clearing the
> > SSA_NAME_ANTI_RANGE_P flag; I don't think NULL SSA_NAME_RANGE_INFO
> > can ever describe an anti-range...
> 
> Yeah, also not a range.  Thus I think the assert is bogus (aka
> superfluous) if !SSA_NAME_RANGE_INFO (name) holds.

Hm, true, and duplicate_ssa_name_range_info unconditionally sets
SSA_NAME_ANTI_RANGE_P... 

> Otherwise other setters of SSA_NAME_RANGE_INFO would need to make
> sure SSA_NAME_ANTI_RANGE_P is cleared as well.
 
They mostly do, if I'm looking right, e.g.
tree-ssa-phiopt.c:1016
tree-ssa-loop-im.c:1224
tree-ssa-loop-im.c:1294
and I've also seen
.

> So - can you instead remove the assert?

Sure.  So is the following ok once it passes the usual testing?

2015-10-05  Marek Polacek  

PR tree-optimization/67821
* tree-ssanames.c (duplicate_ssa_name_range_info): Remove an assert.

* gcc.dg/torture/pr67821-2.c: New test.
* gcc.dg/torture/pr67821.c: New test.

diff --git gcc/testsuite/gcc.dg/torture/pr67821-2.c 
gcc/testsuite/gcc.dg/torture/pr67821-2.c
index e69de29..38cfc84 100644
--- gcc/testsuite/gcc.dg/torture/pr67821-2.c
+++ gcc/testsuite/gcc.dg/torture/pr67821-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+int a, b, c, d, e, g;
+short f;
+
+void
+fn1 ()
+{
+  int i;
+  f = a - b;
+  e = (c && (i = d = (unsigned) f - 1)) || i;
+  g = (unsigned) f - 1;
+  c && (d = 0);
+}
diff --git gcc/testsuite/gcc.dg/torture/pr67821.c 
gcc/testsuite/gcc.dg/torture/pr67821.c
index e69de29..1c9e8b9 100644
--- gcc/testsuite/gcc.dg/torture/pr67821.c
+++ gcc/testsuite/gcc.dg/torture/pr67821.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+
+int isdigit (int);
+
+int
+foo (const char *s)
+{
+  int success = 1;
+  const char *p = s + 2;
+  if (!isdigit (*p))
+success = 0;
+  while (isdigit (*p))
+++p;
+  return success;
+}
diff --git gcc/tree-ssanames.c gcc/tree-ssanames.c
index 64e2379..91f4ed8 100644
--- gcc/tree-ssanames.c
+++ gcc/tree-ssanames.c
@@ -503,7 +503,6 @@ duplicate_ssa_name_range_info (tree name, enum 
value_range_type range_type,
 
   gcc_assert (!POINTER_TYPE_P (TREE_TYPE (name)));
   gcc_assert (!SSA_NAME_RANGE_INFO (name));
-  gcc_assert (!SSA_NAME_ANTI_RANGE_P (name));
 
   if (!range_info)
 return;

Marek

Re: [PATCH] Clear SSA_NAME_ANTI_RANGE_P when appropriate (PR tree-optimization/67821)

2015-10-05 Thread Richard Biener

On Mon, 5 Oct 2015, Marek Polacek wrote:

> On Mon, Oct 05, 2015 at 04:08:05PM +0200, Richard Biener wrote:
> > On Mon, 5 Oct 2015, Marek Polacek wrote:
> > 
> > > Here, we were crashing on an assert in duplicate_ssa_name_range_info:
> > > 
> > > 506   gcc_assert (!SSA_NAME_ANTI_RANGE_P (name));
> > > 
> > > The problem is that reset_flow_sensitive_info wasn't clearing the
> > > SSA_NAME_ANTI_RANGE_P flag; I don't think NULL SSA_NAME_RANGE_INFO
> > > can ever describe an anti-range...
> > 
> > Yeah, also not a range.  Thus I think the assert is bogus (aka
> > superfluous) if !SSA_NAME_RANGE_INFO (name) holds.
> 
> Hm, true, and duplicate_ssa_name_range_info unconditionally sets
> SSA_NAME_ANTI_RANGE_P... 
> 
> > Otherwise other setters of SSA_NAME_RANGE_INFO would need to make
> > sure SSA_NAME_ANTI_RANGE_P is cleared as well.
>  
> They mostly do, if I'm looking right, e.g.
> tree-ssa-phiopt.c:1016
> tree-ssa-loop-im.c:1224
> tree-ssa-loop-im.c:1294
> and I've also seen
> .

Those would be redundant then.

> > So - can you instead remove the assert?
> 
> Sure.  So is the following ok once it passes the usual testing?

Yes.

Thanks,
Richard.

> 2015-10-05  Marek Polacek  
> 
>   PR tree-optimization/67821
>   * tree-ssanames.c (duplicate_ssa_name_range_info): Remove an assert.
> 
>   * gcc.dg/torture/pr67821-2.c: New test.
>   * gcc.dg/torture/pr67821.c: New test.
> 
> diff --git gcc/testsuite/gcc.dg/torture/pr67821-2.c 
> gcc/testsuite/gcc.dg/torture/pr67821-2.c
> index e69de29..38cfc84 100644
> --- gcc/testsuite/gcc.dg/torture/pr67821-2.c
> +++ gcc/testsuite/gcc.dg/torture/pr67821-2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +
> +int a, b, c, d, e, g;
> +short f;
> +
> +void
> +fn1 ()
> +{
> +  int i;
> +  f = a - b;
> +  e = (c && (i = d = (unsigned) f - 1)) || i;
> +  g = (unsigned) f - 1;
> +  c && (d = 0);
> +}
> diff --git gcc/testsuite/gcc.dg/torture/pr67821.c 
> gcc/testsuite/gcc.dg/torture/pr67821.c
> index e69de29..1c9e8b9 100644
> --- gcc/testsuite/gcc.dg/torture/pr67821.c
> +++ gcc/testsuite/gcc.dg/torture/pr67821.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +
> +int isdigit (int);
> +
> +int
> +foo (const char *s)
> +{
> +  int success = 1;
> +  const char *p = s + 2;
> +  if (!isdigit (*p))
> +success = 0;
> +  while (isdigit (*p))
> +++p;
> +  return success;
> +}
> diff --git gcc/tree-ssanames.c gcc/tree-ssanames.c
> index 64e2379..91f4ed8 100644
> --- gcc/tree-ssanames.c
> +++ gcc/tree-ssanames.c
> @@ -503,7 +503,6 @@ duplicate_ssa_name_range_info (tree name, enum 
> value_range_type range_type,
>  
>gcc_assert (!POINTER_TYPE_P (TREE_TYPE (name)));
>gcc_assert (!SSA_NAME_RANGE_INFO (name));
> -  gcc_assert (!SSA_NAME_ANTI_RANGE_P (name));
>  
>if (!range_info)
>  return;
> 
>   Marek
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Richard Sandiford

Richard Biener  writes:
> On Thu, Oct 1, 2015 at 3:59 PM, Bernd Schmidt  wrote:
>> On 10/01/2015 03:51 PM, Richard Sandiford wrote:
>>>
>>> We have a global 1/2 and a cached 1/3, but recalculate 1/4, 1/6 and 1/9
>>> each time we need them.  That seems a bit arbitrary and makes the folding
>>> code more noisy (especially once it's moved to match.pd).
>>>
>>> This patch caches the other three constants too.  Bootstrapped &
>>> regression-tested on x86_64-linux-gnu.  OK to install?
>>
>>
>> Looks reasonable enough.
>
> Given
>
> /* Returns the special REAL_VALUE_TYPE corresponding to 1/3.  */
>
> const REAL_VALUE_TYPE *
> dconst_third_ptr (void)
> {
>   static REAL_VALUE_TYPE value;
>
>   /* Initialize mathematical constants for constant folding builtins.
>  These constants need to be given to at least 160 bits precision.  */
>   if (value.cl == rvc_zero)
> {
>   real_arithmetic (&value, RDIV_EXPR, &dconst1, real_digit (3));
> }
>   return &value;
> }
>
> I wonder if it makes sense to have
>
> template
> const REAL_VALUE_TYPE &
> dconst (void)
> {
>   static REAL_VALUE_TYPE value;
>   if (value.cl == rvc_zero)
> real_arithmetic (&value, RDIV_EXPR, real_digit (a), real_digit (b));
>   return value;
> }
>
> instead which allows us to use
>
>   dconst<1,2>()
>
> in place of dconst_half () and allows arbitrary extra cached constants to be
> added (well, double-check that, but I think the function static should be
> a .comdat).

You suggested on IRC that we do the same for the integral constants,
so e.g. dconst0 becomes dconst<0> ().  Here's the result.  Like I said,
I think this may be a case of "be careful what you wish for".

Bootstrapped & regression-tested on x86_64-linux-gnu.  Also tested by
building one target per CPU directory and checking that there were no
new warnings and no changes in testsuite output at -O2.  OK to install?

Thanks,
Richard

gcc/ada/
* gcc-interface/trans.c (convert_with_check): Use dconst template
instead of static variables.

gcc/c-family/
* c-common.c (c_common_truthvalue_conversion): Use dconst template
instead of static variables.
* c-lex.c (interpret_float): Likewise.
* c-ubsan.c (ubsan_instrument_division): Likewise.

gcc/java/
* decl.c (java_init_decl_processing): Use dconst template instead
of static variables.

gcc/
* real.h (dconst0, dconst1, dconst2, dconstm1, dconsthalf): Delete.
(dconst_third, dconst_third_ptr): Delete.
(real_from_fraction): Declare.
(dconst): New function.
* real.c (real_from_fraction): New function.
(real_digit, dconst_third_ptr): Delete.
(exact_real_inverse, real_to_decimal_for_mode, decimal_integer_string)
(ten_to_mptwo, times_pten): Use dconst instead of real_digit.
(real_powi, real_floor, real_ceil, real_round): Use dconst
instead of static variables.
* emit-rtl.c (dconst0, dconst1, dconst2, dconstm1, dconsthalf): Delete.
(init_emit_once): Don't initialize them.
* builtins.c (fold_builtin_sqrt, fold_builtin_cbrt): Use dconst
instead of static variables.  Also use dconst<1, 6> and dconst<1, 9>
instead of deriving them from doncst_third.
(expand_builtin_cexpi, expand_builtin_signbit, fold_builtin_cabs)
(fold_builtin_pow, fold_builtin_powi, fold_builtin_signbit)
(fold_builtin_modf, fold_builtin_classify, fold_builtin_fpclassify)
(fold_builtin_1, fold_builtin_2): Use dconst instead of static
variables.
* doc/match-and-simplify.texi: Likewise (in examples).
* config/aarch64/aarch64.c (aarch64_float_const_zero_rtx_p): Likewise.
* config/c6x/c6x.md (divsf3, divdf3): Likewise.
* config/fr30/fr30.c (fr30_const_double_is_zero): Likewise.
* config/i386/i386.c (standard_80387_constant_p): Likewise.
(ix86_expand_convert_uns_didf_sse, ix86_expand_convert_uns_sidf_sse)
(ix86_expand_convert_sign_didf_sse, ix86_expand_convert_uns_sisf_sse)
(ix86_expand_vector_convert_uns_vsivsf): Likewise.
(ix86_expand_adjust_ufix_to_sfix_si, ix86_emit_i387_round): Likewise.
(ix86_emit_swsqrtsf, ix86_gen_TWO52, ix86_expand_lround): Likewise.
(ix86_expand_floorceildf_32, ix86_expand_floorceil): Likewise.
(ix86_expand_rounddf_32, ix86_expand_truncdf_32): Likewise.
(ix86_expand_round, ix86_expand_round_sse4): Likewise.
* config/i386/i386.md (fixuns_truncsi2): Likewise.
* config/i386/sse.md (vec_unpacku_float_hi_v4si): Likewise.
(vec_unpacku_float_lo_v4si, vec_unpacku_float_hi_v8si): Likewise.
(vec_unpacku_float_hi_v16si, vec_unpacku_float_lo_v8si): Likewise.
(vec_unpacku_float_lo_v16si, round2): Likewise.
* config/m68k/m68k.c (floating_exact_log2): Likewise.
* config/rs6000/rs6000.c (rs6000_emit_swdiv): Likewise.
(rs6000_scale_v2df): Likewise.
* config/rs6000/rs6000.md (*

Re: [PATCH] Clear SSA_NAME_ANTI_RANGE_P when appropriate (PR tree-optimization/67821)

2015-10-05 Thread Marek Polacek

On Mon, Oct 05, 2015 at 04:26:49PM +0200, Richard Biener wrote:
> > > Otherwise other setters of SSA_NAME_RANGE_INFO would need to make
> > > sure SSA_NAME_ANTI_RANGE_P is cleared as well.
> >  
> > They mostly do, if I'm looking right, e.g.
> > tree-ssa-phiopt.c:1016
> > tree-ssa-loop-im.c:1224
> > tree-ssa-loop-im.c:1294
> > and I've also seen
> > .
> 
> Those would be redundant then.

I can give this a whirl (as a follow-up)...

2015-10-05  Marek Polacek  

* tree-ssa-loop-im.c
(move_computations_dom_walker::before_dom_children): Don't set
SSA_NAME_ANTI_RANGE_P.
* tree-ssa-phiopt.c (value_replacement): Likewise.

diff --git gcc/tree-ssa-loop-im.c gcc/tree-ssa-loop-im.c
index f3389a0..9b2436f 100644
--- gcc/tree-ssa-loop-im.c
+++ gcc/tree-ssa-loop-im.c
@@ -1222,7 +1222,6 @@ move_computations_dom_walker::before_dom_children 
(basic_block bb)
{
  tree lhs = gimple_assign_lhs (new_stmt);
  SSA_NAME_RANGE_INFO (lhs) = NULL;
- SSA_NAME_ANTI_RANGE_P (lhs) = 0;
}
   gsi_insert_on_edge (loop_preheader_edge (level), new_stmt);
   remove_phi_node (&bsi, false);
@@ -1292,7 +1291,6 @@ move_computations_dom_walker::before_dom_children 
(basic_block bb)
{
  tree lhs = gimple_get_lhs (stmt);
  SSA_NAME_RANGE_INFO (lhs) = NULL;
- SSA_NAME_ANTI_RANGE_P (lhs) = 0;
}
   /* In case this is a stmt that is not unconditionally executed
  when the target loop header is executed and the stmt may
diff --git gcc/tree-ssa-phiopt.c gcc/tree-ssa-phiopt.c
index 697836a..f33ca5c 100644
--- gcc/tree-ssa-phiopt.c
+++ gcc/tree-ssa-phiopt.c
@@ -1014,7 +1014,6 @@ value_replacement (basic_block cond_bb, basic_block 
middle_bb,
 :
 # u_3 = PHI   */
  SSA_NAME_RANGE_INFO (lhs) = NULL;
- SSA_NAME_ANTI_RANGE_P (lhs) = 0;
  /* If available, we can use VR of phi result at least.  */
  tree phires = gimple_phi_result (phi);
  struct range_info_def *phires_range_info

Marek

Generalize gimple_val_nonnegative_real_p

2015-10-05 Thread Richard Sandiford

The upcoming patch to move sqrt and cbrt simplifications to match.pd
caused a regression because the (abs @0)->@0 simplification didn't
trigger for:

(abs (convert (abs X)))

The simplification is based on tree_expr_nonnegative_p, which is
pretty weak for gimple (it gives up if it sees an SSA_NAME).

We have the stronger gimple_val_nonnegative_real_p, but (a) as its
name implies, it's specific to reals and (b) in its current form it
doesn't handle converts.  This patch:

- generalises the routine all types
- reuses tree_{unary,binary,call}_nonnegative_warnv_p for the leaf cases
- makes the routine handle CONVERT_EXPR
- allows a nesting depth of 1 for CONVERT_EXPR
- uses the routine instead of tree_expr_nonnegative_p for gimple.

Limiting the depth to 1 is a little arbitrary but adding a param seemed
over the top.

Bootstrapped & regression-tested on x86_64-linux-gnu.  I didn't write
a specific test because this is already covered by the testsuite if
the follow-on patch is also applied.  OK to install?

Thanks,
Richard


gcc/
* gimple-fold.h (gimple_val_nonnegative_real_p): Replace with...
(gimple_val_nonnegative_p): ...this new function.
* gimple-fold.c (gimple_val_nonnegative_real_p): Replace with...
(gimple_val_nonnegative_p): ...this new function.  Add a nesting
depth.  Handle conversions and allow them to be nested to a depth
of 1.  Generalize to non-reals.  Use tree_binary_nonnegative_warnv_p,
tree_unary_nonnegative_warnv_p and tree_call_nonnegative_warnv_p.
* tree-ssa-math-opts.c (gimple_expand_builtin_pow): Update accordingly.
* match.pd (nonnegative_p): New predicate.  Use it instead of
tree_expr_nonnegative_p to detect redundant abs expressions.

Index: a/gcc/gimple-fold.c
===
*** a/gcc/gimple-fold.c
--- b/gcc/gimple-fold.c
*** gimple_get_virt_method_for_binfo (HOST_WIDE_INT token, tree 
known_binfo,
*** 5773,5787 
  }
  
  /* Return true iff VAL is a gimple expression that is known to be
!non-negative.  Restricted to floating-point inputs.  */
  
  bool
! gimple_val_nonnegative_real_p (tree val)
  {
gimple *def_stmt;
  
-   gcc_assert (val && SCALAR_FLOAT_TYPE_P (TREE_TYPE (val)));
- 
/* Use existing logic for non-gimple trees.  */
if (tree_expr_nonnegative_p (val))
  return true;
--- 5773,5785 
  }
  
  /* Return true iff VAL is a gimple expression that is known to be
!non-negative.  DEPTH is the nesting depth.  */
  
  bool
! gimple_val_nonnegative_p (tree val, unsigned int depth)
  {
gimple *def_stmt;
  
/* Use existing logic for non-gimple trees.  */
if (tree_expr_nonnegative_p (val))
  return true;
*** gimple_val_nonnegative_real_p (tree val)
*** 5789,5906 
if (TREE_CODE (val) != SSA_NAME)
  return false;
  
!   /* Currently we look only at the immediately defining statement
!  to make this determination, since recursion on defining 
!  statements of operands can lead to quadratic behavior in the
!  worst case.  This is expected to catch almost all occurrences
!  in practice.  It would be possible to implement limited-depth
!  recursion if important cases are lost.  Alternatively, passes
!  that need this information (such as the pow/powi lowering code
!  in the cse_sincos pass) could be revised to provide it through
   dataflow propagation.  */
  
def_stmt = SSA_NAME_DEF_STMT (val);
  
if (is_gimple_assign (def_stmt))
  {
!   tree op0, op1;
! 
!   /* See fold-const.c:tree_expr_nonnegative_p for additional
!cases that could be handled with recursion.  */
! 
!   switch (gimple_assign_rhs_code (def_stmt))
{
!   case ABS_EXPR:
! /* Always true for floating-point operands.  */
! return true;
! 
!   case MULT_EXPR:
! /* True if the two operands are identical (since we are
!restricted to floating-point inputs).  */
! op0 = gimple_assign_rhs1 (def_stmt);
! op1 = gimple_assign_rhs2 (def_stmt);
! 
! if (op0 == op1
! || operand_equal_p (op0, op1, 0))
!   return true;
  
default:
! return false;
}
  }
else if (is_gimple_call (def_stmt))
  {
tree fndecl = gimple_call_fndecl (def_stmt);
!   if (fndecl
! && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
{
! tree arg1;
! 
! switch (DECL_FUNCTION_CODE (fndecl))
!   {
!   CASE_FLT_FN (BUILT_IN_ACOS):
!   CASE_FLT_FN (BUILT_IN_ACOSH):
!   CASE_FLT_FN (BUILT_IN_CABS):
!   CASE_FLT_FN (BUILT_IN_COSH):
!   CASE_FLT_FN (BUILT_IN_ERFC):
!   CASE_FLT_FN (BUILT_IN_EXP):
!   CASE_FLT_FN (BUILT_IN_EXP10):
!   CASE_FLT_FN (BUILT_IN_EXP2):
!   CASE_FLT_FN (BUILT_IN_FABS):
!   CASE_FLT_FN (BUILT_IN_FDIM):
!   CA

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Bernd Schmidt


On 10/05/2015 04:47 PM, Richard Sandiford wrote:

@@ -9536,7 +9520,7 @@ fold_builtin_classify (location_t loc, tree fndecl, tree 
arg, int builtin_index)
{
  r = TREE_REAL_CST (arg);
  if (real_isinf (&r))
-   return real_compare (GT_EXPR, &r, &dconst0)
+   return real_compare (GT_EXPR, &r, &dconst<0> ())
   ? integer_one_node : integer_minus_one_node;
  else
return integer_zero_node;


So... are the templates magic enough not to make us create a new 
temporary every time this is used?



Bernd

Re: C PATCH for c/65345 (file-scope _Atomic expansion with floats)

2015-10-05 Thread Joseph Myers

On Sat, 3 Oct 2015, David Edelsohn wrote:

> It's poor form to fix a bug only on x86 that is common to all targets
> and leave the problem "as an exercise for the reader" for all other
> targets -- essentially banishing the other targets to second-class
> status with respect to conformance -- especially when the change is
> mostly mechanical.  I don't expect developers to specifically enable
> and exploit every new feature on every architecture, but had expected
> bug fixes to be distributed to all targets.  "It's just not cricket."

I have no disagreement with that principle.  My disagreement is about how 
this particular patch fits in with such principles, where there is a 
subjective judgement involved about how separate issues with different 
targets' code are, and about how much fixing an issue for a target 
involves expertise in that architecture and back end versus expertise in 
the issue being fixed, and about the merits of checking in testcases early 
so it's as easy as possible to see whether a given target is actually 
affected.

If in such a case the judgement is that something can be more efficiently 
fixed with target expertise, it is of course important to be thorough 
about handling over the expertise in the fix (for example, in this case, 
explaining how to tell when TARGET_EXPR needs to be used).

Perhaps GCC should have an equivalent of glibc's 
 to list cases where it was 
judged for a particular change that updates could most effectively be made 
by target maintainers.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: using scratchpads to enhance RTL-level if-conversion: the new patch now passes bootstrap with the default BUILD_CONFIG [i.e. no stage2-to-stage3 comparison errors even with debugging info off in s

2015-10-05 Thread Bernd Schmidt

Oh, one other thing. To be able to include your code we need to have a 
copyright assignment to the FSF from you. I see one previous commit from 
you, but only a trivial one and with a corporate email address. Have you 
gone through the copyright assignment process?



Bernd

Move sqrt and cbrt simplifications to match.pd

2015-10-05 Thread Richard Sandiford

This patch moves the sqrt and cbrt simplification rules to match.pd.
builtins.c now only does the constant folding.

Bootstrapped & regression-tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* builtins.c (fold_builtin_sqrt, fold_builtin_cbrt): Delete.
(fold_builtin_1): Update accordingly.  Handle constant arguments here.
* match.pd: Add rules previously handled by fold_builtin_sqrt
and fold_builtin_cbrt.

gcc/testsuite/
* gcc.dg/builtins-47.c: Test the optimized dump instead.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 85ba6dd..3df60e8 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -158,8 +158,6 @@ static bool integer_valued_real_p (tree);
 static tree fold_trunc_transparent_mathfn (location_t, tree, tree);
 static rtx expand_builtin_fabs (tree, rtx, rtx);
 static rtx expand_builtin_signbit (tree, rtx);
-static tree fold_builtin_sqrt (location_t, tree, tree);
-static tree fold_builtin_cbrt (location_t, tree, tree);
 static tree fold_builtin_pow (location_t, tree, tree, tree, tree);
 static tree fold_builtin_powi (location_t, tree, tree, tree, tree);
 static tree fold_builtin_cos (location_t, tree, tree, tree);
@@ -7706,145 +7704,6 @@ fold_builtin_cproj (location_t loc, tree arg, tree type)
   return NULL_TREE;
 }
 
-/* Fold a builtin function call to sqrt, sqrtf, or sqrtl with argument ARG.
-   Return NULL_TREE if no simplification can be made.  */
-
-static tree
-fold_builtin_sqrt (location_t loc, tree arg, tree type)
-{
-
-  enum built_in_function fcode;
-  tree res;
-
-  if (!validate_arg (arg, REAL_TYPE))
-return NULL_TREE;
-
-  /* Calculate the result when the argument is a constant.  */
-  if ((res = do_mpfr_arg1 (arg, type, mpfr_sqrt, &dconst<0> (), NULL, true)))
-return res;
-
-  /* Optimize sqrt(expN(x)) = expN(x*0.5).  */
-  fcode = builtin_mathfn_code (arg);
-  if (flag_unsafe_math_optimizations && BUILTIN_EXPONENT_P (fcode))
-{
-  tree expfn = TREE_OPERAND (CALL_EXPR_FN (arg), 0);
-  arg = fold_build2_loc (loc, MULT_EXPR, type,
-CALL_EXPR_ARG (arg, 0),
-build_real (type, dconst<1, 2> ()));
-  return build_call_expr_loc (loc, expfn, 1, arg);
-}
-
-  /* Optimize sqrt(Nroot(x)) -> pow(x,1/(2*N)).  */
-  if (flag_unsafe_math_optimizations && BUILTIN_ROOT_P (fcode))
-{
-  tree powfn = mathfn_built_in (type, BUILT_IN_POW);
-
-  if (powfn)
-   {
- tree arg0 = CALL_EXPR_ARG (arg, 0);
- tree arg1 = (BUILTIN_SQRT_P (fcode)
-  ? build_real (type, dconst<1, 4> ())
-  : build_real_truncate (type, dconst<1, 6> ()));
- return build_call_expr_loc (loc, powfn, 2, arg0, arg1);
-   }
-}
-
-  /* Optimize sqrt(pow(x,y)) = pow(|x|,y*0.5).  */
-  if (flag_unsafe_math_optimizations
-  && (fcode == BUILT_IN_POW
- || fcode == BUILT_IN_POWF
- || fcode == BUILT_IN_POWL))
-{
-  tree powfn = TREE_OPERAND (CALL_EXPR_FN (arg), 0);
-  tree arg0 = CALL_EXPR_ARG (arg, 0);
-  tree arg1 = CALL_EXPR_ARG (arg, 1);
-  tree narg1;
-  if (!tree_expr_nonnegative_p (arg0))
-   arg0 = build1 (ABS_EXPR, type, arg0);
-  narg1 = fold_build2_loc (loc, MULT_EXPR, type, arg1,
-  build_real (type, dconst<1, 2> ()));
-  return build_call_expr_loc (loc, powfn, 2, arg0, narg1);
-}
-
-  return NULL_TREE;
-}
-
-/* Fold a builtin function call to cbrt, cbrtf, or cbrtl with argument ARG.
-   Return NULL_TREE if no simplification can be made.  */
-
-static tree
-fold_builtin_cbrt (location_t loc, tree arg, tree type)
-{
-  const enum built_in_function fcode = builtin_mathfn_code (arg);
-  tree res;
-
-  if (!validate_arg (arg, REAL_TYPE))
-return NULL_TREE;
-
-  /* Calculate the result when the argument is a constant.  */
-  if ((res = do_mpfr_arg1 (arg, type, mpfr_cbrt, NULL, NULL, 0)))
-return res;
-
-  if (flag_unsafe_math_optimizations)
-{
-  /* Optimize cbrt(expN(x)) -> expN(x/3).  */
-  if (BUILTIN_EXPONENT_P (fcode))
-   {
- tree expfn = TREE_OPERAND (CALL_EXPR_FN (arg), 0);
- arg = fold_build2_loc (loc, MULT_EXPR, type,
-CALL_EXPR_ARG (arg, 0),
-build_real_truncate (type, dconst<1, 3> ()));
- return build_call_expr_loc (loc, expfn, 1, arg);
-   }
-
-  /* Optimize cbrt(sqrt(x)) -> pow(x,1/6).  */
-  if (BUILTIN_SQRT_P (fcode))
-   {
- tree powfn = mathfn_built_in (type, BUILT_IN_POW);
-
- if (powfn)
-   {
- tree arg0 = CALL_EXPR_ARG (arg, 0);
- tree tree_root = build_real_truncate (type, dconst<1, 6> ());
- return build_call_expr_loc (loc, powfn, 2, arg0, tree_root);
-   }
-   }
-
-  /* Optimize cbrt(cbrt(x)) -> pow(x,1/9) iff x is nonnegative.  */
-  if (BUILTIN_CBRT_P (fcode))
-   {
- tree arg0 = CALL_EXPR_ARG (ar

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Richard Sandiford

Bernd Schmidt  writes:
> On 10/05/2015 04:47 PM, Richard Sandiford wrote:
>> @@ -9536,7 +9520,7 @@ fold_builtin_classify (location_t loc, tree fndecl, 
>> tree arg, int builtin_index)
>>  {
>>r = TREE_REAL_CST (arg);
>>if (real_isinf (&r))
>> -return real_compare (GT_EXPR, &r, &dconst0)
>> +return real_compare (GT_EXPR, &r, &dconst<0> ())
>> ? integer_one_node : integer_minus_one_node;
>>else
>>  return integer_zero_node;
>
> So... are the templates magic enough not to make us create a new 
> temporary every time this is used?

Yeah, the static variables become comdat objects keyed off the full name
(dconst<...>::value).  They're shared between calls and between TUs.

Thanks,
Richard

Re: RFC: PATCH for front end parts of C++ transactional memory TS

2015-10-05 Thread Jason Merrill


On 10/05/2015 05:00 AM, Andreas Schwab wrote:

Jason Merrill  writes:


diff --git a/gcc/testsuite/g++.dg/tm/eh1.C b/gcc/testsuite/g++.dg/tm/eh1.C
new file mode 100644
index 000..1561211
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tm/eh1.C
@@ -0,0 +1,10 @@
+// A handler can involve a transaction-safety conversion.
+// { dg-do run }
+// { dg-options "-fgnu-tm" }
+
+void g() transaction_safe {}
+int main()
+{
+  try { throw g; }
+  catch (void (*p)()) { }
+}


FAIL: g++.dg/tm/eh1.C  -std=gnu++98 (test for excess errors)
Excess errors:
xg++: error: libitm.spec: No such file or directory

There is no libitm support on ia64.


Thanks.  I moved the run tests to the libitm testsuite.

Jason

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Bernd Schmidt


On 10/05/2015 05:22 PM, Richard Sandiford wrote:

Bernd Schmidt  writes:

On 10/05/2015 04:47 PM, Richard Sandiford wrote:

@@ -9536,7 +9520,7 @@ fold_builtin_classify (location_t loc, tree fndecl, tree 
arg, int builtin_index)
{
  r = TREE_REAL_CST (arg);
  if (real_isinf (&r))
-   return real_compare (GT_EXPR, &r, &dconst0)
+   return real_compare (GT_EXPR, &r, &dconst<0> ())
   ? integer_one_node : integer_minus_one_node;
  else
return integer_zero_node;


So... are the templates magic enough not to make us create a new
temporary every time this is used?


Yeah, the static variables become comdat objects keyed off the full name
(dconst<...>::value).  They're shared between calls and between TUs.


Hmm. And since you're returning a reference, taking the address works. 
The whole thing is subtle enough that it deserves a comment. Since this 
kind of thing is something I don't like about C++ (simple-looking code 
expanding into non-obvious behaviour) I'm not going to ack this patch, 
but if someone else wants to, that's fine.


I do believe you still have some code growth since the inline dconst 
function always expands code that will initialize the constant. IMO 
that's not desirable.



Bernd

Re: [PATCH] Fix PR67783, quadraticness in IPA inline analysis

2015-10-05 Thread Jan Hubicka

> On Thu, 1 Oct 2015, Richard Biener wrote:
> 
> > 
> > The following avoids quadraticness in the loop depth by only considering
> > loop header defs as IVs for the analysis of the loop_stride predicate.
> > This will miss cases like
> > 
> > foo (int inv)
> > {
> >  for (i = inv; i < n; ++i)
> >   {
> > int derived_iv = i + i * inv;
> > ...
> >   }
> > }
> > 
> > but I doubt that's important in practice.  Another way would be to
> > just consider the containing loop when analyzing the IV, thus iterate
> > over outermost loop bodies only, replacing the
> > 
> >   simple_iv (loop, loop_containing_stmt (stmt), use, &iv, true)
> > 
> > check with
> > 
> >   simple_iv (loop_containing_stmt (stmt), loop_containing_stmt (stmt), 
> > use, &iv, true);
> > 
> > but doing all this analysis for each stmt is already quite expensive,
> > esp. as we are doing it for all uses instead of all defs ...
> > 
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > 
> > Honza, is this ok or did you do the current way on purpose (rather
> > than for completeness as it was easy to do?)
> 
> Applied as r228472.

Ah, sorry. I wrote you a reply but apparently did not send.  Yes, the patch 
looks
resonable - it is a heuristics after all.  Lets watch if the change make any 
difference
on polyhedron and other benchmarks.

Honza

Re: Do not use TYPE_CANONICAL in useless_type_conversion

2015-10-05 Thread Jan Hubicka

> >+  /* For aggregates compare only the size and mode.  Accesses to fields do 
> >have
> >+ a type information by themselves and thus we only care if we can i.e.
> >+ use the types in move operations.  */
> >else if (AGGREGATE_TYPE_P (inner_type)
> >&& TREE_CODE (inner_type) == TREE_CODE (outer_type))
> >-return false;
> >+return (!TYPE_SIZE (outer_type)
> >+|| (TYPE_SIZE (inner_type)
> >+&& operand_equal_p (TYPE_SIZE (inner_type),
> >+TYPE_SIZE (outer_type), 0)));
> >+
> >+  else if (TREE_CODE (inner_type) == OFFSET_TYPE
> >+   && TREE_CODE (inner_type) == TREE_CODE (outer_type))
> >+return useless_type_conversion_p (TREE_TYPE (outer_type),
> >+  TREE_TYPE (inner_type))
> >+   && useless_type_conversion_p
> >+(TYPE_OFFSET_BASETYPE (outer_type),
> >+ TYPE_OFFSET_BASETYPE (inner_type));
> >
> 
> The comment says the mode is compared, but I don't see that in the
> code. Which is right?
> 
> Also, wouldn't the final condition be clearer written as
> 
> > +  else if (TREE_CODE (inner_type) == OFFSET_TYPE
> > +  && TREE_CODE (outer_type) == OFFSET_TYPE)

Updated in my local copy, thanks!

Honza
> 
> 
> Bernd

Re: [PATCH] Add verifier for leaked SSA names

2015-10-05 Thread Jeff Law


On 10/05/2015 02:56 AM, Richard Biener wrote:

I was building the verification step into the ssa name manager. Essentially
at the point where we flush from the pending to the free list, we should
have a consistent state.


Yeah, though when SSA verifiers run the state should also be consistent
and we'd get to pinpoint the offending pass easier.

Agreed.




Thus we ought to be able to walk the IL marking everything we can see,
combine that with the contents of the freelist and the result ought to be
every SSA_NAME ever created.

Reality is somewhat different, of course.

Yours takes a slightly different approach.  Ultimately if we get the leaks
plugged, we might even consider using both.


Sure.  Note that the above is from simply walking all SSA names.
Right.  That's precisely what I was referring to.  Yours walks the 
SSA_NAMEs and declares those with an empty block for the defining 
statement as leaks.


Mine walks the IL and declares any name that was allocated, but not 
found in the IL as a leak.


Mine is obviously more expensive, but possibly catches more leaks.

The one conclusion I did come to over the weekend is that without a 
recycle immediate policy, there's no value in explicitly releasing the 
names.  ie, we could just drop all the explicit management and rely on 
garbage collection at safe points.


If we added either a mode to allow immediate recycling or an adaptive 
behaviour in the manager to recycle immediately until it wasn't safe to 
do so, then there's obviously value in the explicit releases and 
plugging  the leaks.


jeff

[PATCH, i386, AVX512] PR target/67849: Avoid upper-bank registers when splitting vec_extract_lo instruction.

2015-10-05 Thread Alexander Fomin

This patch addresses PR target/67849. Given a machine that does not
support AVX512VL, following "else" branch for vec_exract_lo insn
may result in a split using YMMs from upper-bank, hence invalid
assembly. Tuning define_insn pattern and define_split constraints
eliminates this problem.

Please take a look at ChangeLog entries - not sure how to reference
to corresponding split in md.

Regards,
Alexander
---
gcc/

PR target/67849
* config/i386/sse.md (define_split): Restrict splitting for
upper-bank registers when target does not support AVX512VL.
(define_insn "vec_extract_lo_"): Restrict
instruction splitting when target does not support AVX512VL.
---
 gcc/config/i386/sse.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 9b7a338..b413726 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -6924,7 +6924,8 @@
  (parallel [(const_int 0) (const_int 1)
 (const_int 2) (const_int 3)])))]
   "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))
-  && reload_completed"
+  && reload_completed
+  && (TARGET_AVX512VL || (REG_P (operands[0]) && !(REG_P (operands[1]) && 
EXT_REX_SSE_REGNO_P (REGNO (operands[1])"
   [(const_int 0)]
 {
   rtx op1 = operands[1];
@@ -6962,7 +6963,7 @@
 (const_int 2) (const_int 3)])))]
   "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
-  if ()
+  if ( || !TARGET_AVX512VL)
 return "vextract64x4\t{$0x0, %1, 
%0|%0, %1, 0x0}";
   else
 return "#";
-- 
1.8.3.1

Re: [PATCH, i386, AVX512] PR target/67849: Avoid upper-bank registers when splitting vec_extract_lo instruction.

2015-10-05 Thread Uros Bizjak

On Mon, Oct 5, 2015 at 5:54 PM, Alexander Fomin
 wrote:
> This patch addresses PR target/67849. Given a machine that does not
> support AVX512VL, following "else" branch for vec_exract_lo insn
> may result in a split using YMMs from upper-bank, hence invalid
> assembly. Tuning define_insn pattern and define_split constraints
> eliminates this problem.
>
> Please take a look at ChangeLog entries - not sure how to reference
> to corresponding split in md.

Just describe it as " splitter". There are some examples
in ChangeLog.

> Regards,
> Alexander
> ---
> gcc/
>
> PR target/67849
> * config/i386/sse.md (define_split): Restrict splitting for
> upper-bank registers when target does not support AVX512VL.
> (define_insn "vec_extract_lo_"): Restrict
> instruction splitting when target does not support AVX512VL.
> ---
>  gcc/config/i386/sse.md | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 9b7a338..b413726 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -6924,7 +6924,8 @@
>   (parallel [(const_int 0) (const_int 1)
>  (const_int 2) (const_int 3)])))]
>"TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))
> -  && reload_completed"
> +  && reload_completed
> +  && (TARGET_AVX512VL || (REG_P (operands[0]) && !(REG_P (operands[1]) && 
> EXT_REX_SSE_REGNO_P (REGNO (operands[1])"

EXT_REX_SSE_REG_P (operands[1]) can be used here.

>[(const_int 0)]
>  {
>rtx op1 = operands[1];
> @@ -6962,7 +6963,7 @@
>  (const_int 2) (const_int 3)])))]
>"TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
>  {
> -  if ()
> +  if ( || !TARGET_AVX512VL)
>  return "vextract64x4\t{$0x0, %1, 
> %0|%0, %1, 0x0}";
>else
>  return "#";
> --
> 1.8.3.1
>

[gomp4] [nvptx] Don't explicitly pass "-lgomp" to the offload compiler (was: nvptx offloading linking)

2015-10-05 Thread Thomas Schwinge

Hi!

On Wed, 13 May 2015 22:11:36 +0200, I wrote:
> On Wed, 22 Apr 2015 17:08:26 +0200, Bernd Schmidt  
> wrote:
> > On 04/21/2015 05:58 PM, Jakub Jelinek wrote:
> > 
> > > suggests that while it is nice that when building nvptx accel compiler
> > > we build libgcc.a, libc.a, libm.a, libgfortran.a (and in the future 
> > > hopefully libgomp.a),
> > > nothing attempts to link those in :(.
> > 
> > I have that fixed; I expect I'll get around to posting this at some 
> > point now that stage1 is open.
> 
> I have committed the following to gomp-4_0-branch in r223176.  [...]

>   gcc/
>   * config/nvptx/mkoffload.c [...]
>   (main): [...] Add -lgomp.  [...]

> --- gcc/config/nvptx/mkoffload.c
> +++ gcc/config/nvptx/mkoffload.c

> @@ -983,47 +400,74 @@ main (int argc, char **argv)
>obstack_ptr_grow (&argv_obstack, driver);
>obstack_ptr_grow (&argv_obstack, "-xlto");
>obstack_ptr_grow (&argv_obstack, target_ilp32 ? "-m32" : "-m64");
> -[...]
> +  obstack_ptr_grow (&argv_obstack, "-lgomp");

As argued in

(-fopenacc/-fopenmp in combination with the libgomp spec file), and now
verified, we don't actually need that (and I had omitted it from the
earlier trunk commit); now reflected on gomp-4_0-branch in r228495:

commit aee77cda31ea36c95020ea12da3d379d859a851b
Author: tschwinge 
Date:   Mon Oct 5 16:04:23 2015 +

[nvptx] Don't explicitly pass "-lgomp" to the offload compiler

gcc/
* config/nvptx/mkoffload.c (main): Don't explicitly pass "-lgomp"
to the offload compiler.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@228495 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp   |5 +
 gcc/config/nvptx/mkoffload.c |1 -
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 8a32190..a65e652 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2015-10-05  Thomas Schwinge  
+
+   * config/nvptx/mkoffload.c (main): Don't explicitly pass "-lgomp"
+   to the offload compiler.
+
 2015-10-01  Nathan Sidwell  
 
* builtins.c: Don't include gomp-constants.h.
diff --git gcc/config/nvptx/mkoffload.c gcc/config/nvptx/mkoffload.c
index c8ea8b1..e398b44 100644
--- gcc/config/nvptx/mkoffload.c
+++ gcc/config/nvptx/mkoffload.c
@@ -488,7 +488,6 @@ main (int argc, char **argv)
 default:
   gcc_unreachable ();
 }
-  obstack_ptr_grow (&argv_obstack, "-lgomp");
   char *collect_mkoffload_opts = getenv ("COLLECT_MKOFFLOAD_OPTIONS");
   if (collect_mkoffload_opts)
 {


Grüße,
 Thomas


signature.asc
Description: PGP signature

Re: RFC: Patch to allow spill slot alignment greater than the stack alignment

2015-10-05 Thread Steve Ellcey

On Mon, 2015-10-05 at 10:41 +0200, Bernd Schmidt wrote:
> On 10/02/2015 10:57 PM, Steve Ellcey wrote:
> > I have spent some time trying to do dynamic stack alignment on MIPS and had
> > considerable trouble.  The problems are mainly due to the dwarf based stack
> > unwinding and setjmp/longjmp usages where the code does not go through the
> > normal function prologue and epilogue code.
> [...]
> > The main advantage to this approach over dynamically aligning the stack
> > is that by not changing the real stack (or frame) pointer there is
> > minimal chance of breaking the ABI and there are no changes needed to
> > the dwarf unwind code.  The main disadvantage is that I am padding each
> > individual spill so I am wasting more space than absolutely required.
> > It should be possible to address this by putting all the aligned spills
> > together and sharing the padding but I would like to leave that for a
> > future improvement.
> >
> > In the mean time I would like to get some comments on this approach and
> > see what people think.  Does this seem like a reasonable approach to
> > allowing for aligned spills beyond what the stack supports?
> 
> Personally I'm not a fan. Your description of it makes it sound 
> immensely wasteful, and I'm really not clear on why stack alignment 
> wouldn't work for MIPS when it's been shown to work elsewhere. I think 
> we'd want to see a clear demonstration of unfixable problems with stack 
> alignment before allowing something like this in.
> 
> Vlad would have to comment on the LRA bits, probably.
> 
> 
> Bernd

There probably is some way to get dynamic stack alignment to work on
MIPS, but I am not sure I can do it.  The only platform that I see that
uses dynamic stack alignment is x86.  I think the difficulties in
getting this to work correctly is why no other platform has implemented
it.  The most common response I have gotten when asking around for help
on dynamic stack alignment is usually "just break the ABI".

My approach does waste some space, on MIPS I would be allocating 32
bytes of stack space to spill a 16 byte MSA register, but the
hope/belief is that MSA registers would not get spilled very often.

Steve Ellcey

[PATCH] Fix ICE for SIMD clones usage in LTO

2015-10-05 Thread Ilya Enkovich

Hi,

When SIMD clone is created original function may be defined in another 
partition.  In this case SIMD clone also has to have in_other_partition flag.  
Now it doesn't and we get an ICE.  This patch fixes it.  Bootstrapped and 
regtested for x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-10-05  Ilya Enkovich  

* omp-low.c (simd_clone_create): Set in_other_partition
for created clones.

gcc/testsuite/

2015-10-05  Ilya Enkovich  

* gcc.dg/lto/simd-function_0.c: New test.


diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index cdcf9d6..8d25784 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -12948,6 +12948,8 @@ simd_clone_create (struct cgraph_node *old_node)
   DECL_STATIC_CONSTRUCTOR (new_decl) = 0;
   DECL_STATIC_DESTRUCTOR (new_decl) = 0;
   new_node = old_node->create_version_clone (new_decl, vNULL, NULL);
+  if (old_node->in_other_partition)
+   new_node->in_other_partition = 1;
   symtab->call_cgraph_insertion_hooks (new_node);
 }
   if (new_node == NULL)
diff --git a/gcc/testsuite/gcc.dg/lto/simd-function_0.c 
b/gcc/testsuite/gcc.dg/lto/simd-function_0.c
new file mode 100755
index 000..cda31aa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/simd-function_0.c
@@ -0,0 +1,34 @@
+/* { dg-lto-do link } */
+/* { dg-require-effective-target avx2 } */
+/* { dg-lto-options { { -fopenmp-simd -O3 -ffast-math -mavx2 -flto 
-flto-partition=max } } } */
+
+#define SIZE 4096
+float x[SIZE];
+
+
+#pragma omp declare simd
+float
+__attribute__ ((noinline))
+my_mul (float x, float y) {
+  return x * y;
+}
+
+__attribute__ ((noinline))
+int foo ()
+{
+  int i = 0;
+#pragma omp simd safelen (16)
+  for (i = 0; i < SIZE; i++)
+x[i] = my_mul ((float)i, 9932.3323);
+  return (int)x[0];
+}
+
+int main ()
+{
+  int i = 0;
+  for (i = 0; i < SIZE; i++)
+x[i] = my_mul ((float) i, 9932.3323);
+  foo ();
+  return (int)x[0];
+}
+

Re: RFC: Patch to allow spill slot alignment greater than the stack alignment

2015-10-05 Thread H.J. Lu

On Mon, Oct 5, 2015 at 9:10 AM, Steve Ellcey  wrote:
> On Mon, 2015-10-05 at 10:41 +0200, Bernd Schmidt wrote:
>> On 10/02/2015 10:57 PM, Steve Ellcey wrote:
>> > I have spent some time trying to do dynamic stack alignment on MIPS and had
>> > considerable trouble.  The problems are mainly due to the dwarf based stack
>> > unwinding and setjmp/longjmp usages where the code does not go through the
>> > normal function prologue and epilogue code.
>> [...]
>> > The main advantage to this approach over dynamically aligning the stack
>> > is that by not changing the real stack (or frame) pointer there is
>> > minimal chance of breaking the ABI and there are no changes needed to
>> > the dwarf unwind code.  The main disadvantage is that I am padding each
>> > individual spill so I am wasting more space than absolutely required.
>> > It should be possible to address this by putting all the aligned spills
>> > together and sharing the padding but I would like to leave that for a
>> > future improvement.
>> >
>> > In the mean time I would like to get some comments on this approach and
>> > see what people think.  Does this seem like a reasonable approach to
>> > allowing for aligned spills beyond what the stack supports?
>>
>> Personally I'm not a fan. Your description of it makes it sound
>> immensely wasteful, and I'm really not clear on why stack alignment
>> wouldn't work for MIPS when it's been shown to work elsewhere. I think
>> we'd want to see a clear demonstration of unfixable problems with stack
>> alignment before allowing something like this in.
>>
>> Vlad would have to comment on the LRA bits, probably.
>>
>>
>> Bernd
>
> There probably is some way to get dynamic stack alignment to work on
> MIPS, but I am not sure I can do it.  The only platform that I see that
> uses dynamic stack alignment is x86.  I think the difficulties in
> getting this to work correctly is why no other platform has implemented
> it.  The most common response I have gotten when asking around for help
> on dynamic stack alignment is usually "just break the ABI".
>
> My approach does waste some space, on MIPS I would be allocating 32
> bytes of stack space to spill a 16 byte MSA register, but the
> hope/belief is that MSA registers would not get spilled very often.

We keep track stack frame information precisely in x86 backend,
including unwind info, and we fixed DWARF unwind info generation to
support it.  We used gcc_assert to verify that everything is in
sync.

I don't know what is missing to support MIPS dynamic stack alignment.
Unless DWARF unwind info isn't sufficient for MIPS, I don't see why
dynamic stack alignment can't be done for MIPS.  If you get the wrong
DWARF unwind info, you can add assert to GCC source to track down
its origin and fix assert to generate the correct DWARF unwind info.

-- 
H.J.

Re: [patch] Update template instantiation documentation

2015-10-05 Thread Jonathan Wakely


On 03/10/15 10:44 -0600, Sandra Loosemore wrote:

On 10/03/2015 06:47 AM, Jonathan Wakely wrote:

https://gcc.gnu.org/onlinedocs/gcc/Template-Instantiation.html
currently says that using -frepo "is your best option for application
code written for the Borland model, as it just works."

That was true at one point, but as can be seen from the mentions of
binutils 2.8 and Solaris 2, the information there is pretty old.

Since then -frepo has bitrotted occasionally, and it's much simpler to
rely on implicit instantiations in COMDAT sections, controlling
specific instantiations with explicit instantiations if needed (using
'extern template' which was standardised in C++11).

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51910#c2 for an
example of bitrot (now fixed) and people being persuaded by the docs
that -frepo is the best option.

So this revises the docs, to downplay the usefulness of -frepo,
and to endorse the "do nothing" model (with selective explicit
instantations as needed).

It also changes another mention of -frepo to use a different C++-only
option, to further de-emphasize -frepo.

OK for trunk?


Thanks for tackling this.  I remember thinking that this section 
looked bit-rotted when I was reviewing the manual earlier this year.  
Your patch looks like a step in the right direction, but can I get you 
to fix a couple other things while you're at it?


First, I think the reference to ancient ld versions is confusing, and 
it would be better to rewrite that to emphasize that this is the 
default behavior on most targets.  (I'd guess that anybody trying to 
use a recent GCC release with an ld version from 1996 is going to run 
into more critical blocking issues than this one.)  Maybe something 
like:


"G++ implements the Borland model on targets where the linker supports 
it, including both ELF targets (such as GNU/Linux) and Microsoft 
Windows.  Otherwise G++ implements neither automatic model."


Second, if "Do nothing" is now the recommended way to handle this, 
let's move that option to the front of the itemized list instead of 
the end. Also, I'm confused by the "pretend" here; can we just delete 
that sentence?


Here's an updated patch with those changes.


commit 1e41b6aae69aff3126807dec2e3fd59ed85a0d0f
Author: Jonathan Wakely 
Date:   Mon Sep 1 15:50:24 2014 +0100

Update template instantiation documentation

	* doc/extend.texi (Template Instantiation): Reorder options and
	de-emphasize -frepo.
	* doc/invoke.texi (C++ Dialect Options): Use -fstrict-enums in
	example instead of -frepo.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 8406945..2db7bb2 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -19574,8 +19574,8 @@ If any calls are not inlined, you will get linker errors.
 @section Where's the Template?
 @cindex template instantiation
 
-C++ templates are the first language feature to require more
-intelligence from the environment than one usually finds on a UNIX
+C++ templates were the first language feature to require more
+intelligence from the environment than was traditionally found on a UNIX
 system.  Somehow the compiler and linker have to make sure that each
 template instance occurs exactly once in the executable if it is needed,
 and not at all otherwise.  There are two basic approaches to this
@@ -19588,7 +19588,7 @@ equivalent of common blocks to their linker; the compiler emits template
 instances in each translation unit that uses them, and the linker
 collapses them together.  The advantage of this model is that the linker
 only has to consider the object files themselves; there is no external
-complexity to worry about.  This disadvantage is that compilation time
+complexity to worry about.  The disadvantage is that compilation time
 is increased because the template code is being compiled repeatedly.
 Code written for this model tends to include definitions of all
 templates in the header file, since they must be seen to be
@@ -19614,46 +19614,35 @@ of non-inline member templates into a separate file, which should be
 compiled separately.
 @end table
 
-When used with GNU ld version 2.8 or later on an ELF system such as
-GNU/Linux or Solaris 2, or on Microsoft Windows, G++ supports the
-Borland model.  On other systems, G++ implements neither automatic
-model.
+G++ implements the Borland model on targets where the linker supports it,
+including ELF targets (such as GNU/Linux), Mac OS X and Microsoft Windows.
+Otherwise G++ implements neither automatic model.
 
 You have the following options for dealing with template instantiations:
 
 @enumerate
 @item
-@opindex frepo
-Compile your template-using code with @option{-frepo}.  The compiler
-generates files with the extension @samp{.rpo} listing all of the
-template instantiations used in the corresponding object files that
-could be instantiated there; the link wrapper, @samp{collect2},
-then updates the @samp{.rpo} files to tell the compiler where to place
-those ins

Re: [PATCH] Clear SSA_NAME_ANTI_RANGE_P when appropriate (PR tree-optimization/67821)

2015-10-05 Thread Richard Biener

On October 5, 2015 4:46:44 PM GMT+02:00, Marek Polacek  
wrote:
>On Mon, Oct 05, 2015 at 04:26:49PM +0200, Richard Biener wrote:
>> > > Otherwise other setters of SSA_NAME_RANGE_INFO would need to make
>> > > sure SSA_NAME_ANTI_RANGE_P is cleared as well.
>> >  
>> > They mostly do, if I'm looking right, e.g.
>> > tree-ssa-phiopt.c:1016
>> > tree-ssa-loop-im.c:1224
>> > tree-ssa-loop-im.c:1294
>> > and I've also seen
>> > .
>> 
>> Those would be redundant then.
>
>I can give this a whirl (as a follow-up)...

OK for trunk.

Thanks,
Richard.

>2015-10-05  Marek Polacek  
>
>   * tree-ssa-loop-im.c
>   (move_computations_dom_walker::before_dom_children): Don't set
>   SSA_NAME_ANTI_RANGE_P.
>   * tree-ssa-phiopt.c (value_replacement): Likewise.
>
>diff --git gcc/tree-ssa-loop-im.c gcc/tree-ssa-loop-im.c
>index f3389a0..9b2436f 100644
>--- gcc/tree-ssa-loop-im.c
>+++ gcc/tree-ssa-loop-im.c
>@@ -1222,7 +1222,6 @@ move_computations_dom_walker::before_dom_children
>(basic_block bb)
>   {
> tree lhs = gimple_assign_lhs (new_stmt);
> SSA_NAME_RANGE_INFO (lhs) = NULL;
>-SSA_NAME_ANTI_RANGE_P (lhs) = 0;
>   }
>   gsi_insert_on_edge (loop_preheader_edge (level), new_stmt);
>   remove_phi_node (&bsi, false);
>@@ -1292,7 +1291,6 @@ move_computations_dom_walker::before_dom_children
>(basic_block bb)
>   {
> tree lhs = gimple_get_lhs (stmt);
> SSA_NAME_RANGE_INFO (lhs) = NULL;
>-SSA_NAME_ANTI_RANGE_P (lhs) = 0;
>   }
>   /* In case this is a stmt that is not unconditionally executed
>  when the target loop header is executed and the stmt may
>diff --git gcc/tree-ssa-phiopt.c gcc/tree-ssa-phiopt.c
>index 697836a..f33ca5c 100644
>--- gcc/tree-ssa-phiopt.c
>+++ gcc/tree-ssa-phiopt.c
>@@ -1014,7 +1014,6 @@ value_replacement (basic_block cond_bb,
>basic_block middle_bb,
>:
># u_3 = PHI   */
> SSA_NAME_RANGE_INFO (lhs) = NULL;
>-SSA_NAME_ANTI_RANGE_P (lhs) = 0;
> /* If available, we can use VR of phi result at least.  */
> tree phires = gimple_phi_result (phi);
> struct range_info_def *phires_range_info
>
>   Marek

Re: [PATCH] Clear SSA_NAME_ANTI_RANGE_P when appropriate (PR tree-optimization/67821)

2015-10-05 Thread Marek Polacek

On Mon, Oct 05, 2015 at 06:27:53PM +0200, Richard Biener wrote:
> On October 5, 2015 4:46:44 PM GMT+02:00, Marek Polacek  
> wrote:
> >On Mon, Oct 05, 2015 at 04:26:49PM +0200, Richard Biener wrote:
> >> > > Otherwise other setters of SSA_NAME_RANGE_INFO would need to make
> >> > > sure SSA_NAME_ANTI_RANGE_P is cleared as well.
> >> >  
> >> > They mostly do, if I'm looking right, e.g.
> >> > tree-ssa-phiopt.c:1016
> >> > tree-ssa-loop-im.c:1224
> >> > tree-ssa-loop-im.c:1294
> >> > and I've also seen
> >> > .
> >> 
> >> Those would be redundant then.
> >
> >I can give this a whirl (as a follow-up)...
> 
> OK for trunk.

Thanks.  Just for the record, I've succefully regtested/bootstrapped this
patch on x86_64-linux.  Applying to trunk.

> >2015-10-05  Marek Polacek  
> >
> > * tree-ssa-loop-im.c
> > (move_computations_dom_walker::before_dom_children): Don't set
> > SSA_NAME_ANTI_RANGE_P.
> > * tree-ssa-phiopt.c (value_replacement): Likewise.
> >
> >diff --git gcc/tree-ssa-loop-im.c gcc/tree-ssa-loop-im.c
> >index f3389a0..9b2436f 100644
> >--- gcc/tree-ssa-loop-im.c
> >+++ gcc/tree-ssa-loop-im.c
> >@@ -1222,7 +1222,6 @@ move_computations_dom_walker::before_dom_children
> >(basic_block bb)
> > {
> >   tree lhs = gimple_assign_lhs (new_stmt);
> >   SSA_NAME_RANGE_INFO (lhs) = NULL;
> >-  SSA_NAME_ANTI_RANGE_P (lhs) = 0;
> > }
> >   gsi_insert_on_edge (loop_preheader_edge (level), new_stmt);
> >   remove_phi_node (&bsi, false);
> >@@ -1292,7 +1291,6 @@ move_computations_dom_walker::before_dom_children
> >(basic_block bb)
> > {
> >   tree lhs = gimple_get_lhs (stmt);
> >   SSA_NAME_RANGE_INFO (lhs) = NULL;
> >-  SSA_NAME_ANTI_RANGE_P (lhs) = 0;
> > }
> >   /* In case this is a stmt that is not unconditionally executed
> >  when the target loop header is executed and the stmt may
> >diff --git gcc/tree-ssa-phiopt.c gcc/tree-ssa-phiopt.c
> >index 697836a..f33ca5c 100644
> >--- gcc/tree-ssa-phiopt.c
> >+++ gcc/tree-ssa-phiopt.c
> >@@ -1014,7 +1014,6 @@ value_replacement (basic_block cond_bb,
> >basic_block middle_bb,
> >  :
> >  # u_3 = PHI   */
> >   SSA_NAME_RANGE_INFO (lhs) = NULL;
> >-  SSA_NAME_ANTI_RANGE_P (lhs) = 0;
> >   /* If available, we can use VR of phi result at least.  */
> >   tree phires = gimple_phi_result (phi);
> >   struct range_info_def *phires_range_info

Marek

Re: [patch] Update template instantiation documentation

2015-10-05 Thread Jason Merrill


Looks good to me, thanks.

Jason

Re: RFC: Patch to allow spill slot alignment greater than the stack alignment

2015-10-05 Thread Steve Ellcey

On Mon, 2015-10-05 at 09:21 -0700, H.J. Lu wrote:
> On Mon, Oct 5, 2015 at 9:10 AM, Steve Ellcey  wrote:

> > There probably is some way to get dynamic stack alignment to work on
> > MIPS, but I am not sure I can do it.  The only platform that I see that
> > uses dynamic stack alignment is x86.  I think the difficulties in
> > getting this to work correctly is why no other platform has implemented
> > it.  The most common response I have gotten when asking around for help
> > on dynamic stack alignment is usually "just break the ABI".
> >
> > My approach does waste some space, on MIPS I would be allocating 32
> > bytes of stack space to spill a 16 byte MSA register, but the
> > hope/belief is that MSA registers would not get spilled very often.
> 
> We keep track stack frame information precisely in x86 backend,
> including unwind info, and we fixed DWARF unwind info generation to
> support it.  We used gcc_assert to verify that everything is in
> sync.
> 
> I don't know what is missing to support MIPS dynamic stack alignment.
> Unless DWARF unwind info isn't sufficient for MIPS, I don't see why
> dynamic stack alignment can't be done for MIPS.  If you get the wrong
> DWARF unwind info, you can add assert to GCC source to track down
> its origin and fix assert to generate the correct DWARF unwind info.

The problem is that I don't know what is missing either.  I don't know
what the 'correct' stack frame information should look like so I don't
really know what I am trying to generate.  readelf cannot decode the
unwind section of MIPS objects so I can't look at things that way and I
have been trying to work based on what .cfi directives I think I should
be generating and that has not been going well.

One example of an issue I have run into is with the DWARF unwind
generation and 'Rule 16' in dwarf2cfi.c.  It assumes the AND instruction
has an integer constant argument but MIPS can't do an AND with a
constant like -16 so it has to put it in a register first and do the AND
with the register.  I just hacked that code as a temporary workaround
but its the sort of x86 assumption that I have run into due to the fact
that no platform other than x86 currently does dynamic stack alignment.

Steve Ellcey
sell...@imgtec.com

Re: [C++ Patch] PR 53856

2015-10-05 Thread Paolo Carlini


Hi,

On 09/24/2015 03:24 PM, Jason Merrill wrote:

On 09/22/2015 03:31 PM, Paolo Carlini wrote:

 msg = G_("default template arguments may not be used in "
  "partial specializations");
+  else if (current_class_type && !CLASSTYPE_IS_TEMPLATE 
(current_class_type))

+/* Per [temp.param]/9, "A default template-argument shall not be
+   specified in the template-parameter-lists of the definition of
+   a member of a class template that appears outside of the 
member's

+   class.", thus if we aren't handling a member of a class template
+   there is no need to examine the parameters.  */
+last_level_to_check = template_class_depth (current_class_type) 
+ 1;

   else
 msg = G_("default argument for template parameter for class 
enclosing %qD");


Why not handle this below, with the other code that sets 
last_level_to_check?

First, sorry for late replying (a few days of vacations)...

In general, the rationale behind changing that earlier conditional was 
restricting it to the specific case at issue and avoid affecting other 
default-related diagnostic. That said, the patch was subtly wrong 
anyway, because relied on msg remaining zero for the second call of 
check_default_tmpl_args for defarg20.C, thus the testcase was correctly 
accepted but with a zeroed default.


Today, it occurred to me that maybe we can even more directly avoid 
emitting a meaningless "default argument for template parameter for 
class enclosing" for an enclosing class which isn't a template. Thus the 
below, which definitely passes the testsuite and also complexified 
variants (deeper nestings) of the new testcases. Does it make sense to you?


Thanks,
Paolo.

//
Index: cp/pt.c
===
--- cp/pt.c (revision 228467)
+++ cp/pt.c (working copy)
@@ -4940,8 +4940,15 @@ check_default_tmpl_args (tree decl, tree parms, bo
   else if (is_partial)
 msg = G_("default template arguments may not be used in "
 "partial specializations");
+  else if (!current_class_type || CLASSTYPE_IS_TEMPLATE (current_class_type))
+msg = G_("default argument for template parameter for class enclosing 
%qD");
   else
-msg = G_("default argument for template parameter for class enclosing 
%qD");
+/* Per [temp.param]/9, "A default template-argument shall not be
+   specified in the template-parameter-lists of the definition of
+   a member of a class template that appears outside of the member's
+   class.", thus if we aren't handling a member of a class template
+   there is no need to examine the parameters.  */
+return true;
 
   if (current_class_type && TYPE_BEING_DEFINED (current_class_type))
 /* If we're inside a class definition, there's no need to
Index: testsuite/g++.dg/template/defarg19.C
===
--- testsuite/g++.dg/template/defarg19.C(revision 0)
+++ testsuite/g++.dg/template/defarg19.C(working copy)
@@ -0,0 +1,15 @@
+// PR c++/53856
+
+template
+struct A
+{
+  struct B;
+};
+
+template
+struct A::B  // { dg-error "default argument" }
+{
+  int i;
+};
+
+A::B b = { };
Index: testsuite/g++.dg/template/defarg20.C
===
--- testsuite/g++.dg/template/defarg20.C(revision 0)
+++ testsuite/g++.dg/template/defarg20.C(working copy)
@@ -0,0 +1,15 @@
+// PR c++/53856
+
+struct A
+{
+  template
+  struct B;
+};
+
+template
+struct A::B
+{
+  int i;
+};
+
+A::B b = { };

Re: RFC: Patch to allow spill slot alignment greater than the stack alignment

2015-10-05 Thread H.J. Lu

On Mon, Oct 5, 2015 at 9:46 AM, Steve Ellcey  wrote:
> On Mon, 2015-10-05 at 09:21 -0700, H.J. Lu wrote:
>> On Mon, Oct 5, 2015 at 9:10 AM, Steve Ellcey  wrote:
>
>> > There probably is some way to get dynamic stack alignment to work on
>> > MIPS, but I am not sure I can do it.  The only platform that I see that
>> > uses dynamic stack alignment is x86.  I think the difficulties in
>> > getting this to work correctly is why no other platform has implemented
>> > it.  The most common response I have gotten when asking around for help
>> > on dynamic stack alignment is usually "just break the ABI".
>> >
>> > My approach does waste some space, on MIPS I would be allocating 32
>> > bytes of stack space to spill a 16 byte MSA register, but the
>> > hope/belief is that MSA registers would not get spilled very often.
>>
>> We keep track stack frame information precisely in x86 backend,
>> including unwind info, and we fixed DWARF unwind info generation to
>> support it.  We used gcc_assert to verify that everything is in
>> sync.
>>
>> I don't know what is missing to support MIPS dynamic stack alignment.
>> Unless DWARF unwind info isn't sufficient for MIPS, I don't see why
>> dynamic stack alignment can't be done for MIPS.  If you get the wrong
>> DWARF unwind info, you can add assert to GCC source to track down
>> its origin and fix assert to generate the correct DWARF unwind info.
>
> The problem is that I don't know what is missing either.  I don't know
> what the 'correct' stack frame information should look like so I don't
> really know what I am trying to generate.  readelf cannot decode the
> unwind section of MIPS objects so I can't look at things that way and I
> have been trying to work based on what .cfi directives I think I should
> be generating and that has not been going well.

Does MIPS use DWARF unwind info? If yes, it should be easy
to fix readelf to dump MIPS unwind info.  If not, the current dynamic
stack realignment scheme won't work for you.

> One example of an issue I have run into is with the DWARF unwind
> generation and 'Rule 16' in dwarf2cfi.c.  It assumes the AND instruction
> has an integer constant argument but MIPS can't do an AND with a
> constant like -16 so it has to put it in a register first and do the AND
> with the register.  I just hacked that code as a temporary workaround
> but its the sort of x86 assumption that I have run into due to the fact
> that no platform other than x86 currently does dynamic stack alignment.

You need to update dwarf2cfi.c to generate proper unwind info for
whatever frame instructions MIPS generates, like what we did for
x86 dynamic stack realignment.

-- 
H.J.

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Richard Sandiford

Bernd Schmidt  writes:
> On 10/05/2015 05:22 PM, Richard Sandiford wrote:
>> Bernd Schmidt  writes:
>>> On 10/05/2015 04:47 PM, Richard Sandiford wrote:
 @@ -9536,7 +9520,7 @@ fold_builtin_classify (location_t loc, tree fndecl, 
 tree arg, int builtin_index)
{
  r = TREE_REAL_CST (arg);
  if (real_isinf (&r))
 -  return real_compare (GT_EXPR, &r, &dconst0)
 +  return real_compare (GT_EXPR, &r, &dconst<0> ())
   ? integer_one_node : integer_minus_one_node;
  else
return integer_zero_node;
>>>
>>> So... are the templates magic enough not to make us create a new
>>> temporary every time this is used?
>>
>> Yeah, the static variables become comdat objects keyed off the full name
>> (dconst<...>::value).  They're shared between calls and between TUs.
>
> Hmm. And since you're returning a reference, taking the address works. 
> The whole thing is subtle enough that it deserves a comment. Since this 
> kind of thing is something I don't like about C++ (simple-looking code 
> expanding into non-obvious behaviour) I'm not going to ack this patch, 
> but if someone else wants to, that's fine.
>
> I do believe you still have some code growth since the inline dconst 
> function always expands code that will initialize the constant. IMO 
> that's not desirable.

I don't disagree.  I find dconst0 much easier to read than dconst<0> ().
In some ways I was posting the patch to show how bad it could be :-)

If we're prepared to pay the cost of unconditional load-time
initialisation, we could have:

template 
struct dconst_value
{
  dconst_value () { ...set up value ...; }
  REAL_VALUE_TYPE value;
};

template 
const REAL_VALUE_TYPE &
dconst (void)
{
  static dconst_value x;
  return x.value;
}

But this feels like using templates just to prove we know how they work,
rather than because they're the right tool for the job.

If my original patch isn't acceptable, another old-school way of doing
it would be to have a .def file of all the constants that we want.
They could then all be global variables, rather than having some that
are and some that aren't.  They could be initialised at the same point
that dconst0 etc. are initialised now.

E.g. we could have dconst1_3 for 1/3, dconst1_9 for 1/9, etc.  There's
a slight problem that those names are already used for local, rounded,
versions of the constants in some places, but that's easy to fix.

Thanks,
Richard

Re: RFC: Patch to allow spill slot alignment greater than the stack alignment

2015-10-05 Thread Bernd Schmidt


On 10/05/2015 06:46 PM, Steve Ellcey wrote:


One example of an issue I have run into is with the DWARF unwind
generation and 'Rule 16' in dwarf2cfi.c.  It assumes the AND instruction
has an integer constant argument but MIPS can't do an AND with a
constant like -16 so it has to put it in a register first and do the AND
with the register.  I just hacked that code as a temporary workaround
but its the sort of x86 assumption that I have run into due to the fact
that no platform other than x86 currently does dynamic stack alignment.


From what I recall of dwarf2cfi, that issue is already solved for 
things like additions by using the cfa_temp mechanism. Look at rule 6. I 
think you'd need to extend the rule 16 code to use that when it 
encounters a register instead of a constant.


I think time would be better spent pursuing this approach than on a 
register allocator hack.



Bernd

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Bernd Schmidt


On 10/05/2015 07:00 PM, Richard Sandiford wrote:


If my original patch isn't acceptable,


I thought I'd approved it.


another old-school way of doing
it would be to have a .def file of all the constants that we want.
They could then all be global variables, rather than having some that
are and some that aren't.  They could be initialised at the same point
that dconst0 etc. are initialised now.


The thought had crossed my mind, but how many such constants are there, 
really? All this seems like a bigger hammer than necessary.



Bernd

Re: [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion

2015-10-05 Thread Michael Meissner

On Fri, Oct 02, 2015 at 02:04:48PM -0500, Peter Bergner wrote:
> PR67808 exposes a problem with the constraints in the *extenddftf2_internal
> pattern, in that it allows TFmode operands to occupy Altivec registers
> which they are not allowed to do.  Reload was able to work around the
> problem, but LRA is more pedantic and it caused it to go into an infinite
> spill loop until it ICEd.  The following patch from Mike changes the TFmode
> output operand to use the "d" constraint instead of "ws".  It also allows
> using the "ws" constraint for the two input operands, since that is allowed
> for DFmode operands.
> 
> This passed bootstraps (with reload on by default and lra on by default)
> and shows no testsuite regressions.  Is this ok for trunk?
> 
> The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for
> that too, assuming my bootstrap/regtesting there are clean?
> 
> Peter
> 
> 
> gcc/
>   PR target/67808
>   * config/rs6000/rs6000.md (*extenddftf2_internal): Fix constraints.
> 
> gcc/testsuite/
> 
>   * gcc.target/powerpc/pr67808.c: New test.
> 

In looking at the constraints in more detail, after the patch we have the
following alternatives:

  #1:   op0 = m,  op1 = ws, op2 = ws
  #2:   op0 = Y,  op1 = r,  op2 = r
  #3:   op0 = d,  op1 = md, op2 = j
  #4:   op0 = d,  op1 = md, op2 = m
  #5:   op0 = &d, op1 = md, op2 = ws

I.e.

  #1:   Store result, input in VSX register, 0.0 in VSX register (VSX only)
  #2:   Store result, input in GPR register, 0.0 in GPR register
  #3:   Result in FPR register, input in FPR or memory, 0.0 direct (VSX only)
  #4:   Result in FPR register, input in FPR or memory, 0.0 in memory
  #5:   Result in FPR reg (no overlap), input in FPR/memory, 0.0 in VSX reg

So, the non-VSX case (were ws is NO_REGS) only deals with alternatives #2 and
#4.

I think (but I don't have a test case) that alternative #1 is potentially a
problem if the input register is ever allocated to an Altivec register and the
address mode is reg+offset (in which case we would not be able to form the
address after the insn is split post-reload.

I have attached a better version of the patch.

This gives the constraints:

  #1:   op0 = m,  op1 = d,  op2 = d
  #2:   op0 = Y,  op1 = r,  op2 = r
  #3:   op0 = d,  op1 = ws, op2 = j
  #4:   op0 = d,  op1 = md, op2 = m
  #5:   op0 = &d, op1 = m,  op2 = md

I.e.

  #1:   Store result, input in FPR register, 0.0 in FPR register
  #2:   Store result, input in GPR register, 0.0 in GPR register
  #3:   Result in FPR reg, input in VSX reg, 0.0 direct (VSX only)
  #4:   Result in FPR reg, input in FPR/memory, 0.0 in memory
  #5:   Result in FPR reg, input in FPR/memory, 0.0 in FPR/memory (no overlap)

[gcc]
2015-10-05  Peter Bergner 
Michael Meissner  

PR target/67808
* config/rs6000/rs6000.md (extenddftf2_internal): Fix up
constraints.

[gcc/testsuite]
2015-10-05  Peter Bergner 

PR target/67808
* gcc.target/powerpc/pr67808.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 228495)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -6505,9 +6505,9 @@ (define_expand "extenddftf2_fprs"
 })
 
 (define_insn_and_split "*extenddftf2_internal"
-  [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,ws,d,&d")
-   (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,md,md,md")))
-   (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,d"))]
+  [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,d,d,&d")
+   (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,ws,md,md")))
+   (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,md"))]
   "!TARGET_IEEEQUAD
&& TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
&& TARGET_LONG_DOUBLE_128"
Index: gcc/testsuite/gcc.target/powerpc/pr67808.c
===
--- gcc/testsuite/gcc.target/powerpc/pr67808.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr67808.c  (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O1 -mvsx -mlra" } */
+
+/* PR 67808: LRA ICEs on simple double to long double conversion test case */
+
+void
+foo (long double *ldb1, double *db1)
+{
+  *ldb1 = *db1;
+}

RE: [PATCH, MIPS] Frame header optimization for MIPS O32 ABI

2015-10-05 Thread Steve Ellcey

On Mon, 2015-09-28 at 22:10 +, Moore, Catherine wrote:

> Hi Steve, I'm sorry for the delay in reviewing this patch. 
> Some changes have been committed upstream (see revision #227941) that will
> require updates to this patch.
> Please post the update for review.  Other comments are embedded.

OK, I have updated the comments based on your input and changed the code
to compile with the ToT GCC after revision @227941.  Here is the new
patch.


2015-10-05  Steve Ellcey  

* config.gcc (mips*-*-*): Add frame-header-opt.o to extra_objs.
* frame-header-opt.c: New file.
* config/mips/mips-proto.h (mips_register_frame_header_opt):
Add prototype.
* config/mips/mips.c (mips_compute_frame_info): Check
optimize_call_stack flag.
(mips_option_override): Register new frame_header_opt pass.
(mips_frame_info, mips_int_mask, mips_shadow_set,
machine_function): Move these types to...
* config/mips/mips.h: here.
(machine_function): Add does_not_use_frame_header and
optimize_call_stack fields.
* config/mips/t-mips (frame-header-opt.o): Add new make rule.
* doc/invoke.texi (-mframe-header-opt, -mno-frame-header-opt):
Document new flags.
* config/mips/mips.opt (mframe-header-opt): Add new option.


diff --git a/gcc/config.gcc b/gcc/config.gcc
index 56797bd..7bf66f8 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -420,6 +420,7 @@ microblaze*-*-*)
 mips*-*-*)
cpu_type=mips
extra_headers="loongson.h"
+   extra_objs="frame-header-opt.o"
extra_options="${extra_options} g.opt fused-madd.opt 
mips/mips-tables.opt"
;;
 nds32*)
diff --git a/gcc/config/mips/frame-header-opt.c 
b/gcc/config/mips/frame-header-opt.c
new file mode 100644
index 000..7c7b1f2
--- /dev/null
+++ b/gcc/config/mips/frame-header-opt.c
@@ -0,0 +1,216 @@
+/* Analyze functions to determine if callers need to allocate a frame header
+   on the stack.  The frame header is used by callees to save their arguments.
+   This optimization is specific to TARGET_OLDABI targets.  For TARGET_NEWABI
+   targets, if a frame header is required, it is allocated by the callee.
+
+
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+
+#include "config.h"
+#include "system.h"
+#include "context.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "tree-core.h"
+#include "tree-pass.h"
+#include "target.h"
+#include "target-globals.h"
+#include "cfg.h"
+#include "cgraph.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-walk.h"
+
+static unsigned int frame_header_opt (void);
+
+namespace {
+
+const pass_data pass_data_ipa_frame_header_opt =
+{
+  IPA_PASS, /* type */
+  "frame-header-opt", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_CGRAPHOPT, /* tv_id */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_ipa_frame_header_opt : public ipa_opt_pass_d
+{
+public:
+  pass_ipa_frame_header_opt (gcc::context *ctxt)
+: ipa_opt_pass_d (pass_data_ipa_frame_header_opt, ctxt,
+  NULL, /* generate_summary */
+  NULL, /* write_summary */
+  NULL, /* read_summary */
+  NULL, /* write_optimization_summary */
+  NULL, /* read_optimization_summary */
+  NULL, /* stmt_fixup */
+  0, /* function_transform_todo_flags_start */
+  NULL, /* function_transform */
+  NULL) /* variable_transform */
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *)
+{
+  /* This optimization has no affect if TARGET_NEWABI.   If optimize
+ is not at least 1 then the data needed for the optimization is
+ not available and nothing will be done anyway.  */
+  return TARGET_OLDABI && flag_frame_header_optimization;
+}
+
+  virtual unsigned int execute (function *) { return frame_header_opt (); }
+
+}; // class pass_ipa_frame_header_opt
+
+} // anon namespace
+
+static ipa_opt_pass_d *
+make_pass_ipa_frame_header_opt (gcc::context *ctxt)
+{
+  return new pass_ipa_frame_header_opt (ctxt);
+}
+
+void

Re: [C++ Patch] PR 53856

2015-10-05 Thread Jason Merrill


On 10/05/2015 12:50 PM, Paolo Carlini wrote:

+  else if (!current_class_type || CLASSTYPE_IS_TEMPLATE (current_class_type))
+msg = G_("default argument for template parameter for class enclosing 
%qD");


Why would this be right when !current_class_type?

Jason

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Marc Glisse


On Mon, 5 Oct 2015, Richard Sandiford wrote:


I do believe you still have some code growth since the inline dconst
function always expands code that will initialize the constant. IMO
that's not desirable.


I don't disagree.  I find dconst0 much easier to read than dconst<0> ().
In some ways I was posting the patch to show how bad it could be :-)

If we're prepared to pay the cost of unconditional load-time
initialisation, we could have:

template 
struct dconst_value
{
 dconst_value () { ...set up value ...; }
 REAL_VALUE_TYPE value;
};

template 
const REAL_VALUE_TYPE &
dconst (void)
{
 static dconst_value x;
 return x.value;
}


Why are you calling this load-time initialization? As far as I know, such 
a static object is still initialized lazily (with all the overhead this 
implies). Did you mean to make value a static member of dconst_value?


--
Marc Glisse

[PATCH v2, i386]: Enable -mstackrealign and 'force_align_arg_pointer' attribute for x86_64

2015-10-05 Thread Uros Bizjak

On Sun, Oct 4, 2015 at 5:26 PM, Uros Bizjak  wrote:

> As shown in PR 66697 [1] and WineHQ bug [2], an application can
> misalign incoming stack to less than ABI mandated 16 bytes. While it
> is possible to use -mincoming-stack-boundary=2  (= 4 bytes) for 32 bit
> targets to emit stack realignment code, this option is artificially
> limited to 4 (= 16  bytes) for 64bit targets.

Attached v2 patch goes all the way to enable -mstackrealign and
'force_align_arg_pointer' attribute for x86_64. In addition to
-mincoming-stack-boundary changes, the patch changes
MIN_STACK_BOUNDARY definition to 8bytes on 64bit targets, as this is
really the minimum supported stack boundary.

This patch is also needed to allow stack realignment in the interrupt handler.

2015-10-05  Uros Bizjak  

PR target/66697
* config/i386/i386.h (MIN_STACK_BOUNDARY): Redefine as BITS_PER_WORD.
* config/i386/i386.c (ix86_option_override_internal): Lower minimum
allowed incoming stack boundary to 3 for 64bit SSE targets.
(ix86_minimum_incoming_stack_boundary): Also initialize
incoming_stack_boundary to MIN_STACK_BOUNDARY for 64bit targets
when -mstackrealign or force_align_arg_pointer attribute is used.
(ix86_handle_force_align_arg_pointer_attribute): New function.
(ix86_attribute_table): Use it for force_align_arg_pointer attribute.

testsuite/ChangeLog:

2015-10-05  Uros Bizjak  

PR target/66697
* gcc.target/i386/20060512-1.c: Also run on 64bit targets.
(PUSH, POP): New defines.
(sse2_test): Use PUSH and POP to misalign runtime stack.
* gcc.target/i386/20060512-2.c: Also compile on 64bit targets.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32}. I plan to commit this patch to SVN tomorrow, and also to
backport the patch to gcc-5 branch, but no soon that a couple of weeks
without problems in mainline, and after receiving confirmation from
Cygwin people that the patch cures their problems.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 228488)
+++ config/i386/i386.c  (working copy)
@@ -5102,8 +5102,7 @@ ix86_option_override_internal (bool main_args_p,
   ix86_incoming_stack_boundary = ix86_default_incoming_stack_boundary;
   if (opts_set->x_ix86_incoming_stack_boundary_arg)
 {
-  int min = (TARGET_64BIT_P (opts->x_ix86_isa_flags)
-? (TARGET_SSE_P (opts->x_ix86_isa_flags) ? 4 : 3) : 2);
+  int min = TARGET_64BIT_P (opts->x_ix86_isa_flags) ? 3 : 2;
 
   if (opts->x_ix86_incoming_stack_boundary_arg < min
  || opts->x_ix86_incoming_stack_boundary_arg > 12)
@@ -11779,6 +11778,25 @@ find_drap_reg (void)
 }
 }
 
+/* Handle a "force_align_arg_pointer" attribute.  */
+
+static tree
+ix86_handle_force_align_arg_pointer_attribute (tree *node, tree name,
+  tree, int, bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_TYPE
+  && TREE_CODE (*node) != METHOD_TYPE
+  && TREE_CODE (*node) != FIELD_DECL
+  && TREE_CODE (*node) != TYPE_DECL)
+{
+  warning (OPT_Wattributes, "%qE attribute only applies to functions",
+  name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Return minimum incoming stack alignment.  */
 
 static unsigned int
@@ -11789,11 +11807,10 @@ ix86_minimum_incoming_stack_boundary (bool sibcall
   /* Prefer the one specified at command line. */
   if (ix86_user_incoming_stack_boundary)
 incoming_stack_boundary = ix86_user_incoming_stack_boundary;
-  /* In 32bit, use MIN_STACK_BOUNDARY for incoming stack boundary
+  /* Use MIN_STACK_BOUNDARY for incoming stack boundary
  if -mstackrealign is used, it isn't used for sibcall check and
  estimated stack alignment is 128bit.  */
   else if (!sibcall
-  && !TARGET_64BIT
   && ix86_force_align_arg_pointer
   && crtl->stack_alignment_estimated == 128)
 incoming_stack_boundary = MIN_STACK_BOUNDARY;
@@ -48050,7 +48067,7 @@ static const struct attribute_spec ix86_attribute_
 true },
   /* force_align_arg_pointer says this function realigns the stack at entry.  
*/
   { (const char *)&ix86_force_align_arg_pointer_string, 0, 0,
-false, true,  true, ix86_handle_cconv_attribute, false },
+false, true,  true, ix86_handle_force_align_arg_pointer_attribute, false },
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
   { "dllimport", 0, 0, false, false, false, handle_dll_attribute, false },
   { "dllexport", 0, 0, false, false, false, handle_dll_attribute, false },
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 228488)
+++ config/i386/i386.h  (working copy)
@@ -752,7 +752,7 @@ extern const char *host_detect_local_cpu (int argc
 #define MAIN_STACK_BOUNDARY (TARGET_64BIT ? 128 : 32)
 
 /* Minimum stack boundary.  */
-#define MIN_STACK_BOUNDARY (TARGET_64BIT ? (TARGET_SSE ? 128 : 64

Re: [PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion

2015-10-05 Thread Peter Bergner

On Mon, 2015-10-05 at 13:12 -0400, Michael Meissner wrote:
> I have attached a better version of the patch.

I'll note that I have not committed the earlier patch and will hold
off while we sort out what is best here.

> This gives the constraints:
> 
>   #1: op0 = m,  op1 = d,  op2 = d
>   #2: op0 = Y,  op1 = r,  op2 = r
>   #3: op0 = d,  op1 = ws, op2 = j
>   #4: op0 = d,  op1 = md, op2 = m
>   #5: op0 = &d, op1 = m,  op2 = md
> 
> I.e.
> 
>   #1: Store result, input in FPR register, 0.0 in FPR register
>   #2: Store result, input in GPR register, 0.0 in GPR register
>   #3: Result in FPR reg, input in VSX reg, 0.0 direct (VSX only)
>   #4: Result in FPR reg, input in FPR/memory, 0.0 in memory
>   #5: Result in FPR reg, input in FPR/memory, 0.0 in FPR/memory (no overlap)

As we discussed on IRC, this alt #3 now does not accept memory as an
input operand and that is the only alternative that allows us to
generate a xxlxor to create a zero fp value.  Either we can change #3's
opt1 from "ws" to "mws" or just create another alternative.  I'll let
you decide what works best.

Since this test is testing whether we ICE when -mlra -mvsx is enabled,
how about if we verify we're also getting the xxlxor too, with the
addition of:

/* { dg-final { scan-assembler-times "xxlxor" 1 } } */

to the test case?

Peter

Re: [C++ Patch] PR 53856

2015-10-05 Thread Paolo Carlini


Hi,

On 10/05/2015 07:10 PM, Jason Merrill wrote:

On 10/05/2015 12:50 PM, Paolo Carlini wrote:
+  else if (!current_class_type || CLASSTYPE_IS_TEMPLATE 
(current_class_type))
+msg = G_("default argument for template parameter for class 
enclosing %qD");


Why would this be right when !current_class_type?
Yes, it doesn't make much sense, but at least it's conservatively 
correct ;) I'm finishing regtesting the below, everything seems fine so 
far. Ok if it passes?


Thanks,
Paolo.

//
Index: cp/pt.c
===
--- cp/pt.c (revision 228467)
+++ cp/pt.c (working copy)
@@ -4940,8 +4940,15 @@ check_default_tmpl_args (tree decl, tree parms, bo
   else if (is_partial)
 msg = G_("default template arguments may not be used in "
 "partial specializations");
+  else if (current_class_type && CLASSTYPE_IS_TEMPLATE (current_class_type))
+msg = G_("default argument for template parameter for class enclosing 
%qD");
   else
-msg = G_("default argument for template parameter for class enclosing 
%qD");
+/* Per [temp.param]/9, "A default template-argument shall not be
+   specified in the template-parameter-lists of the definition of
+   a member of a class template that appears outside of the member's
+   class.", thus if we aren't handling a member of a class template
+   there is no need to examine the parameters.  */
+return true;
 
   if (current_class_type && TYPE_BEING_DEFINED (current_class_type))
 /* If we're inside a class definition, there's no need to
Index: testsuite/g++.dg/template/defarg19.C
===
--- testsuite/g++.dg/template/defarg19.C(revision 0)
+++ testsuite/g++.dg/template/defarg19.C(working copy)
@@ -0,0 +1,15 @@
+// PR c++/53856
+
+template
+struct A
+{
+  struct B;
+};
+
+template
+struct A::B  // { dg-error "default argument" }
+{
+  int i;
+};
+
+A::B b = { };
Index: testsuite/g++.dg/template/defarg20.C
===
--- testsuite/g++.dg/template/defarg20.C(revision 0)
+++ testsuite/g++.dg/template/defarg20.C(working copy)
@@ -0,0 +1,15 @@
+// PR c++/53856
+
+struct A
+{
+  template
+  struct B;
+};
+
+template
+struct A::B
+{
+  int i;
+};
+
+A::B b = { };

Re: [C++ Patch] PR 53856

2015-10-05 Thread Jason Merrill


OK.

Jason

Re: RFC: Patch to allow spill slot alignment greater than the stack alignment

2015-10-05 Thread Mike Stump

On Oct 5, 2015, at 9:46 AM, Steve Ellcey  wrote:
> One example of an issue I have run into is with the DWARF unwind
> generation and 'Rule 16' in dwarf2cfi.c.  It assumes the AND instruction
> has an integer constant argument but MIPS can't do an AND with a
> constant like -16 so it has to put it in a register first and do the AND
> with the register.

If you are using any detail about the architecture to imagine limitations with 
dwarf generation, then I think you’re missing that fact that the validity of 
the dwarf can be uncoupled from target considerations.  See dwarf_pattern on 
frv for example, or more generally REG_FRAME_RELATED_EXPR on all the ports.  
When _must_ one use this?  Whenever the generated dwarf is other than trivial.  
If trivial, it will usually just work.

You didn’t include a lot of detail in your email, so I’m just extrapolating 
that I think you did and what you saw.

Go patch committed: Update Unicode letters table to Unicode 8.0.0

2015-10-05 Thread Ian Lance Taylor

This patch by Chris Manghane updates the Go frontend unicode letters
table to Unicode 8.0.0.  This fixes https://golang.org/issue/12322 .
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian
Index: gospec.c
===
--- gospec.c(revision 228311)
+++ gospec.c(working copy)
@@ -158,9 +158,11 @@ lang_specific_driver (struct cl_decoded_
library = (library == 0) ? 1 : library;
  break;
 
+#ifdef TARGET_CAN_SPLIT_STACK_64BIT
case OPT_m32:
  saw_opt_m32 = true;
  break;
+#endif
 
case OPT_pg:
case OPT_p:

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Mike Stump

On Oct 5, 2015, at 7:47 AM, Richard Sandiford  wrote:
> -  return real_equal (&TREE_REAL_CST (expr), &dconst0)
> +  return real_equal (&TREE_REAL_CST (expr), &dconst<0> ())

On behalf of mere mortals, hiding the syntax &dconst<0> () into a header file 
as in implementation detail is entirely reasonable.

  return real_equal (&TREE_REAL_CST (expr), &double_const(0))

Also, in C++, we don’t like to mandate apis use & on their parameters, so the 
nicer form:

  return real_equal (TREE_REAL_CST (expr), double_const(0))

is preferred.  :-)  One you add conversion operators:

  return real_equal (TREE_REAL_CST (expr), 0)

and then with operator support, (if == would not be too confusing given the 
actual semantic of real_equal):

  return TREE_REAL_CST (expr) == 0;

and possibly even just:

  return expr == 0

in the end.  :-)

Re: [patch] Update template instantiation documentation

2015-10-05 Thread Sandra Loosemore


On 10/05/2015 10:40 AM, Jason Merrill wrote:

Looks good to me, thanks.

Jason


Looks good to me, too.

-Sandra

Re: Cache reals for 1/4, 1/6 and 1/9

2015-10-05 Thread Richard Biener

On October 5, 2015 8:04:59 PM GMT+02:00, Mike Stump  
wrote:
>On Oct 5, 2015, at 7:47 AM, Richard Sandiford
> wrote:
>> -  return real_equal (&TREE_REAL_CST (expr), &dconst0)
>> +  return real_equal (&TREE_REAL_CST (expr), &dconst<0> ())
>
>On behalf of mere mortals, hiding the syntax &dconst<0> () into a
>header file as in implementation detail is entirely reasonable.

That would mean using references as parameters.  I think we decided against 
this to make it more explicit what is a pointer and what not.

>
>  return real_equal (&TREE_REAL_CST (expr), &double_const(0))
>
>Also, in C++, we don’t like to mandate apis use & on their parameters,
>so the nicer form:
>
>  return real_equal (TREE_REAL_CST (expr), double_const(0))
>
>is preferred.  :-)  One you add conversion operators:
>
>  return real_equal (TREE_REAL_CST (expr), 0)
>
>and then with operator support, (if == would not be too confusing given
>the actual semantic of real_equal):
>
>  return TREE_REAL_CST (expr) == 0;
>
>and possibly even just:
>
>  return expr == 0
>
>in the end.  :-)

Aww...

Re: [PATCH] Remove restriction for remote testing

2015-10-05 Thread Mike Stump

On Sep 28, 2015, at 5:35 AM, James Norris  wrote:
> The attached patch fixes a problem when doing remote testing.
> Specifically, testing of the atomic tests found in gcc/atomic.
> The code in atomic_init precludes the setting of the variable
> 'link_flags' when doing remote testing. The conditional test
> can be safely removed as get_multilibs will return "", and
> atomic_link_flags will return the necessary '-latomic' that
> will allow the atomic tests to successfully link.
> 
> OK for trunk?

I don't think this is appropriate.  The design is for remote host testing to 
have the compete shape of an installed compiler as I recall.  When it does, it 
then is indistinguishable from an installed compiler, and when it is installed, 
then no -L nor -B flag is necessary for it to work.  The link_flags only exists 
to add these flags, not the -l flag.  That is the thing that is wrong.  Remove 
that, and add

 "libs=-latomic”

to someplace that will inject that option.  I stole that line from objc.exp:

  append options "libs=-lobjc”

or otherwise unconditionally put -latomic on the link line (some place that 
isn’t protected by is_remote host).

1 2 >

1 - 100 of 133 matches

Mail list logo