date:20141211

Re: [PATCH fortran/diagnostics] Move gfc_error (buffered) to common diagnostics (try 2)

2014-12-11 Thread Dodji Seketeli

Manuel López-Ibáñez  writes:

> New version using XNEW. Bootstrapped & tested on x86_64-linux-gnu.
>
> OK?

The diagnostics infrastructure changes are OK for me.  Thanks!

Cheers,

-- 
Dodji

Re: [RFC] diagnostics.c: For terminals, restrict messages to terminal width?

2014-12-11 Thread Dodji Seketeli

Tobias Burnus  writes:

> 2014-12-06  Tobias Burnus  
>   Manuel L³pez-Ib¡±ez  
>
> gcc/
>   * diagnostic.c (get_terminal_width): Renamed from getenv_columns,
>   removed static, and additionally use ioctl to get width.
>   (diagnostic_set_caret_max_width): Update call.
>   * diagnostic.h (get_terminal_width): Add prototype.
>   * opts.c (print_specific_help): Use it for x_help_columns.
>   * doc/invoke.texi (fdiagnostics-show-caret): Document how the
>   width is set.
>
> gcc/fortran/
>   * error.c (gfc_get_terminal_width): Renamed from
>   get_terminal_width and use same-named common function.
>   (gfc_error_init_1): Update call.

The diagnostics infrastructure changes are OK for me.  Thanks!

Cheers,

-- 
Dodji

Re: [PATCH] TYPE_OVERFLOW_* cleanup

2014-12-11 Thread Marek Polacek

On Wed, Dec 10, 2014 at 08:11:02PM +0100, Marc Glisse wrote:
> >+inline tree
> >+any_integral_type_check (tree __t, const char *__f, int __l, const char 
> >*__g)
> >+{
> >+  if (!(INTEGRAL_TYPE_P (__t)
> >+|| ((TREE_CODE (__t) == COMPLEX_TYPE
> >+ || VECTOR_TYPE_P (__t))
> >+&& INTEGRAL_TYPE_P (TREE_TYPE (__t)
> >+tree_check_failed (__t, __f, __l, __g, BOOLEAN_TYPE, ENUMERAL_TYPE,
> >+   INTEGER_TYPE, 0);
> >+  return __t;
> >+}
> 
> Is there a particular reason why you are avoiding ANY_INTEGRAL_TYPE_P in
> any_integral_type_check?

No, I'm just blind ;).  Changed in the following, thanks for looking
into this!

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-12-11  Marek Polacek  

* fold-const.c (fold_negate_expr): Add ANY_INTEGRAL_TYPE_P check.
(extract_muldiv_1): Likewise.
(maybe_canonicalize_comparison_1): Likewise.
(fold_comparison): Likewise.
(tree_binary_nonnegative_warnv_p): Likewise.
(tree_binary_nonzero_warnv_p): Likewise.
* gimple-ssa-strength-reduction.c (legal_cast_p_1): Likewise.
* tree-scalar-evolution.c (simple_iv): Likewise.
(scev_const_prop): Likewise.
* tree-ssa-loop-niter.c (expand_simple_operations): Likewise.
* tree-vect-generic.c (expand_vector_operation): Likewise.
* tree.h (ANY_INTEGRAL_TYPE_CHECK): Define.
(ANY_INTEGRAL_TYPE_P): Define.
(TYPE_OVERFLOW_WRAPS, TYPE_OVERFLOW_UNDEFINED, TYPE_OVERFLOW_TRAPS):
Add ANY_INTEGRAL_TYPE_CHECK.
(any_integral_type_check): New function.

diff --git gcc/fold-const.c gcc/fold-const.c
index 0d947ae..7b68bea 100644
--- gcc/fold-const.c
+++ gcc/fold-const.c
@@ -558,7 +558,8 @@ fold_negate_expr (location_t loc, tree t)
 case INTEGER_CST:
   tem = fold_negate_const (t, type);
   if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t)
- || (!TYPE_OVERFLOW_TRAPS (type)
+ || (ANY_INTEGRAL_TYPE_P (type)
+ && !TYPE_OVERFLOW_TRAPS (type)
  && TYPE_OVERFLOW_WRAPS (type))
  || (flag_sanitize & SANITIZE_SI_OVERFLOW) == 0)
return tem;
@@ -5951,7 +5952,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
tree wide_type,
   || EXPRESSION_CLASS_P (op0))
  /* ... and has wrapping overflow, and its type is smaller
 than ctype, then we cannot pass through as widening.  */
- && ((TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0))
+ && (((ANY_INTEGRAL_TYPE_P (TREE_TYPE (op0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0)))
   && (TYPE_PRECISION (ctype)
   > TYPE_PRECISION (TREE_TYPE (op0
  /* ... or this is a truncation (t is narrower than op0),
@@ -5966,7 +5968,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
tree wide_type,
  /* ... or has undefined overflow while the converted to
 type has not, we cannot do the operation in the inner type
 as that would introduce undefined overflow.  */
- || (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))
+ || ((ANY_INTEGRAL_TYPE_P (TREE_TYPE (op0))
+  && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0)))
  && !TYPE_OVERFLOW_UNDEFINED (type
break;
 
@@ -8497,7 +8500,8 @@ maybe_canonicalize_comparison_1 (location_t loc, enum 
tree_code code, tree type,
 
   /* Match A +- CST code arg1 and CST code arg1.  We can change the
  first form only if overflow is undefined.  */
-  if (!((TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
+  if (!(((ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
+ && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
 /* In principle pointers also have undefined overflow behavior,
but that causes problems elsewhere.  */
 && !POINTER_TYPE_P (TREE_TYPE (arg0))
@@ -8712,7 +8716,9 @@ fold_comparison (location_t loc, enum tree_code code, 
tree type,
 
   /* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 -+ C1.  */
   if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
-  && (equality_code || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
+  && (equality_code
+ || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
+ && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0
   && TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
   && !TREE_OVERFLOW (TREE_OPERAND (arg0, 1))
   && TREE_CODE (arg1) == INTEGER_CST
@@ -9031,7 +9037,8 @@ fold_comparison (location_t loc, enum tree_code code, 
tree type,
  X CMP Y +- C2 +- C1 for signed X, Y.  This is valid if
  the resulting offset is smaller in absolute value than the
  original one and has the same sign.  */
-  if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
+  if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
+  && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
   && (TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
   && (TREE_CODE (TREE_O

[Patch, libstdc++/64239] Fix regex_iterator copying

2014-12-11 Thread Tim Shen

As discussed in Bugzilla.

Bootstrapped and tested.

Is it Ok to backport it to 4.9 branch, with _M_in_iterator kept unused?

Thanks! :)


-- 
Regards,
Tim Shen
commit 18c4399589b414c79c6e85ab91f7a95f2fcad829
Author: timshen 
Date:   Wed Dec 10 21:30:13 2014 -0800

PR libstdc++/64239
* include/bits/regex.h (match_results<>::match_results,
match_results<>::operator=, match_results<>::position,
match_results<>::swap): Remove match_results::_M_in_iterator.
Fix ctor/assign/swap.
* include/bits/regex.tcc: (__regex_algo_impl<>,
regex_iterator<>::operator++): Set match_results::_M_begin as
"start position".
* testsuite/28_regex/iterators/regex_iterator/char/
string_position_01.cc: Test cases.

diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index cb6bc93..3afec37 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -1563,42 +1563,30 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   explicit
   match_results(const _Alloc& __a = _Alloc())
-  : _Base_type(__a), _M_in_iterator(false)
+  : _Base_type(__a)
   { }
 
   /**
* @brief Copy constructs a %match_results.
*/
-  match_results(const match_results& __rhs)
-  : _Base_type(__rhs), _M_in_iterator(false)
-  { }
+  match_results(const match_results& __rhs) = default;
 
   /**
* @brief Move constructs a %match_results.
*/
-  match_results(match_results&& __rhs) noexcept
-  : _Base_type(std::move(__rhs)), _M_in_iterator(false)
-  { }
+  match_results(match_results&& __rhs) noexcept = default;
 
   /**
* @brief Assigns rhs to *this.
*/
   match_results&
-  operator=(const match_results& __rhs)
-  {
-   match_results(__rhs).swap(*this);
-   return *this;
-  }
+  operator=(const match_results& __rhs) = default;
 
   /**
* @brief Move-assigns rhs to *this.
*/
   match_results&
-  operator=(match_results&& __rhs)
-  {
-   match_results(std::move(__rhs)).swap(*this);
-   return *this;
-  }
+  operator=(match_results&& __rhs) = default;
 
   /**
* @brief Destroys a %match_results object.
@@ -1685,13 +1673,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   difference_type
   position(size_type __sub = 0) const
   {
-   // [28.12.1.4.5]
-   if (_M_in_iterator)
- return __sub < size() ? std::distance(_M_begin,
-   (*this)[__sub].first) : -1;
-   else
- return __sub < size() ? std::distance(this->prefix().first,
-   (*this)[__sub].first) : -1;
+   return __sub < size() ? std::distance(_M_begin,
+ (*this)[__sub].first) : -1;
   }
 
   /**
@@ -1876,7 +1859,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   void
   swap(match_results& __that)
-  { _Base_type::swap(__that); }
+  {
+   _Base_type::swap(__that);
+   swap(_M_begin, __that._M_begin);
+  }
   //@}
 
 private:
@@ -1894,7 +1880,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
regex_constants::match_flag_type);
 
   _Bi_iter _M_begin;
-  bool _M_in_iterator;
 };
 
   typedef match_results cmatch;
diff --git a/libstdc++-v3/include/bits/regex.tcc 
b/libstdc++-v3/include/bits/regex.tcc
index 9692402..b676428 100644
--- a/libstdc++-v3/include/bits/regex.tcc
+++ b/libstdc++-v3/include/bits/regex.tcc
@@ -62,6 +62,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return false;
 
   typename match_results<_BiIter, _Alloc>::_Base_type& __res = __m;
+  __m._M_begin = __s;
   __res.resize(__re._M_automaton->_M_sub_count() + 2);
   for (auto& __it : __res)
__it.matched = false;
@@ -572,7 +573,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  auto& __prefix = _M_match.at(_M_match.size());
  __prefix.first = __prefix_first;
  __prefix.matched = __prefix.first != __prefix.second;
- _M_match._M_in_iterator = true;
+ // [28.12.1.4.5]
  _M_match._M_begin = _M_begin;
  return *this;
}
@@ -587,7 +588,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  auto& __prefix = _M_match.at(_M_match.size());
  __prefix.first = __prefix_first;
  __prefix.matched = __prefix.first != __prefix.second;
- _M_match._M_in_iterator = true;
+ // [28.12.1.4.5]
  _M_match._M_begin = _M_begin;
}
  else
diff --git 
a/libstdc++-v3/testsuite/28_regex/iterators/regex_iterator/char/string_position_01.cc
 
b/libstdc++-v3/testsuite/28_regex/iterators/regex_iterator/char/string_position_01.cc
index 5fa4ea7..91aa061 100644
---

Re: [PATCH fortran/diagnostics] Move gfc_error (buffered) to common diagnostics (try 2)

2014-12-11 Thread Tobias Burnus


Dodji Seketeli wrote:

Manuel López-Ibáñez  writes:

New version using XNEW. Bootstrapped & tested on x86_64-linux-gnu.
OK?

The diagnostics infrastructure changes are OK for me.  Thanks!


And the Fortran part was already approved before. (Otherwise, take this 
as another rubber stamp.)


Thanks also from my side!

BTW: The terminal-width patch was committed as Rev. 218619.

Tobias

Re: [PATCH 2/3] Extended if-conversion

2014-12-11 Thread Richard Biener

On Wed, Dec 10, 2014 at 4:22 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> Thanks for your reply!
>
> I didn't understand your point:
>
> Well, I don't mind splitting all critical edges unconditionally
>
> but you do it unconditionally in proposed patch.

I don't mind means I am fine with it.

> Also I assume that
> call of split_critical_edges() can break ssa. For example, we can
> split headers of loops, loop exit blocks etc.

How does that "break SSA"?  You mean loop-closed SSA?  I'd
be surprised if so but that may be possible.

> I prefer to do something
> more loop-specialized, e.g. call edge_split() for critical edges
> outgoing from bb ending with GIMPLE_COND stmt (assuming that edge
> destination bb belongs to loop).

That works for me as well but it is more complicated to implement.
Ideally you'd only split one edge if you find a block with only critical
predecessors (where we'd currently give up).  But note that this
requires re-computation of ifc_bbs in if_convertible_loop_p_1 and it
will change loop->num_nodes so we have to be more careful in
constructing the loop calling if_convertible_bb_p.

Richard.

>
> 2014-12-10 17:31 GMT+03:00 Richard Biener :
>> On Wed, Dec 10, 2014 at 11:54 AM, Yuri Rumyantsev  wrote:
>>> Richard,
>>>
>>> Sorry that I forgot to delete debug dump from my fix.
>>> I have few questions about your comments.
>>>
>>> 1. You wrote :
 You also still have two functions for PHI predication.  And the
 new extended variant doesn't commonize the 2-args and general
 path
>>>  Did you mean that I must combine predicate_scalar_phi and
>>> predicate_extended scalar phi to one function?
>>> Please note that if additional flag was not set up (i.e.
>>> aggressive_if_conv is false) extended predication is required more
>>> compile time since it builds hash_map.
>>
>> It's compile-time complexity is reasonable enough even for
>> non-aggressive if-conversion.
>>
>>> 2. About critical edge splitting.
>>>
>>> Did you mean that we should perform it (1) under aggressive_if_conv
>>> option only; (2) should we split all critical edges.
>>> Note that this leads to recomputing of topological order.
>>
>> Well, I don't mind splitting all critical edges unconditionally, thus
>> do something like
>>
>> Index: gcc/tree-if-conv.c
>> ===
>> --- gcc/tree-if-conv.c  (revision 218515)
>> +++ gcc/tree-if-conv.c  (working copy)
>> @@ -2235,12 +2235,21 @@ pass_if_conversion::execute (function *f
>>if (number_of_loops (fun) <= 1)
>>  return 0;
>>
>> +  bool critical_edges_split_p = false;
>>FOR_EACH_LOOP (loop, 0)
>>  if (flag_tree_loop_if_convert == 1
>> || flag_tree_loop_if_convert_stores == 1
>> || ((flag_tree_loop_vectorize || loop->force_vectorize)
>> && !loop->dont_vectorize))
>> -  todo |= tree_if_conversion (loop);
>> +  {
>> +   if (!critical_edges_split_p)
>> + {
>> +   split_critical_edges ();
>> +   critical_edges_split_p = true;
>> +   todo |= TODO_cleanup_cfg;
>> + }
>> +   todo |= tree_if_conversion (loop);
>> +  }
>>
>>  #ifdef ENABLE_CHECKING
>>{
>>
>>> It is worth noting that in current implementation bb's with 2
>>> predecessors and both are on critical edges are accepted without
>>> additional option.
>>
>> Yes, I know.
>>
>> tree-if-conv.c is a mess right now and if we can avoid adding more
>> to it and even fix the critical edge missed optimization with splitting
>> critical edges then I am all for that solution.
>>
>> Richard.
>>
>>> Thanks ahead.
>>> Yuri.
>>> 2014-12-09 18:20 GMT+03:00 Richard Biener :
 On Tue, Dec 9, 2014 at 2:11 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> Here is updated patch2 with the following changes:
> 1. Delete functions  phi_has_two_different_args and find_insertion_point.
> 2. Use only one function for extended predication -
> predicate_extended_scalar_phi.
> 3. Save gsi before insertion of predicate computations for basic
> blocks if it has 2 predecessors and
> both incoming edges are critical or it gas more than 2 predecessors
> and at least one incoming edge
> is critical. This saved iterator can be used by extended phi predication.
>
> Here is motivated test-case which explains this point.
> Test-case is attached (t5.c) and it must be compiled with -O2
> -ftree-loop-vectorize -fopenmp options.
> The problem phi is in bb-7:
>
>   bb_5 (preds = {bb_4 }, succs = {bb_7 bb_9 })
>   {
> :
> xmax_edge_18 = xmax_edge_36 + 1;
> if (xmax_17 == xmax_27)
>   goto ;
> else
>   goto ;
>
>   }
>   bb_6 (preds = {bb_4 }, succs = {bb_7 bb_8 })
>   {
> :
> if (xmax_17 == xmax_27)
>   goto ;
> else
>   goto ;
>
>   }
>   bb_7 (preds = {bb_6 bb_5 }, succs = {bb_11 })
>   {
> :
> # xmax_ed

Re: [PATCH 00/13] Go closures, libffi, and the static chain

2014-12-11 Thread Dominik Vogt

On Fri, Oct 10, 2014 at 01:42:40PM -0700, Richard Henderson wrote:
> The background here is my thread from last week[1], and Ian's reply[2],
> wherein he rightly points out that not needing to play games with
> mmap in order to implement closures for Go is a strong reason to
> continue using custom code within libgo.
> 
> While that thread did have a go at implementing that custom code for
> aarch64, I still think that replicating libffi's calling convention
> knowledge for every interesting target is a mistake.
> 
> So instead I thought about how I'd add some support for Go directly
> into libffi.
...
> But the comment immediately before __go_set_closure itself says
> that it would be better to use the static chain register.
...
> Before I go too much farther down this road, I wanted to get some
> feedback.  FWIW, a complete tree can be found at [4].
...
> [4] git://github.com/rth7680/gcc.git rth/go-closure

1)

On s390x, the static chain register cannot be used for passing the
Go closure pointer to a function:  According to the Abi, the
dynamic linker is allowed to destroy the contents of r0 (static
chain register) eventually causing a crash if libgo is linked
dynamically.  The assumption that the static chain register can be
used to pass information to a function is wrong for s390x.

2)

With this branch, the reflection tests on amd64 crash:

  $ cd /build
  # build gcc
  $ cd /libgo
  $ make reflect/check

  -->

-- snip --
Aborted

reflect.call
../../../libgo/runtime/go-reflect-call.c:216
reflect.call.N13_reflect.Value

GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/value.go:579
reflect.Call.N13_reflect.Value

GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/value.go:412
reflect_test.TestCallWithStruct

GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/all_test.go:1490
testing.tRunner
../../../libgo/go/testing/testing.go:422

goroutine 16 [chan receive]:
testing.RunTests
../../../libgo/go/testing/testing.go:505
testing.Main
../../../libgo/go/testing/testing.go:435
main.main

GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/_testmain.go:124
created by main
../../../libgo/runtime/go-main.c:42

goroutine 18 [finalizer wait]:
created by runtime_createfing
../../../libgo/runtime/mgc0.c:2572

goroutine 53 [sleep]:
reflect_test.selectWatcher

GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/all_test.go:1377
created by reflect_test.$nested2

GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/all_test.go:1107
FAIL: reflect
make: *** [reflect/check] Error 1
-- snip --

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Re: [PATCH 00/13] Go closures, libffi, and the static chain

2014-12-11 Thread Alan Modra

On Thu, Dec 11, 2014 at 10:06:23AM +0100, Dominik Vogt wrote:
> On s390x, the static chain register cannot be used for passing the
> Go closure pointer to a function:  According to the Abi, the
> dynamic linker is allowed to destroy the contents of r0 (static
> chain register) eventually causing a crash if libgo is linked
> dynamically.  The assumption that the static chain register can be
> used to pass information to a function is wrong for s390x.

I was worried about exactly the same "problem" on powerpc with r11
being used for the static chain and also destroyed in linkage stubs.
It turns out we don't traverse any linkage stubs.

See https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00446.html.  

-- 
Alan Modra
Australia Development Lab, IBM

Re: [PATCH][libstdc++][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model

2014-12-11 Thread Kyrill Tkachov



On 10/12/14 22:18, Mike Stump wrote:

On Dec 10, 2014, at 10:05 AM, Kyrill Tkachov  wrote:

Thanks for the guidance. I've moved the definitions into a separate file and 
included that in the places that use it (more than 2 places in my count). This 
is the patch attached.
The second patch (will send shortly after this) adds the logic to libstdc++.

Ok?

Ok.

If anyone else wants to refactor annoying to maintain code into a single place… 
 certainly the legacy of cut-n-paste programming is alive and well in the *.exp 
files.  It was never a design goal to replicate annoying to maintain code.  :-)


Thanks,
and the patch that adds the libstdc++.exp changes at 
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00952.html using the new 
target-utils.exp file is ok too then?


Cheers,
Kyrill

[PATCH][ARM][cleanup] Use R0_REGNUM and R1_REGNUM instead of 0 and 1 where appropriate

2014-12-11 Thread Kyrill Tkachov


Hi all,

While looking in this area on other business I noticed we could be using 
the names R0_REGNUM
and R1_REGNUM when creating those REG rtxs since it's a bit more 
descriptive that just 0 and 1.


Tested arm-none-eabi.

Ok for trunk?

Thanks,
Kyrill

2014-12-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/arm.c (arm_load_tp): Use R0_REGNUM instead of constant 0
in gen_rtx_REG.
(arm_tls_descseq_addr): Likewise.
(arm_gen_movmemqi): Likewise.
(arm_expand_epilogue_apcs_frame): Likewise.
(arm_expand_epilogue): Likewise.
(arm_expand_prologue): Likewise.  Use R1_REGNUM instead of constant 1
in gen_rtx_REG.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 64494e8..d17c81d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -7431,7 +7431,7 @@ arm_load_tp (rtx target)
 
   emit_insn (gen_load_tp_soft ());
 
-  tmp = gen_rtx_REG (SImode, 0);
+  tmp = gen_rtx_REG (SImode, R0_REGNUM);
   emit_move_insn (target, tmp);
 }
   return target;
@@ -7495,13 +7495,13 @@ arm_tls_descseq_addr (rtx x, rtx reg)
    gen_rtx_CONST (VOIDmode, label),
    GEN_INT (!TARGET_ARM)),
 			UNSPEC_TLS);
-  rtx reg0 = load_tls_operand (sum, gen_rtx_REG (SImode, 0));
+  rtx reg0 = load_tls_operand (sum, gen_rtx_REG (SImode, R0_REGNUM));
 
   emit_insn (gen_tlscall (x, labelno));
   if (!reg)
 reg = gen_reg_rtx (SImode);
   else
-gcc_assert (REGNO (reg) != 0);
+gcc_assert (REGNO (reg) != R0_REGNUM);
 
   emit_move_insn (reg, reg0);
 
@@ -14659,7 +14659,7 @@ arm_gen_movmemqi (rtx *operands)
 	  else
 	{
 	  mem = adjust_automodify_address (dstbase, SImode, dst, dstoffset);
-	  emit_move_insn (mem, gen_rtx_REG (SImode, 0));
+	  emit_move_insn (mem, gen_rtx_REG (SImode, R0_REGNUM));
 	  if (last_bytes != 0)
 		{
 		  emit_insn (gen_addsi3 (dst, dst, GEN_INT (4)));
@@ -21092,8 +21092,8 @@ arm_expand_prologue (void)
 	 Just tell it we saved SP in r0.  */
   gcc_assert (TARGET_THUMB2 && !arm_arch_notm && args_to_push == 0);
 
-  r0 = gen_rtx_REG (SImode, 0);
-  r1 = gen_rtx_REG (SImode, 1);
+  r0 = gen_rtx_REG (SImode, R0_REGNUM);
+  r1 = gen_rtx_REG (SImode, R1_REGNUM);
 
   insn = emit_insn (gen_movsi (r0, stack_pointer_rtx));
   RTX_FRAME_RELATED_P (insn) = 1;
@@ -24866,7 +24866,7 @@ arm_expand_epilogue_apcs_frame (bool really_return)
 /* Restore the original stack pointer.  Before prologue, the stack was
realigned and the original stack pointer saved in r0.  For details,
see comment in arm_expand_prologue.  */
-emit_insn (gen_movsi (stack_pointer_rtx, gen_rtx_REG (SImode, 0)));
+emit_insn (gen_movsi (stack_pointer_rtx, gen_rtx_REG (SImode, R0_REGNUM)));
 
   emit_jump_insn (simple_return_rtx);
 }
@@ -25148,7 +25148,7 @@ arm_expand_epilogue (bool really_return)
 /* Restore the original stack pointer.  Before prologue, the stack was
realigned and the original stack pointer saved in r0.  For details,
see comment in arm_expand_prologue.  */
-emit_insn (gen_movsi (stack_pointer_rtx, gen_rtx_REG (SImode, 0)));
+emit_insn (gen_movsi (stack_pointer_rtx, gen_rtx_REG (SImode, R0_REGNUM)));
 
   emit_jump_insn (simple_return_rtx);
 }

Re: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.

2014-12-11 Thread Bin.Cheng

On Wed, Dec 10, 2014 at 9:47 PM, Richard Biener
 wrote:
> On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng  wrote:
>> Hi,
>> Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt
>> issue still exists.
>>
>> Current candidate selecting algorithm tends to select fewer candidates given
>> below reasons:
>>   1) to better handle loops with many induction uses but the best choice is
>> one generic basic induction variable;
>>   2) to keep compilation time low.
>>
>> One fundamental weakness of the strategy is the opposite situation can't be
>> handled properly sometimes.  For these cases the best choice is each
>> induction variable has its own candidate.
>> This patch fixes the problem by shuffling candidate set after fix-point is
>> reached by current implementation.  The reason why this strategy works is it
>> replaces candidate set by selecting local optimal candidate for some
>> induction uses, and the new candidate set (has lower cost) is exact what we
>> want in the mentioned case.  Instrumentation data shows this can find better
>> candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64.
>>
>> This patch actually is extension to the first version patch posted at
>> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds
>> another selecting pass with special seed set (more or less like the shuffled
>> set in this patch).  Data also confirms this patch can find optimal sets for
>> most loops found by the first one, as well as optimal sets for many new
>> loops.
>>
>> Bootstrap and test on x86_64, no regression on benchmarks.  Bootstrap and
>> test on aarch64.
>> Since this patch only selects candidate set with lower cost, any regressions
>> revealed are latent bugs of other components in GCC.
>> I also collected GCC bootstrap time on x86_64, no regression either.
>> Is this OK?
>
> The algorithm seems to be quadratic in the number of IV candidates
> (at least):
Yes, I worried about that too, that's why I measured the bootstrap
time.  One way is restrict this procedure one time for each loop.  I
already tried that and it can capture +90% loops.  Is this sounds
reasonable?

BTW, do we have some compilation time benchmarks for GCC?

Thanks,
bin
>
> + for (i = 0; i < n_iv_cands (data); i++)
> +   {
> ...
> + iv_ca_replace (data, ivs, cand, act_delta, &tmp_delta);
> ...
>
> and
>
> +static void
> +iv_ca_replace (struct ivopts_data *data, struct iv_ca *ivs,
> +  struct iv_cand *cand, struct iv_ca_delta *act_delta,
> +  struct iv_ca_delta **delta)
> +{
> ...
> +  for (i = 0; i < ivs->upto; i++)
> +{
> ...
> +  if (data->consider_all_candidates)
> +   {
> + for (j = 0; j < n_iv_cands (data); j++)
> +   {
>
> possibly cubic if ivs->upto is of similar value.
>
> I wonder if it is possible to restrict this to the single IV with
> the largest delta?  After all we are iterating try_improve_iv_set.
> Alternatively move the handling out of iteration completey,
> thus into the caller of try_improve_iv_set?
>
> Note that compile-time issues always arise in auto-generated code,
> not during GCC bootstrap.
>
> Richard.
>
>
>> 2014-12-03  Bin Cheng  bin.ch...@arm.com
>>
>>   PR tree-optimization/62178
>>   * tree-ssa-loop-ivopts.c (iv_ca_replace): New function.
>>   (try_improve_iv_set): Shuffle candidates set in order to handle
>>   case in which candidate wrto each iv use should be selected.
>>
>> gcc/testsuite/ChangeLog
>> 2014-12-03  Bin Cheng  bin.ch...@arm.com
>>
>>   PR tree-optimization/62178
>>   * gcc.target/aarch64/pr62178.c: New test.

Re: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.

2014-12-11 Thread Bin.Cheng

On Thu, Dec 11, 2014 at 5:56 PM, Bin.Cheng  wrote:
> On Wed, Dec 10, 2014 at 9:47 PM, Richard Biener
>  wrote:
>> On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng  wrote:
>>> Hi,
>>> Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt
>>> issue still exists.
>>>
>>> Current candidate selecting algorithm tends to select fewer candidates given
>>> below reasons:
>>>   1) to better handle loops with many induction uses but the best choice is
>>> one generic basic induction variable;
>>>   2) to keep compilation time low.
>>>
>>> One fundamental weakness of the strategy is the opposite situation can't be
>>> handled properly sometimes.  For these cases the best choice is each
>>> induction variable has its own candidate.
>>> This patch fixes the problem by shuffling candidate set after fix-point is
>>> reached by current implementation.  The reason why this strategy works is it
>>> replaces candidate set by selecting local optimal candidate for some
>>> induction uses, and the new candidate set (has lower cost) is exact what we
>>> want in the mentioned case.  Instrumentation data shows this can find better
>>> candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64.
>>>
>>> This patch actually is extension to the first version patch posted at
>>> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds
>>> another selecting pass with special seed set (more or less like the shuffled
>>> set in this patch).  Data also confirms this patch can find optimal sets for
>>> most loops found by the first one, as well as optimal sets for many new
>>> loops.
>>>
>>> Bootstrap and test on x86_64, no regression on benchmarks.  Bootstrap and
>>> test on aarch64.
>>> Since this patch only selects candidate set with lower cost, any regressions
>>> revealed are latent bugs of other components in GCC.
>>> I also collected GCC bootstrap time on x86_64, no regression either.
>>> Is this OK?
>>
>> The algorithm seems to be quadratic in the number of IV candidates
>> (at least):
> Yes, I worried about that too, that's why I measured the bootstrap
> time.  One way is restrict this procedure one time for each loop.  I
> already tried that and it can capture +90% loops.  Is this sounds
> reasonable?
By +90%, I mean 90% from the 6% improved loops, not the total loop number...
>
> BTW, do we have some compilation time benchmarks for GCC?
>
> Thanks,
> bin
>>
>> + for (i = 0; i < n_iv_cands (data); i++)
>> +   {
>> ...
>> + iv_ca_replace (data, ivs, cand, act_delta, &tmp_delta);
>> ...
>>
>> and
>>
>> +static void
>> +iv_ca_replace (struct ivopts_data *data, struct iv_ca *ivs,
>> +  struct iv_cand *cand, struct iv_ca_delta *act_delta,
>> +  struct iv_ca_delta **delta)
>> +{
>> ...
>> +  for (i = 0; i < ivs->upto; i++)
>> +{
>> ...
>> +  if (data->consider_all_candidates)
>> +   {
>> + for (j = 0; j < n_iv_cands (data); j++)
>> +   {
>>
>> possibly cubic if ivs->upto is of similar value.
>>
>> I wonder if it is possible to restrict this to the single IV with
>> the largest delta?  After all we are iterating try_improve_iv_set.
>> Alternatively move the handling out of iteration completey,
>> thus into the caller of try_improve_iv_set?
>>
>> Note that compile-time issues always arise in auto-generated code,
>> not during GCC bootstrap.
>>
>> Richard.
>>
>>
>>> 2014-12-03  Bin Cheng  bin.ch...@arm.com
>>>
>>>   PR tree-optimization/62178
>>>   * tree-ssa-loop-ivopts.c (iv_ca_replace): New function.
>>>   (try_improve_iv_set): Shuffle candidates set in order to handle
>>>   case in which candidate wrto each iv use should be selected.
>>>
>>> gcc/testsuite/ChangeLog
>>> 2014-12-03  Bin Cheng  bin.ch...@arm.com
>>>
>>>   PR tree-optimization/62178
>>>   * gcc.target/aarch64/pr62178.c: New test.

[PATCH PR62151]Fix REG_DEAD note distribution issue by using right ELIM_I0/ELIM_I1

2014-12-11 Thread Bin Cheng

Hi,
As described both in the PR and patch comments, this patch fixes PR62151 by
setting right value to ELIM_I0/ELIM_I1 when distributing REG_DEAD notes from
i0/i1.  It is said that distribute_notes had caused many bugs in the past.
I think it still has bug in it, as noted in the PR.  This patch doesn't
touch distribute_notes because we are in stage3 and I want to have more
discussion on it.
Bootstrap and test on x86_64.  aarch64 is ongoing.  So is it ok?

2014-12-11  Bin Cheng  

PR rtl-optimization/62151
* combine.c (try_combine): Reset elim_i0 and elim_i1 when
distributing notes from i0notes or i1notes, this time don't
check whether newi2pat sets i1dest or i0dest.

gcc/testsuite/ChangeLog
2014-12-11  Bin Cheng  

PR rtl-optimization/62151
* gcc.c-torture/execute/pr62151.c: New test.Index: gcc/combine.c
===
--- gcc/combine.c   (revision 218200)
+++ gcc/combine.c   (working copy)
@@ -4183,11 +4183,42 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn
   distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
 if (i1notes)
-  distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
-   elim_i2, elim_i1, elim_i0);
+  {
+   /* When distributing REG_DEAD note from i1, it doesn't matter
+  if newi2pat sets i1dest/i0dest or not.  Recompute and use
+  elim_i0/elim_i1 in temp variables.
+
+  See PR62151, if we have four insns combination:
+i0: r0 <- i0src
+i1: r1 <- i1src (using r0)
+  REG_DEAD (r0)
+i2: r0 <- i2src (using r1)
+i3: r3 <- i3src (using r0)
+ix: using r0
+  From i1's point of view, r0 is eliminated, no matter if it is
+  set by newi2pat or not.  In other words, REG_DEAD info for r0
+  in i1 should be discarded.
+
+  Note this only affects cases in which I2 is after I0/I1, like
+  "I1->I2->I3", "I0->I1->I2->I3" or "I0&I1->I2, I2->I3".  For
+  other cases like "I0->I1, I1&I2->I3" or "I1&I2->I3", newi2pat
+  will not set i1dest or i0dest.  */
+   rtx tmp_elim_i1 = (i1 == 0 || i1dest_in_i1src || i1dest_in_i0src
+  || !i1dest_killed
+  ? 0 : i1dest);
+   rtx tmp_elim_i0 = (i0 == 0 || i0dest_in_i0src || !i0dest_killed
+  ? 0 : i0dest);
+   distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
+ elim_i2, tmp_elim_i1, tmp_elim_i0);
+  }
 if (i0notes)
-  distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
-   elim_i2, elim_i1, elim_i0);
+  {
+   /* Same with distribution of i1notes.  */
+   rtx tmp_elim_i0 = (i0 == 0 || i0dest_in_i0src || !i0dest_killed
+  ? 0 : i0dest);
+   distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
+ elim_i2, elim_i1, tmp_elim_i0);
+  }
 if (midnotes)
   distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
Index: gcc/testsuite/gcc.c-torture/execute/pr62151.c
===
--- gcc/testsuite/gcc.c-torture/execute/pr62151.c   (revision 0)
+++ gcc/testsuite/gcc.c-torture/execute/pr62151.c   (revision 0)
@@ -0,0 +1,41 @@
+/* PR rtl-optimization/62151 */
+
+int a, c, d, e, f, g, h, i;
+short b;
+
+int
+fn1 ()
+{
+  b = 0;
+  for (;;)
+{
+  int j[2];
+  j[f] = 0;
+  if (h)
+   d = 0;
+  else
+   {
+ for (; f; f++)
+   ;
+ for (a = 0; a < 1; a++)
+   for (;;)
+ {
+   i = b & ((b ^ 1) & 83647) ? b : b - 1;
+   g = 1 ? i : 0;
+   e = j[0];
+   if (c)
+ break;
+   return 0;
+ }
+   }
+}
+}
+
+int
+main ()
+{
+  fn1 ();
+  if (g != -1)
+__builtin_abort ();
+  return 0;
+}

[PATCH AARCH64]Make ldp/stp case less vulnerable

2014-12-11 Thread Bin Cheng

Hi,
Case gcc.target/aarch64/ldp_stp_3.c test fails on aarch64-none-elf.
Instead of merging the loads into ldp it generates:

foo:
 adrpx1, .LANCHOR0
 add x1, x1, :lo12:.LANCHOR0
 ldr w0, [x1, 4]
 ldr w3, [x1, 20]
 ldr w2, [x1, 32]
 ldr w1, [x1, 16]
 add x2, x3, x2
 add x0, x0, x1
 add x0, x2, x0
 ret

Once register allocation decides to load [x1, 16] into x1(w1) like below:
14: x0:DI = zero_extend([x1:DI+0x4])
 7: x3:DI = zero_extend([x1:DI+0x14])
10: x2:DI = zero_extend([x1:DI+0x20])
17: x1:DI = zero_extend([x1:DI+0x10])

Instructions 14/7/10 are anti-dependent on insn 17, bug sched_fusion orders
ready list (14/7/10) in ascending order of address.  As a result insn 10
intervenes between 7 and 17.
This patch fixes this by making cases less vulnerable.  One possible fix is
to move sched_fusion after regrename, it does help a lot.  I didn't do that
because regrenamre is currently disabled.

Tested on aarch64-elf.  Is it OK?

Thanks,
bin

gcc/testsuite/ChangeLog
2014-12-11  Bin Cheng  

* gcc.target/aarch64/ldp_stp_2.c: Make test less vulnerable.
* gcc.target/aarch64/ldp_stp_3.c: Ditto.Index: gcc/testsuite/gcc.target/aarch64/ldp_stp_2.c
===
--- gcc/testsuite/gcc.target/aarch64/ldp_stp_2.c(revision 218558)
+++ gcc/testsuite/gcc.target/aarch64/ldp_stp_2.c(working copy)
@@ -7,10 +7,8 @@ long long
 foo ()
 {
   long long ll = 0;
-  ll += arr[0][1];
   ll += arr[1][0];
   ll += arr[1][1];
-  ll += arr[2][0];
   return ll;
 }
 
Index: gcc/testsuite/gcc.target/aarch64/ldp_stp_3.c
===
--- gcc/testsuite/gcc.target/aarch64/ldp_stp_3.c(revision 218558)
+++ gcc/testsuite/gcc.target/aarch64/ldp_stp_3.c(working copy)
@@ -7,10 +7,8 @@ unsigned long long
 foo ()
 {
   unsigned long long ll = 0;
-  ll += arr[0][1];
   ll += arr[1][0];
   ll += arr[1][1];
-  ll += arr[2][0];
   return ll;
 }

[PATCH] Fix PR42108

2014-12-11 Thread Richard Biener


The following patch fixes the performance regression in PR42108
by allowing PRE and LIM to see the division (to - from) / step
in translating do loops executed unconditionally.  This makes
them not care for the fact that step might be zero and thus
the division might trap.

This makes the runtime of the testcase improve from 10.7s to
8s (same as gfortran 4.3).

The caveat is that iff the loop is not executed (to < from
for positive step for example) then there will be an additional
executed division computing the unused countm1.

Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok
for trunk?

Thanks,
Richard.

2014-12-11  Richard Biener  

PR tree-optimization/42108
* trans-stmt.c (gfc_trans_do): Execute the division computing
countm1 before the loop entry check.

* gfortran.dg/pr42108.f90: Amend.

Index: gcc/fortran/trans-stmt.c
===
--- gcc/fortran/trans-stmt.c(revision 218515)
+++ gcc/fortran/trans-stmt.c(working copy)
@@ -1645,15 +1645,15 @@ gfc_trans_do (gfc_code * code, tree exit
  This code is executed before we enter the loop body. We generate:
  if (step > 0)
{
+countm1 = (to - from) / step;
 if (to < from)
   goto exit_label;
-countm1 = (to - from) / step;
}
  else
{
+countm1 = (from - to) / -step;
 if (to > from)
   goto exit_label;
-countm1 = (from - to) / -step;
}
*/
 
@@ -1675,11 +1675,12 @@ gfc_trans_do (gfc_code * code, tree exit
  fold_build2_loc (loc, MINUS_EXPR, utype,
   tou, fromu),
  stepu);
-  pos = fold_build3_loc (loc, COND_EXPR, void_type_node, tmp,
-fold_build1_loc (loc, GOTO_EXPR, void_type_node,
- exit_label),
-fold_build2 (MODIFY_EXPR, void_type_node,
- countm1, tmp2));
+  pos = build2 (COMPOUND_EXPR, void_type_node,
+   fold_build2 (MODIFY_EXPR, void_type_node,
+countm1, tmp2),
+   build3_loc (loc, COND_EXPR, void_type_node, tmp,
+   build1_loc (loc, GOTO_EXPR, void_type_node,
+   exit_label), NULL_TREE));
 
   /* For a negative step, when to > from, exit, otherwise compute
  countm1 = ((unsigned)from - (unsigned)to) / -(unsigned)step  */
@@ -1688,11 +1689,12 @@ gfc_trans_do (gfc_code * code, tree exit
  fold_build2_loc (loc, MINUS_EXPR, utype,
   fromu, tou),
  fold_build1_loc (loc, NEGATE_EXPR, utype, stepu));
-  neg = fold_build3_loc (loc, COND_EXPR, void_type_node, tmp,
-fold_build1_loc (loc, GOTO_EXPR, void_type_node,
- exit_label),
-fold_build2 (MODIFY_EXPR, void_type_node,
- countm1, tmp2));
+  neg = build2 (COMPOUND_EXPR, void_type_node,
+   fold_build2 (MODIFY_EXPR, void_type_node,
+countm1, tmp2),
+   build3_loc (loc, COND_EXPR, void_type_node, tmp,
+   build1_loc (loc, GOTO_EXPR, void_type_node,
+   exit_label), NULL_TREE));
 
   tmp = fold_build2_loc (loc, LT_EXPR, boolean_type_node, step,
 build_int_cst (TREE_TYPE (step), 0));
Index: gcc/testsuite/gfortran.dg/pr42108.f90
===
--- gcc/testsuite/gfortran.dg/pr42108.f90   (revision 218584)
+++ gcc/testsuite/gfortran.dg/pr42108.f90   (working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-O2 -fdump-tree-fre1" }
+! { dg-options "-O2 -fdump-tree-fre1 -fdump-tree-pre-details" }
 
 subroutine  eval(foo1,foo2,foo3,foo4,x,n,nnd)
   implicit real*8 (a-h,o-z)
@@ -21,7 +21,9 @@ subroutine  eval(foo1,foo2,foo3,foo4,x,n
   end do
 end subroutine eval
 
+! We should have hoisted the division
+! { dg-final { scan-tree-dump "in all uses of countm1\[^\n\]* / " "pre" } }
 ! There should be only one load from n left
-
 ! { dg-final { scan-tree-dump-times "\\*n_" 1 "fre1" } }
 ! { dg-final { cleanup-tree-dump "fre1" } }
+! { dg-final { cleanup-tree-dump "pre" } }

RE: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.

2014-12-11 Thread David Sherwood

Hi Christophe,

Sorry to bother you again. After my clarification email below are you now
happy for these patches to go in?

Kind Regards,
David Sherwood.

> -Original Message-
> From: David Sherwood [mailto:david.sherw...@arm.com]
> Sent: 27 November 2014 14:53
> To: 'Christophe Lyon'
> Cc: gcc-patches@gcc.gnu.org; Marcus Shawcroft; Alan Hayward; 'Tejas Belagod'; 
> Richard Sandiford
> Subject: RE: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes 
> endianness-safe.
> 
> > On 18 November 2014 10:14, David Sherwood  wrote:
> > > Hi Christophe,
> > >
> > > Ah sorry. My mistake - it fixes this in bugzilla:
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810
> >
> > I did look at that PR, but since it has no testcase attached, I was unsure.
> > And I am still not :-)
> > PR 59810 is "[AArch64] LDn/STn implementations are not ABI-conformant
> > for bigendian."
> > but the advsimd-intrinsics/vldX.c and vldX_lane.c now PASS with Alan's
> > patches on aarch64_be, so I thought Alan's patches solve PR59810.
> >
> > What am I missing?
> 
> Hi Christophe,
> 
> I think probably this is our fault for making our lives way too difficult and
> artificially splitting all these patches up. :)
> 
> Alan's patch:
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00952.html
> 
> fixes some issues on aarch64_be, but also causes regressions. For example,
> 
> 
> Tests that now fail, but worked before:
> 
> aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects 
> execution test
> aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c execution test
> aarch64_be-elf-aem: gcc.dg/vect/vect-over-widen-1-big-array.c -flto 
> -ffat-lto-objects execution test
> ...
> 
> Tests that now work, but didn't before:
> 
> aarch64_be-elf-aem: gcc.dg/vect/fast-math-vect-complex-3.c execution test
> aarch64_be-elf-aem: gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c execution test
> aarch64_be-elf-aem: gcc.dg/vect/no-scevccp-outer-10a.c execution test
> ...
> 
> 
> His patch is only half of the story and must be applied at the same time as 
> the
> "[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe."
> patch. With both patches applied the result looks much healthier:
> 
> 
> # Comparing 1 common sum files
> ## /bin/sh ./src/gcc/contrib/compare_tests  /tmp/gxx-sum1.10051 
> /tmp/gxx-sum2.10051
> Tests that now work, but didn't before:
> 
> aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer  
> execution test
> aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer 
> -funroll-all-loops -finline-
> functions  execution test
> aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer 
> -funroll-loops  execution test
> ...
> 
> 
> with no new regressions. After applying both patches the aarch64_be gcc 
> testsuite is
> on a parity with the aarch64 testsuite. Furthermore, after applying both of 
> these patches:
> 
> "[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe"
> "[AArch64] [BE] Fix vector load/stores to not use ld1/st1"
> 
> it then becomes safe for us to remove the CCMC macro, which is the cause of
> unnecessary spills to the stack for certain auto-vectorised code. So really I
> suppose when I posted my second patch
> 
> "[AArch64] [BE] [2/2] Make large opaque integer modes endianness-safe"
> 
> I should have really just called this
> 
> "[AArch64] [BE] Remove CCMC for aarch64"
> 
> in order to make it clear exactly what the purpose of these patches is.
> 
> Kind Regards,
> David Sherwood.

[PATCHv2] New check and updates in check_GNU_style script

2014-12-11 Thread Yury Gribov


Hi all,

Attached patch adds new check (all blocks of 8 spaces are replaced with
tabs) to contrib/check_GNU_style.sh. It also changes the script to allow
reading patches from stdin and strengthens the "Dot, space, space, new
sentence." check.

Is this ok to commit?

-Y

>From c099086a7325d5feca28630be5a569a7de027c93 Mon Sep 17 00:00:00 2001
From: Yury Gribov 
Date: Thu, 11 Dec 2014 13:19:59 +0300
Subject: [PATCH] 2014-12-11  Yury Gribov  

	check_GNU_style.sh: Support patches coming from stdin,
	check that spaces are converted to tabs and make
	double-space-after-dot check more precice.
---
 contrib/check_GNU_style.sh |   49 ++--
 1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/contrib/check_GNU_style.sh b/contrib/check_GNU_style.sh
index 5f90190..cf6081e 100755
--- a/contrib/check_GNU_style.sh
+++ b/contrib/check_GNU_style.sh
@@ -23,6 +23,8 @@ usage() {
 check_GNU_style.sh [patch]...
 
 Checks the patches for some of the GNU style formatting problems.
+When FILE is -, read standard input.
+
 Please note that these checks are not always accurate, and
 complete.  The reference documentation of the GNU Coding Standards
 can be found here: http://www.gnu.org/prep/standards_toc.html
@@ -35,19 +37,22 @@ EOF
 
 test $# -eq 0 && usage
 
+inp=check_GNU_style.inp
 tmp=check_GNU_style.tmp
 
 # Remove $tmp on exit and various signals.
-trap "rm -f $tmp" 0
-trap "rm -f $tmp ; exit 1" 1 2 3 5 9 13 15
+trap "rm -f $inp $tmp" 0
+trap "rm -f $inp $tmp ; exit 1" 1 2 3 5 9 13 15
+
+grep -nH '^+' $* \
+	| grep -v ':+++' \
+	> $inp
 
 # Grep
 g (){
 msg="$1"
 arg="$2"
-shift 2
-grep -nH '^+' $* \
-	| grep -v ':+++' \
+cat $inp \
 	| egrep --color=always -- "$arg" \
 	> $tmp && printf "\n$msg\n"
 cat $tmp
@@ -58,9 +63,7 @@ ag (){
 msg="$1"
 arg1="$2"
 arg2="$3"
-shift 3
-grep -nH '^+' $* \
-	| grep -v ':+++' \
+cat $inp \
 	| egrep --color=always -- "$arg1" \
 	| egrep --color=always -- "$arg2" \
 	> $tmp && printf "\n$msg\n"
@@ -72,9 +75,7 @@ vg (){
 msg="$1"
 varg="$2"
 arg="$3"
-shift 3
-grep -nH '^+' $* \
-	| grep -v ':+++' \
+cat $inp \
 	| egrep -v -- "$varg" \
 	| egrep --color=always -- "$arg" \
 	> $tmp && printf "\n$msg\n"
@@ -83,9 +84,7 @@ vg (){
 
 col (){
 msg="$1"
-shift 1
-grep -nH '^+' $* \
-	| grep -v ':+++' \
+cat $inp \
 	| cut -f 2 -d '+' \
 	| awk '{ if (length ($0) > 80) print $0 }' \
 	> $tmp
@@ -95,30 +94,32 @@ col (){
 fi
 }
 
-col 'Lines should not exceed 80 characters.' $*
+col 'Lines should not exceed 80 characters.'
+
+g 'Blocks of 8 spaces should be replaced with tabs.' \
+' {8}'
 
 g 'Trailing whitespace.' \
-'[[:space:]]$' $*
+'[[:space:]]$'
 
 g 'Space before dot.' \
-'[[:alnum:]][[:blank:]]+\.' $*
+'[[:alnum:]][[:blank:]]+\.'
 
 g 'Dot, space, space, new sentence.' \
-'[[:alnum:]]\.([[:blank:]]|[[:blank:]]{3,})[[:alnum:]]' $*
+'[[:alnum:]]\.([[:blank:]]|[[:blank:]]{3,})[A-Z0-9]'
 
 g 'Dot, space, space, end of comment.' \
-'[[:alnum:]]\.([[:blank:]]{0,1}|[[:blank:]]{3,})\*/' $*
+'[[:alnum:]]\.([[:blank:]]{0,1}|[[:blank:]]{3,})\*/'
 
 g 'Sentences should end with a dot.  Dot, space, space, end of the comment.' \
-'[[:alnum:]][[:blank:]]*\*/' $*
+'[[:alnum:]][[:blank:]]*\*/'
 
 vg 'There should be exactly one space between function name and parentheses.' \
-'\#define' '[[:alnum:]]([[:blank:]]{2,})?\(' $*
+'\#define' '[[:alnum:]]([[:blank:]]{2,})?\('
 
 g 'There should be no space before closing parentheses.' \
-'[[:graph:]][[:blank:]]+\)' $*
+'[[:graph:]][[:blank:]]+\)'
 
 ag 'Braces should be on a separate line.' \
-'\{' 'if[[:blank:]]\(|while[[:blank:]]\(|switch[[:blank:]]\(' $*
-
+'\{' 'if[[:blank:]]\(|while[[:blank:]]\(|switch[[:blank:]]\('
 
-- 
1.7.9.5

Re: r218609 - in /trunk/gcc: ChangeLog common.opt d...

2014-12-11 Thread Andreas Schwab

hubi...@gcc.gnu.org writes:

> Author: hubicka
> Date: Wed Dec 10 21:17:28 2014
> New Revision: 218609
>
> URL: https://gcc.gnu.org/viewcvs?rev=218609&root=gcc&view=rev
> Log:
>   * doc/invoke.texi: (-devirtualize-at-ltrans): Document.
>   * lto-cgraph.c (lto_output_varpool_node): Mark initializer as removed
>   when it is not streamed to the given ltrans.
>   (compute_ltrans_boundary): Make code adding all polymorphic
>   call targets conditional with !flag_wpa || flag_ltrans_devirtualize.
>   * common.opt (fdevirtualize-at-ltrans): New flag.

/usr/local/gcc/gcc-20141211/gcc/testsuite/g++.dg/ipa/pr64059.C:56:1: internal 
compiler error: Segmentation fault.
0x40df742f crash_signal.
../../gcc/toplev.c:358.
0x412f2c9f get_binfo_at_offset(tree_node*, long, tree_node*).
../../gcc/tree.c:11922.
0x40a0d75f possible_polymorphic_call_targets(tree_node*, long, 
ipa_polymorphic_call_context, bool*, void**, bool).
../../gcc/ipa-devirt.c:2404.
0x40b6f2ef possible_polymorphic_call_targets(cgraph_edge*, bool*, 
void**, bool).
../../gcc/ipa-utils.h:109.
0x40b6f2ef compute_ltrans_boundary(lto_symtab_encoder_d*).
../../gcc/lto-cgraph.c:952.
0x40c40f2f ipa_write_summaries(bool).
../../gcc/passes.c:2511.
0x406584ff ipa_passes.
../../gcc/cgraphunit.c:2091.
0x406584ff symbol_table::compile().
../../gcc/cgraphunit.c:2187.
0x4065be1f symbol_table::finalize_compilation_unit().
../../gcc/cgraphunit.c:2340.
0x4029bcef cp_write_global_declarations().
../../gcc/cp/decl2.c:4688.
Please submit a full bug report,.
with preprocessed source if appropriate..
Please include the complete backtrace with any bug report..
See <http://gcc.gnu.org/bugs.html> for instructions..

FAIL: g++.dg/ipa/pr64059.C  -std=gnu++11 (internal compiler error)

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain

2014-12-11 Thread Dominik Vogt

On Thu, Dec 11, 2014 at 07:51:44PM +1030, Alan Modra wrote:
> On Thu, Dec 11, 2014 at 10:06:23AM +0100, Dominik Vogt wrote:
> > On s390x, the static chain register cannot be used for passing the
> > Go closure pointer to a function:  According to the Abi, the
> > dynamic linker is allowed to destroy the contents of r0 (static
> > chain register) eventually causing a crash if libgo is linked
> > dynamically.  The assumption that the static chain register can be
> > used to pass information to a function is wrong for s390x.
> 
> I was worried about exactly the same "problem" on powerpc with r11
> being used for the static chain and also destroyed in linkage stubs.
> It turns out we don't traverse any linkage stubs.

Just to make this clear:  It's not something that *might* happen.
It *does* happen on s390[x] which does not use libffi but the hand
written code in makefunc_s390.S and makefuncgo_s390[x].go.

The same may not happen when calling functions through libffi
(which may be dynamically linked) because ffi_call_go() is passed
the closure pointer as an argument and not in the static chain
register.

> See https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00446.html.  

Thanks for the link.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Re: [PATCH] TYPE_OVERFLOW_* cleanup

2014-12-11 Thread Richard Biener

On Thu, 11 Dec 2014, Marek Polacek wrote:

> On Wed, Dec 10, 2014 at 08:11:02PM +0100, Marc Glisse wrote:
> > >+inline tree
> > >+any_integral_type_check (tree __t, const char *__f, int __l, const char 
> > >*__g)
> > >+{
> > >+  if (!(INTEGRAL_TYPE_P (__t)
> > >+  || ((TREE_CODE (__t) == COMPLEX_TYPE
> > >+   || VECTOR_TYPE_P (__t))
> > >+  && INTEGRAL_TYPE_P (TREE_TYPE (__t)
> > >+tree_check_failed (__t, __f, __l, __g, BOOLEAN_TYPE, ENUMERAL_TYPE,
> > >+ INTEGER_TYPE, 0);
> > >+  return __t;
> > >+}
> > 
> > Is there a particular reason why you are avoiding ANY_INTEGRAL_TYPE_P in
> > any_integral_type_check?
> 
> No, I'm just blind ;).  Changed in the following, thanks for looking
> into this!
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2014-12-11  Marek Polacek  
> 
>   * fold-const.c (fold_negate_expr): Add ANY_INTEGRAL_TYPE_P check.
>   (extract_muldiv_1): Likewise.
>   (maybe_canonicalize_comparison_1): Likewise.
>   (fold_comparison): Likewise.
>   (tree_binary_nonnegative_warnv_p): Likewise.
>   (tree_binary_nonzero_warnv_p): Likewise.
>   * gimple-ssa-strength-reduction.c (legal_cast_p_1): Likewise.
>   * tree-scalar-evolution.c (simple_iv): Likewise.
>   (scev_const_prop): Likewise.
>   * tree-ssa-loop-niter.c (expand_simple_operations): Likewise.
>   * tree-vect-generic.c (expand_vector_operation): Likewise.
>   * tree.h (ANY_INTEGRAL_TYPE_CHECK): Define.
>   (ANY_INTEGRAL_TYPE_P): Define.
>   (TYPE_OVERFLOW_WRAPS, TYPE_OVERFLOW_UNDEFINED, TYPE_OVERFLOW_TRAPS):
>   Add ANY_INTEGRAL_TYPE_CHECK.
>   (any_integral_type_check): New function.
> 
> diff --git gcc/fold-const.c gcc/fold-const.c
> index 0d947ae..7b68bea 100644
> --- gcc/fold-const.c
> +++ gcc/fold-const.c
> @@ -558,7 +558,8 @@ fold_negate_expr (location_t loc, tree t)
>  case INTEGER_CST:
>tem = fold_negate_const (t, type);
>if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t)
> -   || (!TYPE_OVERFLOW_TRAPS (type)
> +   || (ANY_INTEGRAL_TYPE_P (type)
> +   && !TYPE_OVERFLOW_TRAPS (type)
> && TYPE_OVERFLOW_WRAPS (type))
> || (flag_sanitize & SANITIZE_SI_OVERFLOW) == 0)
>   return tem;
> @@ -5951,7 +5952,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
> tree wide_type,
>  || EXPRESSION_CLASS_P (op0))
> /* ... and has wrapping overflow, and its type is smaller
>than ctype, then we cannot pass through as widening.  */
> -   && ((TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0))
> +   && (((ANY_INTEGRAL_TYPE_P (TREE_TYPE (op0))
> + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0)))
>  && (TYPE_PRECISION (ctype)
>  > TYPE_PRECISION (TREE_TYPE (op0
> /* ... or this is a truncation (t is narrower than op0),
> @@ -5966,7 +5968,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
> tree wide_type,
> /* ... or has undefined overflow while the converted to
>type has not, we cannot do the operation in the inner type
>as that would introduce undefined overflow.  */
> -   || (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))
> +   || ((ANY_INTEGRAL_TYPE_P (TREE_TYPE (op0))
> +&& TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0)))
> && !TYPE_OVERFLOW_UNDEFINED (type
>   break;
>  
> @@ -8497,7 +8500,8 @@ maybe_canonicalize_comparison_1 (location_t loc, enum 
> tree_code code, tree type,
>  
>/* Match A +- CST code arg1 and CST code arg1.  We can change the
>   first form only if overflow is undefined.  */
> -  if (!((TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
> +  if (!(((ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
> +   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
>/* In principle pointers also have undefined overflow behavior,
>   but that causes problems elsewhere.  */
>&& !POINTER_TYPE_P (TREE_TYPE (arg0))
> @@ -8712,7 +8716,9 @@ fold_comparison (location_t loc, enum tree_code code, 
> tree type,
>  
>/* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 -+ C1.  */
>if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
> -  && (equality_code || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
> +  && (equality_code
> +   || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
> +   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0
>&& TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
>&& !TREE_OVERFLOW (TREE_OPERAND (arg0, 1))
>&& TREE_CODE (arg1) == INTEGER_CST
> @@ -9031,7 +9037,8 @@ fold_comparison (location_t loc, enum tree_code code, 
> tree type,
>   X CMP Y +- C2 +- C1 for signed X, Y.  This is valid if
>   the resulting offset is smaller in absolute value than the
>   original one and has the same sign.  */
> -  if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
> +  if (ANY_INTEGRAL_TYP

[C++ Patch] Mini maybe_warn_about_useless_cast clean up?

2014-12-11 Thread Paolo Carlini


Hi,

yesterday, while working on c++/60955 I noticed this comment and 
wondered if for 5 we want to do the below. Certainly passes testing on 
x86_64-linux.


Thanks,
Paolo.

///
2014-12-11  Paolo Carlini  

* typeck.c (maybe_warn_about_useless_cast): Remove unnecessary
conditional.
Index: typeck.c
===
--- typeck.c(revision 218619)
+++ typeck.c(working copy)
@@ -6363,12 +6364,6 @@ maybe_warn_about_useless_cast (tree type, tree exp
   if (warn_useless_cast
   && complain & tf_warning)
 {
-  /* In C++14 mode, this interacts badly with force_paren_expr.  And it
-isn't necessary in any mode, because the code below handles
-glvalues properly.  For 4.9, just skip it in C++14 mode.  */
-  if (cxx_dialect < cxx14 && REFERENCE_REF_P (expr))
-   expr = TREE_OPERAND (expr, 0);
-
   if ((TREE_CODE (type) == REFERENCE_TYPE
   && (TYPE_REF_IS_RVALUE (type)
   ? xvalue_p (expr) : real_lvalue_p (expr))

RE: RFC: PATCH to genericize C++ loops to LOOP_EXPR instead of gotos

2014-12-11 Thread Bernd Edlinger


Hi Jason,

I managed to reproduce this fault now.

and entered a bug tracker for it:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64265

any ideas how this patch could move the __tsan_func_entry into the loop?


Thanks
Bernd.

On Wed, 10 Dec 2014 00:10:07, Bernd Edlinger wrote:
>
> Hi Jason,
>
>> I ran the tramp3d benchmark over 500 iterations before and after the
> change and couldn't see any measurable difference in runtime. The
> binary with my
>> change was 0.4% smaller.
> I'm going to go ahead and check it in; if a performance hit shows up on
> the automated testing we can revisit the choice.
>
> Unfortunately, this checkin broke the thread sanitizer:
>
> r217669 | jason | 2014-11-17 20:08:02 +0100 (Mo, 17. Nov 2014) | 2 Zeilen
>
> * cp-gimplify.c (genericize_cp_loop): Use LOOP_EXPR.
> (genericize_for_stmt): Handle null statement-list.
>
>
> therefore I would kindly ask you to revert this again.
>
> After that patch some C++ functions moved the call to the __tsan_func_entry 
> into the loop.
> And we get crashes or major memory leaks from this, if the software is 
> compiled with -fsanitize=thread.
>
> This happens in a software package on which I currently work. It is called 
> Softing OPC UA Toolbox.
>
> I found this in the generated assembler code by bisection:
>
> 0092747d 
> <_ZNSt12_Destroy_auxILb0EE9__destroyIPN18SoftingOPCToolbox55ValueEEEvT_S5_>:
>   92747d:   55  push   %rbp
>   92747e:   48 89 e5mov%rsp,%rbp
>   927481:   53  push   %rbx
>   927482:   48 83 ec 18 sub$0x18,%rsp
>   927486:   48 89 7d e8 mov%rdi,-0x18(%rbp)
>   92748a:   48 89 75 e0 mov%rsi,-0x20(%rbp)
>   92748e:   48 8b 45 08 mov0x8(%rbp),%rax
>   927492:   48 89 c7mov%rax,%rdi
>   927495:   e8 26 33 fe ff  callq  90a7c0 <__tsan_func_entry@plt>
>   92749a:   48 8b 45 e8 mov-0x18(%rbp),%rax
>   92749e:   48 3b 45 e0 cmp-0x20(%rbp),%rax
>   9274a2:   74 3d   je 9274e1 
> <_ZNSt12_Destroy_auxILb0EE9__destroyIPN18SoftingOPCToolbox55ValueEEEvT_S5_+0x64>
>   9274a4:   48 8b 5d e8 mov-0x18(%rbp),%rbx
>   9274a8:   48 89 d8mov%rbx,%rax
>   9274ab:   48 85 dbtest   %rbx,%rbx
>   9274ae:   74 0b   je 9274bb 
> <_ZNSt12_Destroy_auxILb0EE9__destroyIPN18SoftingOPCToolbox55ValueEEEvT_S5_+0x3e>
>   9274b0:   48 89 c2mov%rax,%rdx
>   9274b3:   83 e2 07and$0x7,%edx
>   9274b6:   48 85 d2test   %rdx,%rdx
>   9274b9:   74 0f   je 9274ca 
> <_ZNSt12_Destroy_auxILb0EE9__destroyIPN18SoftingOPCToolbox55ValueEEEvT_S5_+0x4d>
>   9274bb:   48 89 c6mov%rax,%rsi
>   9274be:   48 8d 3d 1b a3 f8 00lea0xf8a31b(%rip),%rdi# 
> 18b17e0 
>   9274c5:   e8 06 36 fe ff  callq  90aad0 
> <__ubsan_handle_type_mismatch@plt>
>   9274ca:   48 89 dfmov%rbx,%rdi
>   9274cd:   e8 1b 09 00 00  callq  927ded 
> <_ZSt11__addressofIN18SoftingOPCToolbox55ValueEEPT_RS2_>
>   9274d2:   48 89 c7mov%rax,%rdi
>   9274d5:   e8 21 09 00 00  callq  927dfb 
> <_ZSt8_DestroyIN18SoftingOPCToolbox55ValueEEvPT_>
>   9274da:   48 83 45 e8 20  addq   $0x20,-0x18(%rbp)
>   9274df:   eb ad   jmp92748e 
> <_ZNSt12_Destroy_auxILb0EE9__destroyIPN18SoftingOPCToolbox55ValueEEEvT_S5_+0x11>
>   9274e1:   e8 da 2c fe ff  callq  90a1c0 <__tsan_func_exit@plt>
>   9274e6:   48 83 c4 18 add$0x18,%rsp
>   9274ea:   5b  pop%rbx
>   9274eb:   5d  pop%rbp
>   9274ec:   c3  retq
>   9274ed:   90  nop
>
>
> see the jmp at 9274df: it jumps to _before_ the tsan_func_entry.
> I am not sure how to locate the source code of the above assembler section.
> But I'd guess, it must be some kind of automatically generated default 
> destructor.
>
> All I can say in the moment, that it is was working perfectly before Nov 17.
>
>
> Thanks
> Bernd.
>

Re: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.

2014-12-11 Thread Richard Biener

On Thu, Dec 11, 2014 at 10:56 AM, Bin.Cheng  wrote:
> On Wed, Dec 10, 2014 at 9:47 PM, Richard Biener
>  wrote:
>> On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng  wrote:
>>> Hi,
>>> Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt
>>> issue still exists.
>>>
>>> Current candidate selecting algorithm tends to select fewer candidates given
>>> below reasons:
>>>   1) to better handle loops with many induction uses but the best choice is
>>> one generic basic induction variable;
>>>   2) to keep compilation time low.
>>>
>>> One fundamental weakness of the strategy is the opposite situation can't be
>>> handled properly sometimes.  For these cases the best choice is each
>>> induction variable has its own candidate.
>>> This patch fixes the problem by shuffling candidate set after fix-point is
>>> reached by current implementation.  The reason why this strategy works is it
>>> replaces candidate set by selecting local optimal candidate for some
>>> induction uses, and the new candidate set (has lower cost) is exact what we
>>> want in the mentioned case.  Instrumentation data shows this can find better
>>> candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64.
>>>
>>> This patch actually is extension to the first version patch posted at
>>> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds
>>> another selecting pass with special seed set (more or less like the shuffled
>>> set in this patch).  Data also confirms this patch can find optimal sets for
>>> most loops found by the first one, as well as optimal sets for many new
>>> loops.
>>>
>>> Bootstrap and test on x86_64, no regression on benchmarks.  Bootstrap and
>>> test on aarch64.
>>> Since this patch only selects candidate set with lower cost, any regressions
>>> revealed are latent bugs of other components in GCC.
>>> I also collected GCC bootstrap time on x86_64, no regression either.
>>> Is this OK?
>>
>> The algorithm seems to be quadratic in the number of IV candidates
>> (at least):
> Yes, I worried about that too, that's why I measured the bootstrap
> time.  One way is restrict this procedure one time for each loop.  I
> already tried that and it can capture +90% loops.  Is this sounds
> reasonable?

Yes.  That's my suggestion to handle it in the caller of try_improve_iv_set?

> BTW, do we have some compilation time benchmarks for GCC?

There are various testcases linked from PR47344, I don't remember
any particular one putting load on IVOPTs (but I do remember seeing
IVOPTs in the ~25% area in -ftime-report for some testcases).

Thanks,
Richard.

> Thanks,
> bin
>>
>> + for (i = 0; i < n_iv_cands (data); i++)
>> +   {
>> ...
>> + iv_ca_replace (data, ivs, cand, act_delta, &tmp_delta);
>> ...
>>
>> and
>>
>> +static void
>> +iv_ca_replace (struct ivopts_data *data, struct iv_ca *ivs,
>> +  struct iv_cand *cand, struct iv_ca_delta *act_delta,
>> +  struct iv_ca_delta **delta)
>> +{
>> ...
>> +  for (i = 0; i < ivs->upto; i++)
>> +{
>> ...
>> +  if (data->consider_all_candidates)
>> +   {
>> + for (j = 0; j < n_iv_cands (data); j++)
>> +   {
>>
>> possibly cubic if ivs->upto is of similar value.
>>
>> I wonder if it is possible to restrict this to the single IV with
>> the largest delta?  After all we are iterating try_improve_iv_set.
>> Alternatively move the handling out of iteration completey,
>> thus into the caller of try_improve_iv_set?
>>
>> Note that compile-time issues always arise in auto-generated code,
>> not during GCC bootstrap.
>>
>> Richard.
>>
>>
>>> 2014-12-03  Bin Cheng  bin.ch...@arm.com
>>>
>>>   PR tree-optimization/62178
>>>   * tree-ssa-loop-ivopts.c (iv_ca_replace): New function.
>>>   (try_improve_iv_set): Shuffle candidates set in order to handle
>>>   case in which candidate wrto each iv use should be selected.
>>>
>>> gcc/testsuite/ChangeLog
>>> 2014-12-03  Bin Cheng  bin.ch...@arm.com
>>>
>>>   PR tree-optimization/62178
>>>   * gcc.target/aarch64/pr62178.c: New test.

Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain

2014-12-11 Thread Dominik Vogt

On Thu, Dec 11, 2014 at 11:31:06AM +0100, Dominik Vogt wrote:
> Just to make this clear:  It's not something that *might* happen.
> It *does* happen on s390[x] which does not use libffi but the hand
> written code in makefunc_s390.S and makefuncgo_s390[x].go.
> 
> The same may not happen when calling functions through libffi
> (which may be dynamically linked) because ffi_call_go() is passed
> the closure pointer as an argument and not in the static chain
> register.

Update:  If I disable the custom s390x code and switch to the
implementation just using libffi for reflection calls, the same
crash occurs with the testing/quick libgo test case.  The called
function sees a bogus value written by the synamic linker as the
closure pointer, for example with this line in the test code:

  CheckEqual(fComplex64, fComplex64, nil)

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

PR64182: Fix rounding division and modulus

2014-12-11 Thread Richard Sandiford

As pointed out in PR 64182, wide-int rounded division gets the
ties-away-from-zero case wrong for odd-numbered dividends, while
double_int gets the unsigned case wrong by unconditionally treating
a dividend or remainder with the top bit set as negative.  As Jakub
says, the test used in double_int might also have overflow problems.

This patch uses:

   abs (remainder) >= abs (dividend) - abs (remainder)

for both wide-int and double_int and fixes the unsigned case in double_int.
I didn't know how to test the double_int change using input code so
resorted to doing some double_int arithmetic at the start of main.

Thanks to Joseph for the testcase.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
PR middle-end/64182
* wide-int.h (wi::div_round, wi::mod_round): Fix rounding of tied
cases.
* double-int.c (div_and_round_double): Fix handling of unsigned
cases.  Use same rounding approach as wide-int.h.

gc/testsuite/
2014-xx-xx  Joseph Myers  

PR middle-end/64182
* gnat.dg/round_div.adb: New test.

Index: gcc/double-int.c
===
--- gcc/double-int.c2014-12-11 10:45:44.430786435 +
+++ gcc/double-int.c2014-12-11 10:46:10.570461030 +
@@ -569,24 +569,23 @@ div_and_round_double (unsigned code, int
   {
unsigned HOST_WIDE_INT labs_rem = *lrem;
HOST_WIDE_INT habs_rem = *hrem;
-   unsigned HOST_WIDE_INT labs_den = lden, ltwice;
-   HOST_WIDE_INT habs_den = hden, htwice;
+   unsigned HOST_WIDE_INT labs_den = lden, lnegabs_rem, ldiff;
+   HOST_WIDE_INT habs_den = hden, hnegabs_rem, hdiff;
 
/* Get absolute values.  */
-   if (*hrem < 0)
+   if (!uns && *hrem < 0)
  neg_double (*lrem, *hrem, &labs_rem, &habs_rem);
-   if (hden < 0)
+   if (!uns && hden < 0)
  neg_double (lden, hden, &labs_den, &habs_den);
 
-   /* If (2 * abs (lrem) >= abs (lden)), adjust the quotient.  */
-   mul_double ((HOST_WIDE_INT) 2, (HOST_WIDE_INT) 0,
-   labs_rem, habs_rem, = abs(den) - abs(rem), adjust the quotient.  */
+   neg_double (labs_rem, habs_rem, &lnegabs_rem, &hnegabs_rem);
+   add_double (labs_den, habs_den, lnegabs_rem, hnegabs_rem,
+   &ldiff, &hdiff);
 
-   if (((unsigned HOST_WIDE_INT) habs_den
-< (unsigned HOST_WIDE_INT) htwice)
-   || (((unsigned HOST_WIDE_INT) habs_den
-== (unsigned HOST_WIDE_INT) htwice)
-   && (labs_den <= ltwice)))
+   if (((unsigned HOST_WIDE_INT) habs_rem
+> (unsigned HOST_WIDE_INT) hdiff)
+   || (habs_rem == hdiff && labs_rem >= ldiff))
  {
if (quo_neg)
  /* quo = quo - 1;  */
Index: gcc/testsuite/gnat.dg/round_div.adb
===
--- /dev/null   2014-11-19 08:41:51.310561007 +
+++ gcc/testsuite/gnat.dg/round_div.adb 2014-12-11 10:46:10.570461030 +
@@ -0,0 +1,17 @@
+-- { dg-do run }
+-- { dg-options "-O3" }
+procedure Round_Div is
+   type Fixed is delta 1.0 range -2147483648.0 .. 2147483647.0;
+   A : Fixed := 1.0;
+   B : Fixed := 3.0;
+   C : Integer;
+   function Divide (X, Y : Fixed) return Integer is
+   begin
+  return Integer (X / Y);
+   end;
+begin
+   C := Divide (A, B);
+   if C /= 0 then
+  raise Program_Error;
+   end if;
+end Round_Div;
Index: gcc/wide-int.h
===
--- gcc/wide-int.h  2014-12-11 10:45:44.434786385 +
+++ gcc/wide-int.h  2014-12-11 10:46:10.570461030 +
@@ -2616,8 +2616,8 @@ wi::div_round (const T1 &x, const T2 &y,
 {
   if (sgn == SIGNED)
{
- if (wi::ges_p (wi::abs (remainder),
-wi::lrshift (wi::abs (y), 1)))
+ WI_BINARY_RESULT (T1, T2) abs_remainder = wi::abs (remainder);
+ if (wi::geu_p (abs_remainder, wi::abs (y) - abs_remainder))
{
  if (wi::neg_p (x, sgn) != wi::neg_p (y, sgn))
return quotient - 1;
@@ -2627,7 +2627,7 @@ wi::div_round (const T1 &x, const T2 &y,
}
   else
{
- if (wi::geu_p (remainder, wi::lrshift (y, 1)))
+ if (wi::geu_p (remainder, y - remainder))
return quotient + 1;
}
 }
@@ -2784,8 +2784,8 @@ wi::mod_round (const T1 &x, const T2 &y,
 {
   if (sgn == SIGNED)
{
- if (wi::ges_p (wi::abs (remainder),
-wi::lrshift (wi::abs (y), 1)))
+ WI_BINARY_RESULT (T1, T2) abs_remainder = wi::abs (remainder);
+ if (wi::geu_p (abs_remainder, wi::abs (y) - abs_remainder))
{
  if (wi::neg_p (x, sgn) != wi::neg_p (y, sgn))
return remainder + y;
@@ -2795,7 +2795,7 @@ wi::mod_round (const T1 &x, const T2 &y,
}
   else
{
-

Re: [PATCH][ARM] Fix names of some rounding intrinsics, impement vrndx_f32 and vrndxq_f32

2014-12-11 Thread Ramana Radhakrishnan

On Tue, Sep 23, 2014 at 4:07 PM, Kyrill Tkachov  wrote:
> Hi all,
>
> Some intrinsics had the wrong name (inconsistent with the NEON intrinsics
> spec). This patch fixes that and adds the vrndx_f32 and vrndxq_f32
> intrinsics that were missing.
> These map down to vrintx.f32 NEON instructions (d and q forms). We already
> had builtins defined for them, just the intrinsics were not wired up to them
> properly.
>
> Tested arm-none-eabi
>
> Ok for trunk?

This is OK if no regressions.

Ramana
>
> 2014-09-23  Kyrylo Tkachov  
>
> * config/arm/arm_neon.h (vrndqn_f32): Rename to...
> (vrndnq_f32): ... this.
> (vrndqa_f32): Rename to...
> (vrndaq_f32): ... this.
> (vrndqp_f32): Rename to...
> (vrndpq_f32): ... this.
> (vrndqm_f32): Rename to...
> (vrndmq_f32): ... this.
> (vrndx_f32): New intrinsic.
> (vrndxq_f32): Likewise.
>
> 2014-09-23  Kyrylo Tkachov  
>
> * gcc.target/arm/simd/neon-vrndx_f32_1.c: New test.
> * gcc.target/arm/simd/neon-vrndxq_f32_1.c: Likewise.
> * gcc.target/arm/neon/vrndqaf32.c: Rename to...
> * gcc.target/arm/neon/vrndaqf32.c: ... This. Update intrinsic names.
> * gcc.target/arm/neon/vrndqmf32.c: Rename to...
> * gcc.target/arm/neon/vrndmqf32.c: ... This. Update intrinsic names.
> * gcc.target/arm/neon/vrndqnf32.c: Rename to...
> * gcc.target/arm/neon/vrndnqf32.c: ... This. Update intrinsic names.
> * gcc.target/arm/neon/vrndqpf32.c: Rename to...
> * gcc.target/arm/neon/vrndpqf32.c: ... This. Update intrinsic names.

Re: [PATCH] IPA ICF: refactoring + fix for PR ipa/63569

2014-12-11 Thread Richard Biener

On Wed, Dec 10, 2014 at 1:18 PM, Martin Liška  wrote:
> Hello.
>
> As suggested by Richard, I split compare_operand functions to various
> functions
> related to a specific comparison. Apart from that I added fast check for
> volatility flag that caused miscompilation mentioned in PR63569.
>
> Patch can bootstrap on x86_64-linux-pc without any regression seen and I was
> able to build Firefox with LTO.
>
> Ready for trunk?

Hmm, I don't think the dispatch to compare_memory_operand is at the
correct place.  It should be called from places where currently
compare_operand is called and it should recurse to compare_operand.
That is, it is more "high-level".

Can you please fix the volatile issue separately?  It's also not necessary
to do that check on every operand but just on memory operands.

Thanks,
Richard.

> Thanks,
> Martin

Re: PR64182: Fix rounding division and modulus

2014-12-11 Thread Richard Biener

On Thu, Dec 11, 2014 at 1:26 PM, Richard Sandiford
 wrote:
> As pointed out in PR 64182, wide-int rounded division gets the
> ties-away-from-zero case wrong for odd-numbered dividends, while
> double_int gets the unsigned case wrong by unconditionally treating
> a dividend or remainder with the top bit set as negative.  As Jakub
> says, the test used in double_int might also have overflow problems.
>
> This patch uses:
>
>abs (remainder) >= abs (dividend) - abs (remainder)
>
> for both wide-int and double_int and fixes the unsigned case in double_int.
> I didn't know how to test the double_int change using input code so
> resorted to doing some double_int arithmetic at the start of main.
>
> Thanks to Joseph for the testcase.
>
> Tested on x86_64-linux-gnu.  OK to install?

Can you add a testcase?  You can follow the gcc.dg/plugin/sreal_plugin.c
example, maybe even make it a generic host_test_plugin.c with separate
files containing the actual tests.

Otherwise ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> PR middle-end/64182
> * wide-int.h (wi::div_round, wi::mod_round): Fix rounding of tied
> cases.
> * double-int.c (div_and_round_double): Fix handling of unsigned
> cases.  Use same rounding approach as wide-int.h.
>
> gc/testsuite/
> 2014-xx-xx  Joseph Myers  
>
> PR middle-end/64182
> * gnat.dg/round_div.adb: New test.
>
> Index: gcc/double-int.c
> ===
> --- gcc/double-int.c2014-12-11 10:45:44.430786435 +
> +++ gcc/double-int.c2014-12-11 10:46:10.570461030 +
> @@ -569,24 +569,23 @@ div_and_round_double (unsigned code, int
>{
> unsigned HOST_WIDE_INT labs_rem = *lrem;
> HOST_WIDE_INT habs_rem = *hrem;
> -   unsigned HOST_WIDE_INT labs_den = lden, ltwice;
> -   HOST_WIDE_INT habs_den = hden, htwice;
> +   unsigned HOST_WIDE_INT labs_den = lden, lnegabs_rem, ldiff;
> +   HOST_WIDE_INT habs_den = hden, hnegabs_rem, hdiff;
>
> /* Get absolute values.  */
> -   if (*hrem < 0)
> +   if (!uns && *hrem < 0)
>   neg_double (*lrem, *hrem, &labs_rem, &habs_rem);
> -   if (hden < 0)
> +   if (!uns && hden < 0)
>   neg_double (lden, hden, &labs_den, &habs_den);
>
> -   /* If (2 * abs (lrem) >= abs (lden)), adjust the quotient.  */
> -   mul_double ((HOST_WIDE_INT) 2, (HOST_WIDE_INT) 0,
> -   labs_rem, habs_rem,  +   /* If abs(rem) >= abs(den) - abs(rem), adjust the quotient.  */
> +   neg_double (labs_rem, habs_rem, &lnegabs_rem, &hnegabs_rem);
> +   add_double (labs_den, habs_den, lnegabs_rem, hnegabs_rem,
> +   &ldiff, &hdiff);
>
> -   if (((unsigned HOST_WIDE_INT) habs_den
> -< (unsigned HOST_WIDE_INT) htwice)
> -   || (((unsigned HOST_WIDE_INT) habs_den
> -== (unsigned HOST_WIDE_INT) htwice)
> -   && (labs_den <= ltwice)))
> +   if (((unsigned HOST_WIDE_INT) habs_rem
> +> (unsigned HOST_WIDE_INT) hdiff)
> +   || (habs_rem == hdiff && labs_rem >= ldiff))
>   {
> if (quo_neg)
>   /* quo = quo - 1;  */
> Index: gcc/testsuite/gnat.dg/round_div.adb
> ===
> --- /dev/null   2014-11-19 08:41:51.310561007 +
> +++ gcc/testsuite/gnat.dg/round_div.adb 2014-12-11 10:46:10.570461030 +
> @@ -0,0 +1,17 @@
> +-- { dg-do run }
> +-- { dg-options "-O3" }
> +procedure Round_Div is
> +   type Fixed is delta 1.0 range -2147483648.0 .. 2147483647.0;
> +   A : Fixed := 1.0;
> +   B : Fixed := 3.0;
> +   C : Integer;
> +   function Divide (X, Y : Fixed) return Integer is
> +   begin
> +  return Integer (X / Y);
> +   end;
> +begin
> +   C := Divide (A, B);
> +   if C /= 0 then
> +  raise Program_Error;
> +   end if;
> +end Round_Div;
> Index: gcc/wide-int.h
> ===
> --- gcc/wide-int.h  2014-12-11 10:45:44.434786385 +
> +++ gcc/wide-int.h  2014-12-11 10:46:10.570461030 +
> @@ -2616,8 +2616,8 @@ wi::div_round (const T1 &x, const T2 &y,
>  {
>if (sgn == SIGNED)
> {
> - if (wi::ges_p (wi::abs (remainder),
> -wi::lrshift (wi::abs (y), 1)))
> + WI_BINARY_RESULT (T1, T2) abs_remainder = wi::abs (remainder);
> + if (wi::geu_p (abs_remainder, wi::abs (y) - abs_remainder))
> {
>   if (wi::neg_p (x, sgn) != wi::neg_p (y, sgn))
> return quotient - 1;
> @@ -2627,7 +2627,7 @@ wi::div_round (const T1 &x, const T2 &y,
> }
>else
> {
> - if (wi::geu_p (remainder, wi::lrshift (y, 1)))
> + if (wi::geu_p (remainder, y - remainder))
> return quotient + 1;
> }
>  }
> @@ -2784,8 +2784,8 @@ wi::mod_round (const T1 &x, const T2 &y,
>

[PATCH] [AArch64, NEON] Add vfms_n_f32, vfmsq_n_f32 and vfmsq_n_f64 specified by the ACLE

2014-12-11 Thread Yangfei (Felix)

Hi, 

  This patch add three intrinsics that are required by the ACLE specification. 
  A new testcase is added which covers vfms_n_f32 and vfmsq_n_f32. Tested on 
both aarch64-linux-gnu and aarch64_be-linux-gnu. 
  OK? 


Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 218582)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2014-12-11  Felix Yang  
+
+   * config/aarch64/arm_neon.h (vfms_n_f32, vfmsq_n_f32, vfmsq_n_f64): New
+   intrinsics.
+
 2014-12-10  Felix Yang  
 
* config/aarch64/aarch64-protos.h (aarch64_function_profiler): Remove
Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfms_n.c
===
--- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfms_n.c
(revision 0)
+++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vfms_n.c
(revision 0)
@@ -0,0 +1,67 @@
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+
+#ifdef __aarch64__
+/* Expected results.  */
+VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4438ca3d, 0x44390a3d };
+VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x44869eb8, 0x4486beb8, 0x4486deb8, 
0x4486feb8 };
+
+#define VECT_VAR_ASSIGN(S,Q,T1,W) S##Q##_##T1##W
+#define ASSIGN(S, Q, T, W, V) T##W##_t S##Q##_##T##W = V
+#define TEST_MSG "VFMS_N/VFMSQ_N"
+
+void exec_vfms_n (void)
+{
+  /* Basic test: v4=vfms_n(v1,v2), then store the result.  */
+#define TEST_VFMS(Q, T1, T2, W, N) \
+  VECT_VAR(vector_res, T1, W, N) = \
+vfms##Q##_n_##T2##W(VECT_VAR(vector1, T1, W, N),   \
+   VECT_VAR(vector2, T1, W, N),\
+   VECT_VAR_ASSIGN(scalar, Q, T1, W)); \
+  vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vector_res, T1, W, N))
+
+#define CHECK_VFMS_RESULTS(test_name,comment)  \
+  {\
+CHECK_FP(test_name, float, 32, 2, PRIx32, expected, comment);  \
+CHECK_FP(test_name, float, 32, 4, PRIx32, expected, comment);  \
+  }
+
+#define DECL_VABD_VAR(VAR) \
+  DECL_VARIABLE(VAR, float, 32, 2);\
+  DECL_VARIABLE(VAR, float, 32, 4);\
+
+  DECL_VABD_VAR(vector1);
+  DECL_VABD_VAR(vector2);
+  DECL_VABD_VAR(vector3);
+  DECL_VABD_VAR(vector_res);
+
+  clean_results ();
+
+  /* Initialize input "vector1" from "buffer".  */
+  VLOAD(vector1, buffer, , float, f, 32, 2);
+  VLOAD(vector1, buffer, q, float, f, 32, 4);
+
+  /* Choose init value arbitrarily.  */
+  VDUP(vector2, , float, f, 32, 2, -9.3f);
+  VDUP(vector2, q, float, f, 32, 4, -29.7f);
+  
+  /* Choose init value arbitrarily.  */
+  ASSIGN(scalar, , float, 32, 81.2f);
+  ASSIGN(scalar, q, float, 32, 36.8f);
+
+  /* Execute the tests.  */
+  TEST_VFMS(, float, f, 32, 2);
+  TEST_VFMS(q, float, f, 32, 4);
+
+  CHECK_VFMS_RESULTS (TEST_MSG, "");
+}
+#endif
+
+int main (void)
+{
+#ifdef __aarch64__
+  exec_vfms_n ();
+#endif
+  return 0;
+}
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 218582)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2014-12-08  Felix Yang  
+
+   * gcc.target/aarch64/advsimd-intrinsics/vfms_n.c: New test.
+
 2014-12-10  Martin Liska  
 
* gcc.dg/ipa/pr63909.c: New test.
Index: gcc/config/aarch64/arm_neon.h
===
--- gcc/config/aarch64/arm_neon.h   (revision 218582)
+++ gcc/config/aarch64/arm_neon.h   (working copy)
@@ -15254,7 +15254,24 @@ vfmsq_f64 (float64x2_t __a, float64x2_t __b, float
   return __builtin_aarch64_fmav2df (-__b, __c, __a);
 }
 
+__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
+vfms_n_f32 (float32x2_t __a, float32x2_t __b, float32_t __c)
+{
+  return __builtin_aarch64_fmav2sf (-__b, vdup_n_f32 (__c), __a);
+}
 
+__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
+vfmsq_n_f32 (float32x4_t __a, float32x4_t __b, float32_t __c)
+{
+  return __builtin_aarch64_fmav4sf (-__b, vdupq_n_f32 (__c), __a);
+}
+
+__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
+vfmsq_n_f64 (float64x2_t __a, float64x2_t __b, float64_t __c)
+{
+  return __builtin_aarch64_fmav2df (-__b, vdupq_n_f64 (__c), __a);
+}
+
 /* vfms_lane  */
 
 __extension__ static __inline float32x2_t __attribute__ ((__always_inline__))


add-vfms_n-v1.diff
Description: add-vfms_n-v1.diff

Re: [PATCH] PR other/63613: Add fixincludes for dejagnu.h

2014-12-11 Thread Rainer Orth

David Malcolm  writes:

> On Mon, 2014-12-08 at 14:13 +0100, Rainer Orth wrote:
>> Jeff Law  writes:
>> 
>> > On 12/04/14 15:42, Rainer Orth wrote:
>> >> David Malcolm  writes:
>> >>
>> >>>  assumed -fgnu89-inline until a recent upstream fix;
>> >>> see http://lists.gnu.org/archive/html/dejagnu/2014-10/msg00011.html
>> >>>
>> >>> Remove the workaround from jit.exp that used -fgnu89-inline
>> >>> in favor of a fixincludes to dejagnu.h that applies the upstream fix
>> >>> to a local copy.
>> >>>
>> >>> This should make it easier to support C++ testcases from jit.exp.
>> >>
>> >> I wonder how this would work if dejagnu.h doesn't live in a system
>> >> include dir (e.g. a self-compiled version)?  fixincludes won't touch
>> >> those AFAIU.  The previous version with -fgnu89-inline would still work
>> >> in that case provided dejagnu.h is found at all.
>> > Presumably in that case the answer is upgrade dejagnu? :-)
>> 
>> I've two problems with this:
>> 
>> * There's not yet a DejaGnu release available with the fix and I've no
>>   idea if there are any planned any time soon.  Not everyone is
>>   comfortable with random git (or whatever) snapshots.
>
> FWIW I've asked on the DejaGnu mailing list, and Ben Elliston said:
>> Yes. I plan on releasing 1.6 over the holidays.
> http://lists.gnu.org/archive/html/dejagnu/2014-12/msg1.html

Thanks for checking this, but ...

>> * I don't consider this a critical issue that cannot work without
>>   current releases.  We're already working around several upstream
>>   DejaGnu issues in our codebase, and I don't consider this particular
>>   one important enough to require everyone to upgrade to a not-a-release
>>   version.

... a DejaGnu 1.6 release would only address one part of my concern: I
still don't believe this minor issues warrants us demanding all gcc
testers upgrading to a newer DejaGnu release.  I'd like my fellow
testsuite maintainers to weigh in, though.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.

2014-12-11 Thread Christophe Lyon

On 11 December 2014 at 11:16, David Sherwood  wrote:
> Hi Christophe,
>
> Sorry to bother you again. After my clarification email below are you now
> happy for these patches to go in?
>
> Kind Regards,
> David Sherwood.
>
>> -Original Message-
>> From: David Sherwood [mailto:david.sherw...@arm.com]
>> Sent: 27 November 2014 14:53
>> To: 'Christophe Lyon'
>> Cc: gcc-patches@gcc.gnu.org; Marcus Shawcroft; Alan Hayward; 'Tejas 
>> Belagod'; Richard Sandiford
>> Subject: RE: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes 
>> endianness-safe.
>>
>> > On 18 November 2014 10:14, David Sherwood  wrote:
>> > > Hi Christophe,
>> > >
>> > > Ah sorry. My mistake - it fixes this in bugzilla:
>> > >
>> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810
>> >
>> > I did look at that PR, but since it has no testcase attached, I was unsure.
>> > And I am still not :-)
>> > PR 59810 is "[AArch64] LDn/STn implementations are not ABI-conformant
>> > for bigendian."
>> > but the advsimd-intrinsics/vldX.c and vldX_lane.c now PASS with Alan's
>> > patches on aarch64_be, so I thought Alan's patches solve PR59810.
>> >
>> > What am I missing?
>>
>> Hi Christophe,
>>
>> I think probably this is our fault for making our lives way too difficult and
>> artificially splitting all these patches up. :)
>>
>> Alan's patch:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00952.html
>>
>> fixes some issues on aarch64_be, but also causes regressions. For example,
>>
>> 
>> Tests that now fail, but worked before:
>>
>> aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects 
>> execution test
>> aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c execution test
>> aarch64_be-elf-aem: gcc.dg/vect/vect-over-widen-1-big-array.c -flto 
>> -ffat-lto-objects execution test
>> ...
>>
>> Tests that now work, but didn't before:
>>
>> aarch64_be-elf-aem: gcc.dg/vect/fast-math-vect-complex-3.c execution test
>> aarch64_be-elf-aem: gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c execution test
>> aarch64_be-elf-aem: gcc.dg/vect/no-scevccp-outer-10a.c execution test
>> ...
>> 
I didn't notice that because I tested Alan's patch only against the
advsimd-intrinsics tests.
In this respect, I don't understand why your ChangeLog entry says
   * config/aarch64/aarch64-simd.md (vec_store_lanes(o/c/x)i,
vec_load_lanes(o/c/x)i): Fixed to work for Big Endian.
since the existing advsimd-intrinsics tests already pass with Alan's patch alone
or is vld1_lane still broken (for which I haven't posted a test yet)?

>> His patch is only half of the story and must be applied at the same time as 
>> the
>> "[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe."
>> patch. With both patches applied the result looks much healthier:
>>
>> 
>> # Comparing 1 common sum files
>> ## /bin/sh ./src/gcc/contrib/compare_tests  /tmp/gxx-sum1.10051 
>> /tmp/gxx-sum2.10051
>> Tests that now work, but didn't before:
>>
>> aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer  
>> execution test
>> aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer 
>> -funroll-all-loops -finline-
>> functions  execution test
>> aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer 
>> -funroll-loops  execution test
>> ...
>> 
>>
>> with no new regressions. After applying both patches the aarch64_be gcc 
>> testsuite is
>> on a parity with the aarch64 testsuite. Furthermore, after applying both of 
>> these patches:
>>
>> "[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe"
>> "[AArch64] [BE] Fix vector load/stores to not use ld1/st1"
>>
>> it then becomes safe for us to remove the CCMC macro, which is the cause of
>> unnecessary spills to the stack for certain auto-vectorised code. So really I
>> suppose when I posted my second patch
>>
>> "[AArch64] [BE] [2/2] Make large opaque integer modes endianness-safe"
>>
>> I should have really just called this
>>
>> "[AArch64] [BE] Remove CCMC for aarch64"
>>
>> in order to make it clear exactly what the purpose of these patches is.
well, not yet since this very does not remove it :-)

>>
>> Kind Regards,
>> David Sherwood.
>
>
>
>

[PATCH] Fix for PR ipa/64146

2014-12-11 Thread Martin Liška


Hello.

In PR64146, for position independent code IPA ICF should be more careful about 
thunk creation.
Patch can bootstrap on x86_64-linux-pc and no new regression was seen.

Ready for thunk?
Thank you,
Martin
>From e57dbf95cf27c2d5da2322ee75dca6361ab59c8a Mon Sep 17 00:00:00 2001
From: mliska 
Date: Wed, 10 Dec 2014 14:46:28 +0100
Subject: [PATCH] IPA ICF: Fix for PR ipa/64146

gcc/ChangeLog:

2014-12-10  Martin Liska  

	PR ipa/64146
	* ipa-icf.c (sem_function::merge): Check for
	decl_binds_to_current_def_p is newly added to merge operation.

gcc/testsuite/ChangeLog:

2014-12-10  Martin Liska  

	* g++.dg/ipa/pr64146.C: New test.
---
 gcc/ipa-icf.c  |  8 
 gcc/testsuite/g++.dg/ipa/pr64146.C | 37 +
 2 files changed, 45 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr64146.C

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index b193200..91878b2 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -101,6 +101,7 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #include "ipa-icf-gimple.h"
 #include "ipa-icf.h"
+#include "varasm.h"
 
 using namespace ipa_icf_gimple;
 
@@ -624,6 +625,13 @@ sem_function::merge (sem_item *alias_item)
 	return false;
   }
 
+  if (!decl_binds_to_current_def_p (alias->decl))
+{
+  if (dump_file)
+	fprintf (dump_file, "Declaration does not bind to currect definition.\n\n");
+  return false;
+}
+
   if (redirect_callers)
 {
   /* If alias is non-overwritable then
diff --git a/gcc/testsuite/g++.dg/ipa/pr64146.C b/gcc/testsuite/g++.dg/ipa/pr64146.C
new file mode 100644
index 000..90c5093
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr64146.C
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-fpic -fdump-ipa-icf-details -fipa-icf"  } */
+
+extern "C" const char*
+foo()
+{
+  return "original";
+}
+
+const char*
+test_foo()
+{
+  return foo();
+}
+
+extern "C" const char*
+bar()
+{
+  return "original";
+}
+
+const char*
+test_bar()
+{
+  return bar();
+}
+
+int main (int argc, char **argv)
+{
+  test_foo ();
+  test_bar ();
+
+  return 0;
+}
+
+/* { dg-final { scan-ipa-dump-times "Declaration does not bind to currect definition." 2 "icf"  } } */
+/* { dg-final { scan-ipa-dump "Equal symbols: 2" "icf"  } } */
-- 
2.1.2

Re: [PATCH] Fix for PR ipa/64146

2014-12-11 Thread Richard Biener

On Thu, Dec 11, 2014 at 2:49 PM, Martin Liška  wrote:
> Hello.
>
> In PR64146, for position independent code IPA ICF should be more careful
> about thunk creation.
> Patch can bootstrap on x86_64-linux-pc and no new regression was seen.
>
> Ready for thunk?

Hmm, does that merge the functions but
keep a call to the original alias which can be overridden at runtime?

If so, ok.

Thanks,
Richard.

> Thank you,
> Martin

Re: [C++ Patch] Mini maybe_warn_about_useless_cast clean up?

2014-12-11 Thread Jason Merrill


OK.

Jason

Re: [patch] Fix tilepro includes

2014-12-11 Thread Andrew MacLeod


On 12/08/2014 11:23 AM, Jan-Benedict Glaw wrote:

On Fri, 2014-11-21 08:45:11 -0500, Andrew MacLeod  wrote:

During the flattening of optabs.h, I updated all the config/* files
which were affected.   I've been getting spurious failures with
config-list.mk where my changes would "disappear" and tracked down
why.

I was blissfully unaware that the tilepro ports mul-tables.c file is
actually generated from gen-mul-tables.cc.

This patch fixes the include issue by adding "#include insn-codes.h"
to the generated files.  I also added a comment indicating these are
generated files, and to make changes in the generator.

This allows all the tile* ports to compile properly again.

OK for trunk?

Seems this wasn't ever ACKed or applied up to now? I'm still seeing
compilation errors for the tile targets, see eg.
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=382169

MfG, JBG



Now checked in, revision 218624

Andrew

Re: [PATCH/AARCH64] v2 Add aligning of functions/loops/jumps

2014-12-11 Thread Marcus Shawcroft

On 23 November 2014 at 00:09, Andrew Pinski  wrote:
> Hi,
>   This is just a rebase of
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01615.html as requested
> by https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01736.html.  Nothing
> has changed in it.
>
> OK?  Built and tested on aarch64-elf with no regressions.
>
> Thanks,
> Andrew Pinski
>
> ChangeLog:
>
> * config/aarch64/aarch64-protos.h (tune_params): Add align field.
> * config/aarch64/aarch64.c (generic_tunings): Specify align.
> (cortexa53_tunings): Likewise.
> (cortexa57_tunings): Likewise.
> (thunderx_tunings): Likewise.
> (aarch64_override_options): Set align_loops, align_jumps,
> align_functions based on what the tuning struct.

OK /Marcus

Re: [PATCH][ARM] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P

2014-12-11 Thread Kyrill Tkachov


Ping.
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00340.html

Thanks,
Kyrill

On 04/12/14 09:19, Kyrill Tkachov wrote:

On 02/12/14 22:58, Ramana Radhakrishnan wrote:

On Tue, Nov 11, 2014 at 11:55 AM, Kyrill Tkachov  wrote:

Hi all,

This is the arm implementation of the macro fusion hook.
It tries to fuse movw+movt operations together. It also tries to take lo_sum
RTXs into account since those generate movt instructions as well.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?



   if (current_tune->fuseable_ops & ARM_FUSE_MOVW_MOVT)
+{
+  /* We are trying to fuse
+ movw imm / movt imm
+ instructions as a group that gets scheduled together.  */
+

A comment here about the insn structure would be useful.

Done. It's similar to the aarch64 adrp+add case. It does make it easier
to read, thanks.

2014-12-04  Kyrylo Tkachov  kyrylo.tkac...@arm.com\

* config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
* config/arm/arm.c (arm_macro_fusion_p): New function.
(arm_macro_fusion_pair_p): Likewise.
(TARGET_SCHED_MACRO_FUSION_P): Define.
(TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
(ARM_FUSE_NOTHING): Likewise.
(ARM_FUSE_MOVW_MOVT): Likewise.
(arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
arm_cortex_a5_tune): Specify fuseable_ops value.


+  set_dest = SET_DEST (curr_set);
+  if (GET_CODE (set_dest) == ZERO_EXTRACT)
+{
+  if (CONST_INT_P (SET_SRC (curr_set))
+  && CONST_INT_P (SET_SRC (prev_set))
+  && REG_P (XEXP (set_dest, 0))
+  && REG_P (SET_DEST (prev_set))
+  && REGNO (XEXP (set_dest, 0)) == REGNO (SET_DEST (prev_set)))
+return true;
+}
+  else if (GET_CODE (SET_SRC (curr_set)) == LO_SUM
+   && REG_P (SET_DEST (curr_set))
+   && REG_P (SET_DEST (prev_set))
+   && GET_CODE (SET_SRC (prev_set)) == HIGH
+   && REGNO (SET_DEST (curr_set)) == REGNO (SET_DEST (prev_set)))
+{
+  return true;
+}

Can we add a fast path exit to be

if (GET_MODE (set_dest) != SImode)
return false;

Done, but if/when we extend the function to handle more fusion cases it
will need to be
refactored, since we will want to just bail out of this MOVW+MOVT case
rather than the whole function.


I did think whether we wanted to use reg_overlap_mentioned_p as that
may simplify the logic a bit but that's  overkill here as we still
want to restrict it to the cases above.

Otherwise OK.

Here's the updated patch. I've tested on arm-none-eabi and made sure
that the
fusion still happens on the benchmarks I looked at.
Ok?

Thanks,
Kyrill


Ramana





+}
+  return false;
Thanks,
Kyrill

2014-11-11  Kyrylo Tkachov  

  * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
  * config/arm/arm.c (arm_macro_fusion_p): New function.
  (arm_macro_fusion_pair_p): Likewise.
  (TARGET_SCHED_MACRO_FUSION_P): Define.
  (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
  (ARM_FUSE_NOTHING): Likewise.
  (ARM_FUSE_MOVW_MOVT): Likewise.
  (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
  arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
  arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
  arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
  arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
  arm_cortex_a5_tune): Specify fuseable_ops value.

Re: [PATCH][AArch64] Fix usage of +no in error message for aarch64_parse_extension

2014-12-11 Thread Marcus Shawcroft

On 10 December 2014 at 15:30, Kyrill Tkachov  wrote:

> 2014-12-10  Kyrylo Tkachov  kyrylo.tkac...@arm.com
>
> * config/aarch64/aarch64.c (aarch64_parse_extension): Update error
> message to say +no only when removing extension.

OK /Marcus

Re: [PATCH][AARCH64][4.9]Backport "Use selected cpu's tuning when no tuning parameter is specified."

2014-12-11 Thread Marcus Shawcroft

On 10 December 2014 at 13:58, Renlin Li  wrote:

> This is a backport patch of
> https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00287.html
>
> aarch64-none-elf has been built and tested on the model, no issue.
> Okay for branch 4.9?
>
> Regards,
> Renlin Li
>
>
> gcc/ChangeLog:
>
> 2014-12-10 Renlin Li 
>
> * config/aarch64/aarch64.c (aarch64_parse_cpu): Remove selected_tune
> assignment as this will be done later.
> (aarch64_override_options): Use selected_cpu's tuning.

OK /Marcus

Re: [PATCH][AARCH64]Use AARCH64_FL_FPSIMD flags for all cores in aarch64-cores.def

2014-12-11 Thread Marcus Shawcroft

On 10 December 2014 at 16:34, Renlin Li  wrote:

> 2014-12-10  Renlin Li  
>
> * config/aarch64/aarch64-cores.def: Change all AARCH64_FL_FPSIMD to
> AARCH64_FL_FOR_ARCH8.
> * config/aarch64/aarch64.c (all_cores): Use FLAGS from aarch64-cores.def
> file
> only.

OK /Marcus

Re: [PATCH AARCH64]Make ldp/stp case less vulnerable

2014-12-11 Thread Marcus Shawcroft

On 11 December 2014 at 10:06, Bin Cheng  wrote:

> gcc/testsuite/ChangeLog
> 2014-12-11  Bin Cheng  
>
> * gcc.target/aarch64/ldp_stp_2.c: Make test less vulnerable.
> * gcc.target/aarch64/ldp_stp_3.c: Ditto.

OK /Marcus

[linaro/gcc-4_9-branch] Merge from gcc-4_9-branch and backports

2014-12-11 Thread Yvan Roux

Hi all

we have merged the gcc-4_9-branch into linaro/gcc-4_9-branch up to
revision 218412 as r218423.  We have also backported this set of revisions:

* r213382 as r218352 : [AArch64] arm_neon.h - add vpaddd_f64,
vpaddd_s64, vpaddd_u64 intrinsics
* r214008 as r218354 : [AArch64] Move some code around in
aarch64_expand_mov_immediate
* r214948 as r218355 : [PATCH AArch64 1/2] Improve codegen of vector
compares inc. tst instruction
* r214949 as r218355 : [PATCH AArch64 2/2] Remove vector compare/tst __builtins
* r214950 as r218356 : [PATCH AArch64 1/2] Add execution tests of
vget_low and vget_high
* r214952 as r218356 : [PATCH AArch64 2/2] Replace temporary inline
assembler for vget_high
* r215013 as r218357 : Remove no-longer-needed fp-bit target macros.
* r215046 as r218358 : [AArch64] PR 61749: Do not ICE in lane
intrinsics when passed non-constant lane number
* r215047 as r218359 : [AArch32] Disable xordi3-opt.c/iordi3-opt.c on
thumb1 target
* r215071 as r218377 : [AArch64 Testsuite]Fix scan-assembler test
false alarm on aarch64-linux-gnu
* r215072 as r218360 : [AArch64 Testsuite] Add test of vld[234]q? intrinsic
* r215077 as r218361 : [AArch64 Testsuite] Extend test of vld1+vst1
intrinsics to cover more variants
* r215078 as r218362 : [AArch64 Testsuite] Add a test of vldN_dup intrinsics
* r215126 as r218363 : [AArch64 Testsuite] Add a test of the vldN_lane intrinsic
* r215129 as r218364 : [AArch64 Testsuite] Add a test of the
vst[234](q?) intrinics
* r215177 as r218365 : [AArch64 Testsuite] Add execution test of
vset(q?)_lane intrinsics.
* r215206 as r218351 : [AArch64] Add cost handling of CALLER_SAVE_REGS
and POINTER_REGS
* r215207 as r218351 : [AArch64] Fix cost for Q register moves
* r215208 as r218351 : [AArch64] Add regmove_costs for Cortex-A57 and A53
* r215473 as r218366 : [testsuite] whole_vector_shift
* r215475 as r218367 : [testsuite] vect-reduc-or
* r215540 as r218368 : PR rtl-optimization/63210 IRA
* r215707 as r218370 : Fix IRA ICE tmpdir-gcc-.dg-struct-layout-1/t028
* r215711 as r218371 : Accept cortex-m7/fpv5-sp-16/fpv5-d16
* r215842 as r218370 : Fix IRA ICE tmpdir-gcc-.dg-struct-layout-1/t028 -addon
* r215865 as r218373 : Add aarch64 to list of targets that support gold
* r216253 as r218374 : Remove unused variable and marco
* r216336 as r218375 : Target Legitimze Address
* r216444 as r218350 : [testsuite] Fix race in libstdc++ testsuite
* r216517 as r218378 : [testsuite] update testcases for GNU11
* r216524 as r218379 : Add -mthunderx option
* r216543 as r218380 : [testsuite] fix gcc-dg-prune glitch when
filtering "relocation truncation" error
* r216544 as r218384 : [testsuite] Update testcases for GNU11
* r216630 as r218385 : PR 63173 fix vldX_dup
* r216638 as r218386 : [testsuite] fix wrap_compile_flags
* r216765 as r218387 : PR63442 libgcc_cmp_return_mode not always
return word_mode
* r216996 as r218390 : [Patch 1/7] Hookize *_BY_PIECES_P
* r216998 as r218390 : [Patch 2/7 s390] Deprecate *_BY_PIECES_P, move
to hookized version
* r216999 as r218390 : [Patch 3/7 arc] Deprecate *_BY_PIECES_P, move
to hookized version
* r217001 as r218390 : [Patch 4/7 sh] Deprecate *_BY_PIECES_P, move to
hookized version
* r217002 as r218390 : [Patch 5/7 mips] Deprecate *_BY_PIECES_P, move
to hookized version
* r217003 as r218390 : [Patch 6/7 AArch64] Deprecate *_BY_PIECES_P,
move to hookized version
* r217004 as r218390 : [Patch 7/7] Remove *_BY_PIECES_P
* r217014 as r218391 : Fix CLZ_DEFINED_VALUE_AT_ZERO for vector modes
* r217026 as r218393 : ifcvt: Allow CC mode if HAVE_cbranchcc4
* r217076 as r218394 : Fix predicate and constraint mismatch in
logical atomic operations
* r217079 as r218398 : Migrate to new reduc_plus_scal_optab
* r217080 as r218398 : Migrate to new reduc_[us](min|max)_scal_optab
* r217742 as r218390 : PR target/63937 fix 216996
* r217971 as r218383 : [PATCH x86] Increase
PARAM_MAX_COMPLETELY_PEELED_INSNS when branch is costly
* r210735 as r218351 : Change CORE_REGS in GENERAL_REGS

This will be part of our 2014.12 4.9 release.

Thanks
Yvan

Re: [PATCH] Fix PR42108

2014-12-11 Thread Steve Kargl

On Thu, Dec 11, 2014 at 11:05:13AM +0100, Richard Biener wrote:
> 
> The following patch fixes the performance regression in PR42108
> by allowing PRE and LIM to see the division (to - from) / step
> in translating do loops executed unconditionally.  This makes
> them not care for the fact that step might be zero and thus
> the division might trap.
> 

step cannot be zero in standard conforming code.  From F95 8.1.4.4,
"Execution of a DO construct":

When the DO statement is executed, the DO construct becomes active.
If loop-control is

  [ , ] do-variable = scalar-int-expr1, scalar-int-expr2 [, scalar-int-expr3]

the following steps are performed in sequence:

  (1) The initial parameter m1, the terminal parameter m2, and the
  incrementation parameter m3 are of type integer with the same kind
  type parameter as the do-variable.  Their values are established by
  evaluating scalar-int-expr1, scalar-int-expr2, and scalar-int-expr3,
  respectively, including, if necessary, conversion to the kind type
  parameter of the do-variable according to the rules for numeric
  conversion (Table 7.10).  If scalar-int-expr3 does not appear, m3
  has the value 1. The value m3 shall not be zero.

"The value m3 shall not be zero" is a not an enumerated constraint,
so the compiler does not need to catch and report m3 being zero.
The prohibition is on the programmerr.  So, if the division traps,
it is a bug in the program.

> This makes the runtime of the testcase improve from 10.7s to
> 8s (same as gfortran 4.3).
> 
> The caveat is that iff the loop is not executed (to < from
> for positive step for example) then there will be an additional
> executed division computing the unused countm1.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok
> for trunk?
> 

OK.

-- 
Steve

Ping: Re: [PATCH 10/11][RS6000] Migrate reduction optabs to reduc_..._scal

2014-12-11 Thread Alan Lawrence

So I'm afraid I'm not going to get involved in a discussion about 
CANNOT_CHANGE_MODE_CLASS on RS6000, and what you might want to do there - sorry, 
but I don't think I can really contribute anything there. However, I *am* trying 
to migrate all platforms off the old reduc_xxx optabs to the new version 
producing a scalar.


Hence, can I ping the attached patch (which is just a simple combination of the 
previously-posted patch + snippet)? No regressions on gcc112.fsffrance.org.


This works in exactly the same way as the old code path, with a second insn to 
pull the scalar result out of the reduction, just as the expander would have 
done (or the bitfieldref before that), and avoiding the v2df combine pattern 
(again, as previously).


gcc/ChangeLog:

* config/rs6000/altivec.md (reduc_splus_): Rename to...
(reduc_plus_scal_): ...this, add rs6000_expand_vector_extract.
(reduc_uplus_v16qi): Remove.

* config/rs6000/vector.md (VEC_reduc_name): change "splus" to "plus"
(reduc__v2df): Remove.
(reduc__scal_v2df): New.
(reduc__v4sf): Rename to...
(reduc__scal_v4sf): ...this, wrap VEC_reduc in a
vec_select of element 3, add scratch register.





Have run check-gcc on gcc110.fsffrance.org (powerpc64-unknown-linux-gnu) using 
this snippet on top of original patch; no regressions.


Alan Lawrence wrote:

So I'm no expert on RS6000 here, but following on from Segher's observation 
about the change in pattern...so the difference in 'expand' is exactly that, a 
vsx_reduc_splus_v2df followed by a vec_extract to DF, becomes a 
vsx_reduc_splus_v2df_scalar - as I expected the combiner to produce by 
combining the two previous insns.


However, inspecting the logs from -fdump-rtl-combine-all, *without* my 
patch, when the combiner tries to put those two together, I see:


Trying 30 -> 31:
Failed to match this instruction:
(set (reg:DF 179 [ stmp_s_5.7D.2196 ])
 (vec_select:DF (plus:V2DF (vec_select:V2DF (reg:V2DF 173 [ 
vect_s_5.6D.2195 ])
 (parallel [
 (const_int 1 [0x1])
 (const_int 0 [0])
 ]))
 (reg:V2DF 173 [ vect_s_5.6D.2195 ]))
 (parallel [
 (const_int 1 [0x1])
 ])))

That is, it looks like combine_simplify_rtx has transformed the (vec_concat 
(vec_select ... 1) (vec_select ... 0)) from the vsx_reduc_plus_v2df insn, into 
a single vec_select, which does not match the vsx_reduc_plus_v2df_scalar insn.


So despite the comment (in vsx.md):

;; Combiner patterns with the vector reduction patterns that knows we can 
get
;; to the top element of the V2DF array without doing an extract.

It looks like the code generation prior to my patch, considered better, was 
because the combiner didn't actually use the pattern?


In that case whilst you may want to dig into register allocation, 
cannot_change_mode_class, etc., for other reasons, I think the best fix for migrating to 
reduc_plus_scal... is simply to avoid using the "Combiner" patterns and just 
emit two insns, the old pattern followed by a vec_extract. The attached snippet does this 
(I won't call it a patch yet, and it applies on top of the previous patch - I went the 
route of calling the two gen functions rather than copying their RTL sequences, but could 
do the latter if that were preferable???), and restores code generation to the original 
form on your example above; it bootstraps OK but I'm still running check-gcc on the 
Compile Farm...


However, again on your example above, I note that if I *remove* the 
reduc_plus_scal_v2df pattern altogether, I get:


.sum:
 li 10,512# 52   *movdi_internal64/4 [length = 4]
 ld 9,.LC2@toc(2) # 20   *movdi_internal64/2 [length = 
4]
 xxlxor 0,0,0 # 17   *vsx_movv2df/12 [length = 4]
 mtctr 10 # 48   *movdi_internal64/11[length = 4]
 .align 4
.L2:
 lxvd2x 12,0,9# 23   *vsx_movv2df/2  [length = 4]
 addi 9,9,16  # 25   *adddi3_internal1/2 [length = 4]
 xvadddp 0,0,12   # 24   *vsx_addv2df3/1 [length = 4]
 bdnz .L2 # 47   *ctrdi_internal1/1  [length = 4]
 xxsldwi 12,0,0,2 # 30   vsx_xxsldwi_v2df[length = 
4]
 xvadddp 1,0,12   # 31   *vsx_addv2df3/1 [length = 4]
 nop  # 37   *vsx_extract_v2df_internal2/1   [length = 4]
 blr  # 55   return  [length = 4]

this is presumably using gcc's scalar reduction code, but (to my untrained 
eye on powerpc!) it looks even better than the first form above (the same in 
the loop, and in the reduction, an xxpermdi is replaced by a nop !)...


--Alan


Segher Boessenkool wrote:

On Mon, Nov 10, 2014 at 05:36:24PM -0500, Michael Meissner wrote:

However, the double patt

Remove unused arguments of bulitin_unreachable

2014-12-11 Thread Jan Hubicka

Hi,
in firefox .optimized dumps one can see few places where __builtin_unreachable
is called (as a result of devirtualization code proving the code path to be
undefined).  There is usually some argument setup for the parameters of
__builtin_unreachable that are dead.  This patch makes it somewhat better 
so now we get:
  :  
  # prephitmp_222 = PHI <_52(27), pretmp_245(29)>   
  _57 = prephitmp_222 + 2;  
  pool_40(D)->ptr = _57;
  __builtin_unreachable (); 

Why DSE does not eliminate the stores prior noreturn const function?

Bootstrapped/regtested x86_64-linux, OK?

Honza
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Remove dead 
parameters
of BUILT_IN_UNREACHABLE

Index: tree-ssa-dce.c
===
--- tree-ssa-dce.c  (revision 218610)
+++ tree-ssa-dce.c  (working copy)
@@ -250,6 +250,15 @@ mark_stmt_if_obviously_necessary (gimple
case BUILT_IN_ALLOCA:
case BUILT_IN_ALLOCA_WITH_ALIGN:
  return;
+   case BUILT_IN_UNREACHABLE:
+ /* All parameters of BUILT_IN_UNREACHABLE are dead.  Remove them
+from the stmt, so we can remove their definitions.  */
+ if (gimple_call_num_args (stmt))
+   {
+ gimple_set_num_ops (stmt, 3);
+ update_stmt (stmt);
+   }
+ break;
 
default:;
}

Re: Remove unused arguments of bulitin_unreachable

2014-12-11 Thread Jakub Jelinek

On Thu, Dec 11, 2014 at 06:06:55PM +0100, Jan Hubicka wrote:
> Hi,
> in firefox .optimized dumps one can see few places where __builtin_unreachable
> is called (as a result of devirtualization code proving the code path to be
> undefined).  There is usually some argument setup for the parameters of
> __builtin_unreachable that are dead.  This patch makes it somewhat better 
> so now we get:
>   :
>   
>   # prephitmp_222 = PHI <_52(27), pretmp_245(29)> 
>   
>   _57 = prephitmp_222 + 2;
>   
>   pool_40(D)->ptr = _57;  
>   
>   __builtin_unreachable ();   
>   
> 
> Why DSE does not eliminate the stores prior noreturn const function?
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
> Honza
>   * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Remove dead 
> parameters
>   of BUILT_IN_UNREACHABLE

Shouldn't this be done when you actually change the call to
__builtin_unreachable ()?  I mean, __builtin_unreachable () has no
arguments, so leaving any arguments there is broken IL, even if you clean it
up during the next DCE.

> --- tree-ssa-dce.c(revision 218610)
> +++ tree-ssa-dce.c(working copy)
> @@ -250,6 +250,15 @@ mark_stmt_if_obviously_necessary (gimple
>   case BUILT_IN_ALLOCA:
>   case BUILT_IN_ALLOCA_WITH_ALIGN:
> return;
> + case BUILT_IN_UNREACHABLE:
> +   /* All parameters of BUILT_IN_UNREACHABLE are dead.  Remove them
> +  from the stmt, so we can remove their definitions.  */
> +   if (gimple_call_num_args (stmt))
> + {
> +   gimple_set_num_ops (stmt, 3);
> +   update_stmt (stmt);
> + }
> +   break;
>  
>   default:;
>   }

Jakub

Re: [Patch, Fortran] Convert gfc_notify_std to common diagnostics

2014-12-11 Thread Tobias Burnus

PING  - https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00731.html

Tobias Burnus wrote:
> This patch requires that the gfc_error patch has been applied,
> https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00607.html

That patch has now been committed - and my patch still applies, building
and regtesting still succeeds without new failures poping up.

> The patch does some missing '%s' to %qs and '...' to %<...%> for gfc_error,
> does likewise for gfc_notify_std and converts the latter into calls to
> gfc_error and gfc_warning

 * * *

Side note: The biggest remaining issue with regards to using the common
diagnostic is supporting two locations. For the current plans, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44054#c22

Other remaining items:

* Things supported by Fortran and not by the common diagnostic, e.g.
  "", but that requires some careful review
  what's missing and whether it matters.

* Option handling: PR54687.
  One could also think of adding "error (OPT_..." support, printing the
  option in brackets (like: "[-std=f95]"). On the other hand, as that's
  all what the feature would do (contrary to -W... which also enters in
  -Werror=...), one could simply leave that in the error calling part.
  Currently, the Fortran code doesn't print this; printing it with
  gfc_notify_std for -std= would be trivial. See also PR31601.

* libcpp-related features such as macro expansion tracking. Requires
  libcpp whitespace support, cf. PR64273 and links there in.
  And PR45179 (support unicode in 4_"..." strings) also depends on the
  libcpp work.

Tobias

RFC: handle cached local static DIEs

2014-12-11 Thread Aldy Hernandez


Hi Jason.

After my last set of dwarf changes for locals, I found some target 
library building failures which I am now fixing.


The problem at hand is that, by design, the caching code in 
gen_variable_die() refuses to use a previously cached DIE if the current 
context and the cached context are different:


   else if (old_die->die_parent != context_die)
{
  /* If the contexts differ, it means we're not talking about
 the same thing.  Clear things so we can get a new DIE.
 This can happen when creating an inlined instance, in
 which case we need to create a new DIE that will get
 annotated with DW_AT_abstract_origin.  */
  old_die = NULL;
  gcc_assert (!DECL_ABSTRACT_P (decl));
}

This is causing problems with local statics which are handled at 
dwarf2out_late_global_decl, and which originally have a context of the 
compilation unit (by virtue of the dwarf2out_decl call).  This context 
then gets changed here:


  /* For local statics lookup proper context die.  */
  if (TREE_STATIC (decl)
  && DECL_CONTEXT (decl)
  && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
context_die = lookup_decl_die (DECL_CONTEXT (decl));

This new context may be correct for front/middle-end purposes, but is 
not the DIE context I am expecting in gen_variable_die.  For example, in 
the following example, the DECL_CONTEXT for the static is funky's 
DW_TAG_subprogram, whereas the caching code is expecting the 
DW_TAG_lexical_block:


void funky()
{
  {
static const char *nested_static_const = "testing123";
  }
}

My proposed way of handling it (attached) is by tightening the check in 
gen_variable_die(), and special casing this scenario (assuming, there is 
no other way to get a differing context).  This works, and fixes all the 
failures, without introducing any regressions.


Another approach would be to use whatever context is already cached with 
just "context_die = lookup_decl_die (decl)", but that feels like cheating.


Are you OK with the attached approach, or do you have something else in 
mind?


Thanks.
Aldy
commit 515a20666d0ea73f2380bae6d9b8ec1d5bb2f001
Author: Aldy Hernandez 
Date:   Thu Dec 11 09:26:25 2014 -0800

* dwarf2out.c (gen_subprogram_die): Handle as cached die if
dumped_early bit is set.
(dwarf2out_decl): Abstract local static check...
(local_function_static): ...into here.
(gen_variable_die): Handle different contexts in a cached die
gracefully for the non inline case.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index e4a7973..5d55d1f 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -18511,7 +18511,9 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 apply; we just use the old DIE.  */
   expanded_location s = expand_location (DECL_SOURCE_LOCATION (decl));
   struct dwarf_file_data * file_index = lookup_filename (s.file);
-  if (((is_cu_die (old_die->die_parent) || context_die == NULL)
+  if (((is_cu_die (old_die->die_parent)
+   || context_die == NULL
+   || dumped_early)
   && (DECL_ARTIFICIAL (decl)
   || (get_AT_file (old_die, DW_AT_decl_file) == file_index
   && (get_AT_unsigned (old_die, DW_AT_decl_line)
@@ -19132,6 +19134,17 @@ decl_will_get_specification_p (dw_die_ref old_die, 
tree decl, bool declaration)
  && get_AT_flag (old_die, DW_AT_declaration) == 1);
 }
 
+/* Return true if DECL is a local static.  */
+
+static inline bool
+local_function_static (tree decl)
+{
+  gcc_assert (TREE_CODE (decl) == VAR_DECL);
+  return TREE_STATIC (decl)
+&& DECL_CONTEXT (decl)
+&& TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL;
+}
+
 /* Generate a DIE to represent a declared data object.
Either DECL or ORIGIN must be non-null.  */
 
@@ -19283,13 +19296,35 @@ gen_variable_die (tree decl, tree origin, dw_die_ref 
context_die)
}
   else if (old_die->die_parent != context_die)
{
- /* If the contexts differ, it means we're not talking about
-the same thing.  Clear things so we can get a new DIE.
-This can happen when creating an inlined instance, in
-which case we need to create a new DIE that will get
-annotated with DW_AT_abstract_origin.  */
- old_die = NULL;
- gcc_assert (!DECL_ABSTRACT_P (decl));
+ /* If the contexts differ, it means we _MAY_ not be talking
+about the same thing.  */
+ if (origin)
+   {
+ /* If we will be creating an inlined instance, we need a
+new DIE that will get annotated with
+DW_AT_abstract_origin.  Clear things so we can get a
+new DIE.  */
+ gcc_assert (!DECL_ABSTRACT_P (decl));
+ old_die = NULL;
+   }
+ else
+

[patch] Fix std::notify_all_at_thread_exit test for older glibc

2014-12-11 Thread Jonathan Wakely


I'm seeing this test timeout on glibc 2.13, which I think is because
it doesn't provide __cxa_atexit_thread_impl and so the thread_local
destructor and the notify_all() happen in an unspecified order.

Tested x86_64-linux, committed to trunk.
commit 299704c621dd7afaee7c5fb2a354b40ef41c2eba
Author: Jonathan Wakely 
Date:   Thu Dec 11 17:12:17 2014 +

	* testsuite/30_threads/condition_variable/members/3.cc: Only use
	a thread_local when __cxa_thread_atexit_impl is available.

diff --git a/libstdc++-v3/testsuite/30_threads/condition_variable/members/3.cc b/libstdc++-v3/testsuite/30_threads/condition_variable/members/3.cc
index 0da545d..1788bcf 100644
--- a/libstdc++-v3/testsuite/30_threads/condition_variable/members/3.cc
+++ b/libstdc++-v3/testsuite/30_threads/condition_variable/members/3.cc
@@ -41,7 +41,12 @@ void func()
 {
   std::unique_lock lock{mx};
   std::notify_all_at_thread_exit(cv, std::move(lock));
+#if _GLIBCXX_HAVE___CXA_THREAD_ATEXIT_IMPL
+  // Correct order of thread_local destruction needs __cxa_thread_atexit_impl
   static thread_local Inc inc;
+#else
+  Inc inc;
+#endif
 }
 
 int main()

Re: [PATCH] libgccjit cleanups

2014-12-11 Thread David Malcolm

On Wed, 2014-12-10 at 23:32 -0500, Ulrich Drepper wrote:
> On Mon, Dec 8, 2014 at 11:36 AM, David Malcolm  wrote:
> > Thanks.  Overall this is good, a few nitpicks inline below:
> 
> I've made the changes and checked in the patch.

...as r218617.  Thanks.

The jit subdirectory has its own ChangeLog file.  I realize now that
your ChangeLog entries went in gcc/ChangeLog; they should have been in
gcc/jit/ChangeLog.

Sorry for not spotting that in review.  I've fixed it in r218637.

Does your editor do some kind of auto-reindent?  FWIW, I see various
whitespace-only changes in that commit.  I assume we try to avoid such
changes 

I've added documentation of libgccjit++.h to the .rst files as of
yesterday.  So I've added documentation of the new function as r218636:

gcc/jit/ChangeLog:
* docs/cp/topics/contexts.rst (gccjit::context::set_str_option):
Document new function.
* docs/_build/texinfo/libgccjit.texi: Regenerate.

---
 gcc/jit/docs/cp/topics/contexts.rst | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/gcc/jit/docs/cp/topics/contexts.rst 
b/gcc/jit/docs/cp/topics/contexts.rst
index 72815fb..4becd51 100644
--- a/gcc/jit/docs/cp/topics/contexts.rst
+++ b/gcc/jit/docs/cp/topics/contexts.rst
@@ -148,9 +148,18 @@ Debugging
 Options
 ---

-..
-  FIXME: gccjit::context::set_str_option doesn't seem to exist yet in the
-  C++ API
+String Options
+**
+
+.. function:: void \
+  gccjit::context::set_str_option (enum gcc_jit_str_option, \
+   const char *value)
+
+   Set a string option of the context.
+
+   This is a thin wrapper around the C API
+   :c:func:`gcc_jit_context_set_str_option`; the options have the same
+   meaning.

 Boolean options
 ***
-- 
1.8.5.3

Re: [Committed/AARCH64] Fix gcc.target/aarch64/test_frame_*.c testcases after ccmp patches

2014-12-11 Thread Tejas Belagod


On 22/11/14 23:41, Andrew Pinski wrote:

Hi,
   After the conditional compare patches, the some of the
gcc.target/aarch64/test_frame_*.c testcases start to fail.  This was
due to no longer duplicating simple_return and causing the epilogue to
be duplicated.

This changes the testcases to expect the non duplicated epilogue.

Committed as obvious after a test of aarch64-elf.

Thanks,
Andrew Pinski

ChangeLog:
* gcc.target/aarch64/test_frame_1.c: Expect only two loads of x30 (in
the epilogue).
* gcc.target/aarch64/test_frame_6.c: Likewise.
* gcc.target/aarch64/test_frame_2.c: Expect only one pair load of x30
and x19 (in the epilogue).
* gcc.target/aarch64/test_frame_4.c: Likewise.
* gcc.target/aarch64/test_frame_7.c: Likewise.



Hi Andrew,

I'm still seeing the original number of ldr x30 and ldp x19, x30 insns 
for these tests. What am I missing?


FAIL: gcc.target/aarch64/test_frame_1.c scan-assembler-times ldr\tx30, 
\\[sp\\], [0-9]+ 2
FAIL: gcc.target/aarch64/test_frame_2.c scan-assembler-times ldp\tx19, 
x30, \\[sp\\], [0-9]+ 1
FAIL: gcc.target/aarch64/test_frame_4.c scan-assembler-times ldp\tx19, 
x30, \\[sp\\], [0-9]+ 1
FAIL: gcc.target/aarch64/test_frame_6.c scan-assembler-times ldr\tx30, 
\\[sp\\], [0-9]+ 2
FAIL: gcc.target/aarch64/test_frame_7.c scan-assembler-times ldp\tx19, 
x30, \\[sp\\], [0-9]+ 1


Thanks,
Tejas,

Re: [Committed/AARCH64] Fix gcc.target/aarch64/test_frame_*.c testcases after ccmp patches

2014-12-11 Thread pinskia





> On Dec 11, 2014, at 10:06 AM, Tejas Belagod  wrote:
> 
>> On 22/11/14 23:41, Andrew Pinski wrote:
>> Hi,
>>   After the conditional compare patches, the some of the
>> gcc.target/aarch64/test_frame_*.c testcases start to fail.  This was
>> due to no longer duplicating simple_return and causing the epilogue to
>> be duplicated.
>> 
>> This changes the testcases to expect the non duplicated epilogue.
>> 
>> Committed as obvious after a test of aarch64-elf.
>> 
>> Thanks,
>> Andrew Pinski
>> 
>> ChangeLog:
>> * gcc.target/aarch64/test_frame_1.c: Expect only two loads of x30 (in
>> the epilogue).
>> * gcc.target/aarch64/test_frame_6.c: Likewise.
>> * gcc.target/aarch64/test_frame_2.c: Expect only one pair load of x30
>> and x19 (in the epilogue).
>> * gcc.target/aarch64/test_frame_4.c: Likewise.
>> * gcc.target/aarch64/test_frame_7.c: Likewise.
> 
> Hi Andrew,
> 
> I'm still seeing the original number of ldr x30 and ldp x19, x30 insns for 
> these tests. What am I missing?

The ccmp patch had to be reverted. But this patch was forgotten when it was. 
Just revert the testcase patch. 


Thanks,
Andrew
> 
> FAIL: gcc.target/aarch64/test_frame_1.c scan-assembler-times ldr\tx30, 
> \\[sp\\], [0-9]+ 2
> FAIL: gcc.target/aarch64/test_frame_2.c scan-assembler-times ldp\tx19, x30, 
> \\[sp\\], [0-9]+ 1
> FAIL: gcc.target/aarch64/test_frame_4.c scan-assembler-times ldp\tx19, x30, 
> \\[sp\\], [0-9]+ 1
> FAIL: gcc.target/aarch64/test_frame_6.c scan-assembler-times ldr\tx30, 
> \\[sp\\], [0-9]+ 2
> FAIL: gcc.target/aarch64/test_frame_7.c scan-assembler-times ldp\tx19, x30, 
> \\[sp\\], [0-9]+ 1
> 
> Thanks,
> Tejas,
> 
> 
> 
> 
> 
> 
>

Re: Remove unused arguments of bulitin_unreachable

2014-12-11 Thread Jan Hubicka

> On Thu, Dec 11, 2014 at 06:06:55PM +0100, Jan Hubicka wrote:
> > Hi,
> > in firefox .optimized dumps one can see few places where 
> > __builtin_unreachable
> > is called (as a result of devirtualization code proving the code path to be
> > undefined).  There is usually some argument setup for the parameters of
> > __builtin_unreachable that are dead.  This patch makes it somewhat better 
> > so now we get:
> >   :  
> > 
> >   # prephitmp_222 = PHI <_52(27), pretmp_245(29)>   
> > 
> >   _57 = prephitmp_222 + 2;  
> > 
> >   pool_40(D)->ptr = _57;
> > 
> >   __builtin_unreachable (); 
> > 
> > 
> > Why DSE does not eliminate the stores prior noreturn const function?
> > 
> > Bootstrapped/regtested x86_64-linux, OK?
> > 
> > Honza
> > * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Remove dead 
> > parameters
> > of BUILT_IN_UNREACHABLE
> 
> Shouldn't this be done when you actually change the call to
> __builtin_unreachable ()?  I mean, __builtin_unreachable () has no
> arguments, so leaving any arguments there is broken IL, even if you clean it
> up during the next DCE.

Hmm, I tought there was some reason to not do so becuase of inplace folding and 
memory-SSA.
I can give a try to update all the places we can put builtin_unreachable into 
IL.
(I wonder if that also include standard constant propagation)

Honza
> 
> > --- tree-ssa-dce.c  (revision 218610)
> > +++ tree-ssa-dce.c  (working copy)
> > @@ -250,6 +250,15 @@ mark_stmt_if_obviously_necessary (gimple
> > case BUILT_IN_ALLOCA:
> > case BUILT_IN_ALLOCA_WITH_ALIGN:
> >   return;
> > +   case BUILT_IN_UNREACHABLE:
> > + /* All parameters of BUILT_IN_UNREACHABLE are dead.  Remove them
> > +from the stmt, so we can remove their definitions.  */
> > + if (gimple_call_num_args (stmt))
> > +   {
> > + gimple_set_num_ops (stmt, 3);
> > + update_stmt (stmt);
> > +   }
> > + break;
> >  
> > default:;
> > }
> 
>   Jakub

patches for libstdc++ in #64271 (bootstrap on NetBSD)

2014-12-11 Thread Kai-Uwe Eckhardt

Here are the three patches as requested for #64271.

--- libstdc++-v3/config/os/bsd/netbsd/ctype_inline.h.orig   2014-12-10 
22:19:05.0 +0100
+++ libstdc++-v3/config/os/bsd/netbsd/ctype_inline.h2014-12-10 
22:20:46.0 +0100
@@ -48,7 +48,7 @@
   is(const char* __low, const char* __high, mask* __vec) const
   {
 while (__low < __high)
-  *__vec++ = _M_table[*__low++];
+  *__vec++ = _M_table[(unsigned char)*__low++];
 return __high;
   }
 


--- libstdc++-v3/config/os/bsd/netbsd/ctype_configure_char.cc.orig  
2014-12-10 22:19:26.0 +0100
+++ libstdc++-v3/config/os/bsd/netbsd/ctype_configure_char.cc   2014-12-10 
22:21:15.0 +0100
@@ -38,11 +38,17 @@
 
 // Information as gleaned from /usr/include/ctype.h
 
+#ifndef _CTYPE_BL
   extern "C" const u_int8_t _C_ctype_[];
+#endif
 
   const ctype_base::mask*
   ctype::classic_table() throw()
-  { return _C_ctype_ + 1; }
+#ifdef _CTYPE_BL
+  { return _C_ctype_tab_ + 1; }
+#else
+   { return _C_ctype_ + 1; }
+#endif
 
   ctype::ctype(__c_locale, const mask* __table, bool __del, 
 size_t __refs) 
@@ -69,14 +75,14 @@
 
   char
   ctype::do_toupper(char __c) const
-  { return ::toupper((int) __c); }
+  { return ::toupper((int)(unsigned char) __c); }
 
   const char*
   ctype::do_toupper(char* __low, const char* __high) const
   {
 while (__low < __high)
   {
-   *__low = ::toupper((int) *__low);
+   *__low = ::toupper((int)(unsigned char) *__low);
++__low;
   }
 return __high;
@@ -84,14 +90,14 @@
 
   char
   ctype::do_tolower(char __c) const
-  { return ::tolower((int) __c); }
+  { return ::tolower((int)(unsigned char) __c); }
 
   const char* 
   ctype::do_tolower(char* __low, const char* __high) const
   {
 while (__low < __high)
   {
-   *__low = ::tolower((int) *__low);
+   *__low = ::tolower((int)(unsigned char) *__low);
++__low;
   }
 return __high;


--- libstdc++-v3/config/os/bsd/netbsd/ctype_base.h.orig 2014-12-10 
22:18:50.0 +0100
+++ libstdc++-v3/config/os/bsd/netbsd/ctype_base.h  2014-12-10 
22:20:31.0 +0100
@@ -43,9 +43,22 @@
 
 // NB: Offsets into ctype::_M_table force a particular size
 // on the mask type. Because of this, we don't use an enum.
-typedef unsigned char  mask;
 
-#ifndef _CTYPE_U
+#if defined(_CTYPE_BL)
+typedef unsigned short  mask;
+static const mask upper = _CTYPE_U;
+static const mask lower = _CTYPE_L;
+static const mask alpha = _CTYPE_A;
+static const mask digit = _CTYPE_D;
+static const mask xdigit= _CTYPE_X;
+static const mask space = _CTYPE_S;
+static const mask print = _CTYPE_R;
+static const mask graph = _CTYPE_G;
+static const mask cntrl = _CTYPE_C;
+static const mask punct = _CTYPE_P;
+static const mask alnum = _CTYPE_A | _CTYPE_D;
+#elif !defined(_CTYPE_U)
+typedef unsigned char  mask;
 static const mask upper= _U;
 static const mask lower= _L;
 static const mask alpha= _U | _L;
@@ -58,6 +71,7 @@
 static const mask punct= _P;
 static const mask alnum= _U | _L | _N;
 #else
+typedef unsigned char  mask;
 static const mask upper= _CTYPE_U;
 static const mask lower= _CTYPE_L;
 static const mask alpha= _CTYPE_U | _CTYPE_L;

second part of patches for #64271 (bootstrap on NetBSD)

2014-12-11 Thread Kai-Uwe Eckhardt

here are the non-libstdc++ patches:

--- libgfortran/configure.orig  2014-12-10 22:34:06.0 +0100
+++ libgfortran/configure   2014-12-10 22:33:57.0 +0100
@@ -26447,7 +26447,7 @@
 
   fi
   case "$host" in
-*-*-darwin* | *-*-hpux* | *-*-cygwin* | *-*-mingw* )
+*-*-darwin* | *-*-hpux* | *-*-cygwin* | *-*-mingw* | *-*-netbsd* )
 
 $as_echo "#define GTHREAD_USE_WEAK 0" >>confdefs.h


 
--- libcilkrts/configure.orig   2014-12-10 22:28:55.0 +0100
+++ libcilkrts/configure2014-12-10 22:28:38.0 +0100
@@ -14519,7 +14519,7 @@
 CFLAGS="$save_CFLAGS"
 
 if test $enable_shared = yes; then
-  link_cilkrts="-lcilkrts %{static: $LIBS}"
+  link_cilkrts="-rpath ${prefix}/lib --as-needed -lgcc_s -lcilkrts %{static 
$LIBS}"
 else
   link_cilkrts="-lcilkrts $LIBS"
 fi


--- libcilkrts/runtime/os-unix.c.orig   2014-12-10 22:29:28.0 +0100
+++ libcilkrts/runtime/os-unix.c2014-12-10 22:29:40.0 +0100
@@ -56,7 +56,9 @@
 // Uses sysconf(_SC_NPROCESSORS_ONLN) in verbose output
 #elif defined  __DragonFly__
 // No additional include files
-#elif defined  __FreeBSD__
+#elif defined  __FreeBSD__ 
+// No additional include files
+#elif defined  __NetBSD__ 
 // No additional include files
 #elif defined __CYGWIN__
 // Cygwin on Windows - no additional include files
@@ -376,7 +378,7 @@
 assert((unsigned)count == count);
 
 return count;
-#elif defined  __FreeBSD__ || defined __CYGWIN__ || defined __DragonFly__
+#elif defined  __FreeBSD__ || defined __CYGWIN__ || defined __DragonFly__ || 
defined __NetBSD__
 int ncores = sysconf(_SC_NPROCESSORS_ONLN);
 
 return ncores;
@@ -400,7 +402,7 @@
 
 COMMON_SYSDEP void __cilkrts_yield(void)
 {
-#if __APPLE__ || __FreeBSD__ || __VXWORKS__
+#if __APPLE__ || __FreeBSD__ || __NetBSD__ || __VXWORKS__
 // On MacOS, call sched_yield to yield quantum.  I'm not sure why we
 // don't do this on Linux also.
 sched_yield();

Re: Ping: Re: [PATCH 10/11][RS6000] Migrate reduction optabs to reduc_..._scal

2014-12-11 Thread Alan Lawrence

Sorry - it works exactly as the current optab/expander *in the v2df case*, but 
is the same as the previous version of the patch in the other cases.


--Alan

Alan Lawrence wrote:
So I'm afraid I'm not going to get involved in a discussion about 
CANNOT_CHANGE_MODE_CLASS on RS6000, and what you might want to do there - sorry, 
but I don't think I can really contribute anything there. However, I *am* trying 
to migrate all platforms off the old reduc_xxx optabs to the new version 
producing a scalar.


Hence, can I ping the attached patch (which is just a simple combination of the 
previously-posted patch + snippet)? No regressions on gcc112.fsffrance.org.


This works in exactly the same way as the old code path, with a second insn to 
pull the scalar result out of the reduction, just as the expander would have 
done (or the bitfieldref before that), and avoiding the v2df combine pattern 
(again, as previously).


gcc/ChangeLog:

 * config/rs6000/altivec.md (reduc_splus_): Rename to...
 (reduc_plus_scal_): ...this, add rs6000_expand_vector_extract.
 (reduc_uplus_v16qi): Remove.

 * config/rs6000/vector.md (VEC_reduc_name): change "splus" to "plus"
 (reduc__v2df): Remove.
 (reduc__scal_v2df): New.
 (reduc__v4sf): Rename to...
 (reduc__scal_v4sf): ...this, wrap VEC_reduc in a
 vec_select of element 3, add scratch register.





Have run check-gcc on gcc110.fsffrance.org (powerpc64-unknown-linux-gnu) using 
this snippet on top of original patch; no regressions.


Alan Lawrence wrote:

So I'm no expert on RS6000 here, but following on from Segher's observation 
about the change in pattern...so the difference in 'expand' is exactly that, a 
vsx_reduc_splus_v2df followed by a vec_extract to DF, becomes a 
vsx_reduc_splus_v2df_scalar - as I expected the combiner to produce by 
combining the two previous insns.


However, inspecting the logs from -fdump-rtl-combine-all, *without* my 
patch, when the combiner tries to put those two together, I see:


Trying 30 -> 31:
Failed to match this instruction:
(set (reg:DF 179 [ stmp_s_5.7D.2196 ])
 (vec_select:DF (plus:V2DF (vec_select:V2DF (reg:V2DF 173 [ 
vect_s_5.6D.2195 ])
 (parallel [
 (const_int 1 [0x1])
 (const_int 0 [0])
 ]))
 (reg:V2DF 173 [ vect_s_5.6D.2195 ]))
 (parallel [
 (const_int 1 [0x1])
 ])))

That is, it looks like combine_simplify_rtx has transformed the (vec_concat 
(vec_select ... 1) (vec_select ... 0)) from the vsx_reduc_plus_v2df insn, into 
a single vec_select, which does not match the vsx_reduc_plus_v2df_scalar insn.


So despite the comment (in vsx.md):

;; Combiner patterns with the vector reduction patterns that knows we can 
get
;; to the top element of the V2DF array without doing an extract.

It looks like the code generation prior to my patch, considered better, was 
because the combiner didn't actually use the pattern?


In that case whilst you may want to dig into register allocation, 
cannot_change_mode_class, etc., for other reasons, I think the best fix for migrating to 
reduc_plus_scal... is simply to avoid using the "Combiner" patterns and just 
emit two insns, the old pattern followed by a vec_extract. The attached snippet does this 
(I won't call it a patch yet, and it applies on top of the previous patch - I went the 
route of calling the two gen functions rather than copying their RTL sequences, but could 
do the latter if that were preferable???), and restores code generation to the original 
form on your example above; it bootstraps OK but I'm still running check-gcc on the 
Compile Farm...


However, again on your example above, I note that if I *remove* the 
reduc_plus_scal_v2df pattern altogether, I get:


.sum:
 li 10,512# 52   *movdi_internal64/4 [length = 4]
 ld 9,.LC2@toc(2) # 20   *movdi_internal64/2 [length = 
4]
 xxlxor 0,0,0 # 17   *vsx_movv2df/12 [length = 4]
 mtctr 10 # 48   *movdi_internal64/11[length = 4]
 .align 4
.L2:
 lxvd2x 12,0,9# 23   *vsx_movv2df/2  [length = 4]
 addi 9,9,16  # 25   *adddi3_internal1/2 [length = 4]
 xvadddp 0,0,12   # 24   *vsx_addv2df3/1 [length = 4]
 bdnz .L2 # 47   *ctrdi_internal1/1  [length = 4]
 xxsldwi 12,0,0,2 # 30   vsx_xxsldwi_v2df[length = 
4]
 xvadddp 1,0,12   # 31   *vsx_addv2df3/1 [length = 4]
 nop  # 37   *vsx_extract_v2df_internal2/1   [length = 4]
 blr  # 55   return  [length = 4]

this is presumably using gcc's scalar reduction code, but (to my untrained 
eye on powerpc!) it looks even better than the first form above (the same in 
the loop, and in the reduction, an xxpe

Re: [patch] remove unused `depth' argument from dwarf2out.c

2014-12-11 Thread Jason Merrill


OK.

Jason

Re: RFC: handle cached local static DIEs

2014-12-11 Thread Jason Merrill


On 12/11/2014 12:44 PM, Aldy Hernandez wrote:

This context then gets changed here:

   /* For local statics lookup proper context die.  */
   if (TREE_STATIC (decl)
   && DECL_CONTEXT (decl)
   && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
 context_die = lookup_decl_die (DECL_CONTEXT (decl));


Can we remove this and just leave the context as NULL until it gets 
fixed up?


Jason

Re: RFC: handle cached local static DIEs

2014-12-11 Thread Aldy Hernandez


On 12/11/14 11:23, Jason Merrill wrote:

On 12/11/2014 02:19 PM, Jason Merrill wrote:

On 12/11/2014 12:44 PM, Aldy Hernandez wrote:

This context then gets changed here:

   /* For local statics lookup proper context die.  */
   if (TREE_STATIC (decl)
   && DECL_CONTEXT (decl)
   && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
 context_die = lookup_decl_die (DECL_CONTEXT (decl));


Can we remove this and just leave the context as NULL until it gets
fixed up?


Never mind, it looks like that'll require more work in gen_variable_die.
  Your patch looks fine.


Hah, I was just going to say that :).

I will push my patch to the branch.

Thanks for your input.
Aldy

[patch] remove unused `depth' argument from dwarf2out.c

2014-12-11 Thread Aldy Hernandez


Looks like `depth' is passed around and never used.

OK for mainline?
commit d1603304423bcb25c69d0f4bf51b142e07274275
Author: Aldy Hernandez 
Date:   Thu Dec 11 10:51:04 2014 -0800

* dwarf2out.c (gen_lexical_block_die): Remove unused `depth'
parameter.
(gen_inlined_subroutine_die): Same.
(gen_block_die): Same.
(decls_for_scope): Same.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 34b327e..4c2ff8d 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3263,8 +3263,8 @@ static void gen_subprogram_die (tree, dw_die_ref);
 static void gen_variable_die (tree, tree, dw_die_ref);
 static void gen_const_die (tree, dw_die_ref);
 static void gen_label_die (tree, dw_die_ref);
-static void gen_lexical_block_die (tree, dw_die_ref, int);
-static void gen_inlined_subroutine_die (tree, dw_die_ref, int);
+static void gen_lexical_block_die (tree, dw_die_ref);
+static void gen_inlined_subroutine_die (tree, dw_die_ref);
 static void gen_field_die (tree, dw_die_ref);
 static void gen_ptr_to_mbr_type_die (tree, dw_die_ref);
 static dw_die_ref gen_compile_unit_die (const char *);
@@ -3275,8 +3275,8 @@ static void gen_struct_or_union_type_die (tree, 
dw_die_ref,
 static void gen_subroutine_type_die (tree, dw_die_ref);
 static void gen_typedef_die (tree, dw_die_ref);
 static void gen_type_die (tree, dw_die_ref);
-static void gen_block_die (tree, dw_die_ref, int);
-static void decls_for_scope (tree, dw_die_ref, int);
+static void gen_block_die (tree, dw_die_ref);
+static void decls_for_scope (tree, dw_die_ref);
 static inline int is_redundant_typedef (const_tree);
 static bool is_naming_typedef_decl (const_tree);
 static inline dw_die_ref get_context_die (tree);
@@ -18696,7 +18696,7 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
   if (DECL_NAME (DECL_RESULT (decl)))
gen_decl_die (DECL_RESULT (decl), NULL, subr_die);
 
-  decls_for_scope (outer_scope, subr_die, 0);
+  decls_for_scope (outer_scope, subr_die);
 
   if (call_arg_locations && !dwarf_strict)
{
@@ -19294,7 +19294,7 @@ add_high_low_attributes (tree stmt, dw_die_ref die)
 /* Generate a DIE for a lexical block.  */
 
 static void
-gen_lexical_block_die (tree stmt, dw_die_ref context_die, int depth)
+gen_lexical_block_die (tree stmt, dw_die_ref context_die)
 {
   dw_die_ref stmt_die = new_die (DW_TAG_lexical_block, context_die, stmt);
 
@@ -19308,13 +19308,13 @@ gen_lexical_block_die (tree stmt, dw_die_ref 
context_die, int depth)
   if (! BLOCK_ABSTRACT (stmt) && TREE_ASM_WRITTEN (stmt))
 add_high_low_attributes (stmt, stmt_die);
 
-  decls_for_scope (stmt, stmt_die, depth);
+  decls_for_scope (stmt, stmt_die);
 }
 
 /* Generate a DIE for an inlined subprogram.  */
 
 static void
-gen_inlined_subroutine_die (tree stmt, dw_die_ref context_die, int depth)
+gen_inlined_subroutine_die (tree stmt, dw_die_ref context_die)
 {
   tree decl;
 
@@ -19346,7 +19346,7 @@ gen_inlined_subroutine_die (tree stmt, dw_die_ref 
context_die, int depth)
 add_high_low_attributes (stmt, subr_die);
   add_call_src_coords_attributes (stmt, subr_die);
 
-  decls_for_scope (stmt, subr_die, depth);
+  decls_for_scope (stmt, subr_die);
 }
 }
 
@@ -20240,7 +20240,7 @@ gen_type_die (tree type, dw_die_ref context_die)
things which are local to the given block.  */
 
 static void
-gen_block_die (tree stmt, dw_die_ref context_die, int depth)
+gen_block_die (tree stmt, dw_die_ref context_die)
 {
   int must_output_die = 0;
   bool inlined_func;
@@ -20259,7 +20259,7 @@ gen_block_die (tree stmt, dw_die_ref context_die, int 
depth)
   tree sub;
 
   for (sub = BLOCK_SUBBLOCKS (stmt); sub; sub = BLOCK_CHAIN (sub))
-   gen_block_die (sub, context_die, depth + 1);
+   gen_block_die (sub, context_die);
 
   return;
 }
@@ -20314,13 +20314,13 @@ gen_block_die (tree stmt, dw_die_ref context_die, int 
depth)
 the concrete instance of STMT got inlined, the later will lead
 to the generation of a DW_TAG_inlined_subroutine DIE.  */
  if (! BLOCK_ABSTRACT (stmt))
-   gen_inlined_subroutine_die (stmt, context_die, depth);
+   gen_inlined_subroutine_die (stmt, context_die);
}
   else
-   gen_lexical_block_die (stmt, context_die, depth);
+   gen_lexical_block_die (stmt, context_die);
 }
   else
-decls_for_scope (stmt, context_die, depth);
+decls_for_scope (stmt, context_die);
 }
 
 /* Process variable DECL (or variable with origin ORIGIN) within
@@ -20352,7 +20352,7 @@ process_scope_var (tree stmt, tree decl, tree origin, 
dw_die_ref context_die)
all of its sub-blocks.  */
 
 static void
-decls_for_scope (tree stmt, dw_die_ref context_die, int depth)
+decls_for_scope (tree stmt, dw_die_ref context_die)
 {
   tree decl;
   unsigned int i;
@@ -20384,7 +20384,7 @@ decls_for_scope (tree stmt, dw_die_ref context_die, int 
depth)
   for (subblocks = BLOCK_SUBBLOCKS (stmt);
subblocks != NULL;

Re: [PATCH 00/13] Go closures, libffi, and the static chain

2014-12-11 Thread Richard Henderson

On 12/11/2014 01:06 AM, Dominik Vogt wrote:
> reflect.call
>   ../../../libgo/runtime/go-reflect-call.c:216
> reflect.call.N13_reflect.Value
>   
> GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/value.go:579
> reflect.Call.N13_reflect.Value
>   
> GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/value.go:412
> reflect_test.TestCallWithStruct
>   
> GCCDIR/build-go-closure/x86_64-unknown-linux-gnu/libgo/gotest30365/test/all_test.go:1490
> testing.tRunner
>   ../../../libgo/go/testing/testing.go:422

Indeed.  libgo uses ffi_type_void to represent empty structures,
and libffi would crash for x86_64 when passing such parameters.

This does go back to an open bug report about how libffi handles
empty structures in general.

I've fixed this on the branch, and I'll push this through the
proper channels later.

r~

Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain

2014-12-11 Thread Richard Henderson

On 12/11/2014 04:25 AM, Dominik Vogt wrote:
> Update:  If I disable the custom s390x code and switch to the
> implementation just using libffi for reflection calls, the same
> crash occurs with the testing/quick libgo test case.  The called
> function sees a bogus value written by the synamic linker as the
> closure pointer, for example with this line in the test code:
> 
>   CheckEqual(fComplex64, fComplex64, nil)

The compiler should be generating a static structure for these.
On x86_64, I see

Relocation section '.rela.rodata.testing_quick.fComplex64$descriptor' at offset
0x5d4c0 contains 1 entries:
  Offset  Info   Type   Sym. ValueSym. Name + Addend
  00020001 R_X86_64_64    .text + c0

00c0 t quick.fComplex64

so that is in fact a direct relocation, and will not go via the dynamic linker.
 Is the s390 port somehow putting the address of a plt entry here?


r~

Re: RFC: handle cached local static DIEs

2014-12-11 Thread Jason Merrill


On 12/11/2014 02:19 PM, Jason Merrill wrote:

On 12/11/2014 12:44 PM, Aldy Hernandez wrote:

This context then gets changed here:

   /* For local statics lookup proper context die.  */
   if (TREE_STATIC (decl)
   && DECL_CONTEXT (decl)
   && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
 context_die = lookup_decl_die (DECL_CONTEXT (decl));


Can we remove this and just leave the context as NULL until it gets
fixed up?


Never mind, it looks like that'll require more work in gen_variable_die. 
 Your patch looks fine.


Jason

Overload HONOR_INFINITIES, etc macros

2014-12-11 Thread Marc Glisse


Hello,

after HONOR_NANS, I am turning the other HONOR_* macros into functions. As 
a reminder, the goal is both to make uses shorter and to fix the answer 
for non-native vector types.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-12-12  Marc Glisse  

* real.h (HONOR_SNANS, HONOR_INFINITIES, HONOR_SIGNED_ZEROS,
HONOR_SIGN_DEPENDENT_ROUNDING): Replace macros with 3 overloaded
declarations.
* real.c (HONOR_NANS): Fix indentation.
(HONOR_SNANS, HONOR_INFINITIES, HONOR_SIGNED_ZEROS,
HONOR_SIGN_DEPENDENT_ROUNDING): Define three overloads.
* builtins.c (fold_builtin_cproj, fold_builtin_signbit,
fold_builtin_fmin_fmax, fold_builtin_classify): Simplify argument
of HONOR_*.
* fold-const.c (operand_equal_p, fold_comparison, fold_binary_loc):
Likewise.
* gimple-fold.c (gimple_val_nonnegative_real_p): Likewise.
* ifcvt.c (noce_try_move, noce_try_minmax, noce_try_abs): Likewise.
* omp-low.c (omp_reduction_init): Likewise.
* rtlanal.c (may_trap_p_1): Likewise.
* simplify-rtx.c (simplify_const_relational_operation): Likewise.
* tree-ssa-dom.c (record_equality, record_edge_info): Likewise.
* tree-ssa-phiopt.c (value_replacement, abs_replacement): Likewise.
* tree-ssa-reassoc.c (eliminate_using_constants): Likewise.
* tree-ssa-uncprop.c (associate_equivalences_with_edges): Likewise.


--
Marc GlisseIndex: gcc/builtins.c
===
--- gcc/builtins.c  (revision 218639)
+++ gcc/builtins.c  (working copy)
@@ -7671,21 +7671,21 @@ build_complex_cproj (tree type, bool neg
return type.  Return NULL_TREE if no simplification can be made.  */
 
 static tree
 fold_builtin_cproj (location_t loc, tree arg, tree type)
 {
   if (!validate_arg (arg, COMPLEX_TYPE)
   || TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) != REAL_TYPE)
 return NULL_TREE;
 
   /* If there are no infinities, return arg.  */
-  if (! HONOR_INFINITIES (TYPE_MODE (TREE_TYPE (type
+  if (! HONOR_INFINITIES (type))
 return non_lvalue_loc (loc, arg);
 
   /* Calculate the result when the argument is a constant.  */
   if (TREE_CODE (arg) == COMPLEX_CST)
 {
   const REAL_VALUE_TYPE *real = TREE_REAL_CST_PTR (TREE_REALPART (arg));
   const REAL_VALUE_TYPE *imag = TREE_REAL_CST_PTR (TREE_IMAGPART (arg));
   
   if (real_isinf (real) || real_isinf (imag))
return build_complex_cproj (type, imag->sign);
@@ -8942,21 +8942,21 @@ fold_builtin_signbit (location_t loc, tr
   return (REAL_VALUE_NEGATIVE (c)
  ? build_one_cst (type)
  : build_zero_cst (type));
 }
 
   /* If ARG is non-negative, the result is always zero.  */
   if (tree_expr_nonnegative_p (arg))
 return omit_one_operand_loc (loc, type, integer_zero_node, arg);
 
   /* If ARG's format doesn't have signed zeros, return "arg < 0.0".  */
-  if (!HONOR_SIGNED_ZEROS (TYPE_MODE (TREE_TYPE (arg
+  if (!HONOR_SIGNED_ZEROS (arg))
 return fold_convert (type,
 fold_build2_loc (loc, LT_EXPR, boolean_type_node, arg,
build_real (TREE_TYPE (arg), dconst0)));
 
   return NULL_TREE;
 }
 
 /* Fold function call to builtin copysign, copysignf or copysignl with
arguments ARG1 and ARG2.  Return NULL_TREE if no simplification can
be made.  */
@@ -9136,26 +9136,26 @@ fold_builtin_fmin_fmax (location_t loc,
   tree res = do_mpfr_arg2 (arg0, arg1, type, (max ? mpfr_max : mpfr_min));
 
   if (res)
return res;
 
   /* If either argument is NaN, return the other one.  Avoid the
 transformation if we get (and honor) a signalling NaN.  Using
 omit_one_operand() ensures we create a non-lvalue.  */
   if (TREE_CODE (arg0) == REAL_CST
  && real_isnan (&TREE_REAL_CST (arg0))
- && (! HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0)))
+ && (! HONOR_SNANS (arg0)
  || ! TREE_REAL_CST (arg0).signalling))
return omit_one_operand_loc (loc, type, arg1, arg0);
   if (TREE_CODE (arg1) == REAL_CST
  && real_isnan (&TREE_REAL_CST (arg1))
- && (! HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg1)))
+ && (! HONOR_SNANS (arg1)
  || ! TREE_REAL_CST (arg1).signalling))
return omit_one_operand_loc (loc, type, arg0, arg1);
 
   /* Transform fmin/fmax(x,x) -> x.  */
   if (operand_equal_p (arg0, arg1, OEP_PURE_SAME))
return omit_one_operand_loc (loc, type, arg0, arg1);
 
   /* Convert fmin/fmax to MIN_EXPR/MAX_EXPR.  C99 requires these
 functions to return the numeric arg if the other one is NaN.
 These tree codes don't honor that, so only transform if
@@ -9552,21 +9552,21 @@ fold_builtin_classify (location_t loc, t
 {
   tree type = TREE_TYPE (TREE_TYPE (fndecl));
   REAL_VALUE_TYPE r;
 
   if (!validate_arg (arg, REAL_TYPE))
 return NULL_TREE;
 
   sw

[PATCH] backport libgo patch to add ioctl consts

2014-12-11 Thread Lynn A. Boger


Hi all,

Please backport the following to gcc 4.9 
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02980.html.


There has been a request to get the fixes that went into gcc trunk for 
gccgo ppc64 & ppc64le backported into gcc 4.9.


2014-12-11 Lynn Boger 

* libgo/mksysinfo.sh:  Add ioctl const values

Index: libgo/mksysinfo.sh
===
--- libgo/mksysinfo.sh  (revision 218396)
+++ libgo/mksysinfo.sh  (working copy)
@@ -174,6 +174,9 @@ enum {
 #ifdef TIOCGWINSZ
   TIOCGWINSZ_val = TIOCGWINSZ,
 #endif
+#ifdef TIOCSWINSZ
+  TIOCSWINSZ_val = TIOCSWINSZ,
+#endif
 #ifdef TIOCNOTTY
   TIOCNOTTY_val = TIOCNOTTY,
 #endif
@@ -192,6 +195,12 @@ enum {
 #ifdef TIOCSIG
   TIOCSIG_val = TIOCSIG,
 #endif
+#ifdef TCGETS
+  TCGETS_val = TCGETS,
+#endif
+#ifdef TCSETS
+  TCSETS_val = TCSETS,
+#endif
 };
 EOF

@@ -780,6 +789,11 @@ if ! grep '^const TIOCGWINSZ' ${OUT} >/dev/null 2>
 echo 'const TIOCGWINSZ = _TIOCGWINSZ_val' >> ${OUT}
   fi
 fi
+if ! grep '^const TIOCSWINSZ' ${OUT} >/dev/null 2>&1; then
+  if grep '^const _TIOCSWINSZ_val' ${OUT} >/dev/null 2>&1; then
+echo 'const TIOCSWINSZ = _TIOCSWINSZ_val' >> ${OUT}
+  fi
+fi
 if ! grep '^const TIOCNOTTY' ${OUT} >/dev/null 2>&1; then
   if grep '^const _TIOCNOTTY_val' ${OUT} >/dev/null 2>&1; then
 echo 'const TIOCNOTTY = _TIOCNOTTY_val' >> ${OUT}
@@ -812,8 +826,18 @@ if ! grep '^const TIOCSIG' ${OUT} >/dev/null 2>&1;
 fi

 # The ioctl flags for terminal control
-grep '^const _TC[GS]ET' gen-sysinfo.go | \
+grep '^const _TC[GS]ET' gen-sysinfo.go | grep -v _val | \
 sed -e 's/^\(const \)_\(TC[GS]ET[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}
+if ! grep '^const TCGETS' ${OUT} >/dev/null 2>&1; then
+  if grep '^const _TCGETS_val' ${OUT} >/dev/null 2>&1; then
+echo 'const TCGETS = _TCGETS_val' >> ${OUT}
+  fi
+fi
+if ! grep '^const TCSETS' ${OUT} >/dev/null 2>&1; then
+  if grep '^const _TCSETS_val' ${OUT} >/dev/null 2>&1; then
+echo 'const TCSETS = _TCSETS_val' >> ${OUT}
+  fi
+fi

 # ioctl constants.  Might fall back to 0 if TIOCNXCL is missing, too, but
 # needs handling in syscalls.exec.go.

Re: Remove unused arguments of bulitin_unreachable

2014-12-11 Thread Martin Jambor

Hi,

On Thu, Dec 11, 2014 at 07:16:43PM +0100, Jan Hubicka wrote:
> > On Thu, Dec 11, 2014 at 06:06:55PM +0100, Jan Hubicka wrote:
> > > Hi,
> > > in firefox .optimized dumps one can see few places where 
> > > __builtin_unreachable
> > > is called (as a result of devirtualization code proving the code path to 
> > > be
> > > undefined).  There is usually some argument setup for the parameters of
> > > __builtin_unreachable that are dead.  This patch makes it somewhat better 
> > > so now we get:
> > >   :
> > >   
> > >   # prephitmp_222 = PHI <_52(27), pretmp_245(29)> 
> > >   
> > >   _57 = prephitmp_222 + 2;
> > >   
> > >   pool_40(D)->ptr = _57;  
> > >   
> > >   __builtin_unreachable ();   
> > >   
> > > 
> > > Why DSE does not eliminate the stores prior noreturn const function?
> > > 
> > > Bootstrapped/regtested x86_64-linux, OK?
> > > 
> > > Honza
> > >   * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Remove dead 
> > > parameters
> > >   of BUILT_IN_UNREACHABLE
> > 
> > Shouldn't this be done when you actually change the call to
> > __builtin_unreachable ()?  I mean, __builtin_unreachable () has no
> > arguments, so leaving any arguments there is broken IL, even if you clean it
> > up during the next DCE.
> 
> Hmm, I tought there was some reason to not do so becuase of inplace folding 
> and memory-SSA.
> I can give a try to update all the places we can put builtin_unreachable into 
> IL.
> (I wonder if that also include standard constant propagation)

I think that's what we ought to do, see also PR 61591.

Martin

Re: [patch] Fix ICE on unaligned record field

2014-12-11 Thread Eric Botcazou

> Note that I think the place of the check is unfortunate as you for example
> will not remove the argument if it is unused.  In fact I'm not yet sure
> what transform exactly we are disabling.  I am guessing we are
> passing an aggregate by value that resides at a bit-aligned offset
> of some outer object:
> 
>   foo (x.aggr);
> 
> and the function then does
> 
> foo (Aggr a)
> {
>   int i = a.foo;
> ...
> }
> 
> thus use only a part of the aggregate.  Then IPA SRA would like to
> pass x.aggr.foo instead of x.aggr and thus tries to materialize a
> load from x.aggr.foo at all callers but fails to do that in a valid way.

Right, it's the usual MEM_EXPR business creating ADDR_EXPRs out of nowhere and 
miserably failing on something not addressable.

> Erics fix did, at all callers
> 
>   Aggr tem = x.aggr;
>   foo (tem.foo);
> 
> ?

Yes, because the code wants to take &tem afterwards.

> While we should be able to simply do
> 
>   foo (BIT_FIELD_REF )
> 
> with the appropriate bit offset and size?  (if that's of register type
> you need to do the load in a separate stmt of couse).
> 
> Thus similar to Erics fix but avoiding the aggregate copy.

Yes, that should be doable, but I'm not sure it's worth the hassle.

-- 
Eric Botcazou

Fix ipa-comdats crashes

2014-12-11 Thread Jan Hubicka

Hi,
IPA comdats performs a dataflow identifying section where every symbol is used.
It sanity checks that everything is reachable. This sanity check shows latent
issue with unreachable function removal.
symbol_table::remove_unreachable_nodes has parameter before_inlining_p that
says whether extern inline and virtual functions should be eliminated if they
are not inlined.  This parameter is correctly used only in call within inliner
itself and in cgraphunit.  All other cleanups happens with before_inlining_p
true that may leave some unreachable inlines at ipa-comdats time.

I fixed this by adding explicit state of symbol table for IPA passes run after
inliner.

Another issue found by Trevor is that ipa-pure-const may render function 
unreachable
in case a static cdtor is found to be pure/const. Fixed thus.

I also updated ipa.c to be more agressive on removing functions that may be 
inlined
at -O0.  This should improve compile times since the functions do not need to 
bubble
down in the queue.

In fact there is same issue in reachability computed at callgraph construction 
time
as well as within C++ frontend.  I will send separate patches for this: it 
seems that
those may account quite noticeable percentage of memory and compile time.

Bootstrapped/regtested x86_64-linux, comitted.
I am grateful to Trevor for analysis and initial patch.

Honza

PR ipa/61324
* testsuite/g++.dg/pr61324.C: New testcase by Trevor Saunders.
* testsuite/g++.dg/tm/pr51411-2.C: Update se the extern function is
not eliminated early.
* testsuite/gcc.target/i386/pr57756.c: Turn extern inline into static
inline.

* passes.c (execute_todo): Update call of remove_unreachable_nodes.
* ipa-chkp.c (chkp_produce_thunks): Use TODO_remove_functions.
* cgraphunit.c (symbol_table::process_new_functions): Add
IPA_SSA_AFTER_INLINING.
(ipa_passes): Update call of remove_unreachable_nodes.
(symbol_table::compile): Remove call of remove_unreachable_nodes.
* ipa-inline.c (inline_small_functions): Do not ICE with
-flto-partition=none
(ipa_inline): Update symtab->state; fix formatting
update call of remove_unreachable_nodes.
* cgraphclones.c (symbol_table::materialize_all_clones): Likewise.
* cgraph.h (enum symtab_state): Add IPA_SSA_AFTER_INLINING.
(remove_unreachable_nodes): Update.
* ipa.c (process_references): Keep external references only
when optimizing.
(walk_polymorphic_call_targets): Keep possible polymorphic call
target only when devirtualizing.
(symbol_table::remove_unreachable_nodes): Remove BEFORE_INLINING_P
parameter.
(ipa_single_use): Update comment.
* ipa-pure-const.c (cdtor_p): New function.
(propagate_pure_const): Track if some cdtor was turned pure/const.
(execute): Return TODO_remove_functions if needed.
* ipa-comdats.c (ipa_comdats): Update comment.

* lto.c (read_cgraph_and_symbols): Update call of
remove_unreachable_nodes.
(do_whole_program_analysis): Remove call of
symtab->remove_unreachable_nodes
Index: testsuite/g++.dg/pr61324.C
===
--- testsuite/g++.dg/pr61324.C  (revision 0)
+++ testsuite/g++.dg/pr61324.C  (revision 0)
@@ -0,0 +1,13 @@
+// { dg-do compile }
+// { dg-options "-O -fkeep-inline-functions -fno-use-cxa-atexit" }
+void foo ();
+
+struct S
+{
+  ~S ()
+  {
+foo ();
+  }
+};
+
+S s;
Index: testsuite/g++.dg/tm/pr51411-2.C
===
--- testsuite/g++.dg/tm/pr51411-2.C (revision 218610)
+++ testsuite/g++.dg/tm/pr51411-2.C (working copy)
@@ -26,6 +26,7 @@ public:
 bool compare(const basic_string& __str) const {
 return 0;
 }
+void key ();
 };
 
 typedef basic_string string;
@@ -35,7 +36,7 @@ inline bool operator<(const basic_string
 return __lhs.compare(__rhs);
 }
 
-extern template class basic_string;
+template class basic_string;
 
 }
 
Index: testsuite/gcc.target/i386/pr57756.c
===
--- testsuite/gcc.target/i386/pr57756.c (revision 218610)
+++ testsuite/gcc.target/i386/pr57756.c (working copy)
@@ -9,7 +9,7 @@ __inline int callee () /* { dg-error "in
 }
 
 __attribute__((target("sse")))
-__inline int caller ()
+static __inline int caller ()
 {
   return callee(); /* { dg-error "called from here" }  */
 }
Index: cgraph.h
===
--- cgraph.h(revision 218610)
+++ cgraph.h(working copy)
@@ -1801,12 +1801,15 @@ enum symtab_state
   PARSING,
   /* Callgraph is being constructed.  It is safe to add new functions.  */
   CONSTRUCTION,
-  /* Callgraph is being at LTO time.  */
+  /* Callgraph is being streamed-in at LTO time.  */
   LTO_STREAMING,
-  /* Callgraph is b

Fix builtin-arith-overflow-1.c with unsigned char

2014-12-11 Thread Eric Botcazou

The char's in gcc.dg/builtin-arith-overflow-1.c are almost all explicitly 
signed or unsigned, except for 2 of them, but that's enough to make it fail 
for targets whose char is unsigned.

Tested on x86-64 and a private port, applied on mainline as obvious.


2014-12-11  Eric Botcazou  

* gcc.dg/builtin-arith-overflow-1.c (fn2): Take signed char.
(fn3): Likewise.


-- 
Eric BotcazouIndex: gcc.dg/builtin-arith-overflow-1.c
===
--- gcc.dg/builtin-arith-overflow-1.c	(revision 218617)
+++ gcc.dg/builtin-arith-overflow-1.c	(working copy)
@@ -17,7 +17,7 @@ fn1 (int x, unsigned int y)
 /* MUL_OVERFLOW should be folded into unsigned multiplication,
because ovf is never used.  */
 __attribute__((noinline, noclone)) int
-fn2 (char x, long int y)
+fn2 (signed char x, long int y)
 {
   short int res;
   int ovf = __builtin_mul_overflow (x, y, &res);
@@ -31,7 +31,7 @@ fn2 (char x, long int y)
 /* ADD_OVERFLOW should be folded into unsigned addition,
because it never overflows.  */
 __attribute__((noinline, noclone)) int
-fn3 (char x, unsigned short y, int *ovf)
+fn3 (signed char x, unsigned short y, int *ovf)
 {
   int res;
   *ovf = __builtin_add_overflow (x, y, &res);

Fix doc about meaning of (pc) in length calculation

2014-12-11 Thread Eric Botcazou

The doc reads:

`(pc)'
 This refers to the address of the _current_ insn.  It might have
 been more consistent with other usage to make this the address of
 the _next_ insn but this would be confusing because the length of
 the current insn is to be computed.

That's incorrect for forward branches, (pc) points to the next insn for them, 
this is what final.c:insn_current_reference_address implements.

I presume that's a quirk known in circles of seasoned GCC hackers for long, 
but I just ran into it and that's a little surprising...

Tested on x86_64-suse-linux, applied on all active branches.


2014-12-11  Eric Botcazou  

* doc/md.texi (Insn Lengths): Fix description of (pc).


-- 
Eric BotcazouIndex: doc/md.texi
===
--- doc/md.texi	(revision 218617)
+++ doc/md.texi	(working copy)
@@ -8330,9 +8330,9 @@ must be a @code{label_ref}.
 
 @cindex @code{pc} and attributes
 @item (pc)
-This refers to the address of the @emph{current} insn.  It might have
-been more consistent with other usage to make this the address of the
-@emph{next} insn but this would be confusing because the length of the
+For non-branch instructions and backward branch instructions, this refers
+to the address of the current insn.  But for forward branch instructions,
+this refers to the address of the next insn, because the length of the
 current insn is to be computed.
 @end table

Re: [PATCH][libstdc++][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model

2014-12-11 Thread Mike Stump

On Dec 11, 2014, at 1:32 AM, Kyrill Tkachov  wrote:
>  the patch that adds the libstdc++.exp changes at 
> https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00952.html using the new 
> target-utils.exp file is ok too then?

Ok.

[PATCH 0/4] GCC port for the Visium

2014-12-11 Thread Eric Botcazou

Hi,

on behalf of Controls and Data Services, AdaCore would like to contribute a
port of the GCC to the Visium.  This is a 32-bit RISC architecture with an 
Extended Arithmetic Module implementing some 64-bit operations and an FPU
designed for embedded systems.  The binutils port has already been contributed 
and the ultimate goal is to contribute a port of the entire toolchain with 
simulator, debugger and embedded libc.

The original port had been written by employees of CDS or companies that
are now part of CDS, and AdaCore contributed enhancements and modifications
on top of it.  Both companies have a copyright assignment on file with the
FSF for the various components of the toolchain.

The Visium is a classic 32-bit RISC architecture whose branches have a delay 
slot and whose arithmetic and logical instructions all set the flags, and 
they comprise the moves between GP registers (which are inclusive ORs under 
the hood in the traditional RISC fashion).  The port is nevertheless MODE_CC 
and it generates code that is as good as the original cc0 implementation with 
the help of the post-reload compare elimination pass (modulo a small patch for 
the reorg pass that I'll submit separately).

The GCC port is split into 4 patches (toplevel, libgcc, gcc, gcc/testsuite) 
and is C-only for now, and 'make -k check-c' reports the following results:

Target is visium-unknown-elf
Host   is x86_64-suse-linux-gnu

=== gcc tests ===


Running target visium-sim
FAIL: gcc.dg/torture/builtin-explog-1.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -O2 -flto -fno-use-linker-plugin -
flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -O3 -fomit-frame-pointer  (test for 
excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -Os  (test for excess errors)

=== gcc Summary for visium-sim ===

# of expected passes81007
# of unexpected failures6
# of expected failures  94
# of unsupported tests  1796

Running target visium-sim/-mcpu=gr6
FAIL: gcc.dg/torture/builtin-explog-1.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -O2 -flto -fno-use-linker-plugin -
flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -O3 -fomit-frame-pointer  (test for 
excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/builtin-explog-1.c   -Os  (test for excess errors)

=== gcc Summary for visium-sim/-mcpu=gr6 ===

# of expected passes81007
# of unexpected failures6
# of expected failures  94
# of unsupported tests  1796

=== gcc Summary ===

# of expected passes162014
# of unexpected failures12
# of expected failures  188
# of unsupported tests  3592
/home/eric/build/gcc/visium-elf/gcc/xgcc  version 5.0.0 20141211 
(experimental) [trunk revision 218617] (GCC) 

after they are applied (on a x86_64-linux host).  I think that the failures 
are common to all newlib targets and very likely related to:
  https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00357.html

OK for the mainline?

-- 
Eric Botcazou

[PATCH 1/4] Add Visium support to toplevel

2014-12-11 Thread Eric Botcazou

ChangeLog

2014-12-11  Eric Botcazou  

* config.sub: Update from upstream config repo.
* configure.ac: Add Visium support.
* configure: Regenerate.


-- 
Eric BotcazouIndex: config.sub
===
--- config.sub	(revision 218617)
+++ config.sub	(working copy)
@@ -2,7 +2,7 @@
 # Configuration validation subroutine script.
 #   Copyright 1992-2014 Free Software Foundation, Inc.
 
-timestamp='2014-09-26'
+timestamp='2014-12-03'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -313,6 +313,7 @@ case $basic_machine in
 	| tahoe | tic4x | tic54x | tic55x | tic6x | tic80 | tron \
 	| ubicom32 \
 	| v850 | v850e | v850e1 | v850e2 | v850es | v850e2v3 \
+	| visium \
 	| we32k \
 	| x86 | xc16x | xstormy16 | xtensa \
 	| z8k | z80)
@@ -440,6 +441,7 @@ case $basic_machine in
 	| ubicom32-* \
 	| v850-* | v850e-* | v850e1-* | v850es-* | v850e2-* | v850e2v3-* \
 	| vax-* \
+	| visium-* \
 	| we32k-* \
 	| x86-* | x86_64-* | xc16x-* | xps100-* \
 	| xstormy16-* | xtensa*-* \
Index: configure.ac
===
--- configure.ac	(revision 218617)
+++ configure.ac	(working copy)
@@ -669,6 +669,10 @@ case "${target}" in
 # for explicit misaligned loads.
 noconfigdirs="$noconfigdirs target-libssp"
 ;;
+  visium-*-*)
+# No hosted I/O support.
+noconfigdirs="$noconfigdirs target-libssp"
+;;
 esac
 
 # Disable libstdc++-v3 for some systems.

[PATCH 2/4] Add Visium support to libgcc

2014-12-11 Thread Eric Botcazou

libgcc/ChangeLog

2014-12-11  Eric Botcazou  

* config.host: Add Visium support.
* config/visium: New directory.


-- 
Eric BotcazouIndex: config.host
===
--- config.host	(revision 218617)
+++ config.host	(working copy)
@@ -1233,6 +1233,10 @@ vax-*-netbsdelf*)
 	;;
 vax-*-openbsd*)
 	;;
+visium-*-elf*)
+extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o"
+tmake_file="visium/t-visium t-fdpbit"
+;;
 xstormy16-*-elf)
 	tmake_file="stormy16/t-stormy16 t-fdpbit"
 	;;


libgcc_visium.tar.gz
Description: application/compressed-tar

[PATCH 3/4] Add Visium support to gcc

2014-12-11 Thread Eric Botcazou

gcc/ChangeLog

2014-12-11  Eric Botcazou  

* config.gcc: Add Visium support.
* configure.ac: Likewise.
* configure: Regenerate.
* doc/invoke.texi: Document Visium options.
* doc/md.texi: Document Visium constraints.
* common/config/visium: New directory.
* config/visium: Likewise.


-- 
Eric BotcazouIndex: config.gcc
===
--- config.gcc	(revision 218617)
+++ config.gcc	(working copy)
@@ -2853,6 +2853,10 @@ vax-*-openbsd*)
 	extra_options="${extra_options} openbsd.opt"
 	use_collect2=yes
 	;;
+visium-*-elf*)
+	tm_file="dbxelf.h elfos.h ${tm_file} visium/elf.h newlib-stdint.h"
+	tmake_file="visium/t-visium visium/t-crtstuff"
+	;;
 xstormy16-*-elf)
 	# For historical reasons, the target files omit the 'x'.
 	tm_file="dbxelf.h elfos.h newlib-stdint.h stormy16/stormy16.h"
Index: configure.ac
===
--- configure.ac	(revision 218617)
+++ configure.ac	(working copy)
@@ -4442,7 +4442,7 @@ esac
 case "$cpu_type" in
   aarch64 | alpha | arm | avr | bfin | cris | i386 | m32c | m68k | microblaze \
   | mips | nios2 | pa | rs6000 | score | sparc | spu | tilegx | tilepro \
-  | xstormy16 | xtensa)
+  | visium | xstormy16 | xtensa)
 insn="nop"
 ;;
   ia64 | s390)
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 218617)
+++ doc/invoke.texi	(working copy)
@@ -1062,6 +1062,10 @@ See RS/6000 and PowerPC Options.
 @emph{VAX Options}
 @gccoptlist{-mg  -mgnu  -munix}
 
+@emph{Visium Options}
+@gccoptlist{-mdebug -msim -mfpu -mno-fpu -mhard-float -msoft-float @gol
+-mcpu=@var{cpu-type} -mtune=@var{cpu-type} -msv-mode -muser-mode}
+
 @emph{VMS Options}
 @gccoptlist{-mvms-return-codes -mdebug-main=@var{prefix} -mmalloc64 @gol
 -mpointer-size=@var{size}}
@@ -11845,6 +11849,7 @@ platform.
 * TILEPro Options::
 * V850 Options::
 * VAX Options::
+* Visium Options::
 * VMS Options::
 * VxWorks Options::
 * x86-64 Options::
@@ -22456,6 +22461,77 @@ GNU assembler is being used.
 Output code for G-format floating-point numbers instead of D-format.
 @end table
 
+@node Visium Options
+@subsection Visium Options
+@cindex Visium options
+
+@table @gcctabopt
+
+@item -mdebug
+@opindex mdebug
+A program which performs file I/O and is destined to run on an MCM target
+should be linked with this option.  It causes the libraries libc.a and
+libdebug.a to be linked.  The program should be run on the target under
+the control of the GDB remote debugging stub.
+
+@item -msim
+@opindex msim
+A program which performs file I/O and is destined to run on the simulator
+should be linked with option.  This causes libraries libc.a and libsim.a to
+be linked.
+
+@item -mfpu
+@itemx -mhard-float
+@opindex mfpu
+@opindex mhard-float
+Generate code containing floating-point instructions.  This is the
+default.
+
+@item -mno-fpu
+@itemx -msoft-float
+@opindex mno-fpu
+@opindex msoft-float
+Generate code containing library calls for floating-point.
+
+@option{-msoft-float} changes the calling convention in the output file;
+therefore, it is only useful if you compile @emph{all} of a program with
+this option.  In particular, you need to compile @file{libgcc.a}, the
+library that comes with GCC, with @option{-msoft-float} in order for
+this to work.
+
+@item -mcpu=@var{cpu_type}
+@opindex mcpu
+Set the instruction set, register set, and instruction scheduling parameters
+for machine type @var{cpu_type}.  Supported values for @var{cpu_type} are
+@samp{mcm}, @samp{gr5} and @samp{gr6}.
+
+@samp{mcm} is a synonym of @samp{gr5} present for backward compatibility.
+
+By default (unless configured otherwise), GCC generates code for the GR5
+variant of the Visium architecture.  
+
+With @option{-mcpu=gr6}, GCC generates code for the GR6 variant of the Visium
+architecture.  The only difference from GR5 code is that the compiler will
+generate block move instructions.
+
+@item -mtune=@var{cpu_type}
+@opindex mtune
+Set the instruction scheduling parameters for machine type @var{cpu_type},
+but do not set the instruction set or register set that the option
+@option{-mcpu=@var{cpu_type}} would.
+
+@item -msv-mode
+@opindex msv-mode
+Generate code for the supervisor mode, where there are no restrictions on
+the access to general registers.  This is the default.
+
+@item -muser-mode
+@opindex muser-mode
+Generate code for the user mode, where the access to some general registers
+is forbidden: on the GR5, registers r24 to r31 cannot be accessed in this
+mode; on the GR6, only registers r29 to r31 are affected.
+@end table
+
 @node VMS Options
 @subsection VMS Options
 
Index: doc/md.texi
===
--- doc/md.texi	(revision 218642)
+++ doc/md.texi	(working copy)
@@ -3974,6 +3974,56 @@ A 2-element vector constant with identic
 
 @end table
 
+@item Visium---@file{co

[PATCH 4/4] Add Visium support to gcc/testsuite

2014-12-11 Thread Eric Botcazou

gcc/testsuite/ChangeLog:

2014-12-11  Eric Botcazou  

* lib/target-supports.exp (check_profiling_available): Return 0 for
Visium.
(check_effective_target_tls_runtime): Likewise.
(check_effective_target_logical_op_short_circuit): Return 1 for Visium.
* gcc.dg/20020312-2.c: Adjust for Visium.
* gcc.dg/tls/thr-cse-1.c: Likewise
* gcc.dg/tree-ssa/20040204-1.c: Likewise
* gcc.dg/tree-ssa/loop-1.c: Likewise.
* gcc.dg/weak/typeof-2.c: Likewise.


-- 
Eric BotcazouIndex: lib/target-supports.exp
===
--- lib/target-supports.exp	(revision 218617)
+++ lib/target-supports.exp	(working copy)
@@ -538,6 +540,7 @@ proc check_profiling_available { test_wh
 	 || [istarget powerpc-*-elf]
 	 || [istarget rx-*-*]	
 	 || [istarget tic6x-*-elf]
+	 || [istarget visium-*-*]
 	 || [istarget xstormy16-*]
 	 || [istarget xtensa*-*-elf]
 	 || [istarget *-*-rtems*]
@@ -707,9 +710,9 @@ proc check_effective_target_tls_emulated
 # Return 1 if TLS executables can run correctly, 0 otherwise.
 
 proc check_effective_target_tls_runtime {} {
-# MSP430 runtime does not have TLS support, but just
+# The runtime does not have TLS support, but just
 # running the test below is insufficient to show this.
-if { [istarget msp430-*-*] } {
+if { [istarget msp430-*-*] || [istarget visium-*-*] } {
 	return 0
 }
 return [check_runtime tls_runtime {
@@ -6085,6 +6088,7 @@ proc check_effective_target_logical_op_s
 	 || [istarget s390*-*-*]
 	 || [istarget powerpc*-*-*]
 	 || [istarget nios2*-*-*]
+	 || [istarget visium-*-*]
 	 || [check_effective_target_arm_cortex_m] } {
 	return 1
 }
Index: gcc.dg/weak/typeof-2.c
===
--- gcc.dg/weak/typeof-2.c	(revision 218617)
+++ gcc.dg/weak/typeof-2.c	(working copy)
@@ -48,4 +48,6 @@ int bar3 (int x)
 // { dg-final { if [string match m68k-*-* $target_triplet ] {return} } }
 // Likewise for moxie targets.
 // { dg-final { if [string match moxie-*-* $target_triplet ] {return} } }
+// Likewise for Visium targets.
+// { dg-final { if [string match visium-*-* $target_triplet ] {return} } }
 // { dg-final { scan-assembler "baz3.*baz3.*baz3.*baz3.*baz3.*baz3" } }
Index: gcc.dg/tree-ssa/loop-1.c
===
--- gcc.dg/tree-ssa/loop-1.c	(revision 218617)
+++ gcc.dg/tree-ssa/loop-1.c	(working copy)
@@ -49,7 +49,7 @@ int xxx(void)
 /* CRIS keeps the address in a register.  */
 /* m68k sometimes puts the address in a register, depending on CPU and PIC.  */
 
-/* { dg-final { scan-assembler-times "foo" 5 { xfail hppa*-*-* ia64*-*-* sh*-*-* cris-*-* crisv32-*-* fido-*-* m68k-*-* i?86-*-mingw* i?86-*-cygwin* x86_64-*-mingw* } } } */
+/* { dg-final { scan-assembler-times "foo" 5 { xfail hppa*-*-* ia64*-*-* sh*-*-* cris-*-* crisv32-*-* fido-*-* m68k-*-* i?86-*-mingw* i?86-*-cygwin* x86_64-*-mingw* visium-*-* } } } */
 /* { dg-final { scan-assembler-times "foo,%r" 5 { target hppa*-*-* } } } */
 /* { dg-final { scan-assembler-times "= foo"  5 { target ia64*-*-* } } } */
 /* { dg-final { scan-assembler-times "call\[ \t\]*_foo" 5 { target i?86-*-mingw* i?86-*-cygwin* } } } */
@@ -57,3 +57,4 @@ int xxx(void)
 /* { dg-final { scan-assembler-times "jsr|bsrf|blink\ttr?,r18"  5 { target sh*-*-* } } } */
 /* { dg-final { scan-assembler-times "Jsr \\\$r" 5 { target cris-*-* } } } */
 /* { dg-final { scan-assembler-times "\[jb\]sr" 5 { target fido-*-* m68k-*-* } } } */
+/* { dg-final { scan-assembler-times "bra *tr,r\[1-9\]*,r21" 5 { target visium-*-* } } } */
Index: gcc.dg/tree-ssa/20040204-1.c
===
--- gcc.dg/tree-ssa/20040204-1.c	(revision 218617)
+++ gcc.dg/tree-ssa/20040204-1.c	(working copy)
@@ -33,5 +33,5 @@ void test55 (int x, int y)
that the && should be emitted (based on BRANCH_COST).  Fix this
by teaching dom to look through && and register all components
as true.  */
-/* { dg-final { scan-tree-dump-times "link_error" 0 "optimized" { xfail { ! "alpha*-*-* arm*-*-* aarch64*-*-* powerpc*-*-* cris-*-* crisv32-*-* hppa*-*-* i?86-*-* mmix-*-* mips*-*-* m68k*-*-* moxie-*-* nds32*-*-* sparc*-*-* spu-*-* x86_64-*-*" } } } } */
+/* { dg-final { scan-tree-dump-times "link_error" 0 "optimized" { xfail { ! "alpha*-*-* arm*-*-* aarch64*-*-* powerpc*-*-* cris-*-* crisv32-*-* hppa*-*-* i?86-*-* mmix-*-* mips*-*-* m68k*-*-* moxie-*-* nds32*-*-* sparc*-*-* spu-*-* visium-*-* x86_64-*-*" } } } } */
 /* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc.dg/tls/thr-cse-1.c
===
--- gcc.dg/tls/thr-cse-1.c	(revision 218617)
+++ gcc.dg/tls/thr-cse-1.c	(working copy)
@@ -18,11 +18,11 @@ int foo (int b, int c, int d)
   return a;
 }
 
-/* { dg-final { scan-assembler-not "emutls_get_address.*emutls_get_addres

Re: [PATCH 2/4] Add Visium support to libgcc

2014-12-11 Thread Joseph Myers

Do you have a reason for using fp-bit instead of soft-fp?

libgcc files are generally GPL+exception, not LGPL without exception with 
a very old FSF address (config/visium/div64.c, mod64.c, 
set_trampoline_parity.c, udiv64.c, udivmod64.c, umod64.c)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 3/4] Add Visium support to gcc

2014-12-11 Thread Joseph Myers

Use of `%s' in diagnostics is long obsoleted by %qs (in this case, using 
%qE with the identifier directly, rather than using IDENTIFIER_POINTER, is 
preferred).

INTVAL / UINTVAL return HOST_WIDE_INT / unsigned HOST_WIDE_INT, not long / 
unsigned long.  You have lots of uses of fprintf that presume they return 
long / unsigned long.

As you have the interrupt attribute, you need to add this port to the list 
in extend.texi of ports with this attribute.  (Generally, check the 
checklist of pieces in sourcebuild.texi to update for a new port.)

At least one target for this port should be added to 
contrib/config-list.mk (and you should verify that the port builds cleanly 
with --enable-werror-always, for both 32-bit and 64-bit hosts, when 
building using current trunk GCC).

-- 
Joseph S. Myers
jos...@codesourcery.com

RE: [Ping] [PATCH, ARM, libgcc] New aeabi_idiv function for armv6-m

2014-12-11 Thread Hale Wang

Ping? Already applied to arm/embedded-4_9-branch, is it OK for trunk?

-Hale

> -Original Message-
> From: Joey Ye [mailto:joey.ye...@gmail.com]
> Sent: Thursday, November 27, 2014 10:01 AM
> To: Hale Wang
> Cc: gcc-patches
> Subject: Re: [PATCH, ARM, libgcc] New aeabi_idiv function for armv6-m
> 
> OK applying to arm/embedded-4_9-branch, though you still need maintainer
> approval into trunk.
> 
> - Joey
> 
> On Wed, Nov 26, 2014 at 11:43 AM, Hale Wang  wrote:
> > Hi,
> >
> > This patch ports the aeabi_idiv routine from Linaro Cortex-Strings
> > (https://git.linaro.org/toolchain/cortex-strings.git), which was
> > contributed by ARM under Free BSD license.
> >
> > The new aeabi_idiv routine is used to replace the one in
> > libgcc/config/arm/lib1funcs.S. This replacement happens within the
> > Thumb1 wrapper. The new routine is under LGPLv3 license.
> >
> > The main advantage of this version is that it can improve the
> > performance of the aeabi_idiv function for Thumb1. This solution will
> > also increase the code size. So it will only be used if __OPTIMIZE_SIZE__ is
> not defined.
> >
> > Make check passed for armv6-m.
> >
> > OK for trunk?
> >
> > Thanks,
> > Hale Wang
> >
> > libgcc/ChangeLog:
> >
> > 2014-11-26  Hale Wang  
> >
> > * config/arm/lib1funcs.S: Add new wrapper.
> >
> > ===
> > diff --git a/libgcc/config/arm/lib1funcs.S
> > b/libgcc/config/arm/lib1funcs.S index b617137..de66c81 100644
> > --- a/libgcc/config/arm/lib1funcs.S
> > +++ b/libgcc/config/arm/lib1funcs.S
> > @@ -306,34 +306,12 @@ LSYM(Lend_fde):
> >  #ifdef __ARM_EABI__
> >  .macro THUMB_LDIV0 name signed
> >  #if defined(__ARM_ARCH_6M__)
> > -   .ifc \signed, unsigned
> > -   cmp r0, #0
> > -   beq 1f
> > -   mov r0, #0
> > -   mvn r0, r0  @ 0x
> > -1:
> > -   .else
> > -   cmp r0, #0
> > -   beq 2f
> > -   blt 3f
> > +
> > +   push{r0, lr}
> > mov r0, #0
> > -   mvn r0, r0
> > -   lsr r0, r0, #1  @ 0x7fff
> > -   b   2f
> > -3: mov r0, #0x80
> > -   lsl r0, r0, #24 @ 0x8000
> > -2:
> > -   .endif
> > -   push{r0, r1, r2}
> > -   ldr r0, 4f
> > -   adr r1, 4f
> > -   add r0, r1
> > -   str r0, [sp, #8]
> > -   @ We know we are not on armv4t, so pop pc is safe.
> > -   pop {r0, r1, pc}
> > -   .align  2
> > -4:
> > -   .word   __aeabi_idiv0 - 4b
> > +   bl  SYM(__aeabi_idiv0)
> > +   pop {r1, pc}
> > +
> >  #elif defined(__thumb2__)
> > .syntax unified
> > .ifc \signed, unsigned
> > @@ -927,7 +905,158 @@ LSYM(Lover7):
> > add dividend, work
> >.endif
> >  LSYM(Lgot_result):
> > -.endm
> > +.endm
> > +
> > +#if defined(__prefer_thumb__)
> && !defined(__OPTIMIZE_SIZE__) .macro
> > +BranchToDiv n, label
> > +   lsr curbit, dividend, \n
> > +   cmp curbit, divisor
> > +   blo \label
> > +.endm
> > +
> > +.macro DoDiv n
> > +   lsr curbit, dividend, \n
> > +   cmp curbit, divisor
> > +   bcc 1f
> > +   lsl curbit, divisor, \n
> > +   sub dividend, dividend, curbit
> > +
> > +1: adc result, result
> > +.endm
> > +
> > +.macro THUMB1_Div_Positive
> > +   mov result, #0
> > +   BranchToDiv #1, LSYM(Lthumb1_div1)
> > +   BranchToDiv #4, LSYM(Lthumb1_div4)
> > +   BranchToDiv #8, LSYM(Lthumb1_div8)
> > +   BranchToDiv #12, LSYM(Lthumb1_div12)
> > +   BranchToDiv #16, LSYM(Lthumb1_div16)
> > +LSYM(Lthumb1_div_large_positive):
> > +   mov result, #0xff
> > +   lsl divisor, divisor, #8
> > +   rev result, result
> > +   lsr curbit, dividend, #16
> > +   cmp curbit, divisor
> > +   blo 1f
> > +   asr result, #8
> > +   lsl divisor, divisor, #8
> > +   beq LSYM(Ldivbyzero_waypoint)
> > +
> > +1: lsr curbit, dividend, #12
> > +   cmp curbit, divisor
> > +   blo LSYM(Lthumb1_div12)
> > +   b   LSYM(Lthumb1_div16)
> > +LSYM(Lthumb1_div_loop):
> > +   lsr divisor, divisor, #8
> > +LSYM(Lthumb1_div16):
> > +   Dodiv   #15
> > +   Dodiv   #14
> > +   Dodiv   #13
> > +   Dodiv   #12
> > +LSYM(Lthumb1_div12):
> > +   Dodiv   #11
> > +   Dodiv   #10
> > +   Dodiv   #9
> > +   Dodiv   #8
> > +   bcs LSYM(Lthumb1_div_loop)
> > +LSYM(Lthumb1_div8):
> > +   Dodiv   #7
> > +   Dodiv   #6
> > +   Dodiv   #5
> > +LSYM(Lthumb1_div5):
> > +   Dodiv   #4
> > +LSYM(Lthumb1_div4):
> > +   Dodiv   #3
> > +LSYM(Lthumb1_div3):
> > +   Dodiv   #2
> > +LSYM(Lthumb1_div2):
> > +   Dodiv   #1
> > +LSYM(Lthumb1_div1):
> > +   sub divisor, dividend, divisor
> > +   bcs 1f
> > +   cpy divisor, dividend
> > +
> > +1: adc result, result
> > +

Re: [PATCH 3/4] Add Visium support to gcc

2014-12-11 Thread Hans-Peter Nilsson

On Fri, 12 Dec 2014, Joseph Myers wrote:
> At least one target for this port should be added to
> contrib/config-list.mk (and you should verify that the port builds cleanly
> with --enable-werror-always, for both 32-bit and 64-bit hosts, when
> building using current trunk GCC).

While doing that, beware that gcc has bugs causing some ports (I
forgot which ones) to get at least one spurious warning
apparently not attributable to the quality of the port.

PR(s) duly entered, but I can't quote the numbers (finding PRs
is not practical for me after the https change, but IIRC Joern
was the author).

brgds, H-P
PS. of course no excuse to not get the low-hanging fruit.

C++ PATCH for c++/57510 (memory leak with initializer_list)

2014-12-11 Thread Jason Merrill

We want to deal with initialization of an array reference/init-list 
temporary the same way that we handle initialization of an array variable.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ad88fa39b2c68d806b58563dfe1e19ecf8d143ba
Author: Jason Merrill 
Date:   Wed Dec 10 14:14:05 2014 -0500

	PR c++/57510
	* typeck2.c (split_nonconstant_init_1): Handle arrays here.
	(store_init_value): Not here.
	(split_nonconstant_init): Look through TARGET_EXPR.  No longer static.
	* cp-tree.h: Declare split_nonconstant_init.
	* call.c (set_up_extended_ref_temp): Use split_nonconstant_init.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index d8075bd..312dfdf 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -9574,7 +9574,7 @@ set_up_extended_ref_temp (tree decl, tree expr, vec **cleanups,
   else
 /* Create the INIT_EXPR that will initialize the temporary
variable.  */
-init = build2 (INIT_EXPR, type, var, expr);
+init = split_nonconstant_init (var, expr);
   if (at_function_scope_p ())
 {
   add_decl_expr (var);
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index d41a834..ad1cc71 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6291,6 +6291,7 @@ extern int abstract_virtuals_error_sfinae	(tree, tree, tsubst_flags_t);
 extern int abstract_virtuals_error_sfinae	(abstract_class_use, tree, tsubst_flags_t);
 
 extern tree store_init_value			(tree, tree, vec**, int);
+extern tree split_nonconstant_init		(tree, tree);
 extern bool check_narrowing			(tree, tree, tsubst_flags_t);
 extern tree digest_init(tree, tree, tsubst_flags_t);
 extern tree digest_init_flags			(tree, tree, int);
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 92c0417..c53a9b5 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -604,6 +604,17 @@ split_nonconstant_init_1 (tree dest, tree init)
 case ARRAY_TYPE:
   inner_type = TREE_TYPE (type);
   array_type_p = true;
+  if ((TREE_SIDE_EFFECTS (init)
+	   && TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type))
+	  || array_of_runtime_bound_p (type))
+	{
+	  /* For an array, we only need/want a single cleanup region rather
+	 than one per element.  */
+	  tree code = build_vec_init (dest, NULL_TREE, init, false, 1,
+  tf_warning_or_error);
+	  add_stmt (code);
+	  return true;
+	}
   /* FALLTHRU */
 
 case RECORD_TYPE:
@@ -721,11 +732,13 @@ split_nonconstant_init_1 (tree dest, tree init)
perform the non-constant part of the initialization to DEST.
Returns the code for the runtime init.  */
 
-static tree
+tree
 split_nonconstant_init (tree dest, tree init)
 {
   tree code;
 
+  if (TREE_CODE (init) == TARGET_EXPR)
+init = TARGET_EXPR_INITIAL (init);
   if (TREE_CODE (init) == CONSTRUCTOR)
 {
   code = push_stmt_list ();
@@ -830,17 +843,7 @@ store_init_value (tree decl, tree init, vec** cleanups, int flags)
   && (TREE_SIDE_EFFECTS (value)
 	  || array_of_runtime_bound_p (type)
 	  || ! reduced_constant_expression_p (value)))
-{
-  if (TREE_CODE (type) == ARRAY_TYPE
-	  && (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (type))
-	  || array_of_runtime_bound_p (type)))
-	/* For an array, we only need/want a single cleanup region rather
-	   than one per element.  */
-	return build_vec_init (decl, NULL_TREE, value, false, 1,
-			   tf_warning_or_error);
-  else
-	return split_nonconstant_init (decl, value);
-}
+return split_nonconstant_init (decl, value);
   /* If the value is a constant, just put it in DECL_INITIAL.  If DECL
  is an automatic variable, the middle end will turn this into a
  dynamic initialization later.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist90.C b/gcc/testsuite/g++.dg/cpp0x/initlist90.C
new file mode 100644
index 000..330517a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist90.C
@@ -0,0 +1,35 @@
+// PR c++/57510
+// { dg-do run { target c++11 } }
+
+#include 
+
+struct counter
+{
+  static int n;
+
+  counter() { ++n; }
+  counter(const counter&) { ++n; }
+  ~counter() { --n; }
+};
+
+int counter::n = 0;
+
+struct X
+{
+X () { if (counter::n > 1) throw 1; }
+
+counter c;
+};
+
+int main ()
+{
+  try
+  {
+auto x = { X{}, X{} };
+  }
+  catch (...)
+  {
+if ( counter::n != 0 )
+  throw;
+  }
+}

C++ PATCH for c++/64248 (FUNCTION error)

2014-12-11 Thread Jason Merrill

It seems that my strict reading of the standard conflicts with existing 
practice in a way that is not useful.  So I'm reverting my earlier patch.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 9dfef51ff302e644acf0111685a7451867049959
Author: Jason Merrill 
Date:   Wed Dec 10 17:33:19 2014 -0500

	PR c++/64248
	Revert:
	* parser.c (cp_parser_unqualified_id): Handle __func__ here.
	(cp_parser_primary_expression): Not here.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 48dd64a..76725ef 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -4503,9 +4503,39 @@ cp_parser_primary_expression (cp_parser *parser,
 	case RID_FUNCTION_NAME:
 	case RID_PRETTY_FUNCTION_NAME:
 	case RID_C99_FUNCTION_NAME:
+	  {
+	non_integral_constant name;
+
 	/* The symbols __FUNCTION__, __PRETTY_FUNCTION__, and
-	   __func__ are the names of variables.  */
-	  goto id_expression;
+	   __func__ are the names of variables -- but they are
+	   treated specially.  Therefore, they are handled here,
+	   rather than relying on the generic id-expression logic
+	   below.  Grammatically, these names are id-expressions.
+
+	   Consume the token.  */
+	token = cp_lexer_consume_token (parser->lexer);
+
+	switch (token->keyword)
+	  {
+	  case RID_FUNCTION_NAME:
+		name = NIC_FUNC_NAME;
+		break;
+	  case RID_PRETTY_FUNCTION_NAME:
+		name = NIC_PRETTY_FUNC;
+		break;
+	  case RID_C99_FUNCTION_NAME:
+		name = NIC_C99_FUNC;
+		break;
+	  default:
+		gcc_unreachable ();
+	  }
+
+	if (cp_parser_non_integral_constant_expression (parser, name))
+	  return error_mark_node;
+
+	/* Look up the name.  */
+	return finish_fname (token->u.value);
+	  }
 
 	case RID_VA_ARG:
 	  {
@@ -4926,7 +4956,6 @@ cp_parser_unqualified_id (cp_parser* parser,
 			  bool optional_p)
 {
   cp_token *token;
-  tree id;
 
   /* Peek at the next token.  */
   token = cp_lexer_peek_token (parser->lexer);
@@ -4935,6 +4964,8 @@ cp_parser_unqualified_id (cp_parser* parser,
 {
 case CPP_NAME:
   {
+	tree id;
+
 	/* We don't know yet whether or not this will be a
 	   template-id.  */
 	cp_parser_parse_tentatively (parser);
@@ -5171,9 +5202,10 @@ cp_parser_unqualified_id (cp_parser* parser,
   }
 
 case CPP_KEYWORD:
-  switch (token->keyword)
+  if (token->keyword == RID_OPERATOR)
 	{
-	case RID_OPERATOR:
+	  tree id;
+
 	  /* This could be a template-id, so we try that first.  */
 	  cp_parser_parse_tentatively (parser);
 	  /* Try a template-id.  */
@@ -5203,19 +5235,6 @@ cp_parser_unqualified_id (cp_parser* parser,
 	}
 
 	  return id;
-
-	case RID_FUNCTION_NAME:
-	case RID_PRETTY_FUNCTION_NAME:
-	case RID_C99_FUNCTION_NAME:
-	  cp_lexer_consume_token (parser->lexer);
-	  /* Don't try to declare this while tentatively parsing a function
-	 declarator, as cp_make_fname_decl will fail.  */
-	  if (current_binding_level->kind != sk_function_parms)
-	finish_fname (token->u.value);
-	  return token->u.value;
-
-	default:
-	  break;
 	}
   /* Fall through.  */
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype-func.C b/gcc/testsuite/g++.dg/cpp0x/decltype-func.C
deleted file mode 100644
index 65dd27a..000
--- a/gcc/testsuite/g++.dg/cpp0x/decltype-func.C
+++ /dev/null
@@ -1,6 +0,0 @@
-// { dg-do compile { target c++11 } }
-
-void f() {
-  typedef decltype(__func__) T;
-  T x = __func__;		// { dg-error "array" }
-}
diff --git a/gcc/testsuite/g++.dg/other/error34.C b/gcc/testsuite/g++.dg/other/error34.C
index cb8fdae..d6f3eb5 100644
--- a/gcc/testsuite/g++.dg/other/error34.C
+++ b/gcc/testsuite/g++.dg/other/error34.C
@@ -4,4 +4,3 @@
 
 S () : str(__PRETTY_FUNCTION__) {}	// { dg-error "forbids declaration" "decl" }
 // { dg-error "only constructors" "constructor" { target *-*-* } 5 }
-// { dg-prune-output "__PRETTY_FUNCTION__" }
diff --git a/gcc/testsuite/g++.dg/parse/fnname2.C b/gcc/testsuite/g++.dg/parse/fnname2.C
new file mode 100644
index 000..7fc0f82
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/fnname2.C
@@ -0,0 +1,19 @@
+// PR c++/64248
+
+class A
+{
+public:
+A(const char* str) {};
+};
+
+class B
+{
+public:
+B(A a) {};
+};
+
+int main()
+{
+   B b(A(__func__));
+   return 0;
+}

Re: [PATCH] Do not download packages for graphite loop optimizations by default when using ./contrib/download_prerequisites

2014-12-11 Thread Chung-Ju Wu

2014-12-10 21:37 GMT+08:00 Richard Biener :
> On Wed, Dec 10, 2014 at 6:16 AM, Chung-Ju Wu  wrote:
>>
>> Thanks for the suggestion.
>> The followings are proposed patch to adjust comment:
>>
>> Index: contrib/ChangeLog
>> ===
>> --- contrib/ChangeLog   (revision 218558)
>> +++ contrib/ChangeLog   (working copy)
>> @@ -1,3 +1,7 @@
>> +2014-12-10  Chung-Ju Wu  
>> +
>> +   * download_prerequisites: Modify the comment for GRAPHITE_LOOP_OPT.
>> +
>>  2014-12-09  Laurynas Biveinis  
>> Yury Gribov  
>>
>>
>> Index: contrib/download_prerequisites
>> ===
>> --- contrib/download_prerequisites  (revision 218558)
>> +++ contrib/download_prerequisites  (working copy)
>> @@ -19,9 +19,9 @@
>>  # You should have received a copy of the GNU General Public License
>>  # along with this program. If not, see http://www.gnu.org/licenses/.
>>
>> -# If you want to build GCC with the Graphite loop optimizations,
>> -# set GRAPHITE_LOOP_OPT=yes to download optional prerequisties
>> -# ISL Library and CLooG.
>> +# If you want to disable Graphite loop optimizations while building GCC,
>> +# DO NOT set GRAPHITE_LOOP_OPT as yes so that the ISL package will not
>> +# be downloaded.
>>  GRAPHITE_LOOP_OPT=yes
>>
>>
>> Is this OK for trunk?
>
> Ok.
>
> Thanks,
> Richard.
>

Thanks for approval.  Committed as Rev.218652:
  https://gcc.gnu.org/r218652


Best regards,
jasonwucj

C++ PATCH to remove "array of runtime bound" from -std=c++14

2014-12-11 Thread Jason Merrill

The C++ VLA paper, N3639, was voted into and then back out of the C++14 
standard, and currently seems likely not to ever be part of a published 
standard.  So I'm backing out most of my changes for that paper, such 
that -std=c++14 has the same VLA support as other standard modes.  We no 
longer throw bad_array_length.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit aa0d87578a3264820105a29099870dcf85e4ef98
Author: Jason Merrill 
Date:   Thu Dec 11 15:20:11 2014 -0500

	Remove N3639 "array of runtime length" from -std=c++14.
gcc/cp/
	* decl.c (compute_array_index_type): VLAs are not part of C++14.
	(create_array_type_for_decl, grokdeclarator): Likewise.
	* lambda.c (add_capture): Likewise.
	* pt.c (tsubst): Likewise.
	* rtti.c (get_tinfo_decl): Likewise.
	* semantics.c (finish_decltype_type): Likewise.
	* typeck.c (cxx_sizeof_or_alignof_type): Likewise.
	(cp_build_addr_expr_1): Likewise.
	* init.c (build_vec_init): Don't throw bad_array_length.
gcc/c-family/
	* c-cppbuiltin.c (c_cpp_builtins): Define __cpp_runtime_arrays if
	we aren't complaining about VLAs.
libstdc++-v3/
	* libsupc++/new (bad_array_length): Move...
	* bad_array_length.cc: ...here.
	* cxxabi.h, eh_aux_runtime.cc (__cxa_throw_bad_array_new_length): Also
	move to bad_array_length.cc.

	* c-cppbuiltin.c (c_cpp_builtins): Define __cpp_runtime_arrays if
	we aren't complaining about VLAs.

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index c571d1b..54d3acd 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -828,6 +828,15 @@ c_cpp_builtins (cpp_reader *pfile)
 	 and were standardized for C++14.  */
   if (!pedantic || cxx_dialect > cxx11)
 	cpp_define (pfile, "__cpp_binary_literals=201304");
+
+  /* Arrays of runtime bound were removed from C++14, but we still
+	 support GNU VLAs.  Let's define this macro to a low number
+	 (corresponding to the initial test release of GNU C++) if we won't
+	 complain about use of VLAs.  */
+  if (c_dialect_cxx ()
+	  && (pedantic ? warn_vla == 0 : warn_vla <= 0))
+	cpp_define (pfile, "__cpp_runtime_arrays=198712");
+
   if (cxx_dialect >= cxx11)
 	{
 	  /* Set feature test macros for C++11  */
@@ -863,9 +872,6 @@ c_cpp_builtins (cpp_reader *pfile)
 	  cpp_define (pfile, "__cpp_variable_templates=201304");
 	  cpp_define (pfile, "__cpp_digit_separators=201309");
 	  //cpp_define (pfile, "__cpp_sized_deallocation=201309");
-	  /* We'll have to see where runtime arrays wind up.
-	 Let's put it in C++14 for now.  */
-	  cpp_define (pfile, "__cpp_runtime_arrays=201304");
 	}
 }
   /* Note that we define this for C as well, so that we know if
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 9659336..efc2001 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -8515,7 +8515,7 @@ compute_array_index_type (tree name, tree size, tsubst_flags_t complain)
 	   /* We don't allow VLAs at non-function scopes, or during
 	  tentative template substitution.  */
 	   || !at_function_scope_p ()
-	   || (cxx_dialect < cxx14 && !(complain & tf_error)))
+	   || !(complain & tf_error))
 {
   if (!(complain & tf_error))
 	return error_mark_node;
@@ -8527,7 +8527,7 @@ compute_array_index_type (tree name, tree size, tsubst_flags_t complain)
 	error ("size of array is not an integral constant-expression");
   size = integer_one_node;
 }
-  else if (cxx_dialect < cxx14 && pedantic && warn_vla != 0)
+  else if (pedantic && warn_vla != 0)
 {
   if (name)
 	pedwarn (input_location, OPT_Wvla, "ISO C++ forbids variable length array %qD", name);
@@ -8585,25 +8585,12 @@ compute_array_index_type (tree name, tree size, tsubst_flags_t complain)
 
 	  stabilize_vla_size (itype);
 
-	  if (cxx_dialect >= cxx14 && flag_exceptions)
+	  if (flag_sanitize & SANITIZE_VLA
+	  && current_function_decl != NULL_TREE
+	  && !lookup_attribute ("no_sanitize_undefined",
+DECL_ATTRIBUTES
+(current_function_decl)))
 	{
-	  /* If the VLA bound is larger than half the address space,
-	 or less than zero, throw std::bad_array_length.  */
-	  tree comp = build2 (LT_EXPR, boolean_type_node, itype,
-  ssize_int (-1));
-	  comp = build3 (COND_EXPR, void_type_node, comp,
-			 throw_bad_array_length (), void_node);
-	  finish_expr_stmt (comp);
-	}
-	  else if (flag_sanitize & SANITIZE_VLA
-		   && current_function_decl != NULL_TREE
-		   && !lookup_attribute ("no_sanitize_undefined",
-	 DECL_ATTRIBUTES
-	   (current_function_decl)))
-	{
-	  /* From C++14 onwards, we throw an exception on a negative
-		 length size of an array; see above.  */
-
 	  /* We have to add 1 -- in the ubsan routine we generate
 		 LE_EXPR rather than LT_EXPR.  */
 	  tree t = fold_build2 (PLUS_EXPR, TREE_TYPE (itype), itype,
@@ -8730,10 +8717,6 @@ create_array_type_for_decl (tree name, tree type, tree size)
   retu

c-family PATCH to update __cpp_constexpr macro for C++14 constexpr support

2014-12-11 Thread Jason Merrill


A bit I forgot in the earlier C++14 constexpr work.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 0b9dbcebc4d3bf5e9281889f34d189fb7c42dde3
Author: Jason Merrill 
Date:   Thu Dec 11 22:19:36 2014 -0500

	* c-cppbuiltin.c (c_cpp_builtins): Enable C++14 __cpp_constexpr.

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 54d3acd..2dfecb6 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -845,7 +845,8 @@ c_cpp_builtins (cpp_reader *pfile)
 	  cpp_define (pfile, "__cpp_unicode_literals=200710");
 	  cpp_define (pfile, "__cpp_user_defined_literals=200809");
 	  cpp_define (pfile, "__cpp_lambdas=200907");
-	  cpp_define (pfile, "__cpp_constexpr=200704");
+	  if (cxx_dialect == cxx11)
+	cpp_define (pfile, "__cpp_constexpr=200704");
 	  cpp_define (pfile, "__cpp_range_based_for=200907");
 	  cpp_define (pfile, "__cpp_static_assert=200410");
 	  cpp_define (pfile, "__cpp_decltype=200707");
@@ -865,8 +866,7 @@ c_cpp_builtins (cpp_reader *pfile)
 	  cpp_define (pfile, "__cpp_return_type_deduction=201304");
 	  cpp_define (pfile, "__cpp_init_captures=201304");
 	  cpp_define (pfile, "__cpp_generic_lambdas=201304");
-	  //cpp_undef (pfile, "__cpp_constexpr");
-	  //cpp_define (pfile, "__cpp_constexpr=201304");
+	  cpp_define (pfile, "__cpp_constexpr=201304");
 	  cpp_define (pfile, "__cpp_decltype_auto=201304");
 	  cpp_define (pfile, "__cpp_aggregate_nsdmi=201304");
 	  cpp_define (pfile, "__cpp_variable_templates=201304");
diff --git a/gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C b/gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C
index d271752..36e1135 100644
--- a/gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C
+++ b/gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C
@@ -47,12 +47,6 @@
 #  error "__cpp_lambdas != 200907"
 #endif
 
-#ifndef __cpp_constexpr
-#  error "__cpp_constexpr"
-#elif __cpp_constexpr != 200704
-#  error "__cpp_constexpr != 200704"
-#endif
-
 #ifndef __cpp_range_based_for
 #  error "__cpp_range_based_for"
 #elif __cpp_range_based_for != 200907
@@ -145,11 +139,10 @@
 #  error "__cpp_generic_lambdas != 201304"
 #endif
 
-//  TODO: Change 200704 to 201304 when C++14 constexpr goes in.
 #ifndef __cpp_constexpr
 #  error "__cpp_constexpr"
-#elif __cpp_constexpr != 200704
-#  error "__cpp_constexpr != 200704"
+#elif __cpp_constexpr != 201304
+#  error "__cpp_constexpr != 201304"
 #endif
 
 #ifndef __cpp_decltype_auto

[Committed] [PATCH, ifcvt] Fix PR63917

2014-12-11 Thread Zhenqiang Chen



> -Original Message-
> From: Richard Henderson [mailto:r...@redhat.com]
> Sent: Wednesday, December 10, 2014 8:55 AM
> To: Zhenqiang Chen
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [Ping] [PATCH, ifcvt] Fix PR63917
> 
> On 12/04/2014 05:16 PM, Zhenqiang Chen wrote:
> > +static rtx
> > +cc_in_cond (rtx cond)
> > +{
> > +  if ((HAVE_cbranchcc4) && cond
> 
> Silly parens around the HAVE_cbranchcc4.

Removed.

> > +  && (GET_MODE_CLASS (GET_MODE (XEXP (cond, 0))) == MODE_CC))
> 
> More silly parens around the ==.

Removed. 

> > +  /* Skip it if the instruction to be moved might clobber CC.  */
> > +  cc = cc_in_cond (cond);
> > +  if (cc)
> > +if (set_of (cc, insn_a)
> > +   || (insn_b && set_of (XEXP (cond, 0), insn_b)))
> > +  return FALSE;
> 
> Don't nest if's when an && will do; if the && won't do, always use braces.
> 
> It looks like the insn_b test can be simpler, since the non-null return from
> cc_in_cond is always XEXP (cond, 0).
> 
> So:
> 
>   if (cc
>   && (set_of (cc, insn_a)
>   || (insn_b && set_of (cc, insn_b)))
> return FALSE;
> 
> Ok with those changes.

Updated and committed @r218658.

Here is the final patch.

Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c (revision 218657)
+++ gcc/ifcvt.c (working copy)
@@ -1016,6 +1016,18 @@
   0, 0, outmode, y);
 }
 
+/* Return the CC reg if it is used in COND.  */
+
+static rtx
+cc_in_cond (rtx cond)
+{
+  if (HAVE_cbranchcc4 && cond
+  && GET_MODE_CLASS (GET_MODE (XEXP (cond, 0))) == MODE_CC)
+return XEXP (cond, 0);
+
+  return NULL_RTX;
+}
+
 /* Return sequence of instructions generated by if conversion.  This
function calls end_sequence() to end the current stream, ensures
that are instructions are unshared, recognizable non-jump insns.
@@ -1026,6 +1038,7 @@
 {
   rtx_insn *insn;
   rtx_insn *seq = get_insns ();
+  rtx cc = cc_in_cond (if_info->cond);
 
   set_used_flags (if_info->x);
   set_used_flags (if_info->cond);
@@ -1040,7 +1053,9 @@
  allows proper placement of required clobbers.  */
   for (insn = seq; insn; insn = NEXT_INSN (insn))
 if (JUMP_P (insn)
-   || recog_memoized (insn) == -1)
+   || recog_memoized (insn) == -1
+  /* Make sure new generated code does not clobber CC.  */
+   || (cc && set_of (cc, insn)))
   return NULL;
 
   return seq;
@@ -2544,6 +2559,7 @@
   rtx_insn *insn_a, *insn_b;
   rtx set_a, set_b;
   rtx orig_x, x, a, b;
+  rtx cc;
 
   /* We're looking for patterns of the form
 
@@ -2655,6 +2671,13 @@
   if_info->a = a;
   if_info->b = b;
 
+  /* Skip it if the instruction to be moved might clobber CC.  */
+  cc = cc_in_cond (cond);
+  if (cc
+  && (set_of (cc, insn_a)
+ || (insn_b && set_of (cc, insn_b
+return FALSE;
+
   /* Try optimizations in some approximation of a useful order.  */
   /* ??? Should first look to see if X is live incoming at all.  If it
  isn't, we don't need anything but an unconditional set.  */
@@ -2811,6 +2834,7 @@
   rtx cond)
 {
   rtx_insn *insn;
+  rtx cc = cc_in_cond (cond);
 
/* We can only handle simple jumps at the end of the basic block.
   It is almost impossible to update the CFG otherwise.  */
@@ -2868,6 +2892,10 @@
  && modified_between_p (src, insn, NEXT_INSN (BB_END (bb
return FALSE;
 
+  /* Skip it if the instruction to be moved might clobber CC.  */
+  if (cc && set_of (cc, insn))
+   return FALSE;
+
   vals->put (dest, src);
 
   regs->safe_push (dest);
Index: gcc/testsuite/gcc.dg/pr64007.c
===
--- gcc/testsuite/gcc.dg/pr64007.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr64007.c  (revision 0)
@@ -0,0 +1,50 @@
+/* { dg-options " -O3 " } */
+/* { dg-do run } */
+
+#include 
+
+int d, i;
+
+struct S
+{
+  int f0;
+} *b, c, e, h, **g = &b;
+
+static struct S *f = &e;
+
+int
+fn1 (int p)
+{
+  int a = 0;
+  return a || p < 0 || p >= 2 || 1 >> p;
+}
+
+int
+main ()
+{
+  int k = 1, l, *m = &c.f0;
+
+  for (;;)
+{
+  l = fn1 (i);
+  *m = k && i;
+  if (l)
+   {
+ int n[1] = {0};
+   }
+  break;
+}
+
+  *g = &h;
+
+  assert (b);
+
+  if (d)
+(*m)--;
+  d = (f != 0) | (i >= 0);
+
+  if (c.f0 != 0)
+__builtin_abort ();
+
+  return 0;
+}

RE: [PATCH] Fix PR 61225

2014-12-11 Thread Zhenqiang Chen



> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Wednesday, December 10, 2014 3:16 AM
> To: Segher Boessenkool; Zhenqiang Chen
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Fix PR 61225
> 
> On 12/09/14 12:07, Segher Boessenkool wrote:
> > On Tue, Dec 09, 2014 at 05:49:18PM +0800, Zhenqiang Chen wrote:
> >>> Do you need to verify SETA and SETB satisfy single_set?  Or has that
> >>> already been done elsewhere?
> >>
> >> A is NEXT_INSN (insn)
> >> B is prev_nonnote_nondebug_insn (insn),
> >>
> >> For I1 -> I2 -> B; I2 -> A;
> >> LOG_LINK can make sure I1 and I2 are single_set,
> >
> > It cannot, not anymore anyway.  LOG_LINKs can point to an insn with
> > multiple SETs; multiple LOG_LINKs can point to such an insn.
> So let's go ahead and put a single_set test in this function.
> 
>  Is this fragment really needed?  Does it ever trigger?  I'd think
>  that
> >>> for > 2 uses punting would be fine.  Do we really commonly have
> >>> cases with > 2 uses, but where they're all in SETA and SETB?
> >
> > Can't you just check for a death note on the second insn?  Together
> > with reg_used_between_p?
> Yea, that'd accomplish the same thing I think Zhenqiang is trying to catch
and
> is simpler than walking the lists.

Updated. Check for a death note is enough since b is
prev_nonnote_nondebug_insn (a). 
 
> >
>  +  /* Try to combine a compare insn that sets CC
>  + with a preceding insn that can set CC, and maybe with
its
>  + logical predecessor as well.
>  + We need this special code because data flow connections
>  + do not get entered in LOG_LINKS.  */
> >
> > I think you mean "not _all_ data flow connections"?
> I almost said something about this comment, but figured I was nitpicking
too
> much :-)

Updated. 

> >>> So you've got two new combine cases here, but I think the testcase
> >>> only tests one of them.  Can you include a testcase for both of hte
> >>> major paths above (I1->I2->I3; I2->insn and I2->I3; I2->INSN)
> >>
> >> pr61225.c is the case to cover I1->I2->I3; I2->insn.
> >>
> >> For I2 -> I3; I2 -> insn, I tried my test cases and found peephole2
> >> can also handle them. So I removed the code from the patch.
> >
> > Why?  The simpler case has much better chances of being used.
> The question does it actually catch anything not already handled?  I guess
you
> could argue that doing it in combine is better than peep2 and I'd agree
with
> that.
> 
> >
> > In fact, there are many more cases you could handle:
> >
> > You handle
> >
> > I1 -> I2 -> I3; I2 -> insn
> >I2 -> I3; I2 -> insn
> >
> > but there are also
> >
> > I1,I2 -> I3; I2 -> insn
> >
> > and the many 4-insn combos, too.
> Yes, but I wonder how much of this is really necessary in practice.  We
> could do exhaustive testing here, but I suspect the payoff isn't all
> that great.  Thus I'm comfortable with faulting in the cases we actually
> find are useful in practice.
> 
> >
> >> +/* A is a compare (reg1, 0) and B is SINGLE_SET which SET_SRC is reg2.
> >> +   It returns TRUE, if reg1 == reg2, and no other refer of reg1
> >> +   except A and B.  */
> >
> > That sound like the only correct inputs are such a compare etc., but the
> > routine tests whether that is true.
> Correct, the RTL has to have a specific form and that is tested for.
> Comment updates can't hurt.
 
Updated.
 
> >
> >> +static bool
> >> +can_reuse_cc_set_p (rtx_insn *a, rtx_insn *b)
> >> +{
> >> +  rtx seta = single_set (a);
> >> +  rtx setb = single_set (b);
> >> +
> >> +  if (BLOCK_FOR_INSN (a) != BLOCK_FOR_INSN (b)
> >
> > Neither the comment nor the function name mention this.  This test is
> > better placed in the caller of this function, anyway.
> Didn't consider it terribly important.  Moving it to the caller doesn't
> change anything significantly, though I would agree it's martinally
cleaner.

Updated.
 
> >
> >> @@ -3323,7 +3396,11 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> rtx_insn
> >> *i1, rtx_insn *i0,
> >>  rtx old = newpat;
> >>  total_sets = 1 + extra_sets;
> >>  newpat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (total_sets));
> >> -XVECEXP (newpat, 0, 0) = old;
> >> +
> >> +if (to_combined_insn)
> >> +  XVECEXP (newpat, 0, --total_sets) = old;
> >> +else
> >> +  XVECEXP (newpat, 0, 0) = old;
> >>}
> >
> > Is this correct?  If so, it needs a big fat comment, because it is
> > not exactly obvious :-)
> >
> > Also, it doesn't handle at all the case where the new pattern already is
> > a PARALLEL; can that never happen?
> I'd convinced myself it was.  But yes, a comment here would be good.

The following comments are added.

+   /* This is a hack to match i386 instruction pattern, which
+   is like
+   (parallel [
+   (set (reg:CCZ 17 flags)
+   ...)
+   (set ...)})
+   we h

[Patch, gcc/flag-types.h + Fortran] PR54687 - Fortran options cleanup

2014-12-11 Thread Tobias Burnus

This patch cleans up Fortran's option handling and moves it closer to 
the common way of option handling. That's a nice cleanup and 
additionally, as Manuel points out in the PR, there are a couple of 
reasons why this makes sense in addition. I have not yet touched all 
options but one has to start somewhere.


Built and currently regtesting on x86-64-gnu-linux.
OK for the trunk?

Tobias
2014-12-12  Tobias Burnus  

	PR fortran/54687
gcc/
	* flag-types.h (gfc_init_local_real, gfc_fcoarray,
	gfc_convert): New enums; moved from fortran/.

gcc/fortran/
	* gfortran.h (gfc_option_t): Remove flags which now
	have a Var().
	(init_local_real, gfc_fcoarray): Moved to ../flag-types.h.
	* libgfortran.h (unit_convert): Add comment.
	* lang.opt (flag-aggressive_function_elimination,
	flag-align_commons, flag-all_intrinsics,
	flag-allow_leading_underscore, flag-automatic, flag-backslash,
	flag-backtrace, flag-blas_matmul_limit, flag-convert,
	flag-cray_pointer, flag-dollar_ok, flag-dump_fortran_original,
	flag-dump_fortran_optimized, flag-external_blas, flag-f2c,
	flag-implicit_none, flag-init_real, flag-max_array_constructor,
	flag-module_private, flag-pack_derived, flag-range_check,
	flag-recursive, flag-repack_arrays, flag-coarray, flag-sign_zero,
	flag-underscoring): Add Var() and, where applicable, Enum().
	* options.c (gfc_handle_coarray_option): Remove.
	(gfc_init_options, gfc_post_options, gfc_handle_fpe_option,
	gfc_handle_option): Update for *.opt changes.
	* arith.c: Update for flag-variable name changes.
	* array.c: Ditto.
	* check.c: Ditto.
	* cpp.c: Ditto.
	* decl.c: Ditto.
	* expr.c: Ditto.
	* f95-lang.c: Ditto.
	* frontend-passes.c: Ditto.
	* intrinsic.c: Ditto.
	* io.c: Ditto.
	* match.c: Ditto.
	* module.c: Ditto.
	* parse.c: Ditto.
	* primary.c: Ditto.
	* resolve.c: Ditto.
	* scanner.c: Ditto.
	* simplify.c: Ditto.
	* symbol.c: Ditto.
	* trans-array.c: Ditto.
	* trans-common.c: Ditto.
	* trans-decl.c: Ditto.
	* trans-expr.c: Ditto.
	* trans-intrinsic.c: Ditto.
	* trans-openmp.c: Ditto.
	* trans-stmt.c: Ditto.
	* trans-types.c: Ditto.
	* trans.c: Ditto.

diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 52ff7ee..81e8fb8 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -263,4 +263,38 @@ enum lto_partition_model {
   LTO_PARTITION_MAX = 4
 };
 
+
+/* gfortran -finit-real= values.  */
+
+enum gfc_init_local_real
+{
+  GFC_INIT_REAL_OFF = 0,
+  GFC_INIT_REAL_ZERO,
+  GFC_INIT_REAL_NAN,
+  GFC_INIT_REAL_SNAN,
+  GFC_INIT_REAL_INF,
+  GFC_INIT_REAL_NEG_INF
+};
+
+/* gfortran -fcoarray= values.  */
+
+enum gfc_fcoarray
+{
+  GFC_FCOARRAY_NONE = 0,
+  GFC_FCOARRAY_SINGLE,
+  GFC_FCOARRAY_LIB
+};
+
+
+/* gfortran -fconvert= values; used for unformatted I/O.
+   Keep in sync with GFC_CONVERT_* in gcc/fortran/libgfortran.h.   */
+enum gfc_convert
+{
+  GFC_FLAG_CONVERT_NATIVE = 0,
+  GFC_FLAG_CONVERT_SWAP,
+  GFC_FLAG_CONVERT_BIG,
+  GFC_FLAG_CONVERT_LITTLE
+};
+
+
 #endif /* ! GCC_FLAG_TYPES_H */
diff --git a/gcc/fortran/arith.c b/gcc/fortran/arith.c
index 6394547..e8a5efe 100644
--- a/gcc/fortran/arith.c
+++ b/gcc/fortran/arith.c
@@ -301,7 +301,7 @@ gfc_check_integer_range (mpz_t p, int kind)
 }
 
 
-  if (gfc_option.flag_range_check == 0)
+  if (flag_range_check == 0)
 return result;
 
   if (mpz_cmp (p, gfc_integer_kinds[i].min_int) < 0
@@ -333,12 +333,12 @@ gfc_check_real_range (mpfr_t p, int kind)
 
   if (mpfr_inf_p (p))
 {
-  if (gfc_option.flag_range_check != 0)
+  if (flag_range_check != 0)
 	retval = ARITH_OVERFLOW;
 }
   else if (mpfr_nan_p (p))
 {
-  if (gfc_option.flag_range_check != 0)
+  if (flag_range_check != 0)
 	retval = ARITH_NAN;
 }
   else if (mpfr_sgn (q) == 0)
@@ -348,14 +348,14 @@ gfc_check_real_range (mpfr_t p, int kind)
 }
   else if (mpfr_cmp (q, gfc_real_kinds[i].huge) > 0)
 {
-  if (gfc_option.flag_range_check == 0)
+  if (flag_range_check == 0)
 	mpfr_set_inf (p, mpfr_sgn (p));
   else
 	retval = ARITH_OVERFLOW;
 }
   else if (mpfr_cmp (q, gfc_real_kinds[i].subnormal) < 0)
 {
-  if (gfc_option.flag_range_check == 0)
+  if (flag_range_check == 0)
 	{
 	  if (mpfr_sgn (p) < 0)
 	{
@@ -736,7 +736,7 @@ gfc_arith_divide (gfc_expr *op1, gfc_expr *op2, gfc_expr **resultp)
   break;
 
 case BT_REAL:
-  if (mpfr_sgn (op2->value.real) == 0 && gfc_option.flag_range_check == 1)
+  if (mpfr_sgn (op2->value.real) == 0 && flag_range_check == 1)
 	{
 	  rc = ARITH_DIV0;
 	  break;
@@ -748,7 +748,7 @@ gfc_arith_divide (gfc_expr *op1, gfc_expr *op2, gfc_expr **resultp)
 
 case BT_COMPLEX:
   if (mpc_cmp_si_si (op2->value.complex, 0, 0) == 0
-	  && gfc_option.flag_range_check == 1)
+	  && flag_range_check == 1)
 	{
 	  rc = ARITH_DIV0;
 	  break;
@@ -863,7 +863,7 @@ arith_power (gfc_expr *op1, gfc_expr *op2, gfc_expr **resultp)
 		int i;
 		i = gfc_validate_kind (BT_INTEGER, result->ts.kind, false);
 
-		if (gfc_option.flag_range_check)
+		if (flag_range_check)

RE: [PATCH, AARCH64] Fix ICE in CCMP (PR64015)

2014-12-11 Thread Zhenqiang Chen



> -Original Message-
> From: Richard Henderson [mailto:r...@redhat.com]
> Sent: Tuesday, November 25, 2014 5:25 PM
> To: Zhenqiang Chen
> Cc: Marcus Shawcroft; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, AARCH64] Fix ICE in CCMP (PR64015)
> 
> On 11/25/2014 09:41 AM, Zhenqiang Chen wrote:
> > I want to confirm with you two things before I rework it.
> > (1) expand_insn needs an optab_handler as input. Do I need to define a
> ccmp_optab with different mode support in optabs.def?
> 
> No, look again: expand_insn needs an enum insn_code as input.  Since this is
> the backend, you can use any icode name you like, which means that you can
> use CODE_FOR_ccmp_and etc directly.
> 
> > (2) To make sure later operands not clobber CC, all operands are expanded
> before ccmp-first in current implementation. If taking tree/gimple as input,
> what's your preferred logic to guarantee CC not clobbered?
> 
> Hmm.  Perhaps the target hook will need to output two sequences, each of
> which will be concatenated while looping around the calls to gen_ccmp_next.
> The first sequence will be operand preparation and the second sequence will
> be ccmp generation.
> 
> Something like
> 
> bool
> aarch64_gen_ccmp_start(rtx *prep_seq, rtx *gen_seq,
>int cmp_code, int bit_code,
>tree op0, tree op1) {
>   bool success;
> 
>   start_sequence ();
>   // Widen and expand operands
>   *prep_seq = get_insns ();
>   end_sequence ();
> 
>   start_sequence ();
>   // Generate the first compare
>   *gen_seq = get_insns ();
>   end_sequence ();
> 
>   return success;
> }
> 
> bool
> aarch64_gen_ccmp_next(rtx *prep_seq, rtx *gen_seq,
>   rtx prev, int cmp_code, int bit_code,
>   tree op0, tree op1) {
>   bool success;
> 
>   push_to_sequence (*prep_seq);
>   // Widen and expand operands
>   *prep_seq = get_insns ();
>   end_sequence ();
> 
>   push_to_sequence (*gen_seq);
>   // Generate the next ccmp
>   *gen_seq = get_insns ();
>   end_sequence ();
> 
>   return success;
> }
> 
> If there are ever any failures, the middle-end can simply discard the
> sequences.  If everything succeeds, it simply calls emit_insn on both
> sequences.
> 
 
Thanks for the comments. The updated patch is attached.

Note: Function "aarch64_code_to_ccmode" is the same as it before reverting.

ChangeLog:
2014-12-12  Zhenqiang Chen  

* ccmp.c (expand_ccmp_next): New function.
(expand_ccmp_expr_1, expand_ccmp_expr): Handle operand insn sequence
and compare insn sequence.
* config/aarch64/aarch64.c (aarch64_code_to_ccmode,
aarch64_gen_ccmp_first, aarch64_gen_ccmp_next): New functions.
(TARGET_GEN_CCMP_FIRST, TARGET_GEN_CCMP_NEXT): New MICRO.
* config/aarch64/aarch64.md (*ccmp_and): Changed to ccmp_and.
(*ccmp_ior): Changed to ccmp_ior.
(cmp): New pattern.
* doc/tm.texi (TARGET_GEN_CCMP_FIRST, TARGET_GEN_CCMP_NEXT): Update
parameters.
* target.def (gen_ccmp_first, gen_ccmp_next): Update parameters.

testsuite/ChangeLog:
2014-12-12  Zhenqiang Chen  

* gcc.dg/pr64015.c: New test.


gen-ccmp.patch
Description: Binary data

89 matches

Mail list logo