date:20190401

[PATCH,AIX] go: disable mvsx and maltivec for aix/ppc

2019-04-01 Thread CHIGOT, CLEMENT

Description: 
  * This patch removes -mvsx and -maltivec for go aix/ppc. 
     These options don't seem compatible with Go stack layout. 

Tests: 
  * AIX 7.2:  Configure/Build: SUCCESS

Changelog:
  * go/gospec.c (lang_specific_driver): disable -mvsx and -maltivec for 
aix/ppc. 

 
Clément Chigot 
ATOS Bull SAS 
1 rue de Provence - 38432 Échirolles - France Index: gospec.c
===
--- gospec.c	(revision 269987)
+++ gospec.c	(working copy)
@@ -427,6 +427,14 @@ lang_specific_driver (struct cl_decoded_option **i
   j++;
 #endif
 
+#if defined(TARGET_AIX) && defined(TARGET_32BIT)
+  /* aix/ppc doesn't support -mvsx and -maltivec with Go */
+  generate_option (OPT_mvsx, NULL, 0, CL_DRIVER, &new_decoded_options[j]);
+  j++;
+  generate_option (OPT_maltivec, NULL, 0, CL_DRIVER, &new_decoded_options[j]);
+  j++;
+#endif
+
   *in_decoded_options_count = j;
   *in_decoded_options = new_decoded_options;
   *in_added_libraries = added_libraries;

[PATCH,AIX] disables -flto on Go tests for AIX

2019-04-01 Thread CHIGOT, CLEMENT

Description: 
  * This patches disables go tests with -flto on AIX. 

Tests: 
  * AIX 7.2:  Configure/Build: SUCCESS

Changelog:
  * lib/go-torture.exp: add check for lto option


  
Clément Chigot 
ATOS Bull SAS 
1 rue de Provence - 38432 Échirolles - France Index: go-torture.exp
===
--- go-torture.exp	(revision 269987)
+++ go-torture.exp	(working copy)
@@ -34,10 +34,13 @@ if ![info exists TORTURE_OPTIONS] {
 	{ -O2 -fomit-frame-pointer -finline-functions -funroll-loops } \
 	{ -O2 -fbounds-check } \
 	{ -O3 -g } \
-	{ -Os } \
-	{ -flto }]
+	{ -Os }]
 }
 
+if [check_effective_target_lto] {
+	set TORTURE_OPTIONS \
+	[concat $TORTURE_OPTIONS [list {-flto}]]
+}
 
 #
 # go-torture-compile -- compile a go.go-torture testcase.

Re: C++ PATCH for c++/89214 - ICE when initializing aggregates with bases

2019-04-01 Thread Andreas Schwab

On Mär 28 2019, Marek Polacek  wrote:

> Andreas, could you please find out why we're not hitting this code in
> digest_init_r:
>
> 1210   tree elt = CONSTRUCTOR_ELT (stripped_init, 0)->value;
> 1211   if (reference_related_p (type, TREE_TYPE (elt)))

This is never executed if flag_checking is false, of course.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Paolo Carlini


Hi

On 30/03/19 19:00, Ville Voutilainen wrote:

-  template
+  template
+decltype(auto)
+__visitor_result_type(_Visitor&& __visitor, _Variants&&... __variants)
+{
+  if constexpr(__use_index)
+return __detail::__variant::__variant_idx_cookie{};
+  else
+   return std::forward<_Visitor>(__visitor)(
+ std::get<0>(std::forward<_Variants>(__variants))...);
+}


If I'm not misreading something, the new function will be usually 
compiled/optimized to something very small and isn't constexpr thus 
normally should be explicitly marked inline. Or the problem is the 
missing constexpr? (sorry, I didn't really study the new code)


Paolo.

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Ville Voutilainen

On Mon, 1 Apr 2019 at 11:30, Paolo Carlini  wrote:
>
> Hi
>
> On 30/03/19 19:00, Ville Voutilainen wrote:
> > -  template
> > +  template
> > +decltype(auto)
> > +__visitor_result_type(_Visitor&& __visitor, _Variants&&... __variants)
> > +{
> > +  if constexpr(__use_index)
> > +return __detail::__variant::__variant_idx_cookie{};
> > +  else
> > + return std::forward<_Visitor>(__visitor)(
> > +   std::get<0>(std::forward<_Variants>(__variants))...);
> > +}
>
> If I'm not misreading something, the new function will be usually
> compiled/optimized to something very small and isn't constexpr thus
> normally should be explicitly marked inline. Or the problem is the
> missing constexpr? (sorry, I didn't really study the new code)

The only use of this function is to compute its return type. It's
never called. I should probably comment on that...

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Paolo Carlini


Hi

On 01/04/19 10:43, Ville Voutilainen wrote:

On Mon, 1 Apr 2019 at 11:30, Paolo Carlini  wrote:

Hi

On 30/03/19 19:00, Ville Voutilainen wrote:

-  template
+  template
+decltype(auto)
+__visitor_result_type(_Visitor&& __visitor, _Variants&&... __variants)
+{
+  if constexpr(__use_index)
+return __detail::__variant::__variant_idx_cookie{};
+  else
+ return std::forward<_Visitor>(__visitor)(
+   std::get<0>(std::forward<_Variants>(__variants))...);
+}

If I'm not misreading something, the new function will be usually
compiled/optimized to something very small and isn't constexpr thus
normally should be explicitly marked inline. Or the problem is the
missing constexpr? (sorry, I didn't really study the new code)

The only use of this function is to compute its return type. It's
never called. I should probably comment on that...


Oh, yes, now I see, a few lines below. That's clear enough, I think.

Paolo.

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Ville Voutilainen

On Sat, 30 Mar 2019 at 20:00, Ville Voutilainen
 wrote:
>
> This patch makes assignments correct, because they need to
> match indices instead of types. In addition, we cut down the
> codegen size. The symbols are longer than before, the the amount
> of them is roughly the same, so there's no longer an explosion
> in their amount.
>
> Relops are the last bit in these fixes, I'll fix them during the weekend.

Here. This does produce 17 symbols instead of 9 for a test with a
single operator== comparison.
But it's not 47 like it is with the code before this patch.

2019-04-01  Ville Voutilainen  

Use single-visitation in variant assignment and swap.
Also use indices instead of types when checking whether
variants hold the same thing.
* include/std/variant (__do_visit): Add a template parameter
for index visitation, invoke with indices if index visitation
is used.
(__variant_idx_cookie): New.
(__visit_with_index): Likewise.
(_Copy_assign_base::operator=): Do single-visitation with
an index visitor.
(_Move_assign_base::operator=): Likewise.
(_Extra_visit_slot_needed): Adjust.
(__visit_invoke): Call with indices if it's an index visitor.
(relops): Do single-visitation with an index visitor.
(swap): Likewise.
(__visitor_result_type): New.
diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 7fd6947..3a11e1c 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -138,7 +138,7 @@ namespace __variant
 constexpr variant_alternative_t<_Np, variant<_Types...>> const&&
 get(const variant<_Types...>&&);
 
-  template
+  template
 constexpr decltype(auto)
 __do_visit(_Visitor&& __visitor, _Variants&&... __variants);
 
@@ -175,6 +175,10 @@ namespace __variant
 
   // used for raw visitation
   struct __variant_cookie {};
+  // used for raw visitation with indices passed in
+  struct __variant_idx_cookie {};
+  // a more explanatory name than 'true'
+  inline constexpr auto __visit_with_index = bool_constant{};
 
   // _Uninitialized is guaranteed to be a literal type, even if T is not.
   // We have to do this, because [basic.types]p10.5.3 (n4606) is not implemented
@@ -570,45 +574,44 @@ namespace __variant
   operator=(const _Copy_assign_base& __rhs)
 	  noexcept(_Traits<_Types...>::_S_nothrow_copy_assign)
   {
-	__do_visit([this, &__rhs](auto&& __this_mem, auto&& __rhs_mem) mutable
-		   -> __detail::__variant::__variant_cookie
+	__do_visit<__visit_with_index>([this](auto&& __rhs_mem,
+	  auto __rhs_index) mutable
+	-> __detail::__variant::__variant_idx_cookie
 	  {
-	if constexpr (is_same_v<
-			remove_reference_t,
-			remove_reference_t>)
+	if constexpr (__rhs_index != variant_npos)
 	  {
-		if constexpr (!is_same_v<
-			remove_reference_t,
-			__variant_cookie>)
-		  __this_mem = __rhs_mem;
-	  }
-	else
-	  {
-		if constexpr (!is_same_v<
-			remove_reference_t,
-			__variant_cookie>)
+		if (this->_M_index == __rhs_index)
 		  {
-		using __rhs_type =
-		  remove_reference_t;
-		if constexpr (is_nothrow_copy_constructible_v<__rhs_type>
-		  || !is_nothrow_move_constructible_v<__rhs_type>)
+		if constexpr (__rhs_index != variant_npos)
 		  {
-			this->_M_destructive_copy(__rhs._M_index, __rhs_mem);
+			auto& __this_mem =
+			  __get<__rhs_index>(*this);
+			if constexpr (is_same_v<
+  remove_reference_t,
+  remove_reference_t>)
+			  __this_mem = __rhs_mem;
 		  }
+		  }
+		else
+		  {
+		using __rhs_type =
+		  remove_reference_t;
+		if constexpr
+		  (is_nothrow_copy_constructible_v<__rhs_type>
+		   || !is_nothrow_move_constructible_v<__rhs_type>)
+			this->_M_destructive_copy(__rhs_index, __rhs_mem);
 		else
 		  {
 			auto __tmp(__rhs_mem);
-			this->_M_destructive_move(__rhs._M_index,
+			this->_M_destructive_move(__rhs_index,
 		  std::move(__tmp));
 		  }
 		  }
-		else
-		  {
-		this->_M_reset();
-		  }
 	  }
-	  return {};
-	}, __variant_cast<_Types...>(*this), __variant_cast<_Types...>(__rhs));
+	else
+	  this->_M_reset();
+	return {};
+	  }, __variant_cast<_Types...>(__rhs));
 	__glibcxx_assert(this->_M_index == __rhs._M_index);
 	return *this;
   }
@@ -641,25 +644,32 @@ namespace __variant
   operator=(_Move_assign_base&& __rhs)
 	  noexcept(_Traits<_Types...>::_S_nothrow_move_assign)
   {
-	__do_visit([this, &__rhs](auto&& __this_mem, auto&& __rhs_mem) mutable
-		   -> __detail::__variant::__variant_cookie
+	__do_visit<__visit_with_index>([this](auto&& __rhs_mem,
+	  auto __rhs_index) mutable
+	-> __detail::__variant::__variant_idx_cookie
 	  {
-	if constexpr (is_same_v<
-			remove_reference_t,
-			remove_reference_t>)
+	if constexpr (__rhs_index != variant_npos)
 	  {
-		if constexpr (!is_same_v<
-			remove_reference_t,
-

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Ville Voutilainen

On Mon, 1 Apr 2019 at 11:45, Paolo Carlini  wrote:
> >>> +__visitor_result_type(_Visitor&& __visitor, _Variants&&... 
> >>> __variants)
> >>> +{
> >>> +  if constexpr(__use_index)
> >>> +return __detail::__variant::__variant_idx_cookie{};
> >>> +  else
> >>> + return std::forward<_Visitor>(__visitor)(
> >> If I'm not misreading something, the new function will be usually
> >> compiled/optimized to something very small and isn't constexpr thus
> >> normally should be explicitly marked inline. Or the problem is the
> >> missing constexpr? (sorry, I didn't really study the new code)
> > The only use of this function is to compute its return type. It's
> > never called. I should probably comment on that...
>
> Oh, yes, now I see, a few lines below. That's clear enough, I think.

It's a bit novel, but happens to be much easier to write than
something using std::conditional or
some such. Traits like that may end up evaluating their template
arguments much sooner than
I'd like, but shoving it into a helper function with an if-constexpr
in it makes it trivial and I don't
need to worry about the validity of the branch taken, since it's not
instantiated. This is a new
use case for if-constexpr even for me, but it's a very nice bonus
ability in that already shining-fabulous
language facility. :)

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Ville Voutilainen

On Mon, 1 Apr 2019 at 11:50, Ville Voutilainen
 wrote:
> in it makes it trivial and I don't
> need to worry about the validity of the branch taken, since it's not
> instantiated. This is a new

I mean the branch *not* taken. :)

[PATCH] Improve PR46590 compile-time, fix RPO VN regression

2019-04-01 Thread Richard Biener



This fixes a quadraticness in region RPO VN which happens because
dom_walker computes whole-function RPO order in its constructor
even though in the end we do not use it.  That's an artifact of
C++ (we derive from dom_walker).

The fix is to lazily do costly initialization in the ::walk
method.  I'm not sure we can re-use previous edge flag state
for multiple ::walk invocations when the user said
REACHABLE_BLOCKS instead of REACHABLE_BLOCKS_PRESERVING_FLAGS
so I've gone for the safe side (but I'm quite sure we do not
have a consumer walking multiple times using the same class
instance anyways).

This cuts down GIMPLE copy-header time from 11% to 0% on the
second testcase from PR46590, leaving RTL invariant motion
as the worst offender at 50% of the compile-time (through it's use of DF).

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-04-01  Richard Biener  

PR tree-optimization/46590
* domwalk.h (dom_walker::dom_walker): Consolidate constructors.
(dom_walker::m_reachability): Add in place of...
(dom_walker::m_skip_unreachable_blocks): ...this.
* domwalk.c (dom_walker::dom_walker): Consoliate constructors.
Move complex initialization ...
(dom_walker::walk): Here.  Especially compute m_bb_to_rpo
lazily and initialize edge flags on each invocation.
(dom_walker::bb_reachable): Use m_reachability.

Index: gcc/domwalk.c
===
--- gcc/domwalk.c   (revision 270053)
+++ gcc/domwalk.c   (working copy)
@@ -190,69 +190,11 @@ dom_walker::dom_walker (cdi_direction di
enum reachability reachability,
int *bb_index_to_rpo)
   : m_dom_direction (direction),
-m_skip_unreachable_blocks (reachability != ALL_BLOCKS),
-m_user_bb_to_rpo (true),
+m_reachability (reachability),
+m_user_bb_to_rpo (bb_index_to_rpo != NULL),
 m_unreachable_dom (NULL),
 m_bb_to_rpo (bb_index_to_rpo)
 {
-  /* Set up edge flags if need be.  */
-  switch (reachability)
-{
-default:
-  gcc_unreachable ();
-case ALL_BLOCKS:
-  /* No need to touch edge flags.  */
-  break;
-
-case REACHABLE_BLOCKS:
-  set_all_edges_as_executable (cfun);
-  break;
-
-case REACHABLE_BLOCKS_PRESERVING_FLAGS:
-  /* Preserve the edge flags.  */
-  break;
-}
-}
-
-/* Constructor for a dom walker.  */
-
-dom_walker::dom_walker (cdi_direction direction,
-   enum reachability reachability)
-  : m_dom_direction (direction),
-m_skip_unreachable_blocks (reachability != ALL_BLOCKS),
-m_user_bb_to_rpo (false),
-m_unreachable_dom (NULL),
-m_bb_to_rpo (NULL)
-{
-  /* Compute the basic-block index to RPO mapping.  */
-  if (direction == CDI_DOMINATORS)
-{
-  int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
-  int postorder_num = pre_and_rev_post_order_compute (NULL, postorder,
- true);
-  m_bb_to_rpo = XNEWVEC (int, last_basic_block_for_fn (cfun));
-  for (int i = 0; i < postorder_num; ++i)
-   m_bb_to_rpo[postorder[i]] = i;
-  free (postorder);
-}
-
-  /* Set up edge flags if need be.  */
-  switch (reachability)
-{
-default:
-  gcc_unreachable ();
-case ALL_BLOCKS:
-  /* No need to touch edge flags.  */
-  break;
-
-case REACHABLE_BLOCKS:
-  set_all_edges_as_executable (cfun);
-  break;
-
-case REACHABLE_BLOCKS_PRESERVING_FLAGS:
-  /* Preserve the edge flags.  */
-  break;
-}
 }
 
 /* Destructor.  */
@@ -270,7 +212,7 @@ dom_walker::bb_reachable (struct functio
 {
   /* If we're not skipping unreachable blocks, then assume everything
  is reachable.  */
-  if (!m_skip_unreachable_blocks)
+  if (m_reachability == ALL_BLOCKS)
 return true;
 
   /* If any of the predecessor edges that do not come from blocks dominated
@@ -331,6 +273,23 @@ const edge dom_walker::STOP = (edge)-1;
 void
 dom_walker::walk (basic_block bb)
 {
+  /* Compute the basic-block index to RPO mapping lazily.  */
+  if (!m_bb_to_rpo
+  && m_dom_direction == CDI_DOMINATORS)
+{
+  int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
+  int postorder_num = pre_and_rev_post_order_compute (NULL, postorder,
+ true);
+  m_bb_to_rpo = XNEWVEC (int, last_basic_block_for_fn (cfun));
+  for (int i = 0; i < postorder_num; ++i)
+   m_bb_to_rpo[postorder[i]] = i;
+  free (postorder);
+}
+
+  /* Set up edge flags if need be.  */
+  if (m_reachability == REACHABLE_BLOCKS)
+set_all_edges_as_executable (cfun);
+
   basic_block dest;
   basic_block *worklist = XNEWVEC (basic_block,
   n_basic_blocks_for_fn (cfun) * 2);
Index: gcc/domwalk.h
===
--- gcc/domwalk.h

[C++ Patch] PR 62207 ("[7/8/9 Regression] ICE: tree check: expected tree that contains 'decl minimal' structure, have 'overload' in tsubst_copy, at cp/pt.c")

2019-04-01 Thread Paolo Carlini


Hi,

A rather long standing issue. The error-recovery problem is that, in 
various circumstances, we flag as erroneous a local variable declaration 
and then, when we need to come back to it later on, lookup_name doesn't 
find it, keeps looking and may find instead a completely unrelated 
non-variable declaration with the same name. In principle, we could 
probably set up things in such a way that during error-recovery the 
erroneous entity is found anyway, marked as erroneous of course, but I 
don't think we want to attempt that now... Thus, I'm proposing to just 
check for VAR_P in the relevant case of tsubst_copy and bail-out. 
Setting instead r = NULL_TREE and trying to continue works rather well 
but leads to duplicate diagnostics in some cases (eg, for 
template/crash131.C).


By the way, something else I actually tried is returning immediately 
from tsubst_expr when the tsubst call in case DECL_EXPR returns 
error_mark_node and then bail out immediately from the for loop over the 
STATEMENT_LIST. That also passes the testsuite, but, if you want, you 
can certainly notice that the loop over the statements ends early, for 
example we would not provide any diagnostics for an additional Y y2; 
in template/crash131.C. Which interestingly is true for current clang too ;)


Tested x86_64-linux.

Thanks, Paolo.

/

/cp
2019-04-01  Paolo Carlini  

PR c++/62207
* pt.c (tsubst_copy): Deal with lookup_name not returing a variable.

/testsuite
2019-04-01  Paolo Carlini  

PR c++/62207
* g++.dg/template/crash130.C: New.
* g++.dg/template/crash131.C: Likewise.
Index: cp/pt.c
===
--- cp/pt.c (revision 270012)
+++ cp/pt.c (working copy)
@@ -15579,12 +15579,23 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com
{
  /* First try name lookup to find the instantiation.  */
  r = lookup_name (DECL_NAME (t));
- if (r && !is_capture_proxy (r))
+ if (r)
{
- /* Make sure that the one we found is the one we want.  */
- tree ctx = enclosing_instantiation_of (DECL_CONTEXT (t));
- if (ctx != DECL_CONTEXT (r))
-   r = NULL_TREE;
+ if (!VAR_P (r))
+   {
+ /* During error-recovery we may find a non-variable,
+even an OVERLOAD: just bail out and avoid ICEs and
+duplicate diagnostics (c++/62207).  */
+ gcc_assert (seen_error ());
+ return error_mark_node;
+   }
+ if (!is_capture_proxy (r))
+   {
+ /* Make sure the one we found is the one we want.  */
+ tree ctx = enclosing_instantiation_of (DECL_CONTEXT (t));
+ if (ctx != DECL_CONTEXT (r))
+   r = NULL_TREE;
+   }
}
 
  if (r)
@@ -15620,7 +15631,7 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com
}
  gcc_assert (cp_unevaluated_operand || TREE_STATIC (r)
  || decl_constant_var_p (r)
- || errorcount || sorrycount);
+ || seen_error ());
  if (!processing_template_decl
  && !TREE_STATIC (r))
r = process_outer_var_ref (r, complain);
Index: testsuite/g++.dg/template/crash130.C
===
--- testsuite/g++.dg/template/crash130.C(nonexistent)
+++ testsuite/g++.dg/template/crash130.C(working copy)
@@ -0,0 +1,15 @@
+// PR c++/62207
+
+template void foo(T)
+{
+  X;  // { dg-error "not declared" }
+  X;
+}
+
+void X();
+void X(int);
+
+void bar()
+{
+  foo(0);
+}
Index: testsuite/g++.dg/template/crash131.C
===
--- testsuite/g++.dg/template/crash131.C(nonexistent)
+++ testsuite/g++.dg/template/crash131.C(working copy)
@@ -0,0 +1,16 @@
+// PR c++/62207
+
+template
+class X {
+public:
+  template class Y {};
+  template void y() {}
+  X(F f)
+  {
+Y y;  // { dg-error "not a constant" }
+
+y.value();
+  }
+};
+
+int main() { X x(1); }

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Ville Voutilainen

On Mon, 1 Apr 2019 at 11:45, Ville Voutilainen
 wrote:
>
> On Sat, 30 Mar 2019 at 20:00, Ville Voutilainen
>  wrote:
> >
> > This patch makes assignments correct, because they need to
> > match indices instead of types. In addition, we cut down the
> > codegen size. The symbols are longer than before, the the amount
> > of them is roughly the same, so there's no longer an explosion
> > in their amount.
> >
> > Relops are the last bit in these fixes, I'll fix them during the weekend.
>
> Here. This does produce 17 symbols instead of 9 for a test with a
> single operator== comparison.
> But it's not 47 like it is with the code before this patch.

Jonathan: I consider this good enough to ship; it doesn't explode the
symbols too much and
nicely gets rid of half of the runtime ifs for valueless state and
index. That's really the perf bonus that visitation
buys, and for swap and relops there is less need to do other
inquiries, although there is that
check for same-index, which is now a run-time check between a
completely-runtime value
and a compile-time constant as opposed to a run-time check between two
completely-runtime
values. Later, as a possible optimization, we can see whether swap and
relops can be made
to use a different generated jump table with just maybe twelve
entries, and if we can dispatch
into it without other additional branches, which I'm not entirely sure we can.

[PATCH] Improve DF use of loop-invaraint (PR46590)

2019-04-01 Thread Richard Biener



After the fix to RPO VN in loop header copying DF via RTL invariant
motion takes 50% of the compile-time for the second testcase in
PR46590.  This is caused by the DF live problem iterating over
all dirty blocks for the local problem which is all blocks of
the function because while loop-invariant uses df_analyze_loop
it always marks all blocks of the function as dirty via
df_live_set_all_dirty.  One of the possible fixes is to add
a df_live_set_loop_dirty () function which causes the problem
to only iterate over one loop blocks.

It turns out the LIVE problem is always present so we cannot
remove it which in turn means we have to mark blocks possibly
not visited by invaraint motion as dirty (thus the new
df_live_set_all_dirty at the end of move_loop_invariants).

Anyhow - I'm not really into how DF should ideally work here
but thought this issue was "easy" enough to fix...

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

This brings down DF live time from 50% to 0% for the testcase
(IIRC I've seen loop-invaraint as time-hog several times,
surprising me)

Richard.

2019-04-01  Richard Biener  

PR rtl-optimization/46590
* df.h (df_live_set_loop_dirty): Declare.
* df-problems.c: Include cfgloop.h.
(df_live_set_loop_dirty): New function.
* loop-invariant.c (find_defs): Use df_live_set_loop_dirty
instead of df_live_set_all_dirty.
(move_loop_invariants): Mark all of live dirty at the end.

Index: gcc/df.h
===
--- gcc/df.h(revision 270054)
+++ gcc/df.h(working copy)
@@ -1023,6 +1023,7 @@ extern void df_lr_verify_transfer_functi
 extern void df_live_verify_transfer_functions (void);
 extern void df_live_add_problem (void);
 extern void df_live_set_all_dirty (void);
+extern void df_live_set_loop_dirty (struct loop *);
 extern void df_chain_add_problem (unsigned int);
 extern void df_word_lr_add_problem (void);
 extern bool df_word_lr_mark_ref (df_ref, bool, bitmap);
Index: gcc/df-problems.c
===
--- gcc/df-problems.c   (revision 270054)
+++ gcc/df-problems.c   (working copy)
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.
 #include "valtrack.h"
 #include "dumpfile.h"
 #include "rtl-iter.h"
+#include "cfgloop.h"
 
 /* Note that turning REG_DEAD_DEBUGGING on will cause
gcc.c-torture/unsorted/dump-noaddr.c to fail because it prints
@@ -1774,6 +1775,19 @@ df_live_set_all_dirty (void)
 }
 
 
+/* Set all of the blocks of LOOP as dirty.  This needs to be done if this
+   problem is added after all of the insns have been scanned.  */
+
+void
+df_live_set_loop_dirty (struct loop *loop)
+{
+  basic_block *bbs = get_loop_body (loop);
+  for (unsigned i = 0; i < loop->num_nodes; ++i)
+bitmap_set_bit (df_live->out_of_date_transfer_functions, bbs[i]->index);
+  free (bbs);
+}
+
+
 /* Verify that all of the lr related info is consistent and
correct.  */
 
Index: gcc/loop-invariant.c
===
--- gcc/loop-invariant.c(revision 270054)
+++ gcc/loop-invariant.c(working copy)
@@ -685,7 +685,7 @@ find_defs (struct loop *loop)
   df_process_deferred_rescans ();
   df_chain_add_problem (DF_UD_CHAIN);
   df_live_add_problem ();
-  df_live_set_all_dirty ();
+  df_live_set_loop_dirty (loop);
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_analyze_loop (loop);
   check_invariant_table_size ();
@@ -2286,5 +2286,9 @@ move_loop_invariants (void)
   invariant_table = NULL;
   invariant_table_size = 0;
 
+  /* Mark all of DF_LIVE dirty, we've likely not covered all blocks
+ sofar.  */
+  df_live_set_all_dirty ();
+
   checking_verify_flow_info ();
 }

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Jonathan Wakely


On 01/04/19 11:45 +0300, Ville Voutilainen wrote:

@@ -570,45 +574,44 @@ namespace __variant
  operator=(const _Copy_assign_base& __rhs)
  noexcept(_Traits<_Types...>::_S_nothrow_copy_assign)
  {
-   __do_visit([this, &__rhs](auto&& __this_mem, auto&& __rhs_mem) mutable
-  -> __detail::__variant::__variant_cookie
+   __do_visit<__visit_with_index>([this](auto&& __rhs_mem,
+ auto __rhs_index) mutable
+   -> __detail::__variant::__variant_idx_cookie
  {
-   if constexpr (is_same_v<
-   remove_reference_t,
-   remove_reference_t>)
+   if constexpr (__rhs_index != variant_npos)
  {
-   if constexpr (!is_same_v<
-   remove_reference_t,
-   __variant_cookie>)
- __this_mem = __rhs_mem;
- }
-   else
- {
-   if constexpr (!is_same_v<
-   remove_reference_t,
-   __variant_cookie>)
+   if (this->_M_index == __rhs_index)
  {
-   using __rhs_type =
- remove_reference_t;
-   if constexpr (is_nothrow_copy_constructible_v<__rhs_type>
- || !is_nothrow_move_constructible_v<__rhs_type>)
+   if constexpr (__rhs_index != variant_npos)
  {
-   this->_M_destructive_copy(__rhs._M_index, __rhs_mem);
+   auto& __this_mem =
+ __get<__rhs_index>(*this);


Qualify this as __variant::__get please.


+   if constexpr (is_same_v<
+ remove_reference_t,
+ remove_reference_t>)
+ __this_mem = __rhs_mem;
  }
+ }
+   else
+ {
+   using __rhs_type =
+ remove_reference_t;
+   if constexpr
+ (is_nothrow_copy_constructible_v<__rhs_type>
+  || !is_nothrow_move_constructible_v<__rhs_type>)
+   this->_M_destructive_copy(__rhs_index, __rhs_mem);
else
  {
auto __tmp(__rhs_mem);
-   this->_M_destructive_move(__rhs._M_index,
+   this->_M_destructive_move(__rhs_index,
  std::move(__tmp));
  }
  }
-   else
- {
-   this->_M_reset();
- }
  }
- return {};
-   }, __variant_cast<_Types...>(*this), __variant_cast<_Types...>(__rhs));
+   else
+ this->_M_reset();
+   return {};
+ }, __variant_cast<_Types...>(__rhs));
__glibcxx_assert(this->_M_index == __rhs._M_index);
return *this;
  }
@@ -641,25 +644,32 @@ namespace __variant
  operator=(_Move_assign_base&& __rhs)
  noexcept(_Traits<_Types...>::_S_nothrow_move_assign)
  {
-   __do_visit([this, &__rhs](auto&& __this_mem, auto&& __rhs_mem) mutable
-  -> __detail::__variant::__variant_cookie
+   __do_visit<__visit_with_index>([this](auto&& __rhs_mem,
+ auto __rhs_index) mutable
+   -> __detail::__variant::__variant_idx_cookie
  {
-   if constexpr (is_same_v<
-   remove_reference_t,
-   remove_reference_t>)
+   if constexpr (__rhs_index != variant_npos)
  {
-   if constexpr (!is_same_v<
-   remove_reference_t,
-   __variant_cookie>)
- __this_mem = std::move(__rhs_mem);
+   if (this->_M_index == __rhs_index)
+ {
+   if constexpr (__rhs_index != variant_npos)
+ {
+   auto& __this_mem =
+ __get<__rhs_index>(*this);


And here.


+   if constexpr (is_same_v<
+ remove_reference_t,
+ remove_reference_t>)
+ __this_mem = std::move(__rhs_mem);
+ }
+ }
+   else
+ this->_M_destructive_move(__rhs_index,
+   std::move(__rhs_mem));
  }
else
- {
-   auto __tmp(std::move(__rhs_mem));
-   this->_M_destructive_move(__rhs._M_index, std::move(__tmp));
- }
- return {};
-   }, __variant_cast<_Types...>(*this), __variant_c

Re: [v3 PATCH] Use single-visitation in variant assignment and swap.

2019-04-01 Thread Jonathan Wakely

On 01/04/19 12:52 +0300, Ville Voutilainen wrote:

On Mon, 1 Apr 2019 at 11:45, Ville Voutilainen
 wrote:

On Sat, 30 Mar 2019 at 20:00, Ville Voutilainen
 wrote:
>
> This patch makes assignments correct, because they need to
> match indices instead of types. In addition, we cut down the
> codegen size. The symbols are longer than before, the the amount
> of them is roughly the same, so there's no longer an explosion
> in their amount.
>
> Relops are the last bit in these fixes, I'll fix them during the weekend.

Here. This does produce 17 symbols instead of 9 for a test with a
single operator== comparison.
But it's not 47 like it is with the code before this patch.

Jonathan: I consider this good enough to ship; it doesn't explode the
symbols too much and
nicely gets rid of half of the runtime ifs for valueless state and
index. That's really the perf bonus that visitation
buys, and for swap and relops there is less need to do other
inquiries, although there is that
check for same-index, which is now a run-time check between a
completely-runtime value
and a compile-time constant as opposed to a run-time check between two
completely-runtime
values. Later, as a possible optimization, we can see whether swap and
relops can be made
to use a different generated jump table with just maybe twelve
entries, and if we can dispatch
into it without other additional branches, which I'm not entirely sure we can.

Yes, this is a very satisfactory improvement in codegen.

[RFC][PATCH] Postpone print of --help=* option.

2019-04-01 Thread Martin Liška

Hi.

Last week I was curious which warnings are disabled by default on top
of -Wall and -Wextra. Thus I used --help=warning and noticed some discrepancy
in between documentation and output of the --help option.

I created PR89885 where I explained that OPT__help_ option handling happens
early. That's why LangEnabledBy are not reflected and similarly target overrides
don't take place.

I'm attaching diff for --help=warning for C++ and -Ofast.

Thoughts?

gcc/ChangeLog:

2019-04-01  Martin Liska  

* gcc.c (process_command): Add dummy file only
if n_infiles == 0.
* opts-global.c (decode_options): Pass lang_mask.
* opts.c (print_help): New function.
(finish_options): Print --help if help_option_argument
is set.
(common_handle_option): Factor out content of OPT__help_
into print_help.
* opts.h (finish_options): Add new argument.
---
 gcc/gcc.c |   3 +-
 gcc/opts-global.c |   2 +-
 gcc/opts.c| 267 --
 gcc/opts.h|   3 +-
 4 files changed, 146 insertions(+), 129 deletions(-)


diff --git a/gcc/gcc.c b/gcc/gcc.c
index 4f57765b012..7ce1cae28a7 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -4751,7 +4751,8 @@ process_command (unsigned int decoded_options_count,
 }
 
   /* Ensure we only invoke each subprocess once.  */
-  if (print_subprocess_help || print_help_list || print_version)
+  if (n_infiles == 0
+  && (print_subprocess_help || print_help_list || print_version))
 {
   n_infiles = 0;
 
diff --git a/gcc/opts-global.c b/gcc/opts-global.c
index a5e9ef0237a..f110fe1026f 100644
--- a/gcc/opts-global.c
+++ b/gcc/opts-global.c
@@ -314,7 +314,7 @@ decode_options (struct gcc_options *opts, struct gcc_options *opts_set,
 			loc, lang_mask,
 			&handlers, dc);
 
-  finish_options (opts, opts_set, loc);
+  finish_options (opts, opts_set, loc, lang_mask);
 }
 
 /* Hold command-line options associated with stack limitation.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 02f6b4656e1..707e6754294 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -854,12 +854,18 @@ control_options_for_live_patching (struct gcc_options *opts,
 }
 }
 
+/* --help option argument if set.  */
+const char *help_option_argument = NULL;
+
+static void print_help (struct gcc_options *opts, unsigned int lang_mask);
+
+
 /* After all options at LOC have been read into OPTS and OPTS_SET,
finalize settings of those options and diagnose incompatible
combinations.  */
 void
 finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
-		location_t loc)
+		location_t loc, unsigned int lang_mask)
 {
   enum unwind_info_type ui_except;
 
@@ -1223,6 +1229,10 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
 	 opts->x_flag_live_patching,
 	 loc);
 }
+
+  /* Print --help=* if used.  */
+  if (help_option_argument != NULL)
+print_help (opts, lang_mask);
 }
 
 #define LEFT_COLUMN	27
@@ -2052,6 +2062,135 @@ check_alignment_argument (location_t loc, const char *flag, const char *name)
   parse_and_check_align_values (flag, name, align_result, true, loc);
 }
 
+/* Print help when OPT__help_ is set.  */
+
+static void
+print_help (struct gcc_options *opts, unsigned int lang_mask)
+{
+  const char *a = help_option_argument;
+  unsigned int include_flags = 0;
+  /* Note - by default we include undocumented options when listing
+ specific classes.  If you only want to see documented options
+ then add ",^undocumented" to the --help= option.  E.g.:
+
+ --help=target,^undocumented  */
+  unsigned int exclude_flags = 0;
+
+  if (lang_mask == CL_DRIVER)
+return;
+
+  /* Walk along the argument string, parsing each word in turn.
+ The format is:
+ arg = [^]{word}[,{arg}]
+ word = {optimizers|target|warnings|undocumented|
+ params|common|}  */
+  while (*a != 0)
+{
+  static const struct
+	{
+	  const char *string;
+	  unsigned int flag;
+	}
+  specifics[] =
+	{
+	{ "optimizers", CL_OPTIMIZATION },
+	{ "target", CL_TARGET },
+	{ "warnings", CL_WARNING },
+	{ "undocumented", CL_UNDOCUMENTED },
+	{ "params", CL_PARAMS },
+	{ "joined", CL_JOINED },
+	{ "separate", CL_SEPARATE },
+	{ "common", CL_COMMON },
+	{ NULL, 0 }
+	};
+  unsigned int *pflags;
+  const char *comma;
+  unsigned int lang_flag, specific_flag;
+  unsigned int len;
+  unsigned int i;
+
+  if (*a == '^')
+	{
+	  ++a;
+	  if (*a == '\0')
+	{
+	  error ("missing argument to %qs", "--help=^");
+	  break;
+	}
+	  pflags = &exclude_flags;
+	}
+  else
+	pflags = &include_flags;
+
+  comma = strchr (a, ',');
+  if (comma == NULL)
+	len = strlen (a);
+  else
+	len = comma - a;
+  if (len == 0)
+	{
+	  a = comma + 1;
+	  continue;
+	}
+
+  /* Check to see if the string matches an option class name.  */
+  for (i = 0, specific_flag = 0; specifics[i].string != NULL; i++)
+

[PATCH] Fix PR89896, bogus Makefile generated by lto-wrapper

2019-04-01 Thread Richard Biener



Currently confused by both all or all.c in the local directory.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-04-01  Richard Biener  

PR lto/89896
* lto-wrapper.c (run_gcc): Avoid implicit rules making
the all target phony.

Index: gcc/lto-wrapper.c
===
--- gcc/lto-wrapper.c   (revision 270053)
+++ gcc/lto-wrapper.c   (working copy)
@@ -1665,7 +1665,9 @@ cont:
  struct pex_obj *pex;
  char jobs[32];
 
- fprintf (mstream, "all:");
+ fprintf (mstream,
+  ".PHONY: all\n"
+  "all:");
  for (i = 0; i < nr; ++i)
{
  int j = ltrans_priorities[i*2 + 1];

Re: [PATCH] Add PSTL internal namespace qualifications

2019-04-01 Thread Jonathan Wakely


On 29/03/19 12:12 -0700, Thomas Rodgers wrote:

Prevent ADL related weirdness.

* include/pstl/algorithm_impl.h: Add namespace qualification.
* include/pstl/execution_defs.h: Add namespace qualification.
* include/pstl/execution_impl.h: Add namespace qualification.
* include/pstl/numeric_impl.h: Add namespace qualification.
* include/pstl/parallel_backend_tbb.h: Add namespace qualification.
* include/pstl/unseq_backend_simd.h: Add namespace qualification.
* include/pstl/parallel_backend_utils.h: Include .


We shouldn't include  in the std::lib, the uses of assert
should be changed to __glibcxx_assert instead (which is enabled by
defining _GLIBCXX_ASSERTIONS).


@@ -285,7 +286,7 @@ _ForwardIterator2
__pattern_walk2_n(_ExecutionPolicy&&, _ForwardIterator1 __first1, _Size n, 
_ForwardIterator2 __first2, _Function f,
  _IsVector is_vector, /*parallel=*/std::false_type) noexcept


I missed these before, but we have non-uglified 'n' and 'f' and
'is_vector' here. Almost all uses of "is_vector" are inside a comment
like /*is_vector=*/ but here it's a real parameter name.

In practice if users do something stupid like
#define n 123
#define f 456
then they won't be able to include these headers anyway, because
TBB uses those as parameter names (so users that are using these
headers with TBB can't defines such dumb macros). But when we get a
non-TBB backend that won't be the case, and we should be uglifying all
names declared in our own headers on principle.


{
-return __brick_walk2_n(__first1, n, __first2, f, is_vector);
+return __internal::__brick_walk2_n(__first1, n, __first2, f, is_vector);
}

template 

And 'n' and 'f' here.


{
-return __pattern_walk2(std::forward<_ExecutionPolicy>(__exec), __first1, 
__first1 + n, __first2, f, is_vector,
+return __internal::__pattern_walk2(std::forward<_ExecutionPolicy>(__exec), 
__first1, __first1 + n, __first2, f, is_vector,
   std::true_type());
}



We can fix the 'n' and 'f' and 'is_vector' names in another patch, for
this one please just remove  again and change assert(...) to
__glibcxx_assert(...).

Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc

2019-04-01 Thread David Edelsohn

On Mon, Apr 1, 2019 at 3:43 AM CHIGOT, CLEMENT  wrote:
>
> Description:
>   * This patch removes -mvsx and -maltivec for go aix/ppc.
>  These options don't seem compatible with Go stack layout.
>
> Tests:
>   * AIX 7.2:  Configure/Build: SUCCESS
>
> Changelog:
>   * go/gospec.c (lang_specific_driver): disable -mvsx and -maltivec for 
> aix/ppc.

Is this really the correct place to disable VSX and Altivec for GCC
Go?  It seems that this should be disabled in the target-specific code
based on language.

Thanks, David

Re: [PATCH PR89725]Handle DR's access functions of loops not in DDR's loop_nest

2019-04-01 Thread Richard Biener

On Mon, Apr 1, 2019 at 5:10 AM bin.cheng  wrote:
>
> Hi,
>
> As described in comments of PR89725, this patch fixes out-of-bound access 
> during
> computing classic dist/dir vector for DDR.  Basically it does two things: A) 
> Handle
> relevant chrec of outer loop in multivariate access function as invariant 
> symbol during
> DDR analysis; B) Bypass relevant univariate access functions.
>
> Bootstrap and test on x86_64, any comments?

LGTM, as Jakub said elsewhere index_in_loop_nest is better written as

static inline int
index_in_loop_nest (int var, vec loop_nest)
{
  struct loop *loopi;
  int var_index;

  for (var_index = 0; loop_nest.iterate (var_index, &loopi);
   var_index++)
if (loopi->num == var)
  return var_index;

  gcc_unreachable ();
}

OK with that change.

Richard.

> Thanks,
> bin
>
> 2019-04-01  Bin Cheng  
>
> PR tree-optimization/89725
> * tree-chrec.c (chrec_contains_symbols): New parameter.  Handle outer
> loop's chrec as invariant symbol.
> * tree-chrec.h (chrec_contains_symbols): New parameter.
> * tree-data-ref.c (analyze_miv_subscript): Pass new argument.
> (build_classic_dist_vector_1, add_other_self_distances): Bypass access
> function of loops not in DDR's loop_nest.
> * tree-data-ref.h (index_in_loop_nest): New assertion.
>
> 2018-04-01  Bin Cheng  
>
> PR tree-optimization/89725
> * gcc/testsuite/gcc.dg/tree-ssa/pr89725.c: New test.

Re: [C++ Patch] PR 62207 ("[7/8/9 Regression] ICE: tree check: expected tree that contains 'decl minimal' structure, have 'overload' in tsubst_copy, at cp/pt.c")

2019-04-01 Thread Jason Merrill


OK.

Jason

Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.

2019-04-01 Thread Sudakshina Das

Hi Richard

Thanks for the comments and pointing out the much cleaner existing asm 
output functions!

On 29/03/2019 17:51, Richard Henderson wrote:
>> +#define ASM_LONG "\t.long\t"
> 
> Do not replicate targetm.asm_out.aligned_op.si, or integer_asm_op, really.
> 
>> +aarch64_file_end_indicate_exec_stack ()
>> +{
>> +  file_end_indicate_exec_stack ();
>> +
>> +  if (!aarch64_bti_enabled ()
>> +  && aarch64_ra_sign_scope == AARCH64_FUNCTION_NONE)
>> +{
>> +  return;
>> +}
> 
> This is redundant with...
> 
>> +
>> +  unsigned feature_1_and = 0;
>> +  if (aarch64_bti_enabled ())
>> +feature_1_and |= GNU_PROPERTY_AARCH64_FEATURE_1_BTI;
>> +
>> +  if (aarch64_ra_sign_scope != AARCH64_FUNCTION_NONE)
>> +feature_1_and |= GNU_PROPERTY_AARCH64_FEATURE_1_PAC;
>> +
>> +  if (feature_1_and)
> 
> ... this.  I prefer the second, as it's obvious.
> 
>> +  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
>> +  /* name length.  */
>> +  fprintf (asm_out_file, ASM_LONG " 1f - 0f\n");
>> +  /* data length.  */
>> +  fprintf (asm_out_file, ASM_LONG " 4f - 1f\n");
>> +  /* note type: NT_GNU_PROPERTY_TYPE_0.  */
>> +  fprintf (asm_out_file, ASM_LONG " 5\n");
>> +  fprintf (asm_out_file, "0:\n");
>> +  /* vendor name: "GNU".  */
>> +  fprintf (asm_out_file, STRING_ASM_OP " \"GNU\"\n");
>> +  fprintf (asm_out_file, "1:\n");
>> +  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
>> +  /* pr_type: GNU_PROPERTY_AARCH64_FEATURE_1_AND.  */
>> +  fprintf (asm_out_file, ASM_LONG " 0x%x\n",
>> +   GNU_PROPERTY_AARCH64_FEATURE_1_AND);
>> +  /* pr_datasz.  */\
>> +  fprintf (asm_out_file, ASM_LONG " 3f - 2f\n");
>> +  fprintf (asm_out_file, "2:\n");
>> +  /* GNU_PROPERTY_AARCH64_FEATURE_1_XXX.  */
>> +  fprintf (asm_out_file, ASM_LONG " 0x%x\n", feature_1_and);
>> +  fprintf (asm_out_file, "3:\n");
>> +  ASM_OUTPUT_ALIGN (asm_out_file, p2align);
>> +  fprintf (asm_out_file, "4:\n");
> 
> This could stand to use a comment, a moment's thinking about the sizes, and to
> use the existing asm output functions.
> 
>   /* PT_NOTE header: namesz, descsz, type.
>  namesz = 4 ("GNU\0")
>  descsz = 12 (see below)

I was trying out these changes but the descsz of 12 gets rejected by 
readelf. It hits the following

   unsigned intsize = is_32bit_elf ? 4 : 8;

   printf (_("  Properties: "));

   if (pnote->descsz < 8 || (pnote->descsz % size) != 0)
 {
   printf (_("\n"), 
pnote->descsz);
   return;
 }

Thanks
Sudi

>  type   = 5 (NT_GNU_PROPERTY_TYPE_0).  */
>   assemble_align (POINTER_SIZE);
>   assemble_integer (GEN_INT (4), 4, 32, 1);
>   assemble_integer (GEN_INT (12), 4, 32, 1);
>   assemble_integer (GEN_INT (5), 4, 32, 1);
> 
>   /* PT_NOTE name */
>   assemble_string ("GNU", 4);
> 
>   /* PT_NOTE contents for NT_GNU_PROPERTY_TYPE_0:
>  type   = 0xc000 (GNU_PROPERTY_AARCH64_FEATURE_1_AND),
> datasz = 4
> data   = feature_1_and
> Note that the current section offset is 16,
> and there has been no padding so far.  */
>   assemble_integer (GEN_INT (0xc000), 4, 32, 1);
>   assemble_integer (GEN_INT (4), 4, 32, 1);
>   assemble_integer (GEN_INT (feature_1_and), 4, 32, 1);
> 
>   /* Pad the size of the note to the required alignment. */
>   assemble_align (POINTER_SIZE);
> 
> 
> r~
>

Re: [PATCH] Improve DF use of loop-invaraint (PR46590)

2019-04-01 Thread Richard Sandiford

Richard Biener  writes:
> After the fix to RPO VN in loop header copying DF via RTL invariant
> motion takes 50% of the compile-time for the second testcase in
> PR46590.  This is caused by the DF live problem iterating over
> all dirty blocks for the local problem which is all blocks of
> the function because while loop-invariant uses df_analyze_loop
> it always marks all blocks of the function as dirty via
> df_live_set_all_dirty.  One of the possible fixes is to add
> a df_live_set_loop_dirty () function which causes the problem
> to only iterate over one loop blocks.
>
> It turns out the LIVE problem is always present so we cannot
> remove it which in turn means we have to mark blocks possibly
> not visited by invaraint motion as dirty (thus the new
> df_live_set_all_dirty at the end of move_loop_invariants).
>
> Anyhow - I'm not really into how DF should ideally work here

Me neither, but as discussed on IRC, here's alternative that seems
to give a similar speed-up.

- df_live is already present at -O2, so we only need to add it and
  mark all blocks dirty for -O

- df_process_deferred_rescans should be enough to force a rescan of
  blocks affected by moving invariants, but calling it in find_defs
  means that we don't do any rescans for the final loop

Still testing, but posting for comments.

Thanks,
Richard


2019-04-01  Richard Sandiford  

gcc/
PR rtl-optimization/46950
* loop-invariant.c (find_defs): Move df_remove_problem and
df_process_deferred_rescans to move_invariants.
Move df_live_add_problem and df_live_set_all_dirty calls
to move_invariants.
(move_invariants): Likewise.
(move_loop_invariants): Likewise, making the df_live calls
conditional on -O.  Remove the problem again if we added it
locally.

Index: gcc/loop-invariant.c
===
--- gcc/loop-invariant.c2019-03-08 18:15:33.704751739 +
+++ gcc/loop-invariant.c2019-04-01 14:09:13.553206345 +
@@ -681,11 +681,7 @@ find_defs (struct loop *loop)
   loop->num);
 }
 
-  df_remove_problem (df_chain);
-  df_process_deferred_rescans ();
   df_chain_add_problem (DF_UD_CHAIN);
-  df_live_add_problem ();
-  df_live_set_all_dirty ();
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_analyze_loop (loop);
   check_invariant_table_size ();
@@ -1891,6 +1887,10 @@ move_invariants (struct loop *loop)
 GENERAL_REGS, NO_REGS, GENERAL_REGS);
  }
 }
+  /* Remove the DF_UD_CHAIN problem added in find_defs before rescanning,
+ to save a bit of compile time.  */
+  df_remove_problem (df_chain);
+  df_process_deferred_rescans ();
 }
 
 /* Initializes invariant motion data.  */
@@ -2254,6 +2254,11 @@ move_loop_invariants (void)
 {
   struct loop *loop;
 
+  if (optimize == 1)
+{
+  df_live_add_problem ();
+  df_live_set_all_dirty ();
+}
   if (flag_ira_loop_pressure)
 {
   df_analyze ();
@@ -2286,5 +2291,8 @@ move_loop_invariants (void)
   invariant_table = NULL;
   invariant_table_size = 0;
 
+  if (optimize == 1)
+df_remove_problem (df_live);
+
   checking_verify_flow_info ();
 }

Re: [PATCH] Improve DF use of loop-invaraint (PR46590)

2019-04-01 Thread Richard Biener

On April 1, 2019 4:13:28 PM GMT+02:00, Richard Sandiford 
 wrote:
>Richard Biener  writes:
>> After the fix to RPO VN in loop header copying DF via RTL invariant
>> motion takes 50% of the compile-time for the second testcase in
>> PR46590.  This is caused by the DF live problem iterating over
>> all dirty blocks for the local problem which is all blocks of
>> the function because while loop-invariant uses df_analyze_loop
>> it always marks all blocks of the function as dirty via
>> df_live_set_all_dirty.  One of the possible fixes is to add
>> a df_live_set_loop_dirty () function which causes the problem
>> to only iterate over one loop blocks.
>>
>> It turns out the LIVE problem is always present so we cannot
>> remove it which in turn means we have to mark blocks possibly
>> not visited by invaraint motion as dirty (thus the new
>> df_live_set_all_dirty at the end of move_loop_invariants).
>>
>> Anyhow - I'm not really into how DF should ideally work here
>
>Me neither, but as discussed on IRC, here's alternative that seems
>to give a similar speed-up.
>
>- df_live is already present at -O2, so we only need to add it and
>  mark all blocks dirty for -O
>
>- df_process_deferred_rescans should be enough to force a rescan of
>  blocks affected by moving invariants, but calling it in find_defs
>  means that we don't do any rescans for the final loop
>
>Still testing, but posting for comments.

Looks like a good cleanup. It will still process all blocks at the first 
df_analyze_loop call but at least not all over again for each loop. 

So if it works out OK since it's conceptually simpler. 

Richard. 

>Thanks,
>Richard
>
>
>2019-04-01  Richard Sandiford  
>
>gcc/
>   PR rtl-optimization/46950
>   * loop-invariant.c (find_defs): Move df_remove_problem and
>   df_process_deferred_rescans to move_invariants.
>   Move df_live_add_problem and df_live_set_all_dirty calls
>   to move_invariants.
>   (move_invariants): Likewise.
>   (move_loop_invariants): Likewise, making the df_live calls
>   conditional on -O.  Remove the problem again if we added it
>   locally.
>
>Index: gcc/loop-invariant.c
>===
>--- gcc/loop-invariant.c   2019-03-08 18:15:33.704751739 +
>+++ gcc/loop-invariant.c   2019-04-01 14:09:13.553206345 +
>@@ -681,11 +681,7 @@ find_defs (struct loop *loop)
>  loop->num);
> }
> 
>-  df_remove_problem (df_chain);
>-  df_process_deferred_rescans ();
>   df_chain_add_problem (DF_UD_CHAIN);
>-  df_live_add_problem ();
>-  df_live_set_all_dirty ();
>   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
>   df_analyze_loop (loop);
>   check_invariant_table_size ();
>@@ -1891,6 +1887,10 @@ move_invariants (struct loop *loop)
>GENERAL_REGS, NO_REGS, GENERAL_REGS);
> }
> }
>+  /* Remove the DF_UD_CHAIN problem added in find_defs before
>rescanning,
>+ to save a bit of compile time.  */
>+  df_remove_problem (df_chain);
>+  df_process_deferred_rescans ();
> }
> 
> /* Initializes invariant motion data.  */
>@@ -2254,6 +2254,11 @@ move_loop_invariants (void)
> {
>   struct loop *loop;
> 
>+  if (optimize == 1)
>+{
>+  df_live_add_problem ();
>+  df_live_set_all_dirty ();
>+}
>   if (flag_ira_loop_pressure)
> {
>   df_analyze ();
>@@ -2286,5 +2291,8 @@ move_loop_invariants (void)
>   invariant_table = NULL;
>   invariant_table_size = 0;
> 
>+  if (optimize == 1)
>+df_remove_problem (df_live);
>+
>   checking_verify_flow_info ();
> }

[PATCH PR d/88462] Committed fix for abort in pthread_mutex_init on Solaris

2019-04-01 Thread Iain Buclaw

Hi,

This patch merges the libphobos druntime library with druntime
upstream d57fa1ff.

Fixes alignment of internal core.thread locks that were found to be
the cause of abort() being called inside pthread_mutex_init(), fixing
the second part of PR d/88462.

Bootstrapped and regression tested on x86_64-linux-gnu.

Committed to trunk as r270057.

-- 
Iain
---
diff --git a/libphobos/libdruntime/MERGE b/libphobos/libdruntime/MERGE
index ed756fa6c18..15a55ab612a 100644
--- a/libphobos/libdruntime/MERGE
+++ b/libphobos/libdruntime/MERGE
@@ -1,4 +1,4 @@
-b9564bef1147c797842e6c1a804f2c3565c64ac1
+d57fa1ffaecc858229ed7a730e8486b59197dee5
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/druntime repository.
diff --git a/libphobos/libdruntime/core/internal/traits.d b/libphobos/libdruntime/core/internal/traits.d
index d5786808054..e56f016c355 100644
--- a/libphobos/libdruntime/core/internal/traits.d
+++ b/libphobos/libdruntime/core/internal/traits.d
@@ -170,6 +170,29 @@ template anySatisfy(alias F, T...)
 }
 }
 
+// simplified from std.traits.maxAlignment
+template maxAlignment(U...)
+{
+static if (U.length == 0)
+static assert(0);
+else static if (U.length == 1)
+enum maxAlignment = U[0].alignof;
+else static if (U.length == 2)
+enum maxAlignment = U[0].alignof > U[1].alignof ? U[0].alignof : U[1].alignof;
+else
+{
+enum a = maxAlignment!(U[0 .. ($+1)/2]);
+enum b = maxAlignment!(U[($+1)/2 .. $]);
+enum maxAlignment = a > b ? a : b;
+}
+}
+
+template classInstanceAlignment(T)
+if (is(T == class))
+{
+alias classInstanceAlignment = maxAlignment!(void*, typeof(T.tupleof));
+}
+
 // Somehow fails for non-static nested structs without support for aliases
 template hasElaborateDestructor(T...)
 {
diff --git a/libphobos/libdruntime/core/thread.d b/libphobos/libdruntime/core/thread.d
index e502072be7a..1cf26641e05 100644
--- a/libphobos/libdruntime/core/thread.d
+++ b/libphobos/libdruntime/core/thread.d
@@ -114,6 +114,13 @@ private
 {
 import core.atomic, core.memory, core.sync.mutex;
 
+// Handling unaligned mutexes are not supported on all platforms, so we must
+// ensure that the address of all shared data are appropriately aligned.
+import core.internal.traits : classInstanceAlignment;
+
+enum mutexAlign = classInstanceAlignment!Mutex;
+enum mutexClassInstanceSize = __traits(classInstanceSize, Mutex);
+
 //
 // exposed by compiler runtime
 //
@@ -1708,29 +1715,30 @@ private:
 // lock order inversion.
 @property static Mutex slock() nothrow @nogc
 {
-return cast(Mutex)_locks[0].ptr;
+return cast(Mutex)_slock.ptr;
 }
 
 @property static Mutex criticalRegionLock() nothrow @nogc
 {
-return cast(Mutex)_locks[1].ptr;
+return cast(Mutex)_criticalRegionLock.ptr;
 }
 
-__gshared align(Mutex.alignof) void[__traits(classInstanceSize, Mutex)][2] _locks;
+__gshared align(mutexAlign) void[mutexClassInstanceSize] _slock;
+__gshared align(mutexAlign) void[mutexClassInstanceSize] _criticalRegionLock;
 
 static void initLocks()
 {
-foreach (ref lock; _locks)
-{
-lock[] = typeid(Mutex).initializer[];
-(cast(Mutex)lock.ptr).__ctor();
-}
+_slock[] = typeid(Mutex).initializer[];
+(cast(Mutex)_slock.ptr).__ctor();
+
+_criticalRegionLock[] = typeid(Mutex).initializer[];
+(cast(Mutex)_criticalRegionLock.ptr).__ctor();
 }
 
 static void termLocks()
 {
-foreach (ref lock; _locks)
-(cast(Mutex)lock.ptr).__dtor();
+(cast(Mutex)_slock.ptr).__dtor();
+(cast(Mutex)_criticalRegionLock.ptr).__dtor();
 }
 
 __gshared Context*  sm_cbeg;

Re: [PATCH, wwwdocs] Mention -march=armv8.5-a and other new command line options for AArch64 and Arm for GCC 9

2019-04-01 Thread Sudakshina Das

Hi James

On 29/03/2019 13:41, Sudakshina Das wrote:
> Hi James
> 
> On 22/03/2019 16:25, James Greenhalgh wrote:
>> On Wed, Mar 20, 2019 at 10:17:41AM +, Sudakshina Das wrote:
>>> Hi Kyrill
>>>
>>> On 12/03/2019 12:03, Kyrill Tkachov wrote:
 Hi Sudi,

 On 2/22/19 10:45 AM, Sudakshina Das wrote:
> Hi
>
> This patch documents the addition of the new Armv8.5-A and 
> corresponding
> extensions in the gcc-9/changes.html.
> As per https://gcc.gnu.org/about.html, I have used W3 Validator.
> Is this ok for cvs?
>
> Thanks
> Sudi


 Index: htdocs/gcc-9/changes.html
 ===
 RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
 retrieving revision 1.43
 diff -u -r1.43 changes.html
 --- htdocs/gcc-9/changes.html    21 Feb 2019 10:32:55 -    1.43
 +++ htdocs/gcc-9/changes.html    21 Feb 2019 18:25:09 -
 @@ -283,6 +283,19 @@
    
    The intrinsics are defined by the ACLE specification.
      
 +  
 +    The Armv8.5-A architecture is now supported. This can be used by
 specifying the
 +   -march=armv8.5-a option.


 I tend to prefer the wording "... is now supported through the
 -march=armv8.5-a option".
 Otherwise it reads as the compiler "using" the architecture, whereas we
 usually talk about "targeting" an architecture.

 +  
 +   The Armv8.5-A architecture also adds some security features 
 that
 are optional to all older
 +    architecture versions. These are also supported now and only 
 effect
 the assembler.
 +    
 +     Speculation Barrier instruction using
 -march=armv8-a+sb.
 +     Execution and Data Prediction Restriction instructions using
 -march=armv8-a+predres.
 +     Speculative Store Bypass Safe instruction using
 -march=armv8-a+ssbs. This does not
 + require a compiler option for Arm and thus
 -march=armv8-a+ssbs is a AArch64 specific option.

 "AArch64-specific"


 LGTM otherwise.
 Thanks,
 Kyrill
>>>
>>> Thanks for the review and sorry for the delay in response. I had edited
>>> the language for adding new options in a few other places as well.
>>>
>>> +   The Armv8.5-A architecture also adds some security features 
>>> that are
>>> +    optional to all older architecture versions. These are also 
>>> supported now
>>
>> s/also supported now/now supported/
>>
>>> +    and only effect the assembler.
>>
>> s/effect/affect/
>>
>>> +    
>>> +     Speculation Barrier instruction through the
>>> + -march=armv8-a+sb option.
>>> +     Execution and Data Prediction Restriction instructions through
>>> + the -march=armv8-a+predres option.
>>> +     Speculative Store Bypass Safe instruction through the
>>> + -march=armv8-a+ssbs option. This does not 
>>> require a
>>> + compiler option for Arm and thus 
>>> -march=armv8-a+ssbs
>>> + is an AArch64-specific option.
>>> +    
>>> +  
>>>   
>>>   AArch64 specific
>>> @@ -362,6 +380,23 @@
>>>   The default value is 16 (64Kb) and can be changed at configure
>>>   time using the flag 
>>> --with-stack-clash-protection-guard-size=12|16.
>>>     
>>> +  
>>> +    The option -msign-return-address= has been 
>>> deprecated. This
>>> +    has been replaced by the new -mbranch-protection= 
>>> option. This
>>> +    new option can now be used to enable the return address signing 
>>> as well as
>>> +    the new Branch Target Identification feature of Armv8.5-A 
>>> architecture. For
>>> +    more information on the arguments accepted by this option, 
>>> please refer to
>>> + >> href="https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#AArch64-Options";>AArch64-Options.
>>>  
>>>
>>> +  
>>> +   The following optional extensions to Armv8.5-A architecture 
>>> are also
>>> +   supported now and only effect the assembler.
>>
>> s/effect/affect/
>>
>>> +    
>>> +     Random Number Generation instructions through the
>>> + -march=armv8.5-a+rng option.
>>> +     Memory Tagging Extension through the
>>> + -march=armv8.5-a+memtag option.
>>> +    
>>> +  
>>>   
>>>   Arm specific
>>
>> Otherwise, OK by me but feel free to wait for people with gooder
>> grammar than me to have their say.
>>
> 
> Thanks for spotting those. So far no one else with gooder grammar has 
> pointed out anything else. I will commit the patch with the changes you 
> suggested on Monday if no one else has any other objections.
> 

Committed as 1.56

Thanks
Sudi

> Thanks
> Sudi
> 
>> Thanks,
>> James
>>
>

[committed] sel-sched: remove assert in merge_fences (PR 87273)

2019-04-01 Thread Alexander Monakov

Hi,

I'm applying this patch on Andrey's behalf to resolve PR 87273 where an
overly strict assert fires; Andrey explanation in PR comment #5 provides
more detail.

Alexander

2019-04-01  Andrey Belevantsev  

PR rtl-optimization/87273
* sel-sched-ir.c (merge_fences): Remove assert.

* gcc.dg/pr87273.c: New test.

diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index 2f76e56ca8e..6dec1beaa04 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -703,11 +703,6 @@ merge_fences (fence_t f, insn_t insn,
   else
 if (candidate->src == BLOCK_FOR_INSN (last_scheduled_insn))
   {
-/* Would be weird if same insn is successor of several fallthrough
-   edges.  */
-gcc_assert (BLOCK_FOR_INSN (insn)->prev_bb
-!= BLOCK_FOR_INSN (last_scheduled_insn_old));
-
 state_free (FENCE_STATE (f));
 FENCE_STATE (f) = state;

diff --git a/gcc/testsuite/gcc.dg/pr87273.c b/gcc/testsuite/gcc.dg/pr87273.c
new file mode 100644
index 000..43662f0241c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87273.c
@@ -0,0 +1,20 @@
+/* { dg-do compile { target powerpc*-*-* ia64-*-* i?86-*-* x86_64-*-* } } */
+/* { dg-options "-Os -fschedule-insns -fsel-sched-pipelining 
-fselective-scheduling -fno-ssa-phiopt -fno-tree-loop-im" } */
+/* { dg-additional-options "-march=core2" { target i?86-*-* x86_64-*-* } } */
+
+int sd;
+
+void
+w5 (int n4)
+{
+  long int *vq = (long int *) &n4;
+
+  while (n4 < 1)
+{
+  int ks;
+
+  ks = !!(n4 + 1) ? ((++sd) == *vq) : 0;
+  if (ks == 1 / *vq)
+*vq *= sd;
+}
+}

Re: C++ PATCH for c++/89214 - ICE when initializing aggregates with bases

2019-04-01 Thread Marek Polacek

On Mon, Apr 01, 2019 at 10:15:11AM +0200, Andreas Schwab wrote:
> On Mär 28 2019, Marek Polacek  wrote:
> 
> > Andreas, could you please find out why we're not hitting this code in
> > digest_init_r:
> >
> > 1210   tree elt = CONSTRUCTOR_ELT (stripped_init, 0)->value;
> > 1211   if (reference_related_p (type, TREE_TYPE (elt)))
> 
> This is never executed if flag_checking is false, of course.

It's certainly wrong for a warning to depend on flag_checking, so this
patch corrects it, and I hope will fix the ia64 problem as well.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-04-01  Marek Polacek  

* typeck2.c (digest_init_r): Don't condition the object slicing warning
on flag_checking.

diff --git gcc/cp/typeck2.c gcc/cp/typeck2.c
index fa98b1cb8b5..55b84f043f4 100644
--- gcc/cp/typeck2.c
+++ gcc/cp/typeck2.c
@@ -1200,8 +1200,7 @@ digest_init_r (tree type, tree init, int nested, int 
flags,
   /* "If T is a class type and the initializer list has a single
  element of type cv U, where U is T or a class derived from T,
  the object is initialized from that element."  */
-  if (flag_checking
-  && cxx_dialect >= cxx11
+  if (cxx_dialect >= cxx11
   && BRACE_ENCLOSED_INITIALIZER_P (stripped_init)
   && CONSTRUCTOR_NELTS (stripped_init) == 1
   && ((CLASS_TYPE_P (type) && !CLASSTYPE_NON_AGGREGATE (type))
@@ -1228,7 +1227,7 @@ digest_init_r (tree type, tree init, int nested, int 
flags,
  "results in object slicing", TREE_TYPE (field)))
inform (loc, "remove %<{ }%> around initializer");
}
- else
+ else if (flag_checking)
/* We should have fixed this in reshape_init.  */
gcc_unreachable ();
}

[PATCH, OpenACC, og8, committed] Handle Compute Capability 7.0 in libgomp

2019-04-01 Thread Chung-Lin Tang


I've committed this single-liner (and added a comment) to OG8. This allows 
detection of
Compute Capability 7.0 (Volta) and lets libgomp allocate a more reasonable 
default gang
number for Volta GPUs.

Tested without regressions on a powerpc64le-linux system. Note that mainline has
different code for doing this task, so this patch doesn't apply there (not 
needed there).

Chung-Lin


[og] Handle Compute Capability 7.0 (Volta)

 libgomp/
 * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Handle up to
 Compute Capability 7.0.
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index c2d3b71..706a36f 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1273,8 +1273,9 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const 
void *target_data,
  dev->register_allocation_unit_size = 256;
  dev->register_allocation_granularity = 2;
}
- else if (dev->binary_version <= 62)
+ else if (dev->binary_version <= 70)
{
+ /* Compute Capability 6.1, 6.2, 7.0 share same parameters.  */
  dev->register_allocation_unit_size = 256;
  dev->register_allocation_granularity = 4;
}

Re: [PATCH] elf.c: initialize st_mode member

2019-04-01 Thread Jeff Law

On 4/1/19 12:03 AM, Yu, Mingli wrote:
> Ping.
> 
> Thanks,
> 
> On 2019年03月19日 16:37, mingli...@windriver.com wrote:
>> From: Mingli Yu 
>>
>> Initialize st_mode member to fix the below
>> build failure when -Og included in compiler flag.
>> |
>> ./../../../../../../../../work-shared/gcc-8.3.0-r0/gcc-8.3.0/libsanitizer/libbacktrace/../../libbacktrace/elf.c:
>> In function 'elf_is_symlink':
>> |
>> ../../../../../../../../../work-shared/gcc-8.3.0-r0/gcc-8.3.0/libsanitizer/libbacktrace/../../libbacktrace/elf.c:772:21:
>> error: 'st.st_mode' may be used uninitialized in this function
>> [-Werror=maybe-uninitialized]
>>     return S_ISLNK (st.st_mode);
We already wet through this with someone else.  Use -O2 rather than -Og
or -O1 when compiling that file.  THe warning is a false positive and
the higher optimization levels

jeff

[PATCH] Fix a missed case in predicate analysis of the late uninit pass

2019-04-01 Thread Vladislav Ivanishin

Hi!

This is a fairly trivial change fixing a false negative in
-Wmaybe-uninitialized.  I am pretty sure this is simply an overlooked
case (is_value_included_in() is not meant to deal with the case where
both compare codes are NE_EXPRs, neither does it need to, since their
handling is trivial).

In a nutshell, what happens is values of v restricted by (v != 2) are
incorrectly considered a subset of values of v restricted by (v != 1).
As if "v != 2, therefore v != 1".

This is by no means a gcc-9 regression; I'll ping the patch once stage4
is over, if needed.

This came up when I was experimenting with moving the uninit passes
around.  On mainline, the late uninit pass runs very late, so reliably
triggering the affected path is a bit tricky.  So I created a GIMPLE
test (it reproduces the behavior precisely, but might be fragile
w.r.t. future versions of the textual representation) and then with a
hint from Alexander managed to produce a simple C test.  [By the way,
the first take was to insert an asm with a lot of newlines to prevent
the dom pass from rewriting the CFG due to high cost of duplicating
instructions.  This didn't work out; I think the dom pass does not
respect sizes of inline asms.  I plan to create a testcase and file a
bug later.]

I ran regression testing on x86_64-pc-linux-gnu and saw no new
regressions modulo a handful of flaky UNRESOLVED tests under
gcc.dg/tree-prof.  `BOOT_CFLAGS="-O -Wno-error=maybe-uninitialized
-Wmaybe-uninitialized" bootstrap` also succeeded producing no new
warnings.

OK for stage1?

gcc/ChangeLog:

* tree-ssa-uninit.c (is_pred_expr_subset_of): Correctly handle a case
where the operand of both simple predicates is the same, both compare
codes are NE_EXPR, and the integer values are different (e.g. neither of
the predicate expressions 'v != 1' and 'v != 2' should be considered a
subset of the other).

gcc/testsuite/ChangeLog:

* gcc.dg/uninit-25-gimple.c: New test.
* gcc.dg/uninit-25.c: New test.
---
 gcc/testsuite/gcc.dg/uninit-25-gimple.c | 41 +
 gcc/testsuite/gcc.dg/uninit-25.c| 23 ++
 gcc/tree-ssa-uninit.c   |  3 ++
 3 files changed, 67 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/uninit-25-gimple.c
 create mode 100644 gcc/testsuite/gcc.dg/uninit-25.c

diff --git a/gcc/testsuite/gcc.dg/uninit-25-gimple.c b/gcc/testsuite/gcc.dg/uninit-25-gimple.c
new file mode 100644
index 000..0c0acd6b83e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-25-gimple.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-fgimple -O -Wmaybe-uninitialized" } */
+
+unsigned int __GIMPLE (ssa,startwith("uninit1"))
+foo (unsigned int v)
+{
+  unsigned int undef;/* { dg-warning "may be used uninitialized" } */
+  unsigned int _2;
+  unsigned int _9;
+  unsigned int _10;
+
+  __BB(2):
+  if (v_4(D) != 1u)
+goto __BB3;
+  else
+goto __BB4;
+
+  __BB(3):
+  undef_8 = 8u; // 'undef' is defined conditionally (under 'v != 1' predicate)
+  goto __BB4;
+
+  __BB(4):
+  // An undef value flows into a phi.
+  undef_1 = __PHI (__BB2: undef_5(D), __BB3: undef_8);
+  if (v_4(D) != 2u) // This condition is neither a superset nor a subset of 'v != 1'.
+goto __BB5;
+  else
+goto __BB6;
+
+  __BB(5):
+  _9 = undef_1; // The phi value is used here (under 'v != 2' predicate).
+  goto __BB7;
+
+  __BB(6):
+  _10 = v_4(D);
+  goto __BB7;
+
+  __BB(7):
+  _2 = __PHI (__BB5: _9, __BB6: _10);
+  return _2;
+}
diff --git a/gcc/testsuite/gcc.dg/uninit-25.c b/gcc/testsuite/gcc.dg/uninit-25.c
new file mode 100644
index 000..ce3f91c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-25.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O -Wmaybe-uninitialized" } */
+
+extern unsigned bar (void);
+extern void quux (void);
+
+unsigned foo (unsigned v)
+{
+  unsigned u;
+  if (v != 1)
+u = bar ();
+
+  // Prevent the "dom" pass from changing the CFG layout based on the inference
+  // 'if (v != 1) is false then (v != 2) is true'.  (Now it would have to
+  // duplicate the loop in order to do so, which is deemed expensive.)
+  for (int i = 0; i < 10; i++)
+quux ();
+
+  if (v != 2)
+return u;   /* { dg-warning "may be used uninitialized" } */
+
+  return 0;
+}
diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
index 55a55a05c96..1de7d21d2eb 100644
--- a/gcc/tree-ssa-uninit.c
+++ b/gcc/tree-ssa-uninit.c
@@ -1488,6 +1488,9 @@ is_pred_expr_subset_of (pred_info expr1, pred_info expr2)
   if (expr2.invert)
 code2 = invert_tree_comparison (code2, false);
 
+  if (code1 == NE_EXPR && code2 == NE_EXPR)
+return false;
+
   if ((code1 == EQ_EXPR || code1 == BIT_AND_EXPR) && code2 == BIT_AND_EXPR)
 return (wi::to_wide (expr1.pred_rhs)
 	== (wi::to_wide (expr1.pred_rhs) & wi::to_wide (expr2.pred_rhs)));
-- 
2.20.1


-- 
Vlad

Re: [PATCH] Add PSTL internal namespace qualifications

2019-04-01 Thread Thomas Rodgers



Jonathan Wakely writes:

> On 29/03/19 12:12 -0700, Thomas Rodgers wrote:
>>Prevent ADL related weirdness.
>>
>>  * include/pstl/algorithm_impl.h: Add namespace qualification.
>>  * include/pstl/execution_defs.h: Add namespace qualification.
>>  * include/pstl/execution_impl.h: Add namespace qualification.
>>  * include/pstl/numeric_impl.h: Add namespace qualification.
>>  * include/pstl/parallel_backend_tbb.h: Add namespace qualification.
>>  * include/pstl/unseq_backend_simd.h: Add namespace qualification.
>>  * include/pstl/parallel_backend_utils.h: Include .
>
> We shouldn't include  in the std::lib, the uses of assert
> should be changed to __glibcxx_assert instead (which is enabled by
> defining _GLIBCXX_ASSERTIONS).
>

This has to come through one of the PSTL library configuration
macros because the "right assert" upstream won't be __glibcxx_assert.

>>@@ -285,7 +286,7 @@ _ForwardIterator2
>> __pattern_walk2_n(_ExecutionPolicy&&, _ForwardIterator1 __first1, _Size n, 
>> _ForwardIterator2 __first2, _Function f,
>>   _IsVector is_vector, /*parallel=*/std::false_type) noexcept
>
> I missed these before, but we have non-uglified 'n' and 'f' and
> 'is_vector' here. Almost all uses of "is_vector" are inside a comment
> like /*is_vector=*/ but here it's a real parameter name.
>
> In practice if users do something stupid like
> #define n 123
> #define f 456
> then they won't be able to include these headers anyway, because
> TBB uses those as parameter names (so users that are using these
> headers with TBB can't defines such dumb macros). But when we get a
> non-TBB backend that won't be the case, and we should be uglifying all
> names declared in our own headers on principle.
>
>> {
>>-return __brick_walk2_n(__first1, n, __first2, f, is_vector);
>>+return __internal::__brick_walk2_n(__first1, n, __first2, f, is_vector);
>> }
>>
>> template > class _RandomAccessIterator2,
>>@@ -294,7 +295,7 @@ _RandomAccessIterator2
>> __pattern_walk2_n(_ExecutionPolicy&& __exec, _RandomAccessIterator1 
>> __first1, _Size n, _RandomAccessIterator2 __first2,
>>   _Function f, _IsVector is_vector, 
>> /*parallel=*/std::true_type)
>
> And 'n' and 'f' here.
>
>> {
>>-return __pattern_walk2(std::forward<_ExecutionPolicy>(__exec), __first1, 
>>__first1 + n, __first2, f, is_vector,
>>+return 
>>__internal::__pattern_walk2(std::forward<_ExecutionPolicy>(__exec), __first1, 
>>__first1 + n, __first2, f, is_vector,
>>std::true_type());
>> }
>>
>
> We can fix the 'n' and 'f' and 'is_vector' names in another patch, for
> this one please just remove  again and change assert(...) to
> __glibcxx_assert(...).

patch for PR89865

2019-04-01 Thread Vladimir Makarov


The following patch is for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89865

The patch was successfully tested on x86-64, aarch64, and ppc64.

The patch decreases scan matches from 82 to 8 on x86-64.  I don't know 
what test pr49095.c should expect.  Therefore I don't change the test 
itself.


Committed as rev. 270060

Index: ChangeLog
===
--- ChangeLog   (revision 270059)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2019-04-01  Vladimir Makarov  
+
+   PR rtl-optimization/89865
+   * ira-costs.c (process_bb_node_for_hard_reg_moves): Skip hard
+   register if it is a part of small class.
+
 2019-04-01  Andrey Belevantsev  

PR rtl-optimization/87273
Index: ira-costs.c
===
--- ira-costs.c (revision 270004)
+++ ira-costs.c (working copy)
@@ -2107,6 +2107,13 @@ process_bb_node_for_hard_reg_moves (ira_
}
   else
continue;
+  if (reg_class_size[(int) REGNO_REG_CLASS (hard_regno)]
+ == (ira_reg_class_max_nregs
+ [REGNO_REG_CLASS (hard_regno)][(int) ALLOCNO_MODE(a)]))
+   /* If the class can provide only one hard reg to the allocno,
+  we processed the insn record_operand_costs already and we
+  actually updated the hard reg cost there.  */
+   continue;
   rclass = ALLOCNO_CLASS (a);
   if (! TEST_HARD_REG_BIT (reg_class_contents[rclass], hard_regno))
continue;

[committed] sel-sched: update liveness in redirect_edge_and_branch hooks (PR 86928)

2019-04-01 Thread Alexander Monakov

Hi,

This patch by Andrey fixes PR 86928 by avoiding leaving basic blocks without
computed liveness info.

2019-04-01  Andrey Belevantsev  

PR rtl-optimization/86928
* sel-sched-ir.c (sel_redirect_edge_and_branch_force): Invoke
compute_live if necessary.
(sel_redirect_edge_and_branch): Likewise.

* gcc.dg/pr86928.c: New test.

diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index e8e508ef692..2f76e56ca8e 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -5642,6 +5642,8 @@ sel_redirect_edge_and_branch_force (edge e, basic_block 
to)
   recompute_dominator (CDI_DOMINATORS, to));
   set_immediate_dominator (CDI_DOMINATORS, orig_dest,
   recompute_dominator (CDI_DOMINATORS, orig_dest));
+  if (jump && sel_bb_head_p (jump))
+compute_live (jump);
 }

 /* A wrapper for redirect_edge_and_branch.  Return TRUE if blocks connected by
@@ -5702,6 +5704,8 @@ sel_redirect_edge_and_branch (edge e, basic_block to)
   set_immediate_dominator (CDI_DOMINATORS, orig_dest,
recompute_dominator (CDI_DOMINATORS, 
orig_dest));
 }
+  if (jump && sel_bb_head_p (jump))
+compute_live (jump);
   return recompute_toporder_p;
 }

diff --git a/gcc/testsuite/gcc.dg/pr86928.c b/gcc/testsuite/gcc.dg/pr86928.c
new file mode 100644
index 000..1586a36f945
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr86928.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target powerpc*-*-* ia64-*-* i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 -fnon-call-exceptions -fsel-sched-pipelining 
-fsel-sched-pipelining-outer-loops -fselective-scheduling -fno-if-conversion 
-fno-tree-dce -w" } */
+/* { dg-additional-options "-m32" { target powerpc*-*-* } } */
+
+int kn;
+
+void
+gd (short int sk)
+{
+  char *as;
+
+  while (sk < 1)
+{
+  char *ci;
+
+  if (*ci == 0)
+*as += ci;
+
+  for (kn = 0; kn < 18; ++kn)
+{
+}
+
+  ++sk;
+}
+}

Re: [PATCH] Updated patch for PR84101

2019-04-01 Thread Richard Sandiford

Richard Biener  writes:
> This is an update from last years attempt to tame down vectorization
> when it runs in to ABI inefficiencies at function return.  I've
> updated the patch to look for multi-reg returns instead of
> !vector ones (due to word_mode vectorization) and handle a bit
> more returns, esp. the common pattern of
>
>   ret = vector;
>   D.1234 = ret;
>   ret = {CLOBBER};
>   return D.1234;
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>
> I know this isn't the ultimate solution but we keep getting
> duplicates of the PR.
>
> Richard.
>
> 2019-03-28  Richard Biener  
>
>   PR tree-optimization/84101
>   * tree-vect-stmts.c: Include explow.h for hard_function_value,
>   regs.h for hard_regno_nregs.
>   (cfun_returns): New helper.
>   (vect_model_store_cost): When vectorizing a store to a decl
>   we return and the function ABI returns in a multi-reg location
>   account for the possible spilling that will happen.
>
>   * gcc.target/i386/pr84101.c: New testcase.
>
> Index: gcc/tree-vect-stmts.c
> ===
> --- gcc/tree-vect-stmts.c (revision 269987)
> +++ gcc/tree-vect-stmts.c (working copy)
> @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-cfg.h"
>  #include "tree-ssa-loop-manip.h"
>  #include "cfgloop.h"
> +#include "explow.h"
>  #include "tree-ssa-loop.h"
>  #include "tree-scalar-evolution.h"
>  #include "tree-vectorizer.h"
> @@ -52,6 +53,7 @@ along with GCC; see the file COPYING3.
>  #include "vec-perm-indices.h"
>  #include "tree-ssa-loop-niter.h"
>  #include "gimple-fold.h"
> +#include "regs.h"
>  
>  /* For lang_hooks.types.type_for_mode.  */
>  #include "langhooks.h"
> @@ -948,6 +950,37 @@ vect_model_promotion_demotion_cost (stmt
>   "prologue_cost = %d .\n", inside_cost, prologue_cost);
>  }
>  
> +/* Returns true if the current function returns DECL.  */
> +
> +static bool
> +cfun_returns (tree decl)
> +{
> +  edge_iterator ei;
> +  edge e;
> +  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
> +{
> +  greturn *ret = safe_dyn_cast  (last_stmt (e->src));
> +  if (!ret)
> + continue;
> +  if (gimple_return_retval (ret) == decl)
> + return true;
> +  /* We often end up with an aggregate copy to the result decl,
> + handle that case as well.  First skip intermediate clobbers
> +  though.  */
> +  gimple *def = ret;
> +  do
> + {
> +   def = SSA_NAME_DEF_STMT (gimple_vuse (def));
> + }
> +  while (gimple_clobber_p (def));
> +  if (is_a  (def)
> +   && gimple_assign_lhs (def) == gimple_return_retval (ret)
> +   && gimple_assign_rhs1 (def) == decl)
> + return true;
> +}
> +  return false;
> +}
> +
>  /* Function vect_model_store_cost
>  
> Models cost for stores.  In the case of grouped accesses, one access
> @@ -1032,6 +1065,37 @@ vect_model_store_cost (stmt_vec_info stm
>  vec_to_scalar, stmt_info, 0, vect_body);
>  }
>  
> +  /* When vectorizing a store into the function result assign
> + a penalty if the function returns in a multi-register location.
> + In this case we assume we'll end up with having to spill the
> + vector result and do piecewise loads.  */
> +  tree base = get_base_address (STMT_VINFO_DATA_REF (stmt_info)->ref);
> +  if (base
> +  && (TREE_CODE (base) == RESULT_DECL
> +   || (DECL_P (base) && cfun_returns (base)))
> +  && !aggregate_value_p (base, cfun->decl))
> +{
> +  rtx reg = hard_function_value (TREE_TYPE (base), cfun->decl, 0, 1);
> +  /* ???  Handle PARALLEL in some way.  */
> +  if (REG_P (reg))
> + {
> +   int nregs = hard_regno_nregs (REGNO (reg), GET_MODE (reg));
> +   /* Assume that a reg-reg move is possible and cheap,
> +  do not account for vector to gp register move cost.  */
> +   if (nregs > 1)

Looks OK to me FWIW, but maybe it would be worth having a check like:

targetm.secondary_memory_needed (TYPE_MODE (vectype), ALL_REGS,
 REGNO_REG_CLASS (REGNO (reg)))

as well as the above, so that we don't accidentally penalise values
that are returned in multiple vector registers.  Looks like the i386.c
definition will return true in the cases that matter.

Thanks,
Richard

> + {
> +   /* Spill.  */
> +   prologue_cost += record_stmt_cost (cost_vec, ncopies,
> +  vector_store,
> +  stmt_info, 0, vect_epilogue);
> +   /* Loads.  */
> +   prologue_cost += record_stmt_cost (cost_vec, ncopies * nregs,
> +  scalar_load,
> +  stmt_info, 0, vect_epilogue);
> + }
> + }
> +}
> +
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_NOTE, v

Re: [Patch] PR rtl-optimization/87763 - generate more bfi instructions on aarch64

2019-04-01 Thread Steve Ellcey

This is a ping**3 for a patch to fix one of the test failures in PR 877763.
It fixes the gcc.target/aarch64/combine_bfi_1.c failure, but not the other
ones.

Could one of the Aarch64 maintainers take a look at it?  This version of
the patch was originally submitted on February 11 after incorporating some
changes suggested by Wilco Dijkstra but I have not gotten any comments
since then despite pinging it.  I updated it to apply cleanly on ToT but
did not make any other changes since the February 11th version.

If we want to encourage people to fix bugs in the run up to a release it
would help to get feedback on bugfix patches.

Steve Ellcey
sell...@marvell.com


2018-04-01  Steve Ellcey  

PR rtl-optimization/87763
* config/aarch64/aarch64-protos.h (aarch64_masks_and_shift_for_bfi_p):
New prototype.
* config/aarch64/aarch64.c (aarch64_masks_and_shift_for_bfi_p):
New function.
* config/aarch64/aarch64.md (*aarch64_bfi5_shift):
New instruction.
(*aarch64_bfi4_noand): Ditto.
(*aarch64_bfi4_noshift): Ditto.
(*aarch64_bfi4_noshift_alt): Ditto.


2018-04-01  Steve Ellcey  

PR rtl-optimization/87763
* gcc.target/aarch64/combine_bfxil.c: Change some bfxil checks
to bfi.
* gcc.target/aarch64/combine_bfi_2.c: New test.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index b035e35..b6c0d0a 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -429,6 +429,9 @@ bool aarch64_label_mentioned_p (rtx);
 void aarch64_declare_function_name (FILE *, const char*, tree);
 bool aarch64_legitimate_pic_operand_p (rtx);
 bool aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode, rtx, rtx);
+bool aarch64_masks_and_shift_for_bfi_p (scalar_int_mode, unsigned HOST_WIDE_INT,
+	unsigned HOST_WIDE_INT,
+	unsigned HOST_WIDE_INT);
 bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx);
 bool aarch64_move_imm (HOST_WIDE_INT, machine_mode);
 opt_machine_mode aarch64_sve_pred_mode (unsigned int);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b38505b..3017e99 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9336,6 +9336,35 @@ aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode mode, rtx mask,
 	 & ((HOST_WIDE_INT_1U << INTVAL (shft_amnt)) - 1)) == 0;
 }
 
+/* Return true if the masks and a shift amount from an RTX of the form
+   ((x & MASK1) | ((y << SHIFT_AMNT) & MASK2)) are valid to combine into
+   a BFI instruction of mode MODE.  See *arch64_bfi patterns.  */
+
+bool
+aarch64_masks_and_shift_for_bfi_p (scalar_int_mode mode,
+   unsigned HOST_WIDE_INT mask1,
+   unsigned HOST_WIDE_INT shft_amnt,
+   unsigned HOST_WIDE_INT mask2)
+{
+  unsigned HOST_WIDE_INT t;
+
+  /* Verify that there is no overlap in what bits are set in the two masks.  */
+  if (mask1 != ~mask2)
+return false;
+
+  /* Verify that mask2 is not all zeros or ones.  */
+  if (mask2 == 0 || mask2 == HOST_WIDE_INT_M1U)
+return false;
+
+  /* The shift amount should always be less than the mode size.  */
+  gcc_assert (shft_amnt < GET_MODE_BITSIZE (mode));
+
+  /* Verify that the mask being shifted is contiguous and would be in the
+ least significant bits after shifting by shft_amnt.  */
+  t = mask2 + (HOST_WIDE_INT_1U << shft_amnt);
+  return (t == (t & -t));
+}
+
 /* Calculate the cost of calculating X, storing it in *COST.  Result
is true if the total cost of the operation has now been calculated.  */
 static bool
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 70f0418..baa8983 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5490,6 +5490,76 @@
   [(set_attr "type" "bfm")]
 )
 
+;;  Match a bfi instruction where the shift of OP3 means that we are
+;;  actually copying the least significant bits of OP3 into OP0 by way
+;;  of the AND masks and the IOR instruction.  A similar instruction
+;;  with the two parts of the IOR swapped around was never triggered
+;;  in a bootstrap build and test of GCC so it was not included.
+
+(define_insn "*aarch64_bfi5_shift"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0")
+  (match_operand:GPI 2 "const_int_operand" "n"))
+ (and:GPI (ashift:GPI
+   (match_operand:GPI 3 "register_operand" "r")
+   (match_operand:GPI 4 "aarch64_simd_shift_imm_" "n"))
+  (match_operand:GPI 5 "const_int_operand" "n"]
+  "aarch64_masks_and_shift_for_bfi_p (mode, UINTVAL (operands[2]),
+  UINTVAL (operands[4]),
+  UINTVAL(operands[5]))"
+  "bfi\t%0, %3, %4, %P5"
+  [(set_attr "type" "bfm")]
+)
+
+;; Like *aarch64_bfi5_shift but with no and of the ashift because
+;; the shift is large

Re: [PATCH PR81740]Enforce dependence check for outer loop vectorization

2019-04-01 Thread Richard Sandiford

Richard Biener  writes:
> On Tue, Mar 26, 2019 at 1:56 AM Richard Sandiford
>  wrote:
>> Based on the "complete unrolling" view, if we number statements as
>> (i, n), where i is the outer loop iteration and n is a statement number
>> in the (completely unrolled) loop body, then the original scalar code
>> executes in lexicographical order while for the vector loop:
>>
>> (1) (i,n) executes before (i+ix,n+nx) for all ix>=0, nx>=1, regardless of VF
>> (2) (i,n) executes before (i+ix,n-nx) for all ix>=VF, nx>=0
>>   (well, nx unrestricted, but only nx>=0 is useful given (1))
>>
>> So for any kind of dependence between (i,n) and (i+ix,n-nx), ix>=1, nx>=0
>> we need to restrict VF to ix so that (2) ensures the right order.
>> This means that the unnormalised distances of interest are:
>>
>> - (ix, -nx), ix>=1, nx>=0
>> - (-ix, nx), ix>=1, nx>=0
>>
>> But the second gets normalised to the first, which is actually useful
>> in this case :-).
>>
>> In terms of the existing code, I think that means we want to change
>> the handling of nested statements (only) to:
>>
>> - ignore DDR_REVERSED_P (ddr)
>> - restrict the main dist > 0 case to when the inner distance is <= 0.
>>
>> This should have the side effect of allowing outer-loop vectorisation for:
>>
>> void __attribute__ ((noipa))
>> f (int a[][N], int b[restrict])
>> {
>>   for (int i = N - 1; i-- > 0; )
>> for (int j = 0; j < N - 1; ++j)
>>   a[j + 1][i] = a[j][i + 1] + b[i];
>> }
>>
>> At the moment we reject this, but AFAICT it should be OK.
>> (We do allow it for s/i + 1/i/, since then the outer distance is 0.)
>
> Can you file an enhancement request so we don't forget?

OK, for the record it's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89908

Re: [PATCH] handle strings as template arguments (PR 47488, 89833, 89876)

2019-04-01 Thread Jason Merrill


On 3/31/19 10:17 PM, Martin Sebor wrote:

To fix PR 89833, a P1 regression, the attached patch tries to
handle string literals as C++ 2a non-type template arguments
by treating them the same as brace enclosed initializer lists
(where the latter are handled correctly).  The solution makes
sure equivalent forms of array initializers such as for char[5]:

   "\0\1\2"
   "\0\1\2\0"
   { 0, 1, 2 }
   { 0, 1, 2, 0 }
   { 0, 1, 2, 0, 0 }

are treated as the same, both for overloading and mangling.
Ditto for the following equivalent forms:

   ""
   "\0"
   "\0\0\0\0"
   { }
   { 0 }
   { 0, 0, 0, 0, 0 }

and for these of struct { char a[5], b[5], c[5]; }:

   { {0,0,0,0,0}, {0,0,0,0}, {0,0,0} }
   { { 0 }, { } }
   { "" }

Since this is not handled correctly by the current code (see PR
89876 for a test case) the patch also fixes that.

I'm not at all confident the handling of classes with user-defined
constexpr ctors is 100% correct.  (I use triviality to constrain
the solution for strings but that was more of an off-the-cuff guess
than a carefully considered decision).


You could use TYPE_HAS_TRIVIAL_DFLT since we don't care about the 
triviality of other operations.


I wouldn't worry about trying to omit user-defined constexpr ctors.


The g++.dg/abi/mangle71.C
test is all I've got in terms of verifying it works correctly.
I'm quite sure the C++ 2a testing could stand to be beefed up.

The patch passes x86_64-linux bootstrap and regression tests.
There are a few failures in check-c++-all tests that don't look
related to the changes (I see them with an unpatched GCC as well):

   g++.dg/spellcheck-macro-ordering-2.C
   g++.dg/cpp0x/auto52.C
   g++.dg/cpp1y/auto-neg1.C
   g++.dg/other/crash-9.C


You probably need to check zero_init_p to properly handle pointers to 
data members, where a null value is integer -1; given


struct A { int i; };

constexpr A::* pm = &A::i;
int A::* pma[] = { pm, pm };

we don't want to discard the initializers because they look like zeros, 
as then digest_init will add back -1s.



+ unsigned n = TREE_STRING_LENGTH (value);
+ const char *str = TREE_STRING_POINTER (value);
 
+	  /* Count the number of trailing nuls and subtract them from

+STRSIZE because they don't need to be mangled.  */
+ tree strsizenode = TYPE_SIZE_UNIT (TREE_TYPE (value));
+ unsigned strsize = tree_to_uhwi (strsizenode);
+ if (strsize > n)
+   strsize = n;


Why not just use TREE_STRING_LENGTH?

Jason

[PATCH] Check avx2_available in check_avx2_available

2019-04-01 Thread H.J. Lu

check_avx2_available should check avx2_available, instead of avx_available.
Otherwise, check_avx2_available may use result from check_avx_available.

PR testsuite/89907
* lib/target-supports.exp (check_avx2_available): Replace
avx_available with avx2_available.
---
 gcc/testsuite/lib/target-supports.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index ac60c1e1567..90efaea804d 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7403,7 +7403,7 @@ proc check_avx_available { } {
 # Return true if we are compiling for AVX2 target.
 
 proc check_avx2_available { } {
-  if { [check_no_compiler_messages avx_available assembly {
+  if { [check_no_compiler_messages avx2_available assembly {
 #ifndef __AVX2__
 #error unsupported
 #endif
-- 
2.20.1

[committed] sel-sched: correct reset of reset_sched_cycles_p (PR 85412)

2019-04-01 Thread Alexander Monakov

Hi,

this patch by Andrey moves an assignment to reset_sched_cycles_p
after rev. 259228 placed it incorrectly.

2019-04-01  Andrey Belevantsev  

PR rtl-optimization/85412
* sel-sched.c (sel_sched_region): Assign reset_sched_cycles_p before
sel_sched_region_1, not after.

* gcc.dg/pr85412.c: New test.

diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 338d7c097df..552dd0b9263 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -7650,11 +7650,11 @@ sel_sched_region (int rgn)
   /* Schedule always selecting the next insn to make the correct data
 for bundling or other later passes.  */
   pipelining_p = false;
+  reset_sched_cycles_p = false;
   force_next_insn = 1;
   sel_sched_region_1 ();
   force_next_insn = 0;
 }
-  reset_sched_cycles_p = pipelining_p;
   sel_region_finish (reset_sched_cycles_p);
 }

diff --git a/gcc/testsuite/gcc.dg/pr85412.c b/gcc/testsuite/gcc.dg/pr85412.c
new file mode 100644
index 000..11b8ceccd1e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr85412.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O1 -fpeephole2 -fschedule-insns2 -fsel-sched-pipelining 
-fselective-scheduling2 -ftree-loop-if-convert -fno-if-conversion 
-fno-move-loop-invariants -fno-split-wide-types -fno-tree-dominator-opts" } */
+/* { dg-additional-options "-march=bonnell" { target x86_64-*-* } } */
+
+__int128 jv;
+
+void
+zm (__int128 g9, unsigned short int sm, short int hk)
+{
+  while (hk < 1)
+{
+  if (jv == 0)
+sm *= g9;
+
+  if (sm < jv)
+hk = sm;
+
+  g9 |= sm == hk;
+}
+}

Re: [PATCH] Check avx2_available in check_avx2_available

2019-04-01 Thread Uros Bizjak

On Mon, Apr 1, 2019 at 7:39 PM H.J. Lu  wrote:
>
> check_avx2_available should check avx2_available, instead of avx_available.
> Otherwise, check_avx2_available may use result from check_avx_available.
>
> PR testsuite/89907
> * lib/target-supports.exp (check_avx2_available): Replace
> avx_available with avx2_available.

OK.

Thanks,
Uros.

> ---
>  gcc/testsuite/lib/target-supports.exp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index ac60c1e1567..90efaea804d 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -7403,7 +7403,7 @@ proc check_avx_available { } {
>  # Return true if we are compiling for AVX2 target.
>
>  proc check_avx2_available { } {
> -  if { [check_no_compiler_messages avx_available assembly {
> +  if { [check_no_compiler_messages avx2_available assembly {
>  #ifndef __AVX2__
>  #error unsupported
>  #endif
> --
> 2.20.1
>

[C++ PATCH] PR c++/86946 - ICE with function call in template argument.

2019-04-01 Thread Jason Merrill

DR 1321 clarified that two dependent names are equivalent if the names are
the same, even if the result of name lookup is different.  So template
argument hashing should treat a lookup set like a plain identifier.
Mangling already does.

Tested x86_64-pc-linux-gnu, applying to trunk.

* pt.c (iterative_hash_template_arg): Hash all overload sets like an
identifier.
---
 gcc/cp/pt.c   | 10 +-
 gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C | 23 +++
 gcc/cp/ChangeLog  |  7 +++
 3 files changed, 35 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f3faa89f671..ea1976bd0f5 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1757,6 +1757,11 @@ iterative_hash_template_arg (tree arg, hashval_t val)
   code = TREE_CODE (arg);
   tclass = TREE_CODE_CLASS (code);
 
+  if (code == FUNCTION_DECL || code == OVERLOAD)
+/* DR 1321: Hash a dependent name as the name, not the lookup result.  */
+return iterative_hash_template_arg (DECL_NAME (get_first_fn (arg)),
+   val);
+
   val = iterative_hash_object (code, val);
 
   switch (code)
@@ -1789,11 +1794,6 @@ iterative_hash_template_arg (tree arg, hashval_t val)
val = iterative_hash_template_arg (TREE_VALUE (arg), val);
   return val;
 
-case OVERLOAD:
-  for (lkp_iterator iter (arg); iter; ++iter)
-   val = iterative_hash_template_arg (*iter, val);
-  return val;
-
 case CONSTRUCTOR:
   {
tree field, value;
diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C 
b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
new file mode 100644
index 000..833ae6fc85c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C
@@ -0,0 +1,23 @@
+// PR c++/86946, DR 1321
+// { dg-do compile { target c++11 } }
+
+int d(int, int);
+template  class e {};
+template  e d();
+template  e d();
+
+template  constexpr T d2(T, U) { return 42; }
+template  e d2();
+template  e d2();
+
+template  a d3(a, c);
+template  e d3();
+template  e d3();
+
+
+int main()
+{
+  d<1,2,int>();
+  d2<1,2,int>();
+  d3<1,2,int>();
+}
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 8c6e9931db1..805237cb17b 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,10 @@
+2019-04-01  Jason Merrill  
+
+   PR c++/86946 - ICE with function call in template argument.
+   DR 1321
+   * pt.c (iterative_hash_template_arg): Hash all overload sets like an
+   identifier.
+
 2019-03-31  Marek Polacek  
 
PR c++/89852 - ICE with C++11 functional cast with { }.

base-commit: 9c2fddaf0c9fded1e3788fdf4bc15f2435b84df5
-- 
2.20.1

Re: [C++ PATCH] PR c++/86946 - ICE with function call in template argument.

2019-04-01 Thread Jason Merrill

On Mon, Apr 1, 2019 at 3:04 PM Jason Merrill  wrote:
>
> DR 1321 clarified that two dependent names are equivalent if the names are
> the same, even if the result of name lookup is different.  So template
> argument hashing should treat a lookup set like a plain identifier.
> Mangling already does.

Actually, let's handle this more like cp_tree_equal does, i.e. only in
the context of a CALL_EXPR.


86946-patch2
Description: Binary data

Re: [PATCH, PR d/89823] Update EXCLUDES for updated/removed dmd frontend sources

2019-04-01 Thread Joseph Myers

On Sat, 30 Mar 2019, Iain Buclaw wrote:

> Patch updates the EXCLUDES list for message strings that are now
> picked up from d/dmd after additions/removals that have happened since
> adding sources.  Regenerated gcc.pot as per documentation at
> https://gcc.gnu.org/translation.html.
> 
> OK for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Add PSTL internal namespace qualifications

2019-04-01 Thread Thomas Rodgers

>> We shouldn't include  in the std::lib, the uses of assert
>> should be changed to __glibcxx_assert instead (which is enabled by
>> defining _GLIBCXX_ASSERTIONS).
>>
>
> This has to come through one of the PSTL library configuration
> macros because the "right assert" upstream won't be __glibcxx_assert.

There are a number of locations in the upstream PSTL sources where
 is included and assert() is used. Additionally, the TBB
backend uses __TBB_ASSERT(). I'm going to follow up with a separate
patch that introduces __PSTL_ASSERT() and makes everything consistent,
with __PSTL_ASSERT expanding to __glibcxx_assert in libstdc++.

Tom.

Thomas Rodgers writes:

> Jonathan Wakely writes:
>
>> On 29/03/19 12:12 -0700, Thomas Rodgers wrote:
>>>Prevent ADL related weirdness.
>>>
>>> * include/pstl/algorithm_impl.h: Add namespace qualification.
>>> * include/pstl/execution_defs.h: Add namespace qualification.
>>> * include/pstl/execution_impl.h: Add namespace qualification.
>>> * include/pstl/numeric_impl.h: Add namespace qualification.
>>> * include/pstl/parallel_backend_tbb.h: Add namespace qualification.
>>> * include/pstl/unseq_backend_simd.h: Add namespace qualification.
>>> * include/pstl/parallel_backend_utils.h: Include .
>>
>> We shouldn't include  in the std::lib, the uses of assert
>> should be changed to __glibcxx_assert instead (which is enabled by
>> defining _GLIBCXX_ASSERTIONS).
>>
>
> This has to come through one of the PSTL library configuration
> macros because the "right assert" upstream won't be __glibcxx_assert.
>
>>>@@ -285,7 +286,7 @@ _ForwardIterator2
>>> __pattern_walk2_n(_ExecutionPolicy&&, _ForwardIterator1 __first1, _Size n, 
>>> _ForwardIterator2 __first2, _Function f,
>>>   _IsVector is_vector, /*parallel=*/std::false_type) 
>>> noexcept
>>
>> I missed these before, but we have non-uglified 'n' and 'f' and
>> 'is_vector' here. Almost all uses of "is_vector" are inside a comment
>> like /*is_vector=*/ but here it's a real parameter name.
>>
>> In practice if users do something stupid like
>> #define n 123
>> #define f 456
>> then they won't be able to include these headers anyway, because
>> TBB uses those as parameter names (so users that are using these
>> headers with TBB can't defines such dumb macros). But when we get a
>> non-TBB backend that won't be the case, and we should be uglifying all
>> names declared in our own headers on principle.
>>
>>> {
>>>-return __brick_walk2_n(__first1, n, __first2, f, is_vector);
>>>+return __internal::__brick_walk2_n(__first1, n, __first2, f, is_vector);
>>> }
>>>
>>> template >> _Size, class _RandomAccessIterator2,
>>>@@ -294,7 +295,7 @@ _RandomAccessIterator2
>>> __pattern_walk2_n(_ExecutionPolicy&& __exec, _RandomAccessIterator1 
>>> __first1, _Size n, _RandomAccessIterator2 __first2,
>>>   _Function f, _IsVector is_vector, 
>>> /*parallel=*/std::true_type)
>>
>> And 'n' and 'f' here.
>>
>>> {
>>>-return __pattern_walk2(std::forward<_ExecutionPolicy>(__exec), 
>>>__first1, __first1 + n, __first2, f, is_vector,
>>>+return 
>>>__internal::__pattern_walk2(std::forward<_ExecutionPolicy>(__exec), 
>>>__first1, __first1 + n, __first2, f, is_vector,
>>>std::true_type());
>>> }
>>>
>>
>> We can fix the 'n' and 'f' and 'is_vector' names in another patch, for
>> this one please just remove  again and change assert(...) to
>> __glibcxx_assert(...).

[PATCH] Fix PR 81721: ICE with PCH and Pragma warning and C++ operator

2019-04-01 Thread apinski

From: Andrew Pinski 

Hi,
  The problem here is the token->val.node is not saved over
a precompiled header for C++ operator.  This can cause an
internal compiler error as we tried to print out the spelling
of the token as we assumed it was valid.
The fix is to have cpp_token_val_index return CPP_TOKEN_FLD_NODE
for operator tokens that have NAMED_OP set.

OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

libcpp/ChangeLog:
* lex.c (cpp_token_val_index ): If tok->flags
has NAMED_OP set, then return CPP_TOKEN_FLD_NODE.

gcc/testsuite/ChangeLog:
* g++.dg/pch/operator-1.C: New testcase.
* g++.dg/pch/operator-1.Hs: New file.
---
 gcc/testsuite/g++.dg/pch/operator-1.C  | 2 ++
 gcc/testsuite/g++.dg/pch/operator-1.Hs | 9 +
 libcpp/lex.c   | 6 +-
 3 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/pch/operator-1.C
 create mode 100644 gcc/testsuite/g++.dg/pch/operator-1.Hs

diff --git a/gcc/testsuite/g++.dg/pch/operator-1.C 
b/gcc/testsuite/g++.dg/pch/operator-1.C
new file mode 100644
index 000..290b5f7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/operator-1.C
@@ -0,0 +1,2 @@
+#include "operator-1.H"
+int main(void){ major(0);} /* { dg-warning "Did not Work" } */
diff --git a/gcc/testsuite/g++.dg/pch/operator-1.Hs 
b/gcc/testsuite/g++.dg/pch/operator-1.Hs
new file mode 100644
index 000..657dae1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/operator-1.Hs
@@ -0,0 +1,9 @@
+# define __glibc_macro_warning1(message) _Pragma (#message)
+# define __glibc_macro_warning(message) \
+  __glibc_macro_warning1 (GCC warning message)
+#define __SYSMACROS_DM1(...) __glibc_macro_warning (#__VA_ARGS__)
+
+#define __SYSMACROS_DM(symbol) __SYSMACROS_DM1 \
+ (Did not Work)
+
+# define major(dev) __SYSMACROS_DM (major) (dev+0)
diff --git a/libcpp/lex.c b/libcpp/lex.c
index eedfcbb..16ded6e 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -3756,7 +3756,11 @@ cpp_token_val_index (const cpp_token *tok)
 case SPELL_LITERAL:
   return CPP_TOKEN_FLD_STR;
 case SPELL_OPERATOR:
-  if (tok->type == CPP_PASTE)
+  /* Operands which were originally spelled as ident keep around
+ the node for the exact spelling.  */
+  if (tok->flags & NAMED_OP)
+   return CPP_TOKEN_FLD_NODE;
+  else if (tok->type == CPP_PASTE)
return CPP_TOKEN_FLD_TOKEN_NO;
   else
return CPP_TOKEN_FLD_NONE;
-- 
1.8.3.1

Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc

2019-04-01 Thread Ian Lance Taylor

On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT  wrote:
>
> Description:
>   * This patch removes -mvsx and -maltivec for go aix/ppc.
>  These options don't seem compatible with Go stack layout.

Can you give some more details on the problem?  gccgo should just be
using the regular stack layout, I can't think of any reason why these
options would not be compatible.

Ian

Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.

2019-04-01 Thread Richard Henderson

On 4/1/19 8:53 PM, Sudakshina Das wrote:
>> This could stand to use a comment, a moment's thinking about the sizes, and 
>> to
>> use the existing asm output functions.
>>
>>  /* PT_NOTE header: namesz, descsz, type.
>> namesz = 4 ("GNU\0")
>> descsz = 12 (see below)
> I was trying out these changes but the descsz of 12 gets rejected by 
> readelf. It hits the following
> 
>unsigned intsize = is_32bit_elf ? 4 : 8;
> 
>printf (_("  Properties: "));
> 
>if (pnote->descsz < 8 || (pnote->descsz % size) != 0)
>  {
>printf (_("\n"), 
> pnote->descsz);
>return;
>  }

Hmm, interesting.  The docs say that padding is not to be included in descsz
(gabi4.1, page 82).  To my eye this is a bug in binutils, but perhaps we will
have to live with it.

Nick, thoughts?


r~

Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc

2019-04-01 Thread David Edelsohn

On Mon, Apr 1, 2019 at 9:50 PM Ian Lance Taylor  wrote:
>
> On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT  
> wrote:
> >
> > Description:
> >   * This patch removes -mvsx and -maltivec for go aix/ppc.
> >  These options don't seem compatible with Go stack layout.
>
> Can you give some more details on the problem?  gccgo should just be
> using the regular stack layout, I can't think of any reason why these
> options would not be compatible.

Some Altivec instructions require stricter alignment and 32 bit AIX
does not impose sufficient alignment in the stack, so GCC Go silently
references the wrong address.

GCC Go should be able to align the stack properly on AIX. In the
interim, we need to disable generation of Altivec/VSX.

Thanks, David

Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc

2019-04-01 Thread Ian Lance Taylor

On Mon, Apr 1, 2019 at 7:06 PM David Edelsohn  wrote:
>
> On Mon, Apr 1, 2019 at 9:50 PM Ian Lance Taylor  wrote:
> >
> > On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT  
> > wrote:
> > >
> > > Description:
> > >   * This patch removes -mvsx and -maltivec for go aix/ppc.
> > >  These options don't seem compatible with Go stack layout.
> >
> > Can you give some more details on the problem?  gccgo should just be
> > using the regular stack layout, I can't think of any reason why these
> > options would not be compatible.
>
> Some Altivec instructions require stricter alignment and 32 bit AIX
> does not impose sufficient alignment in the stack, so GCC Go silently
> references the wrong address.
>
> GCC Go should be able to align the stack properly on AIX. In the
> interim, we need to disable generation of Altivec/VSX.

Thanks.  That makes it seem like more of a general GCC problem than a
gccgo problem, though.  Or does GCC arrange to align the stack in the
main function?

Ian

Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.

2019-04-01 Thread H.J. Lu

On Tue, Apr 2, 2019 at 10:05 AM Richard Henderson  wrote:
>
> On 4/1/19 8:53 PM, Sudakshina Das wrote:
> >> This could stand to use a comment, a moment's thinking about the sizes, 
> >> and to
> >> use the existing asm output functions.
> >>
> >>  /* PT_NOTE header: namesz, descsz, type.
> >> namesz = 4 ("GNU\0")
> >> descsz = 12 (see below)
> > I was trying out these changes but the descsz of 12 gets rejected by
> > readelf. It hits the following
> >
> >unsigned intsize = is_32bit_elf ? 4 : 8;
> >
> >printf (_("  Properties: "));
> >
> >if (pnote->descsz < 8 || (pnote->descsz % size) != 0)
> >  {
> >printf (_("\n"),
> > pnote->descsz);
> >return;
> >  }
>
> Hmm, interesting.  The docs say that padding is not to be included in descsz
> (gabi4.1, page 82).  To my eye this is a bug in binutils, but perhaps we will
> have to live with it.
>
> Nick, thoughts?

descsz is wrong.  From:

https://github.com/hjl-tools/linux-abi/wiki/Linux-Extensions-to-gABI

n_desc The note descriptor. The first n_descsz bytes in n_desc is the pro-
gram property array.

The program property array
Each array element represents one program property with type, data
size and data.
In 64-bit objects, each element is an array of 8-byte integers in the
format of the
target processor. In 32-bit objects, each element is an array of
4-byte integers in
the format of the target processor.

-- 
H.J.

Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc

2019-04-01 Thread David Edelsohn

On Mon, Apr 1, 2019 at 10:12 PM Ian Lance Taylor  wrote:
>
> On Mon, Apr 1, 2019 at 7:06 PM David Edelsohn  wrote:
> >
> > On Mon, Apr 1, 2019 at 9:50 PM Ian Lance Taylor  wrote:
> > >
> > > On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT  
> > > wrote:
> > > >
> > > > Description:
> > > >   * This patch removes -mvsx and -maltivec for go aix/ppc.
> > > >  These options don't seem compatible with Go stack layout.
> > >
> > > Can you give some more details on the problem?  gccgo should just be
> > > using the regular stack layout, I can't think of any reason why these
> > > options would not be compatible.
> >
> > Some Altivec instructions require stricter alignment and 32 bit AIX
> > does not impose sufficient alignment in the stack, so GCC Go silently
> > references the wrong address.
> >
> > GCC Go should be able to align the stack properly on AIX. In the
> > interim, we need to disable generation of Altivec/VSX.
>
> Thanks.  That makes it seem like more of a general GCC problem than a
> gccgo problem, though.  Or does GCC arrange to align the stack in the
> main function?

32 bit AIX doesn't require stack alignment as strict as Altivec
assumes.  GCC believes that the stack alignment is stricter than it
is.

- David

[wwwdocs] gcc-9/changes.html - Mention new OpenRISC backend

2019-04-01 Thread Stafford Horne

Hello,

I was reading through some things and found this was missing.

As before, I don't seem to have CVS access, if its OK and someone can commit it
would be helpful.

-Stafford

Index: htdocs/gcc-9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.56
diff -u -r1.56 changes.html
--- htdocs/gcc-9/changes.html   1 Apr 2019 14:55:53 -   1.56
+++ htdocs/gcc-9/changes.html   2 Apr 2019 03:59:42 -
@@ -827,6 +827,13 @@
 
 
 
+OpenRISC
+
+  
+A new back end targeting OpenRISC processors has been contributed to GCC.
+  
+
+

Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc

2019-04-01 Thread Ian Lance Taylor

On Mon, Apr 1, 2019 at 7:28 PM David Edelsohn  wrote:
>
> On Mon, Apr 1, 2019 at 10:12 PM Ian Lance Taylor  wrote:
> >
> > On Mon, Apr 1, 2019 at 7:06 PM David Edelsohn  wrote:
> > >
> > > On Mon, Apr 1, 2019 at 9:50 PM Ian Lance Taylor  wrote:
> > > >
> > > > On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT 
> > > >  wrote:
> > > > >
> > > > > Description:
> > > > >   * This patch removes -mvsx and -maltivec for go aix/ppc.
> > > > >  These options don't seem compatible with Go stack layout.
> > > >
> > > > Can you give some more details on the problem?  gccgo should just be
> > > > using the regular stack layout, I can't think of any reason why these
> > > > options would not be compatible.
> > >
> > > Some Altivec instructions require stricter alignment and 32 bit AIX
> > > does not impose sufficient alignment in the stack, so GCC Go silently
> > > references the wrong address.
> > >
> > > GCC Go should be able to align the stack properly on AIX. In the
> > > interim, we need to disable generation of Altivec/VSX.
> >
> > Thanks.  That makes it seem like more of a general GCC problem than a
> > gccgo problem, though.  Or does GCC arrange to align the stack in the
> > main function?
>
> 32 bit AIX doesn't require stack alignment as strict as Altivec
> assumes.  GCC believes that the stack alignment is stricter than it
> is.

OK, so that seems like something to address in
gcc/config/rs6000/rs6000.c, not in gcc/go/gospec.c.

Ian

54 matches

Mail list logo