Free more of type decls

2018-10-26 Thread Jan Hubicka
Hi,
this patch frees TYPE_DECL and alignment from TYPE_DECL and also preserves
only those TYPE_DECL pointers that are actually used to build ODR type tree.

It reduces number of TYPE_DECLs streamed from WPA to ltrans to about 20%
and is important for the patch turning types to incomplete types.  Without
this change the TREE_TYPE of TYPE_DECL would still point back to complete type
and duplicating TYPE_DECLs as well is somewhat laborious.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* tree-inline.c (remap_decl): Be ready that TREE_TYPE of TYPE_DECL
may be NULL.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree.c (free_lang_data_in_type): Free more type decls.
(free_lang_data_in_decl): Free type and alignment of TYPE_DECL.
Index: tree-inline.c
===
--- tree-inline.c   (revision 265492)
+++ tree-inline.c   (working copy)
@@ -382,7 +382,7 @@ remap_decl (tree decl, copy_body_data *i
  /* Preserve the invariant that DECL_ORIGINAL_TYPE != TREE_TYPE,
 which is enforced in gen_typedef_die when DECL_ABSTRACT_ORIGIN
 is not set on the TYPE_DECL, for example in LTO mode.  */
- if (DECL_ORIGINAL_TYPE (t) == TREE_TYPE (t))
+ if (TREE_TYPE (t) && DECL_ORIGINAL_TYPE (t) == TREE_TYPE (t))
{
  tree x = build_variant_type_copy (TREE_TYPE (t));
  TYPE_STUB_DECL (x) = TYPE_STUB_DECL (TREE_TYPE (t));
Index: tree-pretty-print.c
===
--- tree-pretty-print.c (revision 265492)
+++ tree-pretty-print.c (working copy)
@@ -1896,7 +1896,7 @@ dump_generic_node (pretty_printer *pp, t
}
   if (DECL_NAME (node))
dump_decl_name (pp, node, flags);
-  else if (TYPE_NAME (TREE_TYPE (node)) != node)
+  else if (TREE_TYPE (node) && TYPE_NAME (TREE_TYPE (node)) != node)
{
  pp_string (pp, (TREE_CODE (TREE_TYPE (node)) == UNION_TYPE
  ? "union" : "struct "));
Index: tree.c
===
--- tree.c  (revision 265492)
+++ tree.c  (working copy)
@@ -5174,7 +5174,7 @@ free_lang_data_in_type (tree type)
 
   /* Drop TYPE_DECLs in TYPE_NAME in favor of the identifier in the
  TYPE_DECL if the type doesn't have linkage.  */
-  if (! type_with_linkage_p (type))
+  if (type != TYPE_MAIN_VARIANT (type) || ! type_with_linkage_p (type))
 {
   TYPE_NAME (type) = TYPE_IDENTIFIER (type);
   TYPE_STUB_DECL (type) = NULL;
@@ -5354,6 +5354,8 @@ free_lang_data_in_decl (tree decl)
   DECL_VISIBILITY_SPECIFIED (decl) = 0;
   DECL_INITIAL (decl) = NULL_TREE;
   DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
+  TREE_TYPE (decl) = NULL_TREE;
+  SET_DECL_ALIGN (decl, 0);
 }
   else if (TREE_CODE (decl) == FIELD_DECL)
 DECL_INITIAL (decl) = NULL_TREE;


Re: [PATCH] Make __PRETTY_FUNCTION__-like functions mergeable string csts (PR c++/64266).

2018-10-26 Thread Martin Liška
On 10/24/18 7:24 PM, Jason Merrill wrote:
> On Tue, Oct 23, 2018 at 4:59 AM Martin Liška  wrote:
>> However, I still see some minor ICEs, it's probably related to 
>> decay_conversion in cp_fname_init:
>>
>> 1) ./xg++ -B. 
>> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-__func__2.C
>>
>> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-__func__2.C:6:17:
>>  internal compiler error: Segmentation fault
>> 6 | [] { return __func__; }();
>>   | ^~~~
>> 0x1344568 crash_signal
>> /home/marxin/Programming/gcc/gcc/toplev.c:325
>> 0x76bc310f ???
>> 
>> /usr/src/debug/glibc-2.27-6.1.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
>> 0x9db134 is_capture_proxy(tree_node*)

Hi.

> 
> The problem in both tests is that is_capture_proxy thinks your
> __func__ VAR_DECL with DECL_VALUE_EXPR is a capture proxy, since it is
> neither an anonymous union proxy nor a structured binding.

I see, however I'm a rookie in area of C++ FE. Would it be solvable this problem
with lambdas?

> 
> The standard says,
> 
> The function-local predefined variable __func__ is defined as if a
> definition of the form
>static const char __func__[] = "function-name ";
> had been provided, where function-name is an implementation-defined
> string. It is unspecified whether such a variable has an address
> distinct from that of any other object in the program.
> 
> So changing the type of __func__ (from array to pointer) still breaks
> conformance.  And we need to keep the type checks from pretty4.C, even
> though the checks for strings being distinct need to go.

I added following patch which puts back type to const char[] (instead of char *)
and I made the variable static. Now I see pretty4.C testcase passing again.
To be honest I'm not convinced about the FE changes, so a help would
be appreciated.

Thanks,
Martin

> 
> Jason
> 

>From fd2e13b23e07a7b02025432843782e1ab579 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 25 Oct 2018 18:12:10 +0200
Subject: [PATCH] Fix back.

---
 gcc/cp/decl.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 9624df081e4..74ad871b3f4 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -4445,15 +4445,13 @@ cp_fname_init (const char* name, tree *type_p)
   type = cp_build_qualified_type (char_type_node, TYPE_QUAL_CONST);
   type = build_cplus_array_type (type, domain);
 
-  *type_p = type_decays_to (type);
+  *type_p = type;
 
   if (init)
 TREE_TYPE (init) = type;
   else
 init = error_mark_node;
 
-  init = decay_conversion (init, tf_warning_or_error);
-
   return init;
 }
 
@@ -4482,6 +4480,7 @@ cp_make_fname_decl (location_t loc, tree id, int type_dep)
   TREE_READONLY (decl) = 1;
   DECL_ARTIFICIAL (decl) = 1;
   DECL_DECLARED_CONSTEXPR_P (decl) = 1;
+  TREE_STATIC (decl) = 1;
 
   TREE_USED (decl) = 1;
 
-- 
2.19.0



Cleanup handling of variants in ipa-devirt

2018-10-26 Thread Jan Hubicka
Hi,
with this patch ipa-devirt no longer needs TYPE_DECLs on type variants.
The basic idea is that anything working with ODR types should work on main 
variants
only. For ODR type checking we then have type_variants_equivalent_p which double
check that we have same qualifiers, alignment and attributes provided that the
main variants are known to match. 

Bootstrapped/regtested x86_64-linux and also
tested with lto bootstrap.  Will commit it shortly.

Honza

* ipa-devirt.c (warn_odr): Make static.
(types_same_for_odr): Drop strict variant.
(types_odr_comparable): Likewise.
(odr_or_derived_type_p): Look for main variants.
(odr_name_hasher::equal): Cleanup comment.
(odr_subtypes_equivalent): Add warn and warned arguments; check main
variants.
(type_variants_equivalent_p): break out from ...
(odr_types_equivalent): ... here; go for main variants where needed.
(warn_odr): ... here; turn static.
(warn_types_mismatch): Compare mangled names of main variants.
* ipa-utils.h (types_odr_comparable): Drop strict parameter.
(type_with_linkage_p): Sanity check that we look at main variant.
* lto.c (lto_read_decls): Only consider main variant to be ODR type.
* tree.h (types_same_for_odr): Drop strict argument.
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 265492)
+++ ipa-devirt.c(working copy)
@@ -175,6 +175,8 @@ struct default_hash_traits 
 static bool odr_types_equivalent_p (tree, tree, bool, bool *,
hash_set *,
location_t, location_t);
+static void warn_odr (tree t1, tree t2, tree st1, tree st2,
+ bool warn, bool *warned, const char *reason);
 
 static bool odr_violation_reported = false;
 
@@ -381,22 +383,15 @@ odr_vtable_hasher::hash (const odr_type_
 
Until we start streaming mangled type names, this function works
only for polymorphic types.
-
-   When STRICT is true, we compare types by their names for purposes of
-   ODR violation warnings.  When strict is false, we consider variants
-   equivalent, because it is all that matters for devirtualization machinery.
 */
 
 bool
-types_same_for_odr (const_tree type1, const_tree type2, bool strict)
+types_same_for_odr (const_tree type1, const_tree type2)
 {
   gcc_checking_assert (TYPE_P (type1) && TYPE_P (type2));
 
-  if (!strict)
-{
-  type1 = TYPE_MAIN_VARIANT (type1);
-  type2 = TYPE_MAIN_VARIANT (type2);
-}
+  type1 = TYPE_MAIN_VARIANT (type1);
+  type2 = TYPE_MAIN_VARIANT (type2);
 
   if (type1 == type2)
 return true;
@@ -470,17 +465,15 @@ types_same_for_odr (const_tree type1, co
 /* Return true if we can decide on ODR equivalency.
 
In non-LTO it is always decide, in LTO however it depends in the type has
-   ODR info attached.
-
-   When STRICT is false, compare main variants.  */
+   ODR info attached. */
 
 bool
-types_odr_comparable (tree t1, tree t2, bool strict)
+types_odr_comparable (tree t1, tree t2)
 {
   return (!in_lto_p
- || t1 == t2
- || (!strict && TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2))
- || (odr_type_p (t1) && odr_type_p (t2))
+ || TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2)
+ || (odr_type_p (TYPE_MAIN_VARIANT (t1))
+ && odr_type_p (TYPE_MAIN_VARIANT (t2)))
  || (TREE_CODE (t1) == RECORD_TYPE && TREE_CODE (t2) == RECORD_TYPE
  && TYPE_BINFO (t1) && TYPE_BINFO (t2)
  && polymorphic_type_binfo_p (TYPE_BINFO (t1))
@@ -525,7 +518,7 @@ odr_or_derived_type_p (const_tree t)
 {
   do
 {
-  if (odr_type_p (t))
+  if (odr_type_p (TYPE_MAIN_VARIANT (t)))
return true;
   /* Function type is a tricky one. Basically we can consider it
 ODR derived if return type or any of the parameters is.
@@ -540,7 +533,7 @@ odr_or_derived_type_p (const_tree t)
 if (TREE_TYPE (t) && odr_or_derived_type_p (TREE_TYPE (t)))
   return true;
 for (t = TYPE_ARG_TYPES (t); t; t = TREE_CHAIN (t))
-  if (odr_or_derived_type_p (TREE_VALUE (t)))
+  if (odr_or_derived_type_p (TYPE_MAIN_VARIANT (TREE_VALUE (t
 return true;
 return false;
   }
@@ -566,8 +559,7 @@ odr_name_hasher::equal (const odr_type_d
 return true;
   if (!in_lto_p)
 return false;
-  /* Check for anonymous namespaces. Those have !TREE_PUBLIC
- on the corresponding TYPE_STUB_DECL.  */
+  /* Check for anonymous namespaces.  */
   if ((type_with_linkage_p (t1) && type_in_anonymous_namespace_p (t1))
   || (type_with_linkage_p (t2) && type_in_anonymous_namespace_p (t2)))
 return false;
@@ -639,10 +631,45 @@ set_type_binfo (tree type, tree binfo)
   gcc_assert (!TYPE_BINFO (type));
 }
 
+/* Return true if type variants match.
+   This assumes that we a

V4 [PATCH] x86: Add pmovzx/pmovsx patterns with memory operands

2018-10-26 Thread H.J. Lu
On 10/25/18, Uros Bizjak  wrote:
> On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu  wrote:
>>
>> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
>> a wrong way.  For example:
>>
>> (define_insn "sse4_1_v8qiv8hi2"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> (any_extend:V8HI
>>   (vec_select:V8QI
>> (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
>> (parallel [(const_int 0) (const_int 1)
>>(const_int 2) (const_int 3)
>>(const_int 4) (const_int 5)
>>(const_int 6) (const_int 7)]]
>>
>> should be defind for memory operands as:
>>
>> (define_insn "sse4_1_v8qiv8hi2"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> (any_extend:V8HI
>>   (match_operand:V8QI "memory_operand" "m,m,m")))]
>>
>> This set of patches updates them to
>>
>> (define_insn "sse4_1_v8qiv8hi2"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> (any_extend:V8HI
>>   (vec_select:V8QI
>> (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
>> (parallel [(const_int 0) (const_int 1)
>>(const_int 2) (const_int 3)
>>(const_int 4) (const_int 5)
>>(const_int 6) (const_int 7)]]
>>
>> (define_insn "*sse4_1_v8qiv8hi2_1"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> (any_extend:V8HI
>>   (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
>>
>> with a splitter:
>>
>> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
>>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>
> No constraints needed for pre-reload splitter.
>
>> (any_extend:V8HI
>>   (vec_select:V8QI
>> (subreg:V16QI
>>   (vec_concat:V2DI
>> (match_operand:DI 1 "memory_operand" "m,*m,m")
>> (const_int 0)) 0)
>> (parallel [(const_int 0) (const_int 1)
>>(const_int 2) (const_int 3)
>>(const_int 4) (const_int 5)
>>(const_int 6) (const_int 7)]]
>>   "TARGET_SSE4_1 &&  &&
>> "
>>   "#"
>>   "&& can_create_pseudo_p ()"
>>   [(set (match_dup 0) (match_dup 1))]
>
>  [(set (match_dup 0)
>   (any_extend:V8HI (match_dup 1)))]
>
>> {
>>   operands[1] = gen_rtx_ (V8HImode,
>> gen_rtx_SUBREG (V8QImode,
>> operands[1], 0));
>> })
>
> Don't create subregs of memory. Use adjust_address_nv.

Here is the updated patch.

-- 
H.J.
From c9d11468bc5e9b71905d17c73d12677097d94e3c Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sat, 15 Sep 2018 20:54:42 -0700
Subject: [PATCH] x86: Add pmovzx/pmovsx patterns with memory operands

Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
a wrong way.  For example:

(define_insn "sse4_1_v8qiv8hi2"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
  (vec_select:V8QI
(match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]]

should be defind for memory operands as:

(define_insn "sse4_1_v8qiv8hi2"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
  (match_operand:V8QI "memory_operand" "m,m,m")))]

This patch updates them to

(define_insn "sse4_1_v8qiv8hi2"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
  (vec_select:V8QI
(match_operand:V16QI 1 "register_operand" "Yr,*x,v")
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]]

(define_insn "*sse4_1_v8qiv8hi2_1"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
  (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]

with a splitter:

(define_insn_and_split "*sse4_1_v8qiv8hi2_2"
  [(set (match_operand:V8HI 0 "register_operand")
(any_extend:V8HI
  (vec_select:V8QI
(subreg:V16QI
  (vec_concat:V2DI
(match_operand:DI 1 "memory_operand")
(const_int 0)) 0)
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]]
  "TARGET_SSE4_1 &&  && "
  "#"
  "&& can_create_pseudo_p ()"
  [(set (match_dup 0)
(any_extend:V8HI (match_dup 1)))]
{
  operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
})

This patch requires updating apply_subst_iterator to handle
define_insn_and_split.

gcc/

	PR target/87317
	* config/i386/sse.md (sse4_1_v8qiv8hi2): Replace
	nonimmediate_operand with register_operand.
	(avx2_v8q

[PATCH] Fix testsuite issue throwing off TCL

2018-10-26 Thread Richard Biener


Committed.

Richard.

2018-10-26  Richard Biener  

PR testsuite/87754
* g++.dg/lto/odr-1_0.C: Fix pattern.

Index: gcc/testsuite/g++.dg/lto/odr-1_0.C
===
--- gcc/testsuite/g++.dg/lto/odr-1_0.C  (revision 265516)
+++ gcc/testsuite/g++.dg/lto/odr-1_0.C  (working copy)
@@ -3,6 +3,6 @@
 struct a { // { dg-lto-warning "8: type 'struct a' violates the C\\+\\+ One 
Definition Rule" }
   struct b *ptr; // { dg-lto-message "13: the first difference of 
corresponding definitions is field 'ptr'" }
 };
-void test(struct a *) // { dg-lto-warning "6: warning: 'test' violates the C++ 
One Definition Rule" }
+void test(struct a *) // { dg-lto-warning "6: warning: 'test' violates the 
C\\+\\+ One Definition Rule" }
 {
 }


[PATCH] Relax hash function to match equals function behavior (PR testsuite/86158).

2018-10-26 Thread Martin Liška
Hi.

The patch aligns ipa_vr_ggc_hash_traits::hash function what actual 
ipa_vr_ggc_hash_traits::equals
operator does. Currently, the hash function is pointer based, which the real 
equal operator
does internally operand_equal_p, which works fine for equal constants (with 
different addresses).

It's tested on ppcl64-linux-gnu and it's pre-approved by Honza.

Alexander:
Note that I'm planning to come up with an equivalent of qsort_chk for hash 
tables.
Correct me if I'm wrong but it's expected that when equals function returns true
to have equal hash values as well? If so, that would catch this case I'm 
patching.

Thanks,
Martin

gcc/ChangeLog:

2018-10-25  Martin Liska  

PR testsuite/86158
* ipa-prop.c (struct ipa_vr_ggc_hash_traits): Hash with
addr_expr and not with pointers.
---
 gcc/ipa-prop.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 1e40997c92c..4bd0b4b4541 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -115,8 +115,8 @@ struct ipa_vr_ggc_hash_traits : public ggc_cache_remove 
 {
   gcc_checking_assert (!p->equiv ());
   inchash::hash hstate (p->kind ());
-  hstate.add_ptr (p->min ());
-  hstate.add_ptr (p->max ());
+  inchash::add_expr (p->min (), hstate);
+  inchash::add_expr (p->max (), hstate);
   return hstate.end ();
 }
   static bool



Re: V4 [PATCH] x86: Add pmovzx/pmovsx patterns with memory operands

2018-10-26 Thread Uros Bizjak
On Fri, Oct 26, 2018 at 9:19 AM H.J. Lu  wrote:
>
> On 10/25/18, Uros Bizjak  wrote:
> > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu  wrote:
> >>
> >> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
> >> a wrong way.  For example:
> >>
> >> (define_insn "sse4_1_v8qiv8hi2"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >> (any_extend:V8HI
> >>   (vec_select:V8QI
> >> (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
> >> (parallel [(const_int 0) (const_int 1)
> >>(const_int 2) (const_int 3)
> >>(const_int 4) (const_int 5)
> >>(const_int 6) (const_int 7)]]
> >>
> >> should be defind for memory operands as:
> >>
> >> (define_insn "sse4_1_v8qiv8hi2"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >> (any_extend:V8HI
> >>   (match_operand:V8QI "memory_operand" "m,m,m")))]
> >>
> >> This set of patches updates them to
> >>
> >> (define_insn "sse4_1_v8qiv8hi2"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >> (any_extend:V8HI
> >>   (vec_select:V8QI
> >> (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
> >> (parallel [(const_int 0) (const_int 1)
> >>(const_int 2) (const_int 3)
> >>(const_int 4) (const_int 5)
> >>(const_int 6) (const_int 7)]]
> >>
> >> (define_insn "*sse4_1_v8qiv8hi2_1"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >> (any_extend:V8HI
> >>   (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
> >>
> >> with a splitter:
> >>
> >> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >
> > No constraints needed for pre-reload splitter.
> >
> >> (any_extend:V8HI
> >>   (vec_select:V8QI
> >> (subreg:V16QI
> >>   (vec_concat:V2DI
> >> (match_operand:DI 1 "memory_operand" "m,*m,m")
> >> (const_int 0)) 0)
> >> (parallel [(const_int 0) (const_int 1)
> >>(const_int 2) (const_int 3)
> >>(const_int 4) (const_int 5)
> >>(const_int 6) (const_int 7)]]
> >>   "TARGET_SSE4_1 &&  &&
> >> "
> >>   "#"
> >>   "&& can_create_pseudo_p ()"
> >>   [(set (match_dup 0) (match_dup 1))]
> >
> >  [(set (match_dup 0)
> >   (any_extend:V8HI (match_dup 1)))]
> >
> >> {
> >>   operands[1] = gen_rtx_ (V8HImode,
> >> gen_rtx_SUBREG (V8QImode,
> >> operands[1], 0));
> >> })
> >
> > Don't create subregs of memory. Use adjust_address_nv.
>
> Here is the updated patch.

> with a splitter:
>
> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
>  [(set (match_operand:V8HI 0 "register_operand")
>(any_extend:V8HI
>  (vec_select:V8QI
>(subreg:V16QI
>  (vec_concat:V2DI
>(match_operand:DI 1 "memory_operand")
>(const_int 0)) 0)
>(parallel [(const_int 0) (const_int 1)
>   (const_int 2) (const_int 3)
>   (const_int 4) (const_int 5)
>   (const_int 6) (const_int 7)]]
>  "TARGET_SSE4_1 &&  && "
>  "#"
>  "&& can_create_pseudo_p ()"

"can_create_pseudo_p ()" should go to the insn constraint and "&& 1"
should be used for split constraint. Both, insn and splitter are valid
only before reload.

>  [(set (match_dup 0)
>(any_extend:V8HI (match_dup 1)))]
> {
>  operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
> })

Please use double quotes for one-line preparation statement.

> (any_extend:V4SI
>   (match_operand:V4HI 1 "memory_operand" "m,*m,m")))]

Please remove star in front of memory constraint.

OK with the above changes.

Thanks,
Uros.


Re: [PATCH] Change vectorizer SLP tree to be a graph

2018-10-26 Thread Richard Biener
On Wed, 24 Oct 2018, Richard Biener wrote:

> 
> This does the last step (I've already changed costing, analysis and
> code generation to process nodes as if it were) in making the SLP
> tree a graph.  This means adjusting SLP analysis to lookup already
> identified SLP nodes for a set of scalar stmts and refering to a
> slp_tree from multiple parents.
> 
> This avoids blowing up during analysis and lets us vectorize
> the testcase from PR87105 as well as clang does.
> 
> I'm still fighting with the necessary refcounting, but maybe this
> version did the trick.
> 
> Re-bootstrap & regtest running on x86_64-unknown-linux-gnu, SPEC
> CPU 2006 build is also on the way (though it looks like sth else
> broke stuff there as well).
> 
> Most changes are moving and re-indenting misindented stuff (looks like the
> moving part isn't necessary so I'll edit it out).
> 
> Note while costing and code generation share things cross SLP
> instance the analysis part does not try to do that (I just thought
> that might not be a good idea without re-vamping the whole
> data structure to get rid of the idea of "separate" instances)

The following is what I have applied.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2018-10-26  Richard Biener  

PR tree-optimization/87105
* tree-vectorizer.h (_slp_tree::refcnt): New member.
* tree-vect-slp.c (vect_free_slp_tree): Decrement and honor
refcnt.
(vect_create_new_slp_node): Initialize refcnt to one.
(bst_traits): Move.
(scalar_stmts_set_t, bst_fail): Remove.
(vect_build_slp_tree_2): Add bst_map argument and adjust calls.
(vect_build_slp_tree): Add bst_map argument and lookup
already created SLP nodes.
(vect_print_slp_tree): Handle a SLP graph, print SLP node
addresses.
(vect_slp_rearrange_stmts): Handle a SLP graph.
(vect_analyze_slp_instance): Adjust and free SLP nodes from
the CSE map.  Fix indenting.
(vect_schedule_slp_instance): Add short-cut.

* g++.dg/vect/slp-pr87105.cc: Adjust.
* gcc.dg/torture/20181024-1.c: New testcase.
* g++.dg/opt/20181025-1.C: Likewise.

diff --git a/gcc/testsuite/g++.dg/opt/20181025-1.C 
b/gcc/testsuite/g++.dg/opt/20181025-1.C
new file mode 100644
index 000..43d1614f023
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/20181025-1.C
@@ -0,0 +1,31 @@
+// { dg-do compile }
+// { dg-options "-Ofast" }
+
+template 
+class Vector {
+typedef Number value_type;
+typedef const value_type *const_iterator;
+Number norm_sqr () const;
+const_iterator begin () const;
+unsigned int dim;
+};
+template 
+static inline Number
+local_sqr (const Number x)
+{
+  return x*x;
+}
+template 
+Number
+Vector::norm_sqr () const
+{
+  Number sum0 = 0, sum1 = 0, sum2 = 0, sum3 = 0;
+  const_iterator ptr = begin(), eptr = ptr + (dim/4)*4;
+  while (ptr!=eptr) 
+{
+  sum0 += ::local_sqr(*ptr++);
+  sum1 += ::local_sqr(*ptr++);
+}
+  return sum0+sum1+sum2+sum3;
+}
+template class Vector;
diff --git a/gcc/testsuite/g++.dg/vect/slp-pr87105.cc 
b/gcc/testsuite/g++.dg/vect/slp-pr87105.cc
index 1023d915201..949b16c848f 100644
--- a/gcc/testsuite/g++.dg/vect/slp-pr87105.cc
+++ b/gcc/testsuite/g++.dg/vect/slp-pr87105.cc
@@ -2,7 +2,7 @@
 // { dg-require-effective-target c++11 }
 // { dg-require-effective-target vect_double }
 // For MIN/MAX recognition
-// { dg-additional-options "-ffast-math -fvect-cost-model" }
+// { dg-additional-options "-ffast-math" }
 
 #include 
 #include 
@@ -99,6 +99,7 @@ void quadBoundingBoxA(const Point bez[3], Box& bBox) noexcept 
{
 
 // We should have if-converted everything down to straight-line code
 // { dg-final { scan-tree-dump-times "" 1 "slp2" } }
-// We fail to elide an earlier store which makes us not handle a later
-// duplicate one for vectorization.
-// { dg-final { scan-tree-dump-times "basic block part vectorized" 1 "slp2" { 
xfail *-*-* } } }
+// { dg-final { scan-tree-dump-times "basic block part vectorized" 1 "slp2" } }
+// It's a bit awkward to detect that all stores were vectorized but the
+// following more or less does the trick
+// { dg-final { scan-tree-dump "vect_iftmp\[^\r\m\]* = MIN" "slp2" } }
diff --git a/gcc/testsuite/gcc.dg/torture/20181024-1.c 
b/gcc/testsuite/gcc.dg/torture/20181024-1.c
new file mode 100644
index 000..f2cfe7f6d67
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/20181024-1.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=core-avx2" { target { x86_64-*-* i?86-*-* } 
} } */
+
+typedef enum {
+ C = 0,   N, S, E, W, T, B,   NE, NW, SE, SW,  
 NT, NB, ST, SB,   ET, EB, WT, WB,   FLAGS, 
N_CELL_ENTRIES} CELL_ENTRIES;
+typedef double LBM_Grid[(130)*100*100*N_CELL_ENTRIES];
+void foo( LBM_Grid srcGrid )
+{
+  double ux , uy , uz , rho , ux1, uy1, uz1, rho1, ux2, uy2, 
uz2, rho2, u2, px, py;
+  in

Re: V4 [PATCH] x86: Add pmovzx/pmovsx patterns with memory operands

2018-10-26 Thread Uros Bizjak
On Fri, Oct 26, 2018 at 9:35 AM Uros Bizjak  wrote:
>
> On Fri, Oct 26, 2018 at 9:19 AM H.J. Lu  wrote:
> >
> > On 10/25/18, Uros Bizjak  wrote:
> > > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu  wrote:
> > >>
> > >> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
> > >> a wrong way.  For example:
> > >>
> > >> (define_insn "sse4_1_v8qiv8hi2"
> > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > >> (any_extend:V8HI
> > >>   (vec_select:V8QI
> > >> (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
> > >> (parallel [(const_int 0) (const_int 1)
> > >>(const_int 2) (const_int 3)
> > >>(const_int 4) (const_int 5)
> > >>(const_int 6) (const_int 7)]]
> > >>
> > >> should be defind for memory operands as:
> > >>
> > >> (define_insn "sse4_1_v8qiv8hi2"
> > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > >> (any_extend:V8HI
> > >>   (match_operand:V8QI "memory_operand" "m,m,m")))]
> > >>
> > >> This set of patches updates them to
> > >>
> > >> (define_insn "sse4_1_v8qiv8hi2"
> > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > >> (any_extend:V8HI
> > >>   (vec_select:V8QI
> > >> (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
> > >> (parallel [(const_int 0) (const_int 1)
> > >>(const_int 2) (const_int 3)
> > >>(const_int 4) (const_int 5)
> > >>(const_int 6) (const_int 7)]]
> > >>
> > >> (define_insn "*sse4_1_v8qiv8hi2_1"
> > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > >> (any_extend:V8HI
> > >>   (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
> > >>
> > >> with a splitter:
> > >>
> > >> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
> > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > >
> > > No constraints needed for pre-reload splitter.
> > >
> > >> (any_extend:V8HI
> > >>   (vec_select:V8QI
> > >> (subreg:V16QI
> > >>   (vec_concat:V2DI
> > >> (match_operand:DI 1 "memory_operand" "m,*m,m")
> > >> (const_int 0)) 0)
> > >> (parallel [(const_int 0) (const_int 1)
> > >>(const_int 2) (const_int 3)
> > >>(const_int 4) (const_int 5)
> > >>(const_int 6) (const_int 7)]]
> > >>   "TARGET_SSE4_1 &&  &&
> > >> "
> > >>   "#"
> > >>   "&& can_create_pseudo_p ()"
> > >>   [(set (match_dup 0) (match_dup 1))]
> > >
> > >  [(set (match_dup 0)
> > >   (any_extend:V8HI (match_dup 1)))]
> > >
> > >> {
> > >>   operands[1] = gen_rtx_ (V8HImode,
> > >> gen_rtx_SUBREG (V8QImode,
> > >> operands[1], 0));
> > >> })
> > >
> > > Don't create subregs of memory. Use adjust_address_nv.
> >
> > Here is the updated patch.
>
> > with a splitter:
> >
> > (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
> >  [(set (match_operand:V8HI 0 "register_operand")
> >(any_extend:V8HI
> >  (vec_select:V8QI
> >(subreg:V16QI
> >  (vec_concat:V2DI
> >(match_operand:DI 1 "memory_operand")
> >(const_int 0)) 0)
> >(parallel [(const_int 0) (const_int 1)
> >   (const_int 2) (const_int 3)
> >   (const_int 4) (const_int 5)
> >   (const_int 6) (const_int 7)]]
> >  "TARGET_SSE4_1 &&  && "
> >  "#"
> >  "&& can_create_pseudo_p ()"
>
> "can_create_pseudo_p ()" should go to the insn constraint and "&& 1"
> should be used for split constraint. Both, insn and splitter are valid
> only before reload.
>
> >  [(set (match_dup 0)
> >(any_extend:V8HI (match_dup 1)))]
> > {
> >  operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
> > })
>
> Please use double quotes for one-line preparation statement.
>
> > (any_extend:V4SI
> >   (match_operand:V4HI 1 "memory_operand" "m,*m,m")))]
>
> Please remove star in front of memory constraint.
>
> OK with the above changes.

Oh, and you should remove "q" and "k" operand modifiers in all old patterns.

Uros.


Re: V4 [PATCH] x86: Add pmovzx/pmovsx patterns with memory operands

2018-10-26 Thread Uros Bizjak
On Fri, Oct 26, 2018 at 9:37 AM Uros Bizjak  wrote:
>
> On Fri, Oct 26, 2018 at 9:35 AM Uros Bizjak  wrote:
> >
> > On Fri, Oct 26, 2018 at 9:19 AM H.J. Lu  wrote:
> > >
> > > On 10/25/18, Uros Bizjak  wrote:
> > > > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu  wrote:
> > > >>
> > > >> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
> > > >> a wrong way.  For example:
> > > >>
> > > >> (define_insn "sse4_1_v8qiv8hi2"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >> (any_extend:V8HI
> > > >>   (vec_select:V8QI
> > > >> (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
> > > >> (parallel [(const_int 0) (const_int 1)
> > > >>(const_int 2) (const_int 3)
> > > >>(const_int 4) (const_int 5)
> > > >>(const_int 6) (const_int 7)]]
> > > >>
> > > >> should be defind for memory operands as:
> > > >>
> > > >> (define_insn "sse4_1_v8qiv8hi2"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >> (any_extend:V8HI
> > > >>   (match_operand:V8QI "memory_operand" "m,m,m")))]
> > > >>
> > > >> This set of patches updates them to
> > > >>
> > > >> (define_insn "sse4_1_v8qiv8hi2"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >> (any_extend:V8HI
> > > >>   (vec_select:V8QI
> > > >> (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
> > > >> (parallel [(const_int 0) (const_int 1)
> > > >>(const_int 2) (const_int 3)
> > > >>(const_int 4) (const_int 5)
> > > >>(const_int 6) (const_int 7)]]
> > > >>
> > > >> (define_insn "*sse4_1_v8qiv8hi2_1"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >> (any_extend:V8HI
> > > >>   (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
> > > >>
> > > >> with a splitter:
> > > >>
> > > >> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >
> > > > No constraints needed for pre-reload splitter.
> > > >
> > > >> (any_extend:V8HI
> > > >>   (vec_select:V8QI
> > > >> (subreg:V16QI
> > > >>   (vec_concat:V2DI
> > > >> (match_operand:DI 1 "memory_operand" "m,*m,m")
> > > >> (const_int 0)) 0)
> > > >> (parallel [(const_int 0) (const_int 1)
> > > >>(const_int 2) (const_int 3)
> > > >>(const_int 4) (const_int 5)
> > > >>(const_int 6) (const_int 7)]]
> > > >>   "TARGET_SSE4_1 &&  &&
> > > >> "
> > > >>   "#"
> > > >>   "&& can_create_pseudo_p ()"
> > > >>   [(set (match_dup 0) (match_dup 1))]
> > > >
> > > >  [(set (match_dup 0)
> > > >   (any_extend:V8HI (match_dup 1)))]
> > > >
> > > >> {
> > > >>   operands[1] = gen_rtx_ (V8HImode,
> > > >> gen_rtx_SUBREG (V8QImode,
> > > >> operands[1], 0));
> > > >> })
> > > >
> > > > Don't create subregs of memory. Use adjust_address_nv.
> > >
> > > Here is the updated patch.
> >
> > > with a splitter:
> > >
> > > (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
> > >  [(set (match_operand:V8HI 0 "register_operand")
> > >(any_extend:V8HI
> > >  (vec_select:V8QI
> > >(subreg:V16QI
> > >  (vec_concat:V2DI
> > >(match_operand:DI 1 "memory_operand")
> > >(const_int 0)) 0)
> > >(parallel [(const_int 0) (const_int 1)
> > >   (const_int 2) (const_int 3)
> > >   (const_int 4) (const_int 5)
> > >   (const_int 6) (const_int 7)]]
> > >  "TARGET_SSE4_1 &&  && "
> > >  "#"
> > >  "&& can_create_pseudo_p ()"
> >
> > "can_create_pseudo_p ()" should go to the insn constraint and "&& 1"
> > should be used for split constraint. Both, insn and splitter are valid
> > only before reload.
> >
> > >  [(set (match_dup 0)
> > >(any_extend:V8HI (match_dup 1)))]
> > > {
> > >  operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
> > > })
> >
> > Please use double quotes for one-line preparation statement.
> >
> > > (any_extend:V4SI
> > >   (match_operand:V4HI 1 "memory_operand" "m,*m,m")))]
> >
> > Please remove star in front of memory constraint.
> >
> > OK with the above changes.
>
> Oh, and you should remove "q" and "k" operand modifiers in all old patterns.

Well, the new ones, obviously.

Uros.


[PATCH] apply_subst_iterator: Handle define_split/define_insn_and_split

2018-10-26 Thread H.J. Lu
On 10/25/18, Uros Bizjak  wrote:
> On Fri, Oct 26, 2018 at 8:48 AM H.J. Lu  wrote:
>>
>> On 10/25/18, Uros Bizjak  wrote:
>> > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu  wrote:
>> >>
>> >> * read-rtl.c (apply_subst_iterator): Handle
>> >> define_insn_and_split.
>> >> ---
>> >>  gcc/read-rtl.c | 6 --
>> >>  1 file changed, 4 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c
>> >> index d698dd4af4d..5957c29671a 100644
>> >> --- a/gcc/read-rtl.c
>> >> +++ b/gcc/read-rtl.c
>> >> @@ -275,9 +275,11 @@ apply_subst_iterator (rtx rt, unsigned int, int
>> >> value)
>> >>if (value == 1)
>> >>  return;
>> >>gcc_assert (GET_CODE (rt) == DEFINE_INSN
>> >> + || GET_CODE (rt) == DEFINE_INSN_AND_SPLIT
>> >>   || GET_CODE (rt) == DEFINE_EXPAND);
>> >
>> > Can we also handle DEFINE_SPLIT here?
>> >
>>
>> Yes, we could if there were a usage for it.  I am reluctant to add
>> something
>> I have no use nor test for.
>
> Just split one define_insn_and_split to define_insn and corresponding
> define_split.
>
> define_insn_and_split is a contraction for for the define_insn and
> corresponding define_split, so it looks weird to only handle
> define_insn_and-split without handling define_split.
>

Here is the updated patch to handle define_split.  Tested with

(define_insn "*sse4_1_v8qiv8hi2_2"
  [(set (match_operand:V8HI 0 "register_operand")
(any_extend:V8HI
  (vec_select:V8QI
(subreg:V16QI
  (vec_concat:V2DI
(match_operand:DI 1 "memory_operand")
(const_int 0)) 0)
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]]
  "TARGET_SSE4_1 &&  && "
  "#")

(define_split
  [(set (match_operand:V8HI 0 "register_operand")
(any_extend:V8HI
  (vec_select:V8QI
(subreg:V16QI
  (vec_concat:V2DI
(match_operand:DI 1 "memory_operand")
(const_int 0)) 0)
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]]
  "TARGET_SSE4_1 &&  && 
   && can_create_pseudo_p ()"
  [(set (match_dup 0)
(any_extend:V8HI (match_dup 1)))]
{
  operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
})

-- 
H.J.
From b2f58ad0121520619fb342fff93bc55ca88b8c0a Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 25 Oct 2018 15:16:49 -0700
Subject: [PATCH] apply_subst_iterator: Handle
 define_split/define_insn_and_split

	* read-rtl.c (apply_subst_iterator): Handle define_split and
	define_insn_and_split.
---
 gcc/read-rtl.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c
index d698dd4af4d..dfe22db 100644
--- a/gcc/read-rtl.c
+++ b/gcc/read-rtl.c
@@ -272,12 +272,15 @@ apply_subst_iterator (rtx rt, unsigned int, int value)
   rtx new_attr;
   rtvec attrs_vec, new_attrs_vec;
   int i;
-  if (value == 1)
+  /* define_split has no attributes.  */
+  if (value == 1 || GET_CODE (rt) == DEFINE_SPLIT)
 return;
   gcc_assert (GET_CODE (rt) == DEFINE_INSN
+	  || GET_CODE (rt) == DEFINE_INSN_AND_SPLIT
 	  || GET_CODE (rt) == DEFINE_EXPAND);
 
-  attrs_vec = XVEC (rt, 4);
+  int attrs = GET_CODE (rt) == DEFINE_INSN_AND_SPLIT ? 7 : 4;
+  attrs_vec = XVEC (rt, attrs);
 
   /* If we've already added attribute 'current_iterator_name', then we
  have nothing to do now.  */
@@ -309,7 +312,7 @@ apply_subst_iterator (rtx rt, unsigned int, int value)
 	  GET_NUM_ELEM (attrs_vec) * sizeof (rtx));
   new_attrs_vec->elem[GET_NUM_ELEM (attrs_vec)] = new_attr;
 }
-  XVEC (rt, 4) = new_attrs_vec;
+  XVEC (rt, attrs) = new_attrs_vec;
 }
 
 /* Map subst-attribute ATTR to subst iterator ITER.  */
-- 
2.17.2



Fix failure with the odr-1.C test

2018-10-26 Thread Jan Hubicka
Hi,
this patch fixes issues with the odr test I made yesterday.  One problem is that
I got the template wrong which made dg to crash and I did not notice it fails.
Other problem is that it still trips an sanity check in ipa-devirt.  This is 
fixed
now. It is bit surprising that one can make C++ non-anonymous type that uses 
anonymous
type inside and get to warning at all.

Bootstrapped/regtested x86_64-linux, re-running with lto-bootstrap.

Honza

* ipa-devirt.c (odr_subtypes_equivalent_p): Fix recursion.
(warn_types_mismatch): Fix walk of DECL_NAME.
(odr_types_equivalent_p): Fix overactive assert.
* lto/lto-symtab.c (lto_symtab_merge_decls_2): Fix extra space.

* g++.dg/lto/odr-1_0.C: Fix template.
* g++.dg/lto/odr-1_1.C: Fix template.
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 265519)
+++ ipa-devirt.c(working copy)
@@ -719,9 +719,10 @@ odr_subtypes_equivalent_p (tree t1, tree
 }
   if (visited->add (pair))
 return true;
-  if (odr_types_equivalent_p (TYPE_MAIN_VARIANT (t1), TYPE_MAIN_VARIANT (t2),
- false, NULL, visited, loc1, loc2)
-  && !type_variants_equivalent_p (t1, t2, warn, warned))
+  if (!odr_types_equivalent_p (TYPE_MAIN_VARIANT (t1), TYPE_MAIN_VARIANT (t2),
+ false, NULL, visited, loc1, loc2))
+return false;
+  if (!type_variants_equivalent_p (t1, t2, warn, warned))
 return false;
   return true;
 }
@@ -1138,7 +1139,7 @@ warn_types_mismatch (tree t1, tree t2, l
   if (TREE_CODE (n1) == TYPE_DECL)
n1 = DECL_NAME (n1);
   if (TREE_CODE (n2) == TYPE_DECL)
-   n1 = DECL_NAME (n2);
+   n2 = DECL_NAME (n2);
   /* Most of the time, the type names will match, do not be unnecesarily
  verbose.  */
   if (IDENTIFIER_POINTER (n1) != IDENTIFIER_POINTER (n2))
@@ -1292,10 +1293,6 @@ odr_types_equivalent_p (tree t1, tree t2
   /* Check first for the obvious case of pointer identity.  */
   if (t1 == t2)
 return true;
-  gcc_assert (!type_with_linkage_p (TYPE_MAIN_VARIANT (t1))
- || !type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t1)));
-  gcc_assert (!type_with_linkage_p (TYPE_MAIN_VARIANT (t2))
- || !type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t2)));
 
   /* Can't be the same type if the types don't have the same code.  */
   if (TREE_CODE (t1) != TREE_CODE (t2))
Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 265517)
+++ lto/lto-symtab.c(working copy)
@@ -698,7 +698,7 @@ lto_symtab_merge_decls_2 (symtab_node *f
  if (level & 2)
diag = warning_at (DECL_SOURCE_LOCATION (decl),
   OPT_Wodr,
-  "%qD violates the C++ One Definition Rule ",
+  "%qD violates the C++ One Definition Rule",
   decl);
  if (!diag && (level & 1))
diag = warning_at (DECL_SOURCE_LOCATION (decl),
Index: testsuite/g++.dg/lto/odr-1_0.C
===
--- testsuite/g++.dg/lto/odr-1_0.C  (revision 265517)
+++ testsuite/g++.dg/lto/odr-1_0.C  (working copy)
@@ -3,6 +3,6 @@
 struct a { // { dg-lto-warning "8: type 'struct a' violates the C\\+\\+ One 
Definition Rule" }
   struct b *ptr; // { dg-lto-message "13: the first difference of 
corresponding definitions is field 'ptr'" }
 };
-void test(struct a *) // { dg-lto-warning "6: warning: 'test' violates the C++ 
One Definition Rule" }
+void test(struct a *)
 {
 }
Index: testsuite/g++.dg/lto/odr-1_1.C
===
--- testsuite/g++.dg/lto/odr-1_1.C  (revision 265517)
+++ testsuite/g++.dg/lto/odr-1_1.C  (working copy)
@@ -4,7 +4,7 @@ namespace {
 struct a {
   struct b *ptr;
 };
-void test(struct a *);
+void test(struct a *); // { dg-lto-warning "6: 'test' violates the C\\+\\+ One 
Definition Rule" }
 int
 main(void)
 {


Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.

2018-10-26 Thread Martin Liška
On 10/26/18 12:59 AM, Michael Ploujnikov wrote:
> I've taken the advice from a discussion on IRC and re-wrote the patch
> with more uniform function names and using overloading.
> 
> I think this function accomplished the following goals:
>  - remove clone numbering where it's not needed:
>final.c:final_scan_insn_1 and
>symtab.c:simd_symtab_node::noninterposable_alias.
>  - name and document the clone naming API such that future users won't
>accidentally use the numbering when it's not necessary; if you need
>numbering then you need to explicitly ask for it with the right
>function
>  - provide a new function that allows users to specify a clone number
>explicitly as an argument

Hello.

Thanks for reworking that.

> 
> My thoughts for future improvements:
>  - It's a bit unfortunate that lto-partition.c:privatize_symbol_name_1
>has to break the decl abstraction and pass in a string that it
>created into what I would consider the implementation-detail
>function. The best way I can think of to make it uniform with the
>rest of the users is to have it create a new empty decl with
>DECL_ASSEMBLER_NAME set to the new string

That's not nice to create artificial declaration. Having string variant
is fine for me.

>  - It's unfortunate that I have to duplicate the separator
>concatenation in the numberless clone_function_name, but I think it
>has to be like that unless ASM_FORMAT_PRIVATE_NAME making the
>number optional.
> 

That's also fine for me. I'm attaching small nits that I found.
And please reformat following chunk in ChangeLog entry:

* cgraph.h (clone_function_name_1): Replaced by new
  clone_function_name_numbered that takes name as string; for
  privatize_symbol_name_1 use only.  (clone_function_name):
  Renamed to clone_function_name_numbered to be explicit about
  numbering.  (clone_function_name): New two-argument function
  that does not number its output.  (clone_function_name): New
  three-argument function that takes a number to append to its
  output.

into:

* cgraph.h (clone_function_name_1): Replaced by new
  clone_function_name_numbered that takes name as string; for
  privatize_symbol_name_1 use only.
  (clone_function_name): Renamed to clone_function_name_numbered
  to be explicit about...

I'm adding Honza to CC, hope he can review it quickly.

Thanks,
Martin

diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index c896a5f60cb..9cba4c2c3a9 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -521,6 +521,7 @@ static GTY(()) unsigned int clone_fn_id_num;
then the two argument clone_function_name should be used instead.
Should not be called directly except for by
lto-partition.c:privatize_symbol_name_1.  */
+
 tree
 clone_function_name_numbered (const char *name, const char *suffix)
 {
@@ -532,6 +533,7 @@ clone_function_name_numbered (const char *name, const char *suffix)
assembler name) unspecified number.  If clone numbering is not
needed then the two argument clone_function_name should be used
instead.  */
+
 tree
 clone_function_name_numbered (tree decl, const char *suffix)
 {
@@ -542,8 +544,9 @@ clone_function_name_numbered (tree decl, const char *suffix)
 
 /* Return a new assembler name for a clone of decl named NAME.  Apart
from the string SUFFIX, the new name will end with the specified
-   number.  If clone numbering is not needed then the two argument
+   NUMBER.  If clone numbering is not needed then the two argument
clone_function_name should be used instead.  */
+
 tree
 clone_function_name (const char *name, const char *suffix,
 		 unsigned long number)
@@ -559,9 +562,9 @@ clone_function_name (const char *name, const char *suffix,
   return get_identifier (tmp_name);
 }
 
-
 /* Return a new assembler name ending with the string SUFFIX for a
clone of DECL.  */
+
 tree
 clone_function_name (tree decl, const char *suffix)
 {
@@ -581,7 +584,7 @@ clone_function_name (tree decl, const char *suffix)
 			   IDENTIFIER_POINTER (identifier),
 			   separator,
 			   suffix,
-			   (char*)0));
+			   NULL));
   return get_identifier (result);
 }
 


[AArch64] Add Saphira pipeline description.

2018-10-26 Thread Sameera Deshpande
Hi!

Please find attached the patch to add a pipeline description for the
Qualcomm Saphira core.  It is tested with a bootstrap and make check,
with no regressions.

Ok for trunk?

gcc/
Changelog:

2018-10-26 Sameera Deshpande 

* config/aarch64/aarch64-cores.def (saphira): Use saphira pipeline.
* config/aarch64/aarch64.md: Include saphira.md
* config/aarch64/saphira.md: New file for pipeline description.

-- 
- Thanks and regards,
  Sameera D.
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index 3d876b8..8e4c646 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -90,7 +90,7 @@ AARCH64_CORE("cortex-a76",  cortexa76, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2
 /* ARMv8.4-A Architecture Processors.  */
 
 /* Qualcomm ('Q') cores. */
-AARCH64_CORE("saphira", saphira,falkor,8_4A,  AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 0xC01, -1)
+AARCH64_CORE("saphira", saphira,saphira,8_4A,  AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 0xC01, -1)
 
 /* ARMv8-A big.LITTLE implementations.  */
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index a014a01..f951354 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -298,6 +298,7 @@
 (include "../arm/cortex-a57.md")
 (include "../arm/exynos-m1.md")
 (include "falkor.md")
+(include "saphira.md")
 (include "thunderx.md")
 (include "../arm/xgene1.md")
 (include "thunderx2t99.md")
diff --git a/gcc/config/aarch64/saphira.md b/gcc/config/aarch64/saphira.md
new file mode 100644
index 000..bbf1c5c
--- /dev/null
+++ b/gcc/config/aarch64/saphira.md
@@ -0,0 +1,583 @@
+;; Saphira pipeline description
+;; Copyright (C) 2017-2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_automaton "saphira")
+
+;; Complex int instructions (e.g. multiply and divide) execute in the X
+;; pipeline.  Simple int instructions execute in the X, Y, Z and B pipelines.
+
+(define_cpu_unit "saphira_x" "saphira")
+(define_cpu_unit "saphira_y" "saphira")
+
+;; Branches execute in the Z or B pipeline or in one of the int pipelines depending
+;; on how complex it is.  Simple int insns (like movz) can also execute here.
+
+(define_cpu_unit "saphira_z" "saphira")
+(define_cpu_unit "saphira_b" "saphira")
+
+;; Vector and FP insns execute in the VX and VY pipelines.
+
+(define_automaton "saphira_vfp")
+
+(define_cpu_unit "saphira_vx" "saphira_vfp")
+(define_cpu_unit "saphira_vy" "saphira_vfp")
+
+;; Loads execute in the LD pipeline.
+;; Stores execute in the ST pipeline, for address, data, and
+;; vector data.
+
+(define_automaton "saphira_mem")
+
+(define_cpu_unit "saphira_ld" "saphira_mem")
+(define_cpu_unit "saphira_st" "saphira_mem")
+
+;; The GTOV and VTOG pipelines are for general to vector reg moves, and vice
+;; versa.
+
+(define_cpu_unit "saphira_gtov" "saphira")
+(define_cpu_unit "saphira_vtog" "saphira")
+
+;; Common reservation combinations.
+
+(define_reservation "saphira_vxvy" "saphira_vx|saphira_vy")
+(define_reservation "saphira_zb"   "saphira_z|saphira_b")
+(define_reservation "saphira_xyzb" "saphira_x|saphira_y|saphira_z|saphira_b")
+
+;; SIMD Floating-Point Instructions
+
+(define_insn_reservation "saphira_afp_1_vxvy" 1
+  (and (eq_attr "tune" "saphira")
+   (eq_attr "type" "neon_fp_neg_s,neon_fp_neg_d,neon_fp_abs_s,neon_fp_abs_d,neon_fp_neg_s_q,neon_fp_neg_d_q,neon_fp_abs_s_q,neon_fp_abs_d_q"))
+  "saphira_vxvy")
+
+(define_insn_reservation "saphira_afp_2_vxvy" 2
+  (and (eq_attr "tune" "saphira")
+   (eq_attr "type" "neon_fp_minmax_s,neon_fp_minmax_d,neon_fp_reduc_minmax_s,neon_fp_reduc_minmax_d,neon_fp_compare_s,neon_fp_compare_d,neon_fp_round_s,neon_fp_round_d,neon_fp_minmax_s_q,neon_fp_minmax_d_q,neon_fp_compare_s_q,neon_fp_compare_d_q,neon_fp_round_s_q,neon_fp_round_d_q"))
+  "saphira_vxvy")
+
+(define_insn_reservation "saphira_afp_3_vxvy" 3
+  (and (eq_attr "tune" "saphira")
+   (eq_attr "type" "neon_fp_reduc_minmax_s_q,neon_fp_reduc_minmax_d_q,neon_fp_abd_s,neon_fp_abd_d,neon_fp_addsub_s,neon_fp_addsub_d,neon_fp_reduc_add_s,neon_fp_reduc_add_d,neon_fp_abd_s_q,neon_fp_abd_d_q,neon_fp_addsub_s_q,neon_fp_addsub_d_q,neon_fp_reduc_add_s_q,neon_fp_reduc_add_d_q"))
+  "saphira_vxvy")
+
+(

Re: V4 [PATCH] x86: Add pmovzx/pmovsx patterns with memory operands

2018-10-26 Thread H.J. Lu
On 10/26/18, Uros Bizjak  wrote:
> On Fri, Oct 26, 2018 at 9:37 AM Uros Bizjak  wrote:
>>
>> On Fri, Oct 26, 2018 at 9:35 AM Uros Bizjak  wrote:
>> >
>> > On Fri, Oct 26, 2018 at 9:19 AM H.J. Lu  wrote:
>> > >
>> > > On 10/25/18, Uros Bizjak  wrote:
>> > > > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu 
>> > > > wrote:
>> > > >>
>> > > >> Many x86 pmovzx/pmovsx instructions with memory operands are
>> > > >> modeled in
>> > > >> a wrong way.  For example:
>> > > >>
>> > > >> (define_insn "sse4_1_v8qiv8hi2"
>> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> > > >> (any_extend:V8HI
>> > > >>   (vec_select:V8QI
>> > > >> (match_operand:V16QI 1 "nonimmediate_operand"
>> > > >> "Yrm,*xm,vm")
>> > > >> (parallel [(const_int 0) (const_int 1)
>> > > >>(const_int 2) (const_int 3)
>> > > >>(const_int 4) (const_int 5)
>> > > >>(const_int 6) (const_int 7)]]
>> > > >>
>> > > >> should be defind for memory operands as:
>> > > >>
>> > > >> (define_insn "sse4_1_v8qiv8hi2"
>> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> > > >> (any_extend:V8HI
>> > > >>   (match_operand:V8QI "memory_operand" "m,m,m")))]
>> > > >>
>> > > >> This set of patches updates them to
>> > > >>
>> > > >> (define_insn "sse4_1_v8qiv8hi2"
>> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> > > >> (any_extend:V8HI
>> > > >>   (vec_select:V8QI
>> > > >> (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
>> > > >> (parallel [(const_int 0) (const_int 1)
>> > > >>(const_int 2) (const_int 3)
>> > > >>(const_int 4) (const_int 5)
>> > > >>(const_int 6) (const_int 7)]]
>> > > >>
>> > > >> (define_insn "*sse4_1_v8qiv8hi2_1"
>> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> > > >> (any_extend:V8HI
>> > > >>   (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
>> > > >>
>> > > >> with a splitter:
>> > > >>
>> > > >> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
>> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
>> > > >
>> > > > No constraints needed for pre-reload splitter.
>> > > >
>> > > >> (any_extend:V8HI
>> > > >>   (vec_select:V8QI
>> > > >> (subreg:V16QI
>> > > >>   (vec_concat:V2DI
>> > > >> (match_operand:DI 1 "memory_operand" "m,*m,m")
>> > > >> (const_int 0)) 0)
>> > > >> (parallel [(const_int 0) (const_int 1)
>> > > >>(const_int 2) (const_int 3)
>> > > >>(const_int 4) (const_int 5)
>> > > >>(const_int 6) (const_int 7)]]
>> > > >>   "TARGET_SSE4_1 &&  &&
>> > > >> "
>> > > >>   "#"
>> > > >>   "&& can_create_pseudo_p ()"
>> > > >>   [(set (match_dup 0) (match_dup 1))]
>> > > >
>> > > >  [(set (match_dup 0)
>> > > >   (any_extend:V8HI (match_dup 1)))]
>> > > >
>> > > >> {
>> > > >>   operands[1] = gen_rtx_ (V8HImode,
>> > > >> gen_rtx_SUBREG (V8QImode,
>> > > >> operands[1], 0));
>> > > >> })
>> > > >
>> > > > Don't create subregs of memory. Use adjust_address_nv.
>> > >
>> > > Here is the updated patch.
>> >
>> > > with a splitter:
>> > >
>> > > (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
>> > >  [(set (match_operand:V8HI 0 "register_operand")
>> > >(any_extend:V8HI
>> > >  (vec_select:V8QI
>> > >(subreg:V16QI
>> > >  (vec_concat:V2DI
>> > >(match_operand:DI 1 "memory_operand")
>> > >(const_int 0)) 0)
>> > >(parallel [(const_int 0) (const_int 1)
>> > >   (const_int 2) (const_int 3)
>> > >   (const_int 4) (const_int 5)
>> > >   (const_int 6) (const_int 7)]]
>> > >  "TARGET_SSE4_1 &&  &&
>> > > "
>> > >  "#"
>> > >  "&& can_create_pseudo_p ()"
>> >
>> > "can_create_pseudo_p ()" should go to the insn constraint and "&& 1"
>> > should be used for split constraint. Both, insn and splitter are valid
>> > only before reload.
>> >
>> > >  [(set (match_dup 0)
>> > >(any_extend:V8HI (match_dup 1)))]
>> > > {
>> > >  operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
>> > > })
>> >
>> > Please use double quotes for one-line preparation statement.
>> >
>> > > (any_extend:V4SI
>> > >   (match_operand:V4HI 1 "memory_operand" "m,*m,m")))]
>> >
>> > Please remove star in front of memory constraint.
>> >
>> > OK with the above changes.
>>
>> Oh, and you should remove "q" and "k" operand modifiers in all old
>> patterns.
>
> Well, the new ones, obviously.

This is the patch I am going to check in after the apply_subst_iterator
fix is approved.

Thanks.

-- 
H.J.
From 585a4e65822b07c85b83096905d8c6130bf0381a Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sat, 15 Sep 2018 20:54:42 -0700

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-10-26 Thread Richard Biener
On Fri, Oct 26, 2018 at 4:55 AM Jeff Law  wrote:
>
> On 10/25/18 4:33 PM, Kugan Vivekanandarajah wrote:
> > Hi,
> >
> > PR87528 showed a case where libgcc generated popcount is causing
> > regression for Skylake.
> > We also have PR86677 where kernel build is failing because the kernel
> > does not use the libgcc (when backend is not defining popcount
> > pattern).  While I agree that the kernel should implement its own
> > functionality when it is not using the libgcc, I am afraid that the
> > implementation can have the same performance issues reported for
> > Skylake in PR87528.
> >
> > Therefore, I would like to propose that we disable popcount detection
> > when we don't have a pattern for that. The attached patch (based on
> > previous discussions) does this.
> >
> > Bootstrapped and regression tested on x86_64-linux-gnu with no new
> > regressions. We need to disable the popcount* testcases. I will have
> > to define a effective_target_with_popcount in
> > gcc/testsuite/lib/target-supports.exp if this patch is OK?
> > Thanks,
> > Kugan
> >
> >
> > gcc/ChangeLog:
> >
> > 2018-10-25  Kugan Vivekanandarajah  
> >
> > * tree-scalar-evolution.c (expression_expensive_p): Make BUILTIN 
> > POPCOUNT
> > as expensive when backend does not define it.
> >
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2018-10-25  Kugan Vivekanandarajah  
> >
> > * gcc.target/aarch64/popcount4.c: New test.
> >
> FWIW, I've been disabling by checking direct_optab_handler elsewhere
> (number_of_iterations_popcount) in my tester.  It may in fact be an old
> patch from you.
>
> Richi argued that it's the kernel team's responsibility to provide a
> popcount since they don't link with libgcc.  And I'm generally in
> agreement with that position, though it does tend to generate some
> friction with the kernel developers.  We also run the real risk of GCC 9
> not being able to build the kernel which, IMHO, would be a disaster from
> a PR standpoint.
>
> I'd like to hear from others here.  I fully realize we're beyond the
> realm of what is strictly technically correct here from a review standpoint.

As said final value replacement to a library call is probably not wanted
for optimization purpose, so adjusting expression_expensive_p is OK with
me.  It might not fully solve the (non-)issue in case another optimization pass
chooses to materialize niter computation result.

Few comments on the patch:

+  tree fndecl = get_callee_fndecl (expr);
+
+  if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
+   {
+ combined_fn cfn = as_combined_fn (DECL_FUNCTION_CODE (fndecl));

  combined_fn cfn = gimple_call_combined_fn (expr);
  switch (cfn)
{
...

cfn will be CFN_LAST for a non-builtin/internal call.  I know Richard is mostly
offline but eventually he knows whether there is a better way to query

+   CASE_CFN_POPCOUNT:
+ /* Check if opcode for popcount is available.  */
+ if (optab_handler (popcount_optab,
+TYPE_MODE (TREE_TYPE (CALL_EXPR_ARG
(expr, 0
+ == CODE_FOR_nothing)
+   return true;

note that we currently generate builtin calls rather than IFN calls
(when a direct
optab is supported).

Another comment on the patch is that you probably have to adjust existing
popcount testcases to add architecture specific flags enabling suport for
the instructions, otherwise you won't see loop replacement.

Also I think that the expression is only expensive (for final value
replacement!)
if you consider optimizing for speed.  When optimizing for size getting rid of
the loop is probably beneificial unconditionally.  That would leave the
possibility to switch said testcases to -Os.  It would require adding a
bool size_p flag to expression_expensive and passing down
optimize_loop_for_size_p ().

_NOTE_ that expression_expensive_p is also used by IVOPTs and there
replacing sth with an expression based on the niter analysis result doesn't
mean we get rid of the loop (but only of an IV), so maybe that reasoning
doesn't apply there.

Richard.

> Jeff
>


Re: [PATCH][GCC][mingw-w64][Ada] Fix Ada native bootstrap (PR81878).

2018-10-26 Thread Arnaud Charlet
> Bootstrapped on x86_64-pc-linux-gnu and mingw-w64-x86_64.
> 
> Ok for trunk?

OK, thanks.


Re: [C++ Patch] PR 84644 ("internal compiler error: in warn_misplaced_attr_for_class_type, at cp/decl.c:4718")

2018-10-26 Thread Paolo Carlini

Hi,

On 24/10/18 22:41, Jason Merrill wrote:

On 10/15/18 12:45 PM, Paolo Carlini wrote:

    && ((TREE_CODE (declspecs->type) != TYPENAME_TYPE
+   && TREE_CODE (declspecs->type) != DECLTYPE_TYPE
 && MAYBE_CLASS_TYPE_P (declspecs->type))


I would think that the MAYBE_CLASS_TYPE_P here should be CLASS_TYPE_P, 
and then we can remove the TYPENAME_TYPE check.  Or do we want to 
allow template type parameters for some reason?


Indeed, it would be nice to just use OVERLOAD_TYPE_P. However it seems 
we at least want to let through TEMPLATE_TYPE_PARMs representing 'auto' 
- otherwise Dodji's check a few lines below which fixed c++/51473 
doesn't work anymore - and also BOUND_TEMPLATE_TEMPLATE_PARM, otherwise 
we regress on template/spec32.C and template/ttp22.C because we don't 
diagnose the shadowing anymore. Thus, I would say either we keep on 
using MAYBE_CLASS_TYPE_P or we pick what we need, possibly we add a comment?


Thanks, Paolo.



Re: Free more of type decls

2018-10-26 Thread Richard Biener
On Fri, Oct 26, 2018 at 9:12 AM Jan Hubicka  wrote:
>
> Hi,
> this patch frees TYPE_DECL and alignment from TYPE_DECL and also preserves
> only those TYPE_DECL pointers that are actually used to build ODR type tree.
>
> It reduces number of TYPE_DECLs streamed from WPA to ltrans to about 20%
> and is important for the patch turning types to incomplete types.  Without
> this change the TREE_TYPE of TYPE_DECL would still point back to complete type
> and duplicating TYPE_DECLs as well is somewhat laborious.

So the following is the really important hunk, correct?

> @@ -5174,7 +5174,7 @@ free_lang_data_in_type (tree type)
>
>/* Drop TYPE_DECLs in TYPE_NAME in favor of the identifier in the
>   TYPE_DECL if the type doesn't have linkage.  */
> -  if (! type_with_linkage_p (type))
> +  if (type != TYPE_MAIN_VARIANT (type) || ! type_with_linkage_p (type))
>  {
>TYPE_NAME (type) = TYPE_IDENTIFIER (type);
>TYPE_STUB_DECL (type) = NULL;

Can you explain why you "free" alignment of TYPE_DECLs?  It's just some
bits...  does the FE somehow re-use those for sth else?  I wouldn't have
expected those to be set to anything meaningful.

I'm not too comfortable with setting TREE_TYPE of a TYPE_DECL to NULL.
Can we instead use void_type_node?  The tree-inline.c hunk should
probably test whether DECL_ORIGINAL_TYPE is non-null instead.

Richard.

> Bootstrapped/regtested x86_64-linux, OK?
>
> Honza
>
> * tree-inline.c (remap_decl): Be ready that TREE_TYPE of TYPE_DECL
> may be NULL.
> * tree-pretty-print.c (dump_generic_node): Likewise.
> * tree.c (free_lang_data_in_type): Free more type decls.
> (free_lang_data_in_decl): Free type and alignment of TYPE_DECL.
> Index: tree-inline.c
> ===
> --- tree-inline.c   (revision 265492)
> +++ tree-inline.c   (working copy)
> @@ -382,7 +382,7 @@ remap_decl (tree decl, copy_body_data *i
>   /* Preserve the invariant that DECL_ORIGINAL_TYPE != TREE_TYPE,
>  which is enforced in gen_typedef_die when DECL_ABSTRACT_ORIGIN
>  is not set on the TYPE_DECL, for example in LTO mode.  */
> - if (DECL_ORIGINAL_TYPE (t) == TREE_TYPE (t))
> + if (TREE_TYPE (t) && DECL_ORIGINAL_TYPE (t) == TREE_TYPE (t))
> {
>   tree x = build_variant_type_copy (TREE_TYPE (t));
>   TYPE_STUB_DECL (x) = TYPE_STUB_DECL (TREE_TYPE (t));
> Index: tree-pretty-print.c
> ===
> --- tree-pretty-print.c (revision 265492)
> +++ tree-pretty-print.c (working copy)
> @@ -1896,7 +1896,7 @@ dump_generic_node (pretty_printer *pp, t
> }
>if (DECL_NAME (node))
> dump_decl_name (pp, node, flags);
> -  else if (TYPE_NAME (TREE_TYPE (node)) != node)
> +  else if (TREE_TYPE (node) && TYPE_NAME (TREE_TYPE (node)) != node)
> {
>   pp_string (pp, (TREE_CODE (TREE_TYPE (node)) == UNION_TYPE
>   ? "union" : "struct "));
> Index: tree.c
> ===
> --- tree.c  (revision 265492)
> +++ tree.c  (working copy)
> @@ -5174,7 +5174,7 @@ free_lang_data_in_type (tree type)
>
>/* Drop TYPE_DECLs in TYPE_NAME in favor of the identifier in the
>   TYPE_DECL if the type doesn't have linkage.  */
> -  if (! type_with_linkage_p (type))
> +  if (type != TYPE_MAIN_VARIANT (type) || ! type_with_linkage_p (type))
>  {
>TYPE_NAME (type) = TYPE_IDENTIFIER (type);
>TYPE_STUB_DECL (type) = NULL;
> @@ -5354,6 +5354,8 @@ free_lang_data_in_decl (tree decl)
>DECL_VISIBILITY_SPECIFIED (decl) = 0;
>DECL_INITIAL (decl) = NULL_TREE;
>DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
> +  TREE_TYPE (decl) = NULL_TREE;
> +  SET_DECL_ALIGN (decl, 0);
>  }
>else if (TREE_CODE (decl) == FIELD_DECL)
>  DECL_INITIAL (decl) = NULL_TREE;


Re: [PATCH] Relax hash function to match equals function behavior (PR testsuite/86158).

2018-10-26 Thread Richard Biener
On Fri, Oct 26, 2018 at 9:20 AM Martin Liška  wrote:
>
> Hi.
>
> The patch aligns ipa_vr_ggc_hash_traits::hash function what actual 
> ipa_vr_ggc_hash_traits::equals
> operator does. Currently, the hash function is pointer based, which the real 
> equal operator
> does internally operand_equal_p, which works fine for equal constants (with 
> different addresses).
>
> It's tested on ppcl64-linux-gnu and it's pre-approved by Honza.
>
> Alexander:
> Note that I'm planning to come up with an equivalent of qsort_chk for hash 
> tables.
> Correct me if I'm wrong but it's expected that when equals function returns 
> true
> to have equal hash values as well? If so, that would catch this case I'm 
> patching.

Yes.  See for example operand_equal_p () where we do such check.

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2018-10-25  Martin Liska  
>
> PR testsuite/86158
> * ipa-prop.c (struct ipa_vr_ggc_hash_traits): Hash with
> addr_expr and not with pointers.
> ---
>  gcc/ipa-prop.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
>


Re: Is the D frontend good to go? (was Re: [PATCH 02/14] Add D frontend (GDC) implementation.)

2018-10-26 Thread Richard Biener
On Thu, Oct 25, 2018 at 4:13 PM Iain Buclaw  wrote:
>
> On Thu, 25 Oct 2018 at 15:06, David Malcolm  wrote:
> >
> > On Tue, 2018-10-23 at 19:21 +0200, Iain Buclaw wrote:
> > > On Tue, 23 Oct 2018 at 15:48, Richard Sandiford
> > >  wrote:
> > > >
> > > > Iain Buclaw  writes:
> > > > > I'm just going to post the diff since the original here, just to
> > > > > show
> > > > > what's been done since review comments.
> > > > >
> > > > > I think I've covered all that's been addressed, except for the
> > > > > couple
> > > > > of notes about the quadratic parts (though I think one of them is
> > > > > actually O(N^2)).  I've raised bug reports on improving them
> > > > > later.
> > > > >
> > > > > I've also rebased them against trunk, so there's a couple new
> > > > > things
> > > > > present that are just to support build.
> > > >
> > > > Thanks, this is OK when the frontend is accepted in principle
> > > > (can't remember where things stand with that).
> > > >
> > >
> > > As discussed, the front-end has already been approved by the SC.
> > >
> > > I'm not sure if there's anything else further required, or if any
> > > final review needs to be done.
> > >
> > > Thanks.
> >
> > I'm wondering what the state of this is [1]
> >
> > Iain: are all of the patches individually approved, with the necessary
> > issues fixed?
> >
>
> I've posted diffs a few days back that cover all requested changes.
>
> > IIRC, the front-end as a whole was approved, pending approval of all of
> > the individual patches (URL?).  If that's done, then presumably this is
> > good to go in - unless there was still some license discussion pending?
> >
>
> I have on tab responses from each patch, from what I see, they have
> all been OK'd.
>
> 02: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01432.html
> 03: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00734.html
> 04: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00928.html
> 05: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00592.html
> 06: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00609.html
> 07: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00955.html
> 08: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01270.html
> 09: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01264.html
> 10: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01269.html
> 12: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00735.html
> 14: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00970.html
>
> 1, 11, and 13 are DMD, Druntime, and Phobos, which are mirrored from
> upstream dlang repositories.  I spoke with Richard Stallman a couple
> days after this years GNU Cauldron, and he said there is no problem
> with regards to their license.
>
> > I take it that you've already got your contributor paperwork in place,
> > right?  I see from your maintainers commit that you presumably have svn
> > access.
> >
> > I'm not a global reviewer or steering committee member though; would be
> > nice to get a "go for it" from one of those.  Richard is a global
> > reviewer.
> >
>
> Yes, I was going to wait a couple days to make sure that there's no
> objection, before pressing on with it.
>
> Having a "go for it" from one of the reviewers would be nice though.
>
> > I'm not sure if it should be one big mega-commit, or split out the same
> > way you split things out for review.
> >
>
> I think splitting makes sense, though not necessarily in 14 pieces,
> there are only a few distinct parts.
>
> - D language front-end.
> - D standard and runtime libraries.
> - D language testsuite
> - D language support in GCC proper
> - D language support in GCC targets
> - Toplevel configure/makefile patches that add front-end and library
> to the build
>
> The first three can be squashed into one commit, as it's only adding new 
> files.

If you make sure each individual commit still builds splitting is OK, though
technically I see no need for splitting.

Go for it!

Richard.

> > Thanks for all your work on this
> > Dave
> >
> > [1] I've been checking the git mirror every few hours to look for a
> > massive commit from you, if I'm honest :)
>
> Oops, I didn't realise there were some who are so eager. :-)
>
> --
> Iain


Re: [PATCH 0/7] libsanitizer: merge from trunk

2018-10-26 Thread Jakub Jelinek
On Thu, Oct 25, 2018 at 12:49:42PM +0200, Jakub Jelinek wrote:
> On Thu, Oct 25, 2018 at 12:15:46PM +0200, marxin wrote:
> > I've just finished my first merge from libsanitizer mainline. Overall it
> > looks fine, apparently ABI hasn't changed and so that SONAME bump is not
> > needed.
> 
> Given the 6/7 patch, I think you need to bump libasan soname (it would be
> weird to bump it on powerpc64* only).

BTW, how can shadow offset be 1UL<<44 on powerpc64?  That seems like they
don't want to support anything but very recent kernels.
E.g. looking at Linux 3.4 arch/powerpc/include/asm/processor.h
I see
/* 64-bit user address space is 44-bits (16TB user VM) */
#define TASK_SIZE_USER64 (0x1000UL)
so, the new choice must be incompatible with lots of kernels out there.
Move recent kernels have:
#define TASK_SIZE_64TB  (0x4000UL)
#define TASK_SIZE_128TB (0x8000UL)
#define TASK_SIZE_512TB (0x0002UL)
#define TASK_SIZE_1PB   (0x0004UL)
#define TASK_SIZE_2PB   (0x0008UL)
#define TASK_SIZE_4PB   (0x0010UL)
but 4.15 still tops at 512TB, 4.10 has just 64TB as the only choice, 3.8 as
well.

CCing Bill as he made this change.

Jakub


Re: RFC: Allow moved-from strings to be non-empty

2018-10-26 Thread Ville Voutilainen
On Fri, 26 Oct 2018 at 01:42, Marc Glisse  wrote:
>
> On Fri, 26 Oct 2018, Ville Voutilainen wrote:
>
> > I would rather not introduce a behavioral difference between us and
> > libc++.
>
> Why not? There are already several, and it helps find bugs. Maybe you
> could convince libc++ to change as well if you want to keep the behavior
> the same?

What bugs?

> > It does slightly concern me that some users might
> > actually semantically expect a moved-from string to be empty, even
> > though that's not guaranteed, although for non-SSO cases
> > it *is* guaranteed.
>
> Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
> 5, short string so there is no allocation).

Sigh. Apparently it isn't, because the standard doesn't bother placing
complexity
requirements on string constructors. Even so, I'd prefer string acting
like vector,
so that it will leave the source of a move in an empty state, rather
than an unspecified
state. Despite the standard not requiring that, it's more useful
programmatically
to have the empty state than the unspecified state, especially when the state
is empty in some cases anyway.


Re: Free more of type decls

2018-10-26 Thread Jan Hubicka
> On Fri, Oct 26, 2018 at 9:12 AM Jan Hubicka  wrote:
> >
> > Hi,
> > this patch frees TYPE_DECL and alignment from TYPE_DECL and also preserves
> > only those TYPE_DECL pointers that are actually used to build ODR type tree.
> >
> > It reduces number of TYPE_DECLs streamed from WPA to ltrans to about 20%
> > and is important for the patch turning types to incomplete types.  Without
> > this change the TREE_TYPE of TYPE_DECL would still point back to complete 
> > type
> > and duplicating TYPE_DECLs as well is somewhat laborious.
> 
> So the following is the really important hunk, correct?
> 
> > @@ -5174,7 +5174,7 @@ free_lang_data_in_type (tree type)
> >
> >/* Drop TYPE_DECLs in TYPE_NAME in favor of the identifier in the
> >   TYPE_DECL if the type doesn't have linkage.  */
> > -  if (! type_with_linkage_p (type))
> > +  if (type != TYPE_MAIN_VARIANT (type) || ! type_with_linkage_p (type))
> >  {
> >TYPE_NAME (type) = TYPE_IDENTIFIER (type);
> >TYPE_STUB_DECL (type) = NULL;
> 
> Can you explain why you "free" alignment of TYPE_DECLs?  It's just some
> bits...  does the FE somehow re-use those for sth else?  I wouldn't have
> expected those to be set to anything meaningful.

It is set to 1 for forward declarations and 8 for fully defined types.
Once I start to turn complete types into incomplete they would not match
becaue of this difference.
> 
> I'm not too comfortable with setting TREE_TYPE of a TYPE_DECL to NULL.
> Can we instead use void_type_node?  The tree-inline.c hunk should
> probably test whether DECL_ORIGINAL_TYPE is non-null instead.

void_type_node works too.  I wanted to put in something that will make any
code that relies on it to crash (so I am sure there is none).

Does the following variant look OK?
I am re-testing it.

* tree.c (free_lang_data_in_decl): Clear alignment and TREE_TYPE
of TYPE_DECL.
Index: tree.c
===
--- tree.c  (revision 265522)
+++ tree.c  (working copy)
@@ -5354,6 +5354,10 @@ free_lang_data_in_decl (tree decl)
   DECL_VISIBILITY_SPECIFIED (decl) = 0;
   DECL_INITIAL (decl) = NULL_TREE;
   DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
+  /* Make sure that complete and incomplete types have same TYPE_DECL.
+C++ produces different DECL_ALIGN for them.  */
+  SET_DECL_ALIGN (decl, 0);
+  TREE_TYPE (decl) = void_type_node;
 }
   else if (TREE_CODE (decl) == FIELD_DECL)
 DECL_INITIAL (decl) = NULL_TREE;


Ping^3 Re: [PATCH v3 0/6] [MIPS] Reorganize the loongson march and extensions instructions set

2018-10-26 Thread Paul Hua
Ping ?

On Tue, Oct 23, 2018 at 9:16 AM Paul Hua  wrote:
>
> Ping ?
>
> On Fri, Oct 19, 2018 at 2:19 PM Paul Hua  wrote:
> >
> > Ping?
> >
> > I'd like check in those patches before stage3.
> >
> > Thanks,
> >
> > On Tue, Oct 16, 2018 at 10:49 AM Paul Hua  wrote:
> > >
> > > Hi:
> > >
> > > The original version of patches were here:
> > > https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00099.html
> > >
> > > This is a update version. please review, thanks.
> > >
> > > This series patches reorganize the Loongson -march=xxx and Loongson
> > > extensions instructions set.  For long time, the Loongson extensions
> > > instructions set puts under -march=loongson3a option.  We can't
> > > disable one of them when we need.
> > >
> > > The patch (1) split Loongson  MultiMedia extensions Instructions (MMI)
> > > from loongson3a, add -mloongson-mmi/-mno-loongson-mmi option for
> > > enable/disable them.
> > >
> > > The patch (2) split Loongson EXTensions (EXT) instructions from
> > > loongson3a, add -mloongson-ext/-mno-loongson-ext option for
> > > enable/disable them.
> > >
> > > The patch (3) add Loongson EXTensions R2 (EXT2) instructions support,
> > > add -mloongson-ext2/-mno-loongson-ext2 option for enable/disable them.
> > >
> > > The patch (4) add Loongson 3A1000 processor support.  The gs464 is a
> > > codename of 3A1000 microarchitecture.  Rename -march=loongson3a to
> > > -march=gs464, Keep -march=loongson3a as an alias of -march=gs464 for
> > > compatibility.
> > >
> > > The patch (5) add Loongson 3A2000/3A3000 processor support.  Include
> > > Loongson MMI, EXT, EXT2 instructions set.
> > >
> > > The patch (6) add Loongson 2K1000 processor support. Include Loongson
> > > MMI, EXT, EXT2 and MSA instructions set.
> > >
> > > The binutils patch has been upstreamed.
> > >
> > > There are six patches in this set, as follows.
> > > 1) 0001-MIPS-Add-support-for-loongson-mmi-instructions.patch
> > > 2) 0002-MIPS-Add-support-for-Loongson-EXT-istructions.patch
> > > 3) 0003-MIPS-Add-support-for-Loongson-EXT2-istructions.patch
> > > 4) 0004-MIPS-Add-support-for-Loongson-3A1000-proccessor.patch
> > > 5) 0005-MIPS-Add-support-for-Loongson-3A2000-3A3000-proccess.patch
> > > 6) 0006-MIPS-Add-support-for-Loongson-2K1000-proccessor.patch
> > >
> > > All patchs test under mips64el-linux-gnu no new regressions.
> > >
> > > Ok for commit ?
> > >
> > > Thanks,
> > > Paul Hua


Re: RFC: Allow moved-from strings to be non-empty

2018-10-26 Thread Jonathan Wakely

On 26/10/18 12:16 +0300, Ville Voutilainen wrote:

On Fri, 26 Oct 2018 at 01:42, Marc Glisse  wrote:


On Fri, 26 Oct 2018, Ville Voutilainen wrote:

> I would rather not introduce a behavioral difference between us and
> libc++.

Why not? There are already several, and it helps find bugs. Maybe you
could convince libc++ to change as well if you want to keep the behavior
the same?


What bugs?


Assuming the string is empty after a move and appending to it without
calling clear() first.



> It does slightly concern me that some users might
> actually semantically expect a moved-from string to be empty, even
> though that's not guaranteed, although for non-SSO cases
> it *is* guaranteed.

Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
5, short string so there is no allocation).


Sigh. Apparently it isn't, because the standard doesn't bother placing
complexity
requirements on string constructors.


Writing 5 bytes into the SSO buffer would be constant complexity :-P


Even so, I'd prefer string acting
like vector,
so that it will leave the source of a move in an empty state, rather
than an unspecified
state.


It's not guaranteed for vector either. An allocator with POCMA==false
doesn't propagate on move assignment and if the source object's
allocator isn't equal to the target's it will copy, and there's no
guarantee the source will be empty.

This would be a conforming change to our vector:

--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1793,7 +1793,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   // so we need to individually move each element.
   this->assign(std::__make_move_if_noexcept_iterator(__x.begin()),
std::__make_move_if_noexcept_iterator(__x.end()));
-   __x.clear();
 }
  }
#endif

That might even have a bigger performance benefit, because clearing a
vector runs destructors, it doesn't just set the length and write a
null terminator.

If you're using a vector as a buffer of objects and the first thing
you do after v1=std::move(v2) is v2.resize(n) then it's a
pessimization to have cleared it. 


Despite the standard not requiring that, it's more useful
programmatically
to have the empty state than the unspecified state, especially when the state
is empty in some cases anyway.


Re: RFC: Allow moved-from strings to be non-empty

2018-10-26 Thread Jonathan Wakely

On 26/10/18 08:25 +0200, Marc Glisse wrote:

On Fri, 26 Oct 2018, Jonathan Wakely wrote:


For the libc++ string zeroing the length of a small string happens to
be faster.


Ah, yes, of course.

Is it? In debug mode, I'd be tempted to leave the string as 
"moved" (size 5, short string so there is no allocation).


That's not a bad idea.


Although we can't do it for std::wstring and std::u32string, as their
small string buffer is *very* small.


"N/A"? The proposition was only semi-serious.

By the way, I was surprised by the formula for the size of the buffer. 
It often has size 16, but for a _CharT of size 3 and alignment 1 
(unlikely I guess), it has size 18.


Oops, that's not intentional.



Re: RFC: Allow moved-from strings to be non-empty

2018-10-26 Thread Ville Voutilainen
On Fri, 26 Oct 2018 at 12:55, Jonathan Wakely  wrote:
> >> Why not? There are already several, and it helps find bugs. Maybe you
> >> could convince libc++ to change as well if you want to keep the behavior
> >> the same?
> >
> >What bugs?
>
> Assuming the string is empty after a move and appending to it without
> calling clear() first.

I find it bloody unfortunate that the standard considers that a bug.
It seems quite convenient
to me that it's possible to have a bag of strings, move some of them
out to be processed,
and be able to tell the processed ones from the unprocessed ones
without having to separately
clear them when moving.

> >> > It does slightly concern me that some users might
> >> > actually semantically expect a moved-from string to be empty, even
> >> > though that's not guaranteed, although for non-SSO cases
> >> > it *is* guaranteed.
> >>
> >> Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
> >> 5, short string so there is no allocation).
> >
> >Sigh. Apparently it isn't, because the standard doesn't bother placing
> >complexity
> >requirements on string constructors.
>
> Writing 5 bytes into the SSO buffer would be constant complexity :-P

Indeed it would, but the standard doesn't seem to have complexity
requirements on string constructors
at all. If it did, moving a non-sso string would *have to* steal from
the source, and the sso case
would not have to do that, but could.

> >Even so, I'd prefer string acting
> >like vector,
> >so that it will leave the source of a move in an empty state, rather
> >than an unspecified
> >state.
>
> It's not guaranteed for vector either. An allocator with POCMA==false
> doesn't propagate on move assignment and if the source object's
> allocator isn't equal to the target's it will copy, and there's no
> guarantee the source will be empty.

Right, I was talking about homogeneous vectors; it is well-known that
non-propagating allocators
don't move. Except when they do (on move construction).


Re: [PATCH][GCC][mingw-w64][Ada] Fix Ada native bootstrap (PR81878).

2018-10-26 Thread Eric Botcazou
> Due to the changes in PR81878 builds of GCC8 and trunk are impossible
> with Ada enabled[1][2].
> 
> The reason the patch breaks the bootstrap is due to how gnatlink receives
> it's arguments.

Thanks for working on this!

> Bootstrapped on x86_64-pc-linux-gnu and mingw-w64-x86_64.
> 
> Ok for trunk?

Please put it on the 8 branch too.

-- 
Eric Botcazou


maintainer-scripts closing of 6.x

2018-10-26 Thread Jakub Jelinek
Hi!

I've committed this change to close the 6.x branch.

2018-10-26  Jakub Jelinek  

* update_version_svn (IGNORE_BRANCHES): Add gcc-6-branch.
* crontab: Remove gcc-6-branch entry.

--- maintainer-scripts/update_version_svn.jj2017-10-10 15:08:45.837996075 
+0200
+++ maintainer-scripts/update_version_svn   2018-10-26 11:58:57.572822830 
+0200
@@ -6,7 +6,7 @@
 # in the space separated list in $ADD_BRANCHES.
 
 SVNROOT=${SVNROOT:-"file:///svn/gcc"}
-IGNORE_BRANCHES='gcc-(2_95|3_0|3_1|3_2|3_3|3_4|4_0|4_1|4_2|4_3|4_4|4_5|4_6|4_7|4_8|4_9|5)-branch'
+IGNORE_BRANCHES='gcc-(2_95|3_0|3_1|3_2|3_3|3_4|4_0|4_1|4_2|4_3|4_4|4_5|4_6|4_7|4_8|4_9|5|6)-branch'
 ADD_BRANCHES='HEAD'
 
 # Run this from /tmp.
--- maintainer-scripts/crontab.jj   2018-04-25 09:45:12.096785185 +0200
+++ maintainer-scripts/crontab  2018-10-26 11:59:20.147451625 +0200
@@ -1,7 +1,6 @@
 16  0 * * * sh /home/gccadmin/scripts/update_version_svn
 50  0 * * * sh /home/gccadmin/scripts/update_web_docs_svn
 55  0 * * * sh /home/gccadmin/scripts/update_web_docs_libstdcxx_svn
-32 22 * * 3 sh /home/gccadmin/scripts/gcc_release -s 6:branches/gcc-6-branch 
-l -d /sourceware/snapshot-tmp/gcc all
 32 22 * * 4 sh /home/gccadmin/scripts/gcc_release -s 7:branches/gcc-7-branch 
-l -d /sourceware/snapshot-tmp/gcc all
 32 22 * * 5 sh /home/gccadmin/scripts/gcc_release -s 8:branches/gcc-8-branch 
-l -d /sourceware/snapshot-tmp/gcc all
 32 22 * * 7 sh /home/gccadmin/scripts/gcc_release -s 9:trunk -l -d 
/sourceware/snapshot-tmp/gcc all

Jakub


Fix up gcc_release script

2018-10-26 Thread Jakub Jelinek
Hi!

I got a failure when trying to do 6.5 release, because
"^GCC 6.5" didn't match in the expected portion of NEWS, there was only
   GCC 6.5
This patch accepts whitespace before it and makes the checks consistent,
also it doesn't print just the first argument on error/inform, but all of
them (needed because some messages were too long and split across lines).

Committed to trunk.

2018-10-26  Jakub Jelinek  

* gcc_release (error, inform): Use $@ instead of $1.
(build_sources): Check for ^[[:blank:]]*GCC in both index.html
and changes.html, rather than for GCC in one and ^GCC in another one.

--- maintainer-scripts/gcc_release.jj   2018-05-03 11:28:30.199330419 +0200
+++ maintainer-scripts/gcc_release  2018-10-26 12:24:04.263072882 +0200
@@ -45,18 +45,18 @@
 # Functions
 
 
-# Issue the error message given by $1 and exit with a non-zero
+# Issue the error message given by $@ and exit with a non-zero
 # exit code.
 
 error() {
-echo "gcc_release: error: $1"
+echo "gcc_release: error: $@"
 exit 1
 }
 
-# Issue the informational message given by $1.
+# Issue the informational message given by $@.
 
 inform() {
-echo "gcc_release: $1"
+echo "gcc_release: $@"
 }
 
 # Issue a usage message explaining how to use this script.
@@ -128,12 +128,12 @@ build_sources() {
 previndex="http:\/\/gcc.gnu.org\/gcc-`expr ${RELEASE_MAJOR} - 
1`\/index.html"
 sed -n -e "/^${thisindex}/,/^${thischanges}/p" NEWS |\
   sed -n -e "/Release History/,/References and Acknowledgments/p" |\
-  grep -q "GCC ${RELEASE_MAJOR}.${RELEASE_MINOR}" ||\
+  grep -q "^[[:blank:]]*GCC ${RELEASE_MAJOR}.${RELEASE_MINOR}" ||\
   error "GCC ${RELEASE_MAJOR}.${RELEASE_MINOR} not mentioned "\
 "in gcc-${RELEASE_MAJOR}/index.html"
 
 sed -n -e "/^${thischanges}/,/^${previndex}/p" NEWS |\
-  grep -q "^GCC ${RELEASE_MAJOR}.${RELEASE_MINOR}" ||\
+  grep -q "^[[:blank:]]*GCC ${RELEASE_MAJOR}.${RELEASE_MINOR}" ||\
   error "GCC ${RELEASE_MAJOR}.${RELEASE_MINOR} not mentioned "\
 "in gcc-${RELEASE_MAJOR}/changes.html"
 

Jakub


Re: RFC: Allow moved-from strings to be non-empty

2018-10-26 Thread Marc Glisse

On Fri, 26 Oct 2018, Ville Voutilainen wrote:


On Fri, 26 Oct 2018 at 12:55, Jonathan Wakely  wrote:

Why not? There are already several, and it helps find bugs. Maybe you
could convince libc++ to change as well if you want to keep the behavior
the same?


What bugs?


Assuming the string is empty after a move and appending to it without
calling clear() first.


I find it bloody unfortunate that the standard considers that a bug. It 
seems quite convenient to me that it's possible to have a bag of 
strings, move some of them out to be processed, and be able to tell the 
processed ones from the unprocessed ones without having to separately 
clear them when moving.


We all seem to want different things from move:

1) copy, possibly with a speed up because I don't care what the source 
looks like afterwards. That's what was standardized, and for a small 
string that means that moving it should just copy.


2) move and clear

3) move and destroy

(most likely many others)

The convenience does not seem that great to me, especially if it has a 
performance cost.



It does slightly concern me that some users might
actually semantically expect a moved-from string to be empty, even
though that's not guaranteed, although for non-SSO cases
it *is* guaranteed.


Is it? In debug mode, I'd be tempted to leave the string as "moved" (size
5, short string so there is no allocation).


Sigh. Apparently it isn't, because the standard doesn't bother placing
complexity
requirements on string constructors.


Writing 5 bytes into the SSO buffer would be constant complexity :-P


Indeed it would, but the standard doesn't seem to have complexity 
requirements on string constructors at all. If it did, moving a non-sso 
string would *have to* steal from the source, and the sso case would not 
have to do that, but could.


I am not sure what you are saying exactly, but I think "noexcept" is (kind 
of) playing the role of a complexity requirement here.


--
Marc Glisse


[PATCH] Fix compile-time issue with last SLP vectorizer patch

2018-10-26 Thread Richard Biener


While I fixed up all correctness places I totally forgot about 
compile-time when not adding visited hash_set<>s to all SLP tree
walks.

This rectifies things and fixes 538.imagick_r build time
(currently reducing a testcase which I will add once finished).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

I wish there were a nicer C++ way of allocating the visited set
than adding wrapping overloads.  I played with

void foo (hash_set &visited = hash_set ())

but that doesn't work because visited isn't const.  Making it
const also doesn't work of course (unless casting the const away
in the function).  Ideas?

Thanks,
Richard.

2018-10-26  Richard Biener  

* tree-vect-slp.c (vect_mark_slp_stmts): Add visited hash_set
and wrapper.
(vect_mark_slp_stmts_relevant): Likewise.
(vect_detect_hybrid_slp_stmts): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
(vect_remove_slp_scalar_calls): Likewise.

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index ab8504a10bd..5b925be80f4 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1483,7 +1483,8 @@ vect_print_slp_tree (dump_flags_t dump_kind, 
dump_location_t loc,
stmts in NODE are to be marked.  */
 
 static void
-vect_mark_slp_stmts (slp_tree node, enum slp_vect_type mark, int j)
+vect_mark_slp_stmts (slp_tree node, enum slp_vect_type mark, int j,
+hash_set &visited)
 {
   int i;
   stmt_vec_info stmt_info;
@@ -1492,19 +1493,28 @@ vect_mark_slp_stmts (slp_tree node, enum slp_vect_type 
mark, int j)
   if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
 return;
 
+  if (visited.add (node))
+return;
+
   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
 if (j < 0 || i == j)
   STMT_SLP_TYPE (stmt_info) = mark;
 
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-vect_mark_slp_stmts (child, mark, j);
+vect_mark_slp_stmts (child, mark, j, visited);
 }
 
+static void
+vect_mark_slp_stmts (slp_tree node, enum slp_vect_type mark, int j)
+{
+  hash_set visited;
+  vect_mark_slp_stmts (node, mark, j, visited);
+}
 
 /* Mark the statements of the tree rooted at NODE as relevant (vect_used).  */
 
 static void
-vect_mark_slp_stmts_relevant (slp_tree node)
+vect_mark_slp_stmts_relevant (slp_tree node, hash_set &visited)
 {
   int i;
   stmt_vec_info stmt_info;
@@ -1513,6 +1523,9 @@ vect_mark_slp_stmts_relevant (slp_tree node)
   if (SLP_TREE_DEF_TYPE (node) != vect_internal_def)
 return;
 
+  if (visited.add (node))
+return;
+
   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
 {
   gcc_assert (!STMT_VINFO_RELEVANT (stmt_info)
@@ -1521,7 +1534,14 @@ vect_mark_slp_stmts_relevant (slp_tree node)
 }
 
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-vect_mark_slp_stmts_relevant (child);
+vect_mark_slp_stmts_relevant (child, visited);
+}
+
+static void
+vect_mark_slp_stmts_relevant (slp_tree node)
+{
+  hash_set visited;
+  vect_mark_slp_stmts_relevant (node, visited);
 }
 
 
@@ -2200,7 +2220,8 @@ vect_make_slp_decision (loop_vec_info loop_vinfo)
can't be SLPed) in the tree rooted at NODE.  Mark such stmts as HYBRID.  */
 
 static void
-vect_detect_hybrid_slp_stmts (slp_tree node, unsigned i, slp_vect_type stype)
+vect_detect_hybrid_slp_stmts (slp_tree node, unsigned i, slp_vect_type stype,
+ hash_set &visited)
 {
   stmt_vec_info stmt_vinfo = SLP_TREE_SCALAR_STMTS (node)[i];
   imm_use_iterator imm_iter;
@@ -2210,6 +2231,9 @@ vect_detect_hybrid_slp_stmts (slp_tree node, unsigned i, 
slp_vect_type stype)
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
   int j;
 
+  if (visited.add (node))
+return;
+
   /* Propagate hybrid down the SLP tree.  */
   if (stype == hybrid)
 ;
@@ -2259,7 +2283,14 @@ vect_detect_hybrid_slp_stmts (slp_tree node, unsigned i, 
slp_vect_type stype)
 
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), j, child)
 if (SLP_TREE_DEF_TYPE (child) != vect_external_def)
-  vect_detect_hybrid_slp_stmts (child, i, stype);
+  vect_detect_hybrid_slp_stmts (child, i, stype, visited);
+}
+
+static void
+vect_detect_hybrid_slp_stmts (slp_tree node, unsigned i, slp_vect_type stype)
+{
+  hash_set visited;
+  vect_detect_hybrid_slp_stmts (node, i, stype, visited);
 }
 
 /* Helpers for vect_detect_hybrid_slp walking pattern stmt uses.  */
@@ -2571,12 +2602,16 @@ vect_slp_analyze_operations (vec_info *vinfo)
 static void 
 vect_bb_slp_scalar_cost (basic_block bb,
 slp_tree node, vec *life,
-stmt_vector_for_cost *cost_vec)
+stmt_vector_for_cost *cost_vec,
+hash_set &visited)
 {
   unsigned i;
   stmt_vec_info stmt_info;
   slp_tree child;
 
+  if (visited.add (node))
+return; 
+
   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
 {
   gimple *stmt = stmt_info->stmt;
@@ -2636,12 +2671,22 @@ vect

Re: [PATCH] Add option to control warnings added through attribure "warning"

2018-10-26 Thread Nikolai Merinov
Hi,

What next steps should I perform in order to get this changes merged to GCC?

Regards,
Nikolai

- Original Message -
From: "Nikolai Merinov" 
To: "Martin Sebor" , gcc-patches@gcc.gnu.org
Sent: Monday, October 15, 2018 3:21:15 PM
Subject: Re: [PATCH] Add option to control warnings added through attribure 
"warning"

Hi Martin,

On 10/15/18 6:20 PM, Martin Sebor wrote:
> On 10/15/2018 01:55 AM, Nikolai Merinov wrote:
>> Hi Martin,
>>
>> On 10/12/18 9:58 PM, Martin Sebor wrote:
>>> On 10/12/2018 04:14 AM, Nikolai Merinov wrote:
 Hello,

 In https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01795.html mail I
 suggested patch to have ability to control behavior of
 "__attribute__((warning))" in case when option "-Werror" enabled. Usage
 example:

> #include 
> int a() __attribute__((warning("Warning: `a' was used")));
> int a() { return 1; }
> int main () { return a(); }

> $ gcc -Werror test.c
> test.c: In function ‘main’:
> test.c:4:22: error: call to ‘a’ declared with attribute warning:
> Warning: `a' was used [-Werror]
>  int main () { return a(); }
>   ^
> cc1: all warnings being treated as errors
> $ gcc -Werror -Wno-error=warning-attribute test.c
> test.c: In function ‘main’:
> test.c:4:22: warning: call to ‘a’ declared with attribute warning:
> Warning: `a' was used
>  int main () { return a(); }
>   ^
 Can you provide any feedback on suggested changes?
>>>
>>> It seems like a useful feature and in line with the philosophy
>>> that distinct warnings should be controlled by their own options.
>>>
>>> I would only suggest to consider changing the name to
>>> -Wattribute-warning, because it applies specifically to that
>>> attribute (as opposed to warnings about attributes in general).
>>>
>>> There are many attributes in GCC and diagnosing problems that
>>> are unique to each, under the same -Wattributes option, is
>>> becoming too coarse and overly limiting.  To make it more
>>> flexible, I expect new options will need to be introduced,
>>> such as -Wattribute-alias (to control aspects of the alias
>>> attribute and others related to it), or -Wattribute-const
>>> (to control diagnostics about functions declared with
>>> attribute const that violate the attribute's constraints).
>>>
>>> An alternative might be to introduce a single -Wattribute=
>>>  option where the  gives
>>> the names of all the distinct attributes whose unique
>>> diagnostics one might need to control.
>>>
>>> Martin
>>
>> Currently there is several styles already in use:
>>
>> -Wattribute-alias where "attribute" word used as prefix for name of 
>> attribute,
>> -Wsuggest-attribute=[pure|const|noreturn|format|malloc] where name of 
>> attribute passed as possible argument,
>> -Wmissing-format-attribute where "attribute" word used as suffix,
>> -Wdeprecated-declarations where "attribute" word not used at all even if 
>> this warning option was created especially for "deprecated" attribute.
>>
>> I changed name to "-Wattribute-warning" as you suggested, but unifying style 
>> for all attribute related warning looks like separate activity. Please check 
>> new patch in attachments.
>>
> 
> Thanks for survey!  I agree that making the existing options
> consistent (if that's what we want) should be done separately.
> 
> Martin
> 
> PS It doesn't look like your latest attachments made it to
> the list.
> 
Thank you for mentioning. There was my mistake. Now it's attached
> 
>> Updated changelog:
>>
>> gcc/Changelog
>>
>> 2018-10-14  Nikolai Merinov 
>>
>>  * gcc/common.opt: Add -Wattribute-warning.
>>  * gcc/doc/invoke.texi: Add documentation for -Wno-attribute-warning.
>>  * gcc/testsuite/gcc.dg/Wno-attribute-warning.c: New test.
>>  * gcc/expr.c (expand_expr_real_1): Add new attribute to warning_at
>>  call to allow user configure behavior of "warning" attribute


Re: Free more of type decls

2018-10-26 Thread Richard Biener
On Fri, 26 Oct 2018, Jan Hubicka wrote:

> > On Fri, Oct 26, 2018 at 9:12 AM Jan Hubicka  wrote:
> > >
> > > Hi,
> > > this patch frees TYPE_DECL and alignment from TYPE_DECL and also preserves
> > > only those TYPE_DECL pointers that are actually used to build ODR type 
> > > tree.
> > >
> > > It reduces number of TYPE_DECLs streamed from WPA to ltrans to about 20%
> > > and is important for the patch turning types to incomplete types.  Without
> > > this change the TREE_TYPE of TYPE_DECL would still point back to complete 
> > > type
> > > and duplicating TYPE_DECLs as well is somewhat laborious.
> > 
> > So the following is the really important hunk, correct?
> > 
> > > @@ -5174,7 +5174,7 @@ free_lang_data_in_type (tree type)
> > >
> > >/* Drop TYPE_DECLs in TYPE_NAME in favor of the identifier in the
> > >   TYPE_DECL if the type doesn't have linkage.  */
> > > -  if (! type_with_linkage_p (type))
> > > +  if (type != TYPE_MAIN_VARIANT (type) || ! type_with_linkage_p (type))
> > >  {
> > >TYPE_NAME (type) = TYPE_IDENTIFIER (type);
> > >TYPE_STUB_DECL (type) = NULL;
> > 
> > Can you explain why you "free" alignment of TYPE_DECLs?  It's just some
> > bits...  does the FE somehow re-use those for sth else?  I wouldn't have
> > expected those to be set to anything meaningful.
> 
> It is set to 1 for forward declarations and 8 for fully defined types.
> Once I start to turn complete types into incomplete they would not match
> becaue of this difference.

Bah ;)

> > 
> > I'm not too comfortable with setting TREE_TYPE of a TYPE_DECL to NULL.
> > Can we instead use void_type_node?  The tree-inline.c hunk should
> > probably test whether DECL_ORIGINAL_TYPE is non-null instead.
> 
> void_type_node works too.  I wanted to put in something that will make any
> code that relies on it to crash (so I am sure there is none).
>
> Does the following variant look OK?
> I am re-testing it.

Yes.
 
Thanks,
Richard.

>   * tree.c (free_lang_data_in_decl): Clear alignment and TREE_TYPE
>   of TYPE_DECL.
> Index: tree.c
> ===
> --- tree.c(revision 265522)
> +++ tree.c(working copy)
> @@ -5354,6 +5354,10 @@ free_lang_data_in_decl (tree decl)
>DECL_VISIBILITY_SPECIFIED (decl) = 0;
>DECL_INITIAL (decl) = NULL_TREE;
>DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
> +  /* Make sure that complete and incomplete types have same TYPE_DECL.
> +  C++ produces different DECL_ALIGN for them.  */
> +  SET_DECL_ALIGN (decl, 0);
> +  TREE_TYPE (decl) = void_type_node;
>  }
>else if (TREE_CODE (decl) == FIELD_DECL)
>  DECL_INITIAL (decl) = NULL_TREE;
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Turn complete to incomplete types in free_lang_data

2018-10-26 Thread Jan Hubicka
Hi,
this is minimal variant of the patch turning complete to incomplete pointers in
fields.  We can do more - in particular it would be very function to do same
for functions types and decls (because they often end up being streamed to
symtab) and we should also turn pointers to arrays and enums to incomplete
variants.

I do that in my local tree but i would like to get it into mainline one by
one and check benefits of each change independently. 

Patch bootstraped®tests x86-64 and I am now re-testing it on firefox.  I
checked on small testcases that types indeed do get merged.

OK if it survives more testing on firefox and lto bootstrap?

Honza

* tree.c (free_lang_data_in_type): Declare.
(types_equal_p): New function.
(free_lang_data_type_variant): New function.
(incomplete_type_of): New function.
(simplified_type): New function.
Index: tree.c
===
--- tree.c  (revision 265522)
+++ tree.c  (working copy)
@@ -265,6 +265,8 @@ static void print_type_hash_statistics (
 static void print_debug_expr_statistics (void);
 static void print_value_expr_statistics (void);
 
+static void free_lang_data_in_type (tree type);
+
 tree global_trees[TI_MAX];
 tree integer_types[itk_none];
 
@@ -5038,6 +5041,140 @@ protected_set_expr_location (tree t, loc
 SET_EXPR_LOCATION (t, loc);
 }
 
+/* Do same comparsion as check_qualified_type skipping lang part of type
+   and be more permissive about type names: we only care that names are
+   same (for diagnostics) and that ODR names are the same.  */
+
+static bool
+types_equal_p (tree t, tree v)
+{
+  if (t==v)
+return true;
+
+  if (TYPE_QUALS (t) != TYPE_QUALS (v))
+return false;
+
+  if (TYPE_NAME (t) != TYPE_NAME (v)
+  && (!TYPE_NAME (t) || !TYPE_NAME (v)
+ || TREE_CODE (TYPE_NAME (t)) != TYPE_DECL
+ || TREE_CODE (TYPE_NAME (v)) != TYPE_DECL
+ || DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (t))
+!= DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (v))
+ || DECL_NAME (TYPE_NAME (t)) != DECL_NAME (TYPE_NAME (v
+ return false;
+
+  if (TYPE_ALIGN (t) != TYPE_ALIGN (v))
+return false;
+
+  if (!attribute_list_equal (TYPE_ATTRIBUTES (t),
+TYPE_ATTRIBUTES (v)))
+ return false;
+
+  /* Do not replace complete type by incomplete.  */
+  if ((TREE_CODE (t) == QUAL_UNION_TYPE
+   || TREE_CODE (t) == UNION_TYPE || TREE_CODE (t) == RECORD_TYPE)
+  && COMPLETE_TYPE_P (t) != COMPLETE_TYPE_P (v))
+return false;
+
+  gcc_assert (TREE_CODE (t) == TREE_CODE (v));
+
+  /* For pointer types and array types we also care about the type they
+ reffer to.  */
+  if (TREE_TYPE (t))
+return types_equal_p (TREE_TYPE (t), TREE_TYPE (v));
+
+  return true;
+}
+
+/* Find variant of FIRST that match T and create new one if necessary.  */
+
+static tree
+free_lang_data_type_variant (tree first, tree t)
+{
+  if (first == TYPE_MAIN_VARIANT (t))
+return t;
+  for (tree v = first; v; v = TYPE_NEXT_VARIANT (v))
+if (types_equal_p (t, v))
+  return v;
+  tree v = build_variant_type_copy (first);
+  TYPE_READONLY (v) = TYPE_READONLY (t);
+  TYPE_VOLATILE (v) = TYPE_VOLATILE (t);
+  TYPE_ATOMIC (v) = TYPE_ATOMIC (t);
+  TYPE_RESTRICT (v) = TYPE_RESTRICT (t);
+  TYPE_ADDR_SPACE (v) = TYPE_ADDR_SPACE (t);
+  TYPE_NAME (v) = TYPE_NAME (t);
+  TYPE_ATTRIBUTES (v) = TYPE_ATTRIBUTES (t);
+  return v;
+}
+
+/* Map complete types to incomplete types.  */
+static hash_map *incomplete_types;
+
+/* See if type T can be turned into incopmlete variant.  */
+
+static tree
+incomplete_type_of (tree t)
+{
+  if (!RECORD_OR_UNION_TYPE_P (t))
+return t;
+  if (!COMPLETE_TYPE_P (t))
+return t;
+  if (TYPE_MAIN_VARIANT (t) == t)
+{
+  bool existed;
+  tree ©
+= incomplete_types->get_or_insert (TYPE_MAIN_VARIANT (t), &existed);
+
+  if (!existed)
+   {
+ copy = build_distinct_type_copy (t);
+
+ /* It is possible type was not seen by free_lang_data yet.  */
+ free_lang_data_in_type (copy);
+ TYPE_SIZE (copy) = NULL;
+ SET_TYPE_MODE (copy, VOIDmode);
+ SET_TYPE_ALIGN (copy, BITS_PER_UNIT);
+ TYPE_SIZE_UNIT (copy) = NULL;
+ if (AGGREGATE_TYPE_P (t))
+   {
+ TYPE_FIELDS (copy) = NULL;
+ TYPE_BINFO (copy) = NULL;
+   }
+ else
+   TYPE_VALUES (copy) = NULL;
+   }
+  return copy;
+   }
+  return (free_lang_data_type_variant
+   (incomplete_type_of (TYPE_MAIN_VARIANT (t)), t));
+}
+
+/* Simplify type T for scenarios where we do not need complete pointer
+   types.  */
+
+static tree
+simplified_type (tree t)
+{
+  if (POINTER_TYPE_P (t))
+{
+  tree t2 = POINTER_TYPE_P (TREE_TYPE (t))
+   ? simplified_type (TREE_TYPE (t))
+   : incomplete_type_of (TREE_TYPE (t));
+  if (t2 != TREE_TYPE (t))
+   {
+ tree 

Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Rainer Orth
Hi Ian,

> This patch by Than McIntosh improves the mangling of package paths in
> the Go frontend.
>
> The current implementation of Gogo::pkgpath_for_symbol was written in
> a way that allowed two distinct package paths to map to the same
> symbol, which could cause collisions at link- time or compile-time.
>
> This patch switches to a better mangling scheme to ensure that we get
> a unique packagepath symbol for each package.  In the new scheme
> instead of having separate mangling schemes for identifiers and
> package paths, the main identifier mangler ("go_encode_id") now
> handles mangling of both packagepath characters and identifier
> characters.
>
> The new mangling scheme is more intrusive: "foo/bar.Baz" is mangled as
> "foo..z2fbar.Baz" instead of "foo_bar.Baz".  To mitigate this, this
> patch also adds a demangling capability so that function names
> returned from runtime.CallersFrames are converted back to their
> original unmangled form.
>
> Changing the pkgpath_for_symbol scheme requires updating a number of
> //go:linkname directives and C "__asm__" directives to match the new
> scheme, as well as updating the 'gotest' driver (which makes
> assumptions about the correct mapping from pkgpath symbol to package
> name).

it seems you missed a case here: both i386-pc-solaris2.* and
sparc-sun-solaris2.* bootstraps broke linking the gotools:

Undefined   first referenced
 symbol in file
log..z2fsyslog.syslog_c 
../sparc-sun-solaris2.11/libgo/.libs/libgo.so
ld: fatal: symbol referencing errors
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:751: go] Error 1

The following patch fixes this allowing the links to succeed, though
I've not run the testsuite yet.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/libgo/go/log/syslog/syslog_c.c b/libgo/go/log/syslog/syslog_c.c
--- a/libgo/go/log/syslog/syslog_c.c
+++ b/libgo/go/log/syslog/syslog_c.c
@@ -12,7 +12,7 @@
can't represent a C varargs function in Go.  */
 
 void syslog_c(intgo, const char*)
-  __asm__ (GOSYM_PREFIX "log_syslog.syslog_c");
+  __asm__ (GOSYM_PREFIX "log..z2fsyslog.syslog_c");
 
 void
 syslog_c (intgo priority, const char *msg)


Improve relocation

2018-10-26 Thread Marc Glisse

Hello,

here are some tweaks so that I can usefully mark deque as trivially 
relocatable. It includes more noexcept(auto) madness. For __relocate_a_1, 
I should also test if copying, ++ and != are noexcept, but I wanted to ask 
first because there might be restrictions on what iterators are allowed to 
do, even if I didn't see them. Also, the current code already ignores 
those, so it may as well be fixed in another patch.


Allocators are complicated. I specialized only for the default allocator, 
because that's by far the one that is used the most, and I have much less 
risk of getting it wrong. Some allocator expert is welcome to make a 
better test. I do not know in details how deque is implemented. A quick 
look seemed to show that trivial relocation should be fine, but I would 
appreciate a confirmation.


The extra parameter for __is_trivially_relocatable is not used, but I 
expect it will be as soon as the specializations of 
__is_trivially_relocatable become more advanced.


If I use or specialize __is_trivially_relocatable in many places, this 
forces to #include bits/stl_uninitialized.h in many places. I wonder if I 
should move some of that stuff. Since I may use it in std::swap, 
bits/move.h looks like a sensible place for the core pieces 
(__is_trivially_relocatable, and __relocate_object if I ever create that). 
That or type_traits.


Regtested on gcc112. I manually checked that there was a speed-up for 
operations on vector>, although doing any kind of benchmarking 
on gcc112 is hard, I'll test locally next time.


2018-10-26  Marc Glisse  

PR libstdc++/87106
* include/bits/stl_algobase.h: Include .
(__niter_base): Add noexcept specification.
* include/bits/stl_deque.h: Include .
(__is_trivially_relocatable): Specialize for deque.
* include/bits/stl_iterator.h: Include .
(__niter_base): Add noexcept specification.
* include/bits/stl_uninitialized.h (__is_trivially_relocatable):
Add parameter for meta-programming.
(__relocate_a_1, __relocate_a): Add noexcept specification.
* include/bits/stl_vector.h (__use_relocate): Test __relocate_a.

--
Marc GlisseIndex: libstdc++-v3/include/bits/stl_algobase.h
===
--- libstdc++-v3/include/bits/stl_algobase.h	(revision 265522)
+++ libstdc++-v3/include/bits/stl_algobase.h	(working copy)
@@ -62,20 +62,23 @@
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include  // For std::swap
 #include 
+#if __cplusplus >= 201103L
+# include 
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus < 201103L
   // See http://gcc.gnu.org/ml/libstdc++/2004-08/msg00167.html: in a
   // nutshell, we are partially implementing the resolution of DR 187,
   // when it's safe, i.e., the value_types are equal.
   template
@@ -268,20 +271,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   if (__comp(__a, __b))
 	return __b;
   return __a;
 }
 
   // Fallback implementation of the function in bits/stl_iterator.h used to
   // remove the __normal_iterator wrapper. See copy, fill, ...
   template
 inline _Iterator
 __niter_base(_Iterator __it)
+_GLIBCXX_NOEXCEPT_IF(std::is_nothrow_copy_constructible<_Iterator>::value)
 { return __it; }
 
   // Reverse the __niter_base transformation to get a
   // __normal_iterator back again (this assumes that __normal_iterator
   // is only used to wrap random access iterators, like pointers).
   template
 inline _From
 __niter_wrap(_From __from, _To __res)
 { return __from + (__res - std::__niter_base(__from)); }
 
Index: libstdc++-v3/include/bits/stl_deque.h
===
--- libstdc++-v3/include/bits/stl_deque.h	(revision 265522)
+++ libstdc++-v3/include/bits/stl_deque.h	(working copy)
@@ -54,20 +54,21 @@
  */
 
 #ifndef _STL_DEQUE_H
 #define _STL_DEQUE_H 1
 
 #include 
 #include 
 #include 
 #if __cplusplus >= 201103L
 #include 
+#include  // for __is_trivially_relocatable
 #endif
 
 #include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   /**
@@ -2359,14 +2360,23 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /// See std::deque::swap().
   template
 inline void
 swap(deque<_Tp,_Alloc>& __x, deque<_Tp,_Alloc>& __y)
 _GLIBCXX_NOEXCEPT_IF(noexcept(__x.swap(__y)))
 { __x.swap(__y); }
 
 #undef _GLIBCXX_DEQUE_BUF_SIZE
 
 _GLIBCXX_END_NAMESPACE_CONTAINER
+
+#if __cplusplus >= 201103L
+  // std::allocator is safe, but it is not the only allocator
+  // for which this is valid.
+  template
+struct __is_trivially_relocatable<_GLIBCXX_STD_C::deque<_Tp>>
+: true_type { };
+#endif
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
 #endif /* _STL_DEQUE_H */
Index: libstdc++-v3/include/bits/stl_iterator.h
===

Re: [gofrontend-dev] Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Than McIntosh via gcc-patches
Thanks for reporting this.

Sent https://go-review.googlesource.com/c/gofrontend/+/145017 with a
tentative fix.

Than

On Fri, Oct 26, 2018 at 7:55 AM Rainer Orth  
wrote:
>
> Hi Ian,
>
> > This patch by Than McIntosh improves the mangling of package paths in
> > the Go frontend.
> >
> > The current implementation of Gogo::pkgpath_for_symbol was written in
> > a way that allowed two distinct package paths to map to the same
> > symbol, which could cause collisions at link- time or compile-time.
> >
> > This patch switches to a better mangling scheme to ensure that we get
> > a unique packagepath symbol for each package.  In the new scheme
> > instead of having separate mangling schemes for identifiers and
> > package paths, the main identifier mangler ("go_encode_id") now
> > handles mangling of both packagepath characters and identifier
> > characters.
> >
> > The new mangling scheme is more intrusive: "foo/bar.Baz" is mangled as
> > "foo..z2fbar.Baz" instead of "foo_bar.Baz".  To mitigate this, this
> > patch also adds a demangling capability so that function names
> > returned from runtime.CallersFrames are converted back to their
> > original unmangled form.
> >
> > Changing the pkgpath_for_symbol scheme requires updating a number of
> > //go:linkname directives and C "__asm__" directives to match the new
> > scheme, as well as updating the 'gotest' driver (which makes
> > assumptions about the correct mapping from pkgpath symbol to package
> > name).
>
> it seems you missed a case here: both i386-pc-solaris2.* and
> sparc-sun-solaris2.* bootstraps broke linking the gotools:
>
> Undefined   first referenced
>  symbol in file
> log..z2fsyslog.syslog_c 
> ../sparc-sun-solaris2.11/libgo/.libs/libgo.so
> ld: fatal: symbol referencing errors
> collect2: error: ld returned 1 exit status
> make[2]: *** [Makefile:751: go] Error 1
>
> The following patch fixes this allowing the links to succeed, though
> I've not run the testsuite yet.
>
> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>
> --
> You received this message because you are subscribed to the Google Groups 
> "gofrontend-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to gofrontend-dev+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


Re: Turn complete to incomplete types in free_lang_data

2018-10-26 Thread Richard Biener
On Fri, Oct 26, 2018 at 1:29 PM Jan Hubicka  wrote:
>
> Hi,
> this is minimal variant of the patch turning complete to incomplete pointers 
> in
> fields.  We can do more - in particular it would be very function to do same
> for functions types and decls (because they often end up being streamed to
> symtab) and we should also turn pointers to arrays and enums to incomplete
> variants.
>
> I do that in my local tree but i would like to get it into mainline one by
> one and check benefits of each change independently.
>
> Patch bootstraped®tests x86-64 and I am now re-testing it on firefox.  I
> checked on small testcases that types indeed do get merged.
>
> OK if it survives more testing on firefox and lto bootstrap?

It looks like a hack to do free_lang_data_in_type from free_lang_data_in_decl
walk - I remember you wanted to unify find_* and free_*?  If not doing that
why would first doing the type walk and only then the decl walk not work
to avoid this ugliness?

That we need to have variants of the incomplete types at all for the place
you substitute them (FIELD_DECLs) has what reason?  See also comments below...

We are getting more and more "interesting" in things we free.  _Please_ work on
enabling free-lang-data (portions) for all compilations (with
-fchecking?).  It's disturbing to see
so much differences creep in in the supposedly "shared" part of regular and
LTO compilation.

> Honza
>
> * tree.c (free_lang_data_in_type): Declare.
> (types_equal_p): New function.
> (free_lang_data_type_variant): New function.
> (incomplete_type_of): New function.
> (simplified_type): New function.
> Index: tree.c
> ===
> --- tree.c  (revision 265522)
> +++ tree.c  (working copy)
> @@ -265,6 +265,8 @@ static void print_type_hash_statistics (
>  static void print_debug_expr_statistics (void);
>  static void print_value_expr_statistics (void);
>
> +static void free_lang_data_in_type (tree type);
> +
>  tree global_trees[TI_MAX];
>  tree integer_types[itk_none];
>
> @@ -5038,6 +5041,140 @@ protected_set_expr_location (tree t, loc
>  SET_EXPR_LOCATION (t, loc);
>  }
>
> +/* Do same comparsion as check_qualified_type skipping lang part of type
> +   and be more permissive about type names: we only care that names are
> +   same (for diagnostics) and that ODR names are the same.  */
> +
> +static bool
> +types_equal_p (tree t, tree v)

The function name is of course totally misleading.  Please use sth like
fld_type_variants_equal_p.

Note we already split check_qualified_type - can't you somehow re-use
check_base_type (only)?

> +{
> +  if (t==v)
> +return true;
> +
> +  if (TYPE_QUALS (t) != TYPE_QUALS (v))
> +return false;
> +
> +  if (TYPE_NAME (t) != TYPE_NAME (v)
> +  && (!TYPE_NAME (t) || !TYPE_NAME (v)
> + || TREE_CODE (TYPE_NAME (t)) != TYPE_DECL
> + || TREE_CODE (TYPE_NAME (v)) != TYPE_DECL
> + || DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (t))
> +!= DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (v))

I wonder what this is about...

> + || DECL_NAME (TYPE_NAME (t)) != DECL_NAME (TYPE_NAME (v

...or this, given we may end up turning DECL_NAMEs to INDENTIFIER_NODEs so
this will crash.

> + return false;
> +
> +  if (TYPE_ALIGN (t) != TYPE_ALIGN (v))
> +return false;
> +
> +  if (!attribute_list_equal (TYPE_ATTRIBUTES (t),
> +TYPE_ATTRIBUTES (v)))
> + return false;
> +
> +  /* Do not replace complete type by incomplete.  */
> +  if ((TREE_CODE (t) == QUAL_UNION_TYPE
> +   || TREE_CODE (t) == UNION_TYPE || TREE_CODE (t) == RECORD_TYPE)
> +  && COMPLETE_TYPE_P (t) != COMPLETE_TYPE_P (v))
> +return false;

?  It looks like this function is a left-over from another patch and
the incomplete
type building could use sth leaner and more clearer?


> +
> +  gcc_assert (TREE_CODE (t) == TREE_CODE (v));
> +
> +  /* For pointer types and array types we also care about the type they
> + reffer to.  */
> +  if (TREE_TYPE (t))
> +return types_equal_p (TREE_TYPE (t), TREE_TYPE (v));
> +
> +  return true;
> +}
> +
> +/* Find variant of FIRST that match T and create new one if necessary.  */
> +
> +static tree
> +free_lang_data_type_variant (tree first, tree t)
> +{
> +  if (first == TYPE_MAIN_VARIANT (t))
> +return t;
> +  for (tree v = first; v; v = TYPE_NEXT_VARIANT (v))
> +if (types_equal_p (t, v))
> +  return v;
> +  tree v = build_variant_type_copy (first);
> +  TYPE_READONLY (v) = TYPE_READONLY (t);
> +  TYPE_VOLATILE (v) = TYPE_VOLATILE (t);
> +  TYPE_ATOMIC (v) = TYPE_ATOMIC (t);
> +  TYPE_RESTRICT (v) = TYPE_RESTRICT (t);
> +  TYPE_ADDR_SPACE (v) = TYPE_ADDR_SPACE (t);
> +  TYPE_NAME (v) = TYPE_NAME (t);
> +  TYPE_ATTRIBUTES (v) = TYPE_ATTRIBUTES (t);
> +  return v;
> +}
> +
> +/* Map complete types to incomplete types.  */
> +static hash_map *incomplete_types;
> +
> +/* See if type T can be turn

Re: [PATCH, rs6000] Fix _mm_extract_pi16 for big-endian

2018-10-26 Thread Paul Clarke
On 10/25/2018 05:08 PM, Jakub Jelinek wrote:
> On Thu, Oct 25, 2018 at 05:07:03PM -0500, Segher Boessenkool wrote:
>> On Thu, Oct 25, 2018 at 01:41:15PM -0500, Paul Clarke wrote:
>>> For compatibility implementation of x86 vector intrinsic, _mm_extract_pi16,
>>> adjust shift value for big-endian mode.
>>>
>>> Bootstrapped and tested on Linux POWER8 LE, POWER8 BE (64 & 32), and POWER7.
>>
>> Does it fix existing testcases?

No. I found it with my own testing. I have a "to-do" to enhance the testing in 
this area, not only for endian issues, but I think corner/edge cases are not 
well tested.

>> Okay for trunk in either case.  Thanks!  Also fine to backport to 8.

Thanks!

>>> 2018-10-25  Paul A. Clarke  
>>>
>>> * config/rs6000/xmmintrin.h: Fix _mm_extract_pi16 for big-endian.
> 
> The ChangeLog entry is incorrect, should be:
>   * config/rs6000/xmmintrin.h (_mm_extract_pi16): Fix for big-endian.
> or so.

Will fix before committing. Thanks!

PC



GCC 6 branch is now closed

2018-10-26 Thread Jakub Jelinek
After the GCC 6.5 release the GCC 6 branch is now closed.  Please
refrain from committing to it from now on.

Thanks
Jakub


Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.

2018-10-26 Thread Michael Ploujnikov
Hi Martin,

Thanks for the review.

On 2018-10-26 03:51 AM, Martin Liška wrote:
> On 10/26/18 12:59 AM, Michael Ploujnikov wrote:
>> I've taken the advice from a discussion on IRC and re-wrote the patch
>> with more uniform function names and using overloading.
>>
>> I think this function accomplished the following goals:
>>  - remove clone numbering where it's not needed:
>>final.c:final_scan_insn_1 and
>>symtab.c:simd_symtab_node::noninterposable_alias.
>>  - name and document the clone naming API such that future users won't
>>accidentally use the numbering when it's not necessary; if you need
>>numbering then you need to explicitly ask for it with the right
>>function
>>  - provide a new function that allows users to specify a clone number
>>explicitly as an argument
> 
> Hello.
> 
> Thanks for reworking that.
> 
>>
>> My thoughts for future improvements:
>>  - It's a bit unfortunate that lto-partition.c:privatize_symbol_name_1
>>has to break the decl abstraction and pass in a string that it
>>created into what I would consider the implementation-detail
>>function. The best way I can think of to make it uniform with the
>>rest of the users is to have it create a new empty decl with
>>DECL_ASSEMBLER_NAME set to the new string
> 
> That's not nice to create artificial declaration. Having string variant
> is fine for me.

Ok.

> 
>>  - It's unfortunate that I have to duplicate the separator
>>concatenation in the numberless clone_function_name, but I think it
>>has to be like that unless ASM_FORMAT_PRIVATE_NAME making the
>>number optional.
>>
> 
> That's also fine for me. I'm attaching small nits that I found.
> And please reformat following chunk in ChangeLog entry:
> 
>   * cgraph.h (clone_function_name_1): Replaced by new
>   clone_function_name_numbered that takes name as string; for
>   privatize_symbol_name_1 use only.  (clone_function_name):
>   Renamed to clone_function_name_numbered to be explicit about
>   numbering.  (clone_function_name): New two-argument function
>   that does not number its output.  (clone_function_name): New
>   three-argument function that takes a number to append to its
>   output.
> 
> into:
> 
>   * cgraph.h (clone_function_name_1): Replaced by new
>   clone_function_name_numbered that takes name as string; for
>   privatize_symbol_name_1 use only.
> (clone_function_name): Renamed to clone_function_name_numbered
>   to be explicit about...

Fixed, assuming you wanted me to start each function on a new line.

>  suffix,
> -(char*)0));
> +NULL));
>return get_identifier (result);
>  }

I've actually been told that NULL isn't always the same on some
targets and that I should use (char*)0 instead. Note that
libiberty/concat.c itself uses (char*)0.

> 
> I'm adding Honza to CC, hope he can review it quickly.
> 
> Thanks,
> Martin
> 

Thanks again,
Michael
From aea94273e7a477a03d1ee10a5d9043d6d13b8e8d Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 25 Oct 2018 13:16:36 -0400
Subject: [PATCH] Avoid unnecessarily numbering cloned symbols.

gcc/ChangeLog:

2018-10-26  Michael Ploujnikov  

	* cgraph.h (clone_function_name_1): Replaced by new
	  clone_function_name_numbered that takes name as string; for
	  privatize_symbol_name_1 use only.
	  (clone_function_name): Renamed to
	  clone_function_name_numbered to be explicit about numbering.
	  (clone_function_name): New two-argument function that does
	  not number its output.
	  (clone_function_name): New three-argument function that
	  takes a number to append to its output.
	* cgraphclones.c (duplicate_thunk_for_node):
	  (clone_function_name_1): Renamed.
	  (clone_function_name_numbered): Two new functions.
	  (clone_function_name): Improved documentation.
	  (cgraph_node::create_virtual_clone): Use clone_function_name_numbered.
	* config/rs6000/rs6000.c (make_resolver_func): Ditto.
	* final.c (final_scan_insn_1): Use the new clone_function_name
	  without numbering.
	* multiple_target.c (create_dispatcher_calls): Ditto.
	  (create_target_clone): Ditto.
	* omp-expand.c (grid_expand_target_grid_body): Ditto.
	* omp-low.c (create_omp_child_function_name): Ditto.
	* omp-simd-clone.c (simd_clone_create): Ditto.
	* symtab.c (simd_symtab_node::noninterposable_alias): Use the
	  new clone_function_name without numbering.

gcc/lto/ChangeLog:

2018-10-26  Michael Ploujnikov  

	* lto-partition.c (privatize_symbol_name_1): Use
	  clone_function_name_numbered.

gcc/testsuite/ChangeLog:

2018-10-26  Michael Ploujnikov  

	* gcc.dg/tree-prof/cold_partition_label.c: Update for cold
	  section names without numbers.
	* gcc.dg/tree-prof/section-attr-1.c: Ditto.
	* gcc.dg/tree-prof/section-attr-2.c: Ditto.
	* gcc.dg/tree-prof/section-attr-3.c: Ditto.
---
 gcc/cgraph.h   

Re: [PATCH v2, rs6000 1/4] Fixes for x86 intrinsics on POWER 32bit

2018-10-26 Thread Paul Clarke
On 10/25/2018 05:17 PM, Segher Boessenkool wrote:
> On Thu, Oct 25, 2018 at 02:07:33PM -0500, Paul Clarke wrote:
>> Various clean-ups for 32bit support.
>>
>> Implement various corrections in the compatibility implementations of the
>> x86 vector intrinsics found after enabling 32bit mode for the associated
>> test cases.  (Actual enablement coming in a subsequent patch.)
> 
> So what happened on 32-bit before?  (After you get rid of the #ifdef of
> course).  It isn't clear to me.

Most of the changes are to remove dependency on int128 support, because with 
'-m32', errors were reported:
/opt/at12.0/lib/gcc/powerpc64-linux-gnu/8.2.1/include/xmmintrin.h:992:61: 
error: ‘__int128’ is not supported on this target
   return ((__m64) __builtin_unpack_vector_int128 ((__vector __int128)result, 
0));

Prompted the many changes like:
> -  vm1 = (__vector signed short)__builtin_pack_vector_int128 (__m2, __m1);
> +  vm1 = (__vector signed short) (__vector unsigned long long) { __m2, __m1 };

./xmmintrin.h:1620:21: warning: conversion from ‘long long unsigned int’ to 
‘long unsigned int’ changes value from ‘2269495618449464’ to ‘539504696’ 
[-Woverflow]
   unsigned long p = 0x0008101820283038UL; // permute control for sign bits
^~~~
prompting:
> -  unsigned long p = 0x0008101820283038UL; // permute control for sign bits
> +  unsigned long long p = 0x0008101820283038UL; // permute control for sign 
> bits

But if you are asking what happened with the GCC testsuite, then since all of 
the tests were marked "lp64", they were just ignored as UNSUPPORTED with -m32.

PC



Re: Add a loop versioning pass

2018-10-26 Thread Richard Biener
On Thu, Oct 25, 2018 at 3:52 PM Richard Biener
 wrote:
>
> On Wed, Oct 24, 2018 at 3:05 PM Richard Sandiford
>  wrote:
> >
> > This patch adds a pass that versions loops with variable index strides
> > for the case in which the stride is 1.  E.g.:
> >
> > for (int i = 0; i < n; ++i)
> >   x[i * stride] = ...;
> >
> > becomes:
> >
> > if (stepx == 1)
> >   for (int i = 0; i < n; ++i)
> > x[i] = ...;
> > else
> >   for (int i = 0; i < n; ++i)
> > x[i * stride] = ...;
> >
> > This is useful for both vector code and scalar code, and in some cases
> > can enable further optimisations like loop interchange or pattern
> > recognition.
> >
> > The pass gives a 7.6% improvement on Cortex-A72 for 554.roms_r at -O3
> > and a 2.4% improvement for 465.tonto.  I haven't found any SPEC tests
> > that regress.
> >
> > Sizewise, there's a 10% increase in .text for both 554.roms_r and
> > 465.tonto.  That's obviously a lot, but in tonto's case it's because
> > the whole program is written using assumed-shape arrays and pointers,
> > so a large number of functions really do benefit from versioning.
> > roms likewise makes heavy use of assumed-shape arrays, and that
> > improvement in performance IMO justifies the code growth.
> >
> > The next biggest .text increase is 4.5% for 548.exchange2_r.  I did see
> > a small (0.4%) speed improvement there, but although both 3-iteration runs
> > produced stable results, that might still be noise.  There was a slightly
> > larger (non-noise) improvement for a 256-bit SVE model.
> >
> > 481.wrf and 521.wrf_r .text grew by 2.8% and 2.5% respectively, but
> > without any noticeable improvement in performance.  No other test grew
> > by more than 2%.
> >
> > Although the main SPEC beneficiaries are all Fortran tests, the
> > benchmarks we use for SVE also include some C and C++ tests that
> > benefit.
> >
> > Using -frepack-arrays gives the same benefits in many Fortran cases.
> > The problem is that using that option inappropriately can force a full
> > array copy for arguments that the function only reads once, and so it
> > isn't really something we can turn on by default.  The new pass is
> > supposed to give most of the benefits of -frepack-arrays without
> > the risk of unnecessary repacking.
> >
> > The patch therefore enables the pass by default at -O3.
> >
> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> I will give this a thorough review tomorror (sorry for the delay),

So I didn't really finish but one thing I noticed is

> +  if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> +   fprintf (dump_file, ";; Want to version loop %d (depth %d)"
> +" for when ", loop->num, loop_depth (loop));
> +   print_generic_expr (dump_file, name, TDF_SLIM);
> +   fprintf (dump_file, " == 1");

Since you are writing a new pass you want to use the new dump interface.

   if (dump_enabled_p ())
 dump_printf (MSG_NOTE, ";; Want to version loop %d (depth %d)"
 " for when %E == 1", loop->num, loop_depth (loop), name);
...

it's much nicer to be able to use %E/%G than separate calls for the
tree parts.

I'm also cut&pasting my overall comment part but not the incomplete
set of individual comments/questions yet (hope to finish that on Monday):

> Sizewise, there's a 10% increase in .text for both 554.roms_r and
> 465.tonto.  That's obviously a lot, but in tonto's case it's because
> the whole program is written using assumed-shape arrays and pointers,
> so a large number of functions really do benefit from versioning.
> roms likewise makes heavy use of assumed-shape arrays, and that
> improvement in performance IMO justifies the code growth.

Ouch.  I know that at least with LTO IPA-CP can do "quite" some
propagation of constant strides.  Not sure if we're aggressive
enough in actually doing the cloning for all cases we figure out
strides though.  But my question is how we can avoid doing the
versioning for loops in the copy that did not have the IPA-CPed
stride of one?  Ideally we'd be able to mark individual references
as {definitely,likely,unlikely,not}-unit-stride?

> The next biggest .text increase is 4.5% for 548.exchange2_r.  I did see
> a small (0.4%) speed improvement there, but although both 3-iteration runs
> produced stable results, that might still be noise.  There was a slightly
> larger (non-noise) improvement for a 256-bit SVE model.
>
> 481.wrf and 521.wrf_r .text grew by 2.8% and 2.5% respectively, but
> without any noticeable improvement in performance.  No other test grew
> by more than 2%.
>
> Although the main SPEC beneficiaries are all Fortran tests, the
> benchmarks we use for SVE also include some C and C++ tests that
> benefit.

Did you see any slowdown, for example because versioning was forced
to be on an innermost loop?  I'm thinking of the testcase in
PR87561 where we do have strided accesses in the innermost loop.

Since you cite performance numbers how

Re: Turn complete to incomplete types in free_lang_data

2018-10-26 Thread Jan Hubicka
> > OK if it survives more testing on firefox and lto bootstrap?
> 
> It looks like a hack to do free_lang_data_in_type from free_lang_data_in_decl
> walk - I remember you wanted to unify find_* and free_*?  If not doing that

I did try it :) There is a catch - free lang data calls langhooks to produce 
mangled
assembler names. For that the trees must be non-freed yet.
So you can't do freeing as you discover what trees to follow.

We can save one walk by computing assembler names during the discovery, but we
need to know all trees we want to do langhooks on before we start putting NULL
pointers around.

> why would first doing the type walk and only then the decl walk not work
> to avoid this ugliness?

Hmm, currently we first walk decl and then types,so swapping them woudl work.
But since I want to also simplify types in function types, it would break next.

> 
> That we need to have variants of the incomplete types at all for the place
> you substitute them (FIELD_DECLs) has what reason?  See also comments below...
> 
> We are getting more and more "interesting" in things we free.  _Please_ work 
> on
> enabling free-lang-data (portions) for all compilations (with
> -fchecking?).  It's disturbing to see
> so much differences creep in in the supposedly "shared" part of regular and
> LTO compilation.

I wonder what is the plan to make late warnings to work reliably in this case?

> > +/* Do same comparsion as check_qualified_type skipping lang part of type
> > +   and be more permissive about type names: we only care that names are
> > +   same (for diagnostics) and that ODR names are the same.  */
> > +
> > +static bool
> > +types_equal_p (tree t, tree v)
> 
> The function name is of course totally misleading.  Please use sth like
> fld_type_variants_equal_p.
> 
> Note we already split check_qualified_type - can't you somehow re-use
> check_base_type (only)?

Hmm, you are right. I can re-unify those since this function basically cared
about ...
> 
> > +{
> > +  if (t==v)
> > +return true;
> > +
> > +  if (TYPE_QUALS (t) != TYPE_QUALS (v))
> > +return false;
> > +
> > +  if (TYPE_NAME (t) != TYPE_NAME (v)
> > +  && (!TYPE_NAME (t) || !TYPE_NAME (v)
> > + || TREE_CODE (TYPE_NAME (t)) != TYPE_DECL
> > + || TREE_CODE (TYPE_NAME (v)) != TYPE_DECL
> > + || DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (t))
> > +!= DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (v))
> 
> I wonder what this is about...

... unmerged TYPE_NAMEs which happens only on WPA state. It is leftover of
my merging during streaming experiment.

I will clean this up and send updated patch.  I was bit in hurry leaving today
and wanted to send at least initial patch for discussion.

Honza


Re: [gofrontend-dev] Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Rainer Orth
Hi Than,

> Thanks for reporting this.
>
> Sent https://go-review.googlesource.com/c/gofrontend/+/145017 with a
> tentative fix.

fine, thanks.

While actually running the libgo testsuite, another issue came up: all
tests FAIL like this:

/vol/gcc/src/hg/trunk/local/libgo/testsuite/gotest[516]: local: not found [No 
such file or directory]
FAIL: crypto/sha512

The Solaris 11 /bin/sh is ksh93, and Solaris 10 even comes with the
original Bourne shell, neither of which support the unportable local
command.  AFAICS, there's no need at all to use it, and indeed with the
following patch libgo testsuite results are as before the patch.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/libgo/testsuite/gotest b/libgo/testsuite/gotest
--- a/libgo/testsuite/gotest
+++ b/libgo/testsuite/gotest
@@ -513,9 +513,9 @@ localname() {
 #Returned:  leaf.Mumble
 #
 symtogo() {
-  local s=""
-  local result=""
-  local ndots=""
+  s=""
+  result=""
+  ndots=""
   for tp in $*
   do
 s=$(echo "$tp" | sed -e 's/\.\.z2f/%/g' | sed -e 's/.*%//')


Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.

2018-10-26 Thread Jan Hubicka
> From aea94273e7a477a03d1ee10a5d9043d6d13b8e8d Mon Sep 17 00:00:00 2001
> From: Michael Ploujnikov 
> Date: Thu, 25 Oct 2018 13:16:36 -0400
> Subject: [PATCH] Avoid unnecessarily numbering cloned symbols.
> 
> gcc/ChangeLog:
> 
> 2018-10-26  Michael Ploujnikov  
> 
>   * cgraph.h (clone_function_name_1): Replaced by new
> clone_function_name_numbered that takes name as string; for
> privatize_symbol_name_1 use only.
> (clone_function_name): Renamed to
> clone_function_name_numbered to be explicit about numbering.
> (clone_function_name): New two-argument function that does
> not number its output.
> (clone_function_name): New three-argument function that
> takes a number to append to its output.
>   * cgraphclones.c (duplicate_thunk_for_node):
> (clone_function_name_1): Renamed.
> (clone_function_name_numbered): Two new functions.
> (clone_function_name): Improved documentation.
> (cgraph_node::create_virtual_clone): Use clone_function_name_numbered.
>   * config/rs6000/rs6000.c (make_resolver_func): Ditto.
>   * final.c (final_scan_insn_1): Use the new clone_function_name
> without numbering.
>   * multiple_target.c (create_dispatcher_calls): Ditto.
> (create_target_clone): Ditto.
>   * omp-expand.c (grid_expand_target_grid_body): Ditto.
>   * omp-low.c (create_omp_child_function_name): Ditto.
>   * omp-simd-clone.c (simd_clone_create): Ditto.
>   * symtab.c (simd_symtab_node::noninterposable_alias): Use the
> new clone_function_name without numbering.
> 
> gcc/lto/ChangeLog:
> 
> 2018-10-26  Michael Ploujnikov  
> 
>   * lto-partition.c (privatize_symbol_name_1): Use
> clone_function_name_numbered.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-10-26  Michael Ploujnikov  
> 
>   * gcc.dg/tree-prof/cold_partition_label.c: Update for cold
> section names without numbers.
>   * gcc.dg/tree-prof/section-attr-1.c: Ditto.
>   * gcc.dg/tree-prof/section-attr-2.c: Ditto.
>   * gcc.dg/tree-prof/section-attr-3.c: Ditto.

OK,
thanks!
Honza


Re: gOlogy: skip dbranch at -Og

2018-10-26 Thread Jeff Law
On 10/25/18 11:12 PM, Alexandre Oliva wrote:
> Delayed slot filling moves insns without any regard to variable
> location notes, causing the location information in them to become
> incorrect.
> 
> Fixing that appears to be quite difficult, but filling delay slots is
> hardly an essential optimization to run at -Og, so if the user wants
> to privilege debuggability, skip delay slot filling.
> 
> Regstrapped on sparc64-, x86- and x86_64-linux-gnu.  Also bootstrapped
> on sparc64-linux-gnu with BOOT_CFLAGS='-Og -g' (needed
> -Wno-error={{maybe-,}uninitialized,format-overflow}))
> 
> Ok to install?
> 
> for  gcc/ChangeLog
> 
>   * opts.c (default_options_table): Do not enable
>   OPT_fdelayed_branch at -Og.
>   * doc/invoke.texi (-fdelayed-branch): Document it.
OK.
jeff


Re: [gofrontend-dev] Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Than McIntosh via gcc-patches
OK, thanks again. Another fix sent:

 https://go-review.googlesource.com/c/gofrontend/+/145021

Cheers, Than

On Fri, Oct 26, 2018 at 10:20 AM Rainer Orth
 wrote:
>
> Hi Than,
>
> > Thanks for reporting this.
> >
> > Sent https://go-review.googlesource.com/c/gofrontend/+/145017 with a
> > tentative fix.
>
> fine, thanks.
>
> While actually running the libgo testsuite, another issue came up: all
> tests FAIL like this:
>
> /vol/gcc/src/hg/trunk/local/libgo/testsuite/gotest[516]: local: not found [No 
> such file or directory]
> FAIL: crypto/sha512
>
> The Solaris 11 /bin/sh is ksh93, and Solaris 10 even comes with the
> original Bourne shell, neither of which support the unportable local
> command.  AFAICS, there's no need at all to use it, and indeed with the
> following patch libgo testsuite results are as before the patch.
>
> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>


Re: [PATCH 0/7] libsanitizer: merge from trunk

2018-10-26 Thread Bill Seurer

On 10/26/18 03:57, Jakub Jelinek wrote:

On Thu, Oct 25, 2018 at 12:49:42PM +0200, Jakub Jelinek wrote:

On Thu, Oct 25, 2018 at 12:15:46PM +0200, marxin wrote:

I've just finished my first merge from libsanitizer mainline. Overall it
looks fine, apparently ABI hasn't changed and so that SONAME bump is not
needed.


Given the 6/7 patch, I think you need to bump libasan soname (it would be
weird to bump it on powerpc64* only).


BTW, how can shadow offset be 1UL<<44 on powerpc64?  That seems like they
don't want to support anything but very recent kernels.
E.g. looking at Linux 3.4 arch/powerpc/include/asm/processor.h
I see
/* 64-bit user address space is 44-bits (16TB user VM) */
#define TASK_SIZE_USER64 (0x1000UL)
so, the new choice must be incompatible with lots of kernels out there.
Move recent kernels have:
#define TASK_SIZE_64TB  (0x4000UL)
#define TASK_SIZE_128TB (0x8000UL)
#define TASK_SIZE_512TB (0x0002UL)
#define TASK_SIZE_1PB   (0x0004UL)
#define TASK_SIZE_2PB   (0x0008UL)
#define TASK_SIZE_4PB   (0x0010UL)
but 4.15 still tops at 512TB, 4.10 has just 64TB as the only choice, 3.8 as
well.

CCing Bill as he made this change.

Jakub



At the time for llvm the concern was to get it to work on newer kernels 
and not worry (much) about the older ones.  I did spend some time trying 
to get it to work for both.


--

-Bill Seurer



Re: [PATCH 0/7] libsanitizer: merge from trunk

2018-10-26 Thread Jakub Jelinek
On Fri, Oct 26, 2018 at 09:48:54AM -0500, Bill Seurer wrote:
> On 10/26/18 03:57, Jakub Jelinek wrote:
> > On Thu, Oct 25, 2018 at 12:49:42PM +0200, Jakub Jelinek wrote:
> > > On Thu, Oct 25, 2018 at 12:15:46PM +0200, marxin wrote:
> > > > I've just finished my first merge from libsanitizer mainline. Overall it
> > > > looks fine, apparently ABI hasn't changed and so that SONAME bump is not
> > > > needed.
> > > 
> > > Given the 6/7 patch, I think you need to bump libasan soname (it would be
> > > weird to bump it on powerpc64* only).
> > 
> > BTW, how can shadow offset be 1UL<<44 on powerpc64?  That seems like they
> > don't want to support anything but very recent kernels.
> > E.g. looking at Linux 3.4 arch/powerpc/include/asm/processor.h
> > I see
> > /* 64-bit user address space is 44-bits (16TB user VM) */
> > #define TASK_SIZE_USER64 (0x1000UL)
> > so, the new choice must be incompatible with lots of kernels out there.
> > Move recent kernels have:
> > #define TASK_SIZE_64TB  (0x4000UL)
> > #define TASK_SIZE_128TB (0x8000UL)
> > #define TASK_SIZE_512TB (0x0002UL)
> > #define TASK_SIZE_1PB   (0x0004UL)
> > #define TASK_SIZE_2PB   (0x0008UL)
> > #define TASK_SIZE_4PB   (0x0010UL)
> > but 4.15 still tops at 512TB, 4.10 has just 64TB as the only choice, 3.8 as
> > well.
> > 
> > CCing Bill as he made this change.
> > 
> > Jakub
> > 
> 
> At the time for llvm the concern was to get it to work on newer kernels and
> not worry (much) about the older ones.  I did spend some time trying to get
> it to work for both.

Which exact task size doesn't work if shadow offset is 2TB and why?

Jakub


Re: [ARM/FDPIC v3 03/21] [ARM] FDPIC: Force FDPIC related options unless -mno-fdpic is provided

2018-10-26 Thread Christophe Lyon
On Tue, 23 Oct 2018 at 17:14, Segher Boessenkool
 wrote:
>
> On Tue, Oct 23, 2018 at 02:58:21PM +0100, Richard Earnshaw (lists) wrote:
> > On 15/10/2018 11:10, Christophe Lyon wrote:
> > > Do you mean to also make -mfdpic non-existent/rejected when GCC is not
> > > configured
> > > for arm-uclinuxfdpiceabi?
> >
> > Ideally doesn't exist, so that it doesn't show up in things like --help
> > when it doesn't work.
> >
> > > How to achieve that?
> >
> > Good question, I'm not sure, off hand.  It might be possible to make the
> > config machinery add additional opt files, but it's not something I've
> > tried.  You might want to try adding an additional opt file to
> > extra_options for fdpic targets.
>
> That should work yes.  You could look at how 476.opt is added for powerpc,
> it is a comparable situation.
>

Thanks, I got it to work.

Now back to Richard's original question:
> I think this needs to be resolved.  Either -mfdpic works everywhere, or
> the option should only be available when configured for -mfdpic.
It's not that -mfdpic does not work everywhere, rather it is not sufficient
to use it alone: it should be used along with fpic/fPIC/fpie/fPIE depending
on the use case.

In practice I don't know if we want to be able to use -mfdpic with a
arm-linux-gnueabi
toolchain, or if we are ok to have to use two different toolchains
when we want to make
tests/compare code generation in both cases.

The 1st option means I should improve the documentation patch. For the 2nd one,
I have patches in progress (which also imply reworking the doc since the option
would not also be available)

Christophe


>
> Segher


Re: [PATCH] Add option to control warnings added through attribure "warning"

2018-10-26 Thread Martin Sebor

On 10/26/2018 05:01 AM, Nikolai Merinov wrote:

Hi,

What next steps should I perform in order to get this changes merged to GCC?


Keep pinging it once a week until a maintainer approves it.
I'm not empowered to do that.

Martin



Regards,
Nikolai

- Original Message -
From: "Nikolai Merinov"
To: "Martin Sebor" , gcc-patches@gcc.gnu.org
Sent: Monday, October 15, 2018 3:21:15 PM
Subject: Re: [PATCH] Add option to control warnings added through attribure 
"warning"

Hi Martin,

On 10/15/18 6:20 PM, Martin Sebor wrote:

On 10/15/2018 01:55 AM, Nikolai Merinov wrote:

Hi Martin,

On 10/12/18 9:58 PM, Martin Sebor wrote:

On 10/12/2018 04:14 AM, Nikolai Merinov wrote:

Hello,

In https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01795.html mail I
suggested patch to have ability to control behavior of
"__attribute__((warning))" in case when option "-Werror" enabled. Usage
example:


#include
int a() __attribute__((warning("Warning: `a' was used")));
int a() { return 1; }
int main () { return a(); }



$ gcc -Werror test.c
test.c: In function ‘main’:
test.c:4:22: error: call to ‘a’ declared with attribute warning:
Warning: `a' was used [-Werror]
 int main () { return a(); }
  ^
cc1: all warnings being treated as errors
$ gcc -Werror -Wno-error=warning-attribute test.c
test.c: In function ‘main’:
test.c:4:22: warning: call to ‘a’ declared with attribute warning:
Warning: `a' was used
 int main () { return a(); }
  ^

Can you provide any feedback on suggested changes?


It seems like a useful feature and in line with the philosophy
that distinct warnings should be controlled by their own options.

I would only suggest to consider changing the name to
-Wattribute-warning, because it applies specifically to that
attribute (as opposed to warnings about attributes in general).

There are many attributes in GCC and diagnosing problems that
are unique to each, under the same -Wattributes option, is
becoming too coarse and overly limiting.  To make it more
flexible, I expect new options will need to be introduced,
such as -Wattribute-alias (to control aspects of the alias
attribute and others related to it), or -Wattribute-const
(to control diagnostics about functions declared with
attribute const that violate the attribute's constraints).

An alternative might be to introduce a single -Wattribute=
 option where the  gives
the names of all the distinct attributes whose unique
diagnostics one might need to control.

Martin


Currently there is several styles already in use:

-Wattribute-alias where "attribute" word used as prefix for name of attribute,
-Wsuggest-attribute=[pure|const|noreturn|format|malloc] where name of attribute 
passed as possible argument,
-Wmissing-format-attribute where "attribute" word used as suffix,
-Wdeprecated-declarations where "attribute" word not used at all even if this warning 
option was created especially for "deprecated" attribute.

I changed name to "-Wattribute-warning" as you suggested, but unifying style 
for all attribute related warning looks like separate activity. Please check new patch in 
attachments.



Thanks for survey!  I agree that making the existing options
consistent (if that's what we want) should be done separately.

Martin

PS It doesn't look like your latest attachments made it to
the list.


Thank you for mentioning. There was my mistake. Now it's attached



Updated changelog:

gcc/Changelog

2018-10-14  Nikolai Merinov

 * gcc/common.opt: Add -Wattribute-warning.
 * gcc/doc/invoke.texi: Add documentation for -Wno-attribute-warning.
 * gcc/testsuite/gcc.dg/Wno-attribute-warning.c: New test.
 * gcc/expr.c (expand_expr_real_1): Add new attribute to warning_at
 call to allow user configure behavior of "warning" attribute




Re: [PATCH] Add option to control warnings added through attribure "warning"

2018-10-26 Thread Jeff Law
On 10/26/18 9:11 AM, Martin Sebor wrote:
> On 10/26/2018 05:01 AM, Nikolai Merinov wrote:
>> Hi,
>>
>> What next steps should I perform in order to get this changes merged
>> to GCC?
> 
> Keep pinging it once a week until a maintainer approves it.
> I'm not empowered to do that.
Nikolai -- your patch isn't lost.  It's definitely in the queue.

jeff


Re: [C++ Patch] PR 84644 ("internal compiler error: in warn_misplaced_attr_for_class_type, at cp/decl.c:4718")

2018-10-26 Thread Jason Merrill
On Fri, Oct 26, 2018 at 4:52 AM Paolo Carlini  wrote:
> On 24/10/18 22:41, Jason Merrill wrote:
> > On 10/15/18 12:45 PM, Paolo Carlini wrote:
> >> && ((TREE_CODE (declspecs->type) != TYPENAME_TYPE
> >> +   && TREE_CODE (declspecs->type) != DECLTYPE_TYPE
> >>  && MAYBE_CLASS_TYPE_P (declspecs->type))
> >
> > I would think that the MAYBE_CLASS_TYPE_P here should be CLASS_TYPE_P,
> > and then we can remove the TYPENAME_TYPE check.  Or do we want to
> > allow template type parameters for some reason?
>
> Indeed, it would be nice to just use OVERLOAD_TYPE_P. However it seems
> we at least want to let through TEMPLATE_TYPE_PARMs representing 'auto'
> - otherwise Dodji's check a few lines below which fixed c++/51473
> doesn't work anymore - and also BOUND_TEMPLATE_TEMPLATE_PARM, otherwise
> we regress on template/spec32.C and template/ttp22.C because we don't
> diagnose the shadowing anymore. Thus, I would say either we keep on
> using MAYBE_CLASS_TYPE_P or we pick what we need, possibly we add a comment?

Aha.  I guess the answer is not to restrict that test any more, but
instead to fix the code further down so it gives a proper diagnostic
rather than call warn_misplaced_attr_for_class_type.

Jason


Re: [ARM/FDPIC v3 04/21] [ARM] FDPIC: Add support for FDPIC for arm architecture

2018-10-26 Thread Christophe Lyon
On Tue, 23 Oct 2018 at 16:07, Richard Earnshaw (lists)
 wrote:
>
> On 19/10/2018 14:40, Christophe Lyon wrote:
> > On 12/10/2018 12:45, Richard Earnshaw (lists) wrote:
> >> On 11/10/18 14:34, Christophe Lyon wrote:
> >>> The FDPIC register is hard-coded to r9, as defined in the ABI.
> >>>
> >>> We have to disable tailcall optimizations if we don't know if the
> >>> target function is in the same module. If not, we have to set r9 to
> >>> the value associated with the target module.
> >>>
> >>> When generating a symbol address, we have to take into account whether
> >>> it is a pointer to data or to a function, because different
> >>> relocations are needed.
> >>>
> >>> 2018-XX-XX  Christophe Lyon  
> >>> Mickaël Guêné 
> >>>
> >>> * config/arm/arm-c.c (__FDPIC__): Define new pre-processor macro
> >>> in FDPIC mode.
> >>> * config/arm/arm-protos.h (arm_load_function_descriptor): Declare
> >>> new function.
> >>> * config/arm/arm.c (arm_option_override): Define pic register to
> >>> FDPIC_REGNUM.
> >>> (arm_function_ok_for_sibcall) Disable sibcall optimization if we
> >>
> >> Missing colon.
> >>
> >>> have no decl or go through PLT.
> >>> (arm_load_pic_register): Handle TARGET_FDPIC.
> >>> (arm_is_segment_info_known): New function.
> >>> (arm_pic_static_addr): Add support for FDPIC.
> >>> (arm_load_function_descriptor): New function.
> >>> (arm_assemble_integer): Add support for FDPIC.
> >>> * config/arm/arm.h (PIC_OFFSET_TABLE_REG_CALL_CLOBBERED):
> >>> Define. (FDPIC_REGNUM): New define.
> >>> * config/arm/arm.md (call): Add support for FDPIC.
> >>> (call_value): Likewise.
> >>> (*restore_pic_register_after_call): New pattern.
> >>> (untyped_call): Disable if FDPIC.
> >>> (untyped_return): Likewise.
> >>> * config/arm/unspecs.md (UNSPEC_PIC_RESTORE): New.
> >>>
> >>
> >> Other comments inline.
> >>
> >>> diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
> >>> index 4471f79..90733cc 100644
> >>> --- a/gcc/config/arm/arm-c.c
> >>> +++ b/gcc/config/arm/arm-c.c
> >>> @@ -202,6 +202,8 @@ arm_cpu_builtins (struct cpp_reader* pfile)
> >>> builtin_define ("__ARM_EABI__");
> >>>   }
> >>>   +  def_or_undef_macro (pfile, "__FDPIC__", TARGET_FDPIC);
> >>> +
> >>> def_or_undef_macro (pfile, "__ARM_ARCH_EXT_IDIV__", TARGET_IDIV);
> >>> def_or_undef_macro (pfile, "__ARM_FEATURE_IDIV", TARGET_IDIV);
> >>>   diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> >>> index 0dfb3ac..28cafa8 100644
> >>> --- a/gcc/config/arm/arm-protos.h
> >>> +++ b/gcc/config/arm/arm-protos.h
> >>> @@ -136,6 +136,7 @@ extern int arm_max_const_double_inline_cost (void);
> >>>   extern int arm_const_double_inline_cost (rtx);
> >>>   extern bool arm_const_double_by_parts (rtx);
> >>>   extern bool arm_const_double_by_immediates (rtx);
> >>> +extern rtx arm_load_function_descriptor (rtx funcdesc);
> >>>   extern void arm_emit_call_insn (rtx, rtx, bool);
> >>>   bool detect_cmse_nonsecure_call (tree);
> >>>   extern const char *output_call (rtx *);
> >>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> >>> index 8810df5..92ae24b 100644
> >>> --- a/gcc/config/arm/arm.c
> >>> +++ b/gcc/config/arm/arm.c
> >>> @@ -3470,6 +3470,14 @@ arm_option_override (void)
> >>> if (flag_pic && TARGET_VXWORKS_RTP)
> >>>   arm_pic_register = 9;
> >>>   +  /* If in FDPIC mode then force arm_pic_register to be r9.  */
> >>> +  if (TARGET_FDPIC)
> >>> +{
> >>> +  arm_pic_register = FDPIC_REGNUM;
> >>> +  if (TARGET_ARM_ARCH < 7)
> >>> +error ("FDPIC mode is not supported on architectures older than
> >>> Armv7");
> >>
> >> What properties of FDPIC impose this requirement?  Does it also apply to
> >> Armv8-m.baseline?
> >>
> > In fact, there was miscommunication on my side, resulting in a
> > misunderstanding between Kyrill and myself, which I badly translated
> > into this condition.
> >
> > My initial plan was to submit a patch series tested on v7, and send the
> > patches needed to support older architectures as a follow-up. The proper
> > restriction is actually "CPUs that do not support ARM or Thumb2". As you
> > may have noticed during the iterations of this patch series, I had
> > failed to remove partial Thumb1 support hunks.
> >
> > So really this should be rephrased, and rewritten as "FDPIC mode is
> > supported on architecture versions that support ARM or Thumb-2", if that
> > suits you. And the condition should thus be:
> > if (! TARGET_ARM && ! TARGET_THUMB2)
> >   error ("...")
> >
> > This would also exclude Armv8-m.baseline, since it doesn't support Thumb2.
>
> When we get to v8-m.baseline the thumb1/2 distinction starts to become a
> lot more blurred.  A lot of thumb2 features needed for stand-alone
> systems are then available.  So what feature is it that you require in
> order to make fdpic work in (traditional) thumb2 that isn't in
> (traditional) thumb1?
>
At the moment I'm not s

Re: [PATCH] S/390: Allow immediates in loc expander

2018-10-26 Thread Robin Dapp
Hi,

this is v2 of the patch.  The Z13 check has been moved from the
predicate to the expander.  In addition, it changes a test case to
always run with -march=zEC12 because from z13 on the load immediate on
condition will prevent loop hoisting that the test requires.

Regards
 Robin

--

gcc/ChangeLog:

2018-10-26  Robin Dapp  

* config/s390/predicates.md: Fix typo.
* config/s390/s390.md: Allow immediates for load on condition.

gcc/testsuite/ChangeLog:

2018-10-26  Robin Dapp  

* gcc.dg/loop-8.c: On s390, always run the test with -march=zEC12.
diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index 98a824e77b7..97f717c558d 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -212,7 +212,7 @@
 (INTVAL (op), false, GET_MODE_BITSIZE (mode), NULL, NULL);
 })
 
-;; Return true if OP is ligitimate for any LOC instruction.
+;; Return true if OP is legitimate for any LOC instruction.
 
 (define_predicate "loc_operand"
   (ior (match_operand 0 "nonimmediate_operand")
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 3bd18acb456..ba1fa0c6ff3 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -6582,10 +6582,16 @@
 (define_expand "movcc"
   [(set (match_operand:GPR 0 "nonimmediate_operand" "")
 	(if_then_else:GPR (match_operand 1 "comparison_operator" "")
-			  (match_operand:GPR 2 "nonimmediate_operand" "")
-			  (match_operand:GPR 3 "nonimmediate_operand" "")))]
+			  (match_operand:GPR 2 "loc_operand" "")
+			  (match_operand:GPR 3 "loc_operand" "")))]
   "TARGET_Z196"
 {
+  if (!TARGET_Z13 && CONSTANT_P (operands[2]))
+operands[2] = force_reg (mode, operands[2]);
+
+  if (!TARGET_Z13 && CONSTANT_P (operands[3]))
+operands[3] = force_reg (mode, operands[3]);
+
   /* Emit the comparison insn in case we do not already have a comparison result.  */
   if (!s390_comparison (operands[1], VOIDmode))
 operands[1] = s390_emit_compare (GET_CODE (operands[1]),
diff --git a/gcc/testsuite/gcc.dg/loop-8.c b/gcc/testsuite/gcc.dg/loop-8.c
index 842c0e773b2..1eefccc1a3b 100644
--- a/gcc/testsuite/gcc.dg/loop-8.c
+++ b/gcc/testsuite/gcc.dg/loop-8.c
@@ -1,6 +1,10 @@
 /* { dg-do compile } */
 /* { dg-options "-O1 -fdump-rtl-loop2_invariant" } */
 /* { dg-skip-if "unexpected IV" { "hppa*-*-* mips*-*-* visium-*-* powerpc*-*-* riscv*-*-*" } } */
+/* Load immediate on condition is available from z13 on and prevents moving
+   the load out of the loop, so always run this test with -march=zEC12 that
+   does not have load immediate on condition.  */
+/* { dg-additional-options "-march=zEC12" { target { s390*-*-* } } } */
 
 void
 f (int *a, int *b)


Re: [PATCH] S/390: Add loc patterns for QImode and HImode

2018-10-26 Thread Robin Dapp
Hi,

this is v2 of the patch with less quirky pattern syntax and two tests.

Regards
 Robin

--

gcc/ChangeLog:

2018-10-26  Robin Dapp  

* config/s390/s390.md: QImode and HImode for load on condition.

gcc/testsuite/ChangeLog:

2018-10-26  Robin Dapp  

* gcc.target/s390/ifcvt-one-insn-bool.c: New test.
* gcc.target/s390/ifcvt-one-insn-char.c: New test.
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index ba1fa0c6ff3..d62da92384d 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -6599,6 +6599,44 @@
  XEXP (operands[1], 1));
 })
 
+;;
+;; - We do not have instructions for QImode or HImode but still
+;;   enable load on condition/if conversion for them.
+(define_expand "movcc"
+ [(set (match_operand:HQI 0 "nonimmediate_operand" "")
+	(if_then_else:HQI (match_operand 1 "comparison_operator" "")
+		(match_operand:HQI 2 "loc_operand" "")
+		(match_operand:HQI 3 "loc_operand" "")))]
+ "TARGET_Z196"
+{
+  /* Emit the comparison insn in case we do not already have a comparison
+ result. */
+  if (!s390_comparison (operands[1], VOIDmode))
+operands[1] = s390_emit_compare (GET_CODE (operands[1]),
+			  XEXP (operands[1], 0),
+			  XEXP (operands[1], 1));
+
+  rtx then = operands[2];
+  rtx els = operands[3];
+
+  if ((!TARGET_Z13 && CONSTANT_P (then)) || MEM_P (then))
+	then = force_reg (mode, then);
+  if ((!TARGET_Z13 && CONSTANT_P (els)) || MEM_P (els))
+	els = force_reg (mode, els);
+
+  if (!CONSTANT_P (then))
+then = simplify_gen_subreg (E_SImode, then, mode, 0);
+  if (!CONSTANT_P (els))
+els = simplify_gen_subreg (E_SImode, els, mode, 0);
+
+  rtx tmp_target = gen_reg_rtx (E_SImode);
+  emit_insn (gen_movsicc (tmp_target, operands[1], then, els));
+  emit_move_insn (operands[0], gen_lowpart (mode, tmp_target));
+  DONE;
+})
+
+
+
 ; locr, loc, stoc, locgr, locg, stocg, lochi, locghi
 (define_insn "*movcc"
   [(set (match_operand:GPR 0 "nonimmediate_operand"   "=d,d,d,d,d,d,S,S")
diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-bool.c b/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-bool.c
new file mode 100644
index 000..0a96b71f458
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-bool.c
@@ -0,0 +1,24 @@
+/* Check load on condition for bool.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O2 -march=z13" } */
+
+/* { dg-final { scan-assembler "lochinh\t%r.?,1" } } */
+#include 
+
+int foo (int *a, unsigned int n)
+{
+  int min = 99;
+  bool bla = false;
+  for (int i = 0; i < n; i++)
+{
+  if (a[i] < min)
+	{
+	  bla = true;
+	}
+}
+
+  if (bla)
+min += 1;
+  return min;
+}
diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-char.c b/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-char.c
new file mode 100644
index 000..9c3d041618b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-one-insn-char.c
@@ -0,0 +1,26 @@
+/* Check load on condition for global char.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O2 -march=z13" } */
+
+/* { dg-final { scan-assembler "locrnh\t%r.?,%r.?" } } */
+#include 
+
+char g = 42;
+
+int foo (int *a, unsigned int n)
+{
+  int min = 99;
+  char bla = 3;
+  for (int i = 0; i < n; i++)
+{
+  if (a[i] < min)
+	{
+	  bla = g;
+	}
+}
+
+  if (bla == 42)
+min += 1;
+  return min;
+}


Re: [ARM/FDPIC v3 04/21] [ARM] FDPIC: Add support for FDPIC for arm architecture

2018-10-26 Thread Richard Earnshaw (lists)
On 26/10/2018 16:25, Christophe Lyon wrote:
> On Tue, 23 Oct 2018 at 16:07, Richard Earnshaw (lists)
>  wrote:
>>
>> On 19/10/2018 14:40, Christophe Lyon wrote:
>>> On 12/10/2018 12:45, Richard Earnshaw (lists) wrote:
 On 11/10/18 14:34, Christophe Lyon wrote:
> The FDPIC register is hard-coded to r9, as defined in the ABI.
>
> We have to disable tailcall optimizations if we don't know if the
> target function is in the same module. If not, we have to set r9 to
> the value associated with the target module.
>
> When generating a symbol address, we have to take into account whether
> it is a pointer to data or to a function, because different
> relocations are needed.
>
> 2018-XX-XX  Christophe Lyon  
> Mickaël Guêné 
>
> * config/arm/arm-c.c (__FDPIC__): Define new pre-processor macro
> in FDPIC mode.
> * config/arm/arm-protos.h (arm_load_function_descriptor): Declare
> new function.
> * config/arm/arm.c (arm_option_override): Define pic register to
> FDPIC_REGNUM.
> (arm_function_ok_for_sibcall) Disable sibcall optimization if we

 Missing colon.

> have no decl or go through PLT.
> (arm_load_pic_register): Handle TARGET_FDPIC.
> (arm_is_segment_info_known): New function.
> (arm_pic_static_addr): Add support for FDPIC.
> (arm_load_function_descriptor): New function.
> (arm_assemble_integer): Add support for FDPIC.
> * config/arm/arm.h (PIC_OFFSET_TABLE_REG_CALL_CLOBBERED):
> Define. (FDPIC_REGNUM): New define.
> * config/arm/arm.md (call): Add support for FDPIC.
> (call_value): Likewise.
> (*restore_pic_register_after_call): New pattern.
> (untyped_call): Disable if FDPIC.
> (untyped_return): Likewise.
> * config/arm/unspecs.md (UNSPEC_PIC_RESTORE): New.
>

 Other comments inline.

> diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
> index 4471f79..90733cc 100644
> --- a/gcc/config/arm/arm-c.c
> +++ b/gcc/config/arm/arm-c.c
> @@ -202,6 +202,8 @@ arm_cpu_builtins (struct cpp_reader* pfile)
> builtin_define ("__ARM_EABI__");
>   }
>   +  def_or_undef_macro (pfile, "__FDPIC__", TARGET_FDPIC);
> +
> def_or_undef_macro (pfile, "__ARM_ARCH_EXT_IDIV__", TARGET_IDIV);
> def_or_undef_macro (pfile, "__ARM_FEATURE_IDIV", TARGET_IDIV);
>   diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 0dfb3ac..28cafa8 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -136,6 +136,7 @@ extern int arm_max_const_double_inline_cost (void);
>   extern int arm_const_double_inline_cost (rtx);
>   extern bool arm_const_double_by_parts (rtx);
>   extern bool arm_const_double_by_immediates (rtx);
> +extern rtx arm_load_function_descriptor (rtx funcdesc);
>   extern void arm_emit_call_insn (rtx, rtx, bool);
>   bool detect_cmse_nonsecure_call (tree);
>   extern const char *output_call (rtx *);
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 8810df5..92ae24b 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3470,6 +3470,14 @@ arm_option_override (void)
> if (flag_pic && TARGET_VXWORKS_RTP)
>   arm_pic_register = 9;
>   +  /* If in FDPIC mode then force arm_pic_register to be r9.  */
> +  if (TARGET_FDPIC)
> +{
> +  arm_pic_register = FDPIC_REGNUM;
> +  if (TARGET_ARM_ARCH < 7)
> +error ("FDPIC mode is not supported on architectures older than
> Armv7");

 What properties of FDPIC impose this requirement?  Does it also apply to
 Armv8-m.baseline?

>>> In fact, there was miscommunication on my side, resulting in a
>>> misunderstanding between Kyrill and myself, which I badly translated
>>> into this condition.
>>>
>>> My initial plan was to submit a patch series tested on v7, and send the
>>> patches needed to support older architectures as a follow-up. The proper
>>> restriction is actually "CPUs that do not support ARM or Thumb2". As you
>>> may have noticed during the iterations of this patch series, I had
>>> failed to remove partial Thumb1 support hunks.
>>>
>>> So really this should be rephrased, and rewritten as "FDPIC mode is
>>> supported on architecture versions that support ARM or Thumb-2", if that
>>> suits you. And the condition should thus be:
>>> if (! TARGET_ARM && ! TARGET_THUMB2)
>>>   error ("...")
>>>
>>> This would also exclude Armv8-m.baseline, since it doesn't support Thumb2.
>>
>> When we get to v8-m.baseline the thumb1/2 distinction starts to become a
>> lot more blurred.  A lot of thumb2 features needed for stand-alone
>> systems are then available.  So what feature is it that you require in
>> order to make fdpic work in (traditional) thumb2 that isn

Re: [gofrontend-dev] Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Rainer Orth
Hi Than,

> OK, thanks again. Another fix sent:
>
>  https://go-review.googlesource.com/c/gofrontend/+/145021

great, thanks again.  While the two previous patches were enough to get
decent Solaris 11 results, on Solaris 10 all libgo tests still FAIL like
this:

_testmain.go:9:25: error: reference to undefined identifier 'tar.TestReaderntar'
9 |  {"TestFileWriter", 
tar.TestReaderntar.TestPartialReadntar.TestUninitializedReadntar.TestReadTruncationntar.TestReadHeaderOnlyntar.TestMergePAXntar.TestParsePAXntar.TestReadOldGNUSparseMapntar.TestReadGNUSparsePAXHeadersntar.TestFileReaderntar.TestFitsInBase256ntar.TestParseNumericntar.TestFormatNumericntar.TestFitsInOctalntar.TestParsePAXTimentar.TestFormatPAXTimentar.TestParsePAXRecordntar.TestFormatPAXRecordntar.TestSparseEntriesntar.TestFileInfoHeaderntar.TestFileInfoHeaderDirntar.TestFileInfoHeaderSymlinkntar.TestRoundTripntar.TestHeaderRoundTripntar.TestHeaderAllowedFormatsntar.TestWriterntar.TestPaxntar.TestPaxSymlinkntar.TestPaxNonAsciintar.TestPaxXattrsntar.TestPaxHeadersSortedntar.TestUSTARLongNamentar.TestValidTypeflagWithPAXHeaderntar.TestWriterErrorsntar.TestSplitUSTARPathntar.TestIssue12594ntar.TestFileWriter},
  | ^
FAIL: archive/tar

This is due to gotest (symtogo):

  echo "$result" | sed -e 's/ /\n/g'

which even Solaris 11.5 Beta /bin/sed treats like this:

$ echo 'a b' | /bin/sed -e 's/ /\n/g'
anb

I still got decent results because GNU sed is earlier in my PATH on
Solaris 11, but Solaris 10 lacks it.  However, this seems to work (both
with Solaris sed and GNU sed):

echo "$result" | /bin/sed -e 's/ /\
/g'

It allows the Solaris 10 libgo testing to work, but can't tell if that's
the most portable way to do this.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/libgo/testsuite/gotest b/libgo/testsuite/gotest
--- a/libgo/testsuite/gotest
+++ b/libgo/testsuite/gotest
@@ -529,7 +529,8 @@ symtogo() {
   result="${result} ${s}"
 fi
   done
-  echo "$result" | sed -e 's/ /\n/g'
+  echo "$result" | sed -e 's/ /\
+/g'
 }
 
 {


Re: Cleanup handling of variants in ipa-devirt

2018-10-26 Thread Bernhard Reutner-Fischer
On 26 October 2018 09:18:39 CEST, Jan Hubicka  wrote:

@@ -1094,10 +1133,15 @@ warn_types_mismatch (tree t1, tree t2, l
gcc_assert (TYPE_NAME (t1) && TYPE_NAME (t2)
 && TREE_CODE (TYPE_NAME (t1)) == TYPE_DECL
 && TREE_CODE (TYPE_NAME (t2)) == TYPE_DECL);
+ tree n1 = TYPE_NAME (t1);
+ tree n2 = TYPE_NAME (t2);
+ if (TREE_CODE (n1) == TYPE_DECL)
+   n1 = DECL_NAME (n1);
+ if (TREE_CODE (n2) == TYPE_DECL)
+   n1 = DECL_NAME (n2);
/* Most of the time, the type names will match, do not be unnecesarily
verbose. */

Typo?
Please explain why you don't set n2 but overwrite n1?

thanks,


Re: Cleanup handling of variants in ipa-devirt

2018-10-26 Thread Jan Hubicka
> On 26 October 2018 09:18:39 CEST, Jan Hubicka  wrote:
> 
> @@ -1094,10 +1133,15 @@ warn_types_mismatch (tree t1, tree t2, l
> gcc_assert (TYPE_NAME (t1) && TYPE_NAME (t2)
>&& TREE_CODE (TYPE_NAME (t1)) == TYPE_DECL
>&& TREE_CODE (TYPE_NAME (t2)) == TYPE_DECL);
> + tree n1 = TYPE_NAME (t1);
> + tree n2 = TYPE_NAME (t2);
> + if (TREE_CODE (n1) == TYPE_DECL)
> + n1 = DECL_NAME (n1);
> + if (TREE_CODE (n2) == TYPE_DECL)
> + n1 = DECL_NAME (n2);
> /* Most of the time, the type names will match, do not be unnecesarily
> verbose. */
> 
> Typo?
> Please explain why you don't set n2 but overwrite n1?

Yep, it was a typo. I already fixed it in a followup patch...

honza

> 
> thanks,


Re: [PATCH][rs6000] improve gpr inline expansion of str[n]cmp

2018-10-26 Thread Segher Boessenkool
Hi Aaron,

On Thu, Oct 25, 2018 at 09:11:56AM -0500, Aaron Sawdey wrote:
> This patch changes the sequence that gcc generates for inline expansion of
> strcmp/strncmp using scalar (gpr) instructions. The new sequence is one
> instruction shorter and uses cmpb/cmpb/orc./bne which I also have been
> told that valgrind should be able to understand as the defined/undefined
> info can be propagated and should show that the branch is not based on
> any undefined data past the end of the string.

> 2018-10-25  Aaron Sawdey  
> 
>   * config/rs6000/rs6000-string.c (expand_strncmp_gpr_sequence): Change to
>   a shorter sequence with fewer branches.
>   (emit_final_str_compare_gpr): Ditto.

> Index: gcc/config/rs6000/rs6000-string.c
> ===
> --- gcc/config/rs6000/rs6000-string.c (revision 265393)
> +++ gcc/config/rs6000/rs6000-string.c (working copy)
> @@ -259,7 +259,7 @@
>gcc_assert (mode == E_QImode);
>emit_move_insn (reg, mem);
>break;
> -
> +

What does this change?  Both those lines are equal (completely empty),
maybe your mailer ate some trailing whitespace (that the patch either
added or removed, I can't tell :-) )

The patch is fine however.  Thank you!  Okay for backport to 8 as well
(after some burn-in), it helps solve the valgrind problem.


Segher


Re: [PATCH] combine: Do not combine moves from hard registers

2018-10-26 Thread Steve Ellcey
What is the status of this patch?  I see PR 87708, which is for the
regression to ira-shrinkwrap-prep-[12].c but what about all the
other regressions?  I see 27 of them on my aarch64 build and when
I looked at one of them (gcc.target/aarch64/cvtf_1.c) the code looks
worse than before, generating an extra instruction in each of the
routines.  Here is an example from one function where there is an
extra fmov that was not there before.  The test runs at -O1 but
the extra instruction appears at all optimization levels.  Should
I submit a new PR for this?

Steve Ellcey


void cvt_int32_t_to_float (int a, float b)
{ float c; c = (float) a;
  if ( (c - b) > 0.1) abort();
}


Which used to generate:

cvt_int32_t_to_float:
.LFB0:
.cfi_startproc
scvtf   s1, w0
fsubs0, s1, s0
fcvtd0, s0
adrpx0, .LC0
ldr d1, [x0, #:lo12:.LC0]
fcmpe   d0, d1
bgt .L9
ret
.L9:
stp x29, x30, [sp, -16]!
.cfi_def_cfa_offset 16
.cfi_offset 29, -16
.cfi_offset 30, -8
mov x29, sp
bl  abort
.cfi_endproc

Now generates:

cvt_int32_t_to_float:
.LFB0:
.cfi_startproc
fmovs1, w0
scvtf   s1, s1
fsubs1, s1, s0
fcvtd1, s1
adrpx0, .LC0
ldr d0, [x0, #:lo12:.LC0]
fcmpe   d1, d0
bgt .L9
ret
.L9:
stp x29, x30, [sp, -16]!
.cfi_def_cfa_offset 16
.cfi_offset 29, -16
.cfi_offset 30, -8
mov x29, sp
bl  abort
.cfi_endproc



Re: Fix failure with the odr-1.C test

2018-10-26 Thread Bernhard Reutner-Fischer
On 26 October 2018 09:46:52 CEST, Jan Hubicka  wrote:
>Hi,

>@@ -1138,7 +1139,7 @@ warn_types_mismatch (tree t1, tree t2, l
>   if (TREE_CODE (n1) == TYPE_DECL)
>   n1 = DECL_NAME (n1);
>   if (TREE_CODE (n2) == TYPE_DECL)
>-  n1 = DECL_NAME (n2);
>+  n2 = DECL_NAME (n2);
> /* Most of the time, the type names will match, do not be unnecesarily
>  verbose.  */


You fixed it here, so never mind, sorry..



Re: [gofrontend-dev] Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Ian Lance Taylor
On Fri, Oct 26, 2018 at 5:04 AM, Than McIntosh  wrote:
>
> Thanks for reporting this.
>
> Sent https://go-review.googlesource.com/c/gofrontend/+/145017 with a
> tentative fix.

Thanks, committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 265515)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-0a58bd7d820dac8931e8da5b291f19c3c7e6bee3
+ad50884d2a4b653f7f20edc8b441fe6ad6570d55
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/log/syslog/syslog_c.c
===
--- libgo/go/log/syslog/syslog_c.c  (revision 265460)
+++ libgo/go/log/syslog/syslog_c.c  (working copy)
@@ -12,7 +12,7 @@
can't represent a C varargs function in Go.  */
 
 void syslog_c(intgo, const char*)
-  __asm__ (GOSYM_PREFIX "log_syslog.syslog_c");
+  __asm__ (GOSYM_PREFIX "log..z2fsyslog.syslog_c");
 
 void
 syslog_c (intgo priority, const char *msg)


[gomp5] Use gomp_aligned_alloc for workshare allocation

2018-10-26 Thread Jakub Jelinek
Hi!

The gomp_work_share struct is designed to have first half mostly readonly,
set once, and the second half meant for writes, with the middle being
64-byte aligned.  The following patch uses (as an optimization) the new
gomp_aligned_alloc if it is not the fallback implementation.

Tested on x86_64-linux, committed to trunk.

2018-10-26  Jakub Jelinek  

* libgomp.h (GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC): Define unless
gomp_aligned_alloc uses fallback implementation.
* alloc.c (NEED_SPECIAL_GOMP_ALIGNED_FREE): Don't define.
(gomp_aligned_free): Use !defined(GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC)
instead of defined(NEED_SPECIAL_GOMP_ALIGNED_FREE).
* work.c (alloc_work_share): Use gomp_aligned_alloc instead of
gomp_malloc if GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is defined.

--- libgomp/libgomp.h.jj2018-10-25 12:01:49.673340585 +0200
+++ libgomp/libgomp.h   2018-10-26 18:09:34.626281156 +0200
@@ -86,6 +86,15 @@ enum memmodel
 
 /* alloc.c */
 
+#if defined(HAVE_ALIGNED_ALLOC) \
+|| defined(HAVE__ALIGNED_MALLOC) \
+|| defined(HAVE_POSIX_MEMALIGN) \
+|| defined(HAVE_MEMALIGN)
+/* Defined if gomp_aligned_alloc doesn't use fallback version
+   and free can be used instead of gomp_aligned_free.  */
+#define GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC 1
+#endif
+
 extern void *gomp_malloc (size_t) __attribute__((malloc));
 extern void *gomp_malloc_cleared (size_t) __attribute__((malloc));
 extern void *gomp_realloc (void *, size_t);
--- libgomp/alloc.c.jj  2018-09-27 15:53:01.635671568 +0200
+++ libgomp/alloc.c 2018-10-26 18:10:36.745266239 +0200
@@ -87,7 +87,6 @@ gomp_aligned_alloc (size_t al, size_t si
  ((void **) ap)[-1] = p;
  ret = ap;
}
-#define NEED_SPECIAL_GOMP_ALIGNED_FREE
 }
 #endif
   if (ret == NULL)
@@ -98,10 +97,10 @@ gomp_aligned_alloc (size_t al, size_t si
 void
 gomp_aligned_free (void *ptr)
 {
-#ifdef NEED_SPECIAL_GOMP_ALIGNED_FREE
+#ifdef GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC
+  free (ptr);
+#else
   if (ptr)
 free (((void **) ptr)[-1]);
-#else
-  free (ptr);
 #endif
 }
--- libgomp/work.c.jj   2018-04-30 13:21:06.574866351 +0200
+++ libgomp/work.c  2018-10-26 18:12:02.324868021 +0200
@@ -76,7 +76,15 @@ alloc_work_share (struct gomp_team *team
 #endif
 
   team->work_share_chunk *= 2;
+  /* Allocating gomp_work_share structures aligned is just an
+ optimization, don't do it when using the fallback method.  */
+#ifdef GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC
+  ws = gomp_aligned_alloc (__alignof (struct gomp_work_share),
+  team->work_share_chunk
+  * sizeof (struct gomp_work_share));
+#else
   ws = gomp_malloc (team->work_share_chunk * sizeof (struct gomp_work_share));
+#endif
   ws->next_alloc = team->work_shares[0].next_alloc;
   team->work_shares[0].next_alloc = ws;
   team->work_share_list_alloc = &ws[1];

Jakub


Re: [gofrontend-dev] Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Ian Lance Taylor
On Fri, Oct 26, 2018 at 7:44 AM, Than McIntosh  wrote:
> OK, thanks again. Another fix sent:
>
>  https://go-review.googlesource.com/c/gofrontend/+/145021

Thanks.  Sorry for missing that in code review.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 265533)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-ad50884d2a4b653f7f20edc8b441fe6ad6570d55
+9785e5c4e868ba55efdb33fc51872b4821770167
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/testsuite/gotest
===
--- libgo/testsuite/gotest  (revision 265515)
+++ libgo/testsuite/gotest  (working copy)
@@ -513,9 +513,7 @@ localname() {
 #Returned:  leaf.Mumble
 #
 symtogo() {
-  local s=""
-  local result=""
-  local ndots=""
+  result=""
   for tp in $*
   do
 s=$(echo "$tp" | sed -e 's/\.\.z2f/%/g' | sed -e 's/.*%//')


Re: [PATCH v2, rs6000 1/4] Fixes for x86 intrinsics on POWER 32bit

2018-10-26 Thread Segher Boessenkool
On Fri, Oct 26, 2018 at 08:06:28AM -0500, Paul Clarke wrote:
> On 10/25/2018 05:17 PM, Segher Boessenkool wrote:
> > On Thu, Oct 25, 2018 at 02:07:33PM -0500, Paul Clarke wrote:
> >> Various clean-ups for 32bit support.
> >>
> >> Implement various corrections in the compatibility implementations of the
> >> x86 vector intrinsics found after enabling 32bit mode for the associated
> >> test cases.  (Actual enablement coming in a subsequent patch.)
> > 
> > So what happened on 32-bit before?  (After you get rid of the #ifdef of
> > course).  It isn't clear to me.
> 
> Most of the changes are to remove dependency on int128 support, because with 
> '-m32', errors were reported:
> /opt/at12.0/lib/gcc/powerpc64-linux-gnu/8.2.1/include/xmmintrin.h:992:61: 
> error: ‘__int128’ is not supported on this target
>return ((__m64) __builtin_unpack_vector_int128 ((__vector __int128)result, 
> 0));
> 
> Prompted the many changes like:
> > -  vm1 = (__vector signed short)__builtin_pack_vector_int128 (__m2, __m1);
> > +  vm1 = (__vector signed short) (__vector unsigned long long) { __m2, __m1 
> > };

Ah, okay.  And you have tested this works correctly both on BE and LE, right?
Okay for trunk then.  Thanks!


Segher


[PATCH] powerpc: Fix typos in the manual

2018-10-26 Thread Tulio Magno Quites Machado Filho
[gcc]
2018-10-26  Tulio Magno Quites Machado Filho  

* doc/extend.texi (PowerPC builtins): Fix __builtin_unpack_ibm128
return type and other typos.
---
 gcc/doc/extend.texi | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index edf87118147..75d704317e7 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -15811,7 +15811,7 @@ processors:
 @smallexample
 uint64_t __builtin_ppc_get_timebase ();
 unsigned long __builtin_ppc_mftb ();
-__ibm128 __builtin_unpack_ibm128 (__ibm128, int);
+double __builtin_unpack_ibm128 (__ibm128, int);
 __ibm128 __builtin_pack_ibm128 (double, double);
 double __builtin_mffs (void);
 void __builtin_mtfsb0 (const int);
@@ -15927,13 +15927,13 @@ The @code{__builtin_unpack_longdouble} function takes 
a
 the constant is 0, the first @code{double} within the
 @code{long double} is returned, otherwise the second @code{double}
 is returned.  The @code{__builtin_unpack_longdouble} function is only
-availble if @code{long double} uses the IBM extended double
+available if @code{long double} uses the IBM extended long double
 representation.
 
 The @code{__builtin_pack_longdouble} function takes two @code{double}
 arguments and returns a @code{long double} value that combines the two
 arguments.  The @code{__builtin_pack_longdouble} function is only
-availble if @code{long double} uses the IBM extended double
+available if @code{long double} uses the IBM extended long double
 representation.
 
 The @code{__builtin_unpack_ibm128} function takes a @code{__ibm128}
-- 
2.14.5



Re: [PATCH] combine: Do not combine moves from hard registers

2018-10-26 Thread Segher Boessenkool
On Fri, Oct 26, 2018 at 04:39:06PM +, Steve Ellcey wrote:
> What is the status of this patch?  I see PR 87708, which is for the
> regression to ira-shrinkwrap-prep-[12].c but what about all the
> other regressions?  I see 27 of them on my aarch64 build and when
> I looked at one of them (gcc.target/aarch64/cvtf_1.c) the code looks
> worse than before, generating an extra instruction in each of the
> routines.  Here is an example from one function where there is an
> extra fmov that was not there before.  The test runs at -O1 but
> the extra instruction appears at all optimization levels.  Should
> I submit a new PR for this?

Yes, please open a PR.  It's hard to keep track of things without.

Status: I have figure out what I am doing wrong, I hope to have a patch
soon.  This will not fix all the register allocation problems of course.
It's a pity no PRs are opened for the code improvements either though ;-)


Segher


Re: [PATCH v2, rs6000 3/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-26 Thread Segher Boessenkool
On Thu, Oct 25, 2018 at 02:07:54PM -0500, Paul Clarke wrote:
> This is a follow-on to earlier commits for adding compatibility
> implementations of x86 intrinsics for PPC64LE.  This is the first of
> two patches for SSSE3.  This patch adds the 32 x86 intrinsics from
>  ("SSSE3").  (Patch 2/2 adds tests for these intrinsics,
> and briefly describes the tests performed.)
> 
> Bootstrapped and tested on Linux POWER8 LE, POWER8 BE (64 & 32), and POWER7.
> 
> OK for trunk?

I have acked this before; it is still okay.  One thing:

> +   In the specific case of X86 SSE2 (__m128i, __m128d) intrinsics,
> +   the PowerPC VMX/VSX ISA is a good match for vector double SIMD
> +   operations.  However scalar double operations in vector (XMM)
> +   registers require the POWER8 VSX ISA (2.07) level. Also there are
> +   important differences for data format and placement of double
> +   scalars in the vector register.
> +
> +   For PowerISA Scalar double is in FPRs (left most 64-bits of the
> +   low 32 VSRs), while X86_64 SSE2 uses the right most 64-bits of
> +   the XMM. These differences require extra steps on POWER to match
> +   the SSE2 scalar double semantics.
> +
> +   Most SSE2 scalar double intrinsic operations can be performed more
> +   efficiently as C language double scalar operations or optimized to
> +   use vector SIMD operations.  We recommend this for new applications.
> +
> +   Another difference is the format and details of the X86_64 MXSCR vs
> +   the PowerISA FPSCR / VSCR registers. We recommend applications
> +   replace direct access to the MXSCR with the more portable 
> +   Posix APIs.  */

I don't know how relevant and/or correct this comment is to this file
(it looks like you copied it from previous headers?)


Segher


Re: [PATCH v2, rs6000 4/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-26 Thread Segher Boessenkool
On Thu, Oct 25, 2018 at 02:08:03PM -0500, Paul Clarke wrote:
> This is part 2/2 for contributing PPC64LE support for X86 SSE3
> instrisics. This patch includes testsuite/gcc.target tests for the
> intrinsics defined in pmmintrin.h, copied from gcc.target/i386.
> 
> Bootstrapped and tested on Linux POWER8 LE, POWER8 BE (64 & 32), and POWER7.
> 
> OK for trunk?

Yes please.  Thanks!


Segher


Re: [C++ Patch] PR 84644 ("internal compiler error: in warn_misplaced_attr_for_class_type, at cp/decl.c:4718")

2018-10-26 Thread Paolo Carlini

Hi,

On 26/10/18 17:18, Jason Merrill wrote:

On Fri, Oct 26, 2018 at 4:52 AM Paolo Carlini  wrote:

On 24/10/18 22:41, Jason Merrill wrote:

On 10/15/18 12:45 PM, Paolo Carlini wrote:

 && ((TREE_CODE (declspecs->type) != TYPENAME_TYPE
+   && TREE_CODE (declspecs->type) != DECLTYPE_TYPE
  && MAYBE_CLASS_TYPE_P (declspecs->type))

I would think that the MAYBE_CLASS_TYPE_P here should be CLASS_TYPE_P,
and then we can remove the TYPENAME_TYPE check.  Or do we want to
allow template type parameters for some reason?

Indeed, it would be nice to just use OVERLOAD_TYPE_P. However it seems
we at least want to let through TEMPLATE_TYPE_PARMs representing 'auto'
- otherwise Dodji's check a few lines below which fixed c++/51473
doesn't work anymore - and also BOUND_TEMPLATE_TEMPLATE_PARM, otherwise
we regress on template/spec32.C and template/ttp22.C because we don't
diagnose the shadowing anymore. Thus, I would say either we keep on
using MAYBE_CLASS_TYPE_P or we pick what we need, possibly we add a comment?

Aha.  I guess the answer is not to restrict that test any more, but
instead to fix the code further down so it gives a proper diagnostic
rather than call warn_misplaced_attr_for_class_type.


I see. Thus something like the below? It passes testing on x86_64-linux.

Thanks! Paolo.

/

Index: cp/decl.c
===
--- cp/decl.c   (revision 265510)
+++ cp/decl.c   (working copy)
@@ -4798,9 +4798,10 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
 declared_type = declspecs->type;
   else if (declspecs->type == error_mark_node)
 error_p = true;
-  if (declared_type == NULL_TREE && ! saw_friend && !error_p)
+  if ((!declared_type || TREE_CODE (declared_type) == DECLTYPE_TYPE)
+  && ! saw_friend && !error_p)
 permerror (input_location, "declaration does not declare anything");
-  else if (declared_type != NULL_TREE && type_uses_auto (declared_type))
+  else if (declared_type && type_uses_auto (declared_type))
 {
   error_at (declspecs->locations[ds_type_spec],
"% can only be specified for variables "
@@ -4884,7 +4885,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
  "% cannot be used for type declarations");
 }
 
-  if (declspecs->attributes && warn_attributes && declared_type)
+  if (declspecs->attributes && warn_attributes && declared_type
+  && TREE_CODE (declared_type) != DECLTYPE_TYPE)
 {
   location_t loc;
   if (!CLASS_TYPE_P (declared_type)
Index: testsuite/g++.dg/cpp0x/decltype-33838.C
===
--- testsuite/g++.dg/cpp0x/decltype-33838.C (revision 265510)
+++ testsuite/g++.dg/cpp0x/decltype-33838.C (working copy)
@@ -2,5 +2,5 @@
 // PR c++/33838
 template struct A
 {
-  __decltype (T* foo()); // { dg-error "expected|no arguments|accept" }
+  __decltype (T* foo()); // { dg-error "expected|no arguments|declaration" }
 };
Index: testsuite/g++.dg/cpp0x/decltype68.C
===
--- testsuite/g++.dg/cpp0x/decltype68.C (nonexistent)
+++ testsuite/g++.dg/cpp0x/decltype68.C (working copy)
@@ -0,0 +1,7 @@
+// PR c++/84644
+// { dg-do compile { target c++11 } }
+
+template
+struct b {
+  decltype(a) __attribute__((break));  // { dg-error "declaration does not 
declare anything" }
+};


Re: [gofrontend-dev] Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Bernhard Reutner-Fischer
On 26 October 2018 18:13:32 CEST, Rainer Orth  
wrote:
>Hi Than,
>
>> OK, thanks again. Another fix sent:
>>
>>  https://go-review.googlesource.com/c/gofrontend/+/145021
>
>great, thanks again.  While the two previous patches were enough to get
>decent Solaris 11 results, on Solaris 10 all libgo tests still FAIL
>like
>this:
>
>_testmain.go:9:25: error: reference to undefined identifier
>'tar.TestReaderntar'
>9 |  {"TestFileWriter",
>tar.TestReaderntar.TestPartialReadntar.TestUninitializedReadntar.TestReadTruncationntar.TestReadHeaderOnlyntar.TestMergePAXntar.TestParsePAXntar.TestReadOldGNUSparseMapntar.TestReadGNUSparsePAXHeadersntar.TestFileReaderntar.TestFitsInBase256ntar.TestParseNumericntar.TestFormatNumericntar.TestFitsInOctalntar.TestParsePAXTimentar.TestFormatPAXTimentar.TestParsePAXRecordntar.TestFormatPAXRecordntar.TestSparseEntriesntar.TestFileInfoHeaderntar.TestFileInfoHeaderDirntar.TestFileInfoHeaderSymlinkntar.TestRoundTripntar.TestHeaderRoundTripntar.TestHeaderAllowedFormatsntar.TestWriterntar.TestPaxntar.TestPaxSymlinkntar.TestPaxNonAsciintar.TestPaxXattrsntar.TestPaxHeadersSortedntar.TestUSTARLongNamentar.TestValidTypeflagWithPAXHeaderntar.TestWriterErrorsntar.TestSplitUSTARPathntar.TestIssue12594ntar.TestFileWriter},
>  | ^
>FAIL: archive/tar
>
>This is due to gotest (symtogo):
>
>  echo "$result" | sed -e 's/ /\n/g'
>
>which even Solaris 11.5 Beta /bin/sed treats like this:
>
>$ echo 'a b' | /bin/sed -e 's/ /\n/g'
>anb

Just curious if either 
   | sed -e "s/ /\n/g"
or
  |  tr " " "\n"

would work.

thanks,


Re: [PATCH v2, rs6000 3/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-26 Thread Paul Clarke
On 10/26/2018 01:00 PM, Segher Boessenkool wrote:
> On Thu, Oct 25, 2018 at 02:07:54PM -0500, Paul Clarke wrote:
>> This is a follow-on to earlier commits for adding compatibility
>> implementations of x86 intrinsics for PPC64LE.  This is the first of
>> two patches for SSSE3.  This patch adds the 32 x86 intrinsics from
>>  ("SSSE3").  (Patch 2/2 adds tests for these intrinsics,
>> and briefly describes the tests performed.)

>> OK for trunk?
> 
> I have acked this before; it is still okay.  One thing:

OK, there were substantive changes, so wanted to make sure.

>> +   In the specific case of X86 SSE2 (__m128i, __m128d) intrinsics,
>> +   the PowerPC VMX/VSX ISA is a good match for vector double SIMD
>> +   operations.  However scalar double operations in vector (XMM)
>> +   registers require the POWER8 VSX ISA (2.07) level. Also there are
>> +   important differences for data format and placement of double
>> +   scalars in the vector register.
>> +
>> +   For PowerISA Scalar double is in FPRs (left most 64-bits of the
>> +   low 32 VSRs), while X86_64 SSE2 uses the right most 64-bits of
>> +   the XMM. These differences require extra steps on POWER to match
>> +   the SSE2 scalar double semantics.
>> +
>> +   Most SSE2 scalar double intrinsic operations can be performed more
>> +   efficiently as C language double scalar operations or optimized to
>> +   use vector SIMD operations.  We recommend this for new applications.
>> +
>> +   Another difference is the format and details of the X86_64 MXSCR vs
>> +   the PowerISA FPSCR / VSCR registers. We recommend applications
>> +   replace direct access to the MXSCR with the more portable 
>> +   Posix APIs.  */
> 
> I don't know how relevant and/or correct this comment is to this file
> (it looks like you copied it from previous headers?)

Yep, copied without thinking.  I'll just remove the whole block you referenced.

Thanks, again!

PC



Re: [gofrontend-dev] Re: Go patch committed: Improve name mangling for package paths

2018-10-26 Thread Ian Lance Taylor
On Fri, Oct 26, 2018 at 9:13 AM, Rainer Orth
 wrote:
> Hi Than,
>
>> OK, thanks again. Another fix sent:
>>
>>  https://go-review.googlesource.com/c/gofrontend/+/145021
>
> great, thanks again.  While the two previous patches were enough to get
> decent Solaris 11 results, on Solaris 10 all libgo tests still FAIL like
> this:
>
> _testmain.go:9:25: error: reference to undefined identifier 
> 'tar.TestReaderntar'
> 9 |  {"TestFileWriter", 
> tar.TestReaderntar.TestPartialReadntar.TestUninitializedReadntar.TestReadTruncationntar.TestReadHeaderOnlyntar.TestMergePAXntar.TestParsePAXntar.TestReadOldGNUSparseMapntar.TestReadGNUSparsePAXHeadersntar.TestFileReaderntar.TestFitsInBase256ntar.TestParseNumericntar.TestFormatNumericntar.TestFitsInOctalntar.TestParsePAXTimentar.TestFormatPAXTimentar.TestParsePAXRecordntar.TestFormatPAXRecordntar.TestSparseEntriesntar.TestFileInfoHeaderntar.TestFileInfoHeaderDirntar.TestFileInfoHeaderSymlinkntar.TestRoundTripntar.TestHeaderRoundTripntar.TestHeaderAllowedFormatsntar.TestWriterntar.TestPaxntar.TestPaxSymlinkntar.TestPaxNonAsciintar.TestPaxXattrsntar.TestPaxHeadersSortedntar.TestUSTARLongNamentar.TestValidTypeflagWithPAXHeaderntar.TestWriterErrorsntar.TestSplitUSTARPathntar.TestIssue12594ntar.TestFileWriter},
>   | ^
> FAIL: archive/tar
>
> This is due to gotest (symtogo):
>
>   echo "$result" | sed -e 's/ /\n/g'
>
> which even Solaris 11.5 Beta /bin/sed treats like this:
>
> $ echo 'a b' | /bin/sed -e 's/ /\n/g'
> anb
>
> I still got decent results because GNU sed is earlier in my PATH on
> Solaris 11, but Solaris 10 lacks it.  However, this seems to work (both
> with Solaris sed and GNU sed):
>
> echo "$result" | /bin/sed -e 's/ /\
> /g'
>
> It allows the Solaris 10 libgo testing to work, but can't tell if that's
> the most portable way to do this.

Thanks for the report.  Fixed by simplifying the code, like so.
Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 265534)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-9785e5c4e868ba55efdb33fc51872b4821770167
+8902fb43c569e4d3ec5bd143bfa8cb6bf2836780
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/testsuite/gotest
===
--- libgo/testsuite/gotest  (revision 265534)
+++ libgo/testsuite/gotest  (working copy)
@@ -521,13 +521,8 @@ symtogo() {
 if ! expr "$s" : '^[^.]*\.[^.]*$' >/dev/null 2>&1; then
   continue
 fi
-if [ -z "${result}" ]; then
-  result="${s}"
-else
-  result="${result} ${s}"
-fi
+echo "$s"
   done
-  echo "$result" | sed -e 's/ /\n/g'
 }
 
 {


[PATCH, committed] Backport PR87473 fix to GCC 8

2018-10-26 Thread Bill Schmidt
Committed the backport as follows:

[gcc]

2018-10-26  Bill Schmidt  

Backport from mainline
2018-10-19  Bill Schmidt  

PR tree-optimization/87473
* gimple-ssa-strength-reduction.c (record_phi_increments_1): For
phi arguments identical to the base expression of the phi
candidate, record a phi-adjust increment of zero minus the index
expression of the hidden basis.
(phi_incr_cost_1): For phi arguments identical to the base
expression of the phi candidate, the difference to compare against
the increment is zero minus the index expression of the hidden
basis, and there is no potential savings from replacing the (phi)
statement.
(ncd_with_phi): For phi arguments identical to the base expression
of the phi candidate, the difference to compare against the
increment is zero minus the index expression of the hidden basis.
(all_phi_incrs_profitable_1): For phi arguments identical to the
base expression of the phi candidate, the increment to be checked
for profitability is zero minus the index expression of the hidden
basis.

[gcc/testsuite]

2018-10-26  Bill Schmidt  

Backport from mainline
2018-10-19  Bill Schmidt  

PR tree-optimization/87473
* gcc.c-torture/compile/pr87473.c: New file.


Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 265534)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -2779,17 +2779,23 @@ record_phi_increments_1 (slsr_cand_t basis, gimple
   for (i = 0; i < gimple_phi_num_args (phi); i++)
 {
   tree arg = gimple_phi_arg_def (phi, i);
+  gimple *arg_def = SSA_NAME_DEF_STMT (arg);
 
-  if (!operand_equal_p (arg, phi_cand->base_expr, 0))
+  if (gimple_code (arg_def) == GIMPLE_PHI)
+   record_phi_increments_1 (basis, arg_def);
+  else
{
- gimple *arg_def = SSA_NAME_DEF_STMT (arg);
+ widest_int diff;
 
- if (gimple_code (arg_def) == GIMPLE_PHI)
-   record_phi_increments_1 (basis, arg_def);
+ if (operand_equal_p (arg, phi_cand->base_expr, 0))
+   {
+ diff = -basis->index;
+ record_increment (phi_cand, diff, PHI_ADJUST);
+   }
  else
{
  slsr_cand_t arg_cand = base_cand_from_table (arg);
- widest_int diff = arg_cand->index - basis->index;
+ diff = arg_cand->index - basis->index;
  record_increment (arg_cand, diff, PHI_ADJUST);
}
}
@@ -2864,29 +2870,43 @@ phi_incr_cost_1 (slsr_cand_t c, const widest_int &
   for (i = 0; i < gimple_phi_num_args (phi); i++)
 {
   tree arg = gimple_phi_arg_def (phi, i);
+  gimple *arg_def = SSA_NAME_DEF_STMT (arg);
 
-  if (!operand_equal_p (arg, phi_cand->base_expr, 0))
+  if (gimple_code (arg_def) == GIMPLE_PHI)
{
- gimple *arg_def = SSA_NAME_DEF_STMT (arg);
-  
- if (gimple_code (arg_def) == GIMPLE_PHI)
+ int feeding_savings = 0;
+ tree feeding_var = gimple_phi_result (arg_def);
+ cost += phi_incr_cost_1 (c, incr, arg_def, &feeding_savings);
+ if (uses_consumed_by_stmt (feeding_var, phi))
+   *savings += feeding_savings;
+   }
+  else
+   {
+ widest_int diff;
+ slsr_cand_t arg_cand;
+
+ /* When the PHI argument is just a pass-through to the base
+expression of the hidden basis, the difference is zero minus
+the index of the basis.  There is no potential savings by
+eliminating a statement in this case.  */
+ if (operand_equal_p (arg, phi_cand->base_expr, 0))
{
- int feeding_savings = 0;
- tree feeding_var = gimple_phi_result (arg_def);
- cost += phi_incr_cost_1 (c, incr, arg_def, &feeding_savings);
- if (uses_consumed_by_stmt (feeding_var, phi))
-   *savings += feeding_savings;
+ arg_cand = (slsr_cand_t)NULL;
+ diff = -basis->index;
}
  else
{
- slsr_cand_t arg_cand = base_cand_from_table (arg);
- widest_int diff = arg_cand->index - basis->index;
-
- if (incr == diff)
+ arg_cand = base_cand_from_table (arg);
+ diff = arg_cand->index - basis->index;
+   }
+ 
+ if (incr == diff)
+   {
+ tree basis_lhs = gimple_assign_lhs (basis->cand_stmt);
+ cost += add_cost (true, TYPE_MODE (TREE_TYPE (basis_lhs)));
+ if (arg_cand)
{
- tree basis_lhs = gimple_assign_lhs (basis->cand_stmt);
  tree lhs = gimple_assign_lhs (arg_cand->cand_stmt);
- cost += add_cost (true, TYPE_MODE (TREE_TYPE (basis_lhs

Re: [PATCH v2, rs6000 4/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-26 Thread Bill Schmidt
On 10/25/18 2:08 PM, Paul Clarke wrote:
> This is part 2/2 for contributing PPC64LE support for X86 SSE3
> instrisics. This patch includes testsuite/gcc.target tests for the
> intrinsics defined in pmmintrin.h, copied from gcc.target/i386.
>
> Bootstrapped and tested on Linux POWER8 LE, POWER8 BE (64 & 32), and POWER7.
>
> OK for trunk?
>
> [gcc/testsuite]
>
> 2018-10-25  Paul A. Clarke  
>
>   * gcc.target/powerpc/sse3-check.h: New file.
>   * gcc.target/powerpc/ssse3-vals.h: New file.
>   * gcc.target/powerpc/ssse3-pabsb.c: New file.
>   * gcc.target/powerpc/ssse3-pabsd.c: New file.
>   * gcc.target/powerpc/ssse3-pabsw.c: New file.
>   * gcc.target/powerpc/ssse3-palignr.c: New file.
>   * gcc.target/powerpc/ssse3-phaddd.c: New file.
>   * gcc.target/powerpc/ssse3-phaddsw.c: New file.
>   * gcc.target/powerpc/ssse3-phaddw.c: New file.
>   * gcc.target/powerpc/ssse3-phsubd.c: New file.
>   * gcc.target/powerpc/ssse3-phsubsw.c: New file.
>   * gcc.target/powerpc/ssse3-phsubw.c: New file.
>   * gcc.target/powerpc/ssse3-pmaddubsw.c: New file.
>   * gcc.target/powerpc/ssse3-pmulhrsw.c: New file.
>   * gcc.target/powerpc/ssse3-pshufb.c: New file.
>   * gcc.target/powerpc/ssse3-psignb.c: New file.
>   * gcc.target/powerpc/ssse3-psignd.c: New file.
>   * gcc.target/powerpc/ssse3-psignw.c: New file.
>
> Index: gcc/testsuite/gcc.target/powerpc/ssse3-check.h
> ===
> diff --git a/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-check.h 
> b/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-check.h
> new file mode 10644
> --- /dev/null (revision 0)
> +++ b/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-check.h(working copy)
> @@ -0,0 +1,43 @@
> +#include 
> +#include 
> +
> +#include "m128-check.h"
> +
> +/* define DEBUG replace abort with printf on error.  */

One nit -- this comment appears to be incorrect, as the only place DEBUG is 
used,
you don't have abort() anywhere.

(I have a patch under review that questions why we would replace abort() rather
than supplement it with printf, anyway...)

Thanks,
Bill

> +//#define DEBUG 1
> +
> +#define TEST ssse3_test
> +
> +static void ssse3_test (void);
> +
> +static void
> +__attribute__ ((noinline))
> +do_test (void)
> +{
> +  ssse3_test ();
> +}
> +
> +int
> +main ()
> +{
> +#ifdef __BUILTIN_CPU_SUPPORTS__
> +  /* Most SSE intrinsic operations can be implemented via VMX
> + instructions, but some operations may be faster / simpler
> + using the POWER8 VSX instructions.  This is especially true
> + when we are transferring / converting to / from __m64 types.
> + The direct register transfer instructions from POWER8 are
> + especially important.  So we test for arch_2_07.  */
> +  if (__builtin_cpu_supports ("arch_2_07"))
> +{
> +  do_test ();
> +#ifdef DEBUG
> +  printf ("PASSED\n");
> +#endif
> +}
> +#ifdef DEBUG
> +  else
> +printf ("SKIPPED\n");
> +#endif
> +#endif /* __BUILTIN_CPU_SUPPORTS__ */
> +  return 0;
> +}
> Index: gcc/testsuite/gcc.target/powerpc/ssse3-pabsb.c
> ===
> diff --git a/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-pabsb.c 
> b/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-pabsb.c
> new file mode 10644
> --- /dev/null (revision 0)
> +++ b/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-pabsb.c(working copy)
> @@ -0,0 +1,80 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
> +/* { dg-require-effective-target p8vector_hw } */
> +
> +#ifndef CHECK_H
> +#define CHECK_H "ssse3-check.h"
> +#endif
> +
> +#ifndef TEST
> +#define TEST ssse3_test
> +#endif
> +
> +#include CHECK_H
> +
> +#include "ssse3-vals.h"
> +#include 
> +
> +#ifndef __AVX__
> +/* Test the 64-bit form */
> +static void
> +ssse3_test_pabsb (int *i1, int *r)
> +{
> +  __m64 t1 = *(__m64 *) i1;
> +  *(__m64 *) r = _mm_abs_pi8 (t1);
> +  _mm_empty ();
> +}
> +#endif
> +
> +/* Test the 128-bit form */
> +static void
> +ssse3_test_pabsb128 (int *i1, int *r)
> +{
> +  /* Assumes incoming pointers are 16-byte aligned */
> +  __m128i t1 = *(__m128i *) i1;
> +  *(__m128i *) r = _mm_abs_epi8 (t1);
> +}
> +
> +/* Routine to manually compute the results */
> +static void
> +compute_correct_result (int *i1, int *r)
> +{
> +  char *b1 = (char *) i1;
> +  char *bout = (char *) r;
> +  int i;
> +
> +  for (i = 0; i < 16; i++)
> +if (b1[i] < 0)
> +  bout[i] = -b1[i];
> +else
> +  bout[i] = b1[i];
> +}
> +
> +static void
> +TEST (void)
> +{
> +  int i;
> +  int r [4] __attribute__ ((aligned(16)));
> +  int ck [4];
> +  int fail = 0;
> +
> +  for (i = 0; i < 256; i += 4)
> +{
> +  /* Manually compute the result */
> +  compute_correct_result(&vals[i + 0], ck);
> +
> +#ifndef __AVX__
> +  /* Run the 64-bit tests */
> +  ssse3_test_pabsb (&vals[i + 0], &r[0]);
> +  ssse3_test_pabsb (&vals[i + 2], &r[2]);
> +

Re: [PATCH, rs6000] Intrinsic compatibility tests should not pass just because DEBUG is set

2018-10-26 Thread Segher Boessenkool
Hi Bill,

On Thu, Oct 25, 2018 at 02:58:59PM -0500, Bill Schmidt wrote:
> A number of the test cases for the intrinsic compatibility headers are set up
> to dump more information when a test case fails and the DEBUG macro has been
> set.  Unfortunately, many of them also then pass the test case since they no
> longer call abort().  This patch fixes that oversight (or intentional oddity).
> 
> A lot of these tests are also miserably formatted, so I took the liberty of
> cleaning that up while I was in here.
> 
> Tested on powerpc64le-linux-gnu with no failures.  Is this ok for trunk, and
> potential backport to 8?

Sure, that's fine.  Thanks!


Segher


Re: GCC 6 branch is now closed

2018-10-26 Thread Eric Gallager
On 10/26/18, Jakub Jelinek  wrote:
> After the GCC 6.5 release the GCC 6 branch is now closed.  Please
> refrain from committing to it from now on.
>
> Thanks
>   Jakub
>

So, since it's the last branch with java in it, can we go and finally
remove the last vestiges of java from the gcc website? Specifically:

- On https://gcc.gnu.org/lists.html move "java" and "java-patches"
from "Open lists" to "Historical lists" (and likewise with "java-prs"
and "java-cvs" in "Read only lists")
- In Bugzilla, prevent any new bugs with java-related components from
being opened for the GCC product
- Also in Bugzilla, does it really make sense to have classpath
continue to share a bugzilla with gcc now that the last gcc branch
with java has been closed? Seems like, if possible, it'd be better to
separate classpath out into its own bug database. That way the bug
creation process could be made one step shorter, since the "First, you
must pick a product on which to enter a bug" page would then no longer
be necessary, since classpath would no longer be a choice, so there'd
only be one product left (gcc).

Eric


[PATCH, committed] Backport PR87473 fix to GCC 7

2018-10-26 Thread Bill Schmidt
Committed the backport as follows (some slight changes between versions were 
required):

[gcc]

2018-10-26  Bill Schmidt  

Backport from mainline
2018-10-19  Bill Schmidt  

PR tree-optimization/87473
* gimple-ssa-strength-reduction.c (record_phi_increments): For
phi arguments identical to the base expression of the phi
candidate, record a phi-adjust increment of zero minus the index
expression of the hidden basis.
(phi_incr_cost): For phi arguments identical to the base
expression of the phi candidate, the difference to compare against
the increment is zero minus the index expression of the hidden
basis, and there is no potential savings from replacing the (phi)
statement.
(ncd_with_phi): For phi arguments identical to the base expression
of the phi candidate, the difference to compare against the
increment is zero minus the index expression of the hidden basis.
(all_phi_incrs_profitable): For phi arguments identical to the
base expression of the phi candidate, the increment to be checked
for profitability is zero minus the index expression of the hidden
basis.

[gcc/testsuite]

2018-10-26  Bill Schmidt  

Backport from mainline
2018-10-19  Bill Schmidt  

PR tree-optimization/87473
* gcc.c-torture/compile/pr87473.c: New file.


Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 265546)
+++ gcc/ChangeLog   (revision 265547)
@@ -1,3 +1,26 @@
+2018-10-26  Bill Schmidt  
+
+   Backport from mainline
+   2018-10-19  Bill Schmidt  
+
+   PR tree-optimization/87473
+   * gimple-ssa-strength-reduction.c (record_phi_increments): For
+   phi arguments identical to the base expression of the phi
+   candidate, record a phi-adjust increment of zero minus the index
+   expression of the hidden basis.
+   (phi_incr_cost): For phi arguments identical to the base
+   expression of the phi candidate, the difference to compare against
+   the increment is zero minus the index expression of the hidden
+   basis, and there is no potential savings from replacing the (phi)
+   statement.
+   (ncd_with_phi): For phi arguments identical to the base expression
+   of the phi candidate, the difference to compare against the
+   increment is zero minus the index expression of the hidden basis.
+   (all_phi_incrs_profitable): For phi arguments identical to the
+   base expression of the phi candidate, the increment to be checked
+   for profitability is zero minus the index expression of the hidden
+   basis.
+
 2018-10-19  Andreas Krebbel  
 
Backport from mainline
Index: gcc/testsuite/gcc.c-torture/compile/pr87473.c
===
--- gcc/testsuite/gcc.c-torture/compile/pr87473.c   (nonexistent)
+++ gcc/testsuite/gcc.c-torture/compile/pr87473.c   (revision 265547)
@@ -0,0 +1,19 @@
+/* PR87473: SLSR ICE on hidden basis with |increment| > 1.  */
+/* { dg-additional-options "-fno-tree-ch" } */
+
+void
+t6 (int qz, int wh)
+{
+  int jl = wh;
+
+  while (1.0 / 0 < 1)
+{
+  qz = wh * (wh + 2);
+
+  while (wh < 1)
+jl = 0;
+}
+
+  while (qz < 1)
+qz = jl * wh;
+}
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 265546)
+++ gcc/testsuite/ChangeLog (revision 265547)
@@ -1,3 +1,11 @@
+2018-10-26  Bill Schmidt  
+
+   Backport from mainline
+   2018-10-19  Bill Schmidt  
+
+   PR tree-optimization/87473
+   * gcc.c-torture/compile/pr87473.c: New file.
+
 2018-10-23  Tom de Vries  
 
backport from trunk:
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 265546)
+++ gcc/gimple-ssa-strength-reduction.c (revision 265547)
@@ -2666,17 +2666,23 @@ record_phi_increments (slsr_cand_t basis, gimple *
   for (i = 0; i < gimple_phi_num_args (phi); i++)
 {
   tree arg = gimple_phi_arg_def (phi, i);
+  gimple *arg_def = SSA_NAME_DEF_STMT (arg);
 
-  if (!operand_equal_p (arg, phi_cand->base_expr, 0))
+  if (gimple_code (arg_def) == GIMPLE_PHI)
+   record_phi_increments (basis, arg_def);
+  else
{
- gimple *arg_def = SSA_NAME_DEF_STMT (arg);
+ widest_int diff;
 
- if (gimple_code (arg_def) == GIMPLE_PHI)
-   record_phi_increments (basis, arg_def);
+ if (operand_equal_p (arg, phi_cand->base_expr, 0))
+   {
+ diff = -basis->index;
+ record_increment (phi_cand, diff, PHI_ADJUST);
+   }
  else
{
  slsr_cand_t arg_cand = base_cand_from_table (arg);
-

Re: [PATCH v2, rs6000 4/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-26 Thread Paul Clarke
On 10/26/2018 02:02 PM, Bill Schmidt wrote:
> On 10/25/18 2:08 PM, Paul Clarke wrote:

>> diff --git a/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-check.h 
>> b/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-check.h
>> new file mode 10644
>> --- /dev/null(revision 0)
>> +++ b/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-check.h   (working copy)
>> @@ -0,0 +1,43 @@
>> +#include 
>> +#include 
>> +
>> +#include "m128-check.h"
>> +
>> +/* define DEBUG replace abort with printf on error.  */
> 
> One nit -- this comment appears to be incorrect, as the only place DEBUG is 
> used,
> you don't have abort() anywhere.
> 
> (I have a patch under review that questions why we would replace abort() 
> rather
> than supplement it with printf, anyway...)

You are correct.  That comment was copied without consideration, is incorrect, 
and should just be deleted.  It looks like the abort() issue you are fixing in 
your patch is not present in the new(-ish) test cases in this patch.

I'll commit a new (trivial/obvious) change.

PC



Re: [PATCH] powerpc: Fix typos in the manual

2018-10-26 Thread Segher Boessenkool
On Fri, Oct 26, 2018 at 02:40:46PM -0300, Tulio Magno Quites Machado Filho 
wrote:
> [gcc]
> 2018-10-26  Tulio Magno Quites Machado Filho  
> 
>   * doc/extend.texi (PowerPC builtins): Fix __builtin_unpack_ibm128
>   return type and other typos.

Thanks!  I've committed this to trunk for you.  One thing:

>  is returned.  The @code{__builtin_unpack_longdouble} function is only
> -availble if @code{long double} uses the IBM extended double
> +available if @code{long double} uses the IBM extended long double

It's really called "IBM extended double", so I didn't commit that part.


Segher


Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-10-26 Thread Thomas Preudhomme
Hi,

Please find updated patch to fix PR85434: spilling of stack protector
guard's address on ARM. Quite a few changes have been made to the ARM
part since last round of review so I think it makes more sense to
review it anew. Ran bootstrap + regression testsuite + glibc build +
glibc regression testsuite for Arm and Thumb-2 and bootstrap +
regression testsuite for Thumb-1. GCC's regression testsuite was run
in 3 configurations in all those cases:

- default configuration (no RUNTESTFLAGS)
- with -fstack-protector-all
- with -fPIC -fstack-protector-all (to exercise both codepath in stack
protector's split code)

None of this show any regression beyond some new scan fail with
-fstack-protector-all or -fPIC due to unexpected code sequence for the
testcases concerned and some guality swing due to less optimization
with new stack protector on.

Patch description and ChangeLog below.

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names that take the unexpanded guard and do the set or
test. This allows the target to use an opaque pattern (eg. using UNSPEC)
to hide the individual instructions being generated to the compiler and
split the pattern into generic load, compare and branch instruction
after register allocator, therefore avoiding any spilling. This is here
implemented for the ARM targets. For targets not implementing these new
standard pattern names, the existing stack_protect_set and
stack_protect_test pattern names are used.

To be able to split PIC access after register allocation, the functions
had to be augmented to force a new PIC register load and to control
which register it loads into. This is because sharing the PIC register
between prologue and epilogue could lead to spilling due to CSE again
which an attacker could use to control what the canary gets compared
against.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-10-26  Thomas Preud'homme  

* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.  Insert in the stream of insns if
possible.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.  Use pic_reg if non null instead of
cached one.
(arm_load_pic_register): Add pic_reg parameter and use it if non null.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to require_pic_register prototype change.
(arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
(thumb1_expand_prologue): Likewise.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
(arm_load_pic_register): Likewise.
* config/arm/predicated.md (guard_addr_operand): New predicate.
(guard_operand): New predicate.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
prototype change.
(stack_protect_combined_set): New expander..
(stack_protect_combined_set_insn): New insn_and_split pattern.
(stack_protect_set_insn): New insn pattern.
(stack_protect_combined_test): New expander.
(stack_protect_combined_test_insn): New insn_and_split pattern.
(arm_stack_protect_test_insn): New insn pattern.
* config/arm/thumb1.md (thumb1_stack_protect_test_insn): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

* gcc.target/arm/pr85434.c: New test.

Is this o

RE: RFC: Allow moved-from strings to be non-empty

2018-10-26 Thread Joe Buck
The reason move constructors were introduced was to speed up code in cases 
where an object
Is copied and the copy is no longer needed.  It is unfortunate that there may 
now be code out
there that relies on accidental properties of library implementations.  It 
would be best if the
Implementation is not constrained.  Unless the standard mandates that, after a 
string is moved,
the string is empty, the user should only be able to assume that it is in some 
consistent but
unspecified state.  Otherwise we pay a performance penalty forever. 

If the standard explicitly states that the argument to the move constructor is 
defined to be
empty after the call, we're stuck.



Re: RFC: Allow moved-from strings to be non-empty

2018-10-26 Thread Ville Voutilainen
On Sat, 27 Oct 2018 at 01:27, Joe Buck  wrote:
>
> The reason move constructors were introduced was to speed up code in cases 
> where an object
> Is copied and the copy is no longer needed.  It is unfortunate that there may 
> now be code out
> there that relies on accidental properties of library implementations.  It 
> would be best if the
> Implementation is not constrained.  Unless the standard mandates that, after 
> a string is moved,
> the string is empty, the user should only be able to assume that it is in 
> some consistent but
> unspecified state.  Otherwise we pay a performance penalty forever.
>
> If the standard explicitly states that the argument to the move constructor 
> is defined to be
> empty after the call, we're stuck.

It certainly doesn't specify so for an SSO string, so we're not stuck.
On the other hand, we already get
a speed-up, it's just not as much of a speed-up as it can be. What I
really loathe is the potential implementation
divergence; it's all good for the priesthood to refer to the standard
and say "you shouldn't have done that", but
that's, as a good friend of mine provided as a phrasing on a different
manner, spectacularly unhelpful.


[PATCH v3 0/3] OpenRISC port

2018-10-26 Thread Stafford Horne
Hello,

Changes Since v2:
 - Add RTEMS patches from Joel Sherrill
 - Disable t-softfp-excl as suggsted by Joseph Myers
 - Add new architecture flags needed to run on real FPGA's found in testing
   * -mror - enable l.ror (rotate right)
   * -mshftimm - enable shift/rorate by immediate instructions
 - Binutils requirements are now in upstream git

Changes Since v1:
 - Document options in invoke.texi suggested by Joseph Myers
 - Remove obsolete/incorrect macros suggested by Joseph Myers
 - Documented or1k.c functions as requested by Jeff Law
 - Add epilogue barriers suggested by Jeff Law
 - Define SPECULATION_SAFE_VALUE suggested by Jeff Law
 - Switch to init/fini array suggested by Richard Henderson
 - Define and document multilib flags to enable disable instructions only
   available on some CPU cores as requested on OpenRISC mailing list.

Since February this year I have been working on an OpenRISC clean room rewrite.

  
http://stffrdhrn.github.io/software/embedded/openrisc/2018/02/03/openrisc_gcc_rewrite.html

As per the article, the old port had issues with some of the owners signing over
FSF copyright.  To get around this I discussed options with the group and in the
end I opted for a clean room rewrite.

The new code base has been written by me with lots of help from Richard
Henderson.  I trust that both of us have our FSF GCC copyright's in place.

# Testing

We have been running the GCC testsuite with newlib and musl libc.  The results
are good.  See results published in a test build/release here:

 - https://github.com/stffrdhrn/gcc/releases/tag/or1k-9.0.0-20181027

# Building

To build this requires the latest binutils upstream master i.e. 2.31.52.

-Stafford

Stafford Horne (3):
  or1k: libgcc: initial support for openrisc
  or1k: testsuite: initial support for openrisc
  or1k: gcc: initial support for openrisc

 gcc/common/config/or1k/or1k-common.c  |   41 +
 gcc/config.gcc|   45 +
 gcc/config/or1k/constraints.md|   55 +
 gcc/config/or1k/elf.h |   42 +
 gcc/config/or1k/elf.opt   |   33 +
 gcc/config/or1k/linux.h   |   44 +
 gcc/config/or1k/or1k-protos.h |   38 +
 gcc/config/or1k/or1k.c| 2186 +
 gcc/config/or1k/or1k.h|  392 +++
 gcc/config/or1k/or1k.md   |  907 +++
 gcc/config/or1k/or1k.opt  |   67 +
 gcc/config/or1k/predicates.md |   84 +
 gcc/config/or1k/rtems.h   |   30 +
 gcc/config/or1k/t-or1k|   22 +
 gcc/config/or1k/t-rtems   |3 +
 gcc/configure |   12 +
 gcc/configure.ac  |   12 +
 gcc/doc/install.texi  |   19 +
 gcc/doc/invoke.texi   |   68 +
 gcc/doc/md.texi   |   25 +
 .../gcc.c-torture/execute/20101011-1.c|3 +
 gcc/testsuite/gcc.dg/20020312-2.c |2 +
 gcc/testsuite/gcc.dg/attr-alloc_size-11.c |4 +-
 gcc/testsuite/gcc.dg/builtin-apply2.c |2 +-
 gcc/testsuite/gcc.dg/nop.h|2 +
 .../torture/stackalign/builtin-apply-2.c  |2 +-
 gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c|2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-33.c|2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-34.c|2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-35.c|2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-36.c|2 +-
 gcc/testsuite/gcc.target/or1k/args-1.c|   19 +
 gcc/testsuite/gcc.target/or1k/args-2.c|   15 +
 gcc/testsuite/gcc.target/or1k/cmov-1.c|8 +
 gcc/testsuite/gcc.target/or1k/cmov-2.c|9 +
 gcc/testsuite/gcc.target/or1k/div-mul-1.c |9 +
 gcc/testsuite/gcc.target/or1k/div-mul-2.c |9 +
 gcc/testsuite/gcc.target/or1k/or1k.exp|   41 +
 gcc/testsuite/gcc.target/or1k/return-1.c  |   10 +
 gcc/testsuite/gcc.target/or1k/return-2.c  |   19 +
 gcc/testsuite/gcc.target/or1k/return-3.c  |   19 +
 gcc/testsuite/gcc.target/or1k/return-4.c  |   19 +
 gcc/testsuite/gcc.target/or1k/ror-1.c |8 +
 gcc/testsuite/gcc.target/or1k/ror-2.c |9 +
 gcc/testsuite/gcc.target/or1k/ror-3.c |8 +
 gcc/testsuite/gcc.target/or1k/shftimm-1.c |8 +
 gcc/testsuite/gcc.target/or1k/shftimm-2.c |8 +
 gcc/testsuite/gcc.target/or1k/sibcall-1.c |   18 +
 gcc/testsuite/lib/target-supports.exp |1 +
 libgcc/config.host|   13 +
 libgcc/config/or1k/crti.S |   33 +
 libgcc/config/or1k/crtn.S |1 +
 libgcc/config/or1k/lib1funcs.S|  223 ++
 libgcc/config/or1k/linux-unwind.h |   87 +
 libgcc/config/or1k/sfp-machine.h  |   54 +
 libgcc/config/or1k/t-or1k 

[PATCH v3 1/3] or1k: libgcc: initial support for openrisc

2018-10-26 Thread Stafford Horne
-mm-dd  Stafford Horne  
Richard Henderson  

libgcc/ChangeLog:

* config.host: Add OpenRISC support.
* config/or1k/*: New.
---
 libgcc/config.host|  13 ++
 libgcc/config/or1k/crti.S |  33 +
 libgcc/config/or1k/crtn.S |   1 +
 libgcc/config/or1k/lib1funcs.S| 223 ++
 libgcc/config/or1k/linux-unwind.h |  87 
 libgcc/config/or1k/sfp-machine.h  |  54 
 libgcc/config/or1k/t-or1k |  22 +++
 7 files changed, 433 insertions(+)
 create mode 100644 libgcc/config/or1k/crti.S
 create mode 100644 libgcc/config/or1k/crtn.S
 create mode 100644 libgcc/config/or1k/lib1funcs.S
 create mode 100644 libgcc/config/or1k/linux-unwind.h
 create mode 100644 libgcc/config/or1k/sfp-machine.h
 create mode 100644 libgcc/config/or1k/t-or1k

diff --git a/libgcc/config.host b/libgcc/config.host
index 029f6569caf..e32b2541ea1 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -165,6 +165,9 @@ nds32*-*)
 nios2*-*-*)
cpu_type=nios2
;;
+or1k*-*-*)
+   cpu_type=or1k
+   ;;
 powerpc*-*-*)
cpu_type=rs6000
;;
@@ -1039,6 +1042,16 @@ nios2-*-*)
tmake_file="$tmake_file nios2/t-nios2 t-softfp-sfdf t-softfp-excl 
t-softfp"
extra_parts="$extra_parts crti.o crtn.o"
;;
+or1k-*-linux*)
+   tmake_file="$tmake_file or1k/t-or1k"
+   tmake_file="$tmake_file t-softfp-sfdf t-softfp"
+   md_unwind_header=or1k/linux-unwind.h
+   ;;
+or1k-*-*)
+   tmake_file="$tmake_file or1k/t-or1k"
+   tmake_file="$tmake_file t-softfp-sfdf t-softfp"
+   extra_parts="$extra_parts crti.o crtn.o"
+   ;;
 pdp11-*-*)
tmake_file="pdp11/t-pdp11 t-fdpbit"
;;
diff --git a/libgcc/config/or1k/crti.S b/libgcc/config/or1k/crti.S
new file mode 100644
index 000..9fcf6ae5995
--- /dev/null
+++ b/libgcc/config/or1k/crti.S
@@ -0,0 +1,33 @@
+/* Copyright (C) 2012-2018 Free Software Foundation, Inc.
+
+This file is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+This file is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/* Here _init and _fini are empty because .init_array/.fini_array are used
+   exclusively.  However, the functions are still needed as required when
+   linking.  */
+   .align 4
+   .global _init
+   .type   _init,@function
+_init:
+   .global _fini
+   .type   _fini,@function
+_fini:
+   l.jrr9
+l.nop
diff --git a/libgcc/config/or1k/crtn.S b/libgcc/config/or1k/crtn.S
new file mode 100644
index 000..ca6ee7b6fba
--- /dev/null
+++ b/libgcc/config/or1k/crtn.S
@@ -0,0 +1 @@
+/* crtn.S is empty because .init_array/.fini_array are used exclusively. */
diff --git a/libgcc/config/or1k/lib1funcs.S b/libgcc/config/or1k/lib1funcs.S
new file mode 100644
index 000..354aadae8c4
--- /dev/null
+++ b/libgcc/config/or1k/lib1funcs.S
@@ -0,0 +1,223 @@
+/* Copyright (C) 2018 Free Software Foundation, Inc.
+
+This file is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+This file is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+
+#ifdef L__mulsi3
+   .balign 4
+   .globl  __mulsi3
+   .type   __mulsi3, @function
+__mulsi3:
+   l.movhi r11, 0  /* initial r */
+
+   /* Given R = X * Y ... */
+1: l.sfeq  r4, r0  /* while (y != 0) */
+   l.bf2f
+l.andi r5, r4, 1  

[PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-26 Thread Stafford Horne
-mm-dd  Stafford Horne  
Richard Henderson  
Joel Sherrill  

gcc/ChangeLog:

* common/config/or1k/or1k-common.c: New file.
* config/or1k/*: New.
* config.gcc (or1k*-*-*): New.
* configure.ac (or1k*-*-*): New test for openrisc tls.
* configure: Regenerated.
* doc/install.texi: Document OpenRISC triplets.
* doc/invoke.texi: Document OpenRISC arguments.
* doc/md.texi: Document OpenRISC.
---
 gcc/common/config/or1k/or1k-common.c |   41 +
 gcc/config.gcc   |   45 +
 gcc/config/or1k/constraints.md   |   55 +
 gcc/config/or1k/elf.h|   42 +
 gcc/config/or1k/elf.opt  |   33 +
 gcc/config/or1k/linux.h  |   44 +
 gcc/config/or1k/or1k-protos.h|   38 +
 gcc/config/or1k/or1k.c   | 2186 ++
 gcc/config/or1k/or1k.h   |  392 +
 gcc/config/or1k/or1k.md  |  907 +++
 gcc/config/or1k/or1k.opt |   67 +
 gcc/config/or1k/predicates.md|   84 +
 gcc/config/or1k/rtems.h  |   30 +
 gcc/config/or1k/t-or1k   |   22 +
 gcc/config/or1k/t-rtems  |3 +
 gcc/configure|   12 +
 gcc/configure.ac |   12 +
 gcc/doc/install.texi |   19 +
 gcc/doc/invoke.texi  |   68 +
 gcc/doc/md.texi  |   25 +
 20 files changed, 4125 insertions(+)
 create mode 100644 gcc/common/config/or1k/or1k-common.c
 create mode 100644 gcc/config/or1k/constraints.md
 create mode 100644 gcc/config/or1k/elf.h
 create mode 100644 gcc/config/or1k/elf.opt
 create mode 100644 gcc/config/or1k/linux.h
 create mode 100644 gcc/config/or1k/or1k-protos.h
 create mode 100644 gcc/config/or1k/or1k.c
 create mode 100644 gcc/config/or1k/or1k.h
 create mode 100644 gcc/config/or1k/or1k.md
 create mode 100644 gcc/config/or1k/or1k.opt
 create mode 100644 gcc/config/or1k/predicates.md
 create mode 100644 gcc/config/or1k/rtems.h
 create mode 100644 gcc/config/or1k/t-or1k
 create mode 100644 gcc/config/or1k/t-rtems

diff --git a/gcc/common/config/or1k/or1k-common.c 
b/gcc/common/config/or1k/or1k-common.c
new file mode 100644
index 000..044e843fd19
--- /dev/null
+++ b/gcc/common/config/or1k/or1k-common.c
@@ -0,0 +1,41 @@
+/* Common hooks for OpenRISC
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic-core.h"
+#include "tm.h"
+#include "common/common-target.h"
+#include "common/common-target-def.h"
+#include "opts.h"
+#include "flags.h"
+
+/* Implement TARGET_OPTION_OPTIMIZATION_TABLE.  */
+static const struct default_options or1k_option_optimization_table[] =
+  {
+/* Enable section anchors by default at -O1 or higher.  */
+{ OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
+{ OPT_LEVELS_NONE, 0, NULL, 0 }
+  };
+
+#undef TARGET_OPTION_OPTIMIZATION_TABLE
+#define TARGET_OPTION_OPTIMIZATION_TABLE or1k_option_optimization_table
+
+struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 71f62a2aba2..0dc2ac4b879 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -468,6 +468,9 @@ nios2-*-*)
 nvptx-*-*)
cpu_type=nvptx
;;
+or1k*-*-*)
+   cpu_type=or1k
+   ;;
 powerpc*-*-*spe*)
cpu_type=powerpcspe
extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h 
spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h"
@@ -2464,6 +2467,48 @@ nvptx-*)
tm_file="${tm_file} nvptx/offload.h"
fi
;;
+or1k*-*-*)
+   tm_file="elfos.h ${tm_file}"
+   tmake_file="${tmake_file} or1k/t-or1k"
+   # Force .init_array support.  The configure script cannot always
+   # automatically detect that GAS supports it, yet we require it.
+   gcc_cv_initfini_array=yes
+
+   # Handle --with-multilib-list=...
+   or1k_multilibs="${with_multilib_list}"
+   if test "$or1k_multilibs" = "default"; then
+   or1k_multilibs="mcmov,msoft-mul,msoft-div"
+   fi
+   or1k_multilibs=`echo $or1k_multilibs | sed -e 's/,/ /g'`
+   for or1k_multilib in ${or1k_multilibs}; do
+   case ${

[PATCH v3 2/3] or1k: testsuite: initial support for openrisc

2018-10-26 Thread Stafford Horne
-mm-dd  Stafford Horne  
Richard Henderson  

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/20101011-1.c: Adjust for OpenRISC.
* gcc.dg/20020312-2.c: Likewise.
* gcc.dg/attr-alloc_size-11.c: Likewise.
* gcc.dg/builtin-apply2.c: Likewise.
* gcc.dg/nop.h: Likewise.
* gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise.
* gcc.dg/tree-ssa/20040204-1.c: Likewise.
* gcc.dg/tree-ssa/reassoc-33.c: Likewise.
* gcc.dg/tree-ssa/reassoc-34.c: Likewise.
* gcc.dg/tree-ssa/reassoc-35.c: Likewise.
* gcc.dg/tree-ssa/reassoc-36.c: Likewise.
* lib/target-supports.exp
(check_effective_target_logical_op_short_circuit): Add or1k*-*-*.
* gcc.target/or1k/*: New.
---
 .../gcc.c-torture/execute/20101011-1.c|  3 ++
 gcc/testsuite/gcc.dg/20020312-2.c |  2 +
 gcc/testsuite/gcc.dg/attr-alloc_size-11.c |  4 +-
 gcc/testsuite/gcc.dg/builtin-apply2.c |  2 +-
 gcc/testsuite/gcc.dg/nop.h|  2 +
 .../torture/stackalign/builtin-apply-2.c  |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-33.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-34.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-35.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-36.c|  2 +-
 gcc/testsuite/gcc.target/or1k/args-1.c| 19 +
 gcc/testsuite/gcc.target/or1k/args-2.c| 15 +++
 gcc/testsuite/gcc.target/or1k/cmov-1.c|  8 
 gcc/testsuite/gcc.target/or1k/cmov-2.c|  9 
 gcc/testsuite/gcc.target/or1k/div-mul-1.c |  9 
 gcc/testsuite/gcc.target/or1k/div-mul-2.c |  9 
 gcc/testsuite/gcc.target/or1k/or1k.exp| 41 +++
 gcc/testsuite/gcc.target/or1k/return-1.c  | 10 +
 gcc/testsuite/gcc.target/or1k/return-2.c  | 19 +
 gcc/testsuite/gcc.target/or1k/return-3.c  | 19 +
 gcc/testsuite/gcc.target/or1k/return-4.c  | 19 +
 gcc/testsuite/gcc.target/or1k/ror-1.c |  8 
 gcc/testsuite/gcc.target/or1k/ror-2.c |  9 
 gcc/testsuite/gcc.target/or1k/ror-3.c |  8 
 gcc/testsuite/gcc.target/or1k/shftimm-1.c |  8 
 gcc/testsuite/gcc.target/or1k/shftimm-2.c |  8 
 gcc/testsuite/gcc.target/or1k/sibcall-1.c | 18 
 gcc/testsuite/lib/target-supports.exp |  1 +
 29 files changed, 253 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/or1k/args-1.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/args-2.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/cmov-1.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/cmov-2.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/div-mul-1.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/div-mul-2.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/or1k.exp
 create mode 100644 gcc/testsuite/gcc.target/or1k/return-1.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/return-2.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/return-3.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/return-4.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/ror-1.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/ror-2.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/ror-3.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/shftimm-1.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/shftimm-2.c
 create mode 100644 gcc/testsuite/gcc.target/or1k/sibcall-1.c

diff --git a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c 
b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
index 8261b796a47..d2beeb52a0e 100644
--- a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
@@ -100,6 +100,9 @@ __aeabi_idiv0 (int return_value)
 #elif defined (__moxie__)
   /* Not all moxie configurations may raise exceptions.  */
 # define DO_TEST 0
+#elif defined (__or1k__)
+  /* On OpenRISC division by zero does not trap.  */
+# define DO_TEST 0
 #else
 # define DO_TEST 1
 #endif
diff --git a/gcc/testsuite/gcc.dg/20020312-2.c 
b/gcc/testsuite/gcc.dg/20020312-2.c
index 1a8afd81506..e72a5b261ae 100644
--- a/gcc/testsuite/gcc.dg/20020312-2.c
+++ b/gcc/testsuite/gcc.dg/20020312-2.c
@@ -117,6 +117,8 @@ extern void abort (void);
 # if defined (__CK807__) || defined (__CK810__)
 #   define PIC_REG  "r28"
 # endif
+#elif defined (__or1k__)
+/* No pic register.  */
 #else
 # error "Modify the test for your target."
 #endif
diff --git a/gcc/testsuite/gcc.dg/attr-alloc_size-11.c 
b/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
index 3ec44dc1463..6bb904f4794 100644
--- a/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
+++ b/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
@@ -47,8 +47,8 @@ typedef __SIZE_TYPE__size_t;
 
 /* The following tests fail because of missing range information.  The xfail
exclusions are PR79356.  */
-TEST (signed char, SCHAR_MIN + 2, ALLOC_MAX);   /* { dg-warning "argu

Re: [PATCH] S/390: Allow immediates in loc expander

2018-10-26 Thread Andreas Krebbel
On 10/26/18 5:31 PM, Robin Dapp wrote:
> Hi,
> 
> this is v2 of the patch.  The Z13 check has been moved from the
> predicate to the expander.  In addition, it changes a test case to
> always run with -march=zEC12 because from z13 on the load immediate on
> condition will prevent loop hoisting that the test requires.
> 
> Regards
>  Robin
> 
> --
> 
> gcc/ChangeLog:
> 
> 2018-10-26  Robin Dapp  
> 
>   * config/s390/predicates.md: Fix typo.
>   * config/s390/s390.md: Allow immediates for load on condition.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-10-26  Robin Dapp  
> 
>   * gcc.dg/loop-8.c: On s390, always run the test with -march=zEC12.
> 

Ok. Thanks!

Andreas



Re: [PATCH] S/390: Add loc patterns for QImode and HImode

2018-10-26 Thread Andreas Krebbel
On 10/26/18 5:33 PM, Robin Dapp wrote:
> Hi,
> 
> this is v2 of the patch with less quirky pattern syntax and two tests.
> 
> Regards
>  Robin
> 
> --
> 
> gcc/ChangeLog:
> 
> 2018-10-26  Robin Dapp  
> 
>   * config/s390/s390.md: QImode and HImode for load on condition.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-10-26  Robin Dapp  
> 
>   * gcc.target/s390/ifcvt-one-insn-bool.c: New test.
>   * gcc.target/s390/ifcvt-one-insn-char.c: New test.
> 

Ok. Thanks!

Andreas



[PATCH] i386: Use scalar operand in SF/DF/SI/DI vec_dup patterns

2018-10-26 Thread H.J. Lu
Use scalar operand in SF/DF/SI/DI vec_dup patterns which enables combiner
to generate

(set (reg:V8SF 84)
 (vec_duplicate:V8SF (mem/c:SF (symbol_ref:DI ("y")

const_vector_duplicate_operand is added for constant vector broadcast.
We split

(set (reg:V16SF 86)
 (const_vector:V16SF
   [(const_double:SF 2.0e+0 [0x0.8p+2]) repeated x16])

to

(set (reg:V16SF 86)
 (vec_duplicate:V16SF (mem/u/c:SF (symbol_ref/u:DI ("*.LC1")

before IRA so tha IRA can turn

(set (reg:V16SF 86)
 (vec_duplicate:V16SF (mem/u/c:SF (symbol_ref/u:DI ("*.LC1")
(set (reg:V16SF 90)
 (plus:V16SF (reg/v:V16SF 85 [ x ])
 (reg:V16SF 86)))

into

(set (reg:V16SF 90)
 (plus:V16SF
   (vec_duplicate:V16SF (mem/u/c:SF (symbol_ref/u:DI ("*.LC1"
   (reg/v:V16SF 85 [ x ])))

For AVX512 broadcast instructions from integer register operand, we only
need to broadcast integer to integer vectors.

pic_reg_initialized is added to machine_function to indicate that IRA
has started since *_const_vec_dup is valid only before
IRA.

gcc/

PR target/87537
PR target/87767
* config/i386/i386-builtin-types.def: Replace
CODE_FOR_avx2_vec_dupv4sf, CODE_FOR_avx2_vec_dupv8sf and
CODE_FOR_avx2_vec_dupv4df with CODE_FOR_vec_dupv4sf,
CODE_FOR_vec_dupv8sf and CODE_FOR_vec_dupv4df, respectively.
* config/i386/i386.c (ix86_init_pic_reg): Set pic_reg_initialized.
(expand_vec_perm_1): Replace gen_avx512f_vec_dupv16sf_1,
gen_avx2_vec_dupv8sf_1 and gen_avx512f_vec_dupv8df_1 with
gen_avx512f_vec_dupv16sf, gen_vec_dupv8sf and
gen_avx512f_vec_dupv8df, respectively.  Duplicate them from
scalar operand.
* config/i386/i386.h (machine_function): Add pic_reg_initialized.
* config/i386/i386.md (SF to DF splitter): Replace
gen_avx512f_vec_dupv16sf_1 with gen_avx512f_vec_dupv16sf.
* config/i386/predicates.md (const_vector_duplicate_operand): New.
* config/i386/sse.md (VF48_AVX512VL): New.
(avx2_vec_dup): Removed.
(avx2_vec_dupv8sf_1): Likewise.
(avx512f_vec_dup_1): Likewise.
(avx2_vec_dupv4df): Likewise.
(_vec_dup:V48_AVX512VL): Likewise.
(_vec_dup:VF48_AVX512VL): New.
(*_const_vec_dup): Likewise.
(_vec_dup:VI48_AVX512VL): Likewise.
(_vec_dup_gpr): Replace
V48_AVX512VL with VI48_AVX512VL.
(*avx_vperm_broadcast_): Replace gen_avx2_vec_dupv8sf with
gen_vec_dupv8sf.

gcc/testsuite/

PR target/87537
PR target/87767
* gcc.target/i386/avx2-vbroadcastss_ps256-1.c: Updated.
* gcc.target/i386/avx512vl-vbroadcast-3.c: Likewise.
* gcc.target/i386/avx512-binop-7.h: New file.
* gcc.target/i386/avx512f-add-sf-zmm-7.c: Likewise.
* gcc.target/i386/avx512f-add-si-zmm-7.c: Likewise.
* gcc.target/i386/avx512vl-add-di-xmm-7.c: Likewise.
* gcc.target/i386/avx512vl-add-sf-xmm-7.c: Likewise.
* gcc.target/i386/avx512vl-add-sf-ymm-7.c: Likewise.
* gcc.target/i386/avx512vl-add-si-xmm-7.c: Likewise.
* gcc.target/i386/avx512vl-add-si-ymm-7.c: Likewise.
* gcc.target/i386/pr87537-2.c: Likewise.
* gcc.target/i386/pr87537-3.c: Likewise.
* gcc.target/i386/pr87537-4.c: Likewise.
* gcc.target/i386/pr87537-5.c: Likewise.
* gcc.target/i386/pr87537-6.c: Likewise.
* gcc.target/i386/pr87537-7.c: Likewise.
* gcc.target/i386/pr87537-8.c: Likewise.
* gcc.target/i386/pr87537-9.c: Likewise.
---
 gcc/config/i386/i386-builtin.def  |  6 +-
 gcc/config/i386/i386.c| 30 +-
 gcc/config/i386/i386.h|  3 +
 gcc/config/i386/i386.md   |  2 +-
 gcc/config/i386/predicates.md | 13 +++
 gcc/config/i386/sse.md| 98 ---
 .../i386/avx2-vbroadcastss_ps256-1.c  |  3 +-
 .../gcc.target/i386/avx512-binop-7.h  | 12 +++
 .../gcc.target/i386/avx512f-add-sf-zmm-7.c| 14 +++
 .../gcc.target/i386/avx512f-add-si-zmm-7.c| 12 +++
 .../gcc.target/i386/avx512vl-add-di-xmm-7.c   | 13 +++
 .../gcc.target/i386/avx512vl-add-sf-xmm-7.c   | 13 +++
 .../gcc.target/i386/avx512vl-add-sf-ymm-7.c   | 13 +++
 .../gcc.target/i386/avx512vl-add-si-xmm-7.c   | 13 +++
 .../gcc.target/i386/avx512vl-add-si-ymm-7.c   | 13 +++
 .../gcc.target/i386/avx512vl-vbroadcast-3.c   |  5 +-
 gcc/testsuite/gcc.target/i386/pr87537-2.c | 12 +++
 gcc/testsuite/gcc.target/i386/pr87537-3.c | 12 +++
 gcc/testsuite/gcc.target/i386/pr87537-4.c | 12 +++
 gcc/testsuite/gcc.target/i386/pr87537-5.c | 12 +++
 gcc/testsuite/gcc.target/i386/pr87537-6.c | 12 +++
 gcc/testsuite/gcc.target/i386/pr87537-7.c | 12 +++
 gcc/testsuite/gcc.target/i386/pr87537-8.c | 12 +++
 gcc/testsuite/gcc.target/i386/pr87537-9.c | 12 +++
 24 files changed, 289 insertions(+), 70 deletions(-)
 crea