Re: gfortran.dg/PR82376.f90: Avoid matching a file-path.

2021-08-12 Thread Bernhard Reutner-Fischer via Gcc-patches
On Thu, 12 Aug 2021 00:09:21 +0200
Hans-Peter Nilsson via Fortran  wrote:

> I had a file-path to sources with the substring "new" in it,
> and (only) this test regressed compared to results from
> another build without "new" in the name.
> 
> The test does
>  ! { dg-final { scan-tree-dump-times "new" 4 "original" } }
> i.e. the contents of the tree-dump-file .original needs to match
> the undelimited string "new" exactly four times.  Very brittle.
> 
> In the dump-file, there are three lines with calls to new:
>  D.908 = new ((integer(kind=4) *) data);
>  integer(kind=4) * new (integer(kind=4) & data)
>static integer(kind=4) * new (integer(kind=4) &);
> 
> But, there's also a line, which for me and cris-elf looked like:
>  _gfortran_runtime_error_at (&"At line 46 of file
>   /X/xyzzynewfrob/gcc/testsuite/gfortran.dg/PR82376.f90"[1]{lb: 1 sz: 1},
>   &"Pointer actual argument \'new\' is not associated"[1]{lb: 1 sz: 1});
> The fourth match is obviously intended to match this line, but only
> with *one* match, whereas the path can as above yield another hit.
> 
> With Tcl, the regexp for matching the " " *and* the "'"
> *and* the "\" gets a bit unsightly, so I suggest just
> matching the "new" calls, which according to the comment in
> the test is the key point.  You can't have a file-path with
> spaces and parentheses in a gcc build.  I'm also making use
> of {} rather than "" needing one level of quoting; the "\("
> is needed because the matched string is a regexp.
> 
> Ok to commit?

A wordmatch would be \mnew\M but i agree that counting calls by
{\mnew (} is fine too.

I'd call it obvious, so i dare to approve it.
OK.
thanks!
> 
> testsuite:
>   * gfortran.dg/PR82376.f90: Robustify match.
> ---
>  gcc/testsuite/gfortran.dg/PR82376.f90 | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gfortran.dg/PR82376.f90 
> b/gcc/testsuite/gfortran.dg/PR82376.f90
> index 07143ab7e82e..b99779ce9d8a 100644
> --- a/gcc/testsuite/gfortran.dg/PR82376.f90
> +++ b/gcc/testsuite/gfortran.dg/PR82376.f90
> @@ -2,7 +2,8 @@
>  ! { dg-options "-fdump-tree-original -fcheck=pointer" }
>  !
>  ! Test the fix for PR82376. The pointer check was doubling up the call
> -! to new. The fix reduces the count of 'new' from 5 to 4.
> +! to new. The fix reduces the count of 'new' from 5 to 4, or to 3, when
> +! counting only calls.
>  !
>  ! Contributed by José Rui Faustino de Sousa  
>  !
> @@ -56,4 +57,4 @@ contains
>end subroutine set
>  
>  end program main_p
> -! { dg-final { scan-tree-dump-times "new" 4 "original" } }
> +! { dg-final { scan-tree-dump-times { new \(} 3 "original" } }



Re: [patch] Make -no-pie option work for native Windows

2021-08-12 Thread Eric Botcazou
> Looks good to me. Do you have push permissions?

Thanks.  Yes, see the preceding message on gcc-patches@, so applied.

-- 
Eric Botcazou




openmp: Diagnose omp::directive/sequence on using-directive

2021-08-12 Thread Jakub Jelinek via Gcc-patches
Hi!

With the using-directive parsing changes, we now emit only a warning
for [[omp::directive (...)]] on using-directive.  While that is right
without -fopenmp/-fopenmp-simd, when OpenMP is enabled, that should
be an error as OpenMP (is going to) disallow such attributes there
as they do not appertain to a statement.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2021-08-12  Jakub Jelinek  

* name-lookup.c (finish_using_directive): Diagnose omp::directive
or omp::sequence attributes on using-directive.

* g++.dg/gomp/attrs-11.C: Adjust expected diagnostics.

--- gcc/cp/name-lookup.c.jj 2021-06-10 19:56:20.606335462 +0200
+++ gcc/cp/name-lookup.c2021-08-11 22:33:51.367221663 +0200
@@ -8560,6 +8560,7 @@ finish_using_directive (tree target, tree attribs)
   add_using_namespace (current_binding_level->using_directives,
   ORIGINAL_NAMESPACE (target));
 
+  bool diagnosed = false;
   if (attribs != error_mark_node)
 for (tree a = attribs; a; a = TREE_CHAIN (a))
   {
@@ -8572,6 +8573,16 @@ finish_using_directive (tree target, tree attribs)
  inform (DECL_SOURCE_LOCATION (target),
  "you can use an inline namespace instead");
  }
+   else if ((flag_openmp || flag_openmp_simd)
+&& get_attribute_namespace (a) == omp_identifier
+&& (is_attribute_p ("directive", name)
+|| is_attribute_p ("sequence", name)))
+ {
+   if (!diagnosed)
+ error ("% not allowed to be specified in this "
+"context", name);
+   diagnosed = true;
+ }
else
  warning (OPT_Wattributes, "%qD attribute directive ignored", name);
   }
--- gcc/testsuite/g++.dg/gomp/attrs-11.C.jj 2021-08-11 16:56:49.262489927 
+0200
+++ gcc/testsuite/g++.dg/gomp/attrs-11.C2021-08-11 22:34:59.427281071 
+0200
@@ -11,7 +11,7 @@ foo ()
   [[omp::directive (parallel)]] __extension__ asm ("");// { 
dg-error "expected" }
   __extension__ [[omp::directive (parallel)]] asm ("");// { 
dg-error "expected" }
   [[omp::directive (parallel)]] namespace M = ::N; // { dg-error 
"expected" }
-  [[omp::directive (parallel)]] using namespace N; // { dg-bogus 
"expected" "" { xfail *-*-* } }
+  [[omp::directive (parallel)]] using namespace N; // { dg-error 
"not allowed to be specified in this context" }
   [[omp::directive (parallel)]] using O::T;// { dg-error 
"expected" }
   [[omp::directive (parallel)]] __label__ foo; // { dg-error 
"expected" }
   [[omp::directive (parallel)]] static_assert (true, "");  // { dg-error 
"expected" }


Jakub



[committed] openmp: Diagnose another case of mixing parameter and attribute syntax

2021-08-12 Thread Jakub Jelinek via Gcc-patches
Hi!

This patch diagnoses cases like:
  #pragma omp parallel
  [[omp::directive (declare simd)]] int foo ();
or
  #pragma omp taskgroup
  int bar [[omp::directive (declare simd)]] (int);
where the pragma is on the same declaration statement as the declare simd
attribute.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2021-08-12  Jakub Jelinek  

* parser.c (cp_parser_lambda_body): Add temp overrides
for parser->{omp_declare_simd,oacc_routine,omp_attrs_forbidden_p}.
(cp_parser_statement): Restore parser->omp_attrs_forbidden_p for
cp_parser_declaration_statement.
(cp_parser_default_argument): Add temp override for
parser->omp_attrs_forbidden_p.
(cp_parser_late_parsing_omp_declare_simd): Diagnose declare simd
or declare variant in attribute syntax on a declaration immediately
following an OpenMP construct in pragma syntax.

* g++.dg/gomp/attrs-11.C: Add new tests.

--- gcc/cp/parser.c.jj  2021-08-11 14:17:53.980173449 +0200
+++ gcc/cp/parser.c 2021-08-11 16:51:58.057481477 +0200
@@ -11625,6 +11625,9 @@ cp_parser_lambda_body (cp_parser* parser
middle of an expression.  */
 ++function_depth;
 
+  auto odsd = make_temp_override (parser->omp_declare_simd, NULL);
+  auto ord = make_temp_override (parser->oacc_routine, NULL);
+  auto oafp = make_temp_override (parser->omp_attrs_forbidden_p, false);
   vec omp_privatization_save;
   save_omp_privatization_clauses (omp_privatization_save);
   /* Clear this in case we're in the middle of a default argument.  */
@@ -12268,9 +12271,11 @@ cp_parser_statement (cp_parser* parser,
   so let's un-parse them.  */
saved_tokens.rollback();
 
+ parser->omp_attrs_forbidden_p = omp_attrs_forbidden_p;
  cp_parser_parse_tentatively (parser);
  /* Try to parse the declaration-statement.  */
  cp_parser_declaration_statement (parser);
+ parser->omp_attrs_forbidden_p = false;
  /* If that worked, we're done.  */
  if (cp_parser_parse_definitely (parser))
return;
@@ -24716,6 +24721,8 @@ cp_parser_default_argument (cp_parser *p
   parser->greater_than_is_operator_p = !template_parm_p;
   auto odsd = make_temp_override (parser->omp_declare_simd, NULL);
   auto ord = make_temp_override (parser->oacc_routine, NULL);
+  auto oafp = make_temp_override (parser->omp_attrs_forbidden_p, false);
+
   /* Local variable names (and the `this' keyword) may not
  appear in a default argument.  */
   saved_local_variables_forbidden_p = parser->local_variables_forbidden_p;
@@ -5,6 +44452,14 @@ cp_parser_late_parsing_omp_declare_simd
continue;
  }
 
+   if (parser->omp_attrs_forbidden_p)
+ {
+   error_at (first->location,
+ "mixing OpenMP directives with attribute and "
+ "pragma syntax on the same statement");
+   parser->omp_attrs_forbidden_p = false;
+ }
+
if (!flag_openmp && strcmp (directive[1], "simd") != 0)
  continue;
if (lexer == NULL)
--- gcc/testsuite/g++.dg/gomp/attrs-11.C.jj 2021-08-10 11:22:14.441607489 
+0200
+++ gcc/testsuite/g++.dg/gomp/attrs-11.C2021-08-11 16:56:49.262489927 
+0200
@@ -72,3 +72,15 @@ int f28 [[omp::directive (declare simd),
 int f29 [[omp::directive (foobar), omp::directive (declare simd)]] (int);  
// { dg-error "unknown OpenMP directive name" }
 int f30 [[omp::directive (threadprivate (t7)), omp::directive (declare simd)]] 
(int);  // { dg-error "OpenMP directive other than 'declare simd' or 'declare 
variant' appertains to a declaration" }
 int f31 [[omp::directive (declare simd), omp::directive (threadprivate (t8))]] 
(int);  // { dg-error "OpenMP directive other than 'declare simd' or 'declare 
variant' appertains to a declaration" }
+
+void
+baz ()
+{
+  #pragma omp parallel
+  [[omp::directive (declare simd)]] extern int f32 (int);  // { dg-error 
"mixing OpenMP directives with attribute and pragma syntax on the same 
statement" }
+  #pragma omp parallel
+  extern int f33 [[omp::directive (declare simd)]] (int);  // { dg-error 
"mixing OpenMP directives with attribute and pragma syntax on the same 
statement" }
+  [[omp::directive (parallel)]]
+  #pragma omp declare simd // { dg-error "mixing OpenMP directives with 
attribute and pragma syntax on the same statement" }
+  extern int f34 (int);
+}


Jakub



[committed] openmp: Diagnose syntax mismatches between declare target and end declare target

2021-08-12 Thread Jakub Jelinek via Gcc-patches
Hi!

OpenMP 5.1 says:
For any directive that has a paired end directive, including those with a begin
and end pair, both directives must use either the attribute syntax or the
pragma syntax.

The following patch enforces it with the only pair so far recognized in C++
(Fortran has many, but on the other side doesn't have attribute syntax).

While I initially wanted to use vec *member; in there, that
unfortunately doesn't work, one gets linker errors and I guess it is fixable,
but for begin declare target we'll need a struct anyway to store device_type
etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2021-08-12  Jakub Jelinek  

* cp-tree.h (omp_declare_target_attr): New type.
(struct saved_scope): Change type of omp_declare_target_attribute
from int to vec * and move it.
* parser.c (cp_parser_omp_declare_target): Instead of
incrementing scope_chain->omp_declare_target_attribute, push
a struct containing parser->lexer->in_omp_attribute_pragma to
the vector.
(cp_parser_omp_end_declare_target): Instead of decrementing
scope_chain->omp_declare_target_attribute, pop a structure
from it.  Diagnose mismatching declare target vs.
end declare target syntax.
* semantics.c (finish_translation_unit): Use vec_safe_length
and vec_safe_truncate on scope_chain->omp_declare_target_attributes.
* decl2.c (cplus_decl_attributes): Use vec_safe_length
on scope_chain->omp_declare_target_attributes.

* g++.dg/gomp/attrs-12.C: New test.

--- gcc/cp/cp-tree.h.jj 2021-08-11 23:43:59.167894114 +0200
+++ gcc/cp/cp-tree.h2021-08-11 23:46:28.447823665 +0200
@@ -1789,6 +1789,10 @@ union GTY((desc ("cp_tree_node_structure
 };
 
 
+struct GTY(()) omp_declare_target_attr {
+  bool attr_syntax;
+};
+
 /* Global state.  */
 
 struct GTY(()) saved_scope {
@@ -1826,9 +1830,6 @@ struct GTY(()) saved_scope {
   int unevaluated_operand;
   int inhibit_evaluation_warnings;
   int noexcept_operand;
-  /* If non-zero, implicit "omp declare target" attribute is added into the
- attribute lists.  */
-  int omp_declare_target_attribute;
   int ref_temp_count;
 
   struct stmt_tree_s x_stmt_tree;
@@ -1837,6 +1838,7 @@ struct GTY(()) saved_scope {
   cp_binding_level *bindings;
 
   hash_map *GTY((skip)) x_local_specializations;
+  vec *omp_declare_target_attribute;
 
   struct saved_scope *prev;
 };
--- gcc/cp/parser.c.jj  2021-08-11 23:43:59.192893768 +0200
+++ gcc/cp/parser.c 2021-08-11 23:51:20.622755996 +0200
@@ -44605,8 +44605,10 @@ cp_parser_omp_declare_target (cp_parser
 }
   else
 {
+  struct omp_declare_target_attr a
+   = { parser->lexer->in_omp_attribute_pragma };
+  vec_safe_push (scope_chain->omp_declare_target_attribute, a);
   cp_parser_require_pragma_eol (parser, pragma_tok);
-  scope_chain->omp_declare_target_attribute++;
   return;
 }
   for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
@@ -44687,6 +44689,7 @@ static void
 cp_parser_omp_end_declare_target (cp_parser *parser, cp_token *pragma_tok)
 {
   const char *p = "";
+  bool in_omp_attribute_pragma = parser->lexer->in_omp_attribute_pragma;
   if (cp_lexer_next_token_is (parser->lexer, CPP_NAME))
 {
   tree id = cp_lexer_peek_token (parser->lexer)->u.value;
@@ -44717,12 +44720,26 @@ cp_parser_omp_end_declare_target (cp_par
   return;
 }
   cp_parser_require_pragma_eol (parser, pragma_tok);
-  if (!scope_chain->omp_declare_target_attribute)
+  if (!vec_safe_length (scope_chain->omp_declare_target_attribute))
 error_at (pragma_tok->location,
  "%<#pragma omp end declare target%> without corresponding "
  "%<#pragma omp declare target%>");
   else
-scope_chain->omp_declare_target_attribute--;
+{
+  omp_declare_target_attr
+   a = scope_chain->omp_declare_target_attribute->pop ();
+  if (a.attr_syntax != in_omp_attribute_pragma)
+   {
+ if (a.attr_syntax)
+   error_at (pragma_tok->location,
+ "% in attribute syntax terminated "
+ "with % in pragma syntax");
+ else
+   error_at (pragma_tok->location,
+ "% in pragma syntax terminated "
+ "with % in attribute syntax");
+   }
+}
 }
 
 /* Helper function of cp_parser_omp_declare_reduction.  Parse the combiner
--- gcc/cp/semantics.c.jj   2021-08-11 23:43:45.892077375 +0200
+++ gcc/cp/semantics.c  2021-08-11 23:50:59.420051181 +0200
@@ -3271,12 +3271,12 @@ finish_translation_unit (void)
   /* Do file scope __FUNCTION__ et al.  */
   finish_fname_decls ();
 
-  if (scope_chain->omp_declare_target_attribute)
+  if (vec_safe_length (scope_chain->omp_declare_target_attribute))
 {
   if (!errorcount)
error ("%<#pragma omp declare target%> without corresponding "
   "%<#pragma omp end declare target%>");
-  scope_chain->om

[PATCH] i386: Fix up V32HImode permutations with -mno-avx512bw [PR101860]

2021-08-12 Thread Jakub Jelinek via Gcc-patches
Hi!

My patch from yesterday apparently broke some V32HImode permutations
as the testcase shows.
The first function assumed it would never be called in d->testing_p mode
and so went right away into emitting the code.
And the second one assumed V32HImode would never reach it, which now
can for the !TARGET_AVX512BW case.  We don't have a instruction
in that case though.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2021-08-12  Jakub Jelinek  

PR target/101860
* config/i386/i386-expand.c (ix86_expand_vec_one_operand_perm_avx512):
If d->testing_p, return true after performing checks instead of
actually expanding the insn.
(expand_vec_perm_broadcast_1): Handle V32HImode - assert
!TARGET_AVX512BW and return false.

* gcc.target/i386/avx512f-pr101860.c: New test.

--- gcc/config/i386/i386-expand.c.jj2021-08-10 12:37:53.867159317 +0200
+++ gcc/config/i386/i386-expand.c   2021-08-11 11:02:03.994908828 +0200
@@ -18116,6 +18116,9 @@ ix86_expand_vec_one_operand_perm_avx512
   return false;
 }
 
+  if (d->testing_p)
+return true;
+
   target = d->target;
   op0 = d->op0;
   for (int i = 0; i < d->nelt; ++i)
@@ -20481,6 +20484,10 @@ expand_vec_perm_broadcast_1 (struct expa
   gcc_assert (!TARGET_AVX2 || d->perm[0]);
   return false;
 
+case E_V32HImode:
+  gcc_assert (!TARGET_AVX512BW);
+  return false;
+
 default:
   gcc_unreachable ();
 }
--- gcc/testsuite/gcc.target/i386/avx512f-pr101860.c.jj 2021-08-11 
11:05:28.090072461 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-pr101860.c2021-08-11 
11:05:17.157224399 +0200
@@ -0,0 +1,5 @@
+/* PR target/101860 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mno-avx512bw" } */
+
+#include "../../gcc.dg/torture/vshuf-v32hi.c"

Jakub



[PATCH] libcpp: Fix ICE with -Wtraditional preprocessing [PR101638]

2021-08-12 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs in cpp_sys_macro_p, because cpp_sys_macro_p
is called for a builtin macro which doesn't use node->value.macro union
member but a different one and so dereferencing it ICEs.
As the testcase is distilled from contemporary glibc headers, it means
basically -Wtraditional now ICEs on almost everything.

The fix can be either the patch below, return false for builtin macros,
or we could instead return true for builtin macros and adjust the
function comment, or the fix could be also (untested):
--- libcpp/expr.c   2021-05-07 10:34:46.345122608 +0200
+++ libcpp/expr.c   2021-08-12 09:54:01.837556365 +0200
@@ -783,13 +783,13 @@ cpp_classify_number (cpp_reader *pfile,
 
   /* Traditional C only accepted the 'L' suffix.
  Suppress warning about 'LL' with -Wno-long-long.  */
-  if (CPP_WTRADITIONAL (pfile) && ! cpp_sys_macro_p (pfile))
+  if (CPP_WTRADITIONAL (pfile))
{
  int u_or_i = (result & (CPP_N_UNSIGNED|CPP_N_IMAGINARY));
  int large = (result & CPP_N_WIDTH) == CPP_N_LARGE
   && CPP_OPTION (pfile, cpp_warn_long_long);
 
- if (u_or_i || large)
+ if ((u_or_i || large) && ! cpp_sys_macro_p (pfile))
cpp_warning_with_line (pfile, large ? CPP_W_LONG_LONG : 
CPP_W_TRADITIONAL,
   virtual_location, 0,
   "traditional C rejects the \"%.*s\" suffix",
The builtin macros at least currently don't add any suffixes
or numbers -Wtraditional would like to warn about.  For floating
point suffixes, -Wtraditional calls cpp_sys_macro_p only right
away before emitting the warning, but in the above case the ICE
is because cpp_sys_macro_p is called even if the number doesn't
have any suffixes (that is I think always for builtin macros
right now).

Bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk, or do you prefer to return true for builtin
macros from cpp_sys_macro_p instead, and/or do you want the
above cpp_classify_number change (which can be done either
alone or in addition to some cpp_sys_macro_p change)?

2021-08-12  Jakub Jelinek  

PR preprocessor/101638
* macro.c (cpp_sys_macro_p): Return false instead of
crashing on builtin macros.

* gcc.dg/cpp/pr101638.c: New test.

--- libcpp/macro.c.jj   2021-07-20 17:03:08.036449892 +0200
+++ libcpp/macro.c  2021-08-11 20:50:11.212662720 +0200
@@ -3127,7 +3127,10 @@ cpp_sys_macro_p (cpp_reader *pfile)
   else
 node = pfile->context->c.macro;
 
-  return node && node->value.macro && node->value.macro->syshdr;
+  return (node
+ && cpp_user_macro_p (node)
+ && node->value.macro
+ && node->value.macro->syshdr);
 }
 
 /* Read each token in, until end of the current file.  Directives are
--- gcc/testsuite/gcc.dg/cpp/pr101638.c.jj  2021-08-11 20:53:04.785289640 
+0200
+++ gcc/testsuite/gcc.dg/cpp/pr101638.c 2021-08-11 20:52:36.086682006 +0200
@@ -0,0 +1,7 @@
+/* PR preprocessor/101638 */
+/* { dg-do preprocess } */
+/* { dg-options "-Wtraditional" } */
+
+#define foo(attr) __has_attribute(attr)
+#if foo(__deprecated__)
+#endif

Jakub



Re: [PATCH] i386: Fix up V32HImode permutations with -mno-avx512bw [PR101860]

2021-08-12 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 12, 2021 at 3:49 PM Jakub Jelinek  wrote:
>
> Hi!
>
> My patch from yesterday apparently broke some V32HImode permutations
> as the testcase shows.
> The first function assumed it would never be called in d->testing_p mode
> and so went right away into emitting the code.
> And the second one assumed V32HImode would never reach it, which now
> can for the !TARGET_AVX512BW case.  We don't have a instruction
> in that case though.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
LGTM.
>
> 2021-08-12  Jakub Jelinek  
>
> PR target/101860
> * config/i386/i386-expand.c (ix86_expand_vec_one_operand_perm_avx512):
> If d->testing_p, return true after performing checks instead of
> actually expanding the insn.
> (expand_vec_perm_broadcast_1): Handle V32HImode - assert
> !TARGET_AVX512BW and return false.
>
> * gcc.target/i386/avx512f-pr101860.c: New test.
>
> --- gcc/config/i386/i386-expand.c.jj2021-08-10 12:37:53.867159317 +0200
> +++ gcc/config/i386/i386-expand.c   2021-08-11 11:02:03.994908828 +0200
> @@ -18116,6 +18116,9 @@ ix86_expand_vec_one_operand_perm_avx512
>return false;
>  }
>
> +  if (d->testing_p)
> +return true;
> +
>target = d->target;
>op0 = d->op0;
>for (int i = 0; i < d->nelt; ++i)
> @@ -20481,6 +20484,10 @@ expand_vec_perm_broadcast_1 (struct expa
>gcc_assert (!TARGET_AVX2 || d->perm[0]);
>return false;
>
> +case E_V32HImode:
> +  gcc_assert (!TARGET_AVX512BW);
> +  return false;
> +
>  default:
>gcc_unreachable ();
>  }
> --- gcc/testsuite/gcc.target/i386/avx512f-pr101860.c.jj 2021-08-11 
> 11:05:28.090072461 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr101860.c2021-08-11 
> 11:05:17.157224399 +0200
> @@ -0,0 +1,5 @@
> +/* PR target/101860 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512f -mno-avx512bw" } */
> +
> +#include "../../gcc.dg/torture/vshuf-v32hi.c"
>
> Jakub
>


-- 
BR,
Hongtao


Re: [wwwdocs] gcc-12/changes.html: OpenMP - mention C++11 attributes support

2021-08-12 Thread Tobias Burnus

Hi all,

On 19.07.21 17:28, Jakub Jelinek via Gcc-patches wrote:

On Mon, Jul 19, 2021 at 05:17:10PM +0200, Tobias Burnus wrote:

Update the OpenMP feature list.
Comments? Remarks?



I'd defer mentioning it until I actually finish it

With today's commits by Jakub, the implementation is rather complete.

Hence, as discussed on IRC, I have committed the attached version – with
the 'initial' and other caveats dropped.

See also: https://gcc.gnu.org/gcc-12/changes.html

Tobias



-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit b16ae164c47aaf65d70480b90f55fd9acdba5418
Author: Tobias Burnus 
Date:   Thu Aug 12 10:43:03 2021 +0200

gcc-12/changes.html: OpenMP - mention C++11 attributes support

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 9c2799cf..a8859882 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -56,7 +56,9 @@ a work-in-progress.
iterator can now also be used with the depend
   clause, defaultmap has been updated for OpenMP 5.0, and the
   loop directive and combined directives
-  involving master directive have been added.
+  involving master directive have been added. Additionally,
+  support for expressing OpenMP directives as C++ 11 attributes has been
+  added, which is an OpenMP 5.1 feature.
   
   The new warning flag -Wopenacc-parallelism was added for
   OpenACC. It warns about potentially suboptimal choices related to


Re: [PATCH] [i386] Optimize vec_perm_expr to match vpmov{dw,qd,wb}.

2021-08-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 12, 2021 at 01:43:23PM +0800, liuhongt wrote:
> Hi:
>   This is another patch to optimize vec_perm_expr to match vpmov{dw,dq,wb}
> under AVX512.
>   For scenarios(like pr101846-2.c) where the upper half is not used, this 
> patch
> generates better code with only one vpmov{wb,dw,qd} instruction. For
> scenarios(like pr101846-3.c) where the upper half is actually used,  if the 
> src
> vector length is 256/512bits, the patch can still generate better code, but 
> for
> 128bits, the code generation is worse.
> 
> 128 bits upper half not used.
> 
> -   vpshufb .LC2(%rip), %xmm0, %xmm0
> +   vpmovdw %xmm0, %xmm0
> 
> 128 bits upper half used.
> -   vpshufb .LC2(%rip), %xmm0, %xmm0
> +   vpmovdw %xmm0, %xmm1
> +   vmovq   %xmm1, %rax
> +   vpinsrq $0, %rax, %xmm0, %xmm0
> 
>   Maybe expand_vec_perm_trunc_vinsert should only deal with 256/512bits of
> vectors, but considering the real use of scenarios like pr101846-3.c
> foo_*_128 possibility is relatively low, I still keep this part of the code.

I actually am not sure if even
 foo_dw_512:
 .LFB0:
.cfi_startproc
-   vmovdqa64   %zmm0, %zmm1
-   vmovdqa64   .LC0(%rip), %zmm0
-   vpermi2w%zmm1, %zmm1, %zmm0
+   vpmovdw %zmm0, %ymm1
+   vinserti64x4$0x0, %ymm1, %zmm0, %zmm0
ret
is always a win, the permutations we should care most about are in loops
and the constant load as well as the first move in that case likely go
away and it is one permutation insn vs. two.
Different case is e.g.
-   vmovdqa64   .LC5(%rip), %zmm2
-   vmovdqa64   %zmm0, %zmm1
-   vmovdqa64   .LC0(%rip), %zmm0
-   vpermi2w%zmm1, %zmm1, %zmm2
-   vpermi2w%zmm1, %zmm1, %zmm0
-   vpshufb .LC6(%rip), %zmm0, %zmm0
-   vpshufb .LC7(%rip), %zmm2, %zmm1
-   vporq   %zmm1, %zmm0, %zmm0
+   vpmovwb %zmm0, %ymm1
+   vinserti64x4$0x0, %ymm1, %zmm0, %zmm0
So, I wonder if your new routine shouldn't be instead done after
in ix86_expand_vec_perm_const_1 after vec_perm_1 among other 2 insn cases
and handle the other vpmovdw etc. cases in combine splitters (see that we
only use low half or quarter of the result and transform whatever
permutation we've used into what we want).

And perhaps make the routine eventually more general, don't handle
just identity permutation in the upper half, but allow there other
permutations too (ones where that half can be represented by a single insn
permutation).
> 
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>   Ok for trunk?
> 
> gcc/ChangeLog:
> 
>   PR target/101846
>   * config/i386/i386-expand.c (expand_vec_perm_trunc_vinsert):
>   New function.
>   (ix86_vectorize_vec_perm_const): Call
>   expand_vec_perm_trunc_vinsert.
>   * config/i386/sse.md (vec_set_lo_v32hi): New define_insn.
>   (vec_set_lo_v64qi): Ditto.
>   (vec_set_lo_): Extend to no-avx512dq.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/101846
>   * gcc.target/i386/pr101846-2.c: New test.
>   * gcc.target/i386/pr101846-3.c: New test.
> ---
>  gcc/config/i386/i386-expand.c  | 125 +
>  gcc/config/i386/sse.md |  60 +-
>  gcc/testsuite/gcc.target/i386/pr101846-2.c |  81 +
>  gcc/testsuite/gcc.target/i386/pr101846-3.c |  95 
>  4 files changed, 359 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101846-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101846-3.c
> 
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index bd21efa9530..519caac2e15 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -18317,6 +18317,126 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
>return false;
>  }
>  
> +/* A subroutine of ix86_expand_vec_perm_const_1.  Try to implement D
> +   in terms of a pair of vpmovdw + vinserti128 instructions.  */
> +static bool
> +expand_vec_perm_trunc_vinsert (struct expand_vec_perm_d *d)
> +{
> +  unsigned i, nelt = d->nelt, mask = d->nelt - 1;
> +  unsigned half = nelt / 2;
> +  machine_mode half_mode, trunc_mode;
> +
> +  /* vpmov{wb,dw,qd} only available under AVX512.  */
> +  if (!d->one_operand_p || !TARGET_AVX512F
> +  || (!TARGET_AVX512VL  && GET_MODE_SIZE (d->vmode) < 64)

Too many spaces.
> +  || GET_MODE_SIZE (GET_MODE_INNER (d->vmode)) > 4)
> +return false;
> +
> +  /* TARGET_AVX512BW is needed for vpmovwb.  */
> +  if (GET_MODE_INNER (d->vmode) == E_QImode && !TARGET_AVX512BW)
> +return false;
> +
> +  for (i = 0; i < nelt; i++)
> +{
> +  unsigned idx = d->perm[i] & mask;
> +  if (idx != i * 2 && i < half)
> + return false;
> +  if (idx != i && i >= half)
> + return false;
> +}
> +
> +  rtx (*gen_trunc) (rtx, rtx) = NULL;
> +  rtx (*gen_vec_set_lo) (rtx, rtx, rtx) = NULL;
> +  switch (d->vmode)
> +{
> +case E_V16QImode:
>

Re: [PATCH] [i386] Optimize vec_perm_expr to match vpmov{dw,qd,wb}.

2021-08-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 12, 2021 at 11:22:48AM +0200, Jakub Jelinek via Gcc-patches wrote:
> So, I wonder if your new routine shouldn't be instead done after
> in ix86_expand_vec_perm_const_1 after vec_perm_1 among other 2 insn cases
> and handle the other vpmovdw etc. cases in combine splitters (see that we
> only use low half or quarter of the result and transform whatever
> permutation we've used into what we want).

E.g. in the first function, combine tries:
(set (reg:V16HI 85)
(vec_select:V16HI (unspec:V32HI [
(mem/u/c:V32HI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S64 
A512])
(reg:V32HI 88) repeated x2
] UNSPEC_VPERMT2)
(parallel [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 4 [0x4])
(const_int 5 [0x5])
(const_int 6 [0x6])
(const_int 7 [0x7])
(const_int 8 [0x8])
(const_int 9 [0x9])
(const_int 10 [0xa])
(const_int 11 [0xb])
(const_int 12 [0xc])
(const_int 13 [0xd])
(const_int 14 [0xe])
(const_int 15 [0xf])
])))
A combine splitter could run avoid_constant_pool_reference on the
first UNSPEC_VPERMT2 argument and check the permutation if it can be
optimized, ideally using some function call so that we wouldn't need too
many splitters.

Jakub



[PATCH] Improved handling of MINUS_EXPR in bit CCP.

2021-08-12 Thread Roger Sayle

This patch improves the bit bounds for MINUS_EXPR during tree-ssa's
conditional constant propagation (CCP) pass (and as an added bonus
adds support for POINTER_DIFF_EXPR).

The pessimistic assumptions made by the current algorithm are
demonstrated by considering 1 - (x&1).  Intuitively this should
have possible values 0 and 1, and therefore an unknown mask of 1.
Alas by treating subtraction as a negation followed by addition,
the second operand first becomes 0 or -1, with an unknown mask
of all ones, which results in the addition containing no known bits.

Improved bounds are achieved by using the same approach used for
PLUS_EXPR, determining the result with the minimum number of borrows,
the result from the maximum number of borrows, and examining the bits
they have in common.  One additional benefit of this approach
is that it is applicable to POINTER_DIFF_EXPR, where previously the
negation of a pointer didn't/doesn't make sense.

A more convincing example, where a transformation missed by .032t.cpp
isn't caught a few passes later by .038t.evrp, is the expression
(7 - (x&5)) & 2, which (in the new test case) currently survives the
tree-level optimizers but with this patch is now simplified to the
constant value 2.

This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.

Ok for mainline?

2021-08-12  Roger Sayle  

gcc/ChangeLog
* tree-ssa-ccp.c (bit_value_binop) [MINUS_EXPR]: Use same
algorithm as PLUS_EXPR to improve subtraction bit bounds.
[POINTER_DIFF_EXPR]: Treat as synonymous with MINUS_EXPR.

gcc/testsuite/ChangeLog
* gcc.dg/tree-ssa/ssa-ccp-40.c: New test case.


Roger
--
Roger Sayle
NextMove Software
Cambridge, UK

diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 003c9c2..1223370 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -1398,7 +1398,7 @@ bit_value_binop (enum tree_code code, signop sgn, int 
width,
 widest_int *val, widest_int *mask,
 signop r1type_sgn, int r1type_precision,
 const widest_int &r1val, const widest_int &r1mask,
-signop r2type_sgn, int r2type_precision,
+signop r2type_sgn, int r2type_precision ATTRIBUTE_UNUSED,
 const widest_int &r2val, const widest_int &r2mask)
 {
   bool swap_p = false;
@@ -1445,7 +1445,7 @@ bit_value_binop (enum tree_code code, signop sgn, int 
width,
}
  else
{
- if (wi::neg_p (shift))
+ if (wi::neg_p (shift, r2type_sgn))
{
  shift = -shift;
  if (code == RROTATE_EXPR)
@@ -1482,7 +1482,7 @@ bit_value_binop (enum tree_code code, signop sgn, int 
width,
}
  else
{
- if (wi::neg_p (shift))
+ if (wi::neg_p (shift, r2type_sgn))
break;
  if (code == RSHIFT_EXPR)
{
@@ -1522,13 +1522,16 @@ bit_value_binop (enum tree_code code, signop sgn, int 
width,
   }
 
 case MINUS_EXPR:
+case POINTER_DIFF_EXPR:
   {
-   widest_int temv, temm;
-   bit_value_unop (NEGATE_EXPR, r2type_sgn, r2type_precision, &temv, &temm,
- r2type_sgn, r2type_precision, r2val, r2mask);
-   bit_value_binop (PLUS_EXPR, sgn, width, val, mask,
-r1type_sgn, r1type_precision, r1val, r1mask,
-r2type_sgn, r2type_precision, temv, temm);
+   /* Subtraction is derived from the addition algorithm above.  */
+   widest_int lo = wi::bit_and_not (r1val, r1mask) - (r2val | r2mask);
+   lo = wi::ext (lo, width, sgn);
+   widest_int hi = (r1val | r1mask) - wi::bit_and_not (r2val, r2mask);
+   hi = wi::ext (hi, width, sgn);
+   *mask = r1mask | r2mask | (lo ^ hi);
+   *mask = wi::ext (*mask, width, sgn);
+   *val = lo;
break;
   }
 
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-optimized" } */

int foo(int x)
{
  int p = 7;
  int q = p - (x & 5);
  return q & 2;
}

/* { dg-final { scan-tree-dump "return 2;" "optimized" } } */


[PATCH v4] gcov: Add TARGET_GCOV_TYPE_SIZE target hook

2021-08-12 Thread Sebastian Huber
If -fprofile-update=atomic is used, then the target must provide atomic
operations for the counters of the type returned by get_gcov_type().
This is a 64-bit type for targets which have a 64-bit long long type.
On 32-bit targets this could be an issue since they may not provide
64-bit atomic operations.  Allow targets to override the default type
size with the new TARGET_GCOV_TYPE_SIZE target hook.

If a 32-bit gcov type size is used, then there is currently a warning in
libgcov-driver.c in a dead code block due to
sizeof (counter) == sizeof (gcov_unsigned_t):

libgcc/libgcov-driver.c: In function 'dump_counter':
libgcc/libgcov-driver.c:401:46: warning: right shift count >= width of type 
[-Wshift-count-overflow]
  401 | dump_unsigned ((gcov_unsigned_t)(counter >> 32), dump_fn, arg);
  |  ^~

gcc/

* c-family/c-cppbuiltin.c (c_cpp_builtins): Define
__LIBGCC_GCOV_TYPE_SIZE if flag_building_libgcc is true.
* config/sparc/rtemself.h (SPARC_GCOV_TYPE_SIZE): Define.
* config/sparc/sparc.c (sparc_gcov_type_size): New.
(TARGET_GCOV_TYPE_SIZE): Redefine if SPARC_GCOV_TYPE_SIZE is defined.
* coverage.c (get_gcov_type): Use targetm.gcov_type_size().
* doc/tm.texi (TARGET_GCOV_TYPE_SIZE): Add hook under "Misc".
* doc/tm.texi.in: Regenerate.
* target.def (gcov_type_size): New target hook.
* targhooks.c (default_gcov_type_size): New.
* targhooks.h (default_gcov_type_size): Declare.
* tree-profile.c (gimple_gen_edge_profiler): Use precision of
gcov_type_node.
(gimple_gen_time_profiler): Likewise.

libgcc/

* libgcov.h (gcov_type): Define using __LIBGCC_GCOV_TYPE_SIZE.
(gcov_type_unsigned): Likewise.
---
 gcc/c-family/c-cppbuiltin.c |  2 ++
 gcc/config/sparc/rtemself.h |  2 ++
 gcc/config/sparc/sparc.c| 11 +++
 gcc/coverage.c  |  2 +-
 gcc/doc/tm.texi | 11 +++
 gcc/doc/tm.texi.in  |  2 ++
 gcc/target.def  | 12 
 gcc/targhooks.c |  7 +++
 gcc/targhooks.h |  2 ++
 gcc/tree-profile.c  |  4 ++--
 libgcc/libgcov.h|  6 +++---
 11 files changed, 55 insertions(+), 6 deletions(-)

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index f79f939bd10f..3fa62bc4fe76 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1450,6 +1450,8 @@ c_cpp_builtins (cpp_reader *pfile)
   /* For libgcov.  */
   builtin_define_with_int_value ("__LIBGCC_VTABLE_USES_DESCRIPTORS__",
 TARGET_VTABLE_USES_DESCRIPTORS);
+  builtin_define_with_int_value ("__LIBGCC_GCOV_TYPE_SIZE",
+targetm.gcov_type_size());
 }
 
   /* For use in assembly language.  */
diff --git a/gcc/config/sparc/rtemself.h b/gcc/config/sparc/rtemself.h
index fa972af640cc..d64ce9012daf 100644
--- a/gcc/config/sparc/rtemself.h
+++ b/gcc/config/sparc/rtemself.h
@@ -40,3 +40,5 @@
 
 /* Use the default */
 #undef LINK_GCC_C_SEQUENCE_SPEC
+
+#define SPARC_GCOV_TYPE_SIZE 32
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 04fc80f0ee62..06f41d7bb53f 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -944,6 +944,17 @@ char sparc_hard_reg_printed[8];
 #undef TARGET_ZERO_CALL_USED_REGS
 #define TARGET_ZERO_CALL_USED_REGS sparc_zero_call_used_regs
 
+#ifdef SPARC_GCOV_TYPE_SIZE
+static HOST_WIDE_INT
+sparc_gcov_type_size (void)
+{
+  return SPARC_GCOV_TYPE_SIZE;
+}
+
+#undef TARGET_GCOV_TYPE_SIZE
+#define TARGET_GCOV_TYPE_SIZE sparc_gcov_type_size
+#endif
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 /* Return the memory reference contained in X if any, zero otherwise.  */
diff --git a/gcc/coverage.c b/gcc/coverage.c
index ac9a9fdad228..10d7f8366cb5 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -146,7 +146,7 @@ tree
 get_gcov_type (void)
 {
   scalar_int_mode mode
-= smallest_int_mode_for_size (LONG_LONG_TYPE_SIZE > 32 ? 64 : 32);
+= smallest_int_mode_for_size (targetm.gcov_type_size ());
   return lang_hooks.types.type_for_mode (mode, false);
 }
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index a30fdcbbf3d6..f68f42638a11 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12588,3 +12588,14 @@ Return an RTX representing @var{tagged_pointer} with 
its tag set to zero.
 Store the result in @var{target} if convenient.
 The default clears the top byte of the original pointer.
 @end deftypefn
+
+@deftypefn {Target Hook} HOST_WIDE_INT TARGET_GCOV_TYPE_SIZE (void)
+Returns the gcov type size in bits.  This type is used for example for
+counters incremented by profiling and code-coverage events.  The default
+value is 64, if the type size of long long is greater than 32, otherwise the
+default value is 32.  A 64-bit type is recommended to avoid overflows of the
+counters.  If the @option{-fprofile-update=a

[Patch] OpenMP 5.1: Add proc-bind 'primary' support

2021-08-12 Thread Tobias Burnus

The attached patch adds another (very) low-hanging fruit of OpenMP 5.1
to GCC, given that we already have one OpenMP 5.1 feature and another
one also related to 'master'/'masked' construct might be added soon.

OK?

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP 5.1: Add proc-bind 'primary' support

In OpenMP 5.1 "master thread" was changed to "primary thread" and
the proc_bind clause and the OMP_PROC_BIND environment variable
now take 'primary' as argument as alias for 'master', while the
latter is deprecated.
This commit accepts 'primary' and adds the named constant
omp_proc_bind_primary and changes 'master thread' in the
documentation; however, given that not even OpenMP 5.0 is
fully supported, omp_display_env and the dumps currently
still output 'master' and there is no deprecation warning
when using the 'master' in the proc_bind clause.

gcc/c/ChangeLog:

	* c-parser.c (c_parser_omp_clause_proc_bind): Accept
	'primary' as alias for 'master'.

gcc/cp/ChangeLog:

	* parser.c (cp_parser_omp_clause_proc_bind): Accept
	'primary' as alias for 'master'.

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_clauses): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.
	* intrinsic.texi (OMP_LIB): Mention OpenMP 5.1; add
	omp_proc_bind_primary.
	* openmp.c (gfc_match_omp_clauses): Accept
	'primary' as alias for 'master'.

gcc/ChangeLog:

	* tree-pretty-print.c (dump_omp_clause): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.

libgomp/ChangeLog:

	* env.c (parse_bind_var): Accept 'primary' as alias for
	'master'.
	(omp_display_env): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.
	* libgomp.texi: Change 'master thread' to 'primary thread'
	in line with OpenMP 5.1.
	(omp_get_proc_bind): Add omp_proc_bind_primary and note that
	omp_proc_bind_master is an alias of it.
	(OMP_PROC_BIND): Mention 'PRIMARY'.
	* omp.h.in (__GOMP_DEPRECATED_5_1): Define.
	(omp_proc_bind_primary): Add.
	(omp_proc_bind_master): Deprecate for OpenMP 5.1.
	* omp_lib.f90.in (omp_proc_bind_primary): Add.
	(omp_proc_bind_master): Deprecate for OpenMP 5.1.
	* omp_lib.h.in (omp_proc_bind_primary): Add.
	* testsuite/libgomp.c/affinity-1.c: Check that
	'primary' works and is identical to 'master'.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/pr61486-2.c: Duplicate one proc_bind(master)
	testcase and test proc_bind(primary) instead.
	* gfortran.dg/gomp/affinity-1.f90: Likewise.

 gcc/c/c-parser.c  |  7 --
 gcc/cp/parser.c   |  7 --
 gcc/fortran/dump-parse-tree.c |  1 +
 gcc/fortran/intrinsic.texi|  6 +++--
 gcc/fortran/openmp.c  |  5 -
 gcc/testsuite/c-c++-common/gomp/pr61486-2.c   | 13 +++
 gcc/testsuite/gfortran.dg/gomp/affinity-1.f90 |  9 
 gcc/tree-pretty-print.c   |  1 +
 libgomp/env.c | 13 ++-
 libgomp/libgomp.texi  | 32 ++-
 libgomp/omp.h.in  | 10 -
 libgomp/omp_lib.f90.in|  6 +
 libgomp/omp_lib.h.in  |  1 +
 libgomp/testsuite/libgomp.c/affinity-1.c  | 14 
 14 files changed, 92 insertions(+), 33 deletions(-)

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index d24bfdb6719..53f8617ddaa 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -15959,7 +15959,8 @@ c_parser_omp_clause_dist_schedule (c_parser *parser, tree list)
proc_bind ( proc-bind-kind )
 
proc-bind-kind:
- master | close | spread  */
+ primary | master | close | spread
+   where OpenMP 5.1 added 'primary' and deprecated the alias 'master'.  */
 
 static tree
 c_parser_omp_clause_proc_bind (c_parser *parser, tree list)
@@ -15975,7 +15976,9 @@ c_parser_omp_clause_proc_bind (c_parser *parser, tree list)
   if (c_parser_next_token_is (parser, CPP_NAME))
 {
   const char *p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value);
-  if (strcmp ("master", p) == 0)
+  if (strcmp ("primary", p) == 0)
+	kind = OMP_CLAUSE_PROC_BIND_MASTER;
+  else if (strcmp ("master", p) == 0)
 	kind = OMP_CLAUSE_PROC_BIND_MASTER;
   else if (strcmp ("close", p) == 0)
 	kind = OMP_CLAUSE_PROC_BIND_CLOSE;
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 74de52992bc..0a5ed95d37f 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -39020,7 +39020,8 @@ cp_parser_omp_clause_dist_schedule (cp_parser *parser, tree list,
proc_bind ( proc-bind-kind )
 
proc-bind-kind:
- master | close | spread  */
+ primary | master | close | spread
+   where OpenMP 5.1 added 'primary' and d

Re: [PATCH v4] gcov: Add TARGET_GCOV_TYPE_SIZE target hook

2021-08-12 Thread Martin Liška

Hello.

One small nit, once you sent a new patch version, can you please describe what 
has changed
since the previous one?

Cheers,
Martin


Re: [Patch] OpenMP 5.1: Add proc-bind 'primary' support

2021-08-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 12, 2021 at 12:52:17PM +0200, Tobias Burnus wrote:
> gcc/c/ChangeLog:
> 
>   * c-parser.c (c_parser_omp_clause_proc_bind): Accept
>   'primary' as alias for 'master'.
> 
> gcc/cp/ChangeLog:
> 
>   * parser.c (cp_parser_omp_clause_proc_bind): Accept
>   'primary' as alias for 'master'.
> 
> gcc/fortran/ChangeLog:
> 
>   * dump-parse-tree.c (show_omp_clauses): Add TODO comment to
>   change 'master' to 'primary' in proc_bind for OpenMP 5.1.
>   * intrinsic.texi (OMP_LIB): Mention OpenMP 5.1; add
>   omp_proc_bind_primary.
>   * openmp.c (gfc_match_omp_clauses): Accept
>   'primary' as alias for 'master'.
> 
> gcc/ChangeLog:
> 
>   * tree-pretty-print.c (dump_omp_clause): Add TODO comment to
>   change 'master' to 'primary' in proc_bind for OpenMP 5.1.
> 
> libgomp/ChangeLog:
> 
>   * env.c (parse_bind_var): Accept 'primary' as alias for
>   'master'.
>   (omp_display_env): Add TODO comment to
>   change 'master' to 'primary' in proc_bind for OpenMP 5.1.
>   * libgomp.texi: Change 'master thread' to 'primary thread'
>   in line with OpenMP 5.1.
>   (omp_get_proc_bind): Add omp_proc_bind_primary and note that
>   omp_proc_bind_master is an alias of it.
>   (OMP_PROC_BIND): Mention 'PRIMARY'.
>   * omp.h.in (__GOMP_DEPRECATED_5_1): Define.
>   (omp_proc_bind_primary): Add.
>   (omp_proc_bind_master): Deprecate for OpenMP 5.1.
>   * omp_lib.f90.in (omp_proc_bind_primary): Add.
>   (omp_proc_bind_master): Deprecate for OpenMP 5.1.
>   * omp_lib.h.in (omp_proc_bind_primary): Add.
>   * testsuite/libgomp.c/affinity-1.c: Check that
>   'primary' works and is identical to 'master'.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/gomp/pr61486-2.c: Duplicate one proc_bind(master)
>   testcase and test proc_bind(primary) instead.
>   * gfortran.dg/gomp/affinity-1.f90: Likewise.

LGTM, some nits below.

> @@ -15975,7 +15976,9 @@ c_parser_omp_clause_proc_bind (c_parser *parser, tree 
> list)
>if (c_parser_next_token_is (parser, CPP_NAME))
>  {
>const char *p = IDENTIFIER_POINTER (c_parser_peek_token 
> (parser)->value);
> -  if (strcmp ("master", p) == 0)
> +  if (strcmp ("primary", p) == 0)
> + kind = OMP_CLAUSE_PROC_BIND_MASTER;
> +  else if (strcmp ("master", p) == 0)

Maybe in tree-core.h do:
-  OMP_CLAUSE_PROC_BIND_MASTER = 2,
+  OMP_CLAUSE_PROC_BIND_PRIMARY = 2,
+  OMP_CLAUSE_PROC_BIND_MASTER = OMP_CLAUSE_PROC_BIND_PRIMARY,
and use OMP_CLAUSE_PROC_BIND_PRIMARY for the "primary" cases?
I'd keep tree-pretty-print.c as is though for now (so print master).
And in omp-expand we actually just count on those enumerators
matching the omp.h omp_proc_bind_* ones.

> @@ -39037,7 +39038,9 @@ cp_parser_omp_clause_proc_bind (cp_parser *parser, 
> tree list,
>tree id = cp_lexer_peek_token (parser->lexer)->u.value;
>const char *p = IDENTIFIER_POINTER (id);
>  
> -  if (strcmp ("master", p) == 0)
> +  if (strcmp ("primary", p) == 0)
> + kind = OMP_CLAUSE_PROC_BIND_MASTER;
> +  else if (strcmp ("master", p) == 0)

Ditto.

> -   if (gfc_match ("proc_bind ( master )") == MATCH_YES)
> +   /* Primary is new and master is deprecated in OpenMP 5.1.  */
> +   if (gfc_match ("proc_bind ( primary )") == MATCH_YES)
> + c->proc_bind = OMP_PROC_BIND_MASTER;
> +   else if (gfc_match ("proc_bind ( master )") == MATCH_YES)

Maybe here too with gfortran.h change?

> --- a/libgomp/omp_lib.f90.in
> +++ b/libgomp/omp_lib.f90.in
> @@ -48,6 +48,8 @@
>   parameter :: omp_proc_bind_false = 0
>  integer (omp_proc_bind_kind), &
>   parameter :: omp_proc_bind_true = 1
> +integer (omp_proc_bind_kind), &
> + parameter :: omp_proc_bind_primary = 2
>  integer (omp_proc_bind_kind), &
>   parameter :: omp_proc_bind_master = 2
>  integer (omp_proc_bind_kind), &
> @@ -670,6 +672,10 @@
>  
>  #if _OPENMP >= 201811
>  !GCC$ ATTRIBUTES DEPRECATED :: omp_get_nested, omp_set_nested
> +#endif
> +
> +#if _OPENMP >= 202011
> +!GCC$ ATTRIBUTES DEPRECATED :: omp_proc_bind_master
>  #endif

I must say I have no idea how this will work, but for omp_*_nested
it is like that already.  I think the file is *.f90, not *.F90 and
so it isn't preprocessed (and is it compiled with -fopenmp at all)?
But let's deal with it incrementally.

Jakub



Re: [ARM] PR66791: Replace builtins for vdup_n and vmov_n intrinsics

2021-08-12 Thread Prathamesh Kulkarni via Gcc-patches
On Wed, 11 Aug 2021 at 22:23, Christophe Lyon
 wrote:
>
>
>
> On Thu, Jun 24, 2021 at 6:29 PM Kyrylo Tkachov via Gcc-patches 
>  wrote:
>>
>>
>>
>> > -Original Message-
>> > From: Prathamesh Kulkarni 
>> > Sent: 24 June 2021 12:11
>> > To: gcc Patches ; Kyrylo Tkachov
>> > 
>> > Subject: [ARM] PR66791: Replace builtins for vdup_n and vmov_n intrinsics
>> >
>> > Hi,
>> > This patch replaces builtins for vdup_n and vmov_n.
>> > The patch results in regression for pr51534.c.
>> > Consider following function:
>> >
>> > uint8x8_t f1 (uint8x8_t a) {
>> >   return vcgt_u8(a, vdup_n_u8(0));
>> > }
>> >
>> > code-gen before patch:
>> > f1:
>> > vmov.i32  d16, #0  @ v8qi
>> > vcgt.u8 d0, d0, d16
>> > bx lr
>> >
>> > code-gen after patch:
>> > f1:
>> > vceq.i8 d0, d0, #0
>> > vmvnd0, d0
>> > bx lr
>> >
>> > I am not sure which one is better tho ?
>>
>
> Hi Prathamesh,
>
> This patch introduces a regression on non-hardfp configs (eg 
> arm-linux-gnueabi or arm-eabi):
> FAIL:  gcc:gcc.target/arm/arm.exp=gcc.target/arm/pr51534.c 
> scan-assembler-times vmov.i32[ \t]+[dD][0-9]+, #0x 3
> FAIL:  gcc:gcc.target/arm/arm.exp=gcc.target/arm/pr51534.c 
> scan-assembler-times vmov.i32[ \t]+[qQ][0-9]+, #4294967295 3
>
> Can you fix this?
The issue is, for following test:

#include 

uint8x8_t f1 (uint8x8_t a) {
  return vcge_u8(a, vdup_n_u8(0));
}

armhf code-gen:
f1:
vmov.i32  d0, #0x  @ v8qi
bxlr

arm softfp code-gen:
f1:
mov r0, #-1
mov r1, #-1
bx  lr

The code-gen for both is same upto split2 pass:

(insn 10 6 11 2 (set (reg/i:V8QI 16 s0)
(const_vector:V8QI [
(const_int -1 [0x]) repeated x8
])) "foo.c":5:1 1052 {*neon_movv8qi}
 (expr_list:REG_EQUAL (const_vector:V8QI [
(const_int -1 [0x]) repeated x8
])
(nil)))
(insn 11 10 13 2 (use (reg/i:V8QI 16 s0)) "foo.c":5:1 -1
 (nil))

and for softfp target, split2 pass splits the assignment to r0 and r1:

(insn 15 6 16 2 (set (reg:SI 0 r0)
(const_int -1 [0x])) "foo.c":5:1 740 {*thumb2_movsi_vfp}
 (nil))
(insn 16 15 11 2 (set (reg:SI 1 r1 [+4 ])
(const_int -1 [0x])) "foo.c":5:1 740 {*thumb2_movsi_vfp}
 (nil))
(insn 11 16 13 2 (use (reg/i:V8QI 0 r0)) "foo.c":5:1 -1
 (nil))

I suppose we could use a dg-scan for r[0-9]+, #-1 for softfp targets ?

Thanks,
Prathamesh
>
> Thanks
>
> Christophe
>
>
>>
>> I think they're equivalent in practice, in any case the patch itself is good 
>> (move away from RTL builtins).
>> Ok.
>> Thanks,
>> Kyrill
>>
>> >
>> > Also, this patch regressed bf16_dup.c on arm-linux-gnueabi,
>> > which is due to a missed opt in lowering. I had filed it as
>> > PR98435, and posted a fix for it here:
>> > https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572648.html
>> >
>> > Thanks,
>> > Prathamesh


Re: gfortran.dg/PR82376.f90: Avoid matching a file-path.

2021-08-12 Thread Hans-Peter Nilsson via Gcc-patches
> From: Bernhard Reutner-Fischer 
> Date: Thu, 12 Aug 2021 09:03:50 +0200

> On Thu, 12 Aug 2021 00:09:21 +0200
> Hans-Peter Nilsson via Fortran  wrote:
> 
> > I had a file-path to sources with the substring "new" in it,
> > and (only) this test regressed compared to results from
> > another build without "new" in the name.
> > 
> > The test does
> >  ! { dg-final { scan-tree-dump-times "new" 4 "original" } }
> > i.e. the contents of the tree-dump-file .original needs to match
> > the undelimited string "new" exactly four times.  Very brittle.
> > 
> > In the dump-file, there are three lines with calls to new:
> >  D.908 = new ((integer(kind=4) *) data);
> >  integer(kind=4) * new (integer(kind=4) & data)
> >static integer(kind=4) * new (integer(kind=4) &);
> > 
> > But, there's also a line, which for me and cris-elf looked like:
> >  _gfortran_runtime_error_at (&"At line 46 of file
> >   /X/xyzzynewfrob/gcc/testsuite/gfortran.dg/PR82376.f90"[1]{lb: 1 sz: 1},
> >   &"Pointer actual argument \'new\' is not associated"[1]{lb: 1 sz: 1});
> > The fourth match is obviously intended to match this line, but only
> > with *one* match, whereas the path can as above yield another hit.
> > 
> > With Tcl, the regexp for matching the " " *and* the "'"
> > *and* the "\" gets a bit unsightly, so I suggest just
> > matching the "new" calls, which according to the comment in
> > the test is the key point.  You can't have a file-path with
> > spaces and parentheses in a gcc build.  I'm also making use
> > of {} rather than "" needing one level of quoting; the "\("
> > is needed because the matched string is a regexp.
> > 
> > Ok to commit?
> 
> A wordmatch would be \mnew\M but i agree that counting calls by
> {\mnew (} is fine too.

Not really; I guess I should have mentioned that I briefly
considered word-delimeters, but it'd match a subdirectory
named "new"; not to be unexpected in gcc builds.  Matching
something that can't be in a file-path (of a gcc build) is
the only way to be sure (that I can think of).

> I'd call it obvious, so i dare to approve it.
> OK.
> thanks!

Thanks, but not coming from a testsuite or fortran
maintainer I'm not sure I can actually rely on that.

OTOH, damn the torpedoes.  Committed.

> > 
> > testsuite:
> > * gfortran.dg/PR82376.f90: Robustify match.
> > ---
> >  gcc/testsuite/gfortran.dg/PR82376.f90 | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/gcc/testsuite/gfortran.dg/PR82376.f90 
> > b/gcc/testsuite/gfortran.dg/PR82376.f90
> > index 07143ab7e82e..b99779ce9d8a 100644
> > --- a/gcc/testsuite/gfortran.dg/PR82376.f90
> > +++ b/gcc/testsuite/gfortran.dg/PR82376.f90
> > @@ -2,7 +2,8 @@
> >  ! { dg-options "-fdump-tree-original -fcheck=pointer" }
> >  !
> >  ! Test the fix for PR82376. The pointer check was doubling up the call
> > -! to new. The fix reduces the count of 'new' from 5 to 4.
> > +! to new. The fix reduces the count of 'new' from 5 to 4, or to 3, when
> > +! counting only calls.
> >  !
> >  ! Contributed by José Rui Faustino de Sousa  
> >  !
> > @@ -56,4 +57,4 @@ contains
> >end subroutine set
> >  
> >  end program main_p
> > -! { dg-final { scan-tree-dump-times "new" 4 "original" } }
> > +! { dg-final { scan-tree-dump-times { new \(} 3 "original" } }
> 


Re: [PATCH v4] gcov: Add TARGET_GCOV_TYPE_SIZE target hook

2021-08-12 Thread Sebastian Huber

On 12/08/2021 13:19, Martin Liška wrote:
One small nit, once you sent a new patch version, can you please 
describe what has changed

since the previous one?


Yes, sorry.

In v4 I changed the target macro to a target hook. Also the 
TARGET_GCOV_TYPE_SIZE is now redefined in sparc.c using an 
architecture-specific define (SPARC_GCOV_TYPE_SIZE).


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


[committed] arc: Small data doesn't need fcommon option

2021-08-12 Thread Claudiu Zissulescu via Gcc-patches
ARC backend is defaulting to -fcommon. This is not anylonger needed, remove it.

gcc/
2021-08-12  Claudiu Zissulescu  

* common/config/arc/arc-common.c (arc_option_init_struct): Remove
fno-common reference.
* config/arc/arc.c (arc_override_options): Remove overriding of
flag_no_common.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/common/config/arc/arc-common.c | 4 +---
 gcc/config/arc/arc.c   | 3 ---
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/gcc/common/config/arc/arc-common.c 
b/gcc/common/config/arc/arc-common.c
index 6a119029616..3b36d09997c 100644
--- a/gcc/common/config/arc/arc-common.c
+++ b/gcc/common/config/arc/arc-common.c
@@ -30,10 +30,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "flags.h"
 
 static void
-arc_option_init_struct (struct gcc_options *opts)
+arc_option_init_struct (struct gcc_options *opts ATTRIBUTE_UNUSED)
 {
-  opts->x_flag_no_common = 255; /* Mark as not user-initialized.  */
-
   /* Which cpu we're compiling for (ARC600, ARC601, ARC700, ARCv2).  */
   arc_cpu = PROCESSOR_NONE;
 }
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 69f6ae464e1..92797db96b7 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -1440,9 +1440,6 @@ arc_override_options (void)
   if (flag_pic)
 target_flags |= MASK_NO_SDATA_SET;
 
-  if (flag_no_common == 255)
-flag_no_common = !TARGET_NO_SDATA_SET;
-
   /* Check for small data option */
   if (!global_options_set.x_g_switch_value && !TARGET_NO_SDATA_SET)
 g_switch_value = TARGET_LL64 ? 8 : 4;
-- 
2.31.1



[C PATCH] Evaluate argument of sizeof that are structs of variable size.

2021-08-12 Thread Martin Uecker



Joseph,

here is the patch for c_expr_sizeof_type you had suggested.


Best,
Martin



Evaluate type arguments of sizeof that are structs of variable size [PR101838]

Evaluate type arguments of sizeof for all types of variable size
and not just for VLAs. This fixes PR101838 and some issues related 
to PR29970 where statement expressions need to be evaluated so that
the size is well defined.

2021-08-12  Martin Uecker  

gcc/c/
PR c/101838
PR c/29970
* c-typeck.c (c_expr_sizeof_type): Evaluate
size expressions for structs of variable size.

gcc/testsuite/
PR c/101838
* gcc.dg/vla-stexp-2.c: New test.
   




diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index c5bf3372321..eb5c87dc57a 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -3022,8 +3022,14 @@ c_expr_sizeof_type (location_t loc, struct c_type_name 
*t)
   c_last_sizeof_loc = loc;
   ret.original_code = SIZEOF_EXPR;
   ret.original_type = NULL;
+  if (type == error_mark_node)
+{
+  ret.value = error_mark_node;
+  ret.original_code = ERROR_MARK;
+}
+  else
   if ((type_expr || TREE_CODE (ret.value) == INTEGER_CST)
-  && c_vla_type_p (type))
+  && C_TYPE_VARIABLE_SIZE (type))
 {
   /* If the type is a [*] array, it is a VLA but is represented as
 having a size of zero.  In such a case we must ensure that
diff --git a/gcc/testsuite/gcc.dg/vla-stexp-2.c 
b/gcc/testsuite/gcc.dg/vla-stexp-2.c
new file mode 100644
index 000..4d4a15d34a6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vla-stexp-2.c
@@ -0,0 +1,33 @@
+/* PR101838-*/
+/* { dg-do run } */
+/* { dg-options "-Wpedantic -O0" } */
+
+
+int bar0(
+   int (*a)[*],
+   int (*b)[sizeof(*a)]
+);
+
+
+int bar(
+   struct f {  /* { dg-warning "will not be visible outside of 
this definition" } */
+   int a[*]; } v,  /* { dg-warning "variably modified type" } */
+   int (*b)[sizeof(struct f)]  // should not warn about zero size
+);
+
+int foo(void)
+{
+   int n = 0;
+   return sizeof(typeof(*({ n = 10; struct foo {   /* { dg-warning 
"braced-groups" } */
+   int x[n];   /* { dg-warning 
"variably modified type" } */
+   } x; &x; })));
+}
+
+
+int main()
+{
+   if (sizeof(struct foo { int x[10]; }) != foo())
+   __builtin_abort();
+
+   return 0;
+}



Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-12 Thread Bill Schmidt via Gcc-patches

On 8/11/21 9:10 PM, Kewen.Lin wrote:

Hi Bill,

Thanks for your prompt review!

on 2021/8/12 上午12:34, Bill Schmidt wrote:

Hi Kewen,

FWIW, it's easier on reviewers if you include the patch inline instead of as an 
attachment.

On 8/11/21 1:56 AM, Kewen.Lin wrote:

Hi,

This patch is to add the support to make vectorizer able to
vectorize scalar version of some built-in functions with its
corresponding vector version with Power10 support.

Bootstrapped & regtested on powerpc64le-linux-gnu {P9,P10}
and powerpc64-linux-gnu P8.

Is it ok for trunk?

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/rs6000.c (rs6000_builtin_md_vectorized_function): Add
support for some built-in functions vectorized on Power10.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/dive-vectorize-1.c: New test.
* gcc.target/powerpc/dive-vectorize-1.h: New test.
* gcc.target/powerpc/dive-vectorize-2.c: New test.
* gcc.target/powerpc/dive-vectorize-2.h: New test.
* gcc.target/powerpc/dive-vectorize-run-1.c: New test.
* gcc.target/powerpc/dive-vectorize-run-2.c: New test.
* gcc.target/powerpc/p10-bifs-vectorize-1.c: New test.
* gcc.target/powerpc/p10-bifs-vectorize-1.h: New test.
* gcc.target/powerpc/p10-bifs-vectorize-run-1.c: New test.

---
  gcc/config/rs6000/rs6000.c| 55 +++
  .../gcc.target/powerpc/dive-vectorize-1.c | 11 
  .../gcc.target/powerpc/dive-vectorize-1.h | 22 
  .../gcc.target/powerpc/dive-vectorize-2.c | 12 
  .../gcc.target/powerpc/dive-vectorize-2.h | 22 
  .../gcc.target/powerpc/dive-vectorize-run-1.c | 52 ++
  .../gcc.target/powerpc/dive-vectorize-run-2.c | 53 ++
  .../gcc.target/powerpc/p10-bifs-vectorize-1.c | 15 +
  .../gcc.target/powerpc/p10-bifs-vectorize-1.h | 40 ++
  .../powerpc/p10-bifs-vectorize-run-1.c| 45 +++
  10 files changed, 327 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/dive-vectorize-1.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/dive-vectorize-1.h
  create mode 100644 gcc/testsuite/gcc.target/powerpc/dive-vectorize-2.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/dive-vectorize-2.h
  create mode 100644 gcc/testsuite/gcc.target/powerpc/dive-vectorize-run-1.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/dive-vectorize-run-2.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/p10-bifs-vectorize-1.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/p10-bifs-vectorize-1.h
  create mode 100644 gcc/testsuite/gcc.target/powerpc/p10-bifs-vectorize-run-1.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 279f00cc648..3eac1d05101 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5785,6 +5785,61 @@ rs6000_builtin_md_vectorized_function (tree fndecl, tree 
type_out,
  default:
break;
  }
+
+  machine_mode in_vmode = TYPE_MODE (type_in);
+  machine_mode out_vmode = TYPE_MODE (type_out);
+
+  /* Power10 supported vectorized built-in functions.  */
+  if (TARGET_POWER10
+  && in_vmode == out_vmode
+  && VECTOR_UNIT_ALTIVEC_OR_VSX_P (in_vmode))
+{
+  machine_mode exp_mode = DImode;
+  machine_mode exp_vmode = V2DImode;
+  enum rs6000_builtins vname = RS6000_BUILTIN_COUNT;

Using this as a flag value looks unnecessary.  Is this just being done to 
silence a warning?


Good question!  I didn't notice there is a warning or not, just get used to 
initializing variable
with one suitable value if possible.  If you don't mind, may I still keep it?  
Since if some
future codes use vname in a path where it's not assigned, one explicitly wrong 
enum (bif) seems
better than a random one.  Or will this mentioned possibility definitely never 
happen since the
current uninitialized variables detection and warning scheme is robust and 
should not worry about
that completely?



I don't feel strongly about this either way; up to you and the 
maintainers. :)


Bill




+  switch (fn)
+   {
+   case MISC_BUILTIN_DIVWE:
+   case MISC_BUILTIN_DIVWEU:
+ exp_mode = SImode;
+ exp_vmode = V4SImode;
+ if (fn == MISC_BUILTIN_DIVWE)
+   vname = P10V_BUILTIN_DIVES_V4SI;
+ else
+   vname = P10V_BUILTIN_DIVEU_V4SI;
+ break;
+   case MISC_BUILTIN_DIVDE:
+   case MISC_BUILTIN_DIVDEU:
+ if (fn == MISC_BUILTIN_DIVDE)
+   vname = P10V_BUILTIN_DIVES_V2DI;
+ else
+   vname = P10V_BUILTIN_DIVEU_V2DI;
+ break;
+   case P10_BUILTIN_CFUGED:
+ vname = P10V_BUILTIN_VCFUGED;
+ break;
+   case P10_BUILTIN_CNTLZDM:
+ vname = P10V_BUILTIN_VCLZDM;
+ break;
+   case P10_BUILTIN_CNTTZDM:
+ vname = P10V_BUILTIN_VCTZDM;
+ break;
+   case P10_BUILTIN_PDEPD:
+ vname = P10V_BUILTIN_VPDEPD;
+ b

Re: gfortran.dg/PR82376.f90: Avoid matching a file-path.

2021-08-12 Thread Tobias Burnus

On 12.08.21 14:13, Hans-Peter Nilsson via Fortran wrote:

I'd call it obvious, so i dare to approve it.
OK.
thanks!

Thanks, but not coming from a testsuite or fortran
maintainer I'm not sure I can actually rely on that.

OTOH, damn the torpedoes.  Committed.


If it helps: A post-commit LGTM from my side.

I think it can also be regarded as obvious.*

Tobias

*Albeit that reminds me of a professor in mathematics who told us that
he tends to very carefully check those places where the (course,
bachelor, master, PhD, ...) student writes: 'one then trivially obtains ...'

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [Patch] OpenMP 5.1: Add proc-bind 'primary' support

2021-08-12 Thread Tobias Burnus

I will commit the attached patch in a while – unless last-minute
comments or other issues show up.

On 12.08.21 13:31, Jakub Jelinek wrote:

LGTM, some nits below.
Maybe in tree-core.h do:
-  OMP_CLAUSE_PROC_BIND_MASTER = 2,
+  OMP_CLAUSE_PROC_BIND_PRIMARY = 2,
+  OMP_CLAUSE_PROC_BIND_MASTER = OMP_CLAUSE_PROC_BIND_PRIMARY,
and use OMP_CLAUSE_PROC_BIND_PRIMARY for the "primary" cases?
I'd keep tree-pretty-print.c as is though for now (so print master).
And in omp-expand we actually just count on those enumerators
matching the omp.h omp_proc_bind_* ones.

Did so (same enum value for primary and master) but ...

-  if (gfc_match ("proc_bind ( master )") == MATCH_YES)
+  /* Primary is new and master is deprecated in OpenMP 5.1.  */
+  if (gfc_match ("proc_bind ( primary )") == MATCH_YES)
+c->proc_bind = OMP_PROC_BIND_MASTER;
+  else if (gfc_match ("proc_bind ( master )") == MATCH_YES)

Maybe here too with gfortran.h change?

... used for the gfortran-internal enum a disjunct enum value.

--- a/libgomp/omp_lib.f90.in
...
+#if _OPENMP >= 202011
+!GCC$ ATTRIBUTES DEPRECATED :: omp_proc_bind_master
  #endif

... I think the file is *.f90, not *.F90 and so it isn't preprocessed ...


Turned out that the file is compiled with '-cpp -fopenmp -fsyntax-only'.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP 5.1: Add proc-bind 'primary' support

In OpenMP 5.1 "master thread" was changed to "primary thread" and
the proc_bind clause and the OMP_PROC_BIND environment variable
now take 'primary' as argument as alias for 'master', while the
latter is deprecated.
This commit accepts 'primary' and adds the named constant
omp_proc_bind_primary and changes 'master thread' in the
documentation; however, given that not even OpenMP 5.0 is
fully supported, omp_display_env and the dumps currently
still output 'master' and there is no deprecation warning
when using the 'master' in the proc_bind clause.

gcc/c/ChangeLog:

	* c-parser.c (c_parser_omp_clause_proc_bind): Accept
	'primary' as alias for 'master'.

gcc/cp/ChangeLog:

	* parser.c (cp_parser_omp_clause_proc_bind): Accept
	'primary' as alias for 'master'.

gcc/fortran/ChangeLog:

	* gfortran.h (gfc_omp_proc_bind_kind): Add OMP_PROC_BIND_PRIMARY.
	* dump-parse-tree.c (show_omp_clauses): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.
	* intrinsic.texi (OMP_LIB): Mention OpenMP 5.1; add
	omp_proc_bind_primary.
	* openmp.c (gfc_match_omp_clauses): Accept
	'primary' as alias for 'master'.

gcc/ChangeLog:

	* tree-core.h (omp_clause_proc_bind_kind): Add
	OMP_CLAUSE_PROC_BIND_PRIMARY.
	* tree-pretty-print.c (dump_omp_clause): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.

libgomp/ChangeLog:

	* env.c (parse_bind_var): Accept 'primary' as alias for
	'master'.
	(omp_display_env): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.
	* libgomp.texi: Change 'master thread' to 'primary thread'
	in line with OpenMP 5.1.
	(omp_get_proc_bind): Add omp_proc_bind_primary and note that
	omp_proc_bind_master is an alias of it.
	(OMP_PROC_BIND): Mention 'PRIMARY'.
	* omp.h.in (__GOMP_DEPRECATED_5_1): Define.
	(omp_proc_bind_primary): Add.
	(omp_proc_bind_master): Deprecate for OpenMP 5.1.
	* omp_lib.f90.in (omp_proc_bind_primary): Add.
	(omp_proc_bind_master): Deprecate for OpenMP 5.1.
	* omp_lib.h.in (omp_proc_bind_primary): Add.
	* testsuite/libgomp.c/affinity-1.c: Check that
	'primary' works and is identical to 'master'.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/pr61486-2.c: Duplicate one proc_bind(master)
	testcase and test proc_bind(primary) instead.
	* gfortran.dg/gomp/affinity-1.f90: Likewise.

 gcc/c/c-parser.c  |  7 --
 gcc/cp/parser.c   |  7 --
 gcc/fortran/dump-parse-tree.c |  1 +
 gcc/fortran/gfortran.h|  1 +
 gcc/fortran/intrinsic.texi|  6 +++--
 gcc/fortran/openmp.c  |  5 -
 gcc/fortran/trans-openmp.c|  3 +++
 gcc/testsuite/c-c++-common/gomp/pr61486-2.c   | 13 +++
 gcc/testsuite/gfortran.dg/gomp/affinity-1.f90 |  9 
 gcc/tree-core.h   |  1 +
 gcc/tree-pretty-print.c   |  2 ++
 libgomp/env.c | 13 ++-
 libgomp/libgomp.texi  | 32 ++-
 libgomp/omp.h.in  | 10 -
 libgomp/omp_lib.f90.in|  6 +
 libgomp/omp_lib.h.in  |  2 ++
 libgomp/testsuite/libgomp.c/affinity-1.c  | 14 
 17 files changed, 99 insertions(+)

Re: PING: [RS6000] rotate and mask constants [PR94393]

2021-08-12 Thread Bill Schmidt via Gcc-patches



On 8/10/21 11:02 AM, Bill Schmidt wrote:

Hm, is this code ever executed?  How does this not result in assignment
through null pointers?

It would be good to figure out whether this code is reachable at all,
and just yank it out if not.  Otherwise we need a test case that hits it.

I seem to have been completely blind when I was looking at this. Please 
ignore this comment.


Thanks,
Bill



Re: [ARM] PR66791: Replace builtins for vdup_n and vmov_n intrinsics

2021-08-12 Thread Christophe Lyon via Gcc-patches
On Thu, Aug 12, 2021 at 1:54 PM Prathamesh Kulkarni <
prathamesh.kulka...@linaro.org> wrote:

> On Wed, 11 Aug 2021 at 22:23, Christophe Lyon
>  wrote:
> >
> >
> >
> > On Thu, Jun 24, 2021 at 6:29 PM Kyrylo Tkachov via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> >>
> >>
> >>
> >> > -Original Message-
> >> > From: Prathamesh Kulkarni 
> >> > Sent: 24 June 2021 12:11
> >> > To: gcc Patches ; Kyrylo Tkachov
> >> > 
> >> > Subject: [ARM] PR66791: Replace builtins for vdup_n and vmov_n
> intrinsics
> >> >
> >> > Hi,
> >> > This patch replaces builtins for vdup_n and vmov_n.
> >> > The patch results in regression for pr51534.c.
> >> > Consider following function:
> >> >
> >> > uint8x8_t f1 (uint8x8_t a) {
> >> >   return vcgt_u8(a, vdup_n_u8(0));
> >> > }
> >> >
> >> > code-gen before patch:
> >> > f1:
> >> > vmov.i32  d16, #0  @ v8qi
> >> > vcgt.u8 d0, d0, d16
> >> > bx lr
> >> >
> >> > code-gen after patch:
> >> > f1:
> >> > vceq.i8 d0, d0, #0
> >> > vmvnd0, d0
> >> > bx lr
> >> >
> >> > I am not sure which one is better tho ?
> >>
> >
> > Hi Prathamesh,
> >
> > This patch introduces a regression on non-hardfp configs (eg
> arm-linux-gnueabi or arm-eabi):
> > FAIL:  gcc:gcc.target/arm/arm.exp=gcc.target/arm/pr51534.c
> scan-assembler-times vmov.i32[ \t]+[dD][0-9]+, #0x 3
> > FAIL:  gcc:gcc.target/arm/arm.exp=gcc.target/arm/pr51534.c
> scan-assembler-times vmov.i32[ \t]+[qQ][0-9]+, #4294967295 3
> >
> > Can you fix this?
> The issue is, for following test:
>
> #include 
>
> uint8x8_t f1 (uint8x8_t a) {
>   return vcge_u8(a, vdup_n_u8(0));
> }
>
> armhf code-gen:
> f1:
> vmov.i32  d0, #0x  @ v8qi
> bxlr
>
> arm softfp code-gen:
> f1:
> mov r0, #-1
> mov r1, #-1
> bx  lr
>
> The code-gen for both is same upto split2 pass:
>
> (insn 10 6 11 2 (set (reg/i:V8QI 16 s0)
> (const_vector:V8QI [
> (const_int -1 [0x]) repeated x8
> ])) "foo.c":5:1 1052 {*neon_movv8qi}
>  (expr_list:REG_EQUAL (const_vector:V8QI [
> (const_int -1 [0x]) repeated x8
> ])
> (nil)))
> (insn 11 10 13 2 (use (reg/i:V8QI 16 s0)) "foo.c":5:1 -1
>  (nil))
>
> and for softfp target, split2 pass splits the assignment to r0 and r1:
>
> (insn 15 6 16 2 (set (reg:SI 0 r0)
> (const_int -1 [0x])) "foo.c":5:1 740
> {*thumb2_movsi_vfp}
>  (nil))
> (insn 16 15 11 2 (set (reg:SI 1 r1 [+4 ])
> (const_int -1 [0x])) "foo.c":5:1 740
> {*thumb2_movsi_vfp}
>  (nil))
> (insn 11 16 13 2 (use (reg/i:V8QI 0 r0)) "foo.c":5:1 -1
>  (nil))
>
> I suppose we could use a dg-scan for r[0-9]+, #-1 for softfp targets ?
>
> Yes, probably, or try with check-function-bodies.

 Christophe

Thanks,
> Prathamesh
> >
> > Thanks
> >
> > Christophe
> >
> >
> >>
> >> I think they're equivalent in practice, in any case the patch itself is
> good (move away from RTL builtins).
> >> Ok.
> >> Thanks,
> >> Kyrill
> >>
> >> >
> >> > Also, this patch regressed bf16_dup.c on arm-linux-gnueabi,
> >> > which is due to a missed opt in lowering. I had filed it as
> >> > PR98435, and posted a fix for it here:
> >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572648.html
> >> >
> >> > Thanks,
> >> > Prathamesh
>


[committed] libstdc++: Add #error to some files that depend on a specific standard mode

2021-08-12 Thread Jonathan Wakely via Gcc-patches
Give more explicit errors if these files are not built with the correct
-std options.

libstdc++-v3/ChangeLog:

* src/c++98/locale_init.cc: Require C++11.
* src/c++98/localename.cc: Likewise.
* src/c++98/misc-inst.cc: Require C++98.

Tested powerpc64le-linux. Committed to trunk.

commit d3a7fbcb7c7a1016ceac2ceaf79b28c17ce9fcd7
Author: Jonathan Wakely 
Date:   Thu Aug 12 13:33:43 2021

libstdc++: Add #error to some files that depend on a specific standard mode

Give more explicit errors if these files are not built with the correct
-std options.

libstdc++-v3/ChangeLog:

* src/c++98/locale_init.cc: Require C++11.
* src/c++98/localename.cc: Likewise.
* src/c++98/misc-inst.cc: Require C++98.

diff --git a/libstdc++-v3/src/c++98/locale_init.cc 
b/libstdc++-v3/src/c++98/locale_init.cc
index 4bec50bf595..e96b1a336aa 100644
--- a/libstdc++-v3/src/c++98/locale_init.cc
+++ b/libstdc++-v3/src/c++98/locale_init.cc
@@ -20,6 +20,10 @@
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
+#if __cplusplus != 201103L
+# error This file must be compiled as C++11
+#endif
+
 #define _GLIBCXX_USE_CXX11_ABI 1
 #include 
 #include 
diff --git a/libstdc++-v3/src/c++98/localename.cc 
b/libstdc++-v3/src/c++98/localename.cc
index 350dcf5ad0f..9c707b2327c 100644
--- a/libstdc++-v3/src/c++98/localename.cc
+++ b/libstdc++-v3/src/c++98/localename.cc
@@ -20,6 +20,10 @@
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
+#if __cplusplus != 201103L
+# error This file must be compiled as C++11
+#endif
+
 #define _GLIBCXX_USE_CXX11_ABI 1
 #include 
 #include 
diff --git a/libstdc++-v3/src/c++98/misc-inst.cc 
b/libstdc++-v3/src/c++98/misc-inst.cc
index 09851903600..85a4287e113 100644
--- a/libstdc++-v3/src/c++98/misc-inst.cc
+++ b/libstdc++-v3/src/c++98/misc-inst.cc
@@ -26,6 +26,10 @@
 // ISO C++ 14882:
 //
 
+#if __cplusplus != 199711L
+# error This file must be compiled as C++98
+#endif
+
 #define _GLIBCXX_USE_CXX11_ABI 1
 #define _GLIBCXX_DISAMBIGUATE_REPLACE_INST 1
 #include 


[PATCH] i386: support micro-levels in target{,_clone} attrs [PR101696]

2021-08-12 Thread Martin Liška

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
I modified the H.J. patch draft.
@H.J. Can you please verify the newly added 'feature_priority'?

As mentioned in the PR, we do miss supports target micro-architectures
in target and target_clone attribute. While the levels
x86-64 x86-64-v2 x86-64-v3 x86-64-v4 are supported values by -march
option, they are actually only aliases for k8 CPU. That said, they are more
closer to __builtin_cpu_supports function and we decided to implement
it there.

PR target/101696

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (cpu_indicator_init): Add support
for x86-64 micro levels for __builtin_cpu_supports.
* common/config/i386/i386-cpuinfo.h (enum feature_priority):
Add priorities for the micro-arch levels.
(enum processor_features): Add new features.
* common/config/i386/i386-isas.h: Add micro-arch features.
* config/i386/i386-builtins.c (get_builtin_code_for_version):
Support the micro-arch levels by callsing
__builtin_cpu_supports.
* doc/extend.texi: Document that the levels are support by
  __builtin_cpu_supports.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv30.C: New test.
* gcc.target/i386/mvc16.c: New test.

Co-Authored-By: H.J. Lu 
---
 gcc/common/config/i386/cpuinfo.h  | 38 
 gcc/common/config/i386/i386-cpuinfo.h |  8 +
 gcc/common/config/i386/i386-isas.h|  4 +++
 gcc/config/i386/i386-builtins.c   | 22 ++--
 gcc/doc/extend.texi   | 12 +++
 gcc/testsuite/g++.target/i386/mv30.C  | 50 +++
 gcc/testsuite/gcc.target/i386/mvc16.c | 15 
 7 files changed, 146 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/mv30.C
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc16.c

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 458f41de776..f3b60920c81 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -931,6 +931,44 @@ cpu_indicator_init (struct __processor_model *cpu_model,
   else
 cpu_model->__cpu_vendor = VENDOR_OTHER;
 
+  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_LM)

+  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_SSE2))
+{
+  set_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_X86_64_BASELINE);
+  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_CMPXCHG16B)
+ && has_cpu_feature (cpu_model, cpu_features2, FEATURE_POPCNT)
+ && has_cpu_feature (cpu_model, cpu_features2, FEATURE_LAHF_LM)
+ && has_cpu_feature (cpu_model, cpu_features2, FEATURE_SSE4_2))
+   {
+ set_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_X86_64_V2);
+ if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_AVX2)
+ && has_cpu_feature (cpu_model, cpu_features2, FEATURE_BMI)
+ && has_cpu_feature (cpu_model, cpu_features2, FEATURE_BMI2)
+ && has_cpu_feature (cpu_model, cpu_features2, FEATURE_F16C)
+ && has_cpu_feature (cpu_model, cpu_features2, FEATURE_FMA)
+ && has_cpu_feature (cpu_model, cpu_features2,
+ FEATURE_LZCNT)
+ && has_cpu_feature (cpu_model, cpu_features2,
+ FEATURE_MOVBE))
+   {
+ set_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_X86_64_V3);
+ if (has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_AVX512BW)
+ && has_cpu_feature (cpu_model, cpu_features2,
+ FEATURE_AVX512CD)
+ && has_cpu_feature (cpu_model, cpu_features2,
+ FEATURE_AVX512DQ)
+ && has_cpu_feature (cpu_model, cpu_features2,
+ FEATURE_AVX512VL))
+   set_cpu_feature (cpu_model, cpu_features2,
+FEATURE_X86_64_V4);
+   }
+   }
+}
+
   gcc_assert (cpu_model->__cpu_vendor < VENDOR_MAX);
   gcc_assert (cpu_model->__cpu_type < CPU_TYPE_MAX);
   gcc_assert (cpu_model->__cpu_subtype < CPU_SUBTYPE_MAX);
diff --git a/gcc/common/config/i386/i386-cpuinfo.h 
b/gcc/common/config/i386/i386-cpuinfo.h
index e68dd656046..1b1846d59b8 100644
--- a/gcc/common/config/i386/i386-cpuinfo.h
+++ b/gcc/common/config/i386/i386-cpuinfo.h
@@ -102,6 +102,7 @@ enum feature_priority
   P_MMX,
   P_SSE,
   P_SSE2,
+  P_X86_64_BASELINE,
   P_SSE3,
   P_SSSE3,
   P_PROC_SSSE3,
@@ -111,6 +112,7 @@ enum feature_priority
   P_SSE4_2,
   P_PROC_SSE4_2,
   P_POPCNT,
+  P_X86_64_V2,
   P_AES,
   P_PCLMUL,
   P_AVX,
@@ -125,8 +127,10 @@ enum feature_priority
   P_BMI2,
   P_AVX2,
   P_PROC_AVX2,
+  P_X86_64_V3,
   P_AVX512F,
   P_PROC_AVX512F,
+  P_X86_64_V4,
   

Re: [PATCH] i386: support micro-levels in target{, _clone} attrs [PR101696]

2021-08-12 Thread H.J. Lu via Gcc-patches
On Thu, Aug 12, 2021 at 7:12 AM Martin Liška  wrote:
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> I modified the H.J. patch draft.

Please send out the v2 patch with the enclosed patch.  I added some tests.

> @H.J. Can you please verify the newly added 'feature_priority'?
>

I will take a look at the v2 patch.

> As mentioned in the PR, we do miss supports target micro-architectures
> in target and target_clone attribute. While the levels
> x86-64 x86-64-v2 x86-64-v3 x86-64-v4 are supported values by -march
> option, they are actually only aliases for k8 CPU. That said, they are more
> closer to __builtin_cpu_supports function and we decided to implement
> it there.
>
> PR target/101696
>
> gcc/ChangeLog:
>
> * common/config/i386/cpuinfo.h (cpu_indicator_init): Add support
> for x86-64 micro levels for __builtin_cpu_supports.
> * common/config/i386/i386-cpuinfo.h (enum feature_priority):
> Add priorities for the micro-arch levels.
> (enum processor_features): Add new features.
> * common/config/i386/i386-isas.h: Add micro-arch features.
> * config/i386/i386-builtins.c (get_builtin_code_for_version):
> Support the micro-arch levels by callsing
> __builtin_cpu_supports.
> * doc/extend.texi: Document that the levels are support by
>   __builtin_cpu_supports.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/i386/mv30.C: New test.
> * gcc.target/i386/mvc16.c: New test.
>
> Co-Authored-By: H.J. Lu 
> ---
>   gcc/common/config/i386/cpuinfo.h  | 38 
>   gcc/common/config/i386/i386-cpuinfo.h |  8 +
>   gcc/common/config/i386/i386-isas.h|  4 +++
>   gcc/config/i386/i386-builtins.c   | 22 ++--
>   gcc/doc/extend.texi   | 12 +++
>   gcc/testsuite/g++.target/i386/mv30.C  | 50 +++
>   gcc/testsuite/gcc.target/i386/mvc16.c | 15 
>   7 files changed, 146 insertions(+), 3 deletions(-)
>   create mode 100644 gcc/testsuite/g++.target/i386/mv30.C
>   create mode 100644 gcc/testsuite/gcc.target/i386/mvc16.c
>
> diff --git a/gcc/common/config/i386/cpuinfo.h 
> b/gcc/common/config/i386/cpuinfo.h
> index 458f41de776..f3b60920c81 100644
> --- a/gcc/common/config/i386/cpuinfo.h
> +++ b/gcc/common/config/i386/cpuinfo.h
> @@ -931,6 +931,44 @@ cpu_indicator_init (struct __processor_model *cpu_model,
> else
>   cpu_model->__cpu_vendor = VENDOR_OTHER;
>
> +  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_LM)
> +  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_SSE2))
> +{
> +  set_cpu_feature (cpu_model, cpu_features2,
> +  FEATURE_X86_64_BASELINE);
> +  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_CMPXCHG16B)
> + && has_cpu_feature (cpu_model, cpu_features2, FEATURE_POPCNT)
> + && has_cpu_feature (cpu_model, cpu_features2, FEATURE_LAHF_LM)
> + && has_cpu_feature (cpu_model, cpu_features2, FEATURE_SSE4_2))
> +   {
> + set_cpu_feature (cpu_model, cpu_features2,
> +  FEATURE_X86_64_V2);
> + if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_AVX2)
> + && has_cpu_feature (cpu_model, cpu_features2, FEATURE_BMI)
> + && has_cpu_feature (cpu_model, cpu_features2, FEATURE_BMI2)
> + && has_cpu_feature (cpu_model, cpu_features2, FEATURE_F16C)
> + && has_cpu_feature (cpu_model, cpu_features2, FEATURE_FMA)
> + && has_cpu_feature (cpu_model, cpu_features2,
> + FEATURE_LZCNT)
> + && has_cpu_feature (cpu_model, cpu_features2,
> + FEATURE_MOVBE))
> +   {
> + set_cpu_feature (cpu_model, cpu_features2,
> +  FEATURE_X86_64_V3);
> + if (has_cpu_feature (cpu_model, cpu_features2,
> +  FEATURE_AVX512BW)
> + && has_cpu_feature (cpu_model, cpu_features2,
> + FEATURE_AVX512CD)
> + && has_cpu_feature (cpu_model, cpu_features2,
> + FEATURE_AVX512DQ)
> + && has_cpu_feature (cpu_model, cpu_features2,
> + FEATURE_AVX512VL))
> +   set_cpu_feature (cpu_model, cpu_features2,
> +FEATURE_X86_64_V4);
> +   }
> +   }
> +}
> +
> gcc_assert (cpu_model->__cpu_vendor < VENDOR_MAX);
> gcc_assert (cpu_model->__cpu_type < CPU_TYPE_MAX);
> gcc_assert (cpu_model->__cpu_subtype < CPU_SUBTYPE_MAX);
> diff --git a/gcc/common/config/i386/i386-cpuinfo.h 
> b/gcc/common/config/i386/i386-cpuinfo.h
> index e68dd656046..1b1846d59b8 100644
> --- a/gcc/common/config/i386/i386-cpuinfo.h
> +++ b/gcc/common/config/i386/i386-cpuinfo.h
> @@ -102,6 +102,7 @@ enum feature_priority
>

Re: [PATCH] c++: Implement P0466R5 __cpp_lib_is_layout_compatible compiler helpers [PR101539]

2021-08-12 Thread Jason Merrill via Gcc-patches

On 8/3/21 3:05 AM, Jakub Jelinek wrote:

Hi!

The following patch implements __is_layout_compatible trait and
__builtin_is_corresponding_member helper function for the
std::is_corresponding_member template function.
For now it implements the IMHO buggy but
standard definition of layout-compatible and std::is_layout_compatible
requirements (that Jonathan was discussing to change),
including ignoring of alignment differences, mishandling of bitfields in unions
and [[no_unique_address]] issues with empty classes.
Until we know what exactly is decided in a CWG that seems better to trying
to guess what the standard will say, but of course if you have different
ideas, the patch can change.


I think it's clear that if corresponding fields have different offsets 
or sizes, their containing types can't plausibly be layout-compatible. 
And if two types have different sizes or alignments, they can't be 
layout-compatible.


That leaves open the question of whether the presence or absence of 
no-op alignment specifiers makes a difference; Richard Smith's proposal 
would make that incompatible, I lean the other way, but don't feel 
strongly about it.



For __builtin_is_corresponding_member, it will sorry if corresponding members
could have different offsets (doesn't do so during constant evaluation but
unless one uses the builtin directly, even using std::is_corresponding_member
in constant expressions only will result in instantiation of the template and
the code in the template doesn't have constant arguments and so can emit sorry).


As above, this case should be false without a sorry.  If they're at 
different offsets, they can't reasonably be part of the common initial 
sequence.



For anonymous structs (GCC extension) it will recurse into the anonymous
structs.  For anonymous unions it will emit another sorry if it can't prove such
member types can't appear in the anonymous unions or anonymous aggregates in
that union, because corresponding member is defined only using common initial
sequence which is only defined for std-layout non-union class types and so I
have no idea what to do otherwise in that case.


That makes sense for now.


Bootstrapped/regtested on x86_64-linux and i686-linux.

2021-08-03  Jakub Jelinek  

PR c++/101539
gcc/c-family/
* c-common.h (enum rid): Add RID_IS_LAYOUT_COMPATIBLE.
* c-common.c (c_common_reswords): Add __is_layout_compatible.
gcc/cp/
* cp-tree.h (enum cp_trait_kind): Add CPTK_IS_LAYOUT_COMPATIBLE.
(enum cp_built_in_function): Add CP_BUILT_IN_IS_CORRESPONDING_MEMBER.
(fold_builtin_is_corresponding_member, layout_compatible_type_p):
Declare.
* parser.c (cp_parser_primary_expression): Handle
RID_IS_LAYOUT_COMPATIBLE.
(cp_parser_trait_expr): Likewise.
* cp-objcp-common.c (names_builtin_p): Likewise.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_LAYOUT_COMPATIBLE.
* decl.c (cxx_init_decl_processing): Register
__builtin_is_corresponding_member builtin.
* constexpr.c (cxx_eval_builtin_function_call): Handle
CP_BUILT_IN_IS_CORRESPONDING_MEMBER builtin.
* semantics.c (is_corresponding_member_union,
is_corresponding_member_aggr, fold_builtin_is_corresponding_member):
New functions.
(trait_expr_value): Handle CPTK_IS_LAYOUT_COMPATIBLE.
(finish_trait_expr): Likewise.
* typeck.c (layout_compatible_type_p): New function.
* cp-gimplify.c (cp_gimplify_expr): Fold
CP_BUILT_IN_IS_CORRESPONDING_MEMBER.
(cp_fold): Likewise.
* tree.c (builtin_valid_in_constant_expr_p): Handle
CP_BUILT_IN_IS_CORRESPONDING_MEMBER.
* cxx-pretty-print.c (pp_cxx_trait_expression): Handle
CPTK_IS_LAYOUT_COMPATIBLE.
* class.c (remove_zero_width_bit_fields): Remove.
(layout_class_type): Don't call it.
gcc/testsuite/
* g++.dg/cpp2a/is-corresponding-member1.C: New test.
* g++.dg/cpp2a/is-corresponding-member2.C: New test.
* g++.dg/cpp2a/is-corresponding-member3.C: New test.
* g++.dg/cpp2a/is-corresponding-member4.C: New test.
* g++.dg/cpp2a/is-corresponding-member5.C: New test.
* g++.dg/cpp2a/is-corresponding-member6.C: New test.
* g++.dg/cpp2a/is-corresponding-member7.C: New test.
* g++.dg/cpp2a/is-corresponding-member8.C: New test.
* g++.dg/cpp2a/is-layout-compatible1.C: New test.
* g++.dg/cpp2a/is-layout-compatible2.C: New test.
* g++.dg/cpp2a/is-layout-compatible3.C: New test.

--- gcc/c-family/c-common.h.jj  2021-07-31 18:35:23.879983218 +0200
+++ gcc/c-family/c-common.h 2021-07-31 18:37:07.038600605 +0200
@@ -173,7 +173,8 @@ enum rid
RID_IS_ABSTRACT, RID_IS_AGGREGATE,
RID_IS_BASE_OF,  RID_IS_CLASS,
RID_IS_EMPTY,RID_IS_ENUM,
-  RID_IS_FINAL,RID_IS_LITERAL_TYPE,
+  RID_IS_FINAL,RID_IS_

Re: [PATCH] Remove legacy back threader.

2021-08-12 Thread Aldy Hernandez via Gcc-patches
PING

On Thu, Aug 5, 2021 at 11:48 AM Aldy Hernandez  wrote:
>
> At this point I don't see any use for the legacy mode, which I had
> originally left in place during the transition.
>
> This patch removes the legacy back threader, and cleans up the code a
> bit.  There are no functional changes to the non-legacy code.
>
> Tested on x86-64 Linux.
>
> OK?
>
> gcc/ChangeLog:
>
> * doc/invoke.texi: Remove docs for threader-mode param.
> * flag-types.h (enum threader_mode): Remove.
> * params.opt: Remove threader-mode param.
> * tree-ssa-threadbackward.c (class back_threader): Remove
> path_is_unreachable_p.
> Make find_paths private.
> Add maybe_thread and thread_through_all_blocks.
> Remove reference marker for m_registry.
> Remove reference marker for m_profit.
> (back_threader::back_threader): Adjust for registry and profit not
> being references.
> (dump_path): Move down.
> (debug): Move down.
> (class thread_jumps): Remove.
> (class back_threader_registry): Remove m_all_paths.
> Remove destructor.
> (thread_jumps::thread_through_all_blocks): Move to back_threader
> class.
> (fsm_find_thread_path): Remove
> (back_threader::maybe_thread): New.
> (back_threader::thread_through_all_blocks): Move from
> thread_jumps.
> (back_threader_registry::back_threader_registry): Remove
> m_all_paths.
> (back_threader_registry::~back_threader_registry): Remove.
> (thread_jumps::find_taken_edge): Remove.
> (thread_jumps::check_subpath_and_update_thread_path): Remove.
> (thread_jumps::maybe_register_path): Remove.
> (thread_jumps::handle_phi): Remove.
> (handle_assignment_p): Remove.
> (thread_jumps::handle_assignment): Remove.
> (thread_jumps::fsm_find_control_statement_thread_paths): Remove.
> (thread_jumps::find_jump_threads_backwards): Remove.
> (thread_jumps::find_jump_threads_backwards_with_ranger): Remove.
> (try_thread_blocks): Rename find_jump_threads_backwards to
> maybe_thread.
> (pass_early_thread_jumps::execute): Same.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Remove call into the legacy
> code and adjust for ranger threader.
> ---
>  gcc/doc/invoke.texi   |   3 -
>  gcc/flag-types.h  |   7 -
>  gcc/params.opt|  13 -
>  .../gcc.dg/tree-ssa/ssa-dom-thread-7.c|   3 +-
>  gcc/tree-ssa-threadbackward.c | 539 ++
>  5 files changed, 61 insertions(+), 504 deletions(-)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 4efc8b757ec..65bb9981f02 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13421,9 +13421,6 @@ Setting to 0 disables the analysis completely.
>  @item modref-max-escape-points
>  Specifies the maximum number of escape points tracked by modref per SSA-name.
>
> -@item threader-mode
> -Specifies the mode the backwards threader should run in.
> -
>  @item profile-func-internal-id
>  A parameter to control whether to use function internal id in profile
>  database lookup. If the value is 0, the compiler uses an id that
> diff --git a/gcc/flag-types.h b/gcc/flag-types.h
> index e39673f6716..e43d1de490d 100644
> --- a/gcc/flag-types.h
> +++ b/gcc/flag-types.h
> @@ -454,13 +454,6 @@ enum evrp_mode
>EVRP_MODE_RVRP_DEBUG = EVRP_MODE_RVRP_ONLY | EVRP_MODE_DEBUG
>  };
>
> -/* Backwards threader mode.  */
> -enum threader_mode
> -{
> -  THREADER_MODE_LEGACY = 0,
> -  THREADER_MODE_RANGER = 1
> -};
> -
>  /* Modes of OpenACC 'kernels' constructs handling.  */
>  enum openacc_kernels
>  {
> diff --git a/gcc/params.opt b/gcc/params.opt
> index aa2fb4047b6..92b003e38cb 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -1010,19 +1010,6 @@ Maximum depth of DFS walk used by modref escape 
> analysis.
>  Common Joined UInteger Var(param_modref_max_escape_points) Init(256) Param 
> Optimization
>  Maximum number of escape points tracked by modref per SSA-name.
>
> --param=threader-mode=
> -Common Joined Var(param_threader_mode) Enum(threader_mode) 
> Init(THREADER_MODE_RANGER) Param Optimization
> ---param=threader-mode=[legacy|ranger] Specifies the mode the backwards 
> threader should run in.
> -
> -Enum
> -Name(threader_mode) Type(enum threader_mode) UnknownError(unknown threader 
> mode %qs)
> -
> -EnumValue
> -Enum(threader_mode) String(legacy) Value(THREADER_MODE_LEGACY)
> -
> -EnumValue
> -Enum(threader_mode) String(ranger) Value(THREADER_MODE_RANGER)
> -
>  -param=tm-max-aggregate-size=
>  Common Joined UInteger Var(param_tm_max_aggregate_size) Init(9) Param 
> Optimization
>  Size in bytes after which thread-local aggregates should be instrumented 
> with the logging functions instead of save/restore pairs.
> diff

Re: [PATCH] c++: suppress all warnings on memper pointers to work around dICE [PR101219]

2021-08-12 Thread Jason Merrill via Gcc-patches

On 8/11/21 6:36 PM, Sergei Trofimovich wrote:

On Wed, 11 Aug 2021 15:19:58 -0400
Jason Merrill  wrote:


On 8/6/21 11:34 AM, Sergei Trofimovich wrote:

On Thu, 29 Jul 2021 11:41:39 -0400
Jason Merrill  wrote:
   

On 7/22/21 7:15 PM, Sergei Trofimovich wrote:

From: Sergei Trofimovich 

r12-1804 ("cp: add support for per-location warning groups.") among other
things removed warning suppression from a few places including ptrmemfuncs.

Currently ptrmemfuncs don't have valid BINFO attached which causes ICEs
in access checks:

   crash_signal
   gcc/toplev.c:328
   perform_or_defer_access_check(tree_node*, tree_node*, tree_node*, int, 
access_failure_info*)
   gcc/cp/semantics.c:490
   finish_non_static_data_member(tree_node*, tree_node*, tree_node*)
   gcc/cp/semantics.c:2208
   ...

The change suppresses warnings again until we provide BINFOs for ptrmemfuncs.


We don't need BINFOs for PMFs, we need to avoid paths that expect them.

It looks like the problem is with tsubst_copy_and_build calling
finish_non_static_data_member instead of build_ptrmemfunc_access_expr.


Sounds good. I'm not sure what would be the best way to match it. Here is
my attempt seems to survive all regtests:

--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -20530,7 +20530,13 @@ tsubst_copy_and_build (tree t,
  if (member == error_mark_node)
RETURN (error_mark_node);

-   if (TREE_CODE (member) == FIELD_DECL)
+   if (object_type && TYPE_PTRMEMFUNC_P(object_type)
+   && TREE_CODE (member) == FIELD_DECL)
+ {
+   r = build_ptrmemfunc_access_expr (object, DECL_NAME(member));
+   RETURN (r);
+ }
+   else if (TREE_CODE (member) == FIELD_DECL)
{
  r = finish_non_static_data_member (member, object, NULL_TREE);
  if (TREE_CODE (r) == COMPONENT_REF)
   

PR c++/101219

gcc/cp/ChangeLog:

* typeck.c (build_ptrmemfunc_access_expr): Suppress all warnings
to avoid ICE.

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr101219.C: New test.


This doesn't need to be in torture; it has nothing to do with optimization.


Aha, moved to gcc/testsuite/g++.dg/warn/pr101219.C.

--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/pr101219.C
@@ -0,0 +1,11 @@
+/* PR c++/101219 - ICE on use of uninitialized memfun pointer
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+struct S { void m(); };
+
+template  bool f() {
+  void (S::*mp)();
+
+  return &S::m == mp; // no warning emitted here (no instantiation)
+}

Another question: Is it expected that gcc generates no warnings here?
It's an uninstantiated function (-1 for warn), but from what I
understand it's guaranteed to generate comparison with uninitialized
data if it ever gets instantiated. Given that we used to ICE in
warning code gcc could possibly flag it? (+1 for warn)


Generally it's desirable to diagnose templates for which no valid
instantiation is possible.  It seems reasonable in most cases to also
warn about templates for which all instantiations would warn.

But uninitialized warnings rely on flow analysis that we only do on
instantiated functions, and in any case the ICE doesn't depend on mp
being uninitialized; I get the same crash if I add = 0 to the declaration.


Aha. That makes sense. Let's just fix ICE then.


+   if (object_type && TYPE_PTRMEMFUNC_P(object_type)


Missing space before (.


+   && TREE_CODE (member) == FIELD_DECL)
+ {
+   r = build_ptrmemfunc_access_expr (object, DECL_NAME(member));


And here.


Added both. Attached as v3.


OK, thanks.

Jason



Re: [PATCH] i386: support micro-levels in target{,_clone} attrs [PR101696]

2021-08-12 Thread Martin Liška

On 8/12/21 4:25 PM, H.J. Lu wrote:

Please send out the v2 patch with the enclosed patch.  I added some tests.


Thanks, there's patch which includes your changes.

Martin
>From 22ab27eab643d99addd00c11f62eca891e11ac08 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 12 Aug 2021 15:20:43 +0200
Subject: [PATCH] i386: support micro-levels in target{,_clone} attrs
 [PR101696]

As mentioned in the PR, we do miss supports target micro-architectures
in target and target_clone attribute. While the levels
x86-64 x86-64-v2 x86-64-v3 x86-64-v4 are supported values by -march
option, they are actually only aliases for k8 CPU. That said, they are more
closer to __builtin_cpu_supports function and we decided to implement
it there.

	PR target/101696

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (cpu_indicator_init): Add support
	for x86-64 micro levels for __builtin_cpu_supports.
	* common/config/i386/i386-cpuinfo.h (enum feature_priority):
	Add priorities for the micro-arch levels.
	(enum processor_features): Add new features.
	* common/config/i386/i386-isas.h: Add micro-arch features.
	* config/i386/i386-builtins.c (get_builtin_code_for_version):
	Support the micro-arch levels by callsing
	__builtin_cpu_supports.
	* doc/extend.texi: Document that the levels are support by
	  __builtin_cpu_supports.

gcc/testsuite/ChangeLog:

	* g++.target/i386/mv30.C: New test.
	* gcc.target/i386/mvc16.c: New test.
	* gcc.target/i386/builtin_target.c (CHECK___builtin_cpu_supports):
	New.

Co-Authored-By: H.J. Lu 
---
 gcc/common/config/i386/cpuinfo.h  | 48 ++
 gcc/common/config/i386/i386-cpuinfo.h |  8 +++
 gcc/common/config/i386/i386-isas.h|  4 ++
 gcc/config/i386/i386-builtins.c   | 22 ++--
 gcc/doc/extend.texi   | 12 +
 gcc/testsuite/g++.target/i386/mv30.C  | 50 +++
 .../gcc.target/i386/builtin_target.c  |  2 +
 gcc/testsuite/gcc.target/i386/mvc16.c | 15 ++
 8 files changed, 158 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/mv30.C
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc16.c

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 458f41de776..89158597c1f 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -46,6 +46,10 @@ struct __processor_model2
 # define CHECK___builtin_cpu_is(cpu)
 #endif
 
+#ifndef CHECK___builtin_cpu_supports
+# define CHECK___builtin_cpu_supports(isa)
+#endif
+
 /* Return non-zero if the processor has feature F.  */
 
 static inline int
@@ -931,6 +935,50 @@ cpu_indicator_init (struct __processor_model *cpu_model,
   else
 cpu_model->__cpu_vendor = VENDOR_OTHER;
 
+  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_LM)
+  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_SSE2))
+{
+  CHECK___builtin_cpu_supports ("x86-64");
+  set_cpu_feature (cpu_model, cpu_features2,
+		   FEATURE_X86_64_BASELINE);
+  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_CMPXCHG16B)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_POPCNT)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_LAHF_LM)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_SSE4_2))
+	{
+	  CHECK___builtin_cpu_supports ("x86-64-v2");
+	  set_cpu_feature (cpu_model, cpu_features2,
+			   FEATURE_X86_64_V2);
+	  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_AVX2)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_BMI)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_BMI2)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_F16C)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_FMA)
+	  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_LZCNT)
+	  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_MOVBE))
+	{
+	  CHECK___builtin_cpu_supports ("x86-64-v3");
+	  set_cpu_feature (cpu_model, cpu_features2,
+			   FEATURE_X86_64_V3);
+	  if (has_cpu_feature (cpu_model, cpu_features2,
+   FEATURE_AVX512BW)
+		  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_AVX512CD)
+		  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_AVX512DQ)
+		  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_AVX512VL))
+		{
+		  CHECK___builtin_cpu_supports ("x86-64-v4");
+		  set_cpu_feature (cpu_model, cpu_features2,
+   FEATURE_X86_64_V4);
+		}
+	}
+	}
+}
+
   gcc_assert (cpu_model->__cpu_vendor < VENDOR_MAX);
   gcc_assert (cpu_model->__cpu_type < CPU_TYPE_MAX);
   gcc_assert (cpu_model->__cpu_subtype < CPU_SUBTYPE_MAX);
diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i386/i386-cpuinfo.h
index e68dd656046..1b1846d59b8 100644
--- a/gcc/common/config/i386/i386-cpuinfo.h
+++ b/gcc/common/config/i386/i386-cpuinfo.h
@@ -102,6 +102,7 @@ enum feature_priority
   P_MMX,
   P_SSE,
  

Re: [PATCH] libcpp: Fix ICE with -Wtraditional preprocessing [PR101638]

2021-08-12 Thread Jason Merrill via Gcc-patches

On 8/12/21 3:57 AM, Jakub Jelinek wrote:

Hi!

The following testcase ICEs in cpp_sys_macro_p, because cpp_sys_macro_p
is called for a builtin macro which doesn't use node->value.macro union
member but a different one and so dereferencing it ICEs.
As the testcase is distilled from contemporary glibc headers, it means
basically -Wtraditional now ICEs on almost everything.

The fix can be either the patch below, return false for builtin macros,
or we could instead return true for builtin macros and adjust the
function comment, or the fix could be also (untested):
--- libcpp/expr.c   2021-05-07 10:34:46.345122608 +0200
+++ libcpp/expr.c   2021-08-12 09:54:01.837556365 +0200
@@ -783,13 +783,13 @@ cpp_classify_number (cpp_reader *pfile,
  
/* Traditional C only accepted the 'L' suffix.

   Suppress warning about 'LL' with -Wno-long-long.  */
-  if (CPP_WTRADITIONAL (pfile) && ! cpp_sys_macro_p (pfile))
+  if (CPP_WTRADITIONAL (pfile))
{
  int u_or_i = (result & (CPP_N_UNSIGNED|CPP_N_IMAGINARY));
  int large = (result & CPP_N_WIDTH) == CPP_N_LARGE
   && CPP_OPTION (pfile, cpp_warn_long_long);
  
-	  if (u_or_i || large)

+ if ((u_or_i || large) && ! cpp_sys_macro_p (pfile))
cpp_warning_with_line (pfile, large ? CPP_W_LONG_LONG : 
CPP_W_TRADITIONAL,
   virtual_location, 0,
   "traditional C rejects the \"%.*s\" suffix",
The builtin macros at least currently don't add any suffixes
or numbers -Wtraditional would like to warn about.


So whether cpp_sys_macro_p returns true or false has no practical effect.


For floating
point suffixes, -Wtraditional calls cpp_sys_macro_p only right
away before emitting the warning, but in the above case the ICE
is because cpp_sys_macro_p is called even if the number doesn't
have any suffixes (that is I think always for builtin macros
right now).

Bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk, or do you prefer to return true for builtin
macros from cpp_sys_macro_p instead


I think I'd prefer to return true, since builtin macros are also in the 
broader category of macros provided by the implementation.  OK with that 
change.


, and/or do you want the

above cpp_classify_number change (which can be done either
alone or in addition to some cpp_sys_macro_p change)?

2021-08-12  Jakub Jelinek  

PR preprocessor/101638
* macro.c (cpp_sys_macro_p): Return false instead of
crashing on builtin macros.

* gcc.dg/cpp/pr101638.c: New test.

--- libcpp/macro.c.jj   2021-07-20 17:03:08.036449892 +0200
+++ libcpp/macro.c  2021-08-11 20:50:11.212662720 +0200
@@ -3127,7 +3127,10 @@ cpp_sys_macro_p (cpp_reader *pfile)
else
  node = pfile->context->c.macro;
  
-  return node && node->value.macro && node->value.macro->syshdr;

+  return (node
+ && cpp_user_macro_p (node)
+ && node->value.macro
+ && node->value.macro->syshdr);
  }
  
  /* Read each token in, until end of the current file.  Directives are

--- gcc/testsuite/gcc.dg/cpp/pr101638.c.jj  2021-08-11 20:53:04.785289640 
+0200
+++ gcc/testsuite/gcc.dg/cpp/pr101638.c 2021-08-11 20:52:36.086682006 +0200
@@ -0,0 +1,7 @@
+/* PR preprocessor/101638 */
+/* { dg-do preprocess } */
+/* { dg-options "-Wtraditional" } */
+
+#define foo(attr) __has_attribute(attr)
+#if foo(__deprecated__)
+#endif

Jakub





Re: [PATCH] i386: support micro-levels in target{, _clone} attrs [PR101696]

2021-08-12 Thread H.J. Lu via Gcc-patches
On Thu, Aug 12, 2021 at 7:39 AM Martin Liška  wrote:
>
> On 8/12/21 4:25 PM, H.J. Lu wrote:
> > Please send out the v2 patch with the enclosed patch.  I added some tests.
>
> Thanks, there's patch which includes your changes.
>
> Martin

diff --git a/gcc/common/config/i386/i386-isas.h
b/gcc/common/config/i386/i386-isas.h
index 898c18f3dda..cd9523b8fbc 100644
--- a/gcc/common/config/i386/i386-isas.h
+++ b/gcc/common/config/i386/i386-isas.h
@@ -169,4 +169,8 @@ ISA_NAMES_TABLE_START
   ISA_NAMES_TABLE_ENTRY("aeskle", FEATURE_AESKLE, P_NONE, NULL)
   ISA_NAMES_TABLE_ENTRY("widekl", FEATURE_WIDEKL, P_NONE, "-mwidekl")
   ISA_NAMES_TABLE_ENTRY("avxvnni", FEATURE_AVXVNNI, P_NONE, "-mavxvnni")
+  ISA_NAMES_TABLE_ENTRY("x86-64", FEATURE_X86_64_BASELINE, P_NONE, NULL)
+  ISA_NAMES_TABLE_ENTRY("x86-64-v2", FEATURE_X86_64_V2, P_NONE, NULL)
+  ISA_NAMES_TABLE_ENTRY("x86-64-v3", FEATURE_X86_64_V3, P_NONE, NULL)
+  ISA_NAMES_TABLE_ENTRY("x86-64-v4", FEATURE_X86_64_V4, P_NONE, NULL)

If they have proper feature_priority, can you avoid

iff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
index 204e2903126..492873bb076 100644
--- a/gcc/config/i386/i386-builtins.c
+++ b/gcc/config/i386/i386-builtins.c
@@ -1904,8 +1904,24 @@ get_builtin_code_for_version (tree decl, tree
*predicate_list)
  return 0;
   new_target = TREE_TARGET_OPTION (target_node);
   gcc_assert (new_target);
-
-  if (new_target->arch_specified && new_target->arch > 0)
+  enum ix86_builtins builtin_fn = IX86_BUILTIN_CPU_IS;
+
+  /* Special case x86-64 micro-level architectures.  */
+  const char *arch_name = attrs_str + strlen ("arch=");
+  if (startswith (arch_name, "x86-64"))
+ {
+   arg_str = arch_name;
+   builtin_fn = IX86_BUILTIN_CPU_SUPPORTS;
+   if (strcmp (arch_name, "x86-64") == 0)
+ priority = P_X86_64_BASELINE;
+   else if (strcmp (arch_name, "x86-64-v2") == 0)
+ priority = P_X86_64_V2;
+   else if (strcmp (arch_name, "x86-64-v3") == 0)
+ priority = P_X86_64_V3;
+   else if (strcmp (arch_name, "x86-64-v4") == 0)
+ priority = P_X86_64_V4;
+ }

   if (predicate_list)
  {
-  predicate_decl = ix86_builtins [(int) IX86_BUILTIN_CPU_IS];
+   predicate_decl = ix86_builtins [(int) builtin_fn];

Is this required?

-- 
H.J.


Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-12 Thread Segher Boessenkool
Hi!

On Wed, Aug 11, 2021 at 02:56:11PM +0800, Kewen.Lin wrote:
>   * config/rs6000/rs6000.c (rs6000_builtin_md_vectorized_function): Add
>   support for some built-in functions vectorized on Power10.

Say which, not "some" please?

> +  machine_mode in_vmode = TYPE_MODE (type_in);
> +  machine_mode out_vmode = TYPE_MODE (type_out);
> +
> +  /* Power10 supported vectorized built-in functions.  */
> +  if (TARGET_POWER10
> +  && in_vmode == out_vmode
> +  && VECTOR_UNIT_ALTIVEC_OR_VSX_P (in_vmode))
> +{
> +  machine_mode exp_mode = DImode;
> +  machine_mode exp_vmode = V2DImode;
> +  enum rs6000_builtins vname = RS6000_BUILTIN_COUNT;

"name"?  This should be "bif" or similar?

> +  switch (fn)
> + {
> + case MISC_BUILTIN_DIVWE:
> + case MISC_BUILTIN_DIVWEU:
> +   exp_mode = SImode;
> +   exp_vmode = V4SImode;
> +   if (fn == MISC_BUILTIN_DIVWE)
> + vname = P10V_BUILTIN_DIVES_V4SI;
> +   else
> + vname = P10V_BUILTIN_DIVEU_V4SI;
> +   break;
> + case MISC_BUILTIN_DIVDE:
> + case MISC_BUILTIN_DIVDEU:
> +   if (fn == MISC_BUILTIN_DIVDE)
> + vname = P10V_BUILTIN_DIVES_V2DI;
> +   else
> + vname = P10V_BUILTIN_DIVEU_V2DI;
> +   break;

All of the above should not be builtin functions really, they are all
simple arithmetic :-(  They should not be UNSPECs either, on RTL level.
They can and should be optimised in real code as well.  Oh well.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/dive-vectorize-2.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target lp64 } */

Please add a comment what this is needed for?  "We scan for dive*d" is
enough, but without anything, it takes time to figure this out.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/dive-vectorize-run-2.c
> @@ -0,0 +1,53 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target lp64 } */

Same here.  I suppose this uses builtins that do not exist on 32-bit?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/p10-bifs-vectorize-run-1.c
> @@ -0,0 +1,45 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target lp64 } */

And another.

> +#define CHECK(name)  
>  \
> +  __attribute__ ((optimize (1))) void check_##name ()
>  \

What is the attribute for, btw?  It seems fragile, but perhaps I do not
understand the intention.


Okay for trunk with whose lp64 things improved.  Thanks!


Segher


Re: [PATCH] c++: Implement P0466R5 __cpp_lib_is_layout_compatible compiler helpers [PR101539]

2021-08-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 12, 2021 at 10:33:20AM -0400, Jason Merrill wrote:
> > The following patch implements __is_layout_compatible trait and
> > __builtin_is_corresponding_member helper function for the
> > std::is_corresponding_member template function.
> > For now it implements the IMHO buggy but
> > standard definition of layout-compatible and std::is_layout_compatible
> > requirements (that Jonathan was discussing to change),
> > including ignoring of alignment differences, mishandling of bitfields in 
> > unions
> > and [[no_unique_address]] issues with empty classes.
> > Until we know what exactly is decided in a CWG that seems better to trying
> > to guess what the standard will say, but of course if you have different
> > ideas, the patch can change.
> 
> I think it's clear that if corresponding fields have different offsets or
> sizes, their containing types can't plausibly be layout-compatible. And if
> two types have different sizes or alignments, they can't be
> layout-compatible.
> 
> That leaves open the question of whether the presence or absence of no-op
> alignment specifiers makes a difference; Richard Smith's proposal would make
> that incompatible, I lean the other way, but don't feel strongly about it.

Ok, so you prefer to change layout_compatible_type_p in anticipation of the
future DR.
Given the g++.dg/cpp2a/is-layout-compatible3.C cases, shall that include:
  if (TYPE_ALIGN (type1) != TYPE_ALIGN (type2))
return false;  /* Types with different alignment aren't layout-compatible.  
*/
  if (!tree_int_cst_equal (TYPE_SIZE_UNIT (type1), TYPE_SIZE_UNIT (type2)))
return false;  /* Types with different sizes aren't layout-compatible.  */
cases inside both the ENUMERAL_TYPE and CLASS_TYPE_P ifs (as e.g. the
enumeral types can have the same underlying type including alignment and
size, but the enumeral type itself could have different alignas)?
I think I can't compare TYPE_SIZE_UNIT for the fallthrough same_type_p case
because the type could be array with unspecified bounds and I expect
same_type_p fails if the sizes or alignments are different.

And then also compare field offsets in struct and say members with different
offsets aren't part of the common initial sequence (that would cover the
struct S {};
struct T {};
struct U { [[no_unique_address]] S a1; [[no_unique_address]] S a2; 
[[no_unique_address]] S a3; };
struct V { [[no_unique_address]] S b1; [[no_unique_address]] T b2; 
[[no_unique_address]] S b3; };
case or alignas on the members as opposed to types)?
What about DECL_ALIGN?  Shall that be relevant even when it doesn't change
anything further (TYPE_ALIGN of the whole struct is already the same and
field with higher DECL_ALIGN has the same field offset as in another struct
where it has that offset because of the previous fields or is at start)?

And finally, what about the union case?  Shall it check also bitfield vs.
non-bitfield, bitfield size if bitfields?  What about [[no_unique_address]]
on the union members, shall that be relevant or not?  And DECL_ALIGN?
E.g. the whole union can have the alignment,
union alignas (16) A { short a; alignas (8) int b; };
vs.
union alignas (16) B { int a; short b; };
All union fields have the same field offset of course (0), but above A::b
has different alignment requirement than B::a.

Jakub



Re: [PATCH] libcpp: Fix ICE with -Wtraditional preprocessing [PR101638]

2021-08-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 12, 2021 at 10:46:23AM -0400, Jason Merrill wrote:
> > --- libcpp/expr.c   2021-05-07 10:34:46.345122608 +0200
> > +++ libcpp/expr.c   2021-08-12 09:54:01.837556365 +0200
> > @@ -783,13 +783,13 @@ cpp_classify_number (cpp_reader *pfile,
> > /* Traditional C only accepted the 'L' suffix.
> >Suppress warning about 'LL' with -Wno-long-long.  */
> > -  if (CPP_WTRADITIONAL (pfile) && ! cpp_sys_macro_p (pfile))
> > +  if (CPP_WTRADITIONAL (pfile))
> > {
> >   int u_or_i = (result & (CPP_N_UNSIGNED|CPP_N_IMAGINARY));
> >   int large = (result & CPP_N_WIDTH) == CPP_N_LARGE
> >&& CPP_OPTION (pfile, cpp_warn_long_long);
> > - if (u_or_i || large)
> > + if ((u_or_i || large) && ! cpp_sys_macro_p (pfile))
> > cpp_warning_with_line (pfile, large ? CPP_W_LONG_LONG : 
> > CPP_W_TRADITIONAL,
> >virtual_location, 0,
> >"traditional C rejects the \"%.*s\" suffix",
> > The builtin macros at least currently don't add any suffixes
> > or numbers -Wtraditional would like to warn about.
> 
> So whether cpp_sys_macro_p returns true or false has no practical effect.

Yes.

> > For floating
> > point suffixes, -Wtraditional calls cpp_sys_macro_p only right
> > away before emitting the warning, but in the above case the ICE
> > is because cpp_sys_macro_p is called even if the number doesn't
> > have any suffixes (that is I think always for builtin macros
> > right now).
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux.
> > Ok for trunk, or do you prefer to return true for builtin
> > macros from cpp_sys_macro_p instead
> 
> I think I'd prefer to return true, since builtin macros are also in the
> broader category of macros provided by the implementation.  OK with that
> change.

Ok, will test that.  Thanks.

Jakub



Re: [PATCH] i386: support micro-levels in target{,_clone} attrs [PR101696]

2021-08-12 Thread Martin Liška

On 8/12/21 4:51 PM, H.J. Lu wrote:

On Thu, Aug 12, 2021 at 7:39 AM Martin Liška  wrote:


On 8/12/21 4:25 PM, H.J. Lu wrote:

Please send out the v2 patch with the enclosed patch.  I added some tests.


Thanks, there's patch which includes your changes.

Martin


diff --git a/gcc/common/config/i386/i386-isas.h
b/gcc/common/config/i386/i386-isas.h
index 898c18f3dda..cd9523b8fbc 100644
--- a/gcc/common/config/i386/i386-isas.h
+++ b/gcc/common/config/i386/i386-isas.h
@@ -169,4 +169,8 @@ ISA_NAMES_TABLE_START
ISA_NAMES_TABLE_ENTRY("aeskle", FEATURE_AESKLE, P_NONE, NULL)
ISA_NAMES_TABLE_ENTRY("widekl", FEATURE_WIDEKL, P_NONE, "-mwidekl")
ISA_NAMES_TABLE_ENTRY("avxvnni", FEATURE_AVXVNNI, P_NONE, "-mavxvnni")
+  ISA_NAMES_TABLE_ENTRY("x86-64", FEATURE_X86_64_BASELINE, P_NONE, NULL)
+  ISA_NAMES_TABLE_ENTRY("x86-64-v2", FEATURE_X86_64_V2, P_NONE, NULL)
+  ISA_NAMES_TABLE_ENTRY("x86-64-v3", FEATURE_X86_64_V3, P_NONE, NULL)
+  ISA_NAMES_TABLE_ENTRY("x86-64-v4", FEATURE_X86_64_V4, P_NONE, NULL)

If they have proper feature_priority, can you avoid


I don't think so. First we likely want supporting "arch=x86-64-v3" rather than
"x86-64-v3" in e.g. 'target' attribute. That means a special handling by the 
code
I added.

The following fails as there's no corresponding -m$option.

pr101696.c:5:45: error: attribute ‘x86-64-v4’ argument ‘target’ is unknown

5 | __attribute__ ((target ("x86-64-v4"))) void foo () {  __builtin_printf 
("arch=x86-64-v4\n"); }

  | ^~~


Or do I miss something and we can do it in a simpler way?

Cheers,
Martin



iff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
index 204e2903126..492873bb076 100644
--- a/gcc/config/i386/i386-builtins.c
+++ b/gcc/config/i386/i386-builtins.c
@@ -1904,8 +1904,24 @@ get_builtin_code_for_version (tree decl, tree
*predicate_list)
   return 0;
new_target = TREE_TARGET_OPTION (target_node);
gcc_assert (new_target);
-
-  if (new_target->arch_specified && new_target->arch > 0)
+  enum ix86_builtins builtin_fn = IX86_BUILTIN_CPU_IS;
+
+  /* Special case x86-64 micro-level architectures.  */
+  const char *arch_name = attrs_str + strlen ("arch=");
+  if (startswith (arch_name, "x86-64"))
+ {
+   arg_str = arch_name;
+   builtin_fn = IX86_BUILTIN_CPU_SUPPORTS;
+   if (strcmp (arch_name, "x86-64") == 0)
+ priority = P_X86_64_BASELINE;
+   else if (strcmp (arch_name, "x86-64-v2") == 0)
+ priority = P_X86_64_V2;
+   else if (strcmp (arch_name, "x86-64-v3") == 0)
+ priority = P_X86_64_V3;
+   else if (strcmp (arch_name, "x86-64-v4") == 0)
+ priority = P_X86_64_V4;
+ }

if (predicate_list)
   {
-  predicate_decl = ix86_builtins [(int) IX86_BUILTIN_CPU_IS];
+   predicate_decl = ix86_builtins [(int) builtin_fn];

Is this required?





[PATCH] ipa: make target_clone default decl local [PR101726]

2021-08-12 Thread Martin Liška

When we have a target_clone *declaration*, it does not make sense doing
the default version local. The use-case in the PR is that the reporter
wants to implement the function in assembly.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR ipa/101726

gcc/ChangeLog:

* multiple_target.c (create_dispatcher_calls): Make default
  function local only if it is a definition.
---
 gcc/multiple_target.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index e4192657cef..6c0565880c5 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -170,17 +170,20 @@ create_dispatcher_calls (struct cgraph_node *node)
  clone_function_name_numbered (
  node->decl, "default"));
 
-  /* FIXME: copy of cgraph_node::make_local that should be cleaned up

-   in next stage1.  */
-  node->make_decl_local ();
-  node->set_section (NULL);
-  node->set_comdat_group (NULL);
-  node->externally_visible = false;
-  node->forced_by_abi = false;
-  node->set_section (NULL);
-
-  DECL_ARTIFICIAL (node->decl) = 1;
-  node->force_output = true;
+  if (node->definition)
+{
+  /* FIXME: copy of cgraph_node::make_local that should be cleaned up
+   in next stage1.  */
+  node->make_decl_local ();
+  node->set_section (NULL);
+  node->set_comdat_group (NULL);
+  node->externally_visible = false;
+  node->forced_by_abi = false;
+  node->set_section (NULL);
+
+  DECL_ARTIFICIAL (node->decl) = 1;
+  node->force_output = true;
+}
 }
 
 /* Return length of attribute names string,

--
2.32.0



Re: [PATCH] i386: support micro-levels in target{, _clone} attrs [PR101696]

2021-08-12 Thread H.J. Lu via Gcc-patches
On Thu, Aug 12, 2021 at 8:22 AM Martin Liška  wrote:
>
> On 8/12/21 4:51 PM, H.J. Lu wrote:
> > On Thu, Aug 12, 2021 at 7:39 AM Martin Liška  wrote:
> >>
> >> On 8/12/21 4:25 PM, H.J. Lu wrote:
> >>> Please send out the v2 patch with the enclosed patch.  I added some tests.
> >>
> >> Thanks, there's patch which includes your changes.
> >>
> >> Martin
> >
> > diff --git a/gcc/common/config/i386/i386-isas.h
> > b/gcc/common/config/i386/i386-isas.h
> > index 898c18f3dda..cd9523b8fbc 100644
> > --- a/gcc/common/config/i386/i386-isas.h
> > +++ b/gcc/common/config/i386/i386-isas.h
> > @@ -169,4 +169,8 @@ ISA_NAMES_TABLE_START
> > ISA_NAMES_TABLE_ENTRY("aeskle", FEATURE_AESKLE, P_NONE, NULL)
> > ISA_NAMES_TABLE_ENTRY("widekl", FEATURE_WIDEKL, P_NONE, "-mwidekl")
> > ISA_NAMES_TABLE_ENTRY("avxvnni", FEATURE_AVXVNNI, P_NONE, "-mavxvnni")
> > +  ISA_NAMES_TABLE_ENTRY("x86-64", FEATURE_X86_64_BASELINE, P_NONE, NULL)
> > +  ISA_NAMES_TABLE_ENTRY("x86-64-v2", FEATURE_X86_64_V2, P_NONE, NULL)
> > +  ISA_NAMES_TABLE_ENTRY("x86-64-v3", FEATURE_X86_64_V3, P_NONE, NULL)
> > +  ISA_NAMES_TABLE_ENTRY("x86-64-v4", FEATURE_X86_64_V4, P_NONE, NULL)
> >
> > If they have proper feature_priority, can you avoid
>
> I don't think so. First we likely want supporting "arch=x86-64-v3" rather than
> "x86-64-v3" in e.g. 'target' attribute. That means a special handling by the 
> code
> I added.

Will it hurt if they have proper feature_priorities you added?

> The following fails as there's no corresponding -m$option.
>
> pr101696.c:5:45: error: attribute ‘x86-64-v4’ argument ‘target’ is unknown
>
>  5 | __attribute__ ((target ("x86-64-v4"))) void foo () {  
> __builtin_printf ("arch=x86-64-v4\n"); }
>
>| ^~~
>
>
> Or do I miss something and we can do it in a simpler way?
>
> Cheers,
> Martin
>
> >
> > iff --git a/gcc/config/i386/i386-builtins.c 
> > b/gcc/config/i386/i386-builtins.c
> > index 204e2903126..492873bb076 100644
> > --- a/gcc/config/i386/i386-builtins.c
> > +++ b/gcc/config/i386/i386-builtins.c
> > @@ -1904,8 +1904,24 @@ get_builtin_code_for_version (tree decl, tree
> > *predicate_list)
> >return 0;
> > new_target = TREE_TARGET_OPTION (target_node);
> > gcc_assert (new_target);
> > -
> > -  if (new_target->arch_specified && new_target->arch > 0)
> > +  enum ix86_builtins builtin_fn = IX86_BUILTIN_CPU_IS;
> > +
> > +  /* Special case x86-64 micro-level architectures.  */
> > +  const char *arch_name = attrs_str + strlen ("arch=");
> > +  if (startswith (arch_name, "x86-64"))
> > + {
> > +   arg_str = arch_name;
> > +   builtin_fn = IX86_BUILTIN_CPU_SUPPORTS;
> > +   if (strcmp (arch_name, "x86-64") == 0)
> > + priority = P_X86_64_BASELINE;
> > +   else if (strcmp (arch_name, "x86-64-v2") == 0)
> > + priority = P_X86_64_V2;
> > +   else if (strcmp (arch_name, "x86-64-v3") == 0)
> > + priority = P_X86_64_V3;
> > +   else if (strcmp (arch_name, "x86-64-v4") == 0)
> > + priority = P_X86_64_V4;
> > + }
> >
> > if (predicate_list)
> >{
> > -  predicate_decl = ix86_builtins [(int) IX86_BUILTIN_CPU_IS];
> > +   predicate_decl = ix86_builtins [(int) builtin_fn];
> >
> > Is this required?
> >
>


-- 
H.J.


[PATCH] Do not enable DT_INIT_ARRAY/DT_FINI_ARRAY on uclinuxfdpiceabi

2021-08-12 Thread Christophe Lyon via Gcc-patches
Commit r12-1328 enabled DT_INIT_ARRAY/DT_FINI_ARRAY for all Linux
targets, but this does not work for arm-none-uclinuxfdpiceabi: it
makes all the execution tests fail.

This patch restores the original behavior for uclinuxfdpiceabi.

2021-08-12  Christophe Lyon  

gcc/
PR target/100896
* config.gcc (gcc_cv_initfini_array): Leave undefined for
uclinuxfdpiceabi targets.
---
 gcc/config.gcc | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 93e2b3219b9..8c8d30ca934 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -851,8 +851,14 @@ case ${target} in
   tmake_file="${tmake_file} t-glibc"
   target_has_targetcm=yes
   target_has_targetdm=yes
-  # Linux targets always support .init_array.
-  gcc_cv_initfini_array=yes
+  case $target in
+*-*-uclinuxfdpiceabi)
+  ;;
+*)
+  # Linux targets always support .init_array.
+  gcc_cv_initfini_array=yes
+  ;;
+  esac
   ;;
 *-*-netbsd*)
   tm_p_file="${tm_p_file} netbsd-protos.h"
-- 
2.25.1



Re: [PATCH] i386: support micro-levels in target{,_clone} attrs [PR101696]

2021-08-12 Thread Martin Liška

On 8/12/21 5:26 PM, H.J. Lu wrote:

Will it hurt if they have proper feature_priorities you added?


No. They are unused, by we should use the proper priorities.

Martin
>From 5a2f40394390f8bfca0724d6e371b5105d01c027 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 12 Aug 2021 15:20:43 +0200
Subject: [PATCH] i386: support micro-levels in target{,_clone} attrs
 [PR101696]

As mentioned in the PR, we do miss supports target micro-architectures
in target and target_clone attribute. While the levels
x86-64 x86-64-v2 x86-64-v3 x86-64-v4 are supported values by -march
option, they are actually only aliases for k8 CPU. That said, they are more
closer to __builtin_cpu_supports function and we decided to implement
it there.

	PR target/101696

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (cpu_indicator_init): Add support
	for x86-64 micro levels for __builtin_cpu_supports.
	* common/config/i386/i386-cpuinfo.h (enum feature_priority):
	Add priorities for the micro-arch levels.
	(enum processor_features): Add new features.
	* common/config/i386/i386-isas.h: Add micro-arch features.
	* config/i386/i386-builtins.c (get_builtin_code_for_version):
	Support the micro-arch levels by callsing
	__builtin_cpu_supports.
	* doc/extend.texi: Document that the levels are support by
	  __builtin_cpu_supports.

gcc/testsuite/ChangeLog:

	* g++.target/i386/mv30.C: New test.
	* gcc.target/i386/mvc16.c: New test.
	* gcc.target/i386/builtin_target.c (CHECK___builtin_cpu_supports):
	New.

Co-Authored-By: H.J. Lu 
---
 gcc/common/config/i386/cpuinfo.h  | 48 ++
 gcc/common/config/i386/i386-cpuinfo.h |  8 +++
 gcc/common/config/i386/i386-isas.h|  5 ++
 gcc/config/i386/i386-builtins.c   | 22 ++--
 gcc/doc/extend.texi   | 12 +
 gcc/testsuite/g++.target/i386/mv30.C  | 50 +++
 .../gcc.target/i386/builtin_target.c  |  2 +
 gcc/testsuite/gcc.target/i386/mvc16.c | 15 ++
 8 files changed, 159 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/mv30.C
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc16.c

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 458f41de776..89158597c1f 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -46,6 +46,10 @@ struct __processor_model2
 # define CHECK___builtin_cpu_is(cpu)
 #endif
 
+#ifndef CHECK___builtin_cpu_supports
+# define CHECK___builtin_cpu_supports(isa)
+#endif
+
 /* Return non-zero if the processor has feature F.  */
 
 static inline int
@@ -931,6 +935,50 @@ cpu_indicator_init (struct __processor_model *cpu_model,
   else
 cpu_model->__cpu_vendor = VENDOR_OTHER;
 
+  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_LM)
+  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_SSE2))
+{
+  CHECK___builtin_cpu_supports ("x86-64");
+  set_cpu_feature (cpu_model, cpu_features2,
+		   FEATURE_X86_64_BASELINE);
+  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_CMPXCHG16B)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_POPCNT)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_LAHF_LM)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_SSE4_2))
+	{
+	  CHECK___builtin_cpu_supports ("x86-64-v2");
+	  set_cpu_feature (cpu_model, cpu_features2,
+			   FEATURE_X86_64_V2);
+	  if (has_cpu_feature (cpu_model, cpu_features2, FEATURE_AVX2)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_BMI)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_BMI2)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_F16C)
+	  && has_cpu_feature (cpu_model, cpu_features2, FEATURE_FMA)
+	  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_LZCNT)
+	  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_MOVBE))
+	{
+	  CHECK___builtin_cpu_supports ("x86-64-v3");
+	  set_cpu_feature (cpu_model, cpu_features2,
+			   FEATURE_X86_64_V3);
+	  if (has_cpu_feature (cpu_model, cpu_features2,
+   FEATURE_AVX512BW)
+		  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_AVX512CD)
+		  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_AVX512DQ)
+		  && has_cpu_feature (cpu_model, cpu_features2,
+  FEATURE_AVX512VL))
+		{
+		  CHECK___builtin_cpu_supports ("x86-64-v4");
+		  set_cpu_feature (cpu_model, cpu_features2,
+   FEATURE_X86_64_V4);
+		}
+	}
+	}
+}
+
   gcc_assert (cpu_model->__cpu_vendor < VENDOR_MAX);
   gcc_assert (cpu_model->__cpu_type < CPU_TYPE_MAX);
   gcc_assert (cpu_model->__cpu_subtype < CPU_SUBTYPE_MAX);
diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i386/i386-cpuinfo.h
index e68dd656046..1b1846d59b8 100644
--- a/gcc/common/config/i386/i386-cpuinfo.h
+++ b/gcc/common/config/i386/i386-cpuinfo.h
@@ -102,6 +102,7 @@ enum feature_priority
   P_MMX,
   P_SSE,
   P

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-12 Thread Segher Boessenkool
On Thu, Aug 12, 2021 at 10:10:10AM +0800, Kewen.Lin wrote:
> > +  enum rs6000_builtins vname = RS6000_BUILTIN_COUNT;
> > 
> > Using this as a flag value looks unnecessary.  Is this just being done to 
> > silence a warning?
> 
> Good question!  I didn't notice there is a warning or not, just get used to 
> initializing variable
> with one suitable value if possible.  If you don't mind, may I still keep it? 
>  Since if some
> future codes use vname in a path where it's not assigned, one explicitly 
> wrong enum (bif) seems
> better than a random one.  Or will this mentioned possibility definitely 
> never happen since the
> current uninitialized variables detection and warning scheme is robust and 
> should not worry about
> that completely?

It is a bad idea to initialise things unnecessary: it hinders many
optimisations, but much more importantly, it silences warnings without
fixing the problem.

> > +  if (vname != RS6000_BUILTIN_COUNT
> > 
> > Check is not necessary, as you will have returned by now in that case.
> 
> Thanks for catching, I put break for "default" initially, didn't noticed the 
> following condition
> need an adjustment after updating it to early return.  Will fix it.

Thanks :-)


Segher


[PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-12 Thread Patrick McGehearty via Gcc-patches
This patch resolves the failure of powerpc64 long double complex divide
in native ibm long double format after the patch "Practical improvement
to libgcc complex divide".

The new code uses the following macros which are intended to be mapped
to appropriate values according to the underlying hardware representation.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101104

RBIG a value near the maximum representation
RMIN a value near the minimum representation
 (but not in the subnormal range)
RMIN2a value moderately less than 1
RMINSCAL the inverse of RMIN2
RMAX2RBIG * RMIN2  - a value to limit scaling to not overflow

When "long double" values were not using the IEEE 128-bit format but
the traditional IBM 128-bit, the previous code used the LDBL values
which caused overflow for RMINSCAL. The new code uses the DBL values.

RBIG  LDBL_MAX = 0x1.f800p+1022
  DBL_MAX  = 0x1.f000p+1022

RMIN  LDBL_MIN = 0x1.p-969
RMIN  DBL_MIN  = 0x1.p-1022

RMIN2 LDBL_EPSILON = 0x0.1000p-1022 = 0x1.0p-1074
RMIN2 DBL_EPSILON  = 0x1.p-52

RMINSCAL 1/LDBL_EPSILON = inf (1.0p+1074 does not fit in IBM 128-bit).
 1/DBL_EPSILON  = 0x1.p+52

RMAX2 = RBIG * RMIN2 = 0x1.f800p-52
RBIG * RMIN2 = 0x1.f000p+970

The MAX and MIN values have only modest changes since the maximum and
minimum values are about the same as for double precision.  The
EPSILON field is considerably different. Due to how very small values
can be represented in the lower 64 bits of the IBM 128-bit floating
point, EPSILON is extremely small, so far beyond the desired value
that inversion of the value overflows and even without the overflow,
the RMAX2 is so small as to eliminate most usage of the test.

Instead of just replacing the use of KF_EPSILON with DF_EPSILON, we
replace all uses of KF_* with DF_*. Since the exponent fields are
essentially the same, we gain the positive benefits from the new
formula while avoiding all under/overflow issues in the #defines.

The change has been tested on gcc135.fsffrance.org and gains the
expected improvements in accuracy for long double complex divide.

libgcc/
PR target/101104
* config/rs6000/_divkc3.c (RBIG, RMIN, RMIN2, RMINSCAL, RMAX2):
Use more correct values for native IBM 128-bit.
---
 libgcc/config/rs6000/_divkc3.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libgcc/config/rs6000/_divkc3.c b/libgcc/config/rs6000/_divkc3.c
index a1d29d2..2b229c8 100644
--- a/libgcc/config/rs6000/_divkc3.c
+++ b/libgcc/config/rs6000/_divkc3.c
@@ -38,10 +38,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #endif
 
 #ifndef __LONG_DOUBLE_IEEE128__
-#define RBIG   (__LIBGCC_KF_MAX__ / 2)
-#define RMIN   (__LIBGCC_KF_MIN__)
-#define RMIN2  (__LIBGCC_KF_EPSILON__)
-#define RMINSCAL (1 / __LIBGCC_KF_EPSILON__)
+#define RBIG   (__LIBGCC_DF_MAX__ / 2)
+#define RMIN   (__LIBGCC_DF_MIN__)
+#define RMIN2  (__LIBGCC_DF_EPSILON__)
+#define RMINSCAL (1 / __LIBGCC_DF_EPSILON__)
 #define RMAX2  (RBIG * RMIN2)
 #else
 #define RBIG   (__LIBGCC_TF_MAX__ / 2)
-- 
1.8.3.1



Re: [PATCH] c++: Implement P0466R5 __cpp_lib_is_layout_compatible compiler helpers [PR101539]

2021-08-12 Thread Jason Merrill via Gcc-patches

On 8/12/21 11:16 AM, Jakub Jelinek wrote:

On Thu, Aug 12, 2021 at 10:33:20AM -0400, Jason Merrill wrote:

The following patch implements __is_layout_compatible trait and
__builtin_is_corresponding_member helper function for the
std::is_corresponding_member template function.
For now it implements the IMHO buggy but
standard definition of layout-compatible and std::is_layout_compatible
requirements (that Jonathan was discussing to change),
including ignoring of alignment differences, mishandling of bitfields in unions
and [[no_unique_address]] issues with empty classes.
Until we know what exactly is decided in a CWG that seems better to trying
to guess what the standard will say, but of course if you have different
ideas, the patch can change.


I think it's clear that if corresponding fields have different offsets or
sizes, their containing types can't plausibly be layout-compatible. And if
two types have different sizes or alignments, they can't be
layout-compatible.

That leaves open the question of whether the presence or absence of no-op
alignment specifiers makes a difference; Richard Smith's proposal would make
that incompatible, I lean the other way, but don't feel strongly about it.


Ok, so you prefer to change layout_compatible_type_p in anticipation of the
future DR.


Yes; if the standard says something nonsensical, I prefer to figure out 
something more sensible to propose as a change.



Given the g++.dg/cpp2a/is-layout-compatible3.C cases, shall that include:
   if (TYPE_ALIGN (type1) != TYPE_ALIGN (type2))
 return false;  /* Types with different alignment aren't layout-compatible. 
 */
   if (!tree_int_cst_equal (TYPE_SIZE_UNIT (type1), TYPE_SIZE_UNIT (type2)))
 return false;  /* Types with different sizes aren't layout-compatible.  */
cases inside both the ENUMERAL_TYPE and CLASS_TYPE_P ifs (as e.g. the
enumeral types can have the same underlying type including alignment and
size, but the enumeral type itself could have different alignas)?
I think I can't compare TYPE_SIZE_UNIT for the fallthrough same_type_p case
because the type could be array with unspecified bounds and I expect
same_type_p fails if the sizes or alignments are different.


Sounds good.


And then also compare field offsets in struct and say members with different
offsets aren't part of the common initial sequence (that would cover the
struct S {};
struct T {};
struct U { [[no_unique_address]] S a1; [[no_unique_address]] S a2; 
[[no_unique_address]] S a3; };
struct V { [[no_unique_address]] S b1; [[no_unique_address]] T b2; 
[[no_unique_address]] S b3; };
case or alignas on the members as opposed to types)?


Yes.


What about DECL_ALIGN?  Shall that be relevant even when it doesn't change
anything further (TYPE_ALIGN of the whole struct is already the same and
field with higher DECL_ALIGN has the same field offset as in another struct
where it has that offset because of the previous fields or is at start)?


I'm inclined to accept that.


And finally, what about the union case?  Shall it check also bitfield vs.
non-bitfield, bitfield size if bitfields?


Yes.


What about [[no_unique_address]]
on the union members, shall that be relevant or not?  And DECL_ALIGN?
E.g. the whole union can have the alignment,
union alignas (16) A { short a; alignas (8) int b; };
vs.
union alignas (16) B { int a; short b; };
All union fields have the same field offset of course (0), but above A::b
has different alignment requirement than B::a.


I'd allow these differences.

Jason



Re: [PATCH v4] gcov: Add TARGET_GCOV_TYPE_SIZE target hook

2021-08-12 Thread Joseph Myers
On Thu, 12 Aug 2021, Sebastian Huber wrote:

> If -fprofile-update=atomic is used, then the target must provide atomic
> operations for the counters of the type returned by get_gcov_type().
> This is a 64-bit type for targets which have a 64-bit long long type.
> On 32-bit targets this could be an issue since they may not provide
> 64-bit atomic operations.  Allow targets to override the default type
> size with the new TARGET_GCOV_TYPE_SIZE target hook.

This is not a review of the patch, but I think this version addresses all 
the issues I had with previous versions regarding target macro/hook 
handling.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C PATCH] Evaluate argument of sizeof that are structs of variable size.

2021-08-12 Thread Joseph Myers
On Thu, 12 Aug 2021, Martin Uecker wrote:

> Evaluate type arguments of sizeof that are structs of variable size [PR101838]
> 
> Evaluate type arguments of sizeof for all types of variable size
> and not just for VLAs. This fixes PR101838 and some issues related 
> to PR29970 where statement expressions need to be evaluated so that
> the size is well defined.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 47/55] rs6000: Builtin expansion, part 4

2021-08-12 Thread Bill Schmidt via Gcc-patches

Hi Segher,

On 8/3/21 7:34 PM, Segher Boessenkool wrote:

Whoops, I forgot some stuff:

On Tue, Jul 27, 2021 at 04:06:49PM -0500, will schmidt wrote:

On Thu, 2021-06-17 at 10:19 -0500, Bill Schmidt via Gcc-patches wrote:

  static rtx
  ldv_expand_builtin (rtx target, insn_code icode, rtx *op, machine_mode tmode)
  {
+  rtx pat, addr;
+  bool blk = (icode == CODE_FOR_altivec_lvlx
+ || icode == CODE_FOR_altivec_lvlxl
+ || icode == CODE_FOR_altivec_lvrx
+ || icode == CODE_FOR_altivec_lvrxl);
+
+  if (target == 0
+  || GET_MODE (target) != tmode
+  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))

No space after "!" ?  (here and later on 'pat'.).

It can be written as just
   || !insn_data[icode].operand[0].predicate (target, tmode))
even.  The * is completely optional, and you don't need the extra parens
without it.



Agreed.  This is copied from an idiom that exists throughout the file, 
so I plan to handle this by adding a style patch to clean it up everywhere.


Bill




Segher


Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-12 Thread Joseph Myers
On Thu, 12 Aug 2021, Patrick McGehearty via Gcc-patches wrote:

> This patch resolves the failure of powerpc64 long double complex divide
> in native ibm long double format after the patch "Practical improvement
> to libgcc complex divide".

This description is not consistent with the patch.

__divkc3 should always be using IEEE binary128 format, not IBM long 
double.  If this code is being built for IBM long double, something is 
wrong somewhere else.

Using the DFmode values probably makes sense for IBM long double, but not 
for IEEE binary128.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-12 Thread Patrick McGehearty via Gcc-patches

The key code in _divkc3.c is:

#ifndef __LONG_DOUBLE_IEEE128__
#define RBIG   (__LIBGCC_DF_MAX__ / 2)
#define RMIN   (__LIBGCC_DF_MIN__)
#define RMIN2  (__LIBGCC_DF_EPSILON__)
#define RMINSCAL (1 / __LIBGCC_DF_EPSILON__)
#define RMAX2  (RBIG * RMIN2)
#else
#define RBIG   (__LIBGCC_TF_MAX__ / 2)
#define RMIN   (__LIBGCC_TF_MIN__)
#define RMIN2  (__LIBGCC_TF_EPSILON__)
#define RMINSCAL (1 / __LIBGCC_TF_EPSILON__)
#define RMAX2  (RBIG * RMIN2)
#endif

I added this code based on your comment of 4/20/2021:

-
> This file includes quad-float128.h, which does some remapping from TF to
> KF depending on __LONG_DOUBLE_IEEE128__.
>
> I think you probably need to have a similar __LONG_DOUBLE_IEEE128__
> conditional here.  If __LONG_DOUBLE_IEEE128__ is not defined, use
> __LIBGCC_KF_* macros instead of __LIBGCC_TF_*; if 
__LONG_DOUBLE_IEEE128__

> is defined, use __LIBGCC_TF_* as above.  (Unless the powerpc maintainers
> say otherwise.)
-
The KF version fails when in IBM128 mode while the DF version works
for that mode.

My understanding of ibm FP mode build procedure is minimal,
but it seems that the _divkc3.c routine is built for both IEEE128
and IBM128 modes.

- patrick



On 8/12/2021 11:19 AM, Joseph Myers wrote:

On Thu, 12 Aug 2021, Patrick McGehearty via Gcc-patches wrote:


This patch resolves the failure of powerpc64 long double complex divide
in native ibm long double format after the patch "Practical improvement
to libgcc complex divide".

This description is not consistent with the patch.

__divkc3 should always be using IEEE binary128 format, not IBM long
double.  If this code is being built for IBM long double, something is
wrong somewhere else.

Using the DFmode values probably makes sense for IBM long double, but not
for IEEE binary128.





Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-12 Thread Joseph Myers
On Thu, 12 Aug 2021, Patrick McGehearty via Gcc-patches wrote:

> > This file includes quad-float128.h, which does some remapping from TF to
> > KF depending on __LONG_DOUBLE_IEEE128__.
> >
> > I think you probably need to have a similar __LONG_DOUBLE_IEEE128__
> > conditional here.  If __LONG_DOUBLE_IEEE128__ is not defined, use
> > __LIBGCC_KF_* macros instead of __LIBGCC_TF_*; if __LONG_DOUBLE_IEEE128__
> > is defined, use __LIBGCC_TF_* as above.  (Unless the powerpc maintainers
> > say otherwise.)
> -
> The KF version fails when in IBM128 mode while the DF version works
> for that mode.

KFmode should always be IEEE binary128.  IFmode should always be IBM long 
double.  TFmode may be one or the other depending on command-line options.

"in IBM128 mode" should mean that the compiler defaults to long double 
being IBM long double and TFmode being IBM long double.  But in that mode, 
KFmode should still be IEEE binary128 and it should still be correct to 
use the KF constants in this file.

> My understanding of ibm FP mode build procedure is minimal,
> but it seems that the _divkc3.c routine is built for both IEEE128
> and IBM128 modes.

If built for IBM128 mode (i.e., compiler defaults to TFmode = IBM long 
double), it should still build a function __divkc3 which takes IEEE 
binary128 arguments and uses IEEE binary128 (KFmode) constants.

If you were changing the L_divtc3 case in libgcc2.c to use different 
constants in the case where TFmode is IBM long double, that would make 
sense to me.  It's changing an IEEE-only file for an IBM long double issue 
that doesn't make sense.  If this change causes some test using IBM long 
double to pass where it failed before, that indicates a build system 
problem somewhere.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C PATCH] qualifiers of pointers to arrays in C2X [PR 98397]

2021-08-12 Thread Joseph Myers
On Mon, 24 May 2021, Uecker, Martin wrote:

> -  else if (VOID_TYPE_P (TREE_TYPE (type1))
> -&& !TYPE_ATOMIC (TREE_TYPE (type1)))
> - {
> -   if ((TREE_CODE (TREE_TYPE (type2)) == ARRAY_TYPE)
> -   && (TYPE_QUALS (strip_array_types (TREE_TYPE (type2)))
> -   & ~TYPE_QUALS (TREE_TYPE (type1
> - warning_at (colon_loc, OPT_Wdiscarded_array_qualifiers,
> - "pointer to array loses qualifier "
> - "in conditional expression");
> -
> -   if (TREE_CODE (TREE_TYPE (type2)) == FUNCTION_TYPE)
> +  else if ((VOID_TYPE_P (TREE_TYPE (type1))
> + && !TYPE_ATOMIC (TREE_TYPE (type1)))
> +|| (VOID_TYPE_P (TREE_TYPE (type2))
> +&& !TYPE_ATOMIC (TREE_TYPE (type2

Here you're unifying the two cases where one argument is (not a null 
pointer constant and) a pointer to qualified or unqualified void (and the 
other argument is not a pointer to qualified or unqualified void).  The 
!TYPE_ATOMIC checks are because of the general rule that _Atomic is a type 
qualifier only syntactically, so _Atomic void doesn't count as qualified 
void for this purpose.

> + {
> +   tree t1 = TREE_TYPE (type1);
> +   tree t2 = TREE_TYPE (type2);
> +   if (!VOID_TYPE_P (t1))
> +{
> +  /* roles are swapped */
> +  t1 = t2;
> +  t2 = TREE_TYPE (type1);
> +}

But here you don't have a TYPE_ATOMIC check before swapping.  So if t1 is 
_Atomic void and t2 is void, the types don't get swapped.

> +   /* for array, use qualifiers of element type */
> +   if (flag_isoc2x)
> + t2 = t2_stripped;
> +   result_type = build_pointer_type (qualify_type (t1, t2));

And then it looks to me like this will end up with _Atomic void * as the 
result type, when a conditional expression between _Atomic void * and 
void * should actually have type void *.

If that's indeed the case, I think the swapping needs to occur whenever t1 
is not *non-atomic* void, so that the condition for swapping matches the 
condition checked in the outer if.  (And of course there should be a 
testcase for that.)

I didn't see any other issues in this version of the patch.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] c++, v2: Implement P0466R5 __cpp_lib_is_layout_compatible compiler helpers [PR101539]

2021-08-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 12, 2021 at 12:06:33PM -0400, Jason Merrill wrote:
> Yes; if the standard says something nonsensical, I prefer to figure out
> something more sensible to propose as a change.

Ok, so here it is implemented, so far tested only on the new testcases
(but nothing else really uses the code that has changed since the last
patch).  Also attached incremental diff (so that it is also clear
what test behaviors changed).
I'll of course bootstrap/regtest it full tonight.

2021-08-12  Jakub Jelinek  

PR c++/101539
gcc/c-family/
* c-common.h (enum rid): Add RID_IS_LAYOUT_COMPATIBLE.
* c-common.c (c_common_reswords): Add __is_layout_compatible.
gcc/cp/
* cp-tree.h (enum cp_trait_kind): Add CPTK_IS_LAYOUT_COMPATIBLE.
(enum cp_built_in_function): Add CP_BUILT_IN_IS_CORRESPONDING_MEMBER.
(fold_builtin_is_corresponding_member, layout_compatible_type_p):
Declare.
* parser.c (cp_parser_primary_expression): Handle
RID_IS_LAYOUT_COMPATIBLE.
(cp_parser_trait_expr): Likewise.
* cp-objcp-common.c (names_builtin_p): Likewise.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_LAYOUT_COMPATIBLE.
* decl.c (cxx_init_decl_processing): Register
__builtin_is_corresponding_member builtin.
* constexpr.c (cxx_eval_builtin_function_call): Handle
CP_BUILT_IN_IS_CORRESPONDING_MEMBER builtin.
* semantics.c (is_corresponding_member_union,
is_corresponding_member_aggr, fold_builtin_is_corresponding_member):
New functions.
(trait_expr_value): Handle CPTK_IS_LAYOUT_COMPATIBLE.
(finish_trait_expr): Likewise.
* typeck.c (layout_compatible_type_p): New function.
* cp-gimplify.c (cp_gimplify_expr): Fold
CP_BUILT_IN_IS_CORRESPONDING_MEMBER.
(cp_fold): Likewise.
* tree.c (builtin_valid_in_constant_expr_p): Handle
CP_BUILT_IN_IS_CORRESPONDING_MEMBER.
* cxx-pretty-print.c (pp_cxx_trait_expression): Handle
CPTK_IS_LAYOUT_COMPATIBLE.
* class.c (remove_zero_width_bit_fields): Remove.
(layout_class_type): Don't call it.
gcc/testsuite/
* g++.dg/cpp2a/is-corresponding-member1.C: New test.
* g++.dg/cpp2a/is-corresponding-member2.C: New test.
* g++.dg/cpp2a/is-corresponding-member3.C: New test.
* g++.dg/cpp2a/is-corresponding-member4.C: New test.
* g++.dg/cpp2a/is-corresponding-member5.C: New test.
* g++.dg/cpp2a/is-corresponding-member6.C: New test.
* g++.dg/cpp2a/is-corresponding-member7.C: New test.
* g++.dg/cpp2a/is-corresponding-member8.C: New test.
* g++.dg/cpp2a/is-layout-compatible1.C: New test.
* g++.dg/cpp2a/is-layout-compatible2.C: New test.
* g++.dg/cpp2a/is-layout-compatible3.C: New test.

--- gcc/c-family/c-common.h.jj  2021-08-12 18:14:29.235853657 +0200
+++ gcc/c-family/c-common.h 2021-08-12 18:21:01.141484689 +0200
@@ -173,7 +173,8 @@ enum rid
   RID_IS_ABSTRACT, RID_IS_AGGREGATE,
   RID_IS_BASE_OF,  RID_IS_CLASS,
   RID_IS_EMPTY,RID_IS_ENUM,
-  RID_IS_FINAL,RID_IS_LITERAL_TYPE,
+  RID_IS_FINAL,RID_IS_LAYOUT_COMPATIBLE,
+  RID_IS_LITERAL_TYPE,
   RID_IS_POINTER_INTERCONVERTIBLE_BASE_OF,
   RID_IS_POD,  RID_IS_POLYMORPHIC,
   RID_IS_SAME_AS,
--- gcc/c-family/c-common.c.jj  2021-08-03 00:44:32.762494219 +0200
+++ gcc/c-family/c-common.c 2021-08-12 18:21:01.143484661 +0200
@@ -420,6 +420,7 @@ const struct c_common_resword c_common_r
   { "__is_empty",  RID_IS_EMPTY,   D_CXXONLY },
   { "__is_enum",   RID_IS_ENUM,D_CXXONLY },
   { "__is_final",  RID_IS_FINAL,   D_CXXONLY },
+  { "__is_layout_compatible", RID_IS_LAYOUT_COMPATIBLE, D_CXXONLY },
   { "__is_literal_type", RID_IS_LITERAL_TYPE, D_CXXONLY },
   { "__is_pointer_interconvertible_base_of",
RID_IS_POINTER_INTERCONVERTIBLE_BASE_OF, D_CXXONLY },
--- gcc/cp/cp-tree.h.jj 2021-08-12 09:34:16.817236456 +0200
+++ gcc/cp/cp-tree.h2021-08-12 18:21:01.144484647 +0200
@@ -1365,6 +1365,7 @@ enum cp_trait_kind
   CPTK_IS_EMPTY,
   CPTK_IS_ENUM,
   CPTK_IS_FINAL,
+  CPTK_IS_LAYOUT_COMPATIBLE,
   CPTK_IS_LITERAL_TYPE,
   CPTK_IS_POINTER_INTERCONVERTIBLE_BASE_OF,
   CPTK_IS_POD,
@@ -6358,6 +6359,7 @@ struct GTY((chain_next ("%h.next"))) tin
 enum cp_built_in_function {
   CP_BUILT_IN_IS_CONSTANT_EVALUATED,
   CP_BUILT_IN_INTEGER_PACK,
+  CP_BUILT_IN_IS_CORRESPONDING_MEMBER,
   CP_BUILT_IN_IS_POINTER_INTERCONVERTIBLE_WITH_CLASS,
   CP_BUILT_IN_SOURCE_LOCATION,
   CP_BUILT_IN_LAST
@@ -7574,6 +7576,7 @@ extern tree baselink_for_fns
 extern void finish_static_assert(tree, tree, location_t,
 bool, bool);
 extern tree finish_decltype_type(tree, bool, tsubst_flags_t);
+extern tree fold_builtin_is_corresponding_member (location_t, int, tree *);
 exter

Re: [PATCH] i386: support micro-levels in target{, _clone} attrs [PR101696]

2021-08-12 Thread H.J. Lu via Gcc-patches
On Thu, Aug 12, 2021 at 8:31 AM Martin Liška  wrote:
>
> On 8/12/21 5:26 PM, H.J. Lu wrote:
> > Will it hurt if they have proper feature_priorities you added?
>
> No. They are unused, by we should use the proper priorities.
>
> Martin

+  const char *arch_name = attrs_str + strlen ("arch=");
+  if (startswith (arch_name, "x86-64"))
+ {
+   arg_str = arch_name;
+   builtin_fn = IX86_BUILTIN_CPU_SUPPORTS;
+   if (strcmp (arch_name, "x86-64") == 0)
+ priority = P_X86_64_BASELINE;
+   else if (strcmp (arch_name, "x86-64-v2") == 0)
+ priority = P_X86_64_V2;
+   else if (strcmp (arch_name, "x86-64-v3") == 0)
+ priority = P_X86_64_V3;
+   else if (strcmp (arch_name, "x86-64-v4") == 0)
+ priority = P_X86_64_V4;

What happens for arch=x86-64-v5?

+ }

-- 
H.J.


Re: [PATCH] Remove legacy back threader.

2021-08-12 Thread Jeff Law via Gcc-patches




On 8/12/2021 8:34 AM, Aldy Hernandez wrote:

PING

On Thu, Aug 5, 2021 at 11:48 AM Aldy Hernandez  wrote:

At this point I don't see any use for the legacy mode, which I had
originally left in place during the transition.

This patch removes the legacy back threader, and cleans up the code a
bit.  There are no functional changes to the non-legacy code.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

 * doc/invoke.texi: Remove docs for threader-mode param.
 * flag-types.h (enum threader_mode): Remove.
 * params.opt: Remove threader-mode param.
 * tree-ssa-threadbackward.c (class back_threader): Remove
 path_is_unreachable_p.
 Make find_paths private.
 Add maybe_thread and thread_through_all_blocks.
 Remove reference marker for m_registry.
 Remove reference marker for m_profit.
 (back_threader::back_threader): Adjust for registry and profit not
 being references.
 (dump_path): Move down.
 (debug): Move down.
 (class thread_jumps): Remove.
 (class back_threader_registry): Remove m_all_paths.
 Remove destructor.
 (thread_jumps::thread_through_all_blocks): Move to back_threader
 class.
 (fsm_find_thread_path): Remove
 (back_threader::maybe_thread): New.
 (back_threader::thread_through_all_blocks): Move from
 thread_jumps.
 (back_threader_registry::back_threader_registry): Remove
 m_all_paths.
 (back_threader_registry::~back_threader_registry): Remove.
 (thread_jumps::find_taken_edge): Remove.
 (thread_jumps::check_subpath_and_update_thread_path): Remove.
 (thread_jumps::maybe_register_path): Remove.
 (thread_jumps::handle_phi): Remove.
 (handle_assignment_p): Remove.
 (thread_jumps::handle_assignment): Remove.
 (thread_jumps::fsm_find_control_statement_thread_paths): Remove.
 (thread_jumps::find_jump_threads_backwards): Remove.
 (thread_jumps::find_jump_threads_backwards_with_ranger): Remove.
 (try_thread_blocks): Rename find_jump_threads_backwards to
 maybe_thread.
 (pass_early_thread_jumps::execute): Same.

gcc/testsuite/ChangeLog:

 * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Remove call into the legacy
 code and adjust for ranger threader.

SOrry, I thought I'd already pre-approved this :-)

OK
jeff



Re: [PATCH] Fix typo in fold-vec-load-builtin_vec_xl-* tests.

2021-08-12 Thread Michael Meissner via Gcc-patches
On Sun, Aug 08, 2021 at 03:21:02PM -0500, Segher Boessenkool wrote:
> On Thu, Aug 05, 2021 at 10:44:36PM -0400, Michael Meissner wrote:
> > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c: Fix
> > typo in regular expression.
> > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c:
> > Likewise.
> > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c:
> > Likewise.
> > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c:
> > Likewise.
> > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c:
> > Likewise.
> > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c:
> > Likewise.
> 
> Please don't break changelog lines unnecessarily.

Unfortunately with the length of the filenames, and the 79 character limit per
line, there isn't much option.

> 
> This fixes typos, so the tests failed before, and now pass?
> 
> > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c
> > @@ -28,4 +28,4 @@ BUILD_VAR_TEST( test4, vector float, signed long long, 
> > vector float);
> >  BUILD_VAR_TEST( test5, vector float, signed int, vector float);
> >  BUILD_CST_TEST( test6, vector float, 12, vector float);
> >  
> > -/* { dg-final { scan-assembler-times 
> > {\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mp?lvx\M} 6 } } */
> > +/* { dg-final { scan-assembler-times 
> > {\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mp?lxv\M} 6 } } */
> 
> You can write
>   {\mlxvw4x\M|\mlxvd2x\M|\mp?lxvx?\M}
> instead, or even
>   {\mp?lxv}
> This would be a useful future cleanup: it makes these tests both more
> readable and lower maintenance.  What you test here is how many vector
> loads there are, and the specific kind of vector load is immaterial in
> this test.
> 
> This also make it clear you are now disallowing "lvx" here.  Is that on
> purpose?  Is there any reason we would not allow it here?  We do not
> *expect* it of course, but depending on that means we will have more
> patches to this testcase later.  So maybe something like
>   {\mp?lxv|\mlvx\M}

Ok, though the pattern probably should be:

{\mp?lxvx?|\mlvx\M}

> Okay for trunk, thanks!  Please think about making these tests more
> robust :-)
> 
> 
> Segher

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] rs6000: Fix ICE expanding lxvp and stxvp gimple built-ins [PR101849]

2021-08-12 Thread David Edelsohn via Gcc-patches
On Tue, Aug 10, 2021 at 7:37 PM Peter Bergner  wrote:
>
> PR101849 shows we ICE on a test case when we pass a non __vector_pair *
> pointer to the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins
> that is cast to __vector_pair *.  The problem is that when we expand
> the built-in, the cast has already been removed from gimple and we are
> only given the base pointer.  The solution used here (which fixes the ICE)
> is to catch this case and convert the pointer to a __vector_pair * pointer
> when expanding the built-in.
>
> This passed bootstrap and regression testing on powerpc64le-linux with
> no regressions.  Ok for mainline?  This also affects GCC 11 and 10, so
> ok there too after it has baked on trunk for a few days?
>
> Peter
>
>
> gcc/
> PR target/101849
> * config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Cast
> pointer to __vector_pair *.
>
> gcc/testsuite/
> PR target/101849
> * gcc.target/powerpc/pr101849.c: New test.

Okay.

Thanks, David


[committed] libstdc++: Add [[nodiscard]] to experimental::randint

2021-08-12 Thread Jonathan Wakely via Gcc-patches
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/experimental/random (experimental::randint): Add
nodiscard attribute.

Tested powerpc64le-linux. Committed to trunk.

commit 20ce14c7991fbb498e32a0f5e3b01ae88c9f5e9a
Author: Jonathan Wakely 
Date:   Thu Aug 12 18:05:24 2021

libstdc++: Add [[nodiscard]] to experimental::randint

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/experimental/random (experimental::randint): Add
nodiscard attribute.

diff --git a/libstdc++-v3/include/experimental/random 
b/libstdc++-v3/include/experimental/random
index 2c2b359ff41..d7431e33e98 100644
--- a/libstdc++-v3/include/experimental/random
+++ b/libstdc++-v3/include/experimental/random
@@ -50,6 +50,7 @@ inline namespace fundamentals_v2 {
 
   // 13.2.2.1, Function template randint
   template
+[[__nodiscard__]]
 inline _IntType
 randint(_IntType __a, _IntType __b)
 {


[committed] libstdc++: Make some #error strings consistent with other tests

2021-08-12 Thread Jonathan Wakely via Gcc-patches
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/lerp.cc: Add header name to #error.
* testsuite/26_numerics/midpoint/integral.cc: Likewise.
* testsuite/26_numerics/midpoint/version.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

commit b1c0e8599aa6ff5550dc748679e13c1eb492ee2c
Author: Jonathan Wakely 
Date:   Thu Aug 12 18:02:40 2021

libstdc++: Make some #error strings consistent with other tests

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/lerp.cc: Add header name to #error.
* testsuite/26_numerics/midpoint/integral.cc: Likewise.
* testsuite/26_numerics/midpoint/version.cc: New test.

diff --git a/libstdc++-v3/testsuite/26_numerics/lerp.cc 
b/libstdc++-v3/testsuite/26_numerics/lerp.cc
index e456b8203a5..d74b745abb9 100644
--- a/libstdc++-v3/testsuite/26_numerics/lerp.cc
+++ b/libstdc++-v3/testsuite/26_numerics/lerp.cc
@@ -21,9 +21,9 @@
 #include 
 
 #ifndef __cpp_lib_interpolate
-# error "Feature-test macro for midpoint and lerp missing"
+# error "Feature-test macro for midpoint and lerp missing in "
 #elif __cpp_lib_interpolate != 201902L
-# error "Feature-test macro for midpoint and lerp has wrong value"
+# error "Feature-test macro for midpoint and lerp has wrong value in "
 #endif
 
 #include 
diff --git a/libstdc++-v3/testsuite/26_numerics/midpoint/integral.cc 
b/libstdc++-v3/testsuite/26_numerics/midpoint/integral.cc
index 1094b668144..d74279ea4b3 100644
--- a/libstdc++-v3/testsuite/26_numerics/midpoint/integral.cc
+++ b/libstdc++-v3/testsuite/26_numerics/midpoint/integral.cc
@@ -21,9 +21,9 @@
 #include 
 
 #ifndef __cpp_lib_interpolate
-# error "Feature-test macro for midpoint and lerp missing"
+# error "Feature-test macro for midpoint and lerp missing in "
 #elif __cpp_lib_interpolate != 201902L
-# error "Feature-test macro for midpoint and lerp has wrong value"
+# error "Feature-test macro for midpoint and lerp has wrong value in "
 #endif
 
 #include 
diff --git a/libstdc++-v3/testsuite/26_numerics/midpoint/version.cc 
b/libstdc++-v3/testsuite/26_numerics/midpoint/version.cc
new file mode 100644
index 000..3ccb032bc67
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/midpoint/version.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++2a" }
+// { dg-do preprocess { target c++2a } }
+
+#include 
+
+#ifndef __cpp_lib_interpolate
+# error "Feature-test macro for midpoint and lerp missing in "
+#elif __cpp_lib_interpolate != 201902L
+# error "Feature-test macro for midpoint and lerp has wrong value in "
+#endif


[committed] libstdc++: Add additional overload of std::lerp [PR101870]

2021-08-12 Thread Jonathan Wakely via Gcc-patches
The [cmath.syn] p1 wording about additional overloads sufficient to
handle any arithmetic types also applies to std::lerp. This adds a new
overload of std::lerp that does the required promotions to support
arguments of arbitrary arithmetic types.

A new __promoted_t alias template is added, which the C++17 function
templates std::hypot and std::lerp can use to avoid instantiating the
__promote_3 class template.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/101870
* include/c_global/cmath (hypot): Use __promoted_t.
(lerp): Add new overload accepting any arithmetic types.
* include/ext/type_traits.h (__promoted_t): New alias template.
* testsuite/26_numerics/lerp.cc: Moved to...
* testsuite/26_numerics/lerp/1.cc: ...here.
* testsuite/26_numerics/lerp/constexpr.cc: New test.
* testsuite/26_numerics/lerp/version.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

commit 9017326e19fe278d5f62898cca4682b17f8e8e07
Author: Jonathan Wakely 
Date:   Thu Aug 12 17:35:25 2021

libstdc++: Add additional overload of std::lerp [PR101870]

The [cmath.syn] p1 wording about additional overloads sufficient to
handle any arithmetic types also applies to std::lerp. This adds a new
overload of std::lerp that does the required promotions to support
arguments of arbitrary arithmetic types.

A new __promoted_t alias template is added, which the C++17 function
templates std::hypot and std::lerp can use to avoid instantiating the
__promote_3 class template.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/101870
* include/c_global/cmath (hypot): Use __promoted_t.
(lerp): Add new overload accepting any arithmetic types.
* include/ext/type_traits.h (__promoted_t): New alias template.
* testsuite/26_numerics/lerp.cc: Moved to...
* testsuite/26_numerics/lerp/1.cc: ...here.
* testsuite/26_numerics/lerp/constexpr.cc: New test.
* testsuite/26_numerics/lerp/version.cc: New test.

diff --git a/libstdc++-v3/include/c_global/cmath 
b/libstdc++-v3/include/c_global/cmath
index 39a6b036b8c..3233228fdee 100644
--- a/libstdc++-v3/include/c_global/cmath
+++ b/libstdc++-v3/include/c_global/cmath
@@ -1844,7 +1844,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif // _GLIBCXX_USE_C99_MATH_TR1
 #endif // C++11
 
-#if __cplusplus > 201402L
+#if __cplusplus >= 201703L
 
   // [c.math.hypot3], three-dimensional hypotenuse
 #define __cpp_lib_hypot 201603
@@ -1877,15 +1877,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return std::__hypot3(__x, __y, __z); }
 
   template
-typename __gnu_cxx::__promote_3<_Tp, _Up, _Vp>::__type
+__gnu_cxx::__promoted_t<_Tp, _Up, _Vp>
 hypot(_Tp __x, _Up __y, _Vp __z)
 {
-  using __type = typename __gnu_cxx::__promote_3<_Tp, _Up, _Vp>::__type;
+  using __type = __gnu_cxx::__promoted_t<_Tp, _Up, _Vp>;
   return std::__hypot3<__type>(__x, __y, __z);
 }
 #endif // C++17
 
-#if __cplusplus > 201703L
+#if __cplusplus >= 202002L
   // linear interpolation
 # define __cpp_lib_interpolate 201902L
 
@@ -1918,6 +1918,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr long double
   lerp(long double __a, long double __b, long double __t) noexcept
   { return std::__lerp(__a, __b, __t); }
+
+  template
+constexpr __gnu_cxx::__promoted_t<_Tp, _Up, _Vp>
+lerp(_Tp __x, _Up __y, _Vp __z) noexcept
+{
+  using __type = __gnu_cxx::__promoted_t<_Tp, _Up, _Vp>;
+  return std::__lerp<__type>(__x, __y, __z);
+}
 #endif // C++20
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/include/ext/type_traits.h 
b/libstdc++-v3/include/ext/type_traits.h
index ff067c83f62..065edb4e9a5 100644
--- a/libstdc++-v3/include/ext/type_traits.h
+++ b/libstdc++-v3/include/ext/type_traits.h
@@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return true; }
 #endif
 
-  // For complex and cmath
+  // For arithmetic promotions in  and 
+
   template::__value>
 struct __promote
 { typedef double __type; };
@@ -187,6 +188,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __promote
 { typedef float __type; };
 
+#if __cpp_fold_expressions
+  template
+using __promoted_t = decltype((typename __promote<_Tp>::__type(0) + ...));
+#endif
+
   template::__type,
typename _Up2 = typename __promote<_Up>::__type>
diff --git a/libstdc++-v3/testsuite/26_numerics/lerp.cc 
b/libstdc++-v3/testsuite/26_numerics/lerp/1.cc
similarity index 100%
rename from libstdc++-v3/testsuite/26_numerics/lerp.cc
rename to libstdc++-v3/testsuite/26_numerics/lerp/1.cc
diff --git a/libstdc++-v3/testsuite/26_numerics/lerp/constexpr.cc 
b/libstdc++-v3/testsuite/26_numerics/lerp/constexpr.cc
diff --git a/libstdc++-v3/testsuite/26_numerics/lerp/constexpr.cc 
b/libstdc++-v3/testsuite/26_numerics/lerp/constexpr.cc
new file mode 100644
index 000

Fix condition testing void functions in ipa-split

2021-08-12 Thread Jan Hubicka
Hi,
while looking into the code I noticed the following thinko.
VOID_TYPE_P (TREE_TYPE (current_function_decl)) is always false since
TREE_TYPE (current_function_decl) is either function_type or
method_type.  One extra TREE_TYPE is needed to get to type of return
value.

Bootstrapped/regtested x86_64-linux. Comitted.

gcc/ChangeLog:

2021-08-12  Jan Hubicka  

* ipa-split.c (consider_split): Fix condition testing void functions.

diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
index 5e918ee3fbf..c68577d04a9 100644
--- a/gcc/ipa-split.c
+++ b/gcc/ipa-split.c
@@ -546,8 +546,9 @@ consider_split (class split_point *current, bitmap 
non_ssa_vars,
}
}
 }
-  if (!VOID_TYPE_P (TREE_TYPE (current_function_decl)))
-call_overhead += estimate_move_cost (TREE_TYPE (current_function_decl),
+  if (!VOID_TYPE_P (TREE_TYPE (TREE_TYPE (current_function_decl
+call_overhead += estimate_move_cost (TREE_TYPE (TREE_TYPE
+(current_function_decl)),
 false);
 
   if (current->split_size <= call_overhead)


Introduce EAF_NOREAD and cleanup EAF_UNUSED + ipa-modref

2021-08-12 Thread Jan Hubicka
Hi,
this patch add EAF_NOREAD (as disucssed on IRC earlier) and fixes meaning
of EAF_UNUSED to be really unused and not "does not escape, is not clobbered,
read or returned" since we have separate flags for each of the properties
now.

Number of flags has grown and I thus I refactored the code a bit to avoid
repeated uses of complex flag combinations and also simplified the logic of
merging.

Merging is bit tricky since we have flags that implies other flags (like
NOESCAPE implies NODIRECTESCAPE) but code that sets only NOESCAPE and
not NODIRECTESCAPE and therefore simple and does not right thing.  I
added code that deals with the implications.  Perhaps it would make
sense to update fnspecs to always set flag along with all the
implications, but for now I am handlingit in merge.

I made only trivial update to tree-ssa-structalias and will send changes
needed for EAF_NOREAD incrementally, so it can be discussed.  I think
logical step is to track whether function reads/stores global memory and
rewrite the constraint generation so we can handle normal, pure and
const in unified manner.

Bootstrapped/regtested x86_64-linux, plan to commit it after furhter testing.

The patch improves alias oracle stats for cc1plus somewhat.

From:

Alias oracle query stats:
  refs_may_alias_p: 72380497 disambiguations, 82649832 queries
  ref_maybe_used_by_call_p: 495184 disambiguations, 73366950 queries
  call_may_clobber_ref_p: 259312 disambiguations, 263253 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 38006 queries
  nonoverlapping_refs_since_match_p: 21157 disambiguations, 65698 must 
overlaps, 87756 queries
  aliasing_component_refs_p: 63141 disambiguations, 2164695 queries
  TBAA oracle: 25975753 disambiguations 61449632 queries
   12138220 are in alias set 0
   11316663 queries asked about the same object
   144 queries asked about the same alias set
   0 access volatile
   10472885 are dependent in the DAG
   1545967 are aritificially in conflict with void *

Modref stats:
  modref use: 23857 disambiguations, 754515 queries
  modref clobber: 1392162 disambiguations, 17753512 queries
  3450241 tbaa queries (0.194341 per modref query)
  534816 base compares (0.030125 per modref query)

PTA query stats:
  pt_solution_includes: 12394915 disambiguations, 20235925 queries
  pt_solutions_intersect: 1365299 disambiguations, 14638068 queries

To:

Alias oracle query stats:
  refs_may_alias_p: 72629640 disambiguations, 8290 queries
  ref_maybe_used_by_call_p: 502474 disambiguations, 73612186 queries
  call_may_clobber_ref_p: 261806 disambiguations, 265659 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 38007 queries
  nonoverlapping_refs_since_match_p: 21139 disambiguations, 65772 must 
overlaps, 87816 queries
  aliasing_component_refs_p: 63144 disambiguations, 2164330 queries
  TBAA oracle: 26059018 disambiguations 61571714 queries
   12158033 are in alias set 0
   11326115 queries asked about the same object
   144 queries asked about the same alias set
   0 access volatile
   10484493 are dependent in the DAG
   1543911 are aritificially in conflict with void *

Modref stats:
  modref use: 24008 disambiguations, 712712 queries
  modref clobber: 1395917 disambiguations, 17163694 queries
  3465657 tbaa queries (0.201918 per modref query)
  537591 base compares (0.031321 per modref query)

PTA query stats:
  pt_solution_includes: 12468934 disambiguations, 20295402 queries
  pt_solutions_intersect: 1391917 disambiguations, 14665265 queries

I think it is mostly due to better heandling of EAF_NODIRECTESCAPE.

Honza

gcc/ChangeLog:

2021-08-12  Jan Hubicka  

* ipa-modref.c (dump_eaf_flags): Dump EAF_NOREAD.
(implicit_const_eaf_flags, implicit_pure_eaf_flags,
 ignore_stores_eaf_flags): New constants.
(remove_useless_eaf_flags): New function.
(eaf_flags_useful_p): Use it.
(deref_flags): Add EAF_NOT_RETURNED if flag is unused;
handle EAF_NOREAD.
(modref_lattice::init): Add EAF_NOREAD.
(modref_lattice::add_escape_point): Do not reacord escape point if
result is unused.
(modref_lattice::merge): EAF_NOESCAPE implies EAF_NODIRECTESCAPE;
use remove_useless_eaf_flags.
(modref_lattice::merge_deref): Use ignore_stores_eaf_flags.
(modref_lattice::merge_direct_load): Add EAF_NOREAD
(analyze_ssa_name_flags): Fix handling EAF_NOT_RETURNED
(analyze_parms): Use remove_useless_eaf_flags.
(ipa_merge_modref_summary_after_inlining): Use ignore_stores_eaf_flags.
(modref_merge_call_site_flags): Add caller and ecf_flags parameter;
use remove_useless_eaf_flags.
(modref_propagate_flags_in_scc): Update.
* ipa-modref.h: Turn eaf_flags_t back to char.
* tree-core.h (EAF_NOT_RETURNED): Fix.
(EAF_NOREAD): 

Re: [PATCH] Extend ldexp{s, d}f3 to vscalefs{s, d} when TARGET_AVX512F and TARGET_SSE_MATH.

2021-08-12 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 12, 2021 at 6:40 AM Hongtao Liu  wrote:

> > > > Hi:
> > > >   AVX512F supported vscalefs{s,d} which is the same as ldexp except the 
> > > > second operand should be floating point.
> > > >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > PR target/98309
> > > > * config/i386/i386.md (ldexp3): Extend to vscalefs[sd]
> > > > when TARGET_AVX512F and TARGET_SSE_MATH.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR target/98309
> > > > * gcc.target/i386/pr98309-1.c: New test.
> > > > * gcc.target/i386/pr98309-2.c: New test.
> > >
> > > OK.
> >
> > Actually, we should introduce a scalar version of avx512f_vmscalef, so
> > we can avoid all subreg conversions with the vector-merge (VM)
> > version, and will also allow memory in operand 2.
> >
> > Please test the attached incremental patch.
> >
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,} on CLX.
> tests is fine.

Thanks, committed with the following ChangeLog:

[i386] Introduce scalar version of avx512f_vmscalef.

2021-08-12  Uroš Bizjak  

gcc/
PR target/98309
* config/i386/i386.md (avx512f_scalef2): New insn pattern.
(ldexp3): Use avx512f_scalef2.
(UNSPEC_SCALEF): Move from ...
* config/i386/sse.md (UNSPEC_SCALEF): ... here.

Uros.


Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-08-12 Thread Qing Zhao via Gcc-patches
Hi, Richard,

For RTL expansion of call to .DEFERRED_INIT, I changed my code per your 
suggestions like following:

==
#define INIT_PATTERN_VALUE  0xFE
static void
expand_DEFERRED_INIT (internal_fn, gcall *stmt)
{
  tree lhs = gimple_call_lhs (stmt);
  tree var_size = gimple_call_arg (stmt, 0);
  enum auto_init_type init_type
= (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
  bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2));

  tree var_type = TREE_TYPE (lhs);
  gcc_assert (init_type > AUTO_INIT_UNINITIALIZED);

  if (is_vla || (!can_native_interpret_type_p (var_type)))
{
/* If this is a VLA or the type of the variable cannot be natively
   interpreted, expand to a memset to initialize it.  */
  if (TREE_CODE (lhs) == SSA_NAME)
lhs = SSA_NAME_VAR (lhs);
  tree var_addr = NULL_TREE;
  if (is_vla)
var_addr = TREE_OPERAND (lhs, 0);
  else
{
 TREE_ADDRESSABLE (lhs) = 1;
 var_addr = build_fold_addr_expr (lhs);
}
  tree value = (init_type == AUTO_INIT_PATTERN) ?
build_int_cst (unsigned_char_type_node,
   INIT_PATTERN_VALUE) :
build_zero_cst (unsigned_char_type_node);
  tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET),
 3, var_addr, value, var_size);
  /* Expand this memset call.  */
  expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type));
}
  else
{
/* If this is not a VLA and the type of the variable can be natively 
   interpreted, expand to assignment to generate better code.  */
  tree pattern = NULL_TREE;
  unsigned HOST_WIDE_INT total_bytes
= tree_to_uhwi (TYPE_SIZE_UNIT (var_type));

  if (init_type == AUTO_INIT_PATTERN)
{
  unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
  memset (buf, INIT_PATTERN_VALUE, total_bytes);
  pattern = native_interpret_expr (var_type, buf, total_bytes);
  gcc_assert (pattern);
}

  tree init = (init_type == AUTO_INIT_PATTERN) ?
   pattern :
   build_zero_cst (var_type);
  expand_assignment (lhs, init, false);
}
}
===

Now, I used “can_native_interpret_type_p (var_type)” instead of 
“use_register_for_decl (lhs)” to decide 
whether to use “memset” or use “assign” to expand this function.

However, this exposed an bug that is very hard to be addressed:

***For the testing case: test suite/gcc.dg/uninit-I.c:

/* { dg-do compile } */
/* { dg-options "-O2 -Wuninitialized" } */

int sys_msgctl (void)
{
  struct { int mode; } setbuf;
  return setbuf.mode;  /* { dg-warning "'setbuf\.mode' is used" } */
==

**the above auto var “setbuf” has “struct” type, which 
“can_native_interpret_type_p(var_type)” is false, therefore, 
Expanding this .DEFERRED_INIT call went down the “memset” expansion route. 

However, this structure type can be fitted into a register, therefore cannot be 
taken address anymore at this stage, even though I tried:

 TREE_ADDRESSABLE (lhs) = 1;
 var_addr = build_fold_addr_expr (lhs);

To create an address variable for it, the expansion still failed at expr.c: 
line 8412:
during RTL pass: expand
/home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-I.c:6:24: 
internal compiler error: in expand_expr_addr_expr_1, at expr.c:8412
0xd04104 expand_expr_addr_expr_1
../../latest-gcc/gcc/expr.c:8412
0xd04a95 expand_expr_addr_expr
../../latest-gcc/gcc/expr.c:8525
0xd13592 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../latest-gcc/gcc/expr.c:11741
0xd05142 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier, 
rtx_def**, bool)
../../latest-gcc/gcc/expr.c:8713
0xaed1d3 expand_expr
../../latest-gcc/gcc/expr.h:301
0xaf0d89 get_memory_rtx
../../latest-gcc/gcc/builtins.c:1370
0xafb4fb expand_builtin_memset_args
../../latest-gcc/gcc/builtins.c:4102
0xafacde expand_builtin_memset(tree_node*, rtx_def*, machine_mode)
../../latest-gcc/gcc/builtins.c:3886
0xe97fb3 expand_DEFERRED_INIT

**That’s the major reason why I chose “use_register_for_decl(lhs)” to 
decide “memset” expansion or “assign” expansion, “memset” expansion
needs to take address of the variable, if the variable has been decided to fit 
into a register, then its address cannot taken anymore at this stage.

**using “can_native_interpret_type_p” did make the “pattern” generation 
part much  cleaner and simpler, however, looks like it didn’t work correctly.

Based on this, I’d like to keep my previous implementation by using 
“use_register_for_decl” to decide whether to take “memset” expansion or 
“assign” expansion.
Therefore, I might still need to keep the “UGLY”  implementation of generatting 
“pattern” 

[PATCH] rs6000: Avoid buffer overruns

2021-08-12 Thread Bill Schmidt via Gcc-patches
Although safe_inc_pos avoids buffer overruns in rs6000-gen-builtins.c,
there are some other routines where we fail to detect the possibility.
Clean those up!

Regstrap in progress on powerpc64le-linux-gnu.  OK for trunk if that
passes?

Thanks,
Bill

2021-08-12  Bill Schmidt  

gcc/
* config/rs6000/rs6000-gen-builtins.c (consume_whitespace):
Diagnose buffer overrun.
(match_identifier): Likewise.
(match_integer): Likewise.
(match_to_right_bracket): Likewise.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 32 ++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 22902c37d55..ff8872c59e4 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -638,6 +638,13 @@ consume_whitespace (void)
 {
   while (pos < LINELEN && isspace(linebuf[pos]) && linebuf[pos] != '\n')
 pos++;
+
+  if (pos >= LINELEN)
+{
+  diag (pos, "line length overrun.\n");
+  exit (1);
+}
+
   return;
 }
 
@@ -697,9 +704,16 @@ static char *
 match_identifier (void)
 {
   int lastpos = pos - 1;
-  while (isalnum (linebuf[lastpos + 1]) || linebuf[lastpos + 1] == '_')
+  while (lastpos < LINELEN - 1
+&& (isalnum (linebuf[lastpos + 1]) || linebuf[lastpos + 1] == '_'))
 ++lastpos;
 
+  if (lastpos >= LINELEN - 1)
+{
+  diag (lastpos, "line length overrun.\n");
+  exit (1);
+}
+
   if (lastpos < pos)
 return 0;
 
@@ -721,9 +735,15 @@ match_integer (void)
 safe_inc_pos ();
 
   int lastpos = pos - 1;
-  while (isdigit (linebuf[lastpos + 1]))
+  while (lastpos < LINELEN - 1 && isdigit (linebuf[lastpos + 1]))
 ++lastpos;
 
+  if (lastpos >= LINELEN - 1)
+{
+  diag (lastpos, "line length overrun.\n");
+  exit (1);
+}
+
   if (lastpos < pos)
 return NULL;
 
@@ -741,13 +761,19 @@ static const char *
 match_to_right_bracket (void)
 {
   int lastpos = pos - 1;
-  while (linebuf[lastpos + 1] != ']')
+  while (lastpos < LINELEN - 1 && linebuf[lastpos + 1] != ']')
 {
   if (linebuf[lastpos + 1] == '\n')
fatal ("no ']' found before end of line.\n");
   ++lastpos;
 }
 
+  if (lastpos >= LINELEN - 1)
+{
+  diag (lastpos, "line length overrun.\n");
+  exit (1);
+}
+
   if (lastpos < pos)
 return 0;
 
-- 
2.27.0



Re: [PATCH] rs6000: Avoid buffer overruns

2021-08-12 Thread Bill Schmidt via Gcc-patches
Per discussion with Martin, I'm also changing the post-increment to 
pre-increment in safe_inc_pos.  That's what I'm regstrapping at the moment.


Thanks,
Bill

On 8/12/21 3:28 PM, Bill Schmidt via Gcc-patches wrote:

Although safe_inc_pos avoids buffer overruns in rs6000-gen-builtins.c,
there are some other routines where we fail to detect the possibility.
Clean those up!

Regstrap in progress on powerpc64le-linux-gnu.  OK for trunk if that
passes?

Thanks,
Bill

2021-08-12  Bill Schmidt  

gcc/
* config/rs6000/rs6000-gen-builtins.c (consume_whitespace):
Diagnose buffer overrun.
(match_identifier): Likewise.
(match_integer): Likewise.
(match_to_right_bracket): Likewise.
---
  gcc/config/rs6000/rs6000-gen-builtins.c | 32 ++---
  1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 22902c37d55..ff8872c59e4 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -638,6 +638,13 @@ consume_whitespace (void)
  {
while (pos < LINELEN && isspace(linebuf[pos]) && linebuf[pos] != '\n')
  pos++;
+
+  if (pos >= LINELEN)
+{
+  diag (pos, "line length overrun.\n");
+  exit (1);
+}
+
return;
  }

@@ -697,9 +704,16 @@ static char *
  match_identifier (void)
  {
int lastpos = pos - 1;
-  while (isalnum (linebuf[lastpos + 1]) || linebuf[lastpos + 1] == '_')
+  while (lastpos < LINELEN - 1
+&& (isalnum (linebuf[lastpos + 1]) || linebuf[lastpos + 1] == '_'))
  ++lastpos;

+  if (lastpos >= LINELEN - 1)
+{
+  diag (lastpos, "line length overrun.\n");
+  exit (1);
+}
+
if (lastpos < pos)
  return 0;

@@ -721,9 +735,15 @@ match_integer (void)
  safe_inc_pos ();

int lastpos = pos - 1;
-  while (isdigit (linebuf[lastpos + 1]))
+  while (lastpos < LINELEN - 1 && isdigit (linebuf[lastpos + 1]))
  ++lastpos;

+  if (lastpos >= LINELEN - 1)
+{
+  diag (lastpos, "line length overrun.\n");
+  exit (1);
+}
+
if (lastpos < pos)
  return NULL;

@@ -741,13 +761,19 @@ static const char *
  match_to_right_bracket (void)
  {
int lastpos = pos - 1;
-  while (linebuf[lastpos + 1] != ']')
+  while (lastpos < LINELEN - 1 && linebuf[lastpos + 1] != ']')
  {
if (linebuf[lastpos + 1] == '\n')
fatal ("no ']' found before end of line.\n");
++lastpos;
  }

+  if (lastpos >= LINELEN - 1)
+{
+  diag (lastpos, "line length overrun.\n");
+  exit (1);
+}
+
if (lastpos < pos)
  return 0;



[committed] openmp: Add support for OpenMP 5.1 masked construct

2021-08-12 Thread Jakub Jelinek via Gcc-patches
Hi!

This construct has been introduced as a replacement for master
construct, but unlike that construct is slightly more general,
has an optional clause which allows to choose which thread
will be the one running the region, it can be some other thread
than the master (primary) thread with number 0, or it could be no
threads or multiple threads (then of course one needs to be careful
about data races).

It is way too early to deprecate the master construct though, we don't
even have OpenMP 5.0 fully implemented, it has been deprecated in 5.1,
will be also in 5.2 and removed in 6.0.  But even then it will likely
be a good idea to just -Wdeprecated warn about it and still accept it.

The patch also contains something I should have done much earlier,
for clauses that accept some integral expression where we only care
about the value, forces during gimplification that value into
either a min invariant (as before), SSA_NAME or a fresh temporary,
but never e.g. a user VAR_DECL, so that for those clauses we don't
need to worry about adjusting it.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to
trunk.

2021-08-12  Jakub Jelinek  

gcc/
* tree.def (OMP_MASKED): New tree code.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_FILTER.
* tree.h (OMP_MASKED_BODY, OMP_MASKED_CLAUSES, OMP_MASKED_COMBINED,
OMP_CLAUSE_FILTER_EXPR): Define.
* tree.c (omp_clause_num_ops): Add OMP_CLAUSE_FILTER entry.
(omp_clause_code_name): Likewise.
(walk_tree_1): Handle OMP_CLAUSE_FILTER.
* tree-nested.c (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_FILTER.
(convert_nonlocal_reference_stmt, convert_local_reference_stmt,
convert_gimple_call): Handle GIMPLE_OMP_MASTER.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_FILTER.
(dump_generic_node): Handle OMP_MASTER.
* gimple.def (GIMPLE_OMP_MASKED): New gimple code.
* gimple.c (gimple_build_omp_masked): New function.
(gimple_copy): Handle GIMPLE_OMP_MASKED.
* gimple.h (gimple_build_omp_masked): Declare.
(gimple_has_substatements): Handle GIMPLE_OMP_MASKED.
(gimple_omp_masked_clauses, gimple_omp_masked_clauses_ptr,
gimple_omp_masked_set_clauses): New inline functions.
(CASE_GIMPLE_OMP): Add GIMPLE_OMP_MASKED.
* gimple-pretty-print.c (dump_gimple_omp_masked): New function.
(pp_gimple_stmt_1): Handle GIMPLE_OMP_MASKED.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple-low.c (lower_stmt): Likewise.
* gimplify.c (is_gimple_stmt): Handle OMP_MASTER.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE_FILTER.  For clauses
that take one expression rather than decl or constant, force
gimplification of that into a SSA_NAME or temporary unless min
invariant.
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_FILTER.
(gimplify_expr): Handle OMP_MASKED.
* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_MASKED.
(estimate_num_insns): Likewise.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_FILTER.
(check_omp_nesting_restrictions): Handle GIMPLE_OMP_MASKED.  Adjust
diagnostics for existence of masked construct.
(scan_omp_1_stmt, lower_omp_master, lower_omp_1, diagnose_sb_1,
diagnose_sb_2): Handle GIMPLE_OMP_MASKED.
* omp-expand.c (expand_omp_synch, expand_omp, omp_make_gimple_edges):
Likewise.
gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_MASKED.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FILTER.
* c-pragma.c (omp_pragmas_simd): Add masked construct.
* c-common.h (enum c_omp_clause_split): Add C_OMP_CLAUSE_SPLIT_MASKED
enumerator.
(c_finish_omp_masked): Declare.
* c-omp.c (c_finish_omp_masked): New function.
(c_omp_split_clauses): Handle combined masked constructs.
gcc/c/
* c-parser.c (c_parser_omp_clause_name): Parse filter clause name.
(c_parser_omp_clause_filter): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER.
(OMP_MASKED_CLAUSE_MASK): Define.
(c_parser_omp_masked): New function.
(c_parser_omp_parallel): Handle parallel masked.
(c_parser_omp_construct): Handle PRAGMA_OMP_MASKED.
* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_FILTER.
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Parse filter clause name.
(cp_parser_omp_clause_filter): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER.
(OMP_MASKED_CLAUSE_MASK): Define.
(cp_parser_omp_masked): New function.
(cp_parser_omp_parallel): Handle parallel masked.
(cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_MASKED.
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_FILTER.
 

Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-12 Thread Segher Boessenkool
On Thu, Aug 12, 2021 at 04:47:42PM +, Joseph Myers wrote:
> On Thu, 12 Aug 2021, Patrick McGehearty via Gcc-patches wrote:
> > My understanding of ibm FP mode build procedure is minimal,
> > but it seems that the _divkc3.c routine is built for both IEEE128
> > and IBM128 modes.
> 
> If built for IBM128 mode (i.e., compiler defaults to TFmode = IBM long 
> double), it should still build a function __divkc3 which takes IEEE 
> binary128 arguments and uses IEEE binary128 (KFmode) constants.

Apparently that is broken then :-(

> If you were changing the L_divtc3 case in libgcc2.c to use different 
> constants in the case where TFmode is IBM long double, that would make 
> sense to me.

This whole thing is there to *not* change generic code.


Segher


Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-12 Thread Segher Boessenkool
On Thu, Aug 12, 2021 at 04:03:13PM +, Patrick McGehearty wrote:
> This patch resolves the failure of powerpc64 long double complex divide
> in native ibm long double format after the patch "Practical improvement
> to libgcc complex divide".

[ etc. ]

Nothing in here says what has changed in v3.  Please do that in future
patches?


Segher


Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-12 Thread Patrick McGehearty via Gcc-patches

I see I have more to learn about gcc's interactions with IEEE-128 format
vs IBM-128 format.

As we discovered here, using the IBM-128 version of LDBL_EPSILON will
not yield correct answers as currently coded.

If _divkc3.c is not intended to provide a version of complex divide
that handles IBM-128 format, then where should that option be handled?

Do I need add a special case for
#ifndef __LONG_DOUBLE_IEEE128__
in the complex divide code in libgcc/libgcc2.c?

And, for completeness, does gcc support LDBL for non-IEEE on
any platform besides IBM?

- patrick


On 8/12/2021 11:47 AM, Joseph Myers wrote:

On Thu, 12 Aug 2021, Patrick McGehearty via Gcc-patches wrote:


This file includes quad-float128.h, which does some remapping from TF to
KF depending on __LONG_DOUBLE_IEEE128__.

I think you probably need to have a similar __LONG_DOUBLE_IEEE128__
conditional here.  If __LONG_DOUBLE_IEEE128__ is not defined, use
__LIBGCC_KF_* macros instead of __LIBGCC_TF_*; if __LONG_DOUBLE_IEEE128__
is defined, use __LIBGCC_TF_* as above.  (Unless the powerpc maintainers
say otherwise.)

-
The KF version fails when in IBM128 mode while the DF version works
for that mode.

KFmode should always be IEEE binary128.  IFmode should always be IBM long
double.  TFmode may be one or the other depending on command-line options.

"in IBM128 mode" should mean that the compiler defaults to long double
being IBM long double and TFmode being IBM long double.  But in that mode,
KFmode should still be IEEE binary128 and it should still be correct to
use the KF constants in this file.


My understanding of ibm FP mode build procedure is minimal,
but it seems that the _divkc3.c routine is built for both IEEE128
and IBM128 modes.

If built for IBM128 mode (i.e., compiler defaults to TFmode = IBM long
double), it should still build a function __divkc3 which takes IEEE
binary128 arguments and uses IEEE binary128 (KFmode) constants.

If you were changing the L_divtc3 case in libgcc2.c to use different
constants in the case where TFmode is IBM long double, that would make
sense to me.  It's changing an IEEE-only file for an IBM long double issue
that doesn't make sense.  If this change causes some test using IBM long
double to pass where it failed before, that indicates a build system
problem somewhere.





[PATCH] libbacktrace: fix fd leak tests on systems with extra descriptors

2021-08-12 Thread Sergei Trofimovich via Gcc-patches
From: Sergei Trofimovich 

I noticed test failures when ran gcc test suite from under mc shell.
mc opens fd=9 and exposes it to child processes. As a result a few
tests failes:
FAIL: b2test_buildid
FAIL: btest_gnudebuglink
FAIL: btest
FAIL: btest_lto
FAIL: btest_alloc
FAIL: ctestg
FAIL: ctesta
FAIL: ctestg_alloc
FAIL: ctesta_alloc
FAIL: dwarf5
FAIL: dwarf5_alloc

Instead of trying to close file descripts in range test polls for
first available file descriptor by creating it via dup(1).

libbacktrace/

* btest.c (check_open_files): Use last free file descriptor as a
signal for flie descriptor leak.
---
 libbacktrace/btest.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/libbacktrace/btest.c b/libbacktrace/btest.c
index 9f9c03babf3..d5cf321640c 100644
--- a/libbacktrace/btest.c
+++ b/libbacktrace/btest.c
@@ -458,22 +458,32 @@ test5 (void)
   return failures;
 }
 
+/* Peek at first free flie descriptior.  */
+
+static int probe_first_dree_fd (void) {
+  int fd;
+
+  fd = dup(1);
+  close(fd);
+
+  return fd;
+}
+
 /* Check that are no files left open.  */
 
 static void
-check_open_files (void)
+check_open_files (int last_free_fd)
 {
-  int i;
+  int fd;
+
+  fd = probe_first_dree_fd();
 
-  for (i = 3; i < 10; i++)
+  if (fd != last_free_fd)
 {
-  if (close (i) == 0)
-   {
- fprintf (stderr,
-  "ERROR: descriptor %d still open after tests complete\n",
-  i);
- ++failures;
-   }
+  fprintf (stderr,
+  "ERROR: descriptor %d still open after tests complete\n",
+  last_free_fd);
+  ++failures;
 }
 }
 
@@ -482,8 +492,11 @@ check_open_files (void)
 int
 main (int argc ATTRIBUTE_UNUSED, char **argv)
 {
+  int first_free_fd;
+
   state = backtrace_create_state (argv[0], BACKTRACE_SUPPORTS_THREADS,
  error_callback_create, NULL);
+  first_free_fd = probe_first_dree_fd ();
 
 #if BACKTRACE_SUPPORTED
   test1 ();
@@ -495,7 +508,7 @@ main (int argc ATTRIBUTE_UNUSED, char **argv)
 #endif
 #endif
 
-  check_open_files ();
+  check_open_files (first_free_fd);
 
   exit (failures ? EXIT_FAILURE : EXIT_SUCCESS);
 }
-- 
2.32.0



[PATCH] libbacktrace: fix b2test_buildid test on non-english locales

2021-08-12 Thread Sergei Trofimovich via Gcc-patches
From: Sergei Trofimovich 

On LANG=ru_RU.UTF-8 'b2test_buildid' test fails due to localized readelf
output:

$ LANG=ru_RU.UTF-8 readelf -n b2test | fgrep 4e37e8f
ID сборки: 4e37e8fead8d6e8b0a9dc95ea25cd784dff3a393
$ LANG=C readelf -n b2test | fgrep 4e37e8f
Build ID: 4e37e8fead8d6e8b0a9dc95ea25cd784dff3a393

libbacktrace/

* install-debuginfo-for-buildid.sh.in: Force non-localized readelf
output with LANG=C.
---
 libbacktrace/install-debuginfo-for-buildid.sh.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libbacktrace/install-debuginfo-for-buildid.sh.in 
b/libbacktrace/install-debuginfo-for-buildid.sh.in
index 1364779d703..91dfdfe89a4 100644
--- a/libbacktrace/install-debuginfo-for-buildid.sh.in
+++ b/libbacktrace/install-debuginfo-for-buildid.sh.in
@@ -47,7 +47,7 @@ mkdir_p="@MKDIR_P@"
 build_id_dir="$1"
 src="$2"
 
-buildid=$($readelf -n $src \
+buildid=$(LANG=C $readelf -n $src \
  | $grep "Build ID" \
  | $awk '{print $3}')
 
-- 
2.32.0



Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-12 Thread Andreas Schwab
On Aug 12 2021, Patrick McGehearty via Gcc-patches wrote:

> diff --git a/libgcc/config/rs6000/_divkc3.c b/libgcc/config/rs6000/_divkc3.c
> index a1d29d2..2b229c8 100644
> --- a/libgcc/config/rs6000/_divkc3.c
> +++ b/libgcc/config/rs6000/_divkc3.c
> @@ -38,10 +38,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  #endif
>  
>  #ifndef __LONG_DOUBLE_IEEE128__
> -#define RBIG   (__LIBGCC_KF_MAX__ / 2)
> -#define RMIN   (__LIBGCC_KF_MIN__)
> -#define RMIN2  (__LIBGCC_KF_EPSILON__)
> -#define RMINSCAL (1 / __LIBGCC_KF_EPSILON__)
> +#define RBIG   (__LIBGCC_DF_MAX__ / 2)
> +#define RMIN   (__LIBGCC_DF_MIN__)
> +#define RMIN2  (__LIBGCC_DF_EPSILON__)
> +#define RMINSCAL (1 / __LIBGCC_DF_EPSILON__)

How can it happen that __LONG_DOUBLE_IEEE128__ is not defined?  This
file is always compiled with -mfloat128 and this looks like dead code.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-08-12 Thread Qing Zhao via Gcc-patches
Hi,
Although I kept my previous "use_register_for_decl(lhs)” to decide “memset” 
expansion or “assign” expansion when expanding .DEFERRED_INIT 

When generating “pattern” for “assign” expansion, I found that 
“can_native_interpret_type_p(var_type)”  combined with “native_interpret_expr” 
make
the implementation cleaner and simpler as following:

  if (init_type == AUTO_INIT_PATTERN)
{
  if (can_native_interpret_type_p (var_type))
{
  unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
  memset (buf, INIT_PATTERN_VALUE, total_bytes);
  pattern = native_interpret_expr (var_type, buf, total_bytes);
  gcc_assert (pattern);
}
  else
{
  tree index_type = build_index_type (size_int (total_bytes - 1));
  tree array_type = build_array_type (unsigned_char_type_node,
  index_type);
  tree element = build_int_cst (unsigned_char_type_node,
INIT_PATTERN_VALUE);
  vec *elts = NULL;
  for (unsigned int i = 0; i < total_bytes; i++)
CONSTRUCTOR_APPEND_ELT (elts, NULL_TREE, element);
  pattern = build_constructor (array_type, elts);
  pattern = build1 (VIEW_CONVERT_EXPR, var_type, pattern);
}
}

Thanks.

Qing

On Aug 12, 2021, at 2:24 PM, Qing Zhao via Gcc-patches 
 wrote:
> 
> 
> Hi, Richard,
> 
> For RTL expansion of call to .DEFERRED_INIT, I changed my code per your 
> suggestions like following:
> 
> ==
> #define INIT_PATTERN_VALUE  0xFE
> static void
> expand_DEFERRED_INIT (internal_fn, gcall *stmt)
> {
>  tree lhs = gimple_call_lhs (stmt);
>  tree var_size = gimple_call_arg (stmt, 0);
>  enum auto_init_type init_type
>= (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1));
>  bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2));
> 
>  tree var_type = TREE_TYPE (lhs);
>  gcc_assert (init_type > AUTO_INIT_UNINITIALIZED);
> 
>  if (is_vla || (!can_native_interpret_type_p (var_type)))
>{
>/* If this is a VLA or the type of the variable cannot be natively
>   interpreted, expand to a memset to initialize it.  */
>  if (TREE_CODE (lhs) == SSA_NAME)
>lhs = SSA_NAME_VAR (lhs);
>  tree var_addr = NULL_TREE;
>  if (is_vla)
>var_addr = TREE_OPERAND (lhs, 0);
>  else
>{
> TREE_ADDRESSABLE (lhs) = 1;
> var_addr = build_fold_addr_expr (lhs);
>}
>  tree value = (init_type == AUTO_INIT_PATTERN) ?
>build_int_cst (unsigned_char_type_node,
>   INIT_PATTERN_VALUE) :
>build_zero_cst (unsigned_char_type_node);
>  tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET),
> 3, var_addr, value, var_size);
>  /* Expand this memset call.  */
>  expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type));
>}
>  else
>{
>/* If this is not a VLA and the type of the variable can be natively 
>   interpreted, expand to assignment to generate better code.  */
>  tree pattern = NULL_TREE;
>  unsigned HOST_WIDE_INT total_bytes
>= tree_to_uhwi (TYPE_SIZE_UNIT (var_type));
> 
>  if (init_type == AUTO_INIT_PATTERN)
>{
>  unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
>  memset (buf, INIT_PATTERN_VALUE, total_bytes);
>  pattern = native_interpret_expr (var_type, buf, total_bytes);
>  gcc_assert (pattern);
>}
> 
>  tree init = (init_type == AUTO_INIT_PATTERN) ?
>   pattern :
>   build_zero_cst (var_type);
>  expand_assignment (lhs, init, false);
>}
> }
> ===
> 
> Now, I used “can_native_interpret_type_p (var_type)” instead of 
> “use_register_for_decl (lhs)” to decide 
> whether to use “memset” or use “assign” to expand this function.
> 
> However, this exposed an bug that is very hard to be addressed:
> 
> ***For the testing case: test suite/gcc.dg/uninit-I.c:
> 
> /* { dg-do compile } */
> /* { dg-options "-O2 -Wuninitialized" } */
> 
> int sys_msgctl (void)
> {
>  struct { int mode; } setbuf;
>  return setbuf.mode;  /* { dg-warning "'setbuf\.mode' is used" } */
> ==
> 
> **the above auto var “setbuf” has “struct” type, which 
> “can_native_interpret_type_p(var_type)” is false, therefore, 
> Expanding this .DEFERRED_INIT call went down the “memset” expansion route. 
> 
> However, this structure type can be fitted into a register, therefore cannot 
> be taken address anymore at this stage, even though I tried:
> 
> TREE_ADDRESSABLE (lhs) = 1;
> var_addr = build_fold_addr_expr (lhs);
> 
> To create an address variable for it, the expansion still failed at expr.c: 
> line 8412:
> 

Re: [PATCH] libbacktrace: fix b2test_buildid test on non-english locales

2021-08-12 Thread Ian Lance Taylor via Gcc-patches
On Thu, Aug 12, 2021 at 3:35 PM Sergei Trofimovich via Gcc-patches
 wrote:
>
> From: Sergei Trofimovich 
>
> On LANG=ru_RU.UTF-8 'b2test_buildid' test fails due to localized readelf
> output:
>
> $ LANG=ru_RU.UTF-8 readelf -n b2test | fgrep 4e37e8f
> ID сборки: 4e37e8fead8d6e8b0a9dc95ea25cd784dff3a393
> $ LANG=C readelf -n b2test | fgrep 4e37e8f
> Build ID: 4e37e8fead8d6e8b0a9dc95ea25cd784dff3a393
>
> libbacktrace/
>
> * install-debuginfo-for-buildid.sh.in: Force non-localized readelf
> output with LANG=C.
> ---
>  libbacktrace/install-debuginfo-for-buildid.sh.in | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libbacktrace/install-debuginfo-for-buildid.sh.in 
> b/libbacktrace/install-debuginfo-for-buildid.sh.in
> index 1364779d703..91dfdfe89a4 100644
> --- a/libbacktrace/install-debuginfo-for-buildid.sh.in
> +++ b/libbacktrace/install-debuginfo-for-buildid.sh.in
> @@ -47,7 +47,7 @@ mkdir_p="@MKDIR_P@"
>  build_id_dir="$1"
>  src="$2"
>
> -buildid=$($readelf -n $src \
> +buildid=$(LANG=C $readelf -n $src \
>   | $grep "Build ID" \
>   | $awk '{print $3}')
>


This is OK.

Thanks.

Ian


Re: [PATCH] libbacktrace: fix fd leak tests on systems with extra descriptors

2021-08-12 Thread Ian Lance Taylor via Gcc-patches
On Thu, Aug 12, 2021 at 3:34 PM Sergei Trofimovich via Gcc-patches
 wrote:
>
> From: Sergei Trofimovich 
>
> I noticed test failures when ran gcc test suite from under mc shell.
> mc opens fd=9 and exposes it to child processes. As a result a few
> tests failes:
> FAIL: b2test_buildid
> FAIL: btest_gnudebuglink
> FAIL: btest
> FAIL: btest_lto
> FAIL: btest_alloc
> FAIL: ctestg
> FAIL: ctesta
> FAIL: ctestg_alloc
> FAIL: ctesta_alloc
> FAIL: dwarf5
> FAIL: dwarf5_alloc
>
> Instead of trying to close file descripts in range test polls for
> first available file descriptor by creating it via dup(1).
>
> libbacktrace/
>
> * btest.c (check_open_files): Use last free file descriptor as a
> signal for flie descriptor leak.

This isn't a useful replacement, as this will pass as long as
libbacktrace closes the first file descriptor that it opens.  It won't
check whether libbacktrace left any other file descriptors open.

Perhaps at program startup we could fstat descriptors up to 10 and
record whether they are valid, and then skip those files in
check_open_files.

Ian


Go patch committed: Store pointers to go:notinheap types indirectly

2021-08-12 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend and libgo stores pointers to
go:notinheap types indirectly.  This provides better support for using
cgo with incomplete types.  This is the gofrontend version of
https://golang.org/cl/264480.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
493219b84da47449c53883d0702dbc4457912f5e
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index be092de568b..539d886b08f 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-7e092d2cc5af7648036496485b639f2c9db2f2d8
+5edbb624b2595d644eb6842c952a292c41f7d6fa
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 67917dac95d..8d4d168f4e3 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -408,7 +408,14 @@ Expression::convert_type_to_interface(Type* lhs_type, 
Expression* rhs,
 {
   // We are assigning a non-pointer value to the interface; the
   // interface gets a copy of the value in the heap if it escapes.
-  if (rhs->is_constant())
+
+  // An exception is &global if global is notinheap, which is a
+  // pointer value but not a direct-iface type and we can't simply
+  // take its address.
+  bool is_address = (rhs->unary_expression() != NULL
+ && rhs->unary_expression()->op() == OPERATOR_AND);
+
+  if (rhs->is_constant() && !is_address)
 obj = Expression::make_unary(OPERATOR_AND, rhs, location);
   else
 {
@@ -11331,6 +11338,7 @@ Call_expression::do_lower(Gogo* gogo, Named_object* 
function,
   // We always pass a pointer when calling a method, except for
   // direct interface types when calling a value method.
   if (!first_arg->type()->is_error()
+  && first_arg->type()->points_to() == NULL
   && !first_arg->type()->is_direct_iface_type())
{
  first_arg = Expression::make_unary(OPERATOR_AND, first_arg, loc);
@@ -18630,12 +18638,20 @@ 
Interface_mtable_expression::do_get_backend(Translate_context* context)
   else
m = st->method_function(p->name(), &is_ambiguous);
   go_assert(m != NULL);
-  Named_object* no =
-(this->is_pointer_
- && this->type_->is_direct_iface_type()
- && m->is_value_method()
- ? m->iface_stub_object()
- : m->named_object());
+
+  // See the comment in Type::method_constructor.
+  bool use_direct_iface_stub = false;
+  if (m->is_value_method()
+ && this->is_pointer_
+ && this->type_->is_direct_iface_type())
+   use_direct_iface_stub = true;
+  if (!m->is_value_method()
+ && this->is_pointer_
+ && !this->type_->in_heap())
+   use_direct_iface_stub = true;
+  Named_object* no = (use_direct_iface_stub
+ ? m->iface_stub_object()
+ : m->named_object());
 
   go_assert(no->is_function() || no->is_function_declaration());
 
diff --git a/gcc/go/gofrontend/types.cc b/gcc/go/gofrontend/types.cc
index 0c44186f507..e76600daab9 100644
--- a/gcc/go/gofrontend/types.cc
+++ b/gcc/go/gofrontend/types.cc
@@ -2464,8 +2464,16 @@ Type::is_direct_iface_type() const
 bool
 Type::is_direct_iface_type_helper(Unordered_set(const Type*)* visited) const
 {
-  if (this->points_to() != NULL
-  || this->channel_type() != NULL
+  if (this->points_to() != NULL)
+{
+  // Pointers to notinheap types must be stored indirectly.  See
+  // https://golang.org/issue/42076.
+  if (!this->points_to()->in_heap())
+   return false;
+  return true;
+}
+
+  if (this->channel_type() != NULL
   || this->function_type() != NULL
   || this->map_type() != NULL)
 return true;
@@ -3597,10 +3605,36 @@ Type::method_constructor(Gogo*, Type* method_type,
   vals->push_back(Expression::make_unary(OPERATOR_AND, s, bloc));
 }
 
-  bool use_direct_iface_stub =
-this->points_to() != NULL
-&& this->points_to()->is_direct_iface_type()
-&& m->is_value_method();
+  // The direct_iface_stub dereferences the value stored in the
+  // interface when calling the method.
+  //
+  // We need this for a value method if this type is a pointer to a
+  // direct-iface type.  For example, if we have "type C chan int" and M
+  // is a value method on C, then since a channel is a direct-iface type
+  // M expects a value of type C.  We are generating the method table
+  // for *C, so the value stored in the interface is *C.  We have to
+  // call the direct-iface stub to dereference *C to get C to pass to M.
+  //
+  // We also need this for a pointer method if the pointer itself is not
+  // a direct-iface type, as arises for notinheap types.  In this case
+  // we have "type NIH ..." where NIH is go:notinheap.  Since NIH is
+  // notinheap, *NIH is a pointer type that is not a

[PATCH] i386: Add peephole for lea and zero extend [PR 101716]

2021-08-12 Thread Hongyu Wang via Gcc-patches
Hi,

For lea + zero_extendsidi insns, if dest of lea and src of zext are the
same, combine them with single leal under 64bit target since 32bit
register will be automatically zero-extended.

Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
Ok for master?

gcc/ChangeLog:

PR target/101716
* config/i386/i386.md (*lea_zext): New define_insn.
(define_peephole2): New peephole2 to combine zero_extend
with lea.

gcc/testsuite/ChangeLog:

PR target/101716
* gcc.target/i386/pr101716.c: New test.
---
 gcc/config/i386/i386.md  | 20 
 gcc/testsuite/gcc.target/i386/pr101716.c | 11 +++
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101716.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 4a8e8fea290..6739dbd799b 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -5187,6 +5187,26 @@
(const_string "SI")
(const_string "")))])
 
+;; combine zero_extendsidi with lea to use leal.
+(define_insn "*lea_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+   (match_operand:SWI48 1 "address_no_seg_operand" "Ts")))]
+  "TARGET_64BIT"
+  "lea{l}\t{%E1, %k0|%k0,%E1}")
+
+(define_peephole2
+  [(set (match_operand:SWI48 0 "general_reg_operand")
+   (match_operand:SWI48 1 "address_no_seg_operand"))
+   (set (match_operand:DI 2 "general_reg_operand")
+   (zero_extend:DI (match_operand:SI 3 "general_reg_operand")))]
+  "TARGET_64BIT && ix86_hardreg_mov_ok (operands[2], operands[1])
+   && REGNO (operands[0]) == REGNO (operands[3])
+   && (REGNO (operands[2]) == REGNO (operands[3])
+  || peep2_reg_dead_p (2, operands[3]))"
+  [(set (match_dup 2)
+   (zero_extend:DI (match_dup 1)))])
+
 (define_peephole2
   [(set (match_operand:SWI48 0 "register_operand")
(match_operand:SWI48 1 "address_no_seg_operand"))]
diff --git a/gcc/testsuite/gcc.target/i386/pr101716.c 
b/gcc/testsuite/gcc.target/i386/pr101716.c
new file mode 100644
index 000..0b684755c2f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101716.c
@@ -0,0 +1,11 @@
+/* PR target/101716 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+
+/* { dg-final { scan-assembler "leal\[\\t \]\*eax" } } */
+/* { dg-final { scan-assembler-not "movl\[\\t \]\*eax" } } */
+
+unsigned long long sample1(unsigned long long m) {
+unsigned int t = -1;
+return (m << 1) & t;
+}
-- 
2.18.1



Re: [PATCH] i386: Add peephole for lea and zero extend [PR 101716]

2021-08-12 Thread Hongyu Wang via Gcc-patches
Sorry for the typo, scan-assembler should be

+/* { dg-final { scan-assembler "leal\[\\t \]\[^\\n\]*eax" } } */
+/* { dg-final { scan-assembler-not "movl\[\\t \]\[^\\n\]*eax" } } */

Hongyu Wang via Gcc-patches  于2021年8月13日周五 上午8:49写道:
>
> Hi,
>
> For lea + zero_extendsidi insns, if dest of lea and src of zext are the
> same, combine them with single leal under 64bit target since 32bit
> register will be automatically zero-extended.
>
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> Ok for master?
>
> gcc/ChangeLog:
>
> PR target/101716
> * config/i386/i386.md (*lea_zext): New define_insn.
> (define_peephole2): New peephole2 to combine zero_extend
> with lea.
>
> gcc/testsuite/ChangeLog:
>
> PR target/101716
> * gcc.target/i386/pr101716.c: New test.
> ---
>  gcc/config/i386/i386.md  | 20 
>  gcc/testsuite/gcc.target/i386/pr101716.c | 11 +++
>  2 files changed, 31 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101716.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 4a8e8fea290..6739dbd799b 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -5187,6 +5187,26 @@
> (const_string "SI")
> (const_string "")))])
>
> +;; combine zero_extendsidi with lea to use leal.
> +(define_insn "*lea_zext"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +   (zero_extend:DI
> +   (match_operand:SWI48 1 "address_no_seg_operand" "Ts")))]
> +  "TARGET_64BIT"
> +  "lea{l}\t{%E1, %k0|%k0,%E1}")
> +
> +(define_peephole2
> +  [(set (match_operand:SWI48 0 "general_reg_operand")
> +   (match_operand:SWI48 1 "address_no_seg_operand"))
> +   (set (match_operand:DI 2 "general_reg_operand")
> +   (zero_extend:DI (match_operand:SI 3 "general_reg_operand")))]
> +  "TARGET_64BIT && ix86_hardreg_mov_ok (operands[2], operands[1])
> +   && REGNO (operands[0]) == REGNO (operands[3])
> +   && (REGNO (operands[2]) == REGNO (operands[3])
> +  || peep2_reg_dead_p (2, operands[3]))"
> +  [(set (match_dup 2)
> +   (zero_extend:DI (match_dup 1)))])
> +
>  (define_peephole2
>[(set (match_operand:SWI48 0 "register_operand")
> (match_operand:SWI48 1 "address_no_seg_operand"))]
> diff --git a/gcc/testsuite/gcc.target/i386/pr101716.c 
> b/gcc/testsuite/gcc.target/i386/pr101716.c
> new file mode 100644
> index 000..0b684755c2f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr101716.c
> @@ -0,0 +1,11 @@
> +/* PR target/101716 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2" } */
> +
> +/* { dg-final { scan-assembler "leal\[\\t \]\*eax" } } */
> +/* { dg-final { scan-assembler-not "movl\[\\t \]\*eax" } } */
> +
> +unsigned long long sample1(unsigned long long m) {
> +unsigned int t = -1;
> +return (m << 1) & t;
> +}
> --
> 2.18.1
>


Re: [PATCH] [i386] Optimize vec_perm_expr to match vpmov{dw,qd,wb}.

2021-08-12 Thread Hongtao Liu via Gcc-patches
On Thu, Aug 12, 2021 at 5:23 PM Jakub Jelinek  wrote:
>
> On Thu, Aug 12, 2021 at 01:43:23PM +0800, liuhongt wrote:
> > Hi:
> >   This is another patch to optimize vec_perm_expr to match vpmov{dw,dq,wb}
> > under AVX512.
> >   For scenarios(like pr101846-2.c) where the upper half is not used, this 
> > patch
> > generates better code with only one vpmov{wb,dw,qd} instruction. For
> > scenarios(like pr101846-3.c) where the upper half is actually used,  if the 
> > src
> > vector length is 256/512bits, the patch can still generate better code, but 
> > for
> > 128bits, the code generation is worse.
> >
> > 128 bits upper half not used.
> >
> > -   vpshufb .LC2(%rip), %xmm0, %xmm0
> > +   vpmovdw %xmm0, %xmm0
> >
> > 128 bits upper half used.
> > -   vpshufb .LC2(%rip), %xmm0, %xmm0
> > +   vpmovdw %xmm0, %xmm1
> > +   vmovq   %xmm1, %rax
> > +   vpinsrq $0, %rax, %xmm0, %xmm0
> >
> >   Maybe expand_vec_perm_trunc_vinsert should only deal with 256/512bits of
> > vectors, but considering the real use of scenarios like pr101846-3.c
> > foo_*_128 possibility is relatively low, I still keep this part of the code.
>
> I actually am not sure if even
>  foo_dw_512:
>  .LFB0:
> .cfi_startproc
> -   vmovdqa64   %zmm0, %zmm1
> -   vmovdqa64   .LC0(%rip), %zmm0
> -   vpermi2w%zmm1, %zmm1, %zmm0
> +   vpmovdw %zmm0, %ymm1
> +   vinserti64x4$0x0, %ymm1, %zmm0, %zmm0
> ret
> is always a win, the permutations we should care most about are in loops
Yes, and vpmov{qd,dw} and vpermi{w,d} are both available under
avx512f, so expand_vec_perm_trunc_vinsert will never be matched when
it's placed in other 2 insn cases.
The only part we need to handle is vpmovwb which is under avx512bw but
vpermib require avx512vbmi.
> and the constant load as well as the first move in that case likely go
> away and it is one permutation insn vs. two.
> Different case is e.g.
> -   vmovdqa64   .LC5(%rip), %zmm2
> -   vmovdqa64   %zmm0, %zmm1
> -   vmovdqa64   .LC0(%rip), %zmm0
> -   vpermi2w%zmm1, %zmm1, %zmm2
> -   vpermi2w%zmm1, %zmm1, %zmm0
> -   vpshufb .LC6(%rip), %zmm0, %zmm0
> -   vpshufb .LC7(%rip), %zmm2, %zmm1
> -   vporq   %zmm1, %zmm0, %zmm0
> +   vpmovwb %zmm0, %ymm1
> +   vinserti64x4$0x0, %ymm1, %zmm0, %zmm0
> So, I wonder if your new routine shouldn't be instead done after
> in ix86_expand_vec_perm_const_1 after vec_perm_1 among other 2 insn cases
> and handle the other vpmovdw etc. cases in combine splitters (see that we
> only use low half or quarter of the result and transform whatever
> permutation we've used into what we want).
>
Got it, i'll try that way.
> And perhaps make the routine eventually more general, don't handle
> just identity permutation in the upper half, but allow there other
> permutations too (ones where that half can be represented by a single insn
> permutation).
> >
> >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> >   Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >   PR target/101846
> >   * config/i386/i386-expand.c (expand_vec_perm_trunc_vinsert):
> >   New function.
> >   (ix86_vectorize_vec_perm_const): Call
> >   expand_vec_perm_trunc_vinsert.
> >   * config/i386/sse.md (vec_set_lo_v32hi): New define_insn.
> >   (vec_set_lo_v64qi): Ditto.
> >   (vec_set_lo_): Extend to no-avx512dq.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   PR target/101846
> >   * gcc.target/i386/pr101846-2.c: New test.
> >   * gcc.target/i386/pr101846-3.c: New test.
> > ---
> >  gcc/config/i386/i386-expand.c  | 125 +
> >  gcc/config/i386/sse.md |  60 +-
> >  gcc/testsuite/gcc.target/i386/pr101846-2.c |  81 +
> >  gcc/testsuite/gcc.target/i386/pr101846-3.c |  95 
> >  4 files changed, 359 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101846-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101846-3.c
> >
> > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> > index bd21efa9530..519caac2e15 100644
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > @@ -18317,6 +18317,126 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
> >return false;
> >  }
> >
> > +/* A subroutine of ix86_expand_vec_perm_const_1.  Try to implement D
> > +   in terms of a pair of vpmovdw + vinserti128 instructions.  */
> > +static bool
> > +expand_vec_perm_trunc_vinsert (struct expand_vec_perm_d *d)
> > +{
> > +  unsigned i, nelt = d->nelt, mask = d->nelt - 1;
> > +  unsigned half = nelt / 2;
> > +  machine_mode half_mode, trunc_mode;
> > +
> > +  /* vpmov{wb,dw,qd} only available under AVX512.  */
> > +  if (!d->one_operand_p || !TARGET_AVX512F
> > +  || (!TARGET_AVX512VL  && GET_MODE_SIZE (d->vmode) < 64)
>
> Too many spaces.
> > +  || GET_MODE_SIZE (GET_M

Re: [PATCH] rs6000: Add missing unsigned info for some P10 bifs

2021-08-12 Thread Kewen.Lin via Gcc-patches
Hi Bill,

on 2021/8/12 上午12:24, Bill Schmidt wrote:
> Hi Kewen,
> 
> On 8/11/21 12:44 AM, Kewen.Lin wrote:
>> Hi,
>>
>> This patch is to make prototypes of some Power10 built-in
>> functions consistent with what's in the documentation, as
>> well as the vector version.  Otherwise, useless conversions
>> can be generated in gimple IR, and the vectorized versions
>> will have inconsistent types.
>>
>> Bootstrapped & regtested on powerpc64le-linux-gnu P9 and
>> powerpc64-linux-gnu P8.
>>
>> Is it ok for trunk?
> 
> LGTM.  Maintainers, this is necessary in the short term for the old builtins 
> support, but this fragile thing that people always forget will go away with 
> the new support.  What Kewen is proposing here is correct for now.
> 

Thanks for your review and good to know we won't have this kind of issue with 
your new support, nice!!

FWIW, for now the bif vectorization still requires this type consistence to 
make type check happy.


BR,
Kewen

> Thanks,
> Bill
> 
>>
>> BR,
>> Kewen
>> -
>> gcc/ChangeLog:
>>
>> * config/rs6000/rs6000-call.c (builtin_function_type): Add unsigned
>> signedness for some Power10 bifs.


Re: [PATCH] Adding target hook allows to reject initialization of register

2021-08-12 Thread Jojo R via Gcc-patches


— Jojo
在 2021年8月11日 +0800 PM6:44,Richard Biener ,写道:
> On Wed, Aug 11, 2021 at 11:28 AM Richard Sandiford
>  wrote:
> >
> > Richard Biener  writes:
> > > On Tue, Aug 10, 2021 at 10:33 AM Jojo R via Gcc-patches
> > >  wrote:
> > > >
> > > > Some target like RISC-V allow to group vector register as a whole,
> > > > and only operate part of it in fact, but the 'init-regs' pass will add 
> > > > initialization
> > > > for uninitialized registers. Add this hook to reject this action for 
> > > > reducing instruction.
> > >
> > > Are these groups "visible"? That is, are the pseudos multi-reg
> > > pseudos? I wonder
> > > if there's a more generic way to tame down initregs w/o introducing a new 
> > > target
> > > hook.
> > >
> > > Btw, initregs is a red herring - it ideally should go away. See PR61810.
> > >
> > > So instead of adding to it can you see whether disabling the pass for 
> > > RISC-V
> > > works w/o fallout (and add a comment to the PR)? Maybe some more RTL
> > > literate (in particular DF literate) can look at the remaining issue.
> > > Richard, did you
> > > ever have a look into the "issue" that initregs covers up (whatever
> > > that exactly is)?
> >
> > No, sorry. I don't really understand what it would be from the comment
> > in the code:
> >
> > [...] papers over some problems on the arm and other
> > processors where certain isa constraints cannot be handled by gcc.
> > These are of the form where two operands to an insn my not be the
> > same. The ra will only make them the same if they do not
> > interfere, and this can only happen if one is not initialized.
> >
> > That would definitely be an RA bug if true, since the constraints need
> > to be applied independently of dataflow information. But the comment
> > and code predate LRA and maybe no-one fancied poking around in reload
> > (hard to believe).
> >
> > I'd be very surprised if LRA gets this wrong.
>
> OK, we're wondering since quite some time - how about changing the
> gate of initregs to optimize > 0 && !targetm.lra_p ()? We'll hopefully
> figure out the "real" issue the pass is papering over. At the same time
> we're leaving old reload (and likely unmaintianed) targets unaffected.
>
Richard,

So this patch is not necessary ?

I need to disable this pass in my situation only ?
I am afraid some side effect in my projects without this init-regs pass … ...
> Richard.
>
> > Thanks,
> > Richard


Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-12 Thread Kewen.Lin via Gcc-patches
Hi Segher,

Thanks for the review!

on 2021/8/12 下午11:10, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Aug 11, 2021 at 02:56:11PM +0800, Kewen.Lin wrote:
>>  * config/rs6000/rs6000.c (rs6000_builtin_md_vectorized_function): Add
>>  support for some built-in functions vectorized on Power10.
> 
> Say which, not "some" please?
> 

Done.

>> +  machine_mode in_vmode = TYPE_MODE (type_in);
>> +  machine_mode out_vmode = TYPE_MODE (type_out);
>> +
>> +  /* Power10 supported vectorized built-in functions.  */
>> +  if (TARGET_POWER10
>> +  && in_vmode == out_vmode
>> +  && VECTOR_UNIT_ALTIVEC_OR_VSX_P (in_vmode))
>> +{
>> +  machine_mode exp_mode = DImode;
>> +  machine_mode exp_vmode = V2DImode;
>> +  enum rs6000_builtins vname = RS6000_BUILTIN_COUNT;
> 
> "name"?  This should be "bif" or similar?
> 

Updated with name.

>> +  switch (fn)
>> +{
>> +case MISC_BUILTIN_DIVWE:
>> +case MISC_BUILTIN_DIVWEU:
>> +  exp_mode = SImode;
>> +  exp_vmode = V4SImode;
>> +  if (fn == MISC_BUILTIN_DIVWE)
>> +vname = P10V_BUILTIN_DIVES_V4SI;
>> +  else
>> +vname = P10V_BUILTIN_DIVEU_V4SI;
>> +  break;
>> +case MISC_BUILTIN_DIVDE:
>> +case MISC_BUILTIN_DIVDEU:
>> +  if (fn == MISC_BUILTIN_DIVDE)
>> +vname = P10V_BUILTIN_DIVES_V2DI;
>> +  else
>> +vname = P10V_BUILTIN_DIVEU_V2DI;
>> +  break;
> 
> All of the above should not be builtin functions really, they are all
> simple arithmetic :-(  They should not be UNSPECs either, on RTL level.
> They can and should be optimised in real code as well.  Oh well.
> 
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/dive-vectorize-2.c
>> @@ -0,0 +1,12 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target lp64 } */
> 
> Please add a comment what this is needed for?  "We scan for dive*d" is
> enough, but without anything, it takes time to figure this out.
> 

Done, same for below requests on lp64 commentary.

>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/dive-vectorize-run-2.c
>> @@ -0,0 +1,53 @@
>> +/* { dg-do run } */
>> +/* { dg-require-effective-target lp64 } */
> 
> Same here.  I suppose this uses builtins that do not exist on 32-bit?
> 

Yeah, those bifs which are guarded with lp64 in their cases are only
supported on 64-bit environment.

>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/p10-bifs-vectorize-run-1.c
>> @@ -0,0 +1,45 @@
>> +/* { dg-do run } */
>> +/* { dg-require-effective-target lp64 } */
> 
> And another.
> 
>> +#define CHECK(name) 
>>   \
>> +  __attribute__ ((optimize (1))) void check_##name ()   
>>   \
> 
> What is the attribute for, btw?  It seems fragile, but perhaps I do not
> understand the intention.
> 
> 

It's to stop compiler from optimizing check functions with vectorization,
since the test point is to compare the results between scalar and vectorized
version.

> Okay for trunk with whose lp64 things improved.  Thanks!
> 

Thanks, v2 has been attached by addressing Bill's and your comments.  :)


BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/rs6000.c (rs6000_builtin_md_vectorized_function): Add
support for built-in functions MISC_BUILTIN_DIVWE, MISC_BUILTIN_DIVWEU,
MISC_BUILTIN_DIVDE, MISC_BUILTIN_DIVDEU, P10_BUILTIN_CFUGED,
P10_BUILTIN_CNTLZDM, P10_BUILTIN_CNTTZDM, P10_BUILTIN_PDEPD and
P10_BUILTIN_PEXTD on Power10.
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 279f00cc648..a8b3175ed50 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5785,6 +5785,59 @@ rs6000_builtin_md_vectorized_function (tree fndecl, tree 
type_out,
 default:
   break;
 }
+
+  machine_mode in_vmode = TYPE_MODE (type_in);
+  machine_mode out_vmode = TYPE_MODE (type_out);
+
+  /* Power10 supported vectorized built-in functions.  */
+  if (TARGET_POWER10
+  && in_vmode == out_vmode
+  && VECTOR_UNIT_ALTIVEC_OR_VSX_P (in_vmode))
+{
+  machine_mode exp_mode = DImode;
+  machine_mode exp_vmode = V2DImode;
+  enum rs6000_builtins name;
+  switch (fn)
+   {
+   case MISC_BUILTIN_DIVWE:
+   case MISC_BUILTIN_DIVWEU:
+ exp_mode = SImode;
+ exp_vmode = V4SImode;
+ if (fn == MISC_BUILTIN_DIVWE)
+   name = P10V_BUILTIN_DIVES_V4SI;
+ else
+   name = P10V_BUILTIN_DIVEU_V4SI;
+ break;
+   case MISC_BUILTIN_DIVDE:
+   case MISC_BUILTIN_DIVDEU:
+ if (fn == MISC_BUILTIN_DIVDE)
+   name = P10V_BUILTIN_DIVES_V2DI;
+ else
+   name = P10V_BUILTIN_DIVEU_V2DI;
+ break;
+   case P10_BUILTIN_CFUGED:
+ name = P10V_BUILTIN_VCFUGED;
+ break;
+   case P10_BUILTIN_CNTLZDM:
+ name = P10V_BUILTIN_VCLZDM;
+ break;
+   case P10_BUILTIN_CNTTZDM:
+ name = P10V_BUILTIN_VCTZDM;
+ 

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-12 Thread Kewen.Lin via Gcc-patches
on 2021/8/12 下午11:51, Segher Boessenkool wrote:
> On Thu, Aug 12, 2021 at 10:10:10AM +0800, Kewen.Lin wrote:
>>> +  enum rs6000_builtins vname = RS6000_BUILTIN_COUNT;
>>>
>>> Using this as a flag value looks unnecessary.  Is this just being done to 
>>> silence a warning?
>>
>> Good question!  I didn't notice there is a warning or not, just get used to 
>> initializing variable
>> with one suitable value if possible.  If you don't mind, may I still keep 
>> it?  Since if some
>> future codes use vname in a path where it's not assigned, one explicitly 
>> wrong enum (bif) seems
>> better than a random one.  Or will this mentioned possibility definitely 
>> never happen since the
>> current uninitialized variables detection and warning scheme is robust and 
>> should not worry about
>> that completely?
> 
> It is a bad idea to initialise things unnecessary: it hinders many
> optimisations, but much more importantly, it silences warnings without
> fixing the problem.
> 

OK, I've made it uninitialized in v2. :-)  I believe the context here is simple
and the uninit-ed var detector can easily catch and warn the bad thing in 
future.

Sorry for chasing dead ends, I don't follow how it can hinder optimizations 
here,
IIUC it would be optimized as a dead store here?  As to the warning, although
there is no warning, I'd expect it causes ICE since the init-ed bif name isn't
reasonable for generation.  Wouldn't it be better than warning?  Sometimes we
don't have a proper value for initialization, I agree it should be better to
just leave it be, but IMHO it isn't the case here.  :)


BR,
Kewen

>>> +  if (vname != RS6000_BUILTIN_COUNT
>>>
>>> Check is not necessary, as you will have returned by now in that case.
>>
>> Thanks for catching, I put break for "default" initially, didn't noticed the 
>> following condition
>> need an adjustment after updating it to early return.  Will fix it.
> 
> Thanks :-)
> 
> 
> Segher
> 


libgo patch committed: Update to Go1.17rc2 release

2021-08-12 Thread Ian Lance Taylor via Gcc-patches
This patch updates libgo from the Go1.16.5 release to the Go 1.17rc2
release.  As usual with these version updates, the patch itself is too
large to attach to this e-mail message.  I've attached the changes to
files that are specific to gccgo.  Bootstraped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
5fe441d33024fe33b9835c3e8d6b9f6cf24715f1
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 539d886b08f..bcbe1d93018 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-5edbb624b2595d644eb6842c952a292c41f7d6fa
+33f65dce43bd01c1fa38cd90a78c9aea6ca6dd59
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/MERGE b/libgo/MERGE
index ac842716022..4286d5c5433 100644
--- a/libgo/MERGE
+++ b/libgo/MERGE
@@ -1,4 +1,4 @@
-7677616a263e8ded606cc8297cb67ddc667a876e
+72ab3ff68b1ec894fe5599ec82b8849f3baa9d94
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
diff --git a/libgo/Makefile.am b/libgo/Makefile.am
index dec98756673..92fedcf6eb8 100644
--- a/libgo/Makefile.am
+++ b/libgo/Makefile.am
@@ -366,6 +366,7 @@ toolexeclibgoregexp_DATA = \
 toolexeclibgoruntimedir = $(toolexeclibgodir)/runtime
 
 toolexeclibgoruntime_DATA = \
+   runtime/cgo.gox \
runtime/debug.gox \
runtime/metrics.gox \
runtime/pprof.gox \
@@ -428,7 +429,9 @@ noinst_DATA = \
internal/testenv.gox \
internal/trace.gox \
net/internal/socktest.gox \
-   os/signal/internal/pty.gox
+   os/signal/internal/pty.gox \
+   reflect/internal/example1.gox \
+   reflect/internal/example2.gox
 
 if LIBGO_IS_RTEMS
 rtems_task_variable_add_file = runtime/rtems-task-variable-add.c
@@ -480,14 +483,10 @@ version.go: s-version; @true
 s-version: Makefile
rm -f version.go.tmp
echo "package sys" > version.go.tmp
-   echo 'func init() { DefaultGoroot = "$(prefix)" }' >> version.go.tmp
-   echo 'const TheVersion = "'`cat $(srcdir)/VERSION | sed 1q`' '`$(GOC) 
--version | sed 1q`'"' >> version.go.tmp
-   echo 'const Goexperiment = ``' >> version.go.tmp
echo 'const GOARCH = "'$(GOARCH)'"' >> version.go.tmp
echo 'const GOOS = "'$(GOOS)'"' >> version.go.tmp
echo 'const GccgoToolDir = "$(libexecsubdir)"' >> version.go.tmp
-   echo >> version.go.tmp
-   echo "type ArchFamilyType int" >> version.go.tmp
+   echo 'const StackGuardMultiplierDefault = 1' >> version.go.tmp
echo >> version.go.tmp
echo "const (" >> version.go.tmp
echo "  UNKNOWN ArchFamilyType = iota" >> version.go.tmp
@@ -507,13 +506,13 @@ s-version: Makefile
done
echo >> version.go.tmp
echo "const (" >> version.go.tmp
-   echo "  ArchFamily = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) family`" 
>> version.go.tmp
-   echo "  BigEndian = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) bigendian`" 
>> version.go.tmp
-   echo "  CacheLineSize = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
cachelinesize`" >> version.go.tmp
-   echo "  DefaultPhysPageSize = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
defaultphyspagesize`" >> version.go.tmp
-   echo "  Int64Align = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
int64align`" >> version.go.tmp
-   echo "  MinFrameSize = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
minframesize`" >> version.go.tmp
-   echo "  PCQuantum = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) pcquantum`" 
>> version.go.tmp
+   echo "  _ArchFamily = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) family`" 
>> version.go.tmp
+   echo "  _BigEndian = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
bigendian`" >> version.go.tmp
+   echo "  _DefaultPhysPageSize = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
defaultphyspagesize`" >> version.go.tmp
+   echo "  _Int64Align = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
int64align`" >> version.go.tmp
+   echo "  _MinFrameSize = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
minframesize`" >> version.go.tmp
+   echo "  _PCQuantum = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
pcquantum`" >> version.go.tmp
+   echo "  _StackAlign = `$(SHELL) $(srcdir)/goarch.sh $(GOARCH) 
stackalign`" >> version.go.tmp
echo ")" >> version.go.tmp
echo >> version.go.tmp
for a in $(ALLGOOS); do \
@@ -526,7 +525,6 @@ s-version: Makefile
  fi; \
done
echo >> version.go.tmp
-   echo "type Uintreg uintptr" >> version.go.tmp
$(SHELL) $(srcdir)/mvifdiff.sh version.go.tmp version.go
$(STAMP) $@
 
@@ -547,24 +545,31 @@ s-gcpu: Makefile
$(SHELL) $(srcdir)/mvifdiff.sh gcpugen.go.tmp gcpugen.go
$(STAMP) $@
 
+buildcfg.go: s-buildcfg; @true
+s-buildcfg: Makefile
+   rm -f buildcfg.go.tmp
+   echo "package buildcfg" > buildcfg.go.tmp
+   echo "import \"runtime\"" >> buildcfg.go.tmp
+   echo 'func defaultGOROOTValue()

[PATCH] Fix tests that require IBM 128-bit long double

2021-08-12 Thread Michael Meissner via Gcc-patches
Fix tests that require IBM 128-bit long double

I posted an earlier version of this patch on July 7th, and Segher had some
comments about it on July 14th.  This is a revised version of the patch

 * My patch: Message-ID: <20210707195837.ga28...@ibm-toto.the-meissners.org>
 * Seger's reply: Message-ID: <20210714223208.gf1...@gate.crashing.org>

This patch adds 3 more selections to target-supports.exp to see if we can
specify to use a particular long double format (IEEE 128-bit, IBM extended
double, 64-bit), and the library support will track the changes for the long
double.  This is needed because two of the tests in the test suite use long
double, and they are actually testing IBM extended double.

This patch also forces the two tests that explicitly require long double
to use the IBM double-double encoding to explicitly run the test.  This
requires GLIBC 2.32 or greater in order to do the switch.

I have run tests on a little endian power9 system with 3 compilers.  There were
no regressions with these patches, and the two tests in the following patches
now work if the default long double is not IBM 128-bit:

 * One compiler used the default IBM 128-bit format;
 * One compiler used the IEEE 128-bit format; (and)
 * One compiler used 64-bit long doubles.

I just reverified that this patch still works on a little endian power9 system
running Linux.

I have also tested compilers on a big endian power8 system with a compiler
defaulting to power8 code generation and another with the default cpu
set.  There were no regressions.

Compared the to last version that I posted, I simplified the fucntion names
(eliminating the 'ppc_' and 'override_' parts).  I have changed the length
parameter from using sizeof to just using 16.  I removed the #if code in the
tests.  I changed the comments.

I did not remove the void * casts in calling memcmp, because not having those
casts will cause the test to fail.  This is because the variables are declared
volatile, and GCC now complains that you are discarding the volatile in doing
the call.  Having this warning makes the test fail.

Can I check this patch into the master branch?

2021-08-12  Michael Meissner  

gcc/testsuite/
PR target/94630
* gcc.target/powerpc/pr70117.c: Specify that we need the long double
type to be IBM 128-bit.  Remove the code to use __ibm128.
* c-c++-common/dfp/convert-bfp-11.c: Specify that we need the long
double type to be IBM 128-bit.  Run the test at -O2 optimization.
* lib/target-supports.exp (add_options_for_long_double_ibm128): New
function.
(check_effective_target_long_double_ibm128): New function.
(add_options_for_long_double_ieee128): New function.
(check_effective_target_long_double_ieee128): New function.
(add_options_for_long_double_64bit): New function.
(check_effective_target_long_double_64bit): New function.
---
 .../c-c++-common/dfp/convert-bfp-11.c |  20 ++--
 gcc/testsuite/gcc.target/powerpc/pr70117.c|  24 ++--
 gcc/testsuite/lib/target-supports.exp | 110 ++
 3 files changed, 130 insertions(+), 24 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c 
b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
index 95c433d2c24..c09c8342bbb 100644
--- a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
+++ b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
@@ -1,9 +1,16 @@
-/* { dg-skip-if "" { ! "powerpc*-*-linux*" } } */
+/* { dg-require-effective-target dfp } */
 
-/* Test decimal float conversions to and from IBM 128-bit long double. 
-   Checks are skipped at runtime if long double is not 128 bits.
-   Don't force 128-bit long doubles because runtime support depends
-   on glibc.  */
+/* We need the long double type to be IBM 128-bit because the CONVERT_TO_PINF
+   tests will fail if we use IEEE 128-bit floating point.  This is due to IEEE
+   128-bit having a larger exponent range than IBM 128-bit extended double.  So
+   tests that would generate an infinity with IBM 128-bit will generate a
+   normal number with IEEE 128-bit.  */
+
+/* { dg-require-effective-target long_double_ibm128 } */
+/* { dg-options "-O2" } */
+/* { dg-add-options long_double_ibm128 } */
+
+/* Test decimal float conversions to and from IBM 128-bit long double.   */
 
 #include "convert.h"
 
@@ -36,9 +43,6 @@ CONVERT_TO_PINF (312, tf, sd, 1.6e+308L, d32)
 int
 main ()
 {
-  if (sizeof (long double) != 16)
-return 0;
-
   convert_101 ();
   convert_102 ();
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c 
b/gcc/testsuite/gcc.target/powerpc/pr70117.c
index 3bbd2c595e0..4a51f583157 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr70117.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
@@ -1,26 +1,18 @@
-/* { dg-do run { target { powerpc*-*-linux* powerpc*-*-darwin* powerpc*-*-aix* 
rs6000-*-* } } } */
-/* { dg-options "-std=c99 -mlong-double-128 -O2" } */
+/* { dg-do run } */
+/* { dg-require-effective-targe

[PATCH] Fix xxeval predicates (PR 99921).

2021-08-12 Thread Michael Meissner via Gcc-patches
Fix xxeval predicates (PR 99921).

I originally posted this patch in May and in June.  I'm reposting it now.

I noticed that the xxeval built-in function used the altivec_register_operand
predicate.  Since it takes vsx registers, this might force the register
allocate to issue a move when it could use a traditional floating point
register.  This patch fixes that.

I have done builds on little endian power9, little endian power10, and big
endian power8 systems, and this patch caused no regressions.  Can I check it
into the master branch?

2021-08-12  Michael Meissner  

gcc/
PR target/99921
* config/rs6000/altivec.md (xxeval): Use register_predicate
instead of altivec_register_predicate.
---
 gcc/config/rs6000/altivec.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index d70c17e6bc2..fd86c300981 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -3875,9 +3875,9 @@ (define_insn "vperm_v16qiv8hi"
 
 (define_insn "xxeval"
   [(set (match_operand:V2DI 0 "register_operand" "=wa")
-   (unspec:V2DI [(match_operand:V2DI 1 "altivec_register_operand" "wa")
- (match_operand:V2DI 2 "altivec_register_operand" "wa")
- (match_operand:V2DI 3 "altivec_register_operand" "wa")
+   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "wa")
+ (match_operand:V2DI 2 "register_operand" "wa")
+ (match_operand:V2DI 3 "register_operand" "wa")
  (match_operand:QI 4 "u8bit_cint_operand" "n")]
 UNSPEC_XXEVAL))]
"TARGET_POWER10"
-- 
2.31.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] Move xx* builtins to vsx.md.

2021-08-12 Thread Michael Meissner via Gcc-patches
Move xx* builtins to vsx.md.

I originally posted this patch in May.  It needed a slight tune up as the
souces have changed, so I'm reposting it now.

I noticed that the xx built-in functions (xxspltiw, xxspltidp, xxsplti32dx,
xxeval, xxblend, and xxpermx) were all defined in altivec.md.  However, since
the XX instructions can take both traditional floating point and Altivec
registers, these built-in functions should be in vsx.md.

This patch just moves the insns from altivec.md to vsx.md.

I also moved the VM3 mode iterator and VM3_char mode attribute from altivec.md
to vsx.md, since the only use of these were for the XXBLEND insns.

I have built little endian power9 compilers, little endian power10 compilers,
and big endian power8 compilers with this patch applied, and there were no
regressions.  Can I apply this patch to the master branch?

Note this patch assumes the previous patch:

Fix xxeval predicates (PR 99921).

has been applied.

2021-08-12  Michael Meissner  

gcc/
* config/rs6000/altivec.md (UNSPEC_XXEVAL): Move to vsx.md.
(UNSPEC_XXSPLTIW): Move to vsx.md.
(UNSPEC_XXSPLTID): Move to vsx.md.
(UNSPEC_XXSPLTI32DX): Move to vsx.md.
(UNSPEC_XXBLEND): Move to vsx.md.
(UNSPEC_XXPERMX): Move to vsx.md.
(VM3): Move to vsx.md.
(VM3_char): Move to vsx.md.
(xxspltiw_v4si): Move to vsx.md.
(xxspltiw_v4sf): Move to vsx.md.
(xxspltiw_v4sf_inst): Move to vsx.md.
(xxspltidp_v2df): Move to vsx.md.
(xxspltidp_v2df_inst): Move to vsx.md.
(xxsplti32dx_v4si_inst): Move to vsx.md.
(xxsplti32dx_v4sf): Move to vsx.md.
(xxsplti32dx_v4sf_inst): Move to vsx.md.
(xxblend_): Move to vsx.md.
(xxpermx): Move to vsx.md.
(xxpermx_inst): Move to vsx.md.
* config/rs6000/vsx.md (UNSPEC_XXEVAL): Move from altivec.md.
(UNSPEC_XXSPLTIW): Move from altivec.md.
(UNSPEC_XXSPLTID): Move from altivec.md.
(UNSPEC_XXSPLTI32DX): Move from altivec.md.
(UNSPEC_XXBLEND): Move from altivec.md.
(UNSPEC_XXPERMX): Move from altivec.md.
(VM3): Move from altivec.md.
(VM3_char): Move from altivec.md.
(xxspltiw_v4si): Move from altivec.md.
(xxspltiw_v4sf): Move from altivec.md.
(xxspltiw_v4sf_inst): Move from altivec.md.
(xxspltidp_v2df): Move from altivec.md.
(xxspltidp_v2df_inst): Move from altivec.md.
(xxsplti32dx_v4si_inst): Move from altivec.md.
(xxsplti32dx_v4sf): Move from altivec.md.
(xxsplti32dx_v4sf_inst): Move from altivec.md.
(xxblend_): Move from altivec.md.
(xxpermx): Move from altivec.md.
(xxpermx_inst): Move from altivec.md.
---
 gcc/config/rs6000/altivec.md | 197 -
 gcc/config/rs6000/vsx.md | 206 +++
 2 files changed, 206 insertions(+), 197 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index fd86c300981..2c73ddea823 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -175,16 +175,10 @@ (define_c_enum "unspec"
UNSPEC_VPEXTD
UNSPEC_VCLRLB
UNSPEC_VCLRRB
-   UNSPEC_XXEVAL
UNSPEC_VSTRIR
UNSPEC_VSTRIL
UNSPEC_SLDB
UNSPEC_SRDB
-   UNSPEC_XXSPLTIW
-   UNSPEC_XXSPLTID
-   UNSPEC_XXSPLTI32DX
-   UNSPEC_XXBLEND
-   UNSPEC_XXPERMX
 ])

 (define_c_enum "unspecv"
@@ -225,21 +219,6 @@ (define_mode_iterator VM2 [V4SI
   (KF "FLOAT128_VECTOR_P (KFmode)")
   (TF "FLOAT128_VECTOR_P (TFmode)")])

-;; Like VM2, just do char, short, int, long, float and double
-(define_mode_iterator VM3 [V4SI
-  V8HI
-  V16QI
-  V4SF
-  V2DF
-  V2DI])
-
-(define_mode_attr VM3_char [(V2DI "d")
-  (V4SI "w")
-  (V8HI "h")
-  (V16QI "b")
-  (V2DF  "d")
-  (V4SF  "w")])
-
 ;; Map the Vector convert single precision to double precision for integer
 ;; versus floating point
 (define_mode_attr VS_sxwsp [(V4SI "sxw") (V4SF "sp")])
@@ -859,170 +838,6 @@ (define_insn "vsdb_"
   "vsdbi %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])

-(define_insn "xxspltiw_v4si"
-  [(set (match_operand:V4SI 0 "register_operand" "=wa")
-   (unspec:V4SI [(match_operand:SI 1 "s32bit_cint_operand" "n")]
-UNSPEC_XXSPLTIW))]
- "TARGET_POWER10"
- "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")
-  (set_attr "prefixed" "yes")])
-
-(define_expand "xxspltiw_v4sf"
-  [(set (match_operand:V4SF 0 "register_operand" "=wa")
-   (unspec:V4SF [(match_operand:SF 1 "const_double_operand" "n")]
-UNSPEC_XXSPLTIW))]
- "TARGET_POWER10"
-{
-  long value = rs6000_const_f32_to_i32 (operands[1]);
-  emit_insn (gen_xxspltiw_v4

[PATCH] FreeBSD: Stop linking _p libs for -pg as of FreeBSD 14

2021-08-12 Thread Andreas Tobler via Gcc-patches

Hi,

I would like to commit the attached patch to trunk and after a settling 
period also to all open branches.

Is this ok?

TIA,
Andreas

From a8a602620917efad2579c0dc24b178c0f2b18ff3 Mon Sep 17 00:00:00 2001
From: Andreas Tobler 
Date: Thu, 12 Aug 2021 22:35:52 +0200
Subject: [PATCH] FreeBSD: Stop linking _p libs for -pg as of FreeBSD 14

As of FreeBSD version 14, FreeBSD no longer provides profiled system
libraries like libc_p and libpthread_p. Stop linking against them if
the FreeBSD major version is 14 or more.

gcc/ChangeLog

* config/freebsd-spec.h: Change the fbsd-lib-spec for FreeBSD > 13,
do not link against profiled system libraries if -pg is invoked.
Add a define to note about this change.
* config/aarch64/aarch64-freebsd.h: Use the note to inform if -pg
is invoked on FreeBSD > 13 that FreeBSD no longer provides profiled
system libraries.
* config/arm/freebsd.h: Likewise.
* config/i386/freebsd.h: Likewise.
* config/i386/freebsd64.h: Likewise.
* config/riscv/freebsd.h: Likewise.
* config/rs6000/freebsd64.h: Likewise.
* config/rs6000/sysv4.h: Likeise.
---
 gcc/config/aarch64/aarch64-freebsd.h |  1 +
 gcc/config/arm/freebsd.h |  1 +
 gcc/config/freebsd-spec.h| 18 ++
 gcc/config/i386/freebsd.h|  1 +
 gcc/config/i386/freebsd64.h  |  1 +
 gcc/config/riscv/freebsd.h   |  1 +
 gcc/config/rs6000/freebsd64.h|  1 +
 gcc/config/rs6000/sysv4.h|  1 +
 8 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-freebsd.h 
b/gcc/config/aarch64/aarch64-freebsd.h
index e2dfe784030..cde74fa7aa5 100644
--- a/gcc/config/aarch64/aarch64-freebsd.h
+++ b/gcc/config/aarch64/aarch64-freebsd.h
@@ -35,6 +35,7 @@
 #undef  FBSD_TARGET_LINK_SPEC
 #define FBSD_TARGET_LINK_SPEC " \
 %{p:%nconsider using `-pg' instead of `-p' with gprof (1)}  \
+" FBSD_LINK_PG_NOTE "  \
 %{v:-V} \
 %{assert*} %{R*} %{rpath*} %{defsym*}   \
 %{shared:-Bshareable %{h*} %{soname*}}  \
diff --git a/gcc/config/arm/freebsd.h b/gcc/config/arm/freebsd.h
index 8a970be0144..d82be28e263 100644
--- a/gcc/config/arm/freebsd.h
+++ b/gcc/config/arm/freebsd.h
@@ -47,6 +47,7 @@
 #undef LINK_SPEC
 #define LINK_SPEC "\
   %{p:%nconsider using `-pg' instead of `-p' with gprof (1)}   \
+  " FBSD_LINK_PG_NOTE "
\
   %{v:-V}  \
   %{assert*} %{R*} %{rpath*} %{defsym*}
\
   %{shared:-Bshareable %{h*} %{soname*}}   \
diff --git a/gcc/config/freebsd-spec.h b/gcc/config/freebsd-spec.h
index 2b492617a0d..3a66e9a0a15 100644
--- a/gcc/config/freebsd-spec.h
+++ b/gcc/config/freebsd-spec.h
@@ -92,19 +92,29 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
libc, depending on whether we're doing profiling or need threads support.
(similar to the default, except no -lg, and no -p).  */
 
+#if FBSD_MAJOR < 14
+#define FBSD_LINK_PG_NOTHREADS "%{!pg: -lc}  %{pg: -lc_p}"
+#define FBSD_LINK_PG_THREADS   "%{!pg: %{pthread:-lpthread} -lc} " \
+   "%{pg: %{pthread:-lpthread} -lc_p}"
+#define FBSD_LINK_PG_NOTE ""
+#else
+#define FBSD_LINK_PG_NOTHREADS "%{-lc} "
+#define FBSD_LINK_PG_THREADS   "%{pthread:-lpthread} -lc "
+#define FBSD_LINK_PG_NOTE "%{pg:%nFreeBSD no longer provides profiled "\
+ "system libraries}"
+#endif
+
 #ifdef FBSD_NO_THREADS
 #define FBSD_LIB_SPEC "
\
   %{pthread: %eThe -pthread option is only supported on FreeBSD when gcc \
 is built with the --enable-threads configure-time option.} \
   %{!shared:   \
-%{!pg: -lc}
\
-%{pg:  -lc_p}  \
+" FBSD_LINK_PG_NOTHREADS " \
   }"
 #else
 #define FBSD_LIB_SPEC "
\
   %{!shared:   \
-%{!pg: %{pthread:-lpthread} -lc}   \
-%{pg:  %{pthread:-lpthread_p} -lc_p}   \
+" FBSD_LINK_PG_THREADS "   \
   }\
   %{shared:\
 %{pthread:-lpthread} -lc