date:20191024

ping [Patch][Fortran/OpenMP] Don't create "alloc:" for 'target exit data'

2019-10-24 Thread Tobias Burnus




On 10/18/19 11:27 AM, Tobias Burnus wrote:

Currently, one has for
  !$omp target exit data map(delete:x)
in the original dump:
  #pragma omp target exit data map(delete:*x) map(alloc:x [pointer 
assign, bias: 0])


The "alloc:" not only does not make sense but also gives run-time 
messages like:

libgomp: GOMP_target_enter_exit_data unhandled kind 0x04

[Depending on the data type, in gfc_trans_omp_clauses's OMP_LIST_MAP, 
add map clauses of type GOMP_MAP_POINTER and/or GOMP_MAP_TO_PSET.]


That's for release:/delete:. However, for 'target exit data' 
(GOMP_target_enter_exit_data) the same issue occurs for 
"from:"/"always, from:".  But "from:" implies "alloc:". – While 
"alloc:" does not make sense for "target exit data" or "update", for 
"target" or "target data" it surely matters. Hence, I only exclude 
"from:" for exit data and update.


See attached patch. I have additionally Fortran-fied 
libgomp.c/target-20.c to have at least one 'enter/exit target data' 
test case for Fortran.


Build + regtested on x86_64-gnu-linux w/o offloading. And I have 
tested the new test case with nvptx.


Tobias

Fix reductions for fully-masked loops

2019-10-24 Thread Richard Sandiford

Now that vectorizable_operation vectorises most loop stmts involved
in a reduction, it needs to be aware of reductions in fully-masked loops.
The LOOP_VINFO_CAN_FULLY_MASK_P parts of vectorizable_reduction now only
apply to cases that use vect_transform_reduction.

This new way of doing things is definitely an improvement for SVE though,
since it means we can lift the old restriction of not using fully-masked
loops for reduction chains.

Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.
OK to install?

Richard


2019-10-24  Richard Sandiford  

gcc/
* tree-vect-loop.c (vectorizable_reduction): Restrict the
LOOP_VINFO_CAN_FULLY_MASK_P handling to cases that will be
handled by vect_transform_reduction.  Allow fully-masked loops
to be used with reduction chains.
* tree-vect-stmts.c (vectorizable_operation): Handle reduction
operations in fully-masked loops.
(vectorizable_condition): Reject EXTRACT_LAST_REDUCTION
operations in fully-masked loops.

gcc/testsuite/
* gcc.dg/vect/pr65947-1.c: No longer expect doubled dump lines
for FOLD_EXTRACT_LAST reductions.
* gcc.dg/vect/pr65947-2.c: Likewise.
* gcc.dg/vect/pr65947-3.c: Likewise.
* gcc.dg/vect/pr65947-4.c: Likewise.
* gcc.dg/vect/pr65947-5.c: Likewise.
* gcc.dg/vect/pr65947-6.c: Likewise.
* gcc.dg/vect/pr65947-9.c: Likewise.
* gcc.dg/vect/pr65947-10.c: Likewise.
* gcc.dg/vect/pr65947-12.c: Likewise.
* gcc.dg/vect/pr65947-13.c: Likewise.
* gcc.dg/vect/pr65947-14.c: Likewise.
* gcc.dg/vect/pr80631-1.c: Likewise.
* gcc.dg/vect/pr80631-2.c: Likewise.
* gcc.dg/vect/vect-cond-reduc-3.c: Likewise.
* gcc.dg/vect/vect-cond-reduc-4.c: Likewise.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2019-10-24 08:28:45.0 +0100
+++ gcc/tree-vect-loop.c2019-10-24 08:29:09.177742864 +0100
@@ -6313,38 +6313,8 @@ vectorizable_reduction (stmt_vec_info st
   else
 vec_num = 1;
 
-  internal_fn cond_fn = get_conditional_internal_fn (code);
-  vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
-  bool mask_by_cond_expr = use_mask_by_cond_expr_p (code, cond_fn, vectype_in);
-
   vect_model_reduction_cost (stmt_info, reduc_fn, reduction_type, ncopies,
 cost_vec);
-  if (loop_vinfo && LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo))
-{
-  if (reduction_type != FOLD_LEFT_REDUCTION
- && !mask_by_cond_expr
- && (cond_fn == IFN_LAST
- || !direct_internal_fn_supported_p (cond_fn, vectype_in,
- OPTIMIZE_FOR_SPEED)))
-   {
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"can't use a fully-masked loop because no"
-" conditional operation is available.\n");
- LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo) = false;
-   }
-  else if (reduc_index == -1)
-   {
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"can't use a fully-masked loop for chained"
-" reductions.\n");
- LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo) = false;
-   }
-  else
-   vect_record_loop_mask (loop_vinfo, masks, ncopies * vec_num,
-  vectype_in, NULL);
-}
   if (dump_enabled_p ()
   && reduction_type == FOLD_LEFT_REDUCTION)
 dump_printf_loc (MSG_NOTE, vect_location,
@@ -6361,6 +6331,27 @@ vectorizable_reduction (stmt_vec_info st
   STMT_VINFO_DEF_TYPE (stmt_info) = vect_internal_def;
   STMT_VINFO_DEF_TYPE (vect_orig_stmt (stmt_info)) = vect_internal_def;
 }
+  else if (loop_vinfo && LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo))
+{
+  vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
+  internal_fn cond_fn = get_conditional_internal_fn (code);
+
+  if (reduction_type != FOLD_LEFT_REDUCTION
+ && !use_mask_by_cond_expr_p (code, cond_fn, vectype_in)
+ && (cond_fn == IFN_LAST
+ || !direct_internal_fn_supported_p (cond_fn, vectype_in,
+ OPTIMIZE_FOR_SPEED)))
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"can't use a fully-masked loop because no"
+" conditional operation is available.\n");
+ LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo) = false;
+   }
+  else
+   vect_record_loop_mask (loop_vinfo, masks, ncopies * vec_num,
+  vectype_in, NULL);
+}
   return true;
 }
 
Index: gcc/tree-vect-stmts.c
==

Re: Pass the data vector mode to get_mask_mode

2019-10-24 Thread Richard Sandiford

Bernhard Reutner-Fischer  writes:
> On 23 October 2019 13:16:19 CEST, Richard Sandiford 
>  wrote:
>
>>+++ gcc/config/gcn/gcn.c  2019-10-23 12:13:54.091122156 +0100
>>@@ -3786,8 +3786,7 @@ gcn_expand_builtin (tree exp, rtx target
>>a vector.  */
>> 
>> opt_machine_mode
>>-gcn_vectorize_get_mask_mode (poly_uint64 ARG_UNUSED (nunits),
>>-  poly_uint64 ARG_UNUSED (length))
>>+gcn_vectorize_get_mask_mode (nachine_mode)
>
> nachine?
>
> If that really compiles someone should fix that preexisting typo, I suppose. 
> Didn't look though.

Gah, had a nasty feeling there was some extra testing I'd forgotten to do.

Thanks for spotting that.  Consider it fixed in the obvious way.

Richard

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-24 Thread Richard Sandiford

"H.J. Lu"  writes:
> On Wed, Oct 23, 2019 at 4:51 AM Richard Sandiford
>  wrote:
>>
>> Richard Biener  writes:
>> > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford
>> >  wrote:
>> >>
>> >> This patch is the first of a series that tries to remove two
>> >> assumptions:
>> >>
>> >> (1) that all vectors involved in vectorisation must be the same size
>> >>
>> >> (2) that there is only one vector mode for a given element mode and
>> >> number of elements
>> >>
>> >> Relaxing (1) helps with targets that support multiple vector sizes or
>> >> that require the number of elements to stay the same.  E.g. if we're
>> >> vectorising code that operates on narrow and wide elements, and the
>> >> narrow elements use 64-bit vectors, then on AArch64 it would normally
>> >> be better to use 128-bit vectors rather than pairs of 64-bit vectors
>> >> for the wide elements.
>> >>
>> >> Relaxing (2) makes it possible for -msve-vector-bits=128 to preoduce
>> >> fixed-length code for SVE.  It also allows unpacked/half-size SVE
>> >> vectors to work with -msve-vector-bits=256.
>> >>
>> >> The patch adds a new hook that targets can use to control how we
>> >> move from one vector mode to another.  The hook takes a starting vector
>> >> mode, a new element mode, and (optionally) a new number of elements.
>> >> The flexibility needed for (1) comes in when the number of elements
>> >> isn't specified.
>> >>
>> >> All callers in this patch specify the number of elements, but a later
>> >> vectoriser patch doesn't.  I won't be posting the vectoriser patch
>> >> for a few days, hence the RFC/A tag.
>> >>
>> >> Tested individually on aarch64-linux-gnu and as a series on
>> >> x86_64-linux-gnu.  OK to install?  Or if not yet, does the idea
>> >> look OK?
>> >
>> > In isolation the idea looks good but maybe a bit limited?  I see
>> > how it works for the same-size case but if you consider x86
>> > where we have SSE, AVX256 and AVX512 what would it return
>> > for related_vector_mode (V4SImode, SImode, 0)?  Or is this
>> > kind of query not intended (where the component modes match
>> > but nunits is zero)?
>>
>> In that case we'd normally get V4SImode back.  It's an allowed
>> combination, but not very useful.
>>
>> > How do you get from SVE fixed 128bit to NEON fixed 128bit then?  Or is
>> > it just used to stay in the same register set for different component
>> > modes?
>>
>> Yeah, the idea is to use the original vector mode as essentially
>> a base architecture.
>>
>> The follow-on patches replace vec_info::vector_size with
>> vec_info::vector_mode and targetm.vectorize.autovectorize_vector_sizes
>> with targetm.vectorize.autovectorize_vector_modes.  These are the
>> starting modes that would be passed to the hook in the nunits==0 case.
>>
>
> For a target with different vector sizes,
> targetm.vectorize.autovectorize_vector_sizes
> doesn't return the optimal vector sizes for known trip count and
> unknown trip count.
> For a target with 128-bit and 256-bit vectors, 256-bit followed by
> 128-bit works well for
> known trip count since vectorizer knows the maximum usable vector size.  But 
> for
> unknown trip count, we may want to use 128-bit vector when 256-bit
> code path won't
> be used at run-time, but 128-bit vector will.  At the moment, we can
> only use one
> set of vector sizes for both known trip count and unknown trip count.

Yeah, we're hit by this for AArch64 too.  Andre's recent patches:

https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01564.html
https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00205.html

should help.

>   Can vectorizer
> support 2 sets of vector sizes, one for known trip count and the other
> for unknown
> trip count?

The approach Andre's taking is to continue to use the wider vector size
for unknown trip counts, and instead ensure that the epilogue loop is
vectorised at the narrower vector size if possible.  The patches then
use this vectorised epilogue as a fallback "main" loop if the runtime
trip count is too low for the wide vectors.

Thanks,
Richard

Re: Order symbols before section copying in the lto streamer

2019-10-24 Thread Martin Liška

On 10/23/19 10:02 PM, Jan Hubicka wrote:
>> Hi,
>> this patch orders symbols where we copy sections to match the order
>> of files in the command line.  This optimizes streaming process since we
>> are not opening and closing files randomly and also we read them more
>> sequentially.  This saves some kernel time though I think more can be
>> done if we avoid doing pair of mmap/unmap for every file section we
>> read.
>>
>> We also read files in random order in ipa-cp and during devirt.
>> I guess also summary streaming can be refactored to stream all summaries
>> for a given file instead of reading one sumarry from all files.
>>
>> Bootstrapped/regtested x86_64-linux, plan to commit it this afternoon if
>> there are no complains.
>>
>> Honza
>>
>>  * lto-common.c (lto_file_finalize): Add order attribute.
>>  (lto_create_files_from_ids): Pass order.
>>  (lto_file_read): UPdate call of lto_create_files_from_ids.
>>  * lto-streamer-out.c (output_constructor): Push CTORS_OUT timevar.
>>  (cmp_symbol_files): New.
>>  (lto_output): Copy sections in file order.
>>  * lto-streamer.h (lto_file_decl_data): Add field order.
> Hi,
> I have commited the patch but messed up testing so it broke builds with
> static libraries and checking enabled. This is fixes by this patch
> 
>   * lto-streamer-out.c (cmp_symbol_files): Watch for overflow.
> Index: lto-streamer-out.c
> ===
> --- lto-streamer-out.c(revision 277346)
> +++ lto-streamer-out.c(working copy)
> @@ -2447,7 +2447,12 @@ cmp_symbol_files (const void *pn1, const
>  
>/* Order within static library.  */
>if (n1->lto_file_data && n1->lto_file_data->id != n2->lto_file_data->id)
> -return n1->lto_file_data->id - n2->lto_file_data->id;
> +{
> +  if (n1->lto_file_data->id > n2->lto_file_data->id)
> + return 1;
> +  if (n1->lto_file_data->id < n2->lto_file_data->id)
> + return -1;
> +}

Hi.

It's unclear to me why you need the patch. Isn't that equivalent?
Why you need only 1 and -1 return values?

Martin

>  
>/* And finaly order by the definition order.  */
>return n1->order - n2->order;
>

[PATCH] Fix another UBSAN in Fortran coarray.

2019-10-24 Thread Martin Liška

Hello.

There are 2 more places that need to be handled similarly
to not do out of bounds access.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From b6698f326fa4625aca6b2fa65824f5aed8b331df Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 24 Oct 2019 09:53:43 +0200
Subject: [PATCH] Fix another UBSAN in Fortran coarray.

gcc/fortran/ChangeLog:

2019-10-24  Martin Liska  

	PR fortran/92174
	* array.c (gfc_resolve_array_spec): Break the loop
	for out of bounds index.
	* resolve.c (is_non_constant_shape_array): Likewise.
---
 gcc/fortran/array.c   | 3 +++
 gcc/fortran/resolve.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c
index f0980dd9cef..36223d2233d 100644
--- a/gcc/fortran/array.c
+++ b/gcc/fortran/array.c
@@ -410,6 +410,9 @@ gfc_resolve_array_spec (gfc_array_spec *as, int check_constant)
 
   for (i = 0; i < as->rank + as->corank; i++)
 {
+  if (i == GFC_MAX_DIMENSIONS)
+	return false;
+
   e = as->lower[i];
   if (!resolve_array_bound (e, check_constant))
 	return false;
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 93f2d0aa761..5deeb4fc87b 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -12266,6 +12266,9 @@ is_non_constant_shape_array (gfc_symbol *sym)
 	 simplification now.  */
   for (i = 0; i < sym->as->rank + sym->as->corank; i++)
 	{
+	  if (i == GFC_MAX_DIMENSIONS)
+	break;
+
 	  e = sym->as->lower[i];
 	  if (e && (!resolve_index_expr(e)
 		|| !gfc_is_constant_expr (e)))
-- 
2.23.0

Re: [PATCH V6 05/11] bpf: new GCC port

2019-10-24 Thread Jose E. Marchesi



Hi Gerald.

I noticed that https://gcc.gnu.org does not have a news item related
to this contribution.  Would you mind adding one?  (Our web pages are
now in GIT, cf. https://gcc.gnu.org/about.html - let me know if you need
help.)

Done! :)

Re: [PATCH] Fix another UBSAN in Fortran coarray.

2019-10-24 Thread Tobias Burnus


OK. (I assume that for the culprit input, an error is show at some point.)

Thanks for ubsan testing!

Tobias

On 10/24/19 10:38 AM, Martin Liška wrote:

Hello.

There are 2 more places that need to be handled similarly
to not do out of bounds access.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

Re: [PATCH] Fix parser to recognize operator?:

2019-10-24 Thread Dr. Matthias Kretz

ping

On Montag, 14. Oktober 2019 12:27:11 CEST Matthias Kretz wrote:
> This time with testcase. Is the subdir for the test ok?
> 
> gcc/ChangeLog:
> 
> 2019-10-11  Matthias Kretz  
> 
>   * gcc/cp/parser.c (cp_parser_operator): Parse operator?: as an
>   attempt to overload the conditional operator. Then
>   grok_op_properties can print its useful "ISO C++ prohibits
>   overloading operator ?:" message instead of the cryptic error
>   message about a missing type-specifier before '?' token.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-10-14  Matthias Kretz  
>   * testsuite/g++.dg/parse/operator9.C: New test verifying the
>   correct error message is printed.
> 
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 3ee8da7db94..73385cb3dcb 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -15502,6 +15502,15 @@ cp_parser_operator (cp_parser* parser, location_t
> start_loc)
>op = COMPONENT_REF;
>break;
> 
> +case CPP_QUERY:
> +  op = COND_EXPR;
> +  /* Consume the `?'.  */
> +  cp_lexer_consume_token (parser->lexer);
> +  /* Look for the matching `:'.  */
> +  cp_parser_require (parser, CPP_COLON, RT_COLON);
> +  consumed = true;
> +  break;
> +
>  case CPP_OPEN_PAREN:
>{
>  /* Consume the `('.  */
> diff --git a/gcc/testsuite/g++.dg/parse/operator9.C b/gcc/testsuite/g++.dg/
> parse/operator9.C
> new file mode 100644
> index 000..d66355afab5
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/parse/operator9.C
> @@ -0,0 +1,5 @@
> +// { dg-do compile }
> +
> +struct A {};
> +struct B {};
> +int operator?:(bool, A, B);  // { dg-error "prohibits overloading" }
> 
> On Freitag, 11. Oktober 2019 16:17:09 CEST you wrote:
> > On Fri, Oct 11, 2019 at 04:06:43PM +0200, Matthias Kretz wrote:
> > > This is a minor bugfix for improved error reporting. Overloading ?: is
> > > just as disallowed as it is without this change.
> > 
> > Thanks.  Can you provide a testcase that shows why this change makes
> > sense?
> > That testcase then should be part of the patch submission.


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtzzentrum für Schwerionenforschung https://gsi.de
 SIMD easy and portable https://github.com/VcDevel/Vc
──

[C++ PATCH] P0784R7 constexpr new fixes (PR c++/91369)

2019-10-24 Thread Jakub Jelinek

Hi!

Jonathan has showed me a testcase with std::allocator::{,de}allocate
and std::construct_at which FAILs with the current constexpr new
implementation.

There are two problems that make the testcase rejected, and further issues
(not addressed by this patch) where supposedly invalid C++20 code is
accepted.

The first problem was that cxx_replaceable_global_alloc_fn was actually
treating placement new as replaceable global allocation function, when it is
not.
The second problem is that std::construct_at uses placement new under the
hood.  From what I understood, placement new is not allowed in C++20 in
constexpr context and it is unspecified whatever magic std::construct_at
uses to get similar effect.

The fix for the first problem is easy, just add the
DECL_IS_REPLACEABLE_OPERATOR_NEW_P || DECL_IS_OPERATOR_DELETE_P
checks to cxx_replaceable_global_alloc_fn, placement new nor some random
user added ::operator new (size_t, float, double) etc. should not have that
set.
For the second one, the patch is allowing placement new in constexpr
evaluation only in std::construct_at and not in other functions.

With this, Jonathan's testcase works fine.  Ok for trunk if it passes
bootstrap/regtest?

Now, for the accepts invalid issues.
From what I understood, in constant evaluation
::operator new{,[]} or ::operator delete{,[]}
can't be called from anywhere, only from new/delete expressions or
std::allocator::{,de}allocate, is that correct?
If so, perhaps we need some CALL_EXPR flag that we'd set on the call
coming from new/delete expressions and disallow calls to
replaceable global allocator functions in constexpr evaluation unless
that flag is set or it is in std::allocator::{,de}allocate.

Another thing is that even with that change,
  std::allocator a;
  auto p = a.allocate (1);
  *p = 1;
  a.deallocate (p, 1);
would be accepted during constexpr evaluation, because allocate already
has the cast which turns "heap uninit" variable into "heap " and assigns
it a type, so there is nothing that will prevent the store from succeeding.
Any thoughts on what to do with that?  Even if the magic cast (perhaps
with some flag on it) is moved from whatever the first cast to non-void*
is to the placement new or start of corresponding constructor, would we
need to unmark it somehow if we say std::destroy_at it but allow
next std::construct_at to construct it again?

2019-10-24  Jakub Jelinek  

PR c++/91369 - Implement P0784R7: constexpr new
* constexpr.c (cxx_replaceable_global_alloc_fn): Don't return true
for placement new.
(cxx_placement_new_fn, is_std_construct_at): New functions.
(cxx_eval_call_expression): Allow placement new in std::construct_at.
(potential_constant_expression_1): Likewise.

* g++.dg/cpp2a/constexpr-new5.C: New test.

--- gcc/cp/constexpr.c.jj   2019-10-23 20:37:59.981872274 +0200
+++ gcc/cp/constexpr.c  2019-10-24 10:20:57.127193138 +0200
@@ -1601,7 +1601,41 @@ cxx_replaceable_global_alloc_fn (tree fn
 {
   return (cxx_dialect >= cxx2a
  && IDENTIFIER_NEWDEL_OP_P (DECL_NAME (fndecl))
- && CP_DECL_CONTEXT (fndecl) == global_namespace);
+ && CP_DECL_CONTEXT (fndecl) == global_namespace
+ && (DECL_IS_REPLACEABLE_OPERATOR_NEW_P (fndecl)
+ || DECL_IS_OPERATOR_DELETE_P (fndecl)));
+}
+
+/* Return true if FNDECL is a placement new function that should be
+   useable during constant expression evaluation of std::construct_at.  */
+
+static inline bool
+cxx_placement_new_fn (tree fndecl)
+{
+  if (cxx_dialect >= cxx2a
+  && IDENTIFIER_NEW_OP_P (DECL_NAME (fndecl))
+  && CP_DECL_CONTEXT (fndecl) == global_namespace
+  && !DECL_IS_REPLACEABLE_OPERATOR_NEW_P (fndecl)
+  && TREE_CODE (TREE_TYPE (fndecl)) == FUNCTION_TYPE)
+{
+  tree first_arg = TREE_CHAIN (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
+  if (TREE_VALUE (first_arg) == ptr_type_node
+ && TREE_CHAIN (first_arg) == void_list_node)
+   return true;
+}
+  return false;
+}
+
+/* Return true if FNDECL is std::construct_at.  */
+
+static inline bool
+is_std_construct_at (tree fndecl)
+{
+  if (!decl_in_std_namespace_p (fndecl))
+return false;
+
+  tree name = DECL_NAME (fndecl);
+  return name && id_equal (name, "construct_at");
 }
 
 /* Subroutine of cxx_eval_constant_expression.
@@ -1738,6 +1772,27 @@ cxx_eval_call_expression (const constexp
  return t;
}
}
+  /* Allow placement new in std::construct_at, just return the second
+argument.  */
+  if (cxx_placement_new_fn (fun)
+ && ctx->call
+ && ctx->call->fundef
+ && is_std_construct_at (ctx->call->fundef->decl))
+   {
+ const int nargs = call_expr_nargs (t);
+ tree arg1 = NULL_TREE;
+ for (int i = 0; i < nargs; ++i)
+   {
+ tree arg = CALL_EXPR_ARG (t, i);
+ arg = cxx_eval_constant_expression (ctx, arg, false,
+

Re: GCC wwwdocs move to git done

2019-10-24 Thread Jose E. Marchesi



Hi Joseph.

I've done the move of GCC wwwdocs to git (using the previously posted and 
discussed scripts), including setting up the post-receive hook to do the 
same things previously covered by the old CVS hooks, and minimal updates 
to the web pages dealing with the CVS setup for wwwdocs.

Belated thanks! :)

[PATCH] Define std::uniform_random_bit_generator concept for C++20

2019-10-24 Thread Jonathan Wakely


* include/bits/random.h (uniform_random_bit_generator): Define for
C++20.
* testsuite/26_numerics/random/concept.cc: New test.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line.

This is the last piece of P0898R3, "C++20 concepts library", although
other proposals added more concepty things to  and other
headers (patch incoming for that).

Tested powerpc64le-linux, committed to trunk.

commit d8fc3b4d03aeceef9dab968395dfc346db754b27
Author: Jonathan Wakely 
Date:   Thu Oct 24 09:10:25 2019 +0100

Define std::uniform_random_bit_generator concept for C++20

* include/bits/random.h (uniform_random_bit_generator): Define for
C++20.
* testsuite/26_numerics/random/concept.cc: New test.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line.

diff --git a/libstdc++-v3/include/bits/random.h 
b/libstdc++-v3/include/bits/random.h
index e63dbcf5a25..270097e07e6 100644
--- a/libstdc++-v3/include/bits/random.h
+++ b/libstdc++-v3/include/bits/random.h
@@ -33,6 +33,9 @@
 
 #include 
 #include 
+#if __cplusplus > 201703L
+# include 
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -48,6 +51,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @{
*/
 
+#ifdef __cpp_lib_concepts
+  /// Requirements for a uniform random bit generator.
+  template
+concept uniform_random_bit_generator
+  = invocable<_Gen&> && unsigned_integral>
+  && requires
+  {
+   { _Gen::min() } -> same_as>;
+   { _Gen::max() } -> same_as>;
+  };
+#endif
+
   /**
* @brief A function template for converting the output of a (integral)
* uniform random number generator to a floatng point result in the range
diff --git a/libstdc++-v3/testsuite/26_numerics/random/concept.cc 
b/libstdc++-v3/testsuite/26_numerics/random/concept.cc
new file mode 100644
index 000..1794ad05419
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/random/concept.cc
@@ -0,0 +1,221 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+// { dg-require-cstdint "" }
+
+#include 
+
+static_assert( std::uniform_random_bit_generator );
+static_assert( std::uniform_random_bit_generator );
+static_assert( std::uniform_random_bit_generator );
+
+struct G1
+{
+  unsigned char operator()();
+  static constexpr unsigned char min() { return 0; }
+  static constexpr unsigned char max() { return 10; }
+};
+
+static_assert( std::uniform_random_bit_generator );
+
+struct G2
+{
+  unsigned operator()();
+  static constexpr unsigned min() { return 0; }
+  static constexpr unsigned max() { return -1U; }
+};
+
+static_assert( std::uniform_random_bit_generator );
+
+struct G3
+{
+  unsigned long long operator()();
+  static constexpr unsigned long long min() { return 0; }
+  static constexpr unsigned long long max() { return -1ULL; }
+};
+
+static_assert( std::uniform_random_bit_generator );
+
+struct G4
+{
+  unsigned operator()(int = 0, int = 0); // extra params, with default args
+  static constexpr unsigned min(long = 0) { return 0; }
+  static constexpr unsigned max(void* = nullptr) { return -1U; }
+};
+
+static_assert( std::uniform_random_bit_generator );
+
+struct G5
+{
+  unsigned operator()() &; // ref-qualifier
+  static constexpr unsigned min() { return 0; }
+  static constexpr unsigned max() { return 10; }
+};
+
+static_assert( std::uniform_random_bit_generator );
+
+struct G6
+{
+  unsigned operator()() const; // cv-qualifier
+  static constexpr unsigned min() { return 0; }
+  static constexpr unsigned max() { return 10; }
+};
+
+static_assert( std::uniform_random_bit_generator );
+
+struct G7
+{
+  unsigned operator()() volatile; // cv-qualifier
+  static constexpr unsigned min() { return 0; }
+  static constexpr unsigned max() { return 10; }
+};
+
+static_assert( std::uniform_random_bit_generator );
+
+struct G8
+{
+  unsigned operator()() const volatile; // cv-qualifiers
+  static constexpr unsigned min() { return 0; }
+  static constexpr unsigned max() { return 10; }
+};
+
+static_assert( std::uniform_random_bit_generator );
+
+struct G9
+{
+  unsigned operator()() const volatile; // cv-qualifiers
+  static constexpr unsigned min() { re

Re: [PATCH] Define std::uniform_random_bit_generator concept for C++20

2019-10-24 Thread Jonathan Wakely


On 24/10/19 10:34 +0100, Jonathan Wakely wrote:

* include/bits/random.h (uniform_random_bit_generator): Define for
C++20.
* testsuite/26_numerics/random/concept.cc: New test.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line.

This is the last piece of P0898R3, "C++20 concepts library",


... so this updates the docs.

Tested powerpc64le-linux, committed to trunk.


commit bbcbcd50c07b8c735c6d19cc7487bbc736004ab7
Author: Jonathan Wakely 
Date:   Thu Oct 24 10:45:10 2019 +0100

PR libstdc++/88338 Implement P0898R3, C++20 concepts library

The implementation is already complete but this updates the docs and
adds tests for the feature test macro.

* doc/xml/manual/status_cxx2020.xml: Update status.
* doc/html/*: Regenerate.
* testsuite/std/concepts/1.cc: New test.
* testsuite/std/concepts/2.cc: New test.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
index 72c38ef985c..73949a34ad9 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
@@ -535,15 +535,14 @@ Feature-testing recommendations for C++.
 
 
 
-  
 Standard Library Concepts 
   
 http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0898r3.pdf";>
 	P0898R3
 	
   
-   
-  
+   10.1 
+   __cpp_lib_concepts >= 201806L 
 
 
 
diff --git a/libstdc++-v3/testsuite/std/concepts/1.cc b/libstdc++-v3/testsuite/std/concepts/1.cc
new file mode 100644
index 000..f7f86e7b405
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/concepts/1.cc
@@ -0,0 +1,27 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do preprocess { target c++2a } }
+
+#include 
+
+#ifndef __cpp_lib_concepts
+# error "Feature test macro for concepts is missing in "
+#elif __cpp_lib_concepts < 201806L
+# error "Feature test macro for concepts has wrong value in "
+#endif
diff --git a/libstdc++-v3/testsuite/std/concepts/2.cc b/libstdc++-v3/testsuite/std/concepts/2.cc
new file mode 100644
index 000..1b71b3110a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/concepts/2.cc
@@ -0,0 +1,27 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do preprocess { target c++2a } }
+
+#include 
+
+#ifndef __cpp_lib_concepts
+# error "Feature test macro for concepts is missing in "
+#elif __cpp_lib_concepts < 201806L
+# error "Feature test macro for concepts has wrong value in "
+#endif

Re: Order symbols before section copying in the lto streamer

2019-10-24 Thread Jan Hubicka

> On 10/23/19 10:02 PM, Jan Hubicka wrote:
> >> Hi,
> >> this patch orders symbols where we copy sections to match the order
> >> of files in the command line.  This optimizes streaming process since we
> >> are not opening and closing files randomly and also we read them more
> >> sequentially.  This saves some kernel time though I think more can be
> >> done if we avoid doing pair of mmap/unmap for every file section we
> >> read.
> >>
> >> We also read files in random order in ipa-cp and during devirt.
> >> I guess also summary streaming can be refactored to stream all summaries
> >> for a given file instead of reading one sumarry from all files.
> >>
> >> Bootstrapped/regtested x86_64-linux, plan to commit it this afternoon if
> >> there are no complains.
> >>
> >> Honza
> >>
> >>* lto-common.c (lto_file_finalize): Add order attribute.
> >>(lto_create_files_from_ids): Pass order.
> >>(lto_file_read): UPdate call of lto_create_files_from_ids.
> >>* lto-streamer-out.c (output_constructor): Push CTORS_OUT timevar.
> >>(cmp_symbol_files): New.
> >>(lto_output): Copy sections in file order.
> >>* lto-streamer.h (lto_file_decl_data): Add field order.
> > Hi,
> > I have commited the patch but messed up testing so it broke builds with
> > static libraries and checking enabled. This is fixes by this patch
> > 
> > * lto-streamer-out.c (cmp_symbol_files): Watch for overflow.
> > Index: lto-streamer-out.c
> > ===
> > --- lto-streamer-out.c  (revision 277346)
> > +++ lto-streamer-out.c  (working copy)
> > @@ -2447,7 +2447,12 @@ cmp_symbol_files (const void *pn1, const
> >  
> >/* Order within static library.  */
> >if (n1->lto_file_data && n1->lto_file_data->id != n2->lto_file_data->id)
> > -return n1->lto_file_data->id - n2->lto_file_data->id;
> > +{
> > +  if (n1->lto_file_data->id > n2->lto_file_data->id)
> > +   return 1;
> > +  if (n1->lto_file_data->id < n2->lto_file_data->id)
> > +   return -1;
> > +}
> 
> Hi.
> 
> It's unclear to me why you need the patch. Isn't that equivalent?
> Why you need only 1 and -1 return values?

ids are 64it unsigned values and the subtraction was running into
overflows. (to be honest i am not sure why they are done this way,
but they come from linker)

Honza
> 
> Martin
> 
> >  
> >/* And finaly order by the definition order.  */
> >return n1->order - n2->order;
> > 
>

[PATCH] Fix testsuite fallout from partial PR65930 fix

2019-10-24 Thread Richard Biener



The no longer xfailed testcases lacked appropriate target restriction.
The following should fix the sparc fallout.

Applied to trunk.

Richard.

2019-10-24  Richard Biener  

PR tree-optimization/65930
* gcc.dg/vect/vect-reduc-2char-big-array.c: Adjust again.
* gcc.dg/vect/vect-reduc-2char.c: Likewise.
* gcc.dg/vect/vect-reduc-2short.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-s8b.c: Likewise.
* gcc.dg/vect/vect-reduc-pattern-2c.c: Likewise.

Index: gcc/testsuite/gcc.dg/vect/vect-reduc-2char-big-array.c
===
--- gcc/testsuite/gcc.dg/vect/vect-reduc-2char-big-array.c  (revision 
277365)
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-2char-big-array.c  (working copy)
@@ -62,4 +62,4 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { 
! vect_no_int_min_max } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-2char.c
===
--- gcc/testsuite/gcc.dg/vect/vect-reduc-2char.c(revision 277365)
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-2char.c(working copy)
@@ -46,4 +46,4 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { 
! vect_no_int_min_max } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-2short.c
===
--- gcc/testsuite/gcc.dg/vect/vect-reduc-2short.c   (revision 277365)
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-2short.c   (working copy)
@@ -45,4 +45,4 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { 
! vect_no_int_min_max } } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8b.c
===
--- gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8b.c  (revision 277365)
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8b.c  (working copy)
@@ -54,5 +54,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 
"vect" { xfail *-*-* } } } */
 /* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 
1 "vect" } } */
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
vect_widen_mult_qi_to_hi } } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-2c.c
===
--- gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-2c.c   (revision 277365)
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-2c.c   (working copy)
@@ -45,4 +45,4 @@ main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: detected" 
1 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
vect_widen_sum_qi_to_hi } } } } */

Re: [Patch 0/X] [WIP][RFC][libsanitizer] Introduce HWASAN to GCC

2019-10-24 Thread Martin Liška

On 10/23/19 1:01 PM, Matthew Malcomson wrote:
> Hi Martin,

Hello.

> 
> I'm getting close to putting up a patch series that I believe could go 
> in before stage1 close.
> 
> I currently have to do testing on sanitizing the kernel, and track down 
> a bootstrap comparison diff in the code handling shadow-stack cleanup 
> during exception unwinding.
> 
> I just thought I'd answer these questions below to see if there's 
> anything I extra could to do to make reviewing easier.

I welcome that approach.

> 
> On 23/09/19 09:02, Martin Liška wrote:
>> Hi.
>>
>> As mentioned in the next email thread, there are main objectives
>> that will help me to make a proper patch review:
>>
>> 1) Make first libsanitizer merge from trunk, it will remove the need
>> of the backports that you made. Plus I will be able to apply the
>> patchset on the current master.
> Done
>> 2) I would exclude the setjmp/longjmp - these should be upstreamed first
>> in libsanitizer.
> 
> Will exclude in the patch series, upstreaming under progress 
> (https://reviews.llvm.org/D69045)
> 
>> 3) I would like to see a two HWASAN options that will clearly separate the
>> 2 supported modes: TBI without MTE and MTE. Here I would appreciate to 
>> have
>> a compiler farm machine with TBI which we can use for testing.
> 
> I went back and looked at clang to see that it uses 
> `-fsanitize=hwaddress` and `-fsanitize=memtag`, which are completely 
> different options.
> 
> I'm now doing the same, with the two sanitizers just using similar code 
> paths.
> 
> In fact, I'm not going to have the MTE instrumentation ready by the end 
> of stage1, so my aim is to just put the `-fsanitize=hwaddress` sanitizer 
> in, but send some outline code to the mailing list to demonstrate how 
> `-fsanitize=memtag` would fit in.

As well here. That will make it easier to merge -fsanitize=hwaddress first.

> 
> 
> ## w.r.t. a compiler farm machine with TBI
> 
> Any AArch64 machine has this feature.  However in order to use the 
> sanitizer the kernel needs to allow "tagged pointers" in syscalls.

If so, then it will be very easy to grab a machine and run 5.4 kernel in it.
So I'll will be able to test the patches.

> 
> The kernel has allowed these tagged pointers in syscalls (once it's been 
> turned on with a relevant prctl) in mainline since 5.4-rc1 (i.e. the 
> start of this month).
> 
> My testing has been on a virtual machine with a mainline kernel built 
> from source.
> 
> Given that I'm not sure how you want to proceed.
> Could we set up a virtual machine on the compiler farm?
> 
> 
>> 4) About the BUILTIN expansion: you provided a patch for couple of them. My 
>> question
>> is whether the list is complete?
> 
> The list of BUILTINs was nowhere near complete at the time I posted the 
> RFC patches.
> 
> Since then I've added features and correspondingly added BUILTINs.
> 
> Now I believe I've added all the BUILTIN's into sanitizer.def this 
> sanitizer will need.
> 
>> 5) I would appreciate the patch set to be split into less logical parts, e.g.
>> libsanitizer changes; option introduction; stack variable handling 
>> (colour/uncolour/alignment);
>> hwasan pass and other GIMPLE-related changes; RTL hooks, new RTL 
>> instructions and expansion changes.
>>
> 
> Will do!

Great.

Thanks,
Martin

> 
>> Thank you,
>> Martin
>> 
>>
>

[PATCH v2] Move jump threading before reload

2019-10-24 Thread Ilya Leoshkevich

Bootstrapped and regtested on x86_64-redhat-linux, s390x-redhat-linux
and ppc64le-redhat-linux.

v1 -> v2: Improved commit message.


r266734 has introduced a new instance of jump threading pass in order to
take advantage of opportunities that combine opens up.  It was perceived
back then that it was beneficial to delay it after reload, since that
might produce even more such opportunities.

Unfortunately jump threading interferes with hot/cold partitioning.  In
the code from PR92007, it converts the following

  +-- 2/HOT +
  | |
  v v
3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT
|   ^
|   |
+---+

into the following:

  +-- 2/HOT --+
  |   |
  v   v
3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT

This makes hot bb 6 dominated by cold bb 11, and because of this
fixup_partitions makes bb 6 cold as well, which in turn makes EH edge
6->16 a crossing one.  Not only can't we have crossing EH edges, we are
also not allowed to introduce new crossing edges after reload in
general, since it might require extra registers on some targets.

Therefore, move the jump threading pass between combine and hot/cold
partitioning.  Building SPEC 2006 and SPEC 2017 with the old and the new
code indicates that:

* When doing jump threading right after reload, 3889 edges are threaded.
* When doing jump threading right after combine, 3918 edges are
  threaded.

This means this change will not introduce performance regressions.

gcc/ChangeLog:

2019-10-17  Ilya Leoshkevich  

PR rtl-optimization/92007
* cfgcleanup.c (thread_jump): Add an assertion that we don't
call it after reload if hot/cold partitioning has been done.
(class pass_postreload_jump): Rename to pass_jump_after_combine.
(make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.
* passes.def (pass_postreload_jump): Move before reload, rename
to pass_jump_after_combine.
* tree-pass.h (make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.

gcc/testsuite/ChangeLog:

2019-10-17  Ilya Leoshkevich  

PR rtl-optimization/92007
* g++.dg/opt/pr92007.C: New test (from Arseny Solokha).
---
 gcc/cfgcleanup.c   | 22 +++-
 gcc/passes.def |  2 +-
 gcc/testsuite/g++.dg/opt/pr92007.C | 32 ++
 gcc/tree-pass.h|  2 +-
 4 files changed, 47 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/pr92007.C

diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index ced7e0a4283..835f7d79ea4 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -259,6 +259,10 @@ thread_jump (edge e, basic_block b)
   bool failed = false;
   reg_set_iterator rsi;
 
+  /* Jump threading may cause fixup_partitions to introduce new crossing edges,
+ which is not allowed after reload.  */
+  gcc_checking_assert (!reload_completed || !crtl->has_bb_partition);
+
   if (b->flags & BB_NONTHREADABLE_BLOCK)
 return NULL;
 
@@ -3280,10 +3284,10 @@ make_pass_jump (gcc::context *ctxt)
 
 namespace {
 
-const pass_data pass_data_postreload_jump =
+const pass_data pass_data_jump_after_combine =
 {
   RTL_PASS, /* type */
-  "postreload_jump", /* name */
+  "jump_after_combine", /* name */
   OPTGROUP_NONE, /* optinfo_flags */
   TV_JUMP, /* tv_id */
   0, /* properties_required */
@@ -3293,20 +3297,20 @@ const pass_data pass_data_postreload_jump =
   0, /* todo_flags_finish */
 };
 
-class pass_postreload_jump : public rtl_opt_pass
+class pass_jump_after_combine : public rtl_opt_pass
 {
 public:
-  pass_postreload_jump (gcc::context *ctxt)
-: rtl_opt_pass (pass_data_postreload_jump, ctxt)
+  pass_jump_after_combine (gcc::context *ctxt)
+: rtl_opt_pass (pass_data_jump_after_combine, ctxt)
   {}
 
   /* opt_pass methods: */
   virtual unsigned int execute (function *);
 
-}; // class pass_postreload_jump
+}; // class pass_jump_after_combine
 
 unsigned int
-pass_postreload_jump::execute (function *)
+pass_jump_after_combine::execute (function *)
 {
   cleanup_cfg (flag_thread_jumps ? CLEANUP_THREADING : 0);
   return 0;
@@ -3315,9 +3319,9 @@ pass_postreload_jump::execute (function *)
 } // anon namespace
 
 rtl_opt_pass *
-make_pass_postreload_jump (gcc::context *ctxt)
+make_pass_jump_after_combine (gcc::context *ctxt)
 {
-  return new pass_postreload_jump (ctxt);
+  return new pass_jump_after_combine (ctxt);
 }
 
 namespace {
diff --git a/gcc/passes.def b/gcc/passes.def
index 8999ceec636..798a391bd35 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -4

Re: [PATCH 00/29] [arm] Rewrite DImode arithmetic support

2019-10-24 Thread Christophe Lyon


On 23/10/2019 15:21, Richard Earnshaw (lists) wrote:

On 23/10/2019 09:28, Christophe Lyon wrote:

On 21/10/2019 14:24, Richard Earnshaw (lists) wrote:

On 21/10/2019 12:51, Christophe Lyon wrote:

On 18/10/2019 21:48, Richard Earnshaw wrote:

Each patch should produce a working compiler (it did when it was
originally written), though since the patch set has been re-ordered
slightly there is a possibility that some of the intermediate steps
may have missing test updates that are only cleaned up later.
However, only the end of the series should be considered complete.
I've kept the patch as a series to permit easier regression hunting
should that prove necessary.


Thanks for this information: my validation system was designed in such a way 
that it will run the GCC testsuite after each of your patches, so I'll keep in 
mind not to report regressions (I've noticed several already).


I can perform a manual validation taking your 29 patches as a single one and 
compare the results with those of the revision preceding the one were you 
committed patch #1. Do you think it would be useful?


Christophe




I think if you can filter out any that are removed by later patches and then 
report against the patch that caused the regression itself then that would be 
the best.  But I realise that would be more work for you, so a round-up against 
the combined set would be OK.

BTW, I'm aware of an issue with the compiler now generating

  reg, reg, shift 

in Thumb2; no need to report that again.

Thanks,
R.
.




Hi Richard,

The validation of the whole set shows 1 regression, which was also reported by 
the validation of r277179 (early split most DImode comparison operations)

When GCC is configured as:
--target arm-none-eabi
--with-mode default
--with-cpu default
--with-fpu default
(that is, no --with-mode, --with-cpu, --with-fpu option)
I'm using binutils-2.28 and newlib-3.1.0

I can see:
FAIL: g++.dg/opt/pr36449.C  -std=gnu++14 execution test
(whatever -std=gnu++XX option)


That's strange.  The assembler code generated for that test is unchanged from 
before the patch series, so I can't see how it can't be a problem in the test 
itself.  What's more, I can't seem to reproduce this myself.


As you have noticed, I have created PR92207 to help understand this.



Similarly, in my build the code for _Znwj, malloc, malloc_r and free_r are also 
unchanged, while the malloc_[un]lock functions are empty stubs (not surprising 
as we aren't multi-threaded).

So the only thing that looks to have really changed are the linker offsets 
(some of the library code has changed, but I don't think it's really reached in 
practice, so shouldn't be relevant).



I'm executing the tests using qemu-4.1.0 -cpu arm926
The qemu traces shows that code enters main, then _Znwj (operator new), then 
_malloc_r
The qemu traces end with:


What do you mean by 'end with'?  What's the failure mode of the test?  A crash, 
or the test exiting with a failure code?


qemu complains with:
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

'end with' because my automated validation builds do not keep the full 
execution traces (that would need too much disk space)


IN: _malloc_r^M
0x00019224:  e3a00ffe  mov  r0, #0x3f8^M
0x00019228:  e3a0c07f  mov  ip, #0x7f^M
0x0001922c:  e3a0e07e  mov  lr, #0x7e^M
0x00019230:  eafffe41  b    #0x18b3c^M
^M
R00=00049418 R01= R02=0554 R03=0004^M
R04= R05=0808 R06=00049418 R07=^M
R08= R09= R10=000492d8 R11=fffeb4b4^M
R12=0060 R13=fffeb460 R14=00018b14 R15=00019224^M
PSR=2010 --C- A usr32^M
^M
IN: _malloc_r^M
0x00018b3c:  e59f76f8  ldr  r7, [pc, #0x6f8]^M
0x00018b40:  e087  add  r0, r7, r0^M
0x00018b44:  e5903004  ldr  r3, [r0, #4]^M
0x00018b48:  e248  sub  r0, r0, #8^M
0x00018b4c:  e153  cmp  r0, r3^M
0x00018b50:  1a05  bne  #0x18b6c^M


But this block neither jumps to, nor falls through to 

^M
R00=03f8 R01= R02=0554 R03=0004^M
R04= R05=0808 R06=00049418 R07=^M
R08= R09= R10=000492d8 R11=fffeb4b4^M
R12=007f R13=fffeb460 R14=007e R15=00018b3c^M
PSR=2010 --C- A usr32^M
R00=00049c30 R01= R02=0554 R03=00049c30^M
R04= R05=0808 R06=00049418 R07=00049840^M
R08= R09= R10=000492d8 R11=fffeb4b4^M
R12=007f R13=fffeb460 R14=007e R15=00018b54^M
PSR=6010 -ZC- A usr32^M
^M
IN: _malloc_r^M


...here.  So there's some trace missing by the looks of it; or some other 
problem.


0x00019120:  e1a02a0b  lsl  r2, fp, #0x14^M
0x00019124:  e1a02a22  lsr  r2, r2, #0x14^M
0x00019128:  e352  cmp  r2, #0^M
0x0001912c:  1afffee7  bne  #0x18cd0^M


and the same here.

yes, qemu traces are 'incomplete'.




^M
R00=0004b000 R01=08002108 R02=00049e40 R03=0004b000^M
R04=0004a8e0 R05=0808 R06=00049418 R07=00

[PATCH] Fix typo

2019-10-24 Thread Richard Biener



Committed.

Richard.

2019-10-24  Richard Biener  

PR tree-optimization/65930
* gcc.dg/vect/vect-reduc-2short.c: Fix typo.

Index: gcc/testsuite/gcc.dg/vect/vect-reduc-2short.c
===
--- gcc/testsuite/gcc.dg/vect/vect-reduc-2short.c   (revision 277372)
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-2short.c   (working copy)
@@ -45,4 +45,4 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { 
! vect_no_int_min_max } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { 
! vect_no_int_min_max } } } } */

[PATCH] Fix PR92203

2019-10-24 Thread Richard Biener



Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2019-10-24  Richard Biener  

PR tree-optimization/92203
* treee-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
Skip eliminating conversion stmts inserted by insertion.

* gcc.dg/torture/pr88240.c: New testcase.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 277365)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -5459,8 +5459,13 @@ eliminate_dom_walker::eliminate_stmt (ba
 
  /* If this is an assignment from our leader (which
 happens in the case the value-number is a constant)
-then there is nothing to do.  */
- if (gimple_assign_single_p (stmt)
+then there is nothing to do.  Likewise if we run into
+inserted code that needed a conversion because of
+our type-agnostic value-numbering of loads.  */
+ if ((gimple_assign_single_p (stmt)
+  || (is_gimple_assign (stmt)
+  && (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
+  || gimple_assign_rhs_code (stmt) == VIEW_CONVERT_EXPR)))
  && sprime == gimple_assign_rhs1 (stmt))
return;
 
Index: gcc/testsuite/gcc.dg/torture/pr88240.c
===
--- gcc/testsuite/gcc.dg/torture/pr88240.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr88240.c  (working copy)
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wno-div-by-zero" } */
+
+unsigned long int rr;
+
+void
+cw (int z9)
+{
+  int m5;
+  unsigned long int vz = 0;
+  long int *na;
+
+  if (z9 == 0)
+rr = 0;
+  else
+{
+  na = (long int *) &m5;
+  for (*na = 0; *na < 1; ++*na)
+   {
+ na = (long int *) &vz;
+ rr /= 0;
+   }
+}
+
+  m5 = rr / 5;
+  ++vz;
+  if (vz != 0)
+while (z9 < 1)
+  {
+   if (m5 >= 0)
+ rr += m5;
+
+   na = (long int *) &rr;
+   if (*na >= 0)
+ rr = 0;
+  }
+}

Re: [PR testsuite/91842] Skip gcc.dg/ipa/ipa-sra-19.c on power

2019-10-24 Thread Andreas Krebbel

On 02.10.19 17:06, Martin Jambor wrote:
> Hi,
> 
> I seem to remember I minimized gcc.dg/ipa/ipa-sra-19.c on power but
> perhaps I am wrong because the testcase fails there with a
> power-specific error:
> 
> gcc.dg/ipa/ipa-sra-19.c:19:3: error: AltiVec argument passed to unprototyped 
> function
> 
> I am going to simply skip it there with the following patch, which I
> hope is obvious.  Tested by running ipa.exp on both ppc64le-linux and
> x86_64-linux.
> 
> Thanks,
> 
> Martin
> 
> 
> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c 
> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
> index adebaa5f5e1..d219411d8ba 100644
> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2"  } */
> +/* { dg-skip-if "" { powerpc*-*-* } } */
>  
>  typedef int __attribute__((__vector_size__(16))) vectype;
> 
> 

I ran into the same problem on IBM Z. Is it important for the testcase to leave 
the argument list of
k unspecified or would it be ok to add it?

diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c 
b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
index d219411d8ba..d9dcd33cb76 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
@@ -5,7 +5,7 @@
 typedef int __attribute__((__vector_size__(16))) vectype;

 vectype dk();
-vectype k();
+vectype k(vectype);

 int b;
 vectype *j;

Re: [C++ PATCH] Implement P1073R3: Immediate functions (PR c++/88335)

2019-10-24 Thread Jakub Jelinek

On Tue, Oct 22, 2019 at 10:57:42AM -0400, Jason Merrill wrote:
> > So, do you prefer to do it the other way during build_cxx_call?
> 
> It seems more straightforward.

I've tried this approach, but am running into issues:
1) the normal constructors aren't an issue, all I was missing is passing
   the object argument down to cxx_constant_value.  The only problem
   (I'm aware of) is then the case where build_over_call is called on
   a constructor on is_dummy_object, because that obviously isn't
   usable in constant expression as something to store into.  This is
   used e.g. in the value initialization in consteval8.C test, but probably
   for vec init too and other cases.  I've tried to create a temporary in
   that case, but consteval8.C still ICEs
/usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval8.C: In function 'int foo()':
/usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval8.C:13:13: internal compiler 
error: tree check: expected aggr_init_expr, have target_expr in build
_value_init, at cp/init.c:372
0x190aa02 tree_check_failed(tree_node const*, char const*, int, char const*, 
...)
../../gcc/tree.c:9925
0x8c37d8 tree_check(tree_node*, char const*, int, char const*, tree_code)
../../gcc/tree.h:3267
0xa254b5 build_value_init(tree_node*, int)
../../gcc/cp/init.c:372
2) all other testcases in the testsuite pass, but I'm worried about
   default arguments in consteval lambdas.
consteval int bar () { return 42; }
consteval int baz () { return 1; }
typedef int (*fnptr) ();
consteval fnptr quux () { return bar; }

void
foo ()
{
  auto qux = [] (fnptr a = quux ()) consteval { return a (); };
  constexpr auto c = qux (baz);
  constexpr auto d = qux (bar);
  constexpr auto e = qux ();
  static_assert (c == 1);
  static_assert (d == 42);
  static_assert (e == 42);
}
  I believe qux (baz) and qux (bar) are invalid and the patch rejects
  it (I think innermost non-block scope for the baz in qux (baz) is
  not a function parameter scope of an immediate function and so taking
  the address there is invalid.  But isn't the qux () call ok?
  I mean it is similar to the non-lambda calls in the example in the
  standard.  Unfortunately, when parsing the default arguments of a
  lambda, we haven't seen the consteval keyword yet.  I think we could
  tentatively set the consteval argument scope when parsing any lambda
  and if it is not consteval, call a cxx_eval_consteval like function
  to evaluate it at that point.  Thoughts on that?

3) compared to the May version of the patch, I also found that
   build_over_call has a completely separate path if
   processing_template_decl and does something much simpler in that
   case, but I believe we still need to evaluate consteval calls
   even if processing_template_decl if they aren't dependent.

2019-10-24  Jakub Jelinek  

PR c++/88335 - Implement P1073R3: Immediate functions
c-family/
* c-common.h (enum rid): Add RID_CONSTEVAL.
* c-common.c (c_common_reswords): Add consteval.
cp/
* cp-tree.h (struct lang_decl_fn): Add immediate_fn_p bit.
(DECL_IMMEDIATE_FUNCTION_P, SET_DECL_IMMEDIATE_FUNCTION_P): Define.
(enum cp_decl_spec): Add ds_consteval.
(build_local_temp): Declare.
(fold_non_dependent_expr): Add another tree argument defaulted to
NULL_TREE.
* name-lookup.h (struct cp_binding_level): Add immediate_fn_ctx_p
member.
* tree.c (build_local_temp): Remove forward declaration, no longer
static.
* parser.c (cp_keyword_starts_decl_specifier_p): Adjust comments
for C++11 and C++20 specifiers.  Handle RID_CONSTEVAL.
(CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR): Adjust comment.
(CP_PARSER_FLAGS_CONSTEVAL): New.
(cp_parser_lambda_declarator_opt): Handle ds_consteval.
(cp_parser_decl_specifier_seq): Handle RID_CONSTEVAL.
(cp_parser_explicit_instantiation): Diagnose explicit instantiation
with consteval specifier.
(cp_parser_init_declarator): For consteval or into flags
CP_PARSER_FLAGS_CONSTEVAL.
(cp_parser_direct_declarator): If CP_PARSER_FLAGS_CONSTEVAL, set
current_binding_level->immediate_fn_ctx_p in the sk_function_parms
scope.
(set_and_check_decl_spec_loc): Add consteval entry, formatting fix.
* call.c (build_addr_func): For direct calls to immediate functions
use build_address rather than decay_conversion.
(build_over_call): Evaluate immediate function invocations.
* error.c (dump_function_decl): Handle DECL_IMMEDIATE_FUNCTION_P.
* semantics.c (expand_or_defer_fn_1): Use tentative linkage and don't
call mark_needed for immediate functions.
* typeck.c (cxx_sizeof_or_alignof_expr): Likewise.  Formatting fix.
(cp_build_addr_expr_1): Reject taking address of immediate function
outside of immediate function.
* decl.c (validate_constexpr_redeclaration): Diagnose consteva

[PATCH] Fix PR92205

2019-10-24 Thread Richard Biener



Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2019-10-24  Richard Biener  

PR tree-optimization/92205
* tree-vect-loop.c (vectorizable_reduction): Restrict
search for alternate vectype_in to lane-reducing patterns
we support.

* gcc.dg/vect/pr92205.c: New testcase.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 277365)
+++ gcc/tree-vect-loop.c(working copy)
@@ -5697,6 +5697,8 @@ vectorizable_reduction (stmt_vec_info st
 
   gassign *stmt = as_a  (stmt_info->stmt);
   enum tree_code code = gimple_assign_rhs_code (stmt);
+  bool lane_reduc_code_p
+= (code == DOT_PROD_EXPR || code == WIDEN_SUM_EXPR || code == SAD_EXPR);
   int op_type = TREE_CODE_LENGTH (code);
 
   scalar_dest = gimple_assign_lhs (stmt);
@@ -5749,8 +5751,10 @@ vectorizable_reduction (stmt_vec_info st
return false;
 
   /* To properly compute ncopies we are interested in the widest
-input type in case we're looking at a widening accumulation.  */
-  if (tem
+non-reduction input type in case we're looking at a widening
+accumulation that we later handle in vect_transform_reduction.  */
+  if (lane_reduc_code_p
+ && tem
  && (!vectype_in
  || (GET_MODE_SIZE (SCALAR_TYPE_MODE (TREE_TYPE (vectype_in)))
  < GET_MODE_SIZE (SCALAR_TYPE_MODE (TREE_TYPE (tem))
@@ -6233,8 +6237,6 @@ vectorizable_reduction (stmt_vec_info st
   && vect_stmt_to_vectorize (use_stmt_info) == stmt_info)
 single_defuse_cycle = true;
 
-  bool lane_reduc_code_p
-= (code == DOT_PROD_EXPR || code == WIDEN_SUM_EXPR || code == SAD_EXPR);
   if (single_defuse_cycle || lane_reduc_code_p)
 {
   gcc_assert (code != COND_EXPR);
Index: gcc/testsuite/gcc.dg/vect/pr92205.c
===
--- gcc/testsuite/gcc.dg/vect/pr92205.c (nonexistent)
+++ gcc/testsuite/gcc.dg/vect/pr92205.c (working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+int b(int n, unsigned char *a)
+{
+  int d = 0;
+  a = __builtin_assume_aligned (a, __BIGGEST_ALIGNMENT__);
+  for (int c = 0; c < n; ++c)
+d |= a[c];
+  return d;
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { 
vect_unpack && { ! vect_no_bitwise } } } } } */

[Patch] Add OpenACC 2.6's no_create

2019-10-24 Thread Tobias Burnus

The clause (new in OpenACC 2.6) makes any device code use the local 
memory address for each of the variables specified unless the given 
variable is already present on the current device. – Or in words of 
OpenACC 2.7 (in Sect. 2.7.9 no_create clause):


"The no_create clause may appear on structured data and compute 
constructs." / "For each var in varlist, if var is in shared memory, no 
action is taken; if var is not in shared memory, the no_create clause 
behaves as follows:" [digest: if present, update present count, if 
pointer attach/detach; if not not present, device-local memory used.]
"The restrictions regarding subarrays in the present clause apply to 
this clause."


Note: The "no_create" maps to the (new) GOMP_MAP_NO_ALLOC in the middle 
end – and all action in libgomp/target.c but only applies to 
GOMP_MAP_NO_ALLOC; hence, the code should only affect OpenACC.


OK for the trunk?

Cheers,

Tobias

PS: This patch is a re-diffed version of the OG9/OG8 version; as some 
other features are not yet on trunk, it misses a test case for 
"no_create(s.y…)" (i.e. the struct component-ref; 
libgomp/testsuite/libgomp.oacc-c-c++-common/nocreate-{3,4}.c); trunk 
also lacks 'acc serial' and, hence, the attach patch lacks the 
OACC_SERIAL_CLAUSE_MASK updates – and gfc_match_omp_map_clause needs 
later to be updated for the allow_derived and allow_common arguments. 
Furthermore, some 'do_detach = false' are missing in libgomp/target.c as 
they do not yet exist on trunk, either.


The openacc-gcc-9 /…-8 branch patch is commit 
8e74c2ec2b90819c995444370e742864a685209f of Dec 20, 2018. It has been 
posted as https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01418.html


Add OpenACC 2.6 `no_create' clause support

The clause makes any device code use the local memory address for each
of the variables specified unless the given variable is already present
on the current device.

2019-10-24  Julian Brown  
	Maciej W. Rozycki  
	Tobias Burnus  

	gcc/
	* omp-low.c (lower_omp_target): Support GOMP_MAP_NO_ALLOC.
	* tree-pretty-print.c (dump_omp_clause): Likewise.

	gcc/c-family/
	* c-pragma.h (pragma_omp_clause): Add
	PRAGMA_OACC_CLAUSE_NO_CREATE.

	gcc/c/
	* c-parser.c (c_parser_omp_clause_name): Support no_create.
	(c_parser_oacc_data_clause): Likewise.
	(c_parser_oacc_all_clauses): Likewise.
	(OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
	(OACC_PARALLEL_CLAUSE_MASK, OACC_SERIAL_CLAUSE_MASK): Add
	PRAGMA_OACC_CLAUSE_NO_CREATE.
	* c-typeck.c (handle_omp_array_sections): Support
	GOMP_MAP_NO_ALLOC.

	gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Support no_create.
	(cp_parser_oacc_data_clause): Likewise.
	(cp_parser_oacc_all_clauses): Likewise.
	(OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
	(OACC_PARALLEL_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_NO_CREATE.
	* semantics.c (handle_omp_array_sections): Support no_create.

	gcc/fortran/
	* gfortran.h (gfc_omp_map_op): Add OMP_MAP_NO_ALLOC.
	* openmp.c (omp_mask2): Add OMP_CLAUSE_NO_CREATE.
	(gfc_match_omp_clauses): Support no_create.
	(OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES)
	(OACC_DATA_CLAUSES): Add OMP_CLAUSE_NO_CREATE.
	* trans-openmp.c (gfc_trans_omp_clauses_1): Support
	OMP_MAP_NO_ALLOC.

	include/
	* gomp-constants.h (gomp_map_kind): Support GOMP_MAP_NO_ALLOC.

	libgomp/
	* target.c (gomp_map_vars_async): Support GOMP_MAP_NO_ALLOC.

	* testsuite/libgomp.oacc-c-c++-common/nocreate-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/nocreate-2.c: New test.
	* testsuite/libgomp.oacc-fortran/nocreate-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/nocreate-2.f90: New test.

diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index e0aa774555a..da6cfdb8b98 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -153,6 +153,7 @@ enum pragma_omp_clause {
   PRAGMA_OACC_CLAUSE_GANG,
   PRAGMA_OACC_CLAUSE_HOST,
   PRAGMA_OACC_CLAUSE_INDEPENDENT,
+  PRAGMA_OACC_CLAUSE_NO_CREATE,
   PRAGMA_OACC_CLAUSE_NUM_GANGS,
   PRAGMA_OACC_CLAUSE_NUM_WORKERS,
   PRAGMA_OACC_CLAUSE_PRESENT,
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 7618a46c8bc..1004a2e5579 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11833,7 +11833,9 @@ c_parser_omp_clause_name (c_parser *parser)
 	result = PRAGMA_OMP_CLAUSE_MERGEABLE;
 	  break;
 	case 'n':
-	  if (!strcmp ("nogroup", p))
+	  if (!strcmp ("no_create", p))
+	result = PRAGMA_OACC_CLAUSE_NO_CREATE;
+	  else if (!strcmp ("nogroup", p))
 	result = PRAGMA_OMP_CLAUSE_NOGROUP;
 	  else if (!strcmp ("nontemporal", p))
 	result = PRAGMA_OMP_CLAUSE_NONTEMPORAL;
@@ -12296,7 +12298,10 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
copyout ( variable-list )
create ( variable-list )
delete ( variable-list )
-   present ( variable-list ) */
+   present ( variable-list )
+
+   OpenACC 2.6:
+   no_create ( variable-list ) */
 
 static tree
 c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
@@ -12332,6 +12337,9 @@ c_parser_oa

Re: [PR testsuite/91842] Skip gcc.dg/ipa/ipa-sra-19.c on power

2019-10-24 Thread Martin Jambor

Hi,

On Thu, Oct 24 2019, Andreas Krebbel wrote:
> On 02.10.19 17:06, Martin Jambor wrote:
>> Hi,
>> 
>> I seem to remember I minimized gcc.dg/ipa/ipa-sra-19.c on power but
>> perhaps I am wrong because the testcase fails there with a
>> power-specific error:
>> 
>> gcc.dg/ipa/ipa-sra-19.c:19:3: error: AltiVec argument passed to unprototyped 
>> function
>> 
>> I am going to simply skip it there with the following patch, which I
>> hope is obvious.  Tested by running ipa.exp on both ppc64le-linux and
>> x86_64-linux.
>> 
>> Thanks,
>> 
>> Martin
>> 
>> 
>> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c 
>> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>> index adebaa5f5e1..d219411d8ba 100644
>> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>> @@ -1,5 +1,6 @@
>>  /* { dg-do compile } */
>>  /* { dg-options "-O2"  } */
>> +/* { dg-skip-if "" { powerpc*-*-* } } */
>>  
>>  typedef int __attribute__((__vector_size__(16))) vectype;
>> 
>> 
>
> I ran into the same problem on IBM Z. Is it important for the testcase to 
> leave the argument list of
> k unspecified or would it be ok to add it?

I wanted to write to you that the un-prototypedness is on purpose and
essential to test what the bug was in the past but this time I actually
managed to find the associated fix in my ipa-sra branch and found out
that I mis-remembered, that it is not the case.  Sorry for not doing
that before.  I believe the patch is OK then and we can even remove the
dg-skip-if I added.  And by that I mean that although I'm not a
reviewer, I would consider it obvious.  Will you do it or should I take
care of it?

Thanks,

Martin


>
> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c 
> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
> index d219411d8ba..d9dcd33cb76 100644
> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
> @@ -5,7 +5,7 @@
>  typedef int __attribute__((__vector_size__(16))) vectype;
>
>  vectype dk();
> -vectype k();
> +vectype k(vectype);
>
>  int b;
>  vectype *j;

More ipa-referene memory use improvements

2019-10-24 Thread Jan Hubicka

Hi,
this patch removes the code to reverse read/written bitmaps into
not_read/no_written bitmaps and make sure we compute/stream only useful
summaries.  For cc1 ipa-reference now consumes about 17MB while previously
we needed about 64MB after consdensing the bitmaps and over 200MB before.

It also reduced ltrans->wpa streaming and makes it possible to give up when
list of static vars is large (introduce --param) which I will do next.

I plan to backport these patches to release tree once they get tested
in trunk.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* ipa-reference.c (ipa_reference_optimization_summary_d): Rename
statics_not_read and statics_not_written to statics_read and
statics_written respectively.
(no_module_statics): New static var.
(ipa_reference_get_not_read_global): Rename to ...
(ipa_reference_get_read_global): ... this.
(ipa_reference_get_not_written_global): Rename to ...
(ipa_reference_get_written_global): ... this.
(dump_static_vars_set_to_file): Dump no_module_statics.
(copy_static_var_set): Add for propagation parameter.
(ipa_init): Initialize no_module_statics.
(ipa_ref_opt_summary_t::duplicate): Update.
(ipa_ref_opt_summary_t::remove): Update.
(propagate): Update.
(write_node_summary_p): Look correctly for bitmap differences.
(ipa_reference_write_optimization_summary): Update.
(ipa_reference_read_optimization_summary): Update.
* ipa-reference.h
(ipa_reference_get_not_read_global): Rename to ...
(ipa_reference_get_read_global): ... this.
(ipa_reference_get_not_written_global): Rename to ...
(ipa_reference_get_written_global): ... this.
* tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Update.
(call_may_clobber_ref_p_1): Update.
Index: ipa-reference.c
===
--- ipa-reference.c (revision 277366)
+++ ipa-reference.c (working copy)
@@ -74,8 +74,8 @@ struct ipa_reference_global_vars_info_d
 
 struct ipa_reference_optimization_summary_d
 {
-  bitmap statics_not_read;
-  bitmap statics_not_written;
+  bitmap statics_read;
+  bitmap statics_written;
 };
 
 typedef ipa_reference_local_vars_info_d *ipa_reference_local_vars_info_t;
@@ -103,6 +103,8 @@ varpool_node_hook_list *varpool_node_hoo
static we are considering.  This is added to the local info when asm
code is found that clobbers all memory.  */
 static bitmap all_module_statics;
+/* Zero bitmap.  */
+static bitmap no_module_statics;
 /* Set of all statics that should be ignored because they are touched by
-fno-ipa-reference code.  */
 static bitmap ignore_module_statics;
@@ -193,7 +195,7 @@ get_reference_optimization_summary (stru
NULL if no data is available.  */
 
 bitmap
-ipa_reference_get_not_read_global (struct cgraph_node *fn)
+ipa_reference_get_read_global (struct cgraph_node *fn)
 {
   if (!opt_for_fn (current_function_decl, flag_ipa_reference))
 return NULL;
@@ -208,10 +210,10 @@ ipa_reference_get_not_read_global (struc
  || (avail == AVAIL_INTERPOSABLE
  && flags_from_decl_or_type (fn->decl) & ECF_LEAF))
   && opt_for_fn (fn2->decl, flag_ipa_reference))
-return info->statics_not_read;
+return info->statics_read;
   else if (avail == AVAIL_NOT_AVAILABLE
   && flags_from_decl_or_type (fn->decl) & ECF_LEAF)
-return all_module_statics;
+return no_module_statics;
   else
 return NULL;
 }
@@ -222,7 +224,7 @@ ipa_reference_get_not_read_global (struc
call.  Returns NULL if no data is available.  */
 
 bitmap
-ipa_reference_get_not_written_global (struct cgraph_node *fn)
+ipa_reference_get_written_global (struct cgraph_node *fn)
 {
   if (!opt_for_fn (current_function_decl, flag_ipa_reference))
 return NULL;
@@ -237,10 +239,10 @@ ipa_reference_get_not_written_global (st
  || (avail == AVAIL_INTERPOSABLE
  && flags_from_decl_or_type (fn->decl) & ECF_LEAF))
   && opt_for_fn (fn2->decl, flag_ipa_reference))
-return info->statics_not_written;
+return info->statics_written;
   else if (avail == AVAIL_NOT_AVAILABLE
   && flags_from_decl_or_type (fn->decl) & ECF_LEAF)
-return all_module_statics;
+return no_module_statics;
   else
 return NULL;
 }
@@ -315,6 +317,8 @@ dump_static_vars_set_to_file (FILE *f, b
 return;
   else if (set == all_module_statics)
 fprintf (f, "ALL");
+  else if (set == no_module_statics)
+fprintf (f, "NO");
   else
 EXECUTE_IF_SET_IN_BITMAP (set, 0, index, bi)
   {
@@ -358,10 +362,12 @@ union_static_var_sets (bitmap &x, bitmap
But if SET is NULL or the maximum set, return that instead.  */
 
 static bitmap
-copy_static_var_set (bitmap set)
+copy_static_var_set (bitmap set, bool for_propagation)
 {
   if (set == NULL || set == all_module_statics)
 return set;
+  if (!for_propagation && set == n

[COMMITTED][MSP430] Tweaks to generation of 430X instructions

2019-10-24 Thread Jozef Lawrynowicz

This patch makes a couple of simple tweaks to improve code generation for the
MSP430 430X ISA.

"Address-word instructions" support 20-bit operands without the extension word
required by regular 430X instructions. Using them where possible reduces code
size and improves performance.
We use the "Ya" constraint to indicate special cases the address-word "MOVA"
instruction can be used. The indirect auto-increment addressing mode can be
used with the source operand of a MOVA instructions, so this patch allows
the Ya constraint to match the (mem (post_inc)) RTX.

Similarly, the RRAM and RLAM rotate instructions do not use the extension word
that their RRAX and RLAX counterparts require. However, their use is limited to
shifting a register by a constant between 1 and 4 bits. This patch ensures they
get used when possible by extending the 430x_shift_left and
430x_arithmetic_shift_right insn patterns.

Successfully regtested for msp430-elf on trunk in the small and large memory
models.

Committed to trunk as obvious.
>From 47e4f7397a43c86a7d483da1aa914018d52d9e5d Mon Sep 17 00:00:00 2001
From: jozefl 
Date: Thu, 24 Oct 2019 13:34:54 +
Subject: [PATCH] MSP430: Tweaks to generation of 430X instructions

gcc/ChangeLog:

2019-10-24  Jozef Lawrynowicz  

	* config/msp430/constraints.md: Allow post_inc for "Ya" constraint.
	* config/msp430/msp430.md (430x_shift_left): Use RLAM when the constant
	shift amount is between 1 and 4.
	(430x_arithmetic_shift_right): Use RRAM when the constant shift amount
	is between 1 and 4.

gcc/testsuite/ChangeLog:

2019-10-24  Jozef Lawrynowicz  

	* gcc.target/msp430/emulate-slli.c: Skip for -mcpu=msp430.
	Add shift by a constant 5 bits.
	Update scan-assembler directives.
	* gcc.target/msp430/emulate-srai.c: Likewise.
	* gcc.target/msp430/emulate-srli.c: Skip for -mcpu=msp430.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@277394 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog  |  8 
 gcc/config/msp430/constraints.md   |  1 +
 gcc/config/msp430/msp430.md| 12 
 gcc/testsuite/ChangeLog|  8 
 gcc/testsuite/gcc.target/msp430/emulate-slli.c |  6 +-
 gcc/testsuite/gcc.target/msp430/emulate-srai.c |  6 +-
 gcc/testsuite/gcc.target/msp430/emulate-srli.c |  1 +
 7 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d09b72d2b16..eb0a2f9b510 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2019-10-24  Jozef Lawrynowicz  
+
+	* config/msp430/constraints.md: Allow post_inc for "Ya" constraint.
+	* config/msp430/msp430.md (430x_shift_left): Use RLAM when the constant
+	shift amount is between 1 and 4.
+	(430x_arithmetic_shift_right): Use RRAM when the constant shift amount
+	is between 1 and 4.
+
 2019-10-24  Richard Biener  
 
 	PR tree-optimization/92205
diff --git a/gcc/config/msp430/constraints.md b/gcc/config/msp430/constraints.md
index d01bcf9a242..49fc769ec74 100644
--- a/gcc/config/msp430/constraints.md
+++ b/gcc/config/msp430/constraints.md
@@ -82,6 +82,7 @@
 		  (match_test ("CONST_INT_P (XEXP (XEXP (op, 0), 1))"))
 		  (match_test ("IN_RANGE (INTVAL (XEXP (XEXP (op, 0), 1)), HOST_WIDE_INT_M1U << 15, (1 << 15)-1)"
 	(match_code "reg" "0")
+	(match_code "post_inc" "0")
 	)))
 
 (define_constraint "Yc"
diff --git a/gcc/config/msp430/msp430.md b/gcc/config/msp430/msp430.md
index e5ba445c60d..ed4c370261a 100644
--- a/gcc/config/msp430/msp430.md
+++ b/gcc/config/msp430/msp430.md
@@ -875,8 +875,10 @@
 		   (match_operand2 "immediate_operand" "n")))]
   "msp430x"
   "*
-  if (INTVAL (operands[2]) > 0 && INTVAL (operands[2]) < 16)
-return \"rpt\t%2 { rlax.w\t%0\";
+  if (INTVAL (operands[2]) > 0 && INTVAL (operands[2]) < 5)
+return \"RLAM.W\t%2, %0\";
+  else if (INTVAL (operands[2]) >= 5 && INTVAL (operands[2]) < 16)
+return \"RPT\t%2 { RLAX.W\t%0\";
   return \"# nop left shift\";
   "
 )
@@ -960,8 +962,10 @@
 		 (match_operand2 "immediate_operand" "n")))]
   "msp430x"
   "*
-  if (INTVAL (operands[2]) > 0 && INTVAL (operands[2]) < 16)
-return \"rpt\t%2 { rrax.w\t%0\";
+  if (INTVAL (operands[2]) > 0 && INTVAL (operands[2]) < 5)
+return \"RRAM.W\t%2, %0\";
+  else if (INTVAL (operands[2]) >= 5 && INTVAL (operands[2]) < 16)
+return \"RPT\t%2 { RRAX.W\t%0\";
   return \"# nop arith right shift\";
   "
 )
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 2742e10bb6f..ee43703ea54 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@
+2019-10-24  Jozef Lawrynowicz  
+
+	* gcc.target/msp430/emulate-slli.c: Skip for -mcpu=msp430.
+	Add shift by a constant 5 bits.
+	Update scan-assembler directives.
+	* gcc.target/msp430/emulate-srai.c: Likewise.
+	* gcc.target/msp430/emulate-srli.c: Skip for -mcpu=msp430.
+
 2019-10-24  Richard Biener  
 
 	PR tree-optimization/92205
diff --git a/gcc/testsuite/gcc.target/msp430/emulate-s

[COMMITTED][MSP430] Remove unused msp430_hard_regno_nregs_*_padding functions

2019-10-24 Thread Jozef Lawrynowicz

There exist implementations HARD_REGNO_NREGS_HAS_PADDING and
HARD_REGNO_NREGS_WITH_PADDING functions in the msp430 back end, but they have
never been tied to their respective target macros.

Defining the target macros so these functions are used has no effect on GCC
testresults or on code size. So it seems that subreg_get_info is handling
msp430 PSImode registers properly.

This patch removes these unused functions.

Successfully regtested for msp430-elf on trunk in the small and large memory
models.

Committed to trunk as obvious.
>From daf0305adc486dcdecf1d94efd564e0d9f187ecf Mon Sep 17 00:00:00 2001
From: jozefl 
Date: Thu, 24 Oct 2019 13:36:52 +
Subject: [PATCH] MSP430: Remove unused msp430_hard_regno_nregs_*_padding
 functions

2019-10-24  Jozef Lawrynowicz  

	* config/msp430/msp430.c (msp430_hard_regno_nregs_has_padding): Remove
	and add comment.
	(msp430_hard_regno_nregs_with_padding): Remove.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@277395 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog  |  6 ++
 gcc/config/msp430/msp430.c | 25 +++--
 2 files changed, 9 insertions(+), 22 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index eb0a2f9b510..7b433bf59a1 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2019-10-24  Jozef Lawrynowicz  
+
+	* config/msp430/msp430.c (msp430_hard_regno_nregs_has_padding): Remove
+	and add comment.
+	(msp430_hard_regno_nregs_with_padding): Remove.
+
 2019-10-24  Jozef Lawrynowicz  
 
 	* config/msp430/constraints.md: Allow post_inc for "Ya" constraint.
diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index cd394333983..fe1fcc0db43 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -332,28 +332,9 @@ msp430_hard_regno_nregs (unsigned int, machine_mode mode)
 	  / UNITS_PER_WORD);
 }
 
-/* Implements HARD_REGNO_NREGS_HAS_PADDING.  */
-int
-msp430_hard_regno_nregs_has_padding (int regno ATTRIBUTE_UNUSED,
- machine_mode mode)
-{
-  if (mode == PSImode && msp430x)
-return 1;
-  return ((GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1)
-	  / UNITS_PER_WORD);
-}
-
-/* Implements HARD_REGNO_NREGS_WITH_PADDING.  */
-int
-msp430_hard_regno_nregs_with_padding (int regno ATTRIBUTE_UNUSED,
-  machine_mode mode)
-{
-  if (mode == PSImode)
-return 2;
-  if (mode == CPSImode)
-return 4;
-  return msp430_hard_regno_nregs (regno, mode);
-}
+/* subreg_get_info correctly handles PSImode registers, so defining
+   HARD_REGNO_NREGS_HAS_PADDING and HARD_REGNO_NREGS_WITH_PADDING
+   has no effect.  */
 
 #undef TARGET_HARD_REGNO_MODE_OK
 #define TARGET_HARD_REGNO_MODE_OK msp430_hard_regno_mode_ok
-- 
2.17.1

[C++ Patch] Prefer error + inform in four typeck.c places

2019-10-24 Thread Paolo Carlini


Hi,

some additional straightforward bits in typeck.c, which I noticed when I 
went through the cp_build_binary_op callers. Tested x86_64-linux.


Thanks, Paolo.

//

/cp
2019-10-24  Paolo Carlini  

* typeck.c (cp_build_modify_expr): Prefer error + inform to
error + error in one place.
(get_delta_difference_1): Likewise.
(get_delta_difference): Likewise, in two places.

/testsuite
2019-10-24  Paolo Carlini  

* g++.dg/conversion/ptrmem2.C: Adjust for error + inform.
* g++.dg/gomp/tpl-atomic-2.C: Likewise.
Index: cp/typeck.c
===
--- cp/typeck.c (revision 277366)
+++ cp/typeck.c (working copy)
@@ -8358,8 +8358,8 @@ cp_build_modify_expr (location_t loc, tree lhs, en
  if (newrhs == error_mark_node)
{
  if (complain & tf_error)
-   error ("  in evaluation of %<%Q(%#T, %#T)%>", modifycode,
-  TREE_TYPE (lhs), TREE_TYPE (rhs));
+   inform (loc, "  in evaluation of %<%Q(%#T, %#T)%>",
+   modifycode, TREE_TYPE (lhs), TREE_TYPE (rhs));
  return error_mark_node;
}
 
@@ -8594,7 +8594,7 @@ get_delta_difference_1 (tree from, tree to, bool c
   if (!(complain & tf_error))
return error_mark_node;
 
-  error ("   in pointer to member function conversion");
+  inform (input_location, "   in pointer to member function conversion");
   return size_zero_node;
 }
   else if (binfo)
@@ -8655,7 +8655,7 @@ get_delta_difference (tree from, tree to,
  return error_mark_node;
 
error_not_base_type (from, to);
-   error ("   in pointer to member conversion");
+   inform (input_location, "   in pointer to member conversion");
result = size_zero_node;
   }
 else
@@ -8674,7 +8674,7 @@ get_delta_difference (tree from, tree to,
  return error_mark_node;
 
error_not_base_type (from, to);
-   error ("   in pointer to member conversion");
+   inform (input_location, "   in pointer to member conversion");
result = size_zero_node;
  }
   }
Index: testsuite/g++.dg/conversion/ptrmem2.C
===
--- testsuite/g++.dg/conversion/ptrmem2.C   (revision 277366)
+++ testsuite/g++.dg/conversion/ptrmem2.C   (working copy)
@@ -15,16 +15,20 @@ int B::*p1 = static_cast(&D::x);
 int D::*p2 = static_cast(&B::x);
 
 // Virtual base class.
-int V::*p3 = static_cast(&D::x);  // { dg-error "" }
-int D::*p4 = static_cast(&V::x);  // { dg-error "" }
+int V::*p3 = static_cast(&D::x);  // { dg-error "virtual base" }
+int D::*p4 = static_cast(&V::x);  // { dg-error "virtual base" }
 
 // Inaccessible base class.
-int P::*p5 = static_cast(&D::x);  // { dg-error "" }
-int D::*p6 = static_cast(&P::x);  // { dg-error "" }
+int P::*p5 = static_cast(&D::x);  // { dg-error "inaccessible base" }
+// { dg-message "pointer to member function" "" { target *-*-* } .-1 }
+int D::*p6 = static_cast(&P::x);  // { dg-error "inaccessible base" }
+// { dg-message "pointer to member function" "" { target *-*-* } .-1 }
 
 // Ambiguous base class.
-int A::*p7 = static_cast(&D::x);  // { dg-error "" }
-int D::*p8 = static_cast(&A::x);  // { dg-error "" }
+int A::*p7 = static_cast(&D::x);  // { dg-error "ambiguous base" }
+// { dg-message "pointer to member function" "" { target *-*-* } .-1 }
+int D::*p8 = static_cast(&A::x);  // { dg-error "ambiguous base" }
+// { dg-message "pointer to member function" "" { target *-*-* } .-1 }
 
 // Valid conversions which increase cv-qualification.
 const int B::*p9 = static_cast(&D::x);
@@ -35,5 +39,5 @@ int B::*p11 = static_cast(p10); // { dg-
 int D::*p12 = static_cast(p9);  // { dg-error "casts away 
qualifiers" }
 
 // Attempts to change member type.
-float B::*p13 = static_cast(&D::x); // { dg-error "" }
-float D::*p14 = static_cast(&B::x); // { dg-error "" }
+float B::*p13 = static_cast(&D::x); // { dg-error "invalid 
.static_cast." }
+float D::*p14 = static_cast(&B::x); // { dg-error "invalid 
.static_cast." }
Index: testsuite/g++.dg/gomp/tpl-atomic-2.C
===
--- testsuite/g++.dg/gomp/tpl-atomic-2.C(revision 277374)
+++ testsuite/g++.dg/gomp/tpl-atomic-2.C(working copy)
@@ -13,7 +13,7 @@ template void f1()
 template void f2(float *f)
 {
   #pragma omp atomic   // { dg-error "invalid" }
-  *f |= 1; // { dg-error "evaluation" }
+  *f |= 1; // { dg-message "evaluation" "" { target *-*-* } .-1 }
 }
 
 // Here the rhs is dependent, but not type dependent.
@@ -20,7 +20,7 @@ template void f2(float *f)
 template void f3(float *f)
 {
   #pragma omp atomic   // { dg-error "invalid" }
-  *f |= sizeof (T);// { dg-error "evaluation" }
+  *f |= sizeof (T);// { dg-message "evaluation" "" { target *-*-* } .-1

Re: [gomp4] Update error messages for c and c++ reductions

2019-10-24 Thread Thomas Schwinge

Hi!

On 2017-04-26T13:08:11-0700, Cesar Philippidis  wrote:
> This patches updates the c and c++ FEs to report consistent error
> messages for invalid reductions involving array elements, struct
> members, and class members. Most of those variables were already
> rejected by the generic OpenMP code, but this patch makes the error
> messages more precise for OpenACC. It also fixes an ICE involving
> invalid struct member reductions in c.
>
> I've committed this patch to gomp-4_0-branch.

It then got into openacc-gcc-8-branch in commit
16ead336bc86fade855e0ec2edfc257286f429b6 "[OpenACC] Update error messages
for c and c++ reductions", was proposed for GCC trunk in

"various OpenACC reduction enhancements" (together with other, unrelated
changes...), and got into openacc-gcc-9-branch in commit
533beb2ec19f8486e4b1b645a153746f96b41f04 "Various OpenACC reduction
enhancements - FE changes" (together with other, unrelated changes...).


At least in their og8 incarnation (have not yet verified other
branches/patches), I find these changes cause quite a serious regression
in the C front end.  Given the buggy:

#pragma acc routine vector
int f(int n)
{
  int plus = 0;
  int minus = 0;
#pragma acc loop reduction(+:plus, -:minus)
  for (int i = 0; i < n; ++i)
++plus, --minus;

  return plus * minus;
}

..., where we used to (and still do for C++) diagnose:

[...]: In function 'f':
[...]:6:35: error: expected ')' before '-' token
 #pragma acc loop reduction(+:plus, -:minus)
   ~   ^~
   )

..., with the patch applied, this doesn't get any diagnostic anymore, so
wrong-code gets generated.


Grüße
 Thomas


>   gcc/c/
>   * c-parser.c (c_parser_omp_variable_list): New c_omp_region_type
>   argument.  Use it to specialize handling of OMP_CLAUSE_REDUCTION for
>   OpenACC.
>   (c_parser_omp_clause_reduction): Update call to
>   c_parser_omp_variable_list.  Propage OpenACC errors as necessary.
>   (c_parser_oacc_all_clauses): Update call to
>   p_parser_omp_clause_reduction.
>   (c_parser_omp_all_clauses): Likewise.
>   (c_parser_cilk_all_clauses): Likewise.
>
>   gcc/cp/
>   * parser.c (cp_parser_omp_var_list_no_open): New c_omp_region_type
>   argument.  Use it to specialize handling of OMP_CLAUSE_REDUCTION for
>   OpenACC.
>   (cp_parser_omp_clause_reduction): Update call to
>   cp_parser_omp_variable_list.  Propage OpenACC errors as necessary.
>   (cp_parser_oacc_all_clauses): Update call to
>   cp_parser_omp_clause_reduction..
>   (cp_parser_omp_all_clauses): Liekwise.
>   (cp_parser_cilk_simd_all_clauses): Likewise.
>
>   gcc/testsuite/
>   * c-c++-common/goacc/reduction-7.c: New test.
>   * g++.dg/goacc/reductions-1.C: New test.

> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index 05b9774..b1af31f 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -10618,7 +10618,8 @@ c_parser_oacc_wait_list (c_parser *parser, location_t 
> clause_loc, tree list)
>  static tree
>  c_parser_omp_variable_list (c_parser *parser,
>   location_t clause_loc,
> - enum omp_clause_code kind, tree list)
> + enum omp_clause_code kind, tree list,
> + enum c_omp_region_type ort = C_ORT_OMP)
>  {
>if (c_parser_next_token_is_not (parser, CPP_NAME)
>|| c_parser_peek_token (parser)->id_kind != C_ID_ID)
> @@ -10674,6 +10675,22 @@ c_parser_omp_variable_list (c_parser *parser,
> /* FALLTHROUGH  */
>   case OMP_CLAUSE_DEPEND:
>   case OMP_CLAUSE_REDUCTION:
> +   if (kind == OMP_CLAUSE_REDUCTION && ort == C_ORT_ACC)
> + {
> +   switch (c_parser_peek_token (parser)->type)
> + {
> + case CPP_OPEN_PAREN:
> + case CPP_OPEN_SQUARE:
> + case CPP_DOT:
> + case CPP_DEREF:
> +   error ("invalid reduction variable");
> +   t = error_mark_node;
> + default:;
> +   break;
> + }
> +   if (t == error_mark_node)
> + break;
> + }
> while (c_parser_next_token_is (parser, CPP_OPEN_SQUARE))
>   {
> tree low_bound = NULL_TREE, length = NULL_TREE;
> @@ -12039,9 +12056,12 @@ c_parser_omp_clause_private (c_parser *parser, tree 
> list)
>   identifier  */
>  
>  static tree
> -c_parser_omp_clause_reduction (c_parser *parser, tree list)
> +c_parser_omp_clause_reduction (c_parser *parser, tree list,
> +enum c_omp_region_type ort)
>  {
>location_t clause_loc = c_parser_peek_token (parser)->location;
> +  bool seen_error = false;
> +

Report heap memory use into -Q output

2019-10-24 Thread Jan Hubicka

Hi,
this patch adds heap memory use report which with -Q is now output
during WPA stream in and after IPA passes.  This is useful to catch
inordinary large memory use of a given pass.

Bootstrapped/regtested x86_64-linux, OK?
* config.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check for mallinfo.
* ggc-common.c: Include malloc.h if available;
include options.h
(report_heap_memory_use): New functoin.
* ggc-page.c (ggc_grow): Do not print "start".
* ggc.h (report_heap_memory_use): Declare.
* pases.c (execute_one_pass): Report memory after IPA passes.
(ipa_read_summaries_1): Likewise.
(ipa_read_optimization_summaries_1): Likewise.

* lto/lto-common.c (read_cgraph_and_symbols): Improve -Q reporting.
* lto.c (lto_wpa_write_files): Likewise.

Index: configure.ac
===
--- configure.ac(revision 277366)
+++ configure.ac(working copy)
@@ -1359,7 +1359,7 @@ define(gcc_UNLOCKED_FUNCS, clearerr_unlo
 AC_CHECK_FUNCS(times clock kill getrlimit setrlimit atoq \
popen sysconf strsignal getrusage nl_langinfo \
gettimeofday mbstowcs wcswidth mmap setlocale \
-   gcc_UNLOCKED_FUNCS madvise)
+   gcc_UNLOCKED_FUNCS madvise mallinfo)
 
 if test x$ac_cv_func_mbstowcs = xyes; then
   AC_CACHE_CHECK(whether mbstowcs works, gcc_cv_func_mbstowcs_works,
@@ -1439,6 +1439,14 @@ gcc_AC_CHECK_DECLS(getrlimit setrlimit g
 #endif
 ])
 
+gcc_AC_CHECK_DECLS(mallinfo, , ,[
+#include "ansidecl.h"
+#include "system.h"
+#ifdef HAVE_MALLOC_H
+#include 
+#endif
+])
+
 AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
 #include "ansidecl.h"
 #include "system.h"
Index: ggc-common.c
===
--- ggc-common.c(revision 277366)
+++ ggc-common.c(working copy)
@@ -21,6 +21,9 @@ along with GCC; see the file COPYING3.
any particular GC implementation.  */
 
 #include "config.h"
+#ifdef HAVE_MALLINFO
+#include 
+#endif
 #include "system.h"
 #include "coretypes.h"
 #include "timevar.h"
@@ -29,6 +32,7 @@ along with GCC; see the file COPYING3.
 #include "params.h"
 #include "hosthooks.h"
 #include "plugin.h"
+#include "options.h"
 
 /* When set, ggc_collect will do collection.  */
 bool ggc_force_collect;
@@ -1017,3 +1021,14 @@ ggc_prune_overhead_list (void)
   delete ggc_mem_desc.m_reverse_object_map;
   ggc_mem_desc.m_reverse_object_map = new map_t (13, false, false, false);
 }
+
+/* Return memory used by heap in kb, 0 if this info is not available.  */
+
+void
+report_heap_memory_use ()
+{
+#ifdef HAVE_MALLINFO
+  if (!quiet_flag)
+fprintf (stderr," {heap %luk}", (unsigned long)(mallinfo().arena / 1024));
+#endif
+}
Index: ggc-page.c
===
--- ggc-page.c  (revision 277366)
+++ ggc-page.c  (working copy)
@@ -2267,7 +2267,7 @@ ggc_grow (void)
   else
 ggc_collect ();
   if (!quiet_flag)
-fprintf (stderr, " {GC start %luk} ", (unsigned long) G.allocated / 1024);
+fprintf (stderr, " {GC %luk} ", (unsigned long) G.allocated / 1024);
 }
 
 void
Index: ggc.h
===
--- ggc.h   (revision 277366)
+++ ggc.h   (working copy)
@@ -266,6 +266,9 @@ extern void stringpool_statistics (void)
 /* Heuristics.  */
 extern void init_ggc_heuristics (void);
 
+/* Report current heap memory use to stderr.  */
+extern void report_heap_memory_use (void);
+
 #define ggc_alloc_rtvec_sized(NELT)\
   (rtvec_def *) ggc_internal_alloc (sizeof (struct rtvec_def)  \
   + ((NELT) - 1) * sizeof (rtx))   \
Index: lto/lto-common.c
===
--- lto/lto-common.c(revision 277366)
+++ lto/lto-common.c(working copy)
@@ -2784,6 +2784,7 @@ read_cgraph_and_symbols (unsigned nfiles
   /* At this stage we know that majority of GGC memory is reachable.
  Growing the limits prevents unnecesary invocation of GGC.  */
   ggc_grow ();
+  report_heap_memory_use ();
 
   /* Set the hooks so that all of the ipa passes can read in their data.  */
   lto_set_in_hooks (all_file_decl_data, get_section_data, free_section_data);
@@ -2791,7 +2792,7 @@ read_cgraph_and_symbols (unsigned nfiles
   timevar_pop (TV_IPA_LTO_DECL_IN);
 
   if (!quiet_flag)
-fprintf (stderr, "\nReading the callgraph\n");
+fprintf (stderr, "\nReading the symbol table:");
 
   timevar_push (TV_IPA_LTO_CGRAPH_IO);
   /* Read the symtab.  */
@@ -2831,7 +2832,7 @@ read_cgraph_and_symbols (unsigned nfiles
   timevar_pop (TV_IPA_LTO_CGRAPH_IO);
 
   if (!quiet_flag)
-fprintf (stderr, "Merging declarations\n");
+fprintf (stderr, "\nMerging declarations:");
 
   timevar_push (TV_IPA_LTO_DECL_MERGE);
   /* Merge global decls.  In ltrans mode we read merged cgraph, we do not

Re: [PATCH] Report errors on inconsistent OpenACC nested reduction clauses

2019-10-24 Thread Thomas Schwinge

Hi Frederik and Jakub!

On 2019-10-21T09:08:28+0200, "Harwath, Frederik"  
wrote:
> OpenACC requires that, if a variable is used in reduction clauses on two 
> nested loops, then there
> must be reduction clauses for that variable on all loops that are nested in 
> between the two loops
> and all these reduction clauses must use the same operator; this has been 
> first clarified by
> OpenACC 2.6. This commit introduces a check for that property which reports 
> errors if the property
> is violated.

So I previously (internally, 2018-11-29) noted:

| I wonder if these should really be diagnosed as hard errors, or rather as
| warnings?
| 
| The specification describes what the user is expected to do, and the
| compiler should assist to achieve that goal, but I wonder if there might
| be any reasonable cases where a compiler error diagnostic might be
| considered "too strong" here?  (Just a quick thought.  On the other hand,
| of course, "fail loudly for stupid things" is desiable.  Will have to
| think about that further.)

In line with the discussion in
, I would now
suggest that indeed we here demote error to warning diagnostics: there
isn't a problem for the compiler to generate code in presence of
non-sensical/missing 'reduction' clauses, so no reason for a hard error.
Does that make sense to you, too?

Obviously, then also adjust all mentions in the commit log etc. from
"error" to "warning".

> I have tested the patch by comparing "make check" results and I am not aware 
> of any regressions.

(For avoidance of doubt, I have not yet tested the patch.)

> Gergö has implemented the check and it works, but I was wondering if the way 
> in which the patch
> avoids issuing errors about operator switches more than once by modifying the 
> clauses (cf. the
> corresponding comment in omp-low.c) could lead to problems - the processing 
> might still continue
> after the error on the modified tree, right?

Yes, processing continues (in order to report more than just the first
error), but per my understanding a single 'error_at' call makes sure that
compilation will termiate at some later point, with an error exit code.

"Patching up" erroneous state or even completely removing OMP clauses is
-- as far as I understand -- acceptable to avoid "issuing errors about
operator switches more than once".  This doesn't affect code generation,
because no code will be generated at all.

(Does that answer your question?)

Regarding my suggestions to "demote error to warning diagnostics", I'd
suggest that at this point we do *not* try to fix for the user any
presumed wrong/missing 'reduction' clauses (difficult/impossible to do
correctly in the general case), but really only diagnose them.  Thus, no
more "modifying the clauses"; that code should disappear.  This may
result in more warning diagnostics being emitted, but that seems
reasonable, given that the user code is presumed buggy.  (So, unless it's
straight-forward, please don't spend much time on trying to minimize the
number of warning diagnostics emitted.)

> I was also wondering about the best place for such
> checks. Should this be a part of "pass_lower_omp" (as in the patch)

(..., and which, for example, is also where
'check_omp_nesting_restrictions' is called from 'scan_omp', running as
part of 'pass_lower_omp'...)

> or should it run earlier
> like, for instance, "pass_diagnose_omp_blocks".

(..., running as one of the first middle end passes, before
'pass_lower_omp'.)

Jakub, do you have an opinion on that?  (Full-quote of the patch is
below, for your easy review.)

I think the issue is balancing whether to have it in its own pass (for
clean separation, which generally certainly is preferable) vs. embedded
into existing code paths that already walk over all the GIMPLE statements
(to avoid introducing more compile-time processing overhead).

> Can the patch be included in trunk?

Normally I might say "OK to commit with the following requests
addressed", but as you're still new, it's maybe a good idea that you post
another revision (as a reply to this email, simply).

A few additional comments/requests:

> From 99796969c1bf91048c6383dfb1b8576bdd9efd7d Mon Sep 17 00:00:00 2001
> From: Frederik Harwath 
> Date: Mon, 21 Oct 2019 08:27:58 +0200
> Subject: [PATCH] Report errors on inconsistent OpenACC nested reduction
>  clauses
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> OpenACC (cf. OpenACC 2.7, section 2.9.11. "reduction clause";
> this was first clarified by OpenACC 2.6) requires that, if a
> variable is used in reduction clauses on two nested loops, then
> there must be reduction clauses for that variable on all loops
> that are nested in between the two loops and all these reduction
> clauses must use the same operator.
> This commit introduces a check for that property which reports
> errors if it is violated.
>

[PATCH] Make gt_pch_nx unreachable in symbol-summary classes.

2019-10-24 Thread Martin Liška

Hello.

For the symbol/call summary we don't expect that it will be streamed
for PCH purpose. So that, I would like to mark all gt_pch_nx functions
with gcc_unreachable.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-10-24  Martin Liska  

* symbol-summary.h (gt_pch_nx): Mark all functions
with gcc_unreachable as we do not expect to be called.
---
 gcc/symbol-summary.h | 43 +--
 1 file changed, 17 insertions(+), 26 deletions(-)


diff --git a/gcc/symbol-summary.h b/gcc/symbol-summary.h
index e90d4481a10..7f2e7218460 100644
--- a/gcc/symbol-summary.h
+++ b/gcc/symbol-summary.h
@@ -295,19 +295,16 @@ gt_ggc_mx(function_summary* const &summary)
 
 template 
 void
-gt_pch_nx(function_summary* const &summary)
+gt_pch_nx (function_summary *const &)
 {
-  gcc_checking_assert (summary->m_ggc);
-  gt_pch_nx (&summary->m_map);
+  gcc_unreachable ();
 }
 
 template 
 void
-gt_pch_nx(function_summary* const& summary, gt_pointer_operator op,
-	  void *cookie)
+gt_pch_nx (function_summary *const &, gt_pointer_operator, void *)
 {
-  gcc_checking_assert (summary->m_ggc);
-  gt_pch_nx (&summary->m_map, op, cookie);
+  gcc_unreachable ();
 }
 
 /* Help template from std c++11.  */
@@ -538,18 +535,17 @@ gt_ggc_mx (fast_function_summary* const &summary)
 
 template 
 void
-gt_pch_nx (fast_function_summary* const &summary)
+gt_pch_nx (fast_function_summary *const &)
 {
-  gt_pch_nx (summary->m_vector);
+  gcc_unreachable ();
 }
 
 template 
 void
-gt_pch_nx (fast_function_summary* const& summary,
-	   gt_pointer_operator op,
-	   void *cookie)
+gt_pch_nx (fast_function_summary *const &, gt_pointer_operator,
+	   void *)
 {
-  gt_pch_nx (summary->m_vector, op, cookie);
+  gcc_unreachable ();
 }
 
 /* Base class for call_summary and fast_call_summary classes.  */
@@ -784,19 +780,16 @@ gt_ggc_mx(call_summary* const &summary)
 
 template 
 void
-gt_pch_nx(call_summary* const &summary)
+gt_pch_nx (call_summary *const &)
 {
-  gcc_checking_assert (summary->m_ggc);
-  gt_pch_nx (&summary->m_map);
+  gcc_unreachable ();
 }
 
 template 
 void
-gt_pch_nx(call_summary* const& summary, gt_pointer_operator op,
-	  void *cookie)
+gt_pch_nx (call_summary *const &, gt_pointer_operator, void *)
 {
-  gcc_checking_assert (summary->m_ggc);
-  gt_pch_nx (&summary->m_map, op, cookie);
+  gcc_unreachable ();
 }
 
 /* We want to pass just pointer types as argument for fast_call_summary
@@ -994,18 +987,16 @@ gt_ggc_mx (fast_call_summary* const &summary)
 
 template 
 void
-gt_pch_nx (fast_call_summary* const &summary)
+gt_pch_nx (fast_call_summary *const &)
 {
-  gt_pch_nx (&summary->m_vector);
+  gcc_unreachable ();
 }
 
 template 
 void
-gt_pch_nx (fast_call_summary* const& summary,
-	   gt_pointer_operator op,
-	   void *cookie)
+gt_pch_nx (fast_call_summary *const &, gt_pointer_operator, void *)
 {
-  gt_pch_nx (&summary->m_vector, op, cookie);
+  gcc_unreachable ();
 }
 
 #endif  /* GCC_SYMBOL_SUMMARY_H  */

[PING 2] [WIP PATCH] add object access attributes (PR 83859)

2019-10-24 Thread Martin Sebor


Ping: https://gcc.gnu.org/ml/gcc-patches/2019-09/msg01690.html

On 10/17/2019 10:28 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2019-09/msg01690.html

Other than the suggestions I got for optimization (for GCC 11)
and additional buffer overflow detection for [static] arrays),
is there any feedback on the patch itself?  Jeff?

Martin

On 9/29/19 1:51 PM, Martin Sebor wrote:

-Wstringop-overflow detects a subset of past-the-end read and write
accesses by built-in functions such as memcpy and strcpy.  It relies
on the functions' effects the knowledge of which is hardwired into
GCC.  Although it's possible for users to create wrappers for their
own functions to detect similar problems, it's quite cumbersome and
so only lightly used outside system libraries like Glibc.  Even Glibc
only checks for buffer overflow and not for reading past the end.

PR 83859 asks to expose the same checking that GCC does natively for
built-in calls via a function attribute that associates a pointer
argument with the size argument, such as:

   __attribute__((buffer_size (1, 2))) void
   f (char* dst, size_t dstsize);

The attached patch is my initial stab at providing this feature by
introducing three new attributes:

   * read_only (ptr-argno, size-argno)
   * read_only (ptr-argno, size-argno)
   * read_write (ptr-argno, size-argno)

As requested, the attributes associate a pointer parameter to
a function with a size parameter.  In addition, they also specify
how the function accesses the object the pointer points to: either
it only reads from it, or it only writes to it, or it does both.

Besides enabling the same buffer overflow detection as for built-in
string functions they also let GCC issue -Wuninitialized warnings
for uninitialized objects passed to read-only functions by reference,
and -Wunused-but-set warnings for objects passed to write-only
functions that are otherwise unused (PR 80806).  The -Wununitialized
part is done. The -Wunused-but-set detection is implemented only in
the C FE and not yet in C++.

Besides the diagnostic improvements above the attributes also open
up optimization opportunities such as DCE.  I'm still working on this
and so it's not yet part of the initial patch.

I plan to finish the patch for GCC 10 but I don't expect to have
the time to start taking advantage of the attributes for optimization
until GCC 11.

Besides regression testing on x86_64-linux, I also tested the patch
by compiling Binutils/GDB, Glibc, and the Linux kernel with it.  It
found no new problems but caused a handful of 
-Wunused-but-set-variable false positives due to an outstanding bug in 
the C front-end introduced

by the patch that I still need to fix.

Martin

[PING][POC v2 PATCH] __builtin_warning

2019-10-24 Thread Martin Sebor


Other than the comments from Joseph any feedback on the patch
itself and my questions?

Ping: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01015.html

On 10/14/2019 02:41 PM, Martin Sebor wrote:

Attached is a more fleshed out version of the built-in implemented
to run in a pass of its own.  I did this in anticipation of looking
at the CFG to help eliminate false positives due to ASAN
instrumentation (e.g., PR 91707).

The built-in now handles a decent number of C and GCC formatting
directives.

The patch introduces a convenience API to create calls to the built-in
(gimple_build_warning).  It also avoids duplicate warnings emitted as
a result of redundant calls to the built-in for the same code (e.g.,
by different passes detecting the same out-of-bounds access).

To show how to use the built-in and the APIs within GCC the patch
modifies path isolation and CCP to inject calls to it into the CFG.
A couple of new tests exercise using the built-in from user code.

The patch triggers a number of -Wnonnull instances during bootstrap
and failures in tests that exercise the warnings modified by using
the built-in.  The GCC warnings are mostly potential bugs that
I will submit patches for, but they're in general unrelated to
the built-in itself.

At this point I want to know if there is support for a) including
the built-in in GCC 10, b) the path isolation changes to make use
of it, and c) the CCP -Wnonnull changes.  If (a), I will submit
a final patch in the next few weeks.  If also (b) and/or (c)
I will also work on cleaning up the GCC warnings.

Martin

PS The patch introduces a general mechanism for processing vararg
formatting functions.  It's only used to handle the built-in but
once it's accepted I expect to replace the gimple-ssa-printf.c
parser with it.

On 8/9/19 3:26 PM, Martin Sebor wrote:

Attached is a very rough and only superficially barely tested
prototype of the __builtin_warning intrinsic we talked about
the other day.  The built-in is declared like so:

   int __builtin_warning (int loc,
  const char *option,
  const char *txt, ...);

If it's still present during RTL expansion the built-in calls

   bool ret = warning_at (LOC, find_opt (option), txt, ...);

and expands to the constant value of RET (which could be used
to issue __builtin_note but there may be better ways to deal
with those than that).

LOC is the location of the warning or zero for the location of
the built-in itself (when called by user code), OPTION is either
the name of the warning option (e.g., "-Wnonnull", when called
by user code) or the index of the option (e.g., OPT_Wnonnull,
when emitted by GCC into the IL), and TXT is the format string
to format the warning text.  The rest of the arguments are not
used but I expect to extract and pass them to the diagnostic
pretty printer to format the text of the warning.

Using the built-in in user code should be obvious.  To show
how it might be put to use within GCC, I added code to
gimple-ssa-isolate-paths.c to emit -Wnonnull in response to
invalid null pointer accesses.  For this demo, when compiled
with the patch applied and with -Wnull-dereference (which is
not in -Wall or -Wextra), GCC issues three warnings: two
instances of -Wnull-dereference one of which is a false positive,
and one -Wnonnull (the one I added, which is included in -Wall),
which is a true positive:

   int f (void)
   {
 char a[4] = "12";
 char *p = __builtin_strlen (a) < 3 ? a : 0;
 return *p;
   }

   int g (void)
   {
 char a[4] = "12";
 char *p = __builtin_strlen (a) > 3 ? a : 0;
 return *p;
   }

   In function ‘f’:
   warning: potential null pointer dereference [-Wnull-dereference]
 7 |   return *p;
   |  ^~
   In function ‘g’:
   warning: null pointer dereference [-Wnull-dereference]
    14 |   return *p;
   |  ^~
   warning: invalid use of a null pointer [-Wnonnull]

The -Wnull-dereference instance in f is a false positive because
the strlen result is clearly less than two.  The strlen pass folds
the strlen result to a constant but it runs after path isolation
which will have already issued the bogus warning.

Martin

PS I tried compiling GCC with the patch.  It fails in libgomp
with:

   libgomp/oacc-mem.c: In function ‘gomp_acc_remove_pointer’:
   cc1: warning: invalid use of a null pointer [-Wnonnull]

so clearly it's missing location information.  With
-Wnull-dereference enabled we get more detail:

   libgomp/oacc-mem.c: In function ‘gomp_acc_remove_pointer’:
   libgomp/oacc-mem.c:1013:31: warning: potential null pointer 
dereference [-Wnull-dereference]

    1013 |   for (size_t i = 0; i < t->list_count; i++)
 |  ~^~~~
   libgomp/oacc-mem.c:1012:19: warning: potential null pointer 
dereference [-Wnull-dereference]

    1012 |   t->refcount = minrefs;
 |   ^
   libgomp/oacc-mem.c:1013:31: warning: potential null po

[PING][PATCH] bring -Warray-bounds closer to -Wstringop-overflow (PR91647, 91463, 91679)

2019-10-24 Thread Martin Sebor


Ping: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00860.html

Should I add something like the -Wzero-length-array-bounds option
to allow some of the questionable idioms seen in the kernel?

On 10/11/2019 10:34 AM, Martin Sebor wrote:

On 9/10/19 4:35 PM, Jeff Law wrote:

On 9/6/19 1:27 PM, Martin Sebor wrote:

Recent enhancements to -Wstringop-overflow improved the warning
to the point that it detects a superset of the problems -Warray-
bounds is intended detect in character accesses.  Because both
warnings detect overlapping sets of problems, and because the IL
they work with tends to change in subtle ways from target to
targer, tests designed to verify one or the other sometimes fail
with a target where the warning isn't robust enough to detect
the problem given the IL representation.

To reduce these test suite failures the attached patch extends
-Warray-bounds to handle some of the same problems -Wstringop-
overflow does, pecifically, out-of-bounds accesses to array
members of structs, including zero-length arrays and flexible
array members of defined objects.

In the process of testing the enhancement I realized that
the recently added component_size() function doesn't work as
intended for non-character array members (see below).  The patch
corrects this by reverting back to the original implementation
of the function until the better/simpler solution can be put in
place as mentioned below.

Tested on x86_64-linux.

Martin


[*] component_size() happens to work for char arrays because those
are transformed to STRING_CSTs, but not for arrays that are not.
E.g., in

   struct S { int64_t i; int16_t j; int16_t a[]; }
 s = { 0, 0, { 1, 0 } };

unless called with type set to int16_t[2], fold_ctor_reference
will return s.a[0] rather than all of s.a.  But set type to
int16_t[2] we would need to know that s.a's initializer has two
elements, and that's just what we're using fold_ctor_reference
to find out.

I think this could probably be made to work somehow by extending
useless_type_conversion_p to handle this case as special somehow,
but it doesn't seem worth the effort given that there should be
an easier way to do it as you noted below.

Given the above, the long term solution should be to rely on
DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard
suggested in the review of its initial implementation.
Unfortunately, because of bugs in both the C and C++ front ends
(I just opened PR 65403 with the details) the simple formula
doesn't give the right answers either.  So until the bugs are
fixed, the patch reverts back to the original loopy solution.
It's no more costly than the current fold_ctor_reference
approach.

...


So no concerns with the patch itself, just the fallout you mentioned in
a follow-up message.  Ideally we'd have glibc and the kernel fixed
before this goes in, but I'd settle for just getting glibc fixed since
we have more influence there.


Half of the issues there were due to a bug in the warning.  The rest
are caused by Glibc's use of interior zero-length arrays to access
subsequent members.  It works in simple cases but it's very brittle
because GCC assumes that even such members don't alias. If it's meant
to be a supported feature then aliasing would have to be changed to
take it into account.  But I'd rather encourage projects to move away
from these dangerous hacks and towards cleaner, safer code.

I've fixed the bug in the attached patch.  The rest can be suppressed
by replacing the zero-length arrays with flexible array members but
that's just trading one misuse for another.  If the code can't be
changed to avoid this (likely not an option since the arrays are in
visible in the public API) I think the best way to deal with them is
to suppress them by #pragma GCC diagnostic ignored.  I opened BZ 25097
in Glibc Bugzilla to track this.


Out of curiosity are the kernel issues you raised due to flexible arrays
or just cases where we're doing a better job on normal objects?  I'd be
a bit surprised to find flexible arrays in the kernel.


I don't think I've come across any flexible arrays in the kernel.

The patch triggers 94 instances of -Warray-bounds (60 of which
are for distinct code) in 21 .c files.  I haven't looked at all
of them but some of the patterns I noticed are:

1) Intentionally using an interior zero-length array to access
    (e.g., memset) one or more subsequent members. E.g.,
    _dbgp_external_startup in drivers/usb/early/ehci-dbgp.c and
    quite a few others.  This is pretty pervasive but seems easily
    avoidable.

2) Overwriting a member array with more data (e.g., function
    cxio_rdev_open in
    drivers/infiniband/hw/cxgb3/cxio_hal.c or in function
    pk_probe in drivers/hid/hid-prodikeys.c).  At first glance
    some of these look like bugs but with stuff obscured by macros
    and no comments it's hard to tell.

3) Uses of the container_of() macro to access one member given
    the address of another.  This is undefined (and again breaks

[PING 2][PATCH] implement -Wrestrict for sprintf (PR 83688)

2019-10-24 Thread Martin Sebor


Ping: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00570.html

On 10/14/2019 08:34 PM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00570.html

On 10/8/19 5:51 PM, Martin Sebor wrote:

Attached is a resubmission of the -Wrestrict implementation for
the sprintf family of functions.  The original patch was posted
in 2017 but never approved.  This revision makes only a few minor
changes to the original code, mostly necessitated by rebasing on
the top of trunk.

The description from the original posting still applies today:

   The enhancement works by first determining the base object (or
   pointer) from the destination of the sprintf call, the constant
   offset into the object (and subobject for struct members), and
   its size.  For each %s argument, it then computes the same data.
   If it determines that overlap between the two is possible it
   stores the data for the directive argument (including the size
   of the argument) for later processing.  After the whole call and
   format string have been processed, the code then iterates over
   the stored directives and their arguments and compares the size
   and length of the argument against the function's overall output.
   If they overlap it issues a warning.

The solution is pretty simple.  The only details that might be
worth calling out are the addition of a few utility functions some
of which at first glance look like they could be replaced by calls
to existing utilities:

  *  array_elt_at_offset
  *  field_at_offset
  *  get_origin_and_offset
  *  alias_offset

Specifically, get_origin_and_offset looks like a dead ringer for
get_addr_base_and_unit_offset, except since the former is only
used for warnings it is less conservative.  It also works with
SSA_NAMEs.  This is also the function I expect to need to make
changes to (and fix bugs in).  The rest of the functions are
general utilities that could perhaps be moved to tree.c at some
point when there is a use for them elsewhere (I have some work
in progress that might need at least one of them).

Another likely question worth addressing is why the sprintf
overlap detection isn't handled in gimple-ssa-warn-restrict.c.
There is an opportunity for code sharing between the two "passes"
but it will require some fairly intrusive changes to the latter.
Those feel out of scope for the initial solution.

Finally, because of new dependencies between existing classes in
the file, some code had to be moved around within it a bit.  That
contributed to the size of the patch making the changes seem more
extensive than they really are.

Tested on x86_64-linux with Binutils/GDB and Glibc.

Martin

The original submission:
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00036.html

Re: [PATCH] Make gt_pch_nx unreachable in symbol-summary classes.

2019-10-24 Thread Jan Hubicka

> Hello.
> 
> For the symbol/call summary we don't expect that it will be streamed
> for PCH purpose. So that, I would like to mark all gt_pch_nx functions
> with gcc_unreachable.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
OK,
thanks!
Honza

Re: [PATCH] PR c++/91369 Implement P0784R7 changes to allocation and construction

2019-10-24 Thread Jonathan Wakely


On 23/10/19 20:27 +0100, Jonathan Wakely wrote:

--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -154,13 +154,42 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_GLIBCXX20_CONSTEXPR
allocator(const allocator<_Tp1>&) _GLIBCXX_NOTHROW { }

+#if __cplusplus <= 201703L
  ~allocator() _GLIBCXX_NOTHROW { }
+#endif


This changes the value of is_trivially_destructible_v> so
maybe it would be better to keep the user-provided destructor but make
it constexpr:

--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -154,9 +154,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _GLIBCXX20_CONSTEXPR
   allocator(const allocator<_Tp1>&) _GLIBCXX_NOTHROW { }

-#if __cplusplus <= 201703L
+  _GLIBCXX20_CONSTEXPR
  ~allocator() _GLIBCXX_NOTHROW { }
-#endif

#if __cplusplus > 201703L
  [[nodiscard,__gnu__::__always_inline__]]

With the earlier commit r277300 I've still changed the result of
is_trivially_destructible_v> and
is_trivially_*_constructible_v> because the
allocator explicit specialization no longer exists for C++20.
That might be a bigger problem.

Re: Report heap memory use into -Q output

2019-10-24 Thread Richard Biener

On Thu, 24 Oct 2019, Jan Hubicka wrote:

> Hi,
> this patch adds heap memory use report which with -Q is now output
> during WPA stream in and after IPA passes.  This is useful to catch
> inordinary large memory use of a given pass.

Looks reasonable but please let others some time to comment on
portability issues (you only check whether the symbol is defined,
not whether the API you use is provided).

Richard.

> Bootstrapped/regtested x86_64-linux, OK?
>   * config.in: Regenerate.
>   * configure: Regenerate.
>   * configure.ac: Check for mallinfo.
>   * ggc-common.c: Include malloc.h if available;
>   include options.h
>   (report_heap_memory_use): New functoin.
>   * ggc-page.c (ggc_grow): Do not print "start".
>   * ggc.h (report_heap_memory_use): Declare.
>   * pases.c (execute_one_pass): Report memory after IPA passes.
>   (ipa_read_summaries_1): Likewise.
>   (ipa_read_optimization_summaries_1): Likewise.
> 
>   * lto/lto-common.c (read_cgraph_and_symbols): Improve -Q reporting.
>   * lto.c (lto_wpa_write_files): Likewise.
>   
> Index: configure.ac
> ===
> --- configure.ac  (revision 277366)
> +++ configure.ac  (working copy)
> @@ -1359,7 +1359,7 @@ define(gcc_UNLOCKED_FUNCS, clearerr_unlo
>  AC_CHECK_FUNCS(times clock kill getrlimit setrlimit atoq \
>   popen sysconf strsignal getrusage nl_langinfo \
>   gettimeofday mbstowcs wcswidth mmap setlocale \
> - gcc_UNLOCKED_FUNCS madvise)
> + gcc_UNLOCKED_FUNCS madvise mallinfo)
>  
>  if test x$ac_cv_func_mbstowcs = xyes; then
>AC_CACHE_CHECK(whether mbstowcs works, gcc_cv_func_mbstowcs_works,
> @@ -1439,6 +1439,14 @@ gcc_AC_CHECK_DECLS(getrlimit setrlimit g
>  #endif
>  ])
>  
> +gcc_AC_CHECK_DECLS(mallinfo, , ,[
> +#include "ansidecl.h"
> +#include "system.h"
> +#ifdef HAVE_MALLOC_H
> +#include 
> +#endif
> +])
> +
>  AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
>  #include "ansidecl.h"
>  #include "system.h"
> Index: ggc-common.c
> ===
> --- ggc-common.c  (revision 277366)
> +++ ggc-common.c  (working copy)
> @@ -21,6 +21,9 @@ along with GCC; see the file COPYING3.
> any particular GC implementation.  */
>  
>  #include "config.h"
> +#ifdef HAVE_MALLINFO
> +#include 
> +#endif
>  #include "system.h"
>  #include "coretypes.h"
>  #include "timevar.h"
> @@ -29,6 +32,7 @@ along with GCC; see the file COPYING3.
>  #include "params.h"
>  #include "hosthooks.h"
>  #include "plugin.h"
> +#include "options.h"
>  
>  /* When set, ggc_collect will do collection.  */
>  bool ggc_force_collect;
> @@ -1017,3 +1021,14 @@ ggc_prune_overhead_list (void)
>delete ggc_mem_desc.m_reverse_object_map;
>ggc_mem_desc.m_reverse_object_map = new map_t (13, false, false, false);
>  }
> +
> +/* Return memory used by heap in kb, 0 if this info is not available.  */
> +
> +void
> +report_heap_memory_use ()
> +{
> +#ifdef HAVE_MALLINFO
> +  if (!quiet_flag)
> +fprintf (stderr," {heap %luk}", (unsigned long)(mallinfo().arena / 
> 1024));
> +#endif
> +}
> Index: ggc-page.c
> ===
> --- ggc-page.c(revision 277366)
> +++ ggc-page.c(working copy)
> @@ -2267,7 +2267,7 @@ ggc_grow (void)
>else
>  ggc_collect ();
>if (!quiet_flag)
> -fprintf (stderr, " {GC start %luk} ", (unsigned long) G.allocated / 
> 1024);
> +fprintf (stderr, " {GC %luk} ", (unsigned long) G.allocated / 1024);
>  }
>  
>  void
> Index: ggc.h
> ===
> --- ggc.h (revision 277366)
> +++ ggc.h (working copy)
> @@ -266,6 +266,9 @@ extern void stringpool_statistics (void)
>  /* Heuristics.  */
>  extern void init_ggc_heuristics (void);
>  
> +/* Report current heap memory use to stderr.  */
> +extern void report_heap_memory_use (void);
> +
>  #define ggc_alloc_rtvec_sized(NELT)  \
>(rtvec_def *) ggc_internal_alloc (sizeof (struct rtvec_def)
> \
>  + ((NELT) - 1) * sizeof (rtx))   \
> Index: lto/lto-common.c
> ===
> --- lto/lto-common.c  (revision 277366)
> +++ lto/lto-common.c  (working copy)
> @@ -2784,6 +2784,7 @@ read_cgraph_and_symbols (unsigned nfiles
>/* At this stage we know that majority of GGC memory is reachable.
>   Growing the limits prevents unnecesary invocation of GGC.  */
>ggc_grow ();
> +  report_heap_memory_use ();
>  
>/* Set the hooks so that all of the ipa passes can read in their data.  */
>lto_set_in_hooks (all_file_decl_data, get_section_data, free_section_data);
> @@ -2791,7 +2792,7 @@ read_cgraph_and_symbols (unsigned nfiles
>timevar_pop (TV_IPA_LTO_DECL_IN);
>  
>if (!quiet_flag)
> -fprintf (stderr, "\nReading the callgraph\n");
> +fpri

[PATCH] Simplify common case of use_future_t that uses std::allocator

2019-10-24 Thread Jonathan Wakely


There is no need to store and pass around the allocator object when it's
an instance of std::allocator. Define a partial specialization of
std::use_future_t and the corresponding completion token so that no
allocator is stored. Overload the completion handler constructor to not
expect an allocator to be stored.

* include/experimental/executor (__use_future_ct, use_future_t):
Define partial specializations for std::allocator.
(__use_future_ch): Overload constructor for completion tokens using
std::allocator.

Tested x86_64-linux, committed to trunk.


commit ff1859de72d39f46118df4f568ee90d78c6be937
Author: Jonathan Wakely 
Date:   Thu Oct 24 15:34:17 2019 +0100

Simplify common case of use_future_t that uses std::allocator

There is no need to store and pass around the allocator object when it's
an instance of std::allocator. Define a partial specialization of
std::use_future_t and the corresponding completion token so that no
allocator is stored. Overload the completion handler constructor to not
expect an allocator to be stored.

* include/experimental/executor (__use_future_ct, use_future_t):
Define partial specializations for std::allocator.
(__use_future_ch): Overload constructor for completion tokens using
std::allocator.

diff --git a/libstdc++-v3/include/experimental/executor 
b/libstdc++-v3/include/experimental/executor
index ef141e92cd3..ed18730951c 100644
--- a/libstdc++-v3/include/experimental/executor
+++ b/libstdc++-v3/include/experimental/executor
@@ -1501,12 +1501,18 @@ inline namespace v1
   std::tuple<_Func, _Alloc> _M_t;
 };
 
+  template
+struct __use_future_ct<_Func, std::allocator<_Tp>>
+{
+  _Func _M_f;
+};
+
   template>
 class use_future_t
 {
 public:
   // use_future_t types:
-  typedef _ProtoAllocator allocator_type;
+  using allocator_type = _ProtoAllocator;
 
   // use_future_t members:
   constexpr use_future_t() noexcept : _M_alloc() { }
@@ -1514,7 +1520,7 @@ inline namespace v1
   explicit
   use_future_t(const _ProtoAllocator& __a) noexcept : _M_alloc(__a) { }
 
-  template
+  template
use_future_t<_OtherAllocator>
rebind(const _OtherAllocator& __a) const noexcept
{ return use_future_t<_OtherAllocator>(__a); }
@@ -1533,6 +1539,35 @@ inline namespace v1
   _ProtoAllocator _M_alloc;
 };
 
+  template
+class use_future_t>
+{
+public:
+  // use_future_t types:
+  using allocator_type = std::allocator<_Tp>;
+
+  // use_future_t members:
+  constexpr use_future_t() noexcept = default;
+
+  explicit
+  use_future_t(const allocator_type& __a) noexcept { }
+
+  template
+   use_future_t>
+   rebind(const std::allocator<_Up>& __a) const noexcept
+   { return use_future_t>(__a); }
+
+  allocator_type get_allocator() const noexcept { return {}; }
+
+  template
+   auto
+   operator()(_Func&& __f) const
+   {
+ using _Token = __use_future_ct, allocator_type>;
+ return _Token{std::forward<_Func>(__f)};
+   }
+};
+
   constexpr use_future_t<> use_future = use_future_t<>();
 
   template
@@ -1552,6 +1587,12 @@ inline namespace v1
  _M_promise{ std::get<1>(__token._M_t) }
{ }
 
+  template
+   explicit
+   __use_future_ch(__use_future_ct<_Func, std::allocator<_Tp>>&& __token)
+   : _M_f{ std::move(__token._M_f) }
+   { }
+
   void
   operator()(_Args&&... __args)
   {

[PATCH] Change SLP representation of reduction chains

2019-10-24 Thread Richard Biener



Instead of

t.c:4:3: note:   node 0x3751bf0 (max_nunits=1)
t.c:4:3: note:  stmt 0 sum_24 = _5 + sum_30;
t.c:4:3: note:  stmt 1 sum_25 = _10 + sum_24;
t.c:4:3: note:  stmt 2 sum_26 = _14 + sum_25;
t.c:4:3: note:  stmt 3 sum_27 = _18 + sum_26;
t.c:4:3: note:  children 0x38eb4d0 0x374acb0
t.c:4:3: note:   node 0x38eb4d0 (max_nunits=1)
t.c:4:3: note:  stmt 0 _5 = *_4;
t.c:4:3: note:  stmt 1 _10 = *_9;
t.c:4:3: note:  stmt 2 _14 = *_13;
t.c:4:3: note:  stmt 3 _18 = *_17;
t.c:4:3: note:   node 0x374acb0 (max_nunits=1)
t.c:4:3: note:  stmt 0 sum_30 = PHI <0(5), sum_27(6)>
t.c:4:3: note:  stmt 1 sum_24 = _5 + sum_30;
t.c:4:3: note:  stmt 2 sum_25 = _10 + sum_24;
t.c:4:3: note:  stmt 3 sum_26 = _14 + sum_25;

we want

t.c:4:3: note:   node 0x3d9d110 (max_nunits=1)
t.c:4:3: note:  stmt 0 sum_24 = _5 + sum_30;
t.c:4:3: note:  stmt 1 sum_25 = _10 + sum_24;
t.c:4:3: note:  stmt 2 sum_26 = _14 + sum_25;
t.c:4:3: note:  stmt 3 sum_27 = _18 + sum_26;
t.c:4:3: note:  children 0x3d9d070 0x3d9d0c0
t.c:4:3: note:   node 0x3d9d070 (max_nunits=1)
t.c:4:3: note:  stmt 0 _5 = *_4;
t.c:4:3: note:  stmt 1 _10 = *_9;
t.c:4:3: note:  stmt 2 _14 = *_13;
t.c:4:3: note:  stmt 3 _18 = *_17;
t.c:4:3: note:   node 0x3d9d0c0 (max_nunits=1)
t.c:4:3: note:  stmt 0 sum_30 = PHI <0(5), sum_27(6)>
t.c:4:3: note:  stmt 1 sum_30 = PHI <0(5), sum_27(6)>
t.c:4:3: note:  stmt 2 sum_30 = PHI <0(5), sum_27(6)>
t.c:4:3: note:  stmt 3 sum_30 = PHI <0(5), sum_27(6)>

where we correctly represent the reduction chain as re-associated.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2019-10-24  Richard Biener  

* tree-vect-slp.c (vect_get_and_check_slp_defs): For reduction
chains try harder with operand swapping and instead of
putting a shifted chain into the reduction operands put
a repetition of the final reduction op there as if we'd
reassociate the expression.

* gcc.dg/vect/slp-reduc-10a.c: New testcase.
* gcc.dg/vect/slp-reduc-10b.c: Likewise.
* gcc.dg/vect/slp-reduc-10c.c: Likewise.
* gcc.dg/vect/slp-reduc-10d.c: Likewise.
* gcc.dg/vect/slp-reduc-10e.c: Likewise.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 277372)
+++ gcc/tree-vect-slp.c (working copy)
@@ -433,20 +433,35 @@ again:
 the def-stmt/s of the first stmt.  Allow different definition
 types for reduction chains: the first stmt must be a
 vect_reduction_def (a phi node), and the rest
-vect_internal_def.  */
+end in the reduction chain.  */
  tree type = TREE_TYPE (oprnd);
  if ((oprnd_info->first_dt != dt
   && !(oprnd_info->first_dt == vect_reduction_def
-   && dt == vect_internal_def)
+   && !STMT_VINFO_DATA_REF (stmt_info)
+   && REDUC_GROUP_FIRST_ELEMENT (stmt_info)
+   && def_stmt_info
+   && !STMT_VINFO_DATA_REF (def_stmt_info)
+   && (REDUC_GROUP_FIRST_ELEMENT (def_stmt_info)
+   == REDUC_GROUP_FIRST_ELEMENT (stmt_info)))
   && !((oprnd_info->first_dt == vect_external_def
 || oprnd_info->first_dt == vect_constant_def)
&& (dt == vect_external_def
|| dt == vect_constant_def)))
- || !types_compatible_p (oprnd_info->first_op_type, type))
+ || !types_compatible_p (oprnd_info->first_op_type, type)
+ || (!STMT_VINFO_DATA_REF (stmt_info)
+ && REDUC_GROUP_FIRST_ELEMENT (stmt_info)
+ && ((!def_stmt_info
+  || STMT_VINFO_DATA_REF (def_stmt_info)
+  || (REDUC_GROUP_FIRST_ELEMENT (def_stmt_info)
+  != REDUC_GROUP_FIRST_ELEMENT (stmt_info)))
+ != (oprnd_info->first_dt != vect_reduction_def
{
  /* Try swapping operands if we got a mismatch.  */
  if (i == commutative_op && !swapped)
{
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+"trying swapped operands\n");
  swapped = true;
  goto again;
}
@@ -484,9 +499,26 @@ again:
  oprnd_info->ops.quick_push (oprnd);
  break;
 
+   case vect_internal_def:
case vect_reduction_def:
+ if (oprnd_info->first_dt == vect_reduction_def
+ && !STMT_VINFO_DATA_REF (stmt_info)
+ && REDUC_GROUP_FIRST_ELEMENT (stmt_info)
+ && !STMT_VINFO_DATA_REF (def_stmt_info)
+ && (REDUC_GROUP

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-24 Thread H.J. Lu

On Thu, Oct 24, 2019 at 12:56 AM Richard Sandiford
 wrote:
>
> "H.J. Lu"  writes:
> > On Wed, Oct 23, 2019 at 4:51 AM Richard Sandiford
> >  wrote:
> >>
> >> Richard Biener  writes:
> >> > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford
> >> >  wrote:
> >> >>
> >> >> This patch is the first of a series that tries to remove two
> >> >> assumptions:
> >> >>
> >> >> (1) that all vectors involved in vectorisation must be the same size
> >> >>
> >> >> (2) that there is only one vector mode for a given element mode and
> >> >> number of elements
> >> >>
> >> >> Relaxing (1) helps with targets that support multiple vector sizes or
> >> >> that require the number of elements to stay the same.  E.g. if we're
> >> >> vectorising code that operates on narrow and wide elements, and the
> >> >> narrow elements use 64-bit vectors, then on AArch64 it would normally
> >> >> be better to use 128-bit vectors rather than pairs of 64-bit vectors
> >> >> for the wide elements.
> >> >>
> >> >> Relaxing (2) makes it possible for -msve-vector-bits=128 to preoduce
> >> >> fixed-length code for SVE.  It also allows unpacked/half-size SVE
> >> >> vectors to work with -msve-vector-bits=256.
> >> >>
> >> >> The patch adds a new hook that targets can use to control how we
> >> >> move from one vector mode to another.  The hook takes a starting vector
> >> >> mode, a new element mode, and (optionally) a new number of elements.
> >> >> The flexibility needed for (1) comes in when the number of elements
> >> >> isn't specified.
> >> >>
> >> >> All callers in this patch specify the number of elements, but a later
> >> >> vectoriser patch doesn't.  I won't be posting the vectoriser patch
> >> >> for a few days, hence the RFC/A tag.
> >> >>
> >> >> Tested individually on aarch64-linux-gnu and as a series on
> >> >> x86_64-linux-gnu.  OK to install?  Or if not yet, does the idea
> >> >> look OK?
> >> >
> >> > In isolation the idea looks good but maybe a bit limited?  I see
> >> > how it works for the same-size case but if you consider x86
> >> > where we have SSE, AVX256 and AVX512 what would it return
> >> > for related_vector_mode (V4SImode, SImode, 0)?  Or is this
> >> > kind of query not intended (where the component modes match
> >> > but nunits is zero)?
> >>
> >> In that case we'd normally get V4SImode back.  It's an allowed
> >> combination, but not very useful.
> >>
> >> > How do you get from SVE fixed 128bit to NEON fixed 128bit then?  Or is
> >> > it just used to stay in the same register set for different component
> >> > modes?
> >>
> >> Yeah, the idea is to use the original vector mode as essentially
> >> a base architecture.
> >>
> >> The follow-on patches replace vec_info::vector_size with
> >> vec_info::vector_mode and targetm.vectorize.autovectorize_vector_sizes
> >> with targetm.vectorize.autovectorize_vector_modes.  These are the
> >> starting modes that would be passed to the hook in the nunits==0 case.
> >>
> >
> > For a target with different vector sizes,
> > targetm.vectorize.autovectorize_vector_sizes
> > doesn't return the optimal vector sizes for known trip count and
> > unknown trip count.
> > For a target with 128-bit and 256-bit vectors, 256-bit followed by
> > 128-bit works well for
> > known trip count since vectorizer knows the maximum usable vector size.  
> > But for
> > unknown trip count, we may want to use 128-bit vector when 256-bit
> > code path won't
> > be used at run-time, but 128-bit vector will.  At the moment, we can
> > only use one
> > set of vector sizes for both known trip count and unknown trip count.
>
> Yeah, we're hit by this for AArch64 too.  Andre's recent patches:
>
> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01564.html
> https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00205.html
>
> should help.
>
> >   Can vectorizer
> > support 2 sets of vector sizes, one for known trip count and the other
> > for unknown
> > trip count?
>
> The approach Andre's taking is to continue to use the wider vector size
> for unknown trip counts, and instead ensure that the epilogue loop is
> vectorised at the narrower vector size if possible.  The patches then
> use this vectorised epilogue as a fallback "main" loop if the runtime
> trip count is too low for the wide vectors.

I tried it on 548.exchange2_r in SPEC CPU 2017.  There is short cut
to vectorized epilogue for low trip count.

-- 
H.J.

Re: [PR testsuite/91842] Skip gcc.dg/ipa/ipa-sra-19.c on power

2019-10-24 Thread Andreas Krebbel

On 24.10.19 15:26, Martin Jambor wrote:
> Hi,
> 
> On Thu, Oct 24 2019, Andreas Krebbel wrote:
>> On 02.10.19 17:06, Martin Jambor wrote:
>>> Hi,
>>>
>>> I seem to remember I minimized gcc.dg/ipa/ipa-sra-19.c on power but
>>> perhaps I am wrong because the testcase fails there with a
>>> power-specific error:
>>>
>>> gcc.dg/ipa/ipa-sra-19.c:19:3: error: AltiVec argument passed to 
>>> unprototyped function
>>>
>>> I am going to simply skip it there with the following patch, which I
>>> hope is obvious.  Tested by running ipa.exp on both ppc64le-linux and
>>> x86_64-linux.
>>>
>>> Thanks,
>>>
>>> Martin
>>>
>>>
>>> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c 
>>> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>>> index adebaa5f5e1..d219411d8ba 100644
>>> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>>> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>>> @@ -1,5 +1,6 @@
>>>  /* { dg-do compile } */
>>>  /* { dg-options "-O2"  } */
>>> +/* { dg-skip-if "" { powerpc*-*-* } } */
>>>  
>>>  typedef int __attribute__((__vector_size__(16))) vectype;
>>>
>>>
>>
>> I ran into the same problem on IBM Z. Is it important for the testcase to 
>> leave the argument list of
>> k unspecified or would it be ok to add it?
> 
> I wanted to write to you that the un-prototypedness is on purpose and
> essential to test what the bug was in the past but this time I actually
> managed to find the associated fix in my ipa-sra branch and found out
> that I mis-remembered, that it is not the case.  Sorry for not doing
> that before.  I believe the patch is OK then and we can even remove the
> dg-skip-if I added.  And by that I mean that although I'm not a
> reviewer, I would consider it obvious.  Will you do it or should I take
> care of it?

I will do it. Thanks!

Andreas

> 
> Thanks,
> 
> Martin
> 
> 
>>
>> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c 
>> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>> index d219411d8ba..d9dcd33cb76 100644
>> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
>> @@ -5,7 +5,7 @@
>>  typedef int __attribute__((__vector_size__(16))) vectype;
>>
>>  vectype dk();
>> -vectype k();
>> +vectype k(vectype);
>>
>>  int b;
>>  vectype *j;

[Committed] ipa-sra-19.c: Avoid unprototyped function

2019-10-24 Thread Andreas Krebbel

Power and IBM Z require a function prototype if a vector argument is
passed.  Complete the prototype of k to prevent errors from being
triggered on these platforms

Committed after the discussion here:
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01737.html

gcc/testsuite/ChangeLog:

2019-10-24  Andreas Krebbel  

* gcc.dg/ipa/ipa-sra-19.c: Remove dg-skip-if. Add argument type to
prototype of k.
---
 gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c 
b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
index d219411d8ba..6186d891a29 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c
@@ -1,11 +1,10 @@
 /* { dg-do compile } */
 /* { dg-options "-O2"  } */
-/* { dg-skip-if "" { powerpc*-*-* } } */
 
 typedef int __attribute__((__vector_size__(16))) vectype;
 
 vectype dk();
-vectype k();
+vectype k(vectype);
 
 int b;
 vectype *j;
-- 
2.23.0

Re: [PATCH] PR c++/91369 Implement P0784R7 changes to allocation and construction

2019-10-24 Thread Jonathan Wakely


On 24/10/19 15:31 +0100, Jonathan Wakely wrote:

On 23/10/19 20:27 +0100, Jonathan Wakely wrote:

--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -154,13 +154,42 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_GLIBCXX20_CONSTEXPR
allocator(const allocator<_Tp1>&) _GLIBCXX_NOTHROW { }

+#if __cplusplus <= 201703L
 ~allocator() _GLIBCXX_NOTHROW { }
+#endif


This changes the value of is_trivially_destructible_v> so
maybe it would be better to keep the user-provided destructor but make
it constexpr:

--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -154,9 +154,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _GLIBCXX20_CONSTEXPR
  allocator(const allocator<_Tp1>&) _GLIBCXX_NOTHROW { }

-#if __cplusplus <= 201703L
+  _GLIBCXX20_CONSTEXPR
 ~allocator() _GLIBCXX_NOTHROW { }
-#endif

#if __cplusplus > 201703L
 [[nodiscard,__gnu__::__always_inline__]]

With the earlier commit r277300 I've still changed the result of
is_trivially_destructible_v> and
is_trivially_*_constructible_v> because the
allocator explicit specialization no longer exists for C++20.
That might be a bigger problem.


This restores the previous properties of std::allocator.

Tested powerpc64le-linux, committed to trunk.


commit 29d60b4d998ba2678da4ef8417ef58f5373cad8b
Author: Jonathan Wakely 
Date:   Thu Oct 24 15:52:57 2019 +0100

Revert ABI changes to std::allocator in C++20

The recent C++20 changes to remove the std::allocator explicit
specialization and the destructor in the std::allocator primary template
change the result of some is_trivially_xxx type traits. To avoid those
changes, this patch restores the explicit specialization and the
destructor.

In order to meet the C++20 requirements the std::allocator
explicit specialization must provide the same interface as the primary
template (except for the unusable allocate and deallocate member
functions) and the destructor in the primary template must be constexpr.

* include/bits/allocator.h (allocator): Restore the explicit
specialization for C++20, but make its API consistent with the primary
template.
(allocator::~allocator()): Restore the destructor for C++20, but make
it constexpr.
* testsuite/20_util/allocator/rebind_c++20.cc: Check allocator.
* testsuite/20_util/allocator/requirements/typedefs_c++20.cc: Likewise.
* testsuite/20_util/allocator/void.cc: Check that constructors and
destructors are trivial. Check for converting constructor in C++20.
* testsuite/ext/malloc_allocator/variadic_construct.cc: Simplify
dejagnu target selector.
* testsuite/ext/new_allocator/variadic_construct.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/allocator.h b/libstdc++-v3/include/bits/allocator.h
index 1a3eb88eded..2559c57b12e 100644
--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -63,23 +63,30 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @{
*/
 
-#if __cplusplus <= 201703L
   /// allocator specialization.
   template<>
 class allocator
 {
 public:
+  typedef voidvalue_type;
   typedef size_t  size_type;
   typedef ptrdiff_t   difference_type;
+#if __cplusplus <= 201703L
   typedef void*   pointer;
   typedef const void* const_pointer;
-  typedef voidvalue_type;
 
   template
 	struct rebind
 	{ typedef allocator<_Tp1> other; };
+#else
+  allocator() = default;
 
-#if __cplusplus >= 201103L
+  template
+	constexpr
+	allocator(const allocator<_Up>&) { }
+#endif // ! C++20
+
+#if __cplusplus >= 201103L && __cplusplus <= 201703L
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 2103. std::allocator propagate_on_container_move_assignment
   typedef true_type propagate_on_container_move_assignment;
@@ -98,9 +105,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	destroy(_Up* __p)
 	noexcept(noexcept(__p->~_Up()))
 	{ __p->~_Up(); }
-#endif // C++11
+#endif // C++11 to C++17
 };
-#endif // ! C++20
 
   /**
* @brief  The @a standard allocator, as per [20.4].
@@ -154,9 +160,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	_GLIBCXX20_CONSTEXPR
 	allocator(const allocator<_Tp1>&) _GLIBCXX_NOTHROW { }
 
-#if __cplusplus <= 201703L
+  _GLIBCXX20_CONSTEXPR
   ~allocator() _GLIBCXX_NOTHROW { }
-#endif
 
 #if __cplusplus > 201703L
   [[nodiscard,__gnu__::__always_inline__]]
diff --git a/libstdc++-v3/testsuite/20_util/allocator/rebind_c++20.cc b/libstdc++-v3/testsuite/20_util/allocator/rebind_c++20.cc
index 968e1de931b..dd7cd67f943 100644
--- a/libstdc++-v3/testsuite/20_util/allocator/rebind_c++20.cc
+++ b/libstdc++-v3/testsuite/20_util/allocator/rebind_c++20.cc
@@ -23,6 +23,9 @@
 template struct Alloc : std::allocator { };
 
 using T = std::allocator_traits>;
-
 // Prior to C++20 this finds std

Re: [PATCH] Add support for C++2a stop_token

2019-10-24 Thread Jonathan Wakely


On 23/10/19 12:50 -0700, Thomas Rodgers wrote:


Thomas Rodgers writes:

Let's try this again.


That's better, thanks :-)


+   * include/Makefile.am: Add  header.
+* include/Makefile.in: Regenerate.
+   * include/std/stop_token: New file.
+   * include/std/version (__cpp_lib_jthread): New value.
+   * testsuite/30_threads/stop_token/1.cc: New test.
+   * testsuite/30_threads/stop_token/2.cc: New test.
+   * testsuite/30_threads/stop_token/stop_token.cc: New test.


ChangeLog entries should be provided separately from the patch (e.g.
inline in the makefile, or as a separte attachment) because they never
apply cleanly.

Also it looks like you have a mixture of tabs and space there, it
should be only tabs.


diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index cd1e9df5482..9b4ab670315 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -416,6 +416,7 @@ std_headers = \
${std_srcdir}/sstream \
${std_srcdir}/stack \
${std_srcdir}/stdexcept \
+   ${std_srcdir}/stop_token \
${std_srcdir}/streambuf \
${std_srcdir}/string \
${std_srcdir}/string_view \


Generated files like Makefile.in don't need to be in the patch. I use
a git alias called 'filter' to filter them out:

!f(){ git $@ | filterdiff -x '*/ChangeLog' -x '*/Makefile.in' -x '*/configure' 
-x '*/config.h.in' -x '*/doc/html/*' | sed -e '/^diff.*ChangeLog/{N;d}' ; }; f

And then I create a patch with this alias:

!f(){ git filter show --diff-algorithm=histogram ${*:-HEAD} > 
/dev/shm/patch.txt; }; f



diff --git a/libstdc++-v3/include/std/stop_token 
b/libstdc++-v3/include/std/stop_token
new file mode 100644
index 000..b3655b85eae
--- /dev/null
+++ b/libstdc++-v3/include/std/stop_token
@@ -0,0 +1,338 @@
+//  -*- C++ -*-
+
+// Copyright (C) 2008-2019 Free Software Foundation, Inc.


The copyright date should be just 2019.


+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file include/stop_token
+ *  This is a Standard C++ Library header.
+ */
+
+#ifndef _GLIBCXX_STOP_TOKEN
+#define _GLIBCXX_STOP_TOKEN
+


Please add:

#if __cplusplus > 201703L

here (and the corresponding #endif before the final #endif) so that
this file is empty when included pre-C++20.


+#include 
+#include 
+#include 
+#include 
+
+#define __cpp_lib_jthread 201907L
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+  _GLIBCXX_BEGIN_NAMESPACE_VERSION


These BEGIN/END macros should not be indented (which has to be done
manually as editors always want to indent the content following the
opening brace of the namespace).


+
+  class stop_source;
+  template


s/class/typename/


+  class stop_callback;
+
+  struct nostopstate_t { explicit nostopstate_t() = default; };
+  inline constexpr nostopstate_t nostopstate();
+
+  class stop_token {


The opening brace should be on the next line please, in the same
column as "class".
(Same comment for all classes and structs in the patch).



+  public:
+stop_token() noexcept = default;
+
+stop_token(const stop_token& __other) noexcept
+  : _M_state(__other._M_state)
+{ }
+
+stop_token(stop_token&& __other) noexcept
+  : _M_state(std::move(__other._M_state))
+{ }
+
+~stop_token() = default;
+
+stop_token&
+operator=(const stop_token& __rhs) noexcept {
+  _M_state = __rhs._M_state;
+  return *this;
+}
+
+stop_token&
+operator=(stop_token&& __rhs) noexcept {
+  std::swap(_M_state, __rhs._M_state);
+  return *this;
+}


This doesn't leave the RHS empty as required.

I think the copy/move constructors and copy/move assignment operators
can all be defined as = default and that will do the right thing.


+
+[[nodiscard]]
+bool
+stop_possible() const noexcept
+{
+  return static_cast(_M_state);
+}
+
+[[nodiscard]]
+bool
+stop_requested() const noexcept
+{
+  return stop_possible() && _M_state->_M_stop_requested();
+}
+
+  private:
+f

[dump] small source cleanup

2019-10-24 Thread Nathan Sidwell

due to my own stupidity I found myself wandering into dump_begin, but 
then got confused due to an uninitialized pointer not pointing to 
somewhere totally insane.


Applying this cleanup patch to initialize the decls at creation time.

nathan
--
Nathan Sidwell
2019-10-24  Nathan Sidwell  

	* dumpfile.c (dump_begin): Reorder decls to use RAII.

Index: gcc/dumpfile.c
===
--- gcc/dumpfile.c	(revision 277405)
+++ gcc/dumpfile.c	(working copy)
@@ -1533,19 +1533,15 @@ gcc::dump_manager::
 dump_begin (int phase, dump_flags_t *flag_ptr, int part)
 {
-  char *name;
-  struct dump_file_info *dfi;
-  FILE *stream;
-
   if (phase == TDI_none || !dump_phase_enabled_p (phase))
 return NULL;
 
-  name = get_dump_file_name (phase, part);
+  char *name = get_dump_file_name (phase, part);
   if (!name)
 return NULL;
-  dfi = get_dump_file_info (phase);
+  struct dump_file_info *dfi = get_dump_file_info (phase);
 
   /* We do not support re-opening of dump files with parts.  This would require
  tracking pstate per part of the dump file.  */
-  stream = dump_open (name, part != -1 || dfi->pstate < 0);
+  FILE *stream = dump_open (name, part != -1 || dfi->pstate < 0);
   if (stream)
 dfi->pstate = 1;

[C++ PATCH] Template parm index fix

2019-10-24 Thread Nathan Sidwell

I discovered that while regular TPIs are pointed to by their owning 
decl, reduced TPIs were not.  This caused me problems on the modules 
branch and seems like a needless inconsistency.  While there I did a 
small amount of cleanup.


I also noticed that convert_generic_types_to_packs was passing the 
original type into reduce_template_parm_level, rather than the copied 
type.  This just seems wrong, and with the above change would become 
(more?) broken.  Every other call to RTPL passes in the copied type.


Applying to trunk.

nathan

--
Nathan Sidwell
2019-10-24  Nathan Sidwell  

	* pt.c (reduce_template_parm_level): Attach the new TPI to the new
	DECL.
	(convert_generic_types_to_packs): Pass the copied type to
	reduce_templatE_parm_level.

Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c	(revision 277405)
+++ gcc/cp/pt.c	(working copy)
@@ -4430,8 +4430,8 @@ reduce_template_parm_level (tree index,
 {
   tree orig_decl = TEMPLATE_PARM_DECL (index);
-  tree decl, t;
 
-  decl = build_decl (DECL_SOURCE_LOCATION (orig_decl),
-			 TREE_CODE (orig_decl), DECL_NAME (orig_decl), type);
+  tree decl = build_decl (DECL_SOURCE_LOCATION (orig_decl),
+			  TREE_CODE (orig_decl), DECL_NAME (orig_decl),
+			  type);
   TREE_CONSTANT (decl) = TREE_CONSTANT (orig_decl);
   TREE_READONLY (decl) = TREE_READONLY (orig_decl);
@@ -4439,22 +4439,29 @@ reduce_template_parm_level (tree index,
   SET_DECL_TEMPLATE_PARM_P (decl);
 
-  t = build_template_parm_index (TEMPLATE_PARM_IDX (index),
- TEMPLATE_PARM_LEVEL (index) - levels,
- TEMPLATE_PARM_ORIG_LEVEL (index),
- decl, type);
-  TEMPLATE_PARM_DESCENDANTS (index) = t;
-  TEMPLATE_PARM_PARAMETER_PACK (t)
+  tree tpi = build_template_parm_index (TEMPLATE_PARM_IDX (index),
+	TEMPLATE_PARM_LEVEL (index) - levels,
+	TEMPLATE_PARM_ORIG_LEVEL (index),
+	decl, type);
+  TEMPLATE_PARM_DESCENDANTS (index) = tpi;
+  TEMPLATE_PARM_PARAMETER_PACK (tpi)
 	= TEMPLATE_PARM_PARAMETER_PACK (index);
 
 	/* Template template parameters need this.  */
+  tree inner = decl;
   if (TREE_CODE (decl) == TEMPLATE_DECL)
 	{
-	  DECL_TEMPLATE_RESULT (decl)
-	= build_decl (DECL_SOURCE_LOCATION (decl),
-			  TYPE_DECL, DECL_NAME (decl), type);
-	  DECL_ARTIFICIAL (DECL_TEMPLATE_RESULT (decl)) = true;
+	  inner = build_decl (DECL_SOURCE_LOCATION (decl),
+			  TYPE_DECL, DECL_NAME (decl), type);
+	  DECL_TEMPLATE_RESULT (decl) = inner;
+	  DECL_ARTIFICIAL (inner) = true;
 	  DECL_TEMPLATE_PARMS (decl) = tsubst_template_parms
 	(DECL_TEMPLATE_PARMS (orig_decl), args, complain);
 	}
+
+  /* Attach the TPI to the decl.  */
+  if (TREE_CODE (inner) == TYPE_DECL)
+	TEMPLATE_TYPE_PARM_INDEX (type) = tpi;
+  else
+	DECL_INITIAL (decl) = tpi;
 }
 
@@ -28441,5 +28448,5 @@ convert_generic_types_to_packs (tree par
   TEMPLATE_TYPE_PARM_INDEX (t)
 	= reduce_template_parm_level (TEMPLATE_TYPE_PARM_INDEX (o),
-  o, 0, 0, tf_none);
+  t, 0, 0, tf_none);
   TREE_TYPE (TEMPLATE_TYPE_DECL (t)) = t;
   TYPE_STUB_DECL (t) = TYPE_NAME (t) = TEMPLATE_TYPE_DECL (t);

Re: [PATCH 00/29] [arm] Rewrite DImode arithmetic support

2019-10-24 Thread Richard Earnshaw (lists)


On 24/10/2019 11:16, Christophe Lyon wrote:

On 23/10/2019 15:21, Richard Earnshaw (lists) wrote:

On 23/10/2019 09:28, Christophe Lyon wrote:

On 21/10/2019 14:24, Richard Earnshaw (lists) wrote:

On 21/10/2019 12:51, Christophe Lyon wrote:

On 18/10/2019 21:48, Richard Earnshaw wrote:

Each patch should produce a working compiler (it did when it was
originally written), though since the patch set has been re-ordered
slightly there is a possibility that some of the intermediate steps
may have missing test updates that are only cleaned up later.
However, only the end of the series should be considered complete.
I've kept the patch as a series to permit easier regression hunting
should that prove necessary.


Thanks for this information: my validation system was designed in 
such a way that it will run the GCC testsuite after each of your 
patches, so I'll keep in mind not to report regressions (I've 
noticed several already).



I can perform a manual validation taking your 29 patches as a 
single one and compare the results with those of the revision 
preceding the one were you committed patch #1. Do you think it 
would be useful?



Christophe




I think if you can filter out any that are removed by later patches 
and then report against the patch that caused the regression itself 
then that would be the best.  But I realise that would be more work 
for you, so a round-up against the combined set would be OK.


BTW, I'm aware of an issue with the compiler now generating

  reg, reg, shift 

in Thumb2; no need to report that again.

Thanks,
R.
.




Hi Richard,

The validation of the whole set shows 1 regression, which was also 
reported by the validation of r277179 (early split most DImode 
comparison operations)


When GCC is configured as:
--target arm-none-eabi
--with-mode default
--with-cpu default
--with-fpu default
(that is, no --with-mode, --with-cpu, --with-fpu option)
I'm using binutils-2.28 and newlib-3.1.0

I can see:
FAIL: g++.dg/opt/pr36449.C  -std=gnu++14 execution test
(whatever -std=gnu++XX option)


That's strange.  The assembler code generated for that test is 
unchanged from before the patch series, so I can't see how it can't be 
a problem in the test itself.  What's more, I can't seem to reproduce 
this myself.


As you have noticed, I have created PR92207 to help understand this.



Similarly, in my build the code for _Znwj, malloc, malloc_r and free_r 
are also unchanged, while the malloc_[un]lock functions are empty 
stubs (not surprising as we aren't multi-threaded).


So the only thing that looks to have really changed are the linker 
offsets (some of the library code has changed, but I don't think it's 
really reached in practice, so shouldn't be relevant).




I'm executing the tests using qemu-4.1.0 -cpu arm926
The qemu traces shows that code enters main, then _Znwj (operator 
new), then _malloc_r

The qemu traces end with:


What do you mean by 'end with'?  What's the failure mode of the test?  
A crash, or the test exiting with a failure code?



qemu complains with:
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

'end with' because my automated validation builds do not keep the full 
execution traces (that would need too much disk space)




As I've said in the PR, this looks like a bug in the qemu+newlib code. 
We call sbrk() which says, OK, but then the page isn't mapped by qemu 
into the process and it then faults.


So I think these changes are off the hook, it's just bad luck that they 
expose the issue at this point in time.


R.

[committed, arm] Backport -- Fix multilibs for Armv7-R(was Re: [PATCH, arm] Backport -- Fix multilibs for Armv7-R)

2019-10-24 Thread Andre Vieira (lists)


Hi,

We had a chat offline with Kyrill in which he approved committing the 
patch now.  Unfortunately he isn't able to access his email until 
tomorrow to confirm his approval, but given a time sensitive deadline 
and the fact that this patch was previously reviewed and accepted on 
trunk I am committing the attached patch on behalf of Mihail Ionescu 
with Kyrylo Tkachov's offline approval.


Committed to gcc-9-branch in revision r277417.

Cheers,
Andre


On 22/10/2019 17:21, Mihail Ionescu wrote:

Hi,

I previously did not properly attach the diff.


Regards,
Mihail

On 10/22/2019 05:06 PM, Mihail Ionescu wrote:

Hi,

This is a backport from trunk for GCC9.

SVN revision: r277156.

Built and tested on arm-none-eabi (comparted 
-march=armv7e-m+fp/-mfloat-abi=hard

to -march=armv7-r+fp.sp/-mfloat-abi=hard).


gcc/ChangeLog:

2019-10-21  Mihail Ionescu  

 Backport from mainline
 2019-10-18  Andre Vieira  

 * config/arm/t-multilib: Add new multilib variants and new
 mappings.

gcc/testsuite/ChangeLog:

2019-10-21  Mihail Ionescu  

 Backport from mainline
 2019-10-18  Andre Vieira  

 * gcc.target/arm/multilib.exp: Add extra tests.


Is it ok for backport to GCC9?


Regards,
Mihail


### Attachment also inlined for ease of reply 
###



diff --git a/gcc/config/arm/t-multilib b/gcc/config/arm/t-multilib
index 
08526302283eea03e4a8f22a2a049e85bd7bb6af..dc97c8f09fb0b7f53520432e1a174adfce1bf6af 
100644

--- a/gcc/config/arm/t-multilib
+++ b/gcc/config/arm/t-multilib
@@ -24,6 +24,8 @@
  # values during the configure step.  We enforce this during the
  # top-level configury.
  +s-mlib: $(srcdir)/config/arm/t-multilib 
$(srcdir)/config/arm/t-aprofile $(srcdir)/config/arm/t-rmprofile

+
  MULTILIB_OPTIONS =
  MULTILIB_DIRNAMES    =
  MULTILIB_EXCEPTIONS  =
@@ -63,6 +65,8 @@ all_early_arch    := armv5tej armv6 armv6j 
armv6k armv6z armv6kz \

  v7_a_arch_variants    := $(call all_feat_combs, mp sec)
  v7_a_nosimd_variants    := +fp +vfpv3 +vfpv3-d16-fp16 +vfpv3-fp16 
+vfpv4-d16 +vfpv4

  v7_a_simd_variants    := +simd +neon-fp16 +neon-vfpv4
+v7_r_sp_variants    := +fp.sp +fp.sp+idiv +vfpv3xd-fp16 
+vfpv3xd-fp16+idiv

+v7_r_dp_variants    := +fp +fp+idiv +vfpv3-d16-fp16 +vfpv3-d16-fp16+idiv
  v7ve_nosimd_variants    := +vfpv3-d16 +vfpv3 +vfpv3-d16-fp16 
+vfpv3-fp16 +fp +vfpv4

  v7ve_vfpv3_simd_variants := +neon +neon-fp16
  v7ve_vfpv4_simd_variants := +simd
@@ -86,8 +90,8 @@ SEP := $(and $(HAS_APROFILE),$(HAS_RMPROFILE),/)
  MULTILIB_OPTIONS    += marm/mthumb
  MULTILIB_DIRNAMES    += arm thumb
  -MULTILIB_OPTIONS    += 
march=armv5te+fp/march=armv7/march=armv7+fp/$(MULTI_ARCH_OPTS_A)$(SEP)$(MULTI_ARCH_OPTS_RM) 

-MULTILIB_DIRNAMES    += v5te v7 v7+fp $(MULTI_ARCH_DIRS_A) 
$(MULTI_ARCH_DIRS_RM)
+MULTILIB_OPTIONS    += 
march=armv5te+fp/march=armv7/march=armv7+fp/march=armv7-r+fp.sp/$(MULTI_ARCH_OPTS_A)$(SEP)$(MULTI_ARCH_OPTS_RM) 

+MULTILIB_DIRNAMES    += v5te v7 v7+fp v7-r+fp.sp $(MULTI_ARCH_DIRS_A) 
$(MULTI_ARCH_DIRS_RM)
   MULTILIB_OPTIONS    += 
mfloat-abi=soft/mfloat-abi=softfp/mfloat-abi=hard

  MULTILIB_DIRNAMES    += nofp softfp hard
@@ -100,22 +104,31 @@ MULTILIB_REQUIRED    += 
mthumb/march=armv7/mfloat-abi=soft

  MULTILIB_REQUIRED    += mthumb/march=armv7+fp/mfloat-abi=softfp
  MULTILIB_REQUIRED    += mthumb/march=armv7+fp/mfloat-abi=hard
  -# Map v7-r down onto common v7 code.
+MULTILIB_REQUIRED    += mthumb/march=armv7-r+fp.sp/mfloat-abi=softfp
+MULTILIB_REQUIRED    += mthumb/march=armv7-r+fp.sp/mfloat-abi=hard
+
+# Map v7-r with double precision down onto common v7 code.
  MULTILIB_MATCHES    += march?armv7=march?armv7-r
  MULTILIB_MATCHES    += march?armv7=march?armv7-r+idiv
-MULTILIB_MATCHES    += march?armv7+fp=march?armv7-r+fp
-MULTILIB_MATCHES    += march?armv7+fp=march?armv7-r+fp+idiv
+MULTILIB_MATCHES    += $(foreach ARCH, $(v7_r_dp_variants), \
+ march?armv7+fp=march?armv7-r$(ARCH))
+
+# Map v7-r single precision variants to v7-r with single precision.
+MULTILIB_MATCHES    += $(foreach ARCH, \
+ $(filter-out +fp.sp, $(v7_r_sp_variants)), \
+ march?armv7-r+fp.sp=march?armv7-r$(ARCH))
   MULTILIB_MATCHES    += $(foreach ARCH, $(all_early_arch), \
   march?armv5te+fp=march?$(ARCH)+fp)
-# Map v8-r down onto common v7 code.
+# Map v8-r down onto common v7 code or v7-r sp.
  MULTILIB_MATCHES    += march?armv7=march?armv8-r
  MULTILIB_MATCHES    += $(foreach ARCH, $(v8_r_nosimd_variants), \
   march?armv7=march?armv8-r$(ARCH))
  MULTILIB_MATCHES    += $(foreach ARCH,+simd +crypto, \
   march?armv7+fp=march?armv8-r$(ARCH) \
   march?armv7+fp=march?armv8-r+crc$(ARCH))
-
+MULTILIB_MATCHES    += march?armv7-r+fp.sp=march?armv8-r+fp.sp
+MULTILIB_MATCHES    += march?armv7-r+fp.sp=march?armv8-r+crc+fp.sp
   ifeq (,$(HAS_APROFILE))
  # Map all v7-a
@@ -177,7 +190,7 @@ MULTILIB_MATCHES    += $(foreach ARCH, 
$(v8_5_a_simd_variants), \

C++ PATCH to add missing space in diagnostic

2019-10-24 Thread Marek Polacek

Since r269045 we're missing a space in a diagnostic.

Bootstrapped/regtested on x86_64-linux, applying to trunk and 9.

2019-10-24  Marek Polacek  

* decl.c (reshape_init_r): Add missing space.

--- gcc/cp/decl.c
+++ gcc/cp/decl.c
@@ -6239,7 +6239,7 @@ reshape_init_r (tree type, reshape_iter *d, bool 
first_initializer_p,
   (CONSTRUCTOR_ELT (stripped_init,0)->value
{
  if (complain & tf_error)
-   error ("too many braces around scalar initializer"
+   error ("too many braces around scalar initializer "
   "for type %qT", type);
  init = error_mark_node;
}

[PATCH] Add missing space in various string literals

2019-10-24 Thread Jakub Jelinek

Hi!

On Thu, Oct 24, 2019 at 01:20:26PM -0400, Marek Polacek wrote:
> Since r269045 we're missing a space in a diagnostic.
> 
> Bootstrapped/regtested on x86_64-linux, applying to trunk and 9.
> 
> 2019-10-24  Marek Polacek  
> 
>   * decl.c (reshape_init_r): Add missing space.

This change reminded me that it is time to run my
https://gcc.gnu.org/ml/gcc-patches/2017-02/msg00844.html
script again.  While it has lots of false positives, it discovered quite a
few issues besides the bug you've fixed.

I'll commit this as obvious to trunk if it passes bootstrap/regtest.

2019-10-24  Jakub Jelinek  

* config/arc/arc.c (hwloop_optimize): Add missing space in string
literal.
* config/rx/rx.c (rx_print_operand): Likewise.
* tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
* tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Likewise.
* ipa-sra.c (create_parameter_descriptors, process_scan_results):
Likewise.
* genemit.c (emit_c_code): Likewise.
* plugin.c (try_init_one_plugin): Likewise.  Formatting fix.
cp/
* call.c (convert_arg_to_ellipsis): Add missing space in string
literal.

--- gcc/config/arc/arc.c.jj 2019-09-11 10:27:44.612703959 +0200
+++ gcc/config/arc/arc.c2019-10-24 19:38:21.796846873 +0200
@@ -8001,7 +8001,7 @@ hwloop_optimize (hwloop_info loop)
  return false;
}
   if (dump_file)
-   fprintf (dump_file, ";; loop %d has a control like last insn;"
+   fprintf (dump_file, ";; loop %d has a control like last insn; "
 "add a nop\n",
 loop->loop_no);
 
@@ -8011,7 +8011,7 @@ hwloop_optimize (hwloop_info loop)
   if (LABEL_P (last_insn))
 {
   if (dump_file)
-   fprintf (dump_file, ";; loop %d has a label as last insn;"
+   fprintf (dump_file, ";; loop %d has a label as last insn; "
 "add a nop\n",
 loop->loop_no);
   last_insn = emit_insn_after (gen_nopv (), last_insn);
@@ -8038,7 +8038,7 @@ hwloop_optimize (hwloop_info loop)
   if (entry_edge == NULL)
 {
   if (dump_file)
-   fprintf (dump_file, ";; loop %d has no fallthru edge jumping"
+   fprintf (dump_file, ";; loop %d has no fallthru edge jumping "
 "into the loop\n",
 loop->loop_no);
   return false;
--- gcc/config/rx/rx.c.jj   2019-09-11 10:27:41.266755350 +0200
+++ gcc/config/rx/rx.c  2019-10-24 19:39:53.224447237 +0200
@@ -649,7 +649,7 @@ rx_print_operand (FILE * file, rtx op, i
case CTRLREG_INTB:  fprintf (file, "intb"); break;
default:
  warning (0, "unrecognized control register number: %d"
-  "- using %", (int) INTVAL (op));
+  " - using %", (int) INTVAL (op));
  fprintf (file, "psw");
  break;
}
--- gcc/cp/call.c.jj2019-10-24 14:46:34.976751156 +0200
+++ gcc/cp/call.c   2019-10-24 19:43:52.416785521 +0200
@@ -7590,7 +7590,7 @@ convert_arg_to_ellipsis (tree arg, tsubs
  && TYPE_MODE (TREE_TYPE (prom)) != TYPE_MODE (arg_type)
  && (complain & tf_warning))
warning_at (loc, OPT_Wabi, "scoped enum %qT passed through %<...%>"
-   "as %qT before %<-fabi-version=6%>, %qT after",
+   " as %qT before %<-fabi-version=6%>, %qT after",
arg_type,
TREE_TYPE (prom), ENUM_UNDERLYING_TYPE (arg_type));
  if (!abi_version_at_least (6))
--- gcc/plugin.c.jj 2019-05-20 11:39:35.305796134 +0200
+++ gcc/plugin.c2019-10-24 19:46:59.839916339 +0200
@@ -712,10 +712,10 @@ try_init_one_plugin (struct plugin_name_
   if (dlsym (dl_handle, str_license) == NULL)
 fatal_error (input_location,
 "plugin %s is not licensed under a GPL-compatible license"
-"%s", plugin->full_name, dlerror ());
+" %s", plugin->full_name, dlerror ());
 
-  PTR_UNION_AS_VOID_PTR (plugin_init_union) =
-  dlsym (dl_handle, str_plugin_init_func_name);
+  PTR_UNION_AS_VOID_PTR (plugin_init_union)
+= dlsym (dl_handle, str_plugin_init_func_name);
   plugin_init = PTR_UNION_AS_CAST_PTR (plugin_init_union);
 
   if ((err = dlerror ()) != NULL)
--- gcc/tree-vect-data-refs.c.jj2019-10-21 13:06:29.220299826 +0200
+++ gcc/tree-vect-data-refs.c   2019-10-24 19:51:53.689419065 +0200
@@ -4282,7 +4282,7 @@ vect_analyze_data_refs (vec_info *vinfo,
{
  if (nested_in_vect_loop_p (loop, stmt_info))
return opt_result::failure_at (stmt_info->stmt,
-  "not vectorized:"
+  "not vectorized: "
   "not suitable for strided load %G",
   stmt_info->stmt);
  STMT_VINFO_STRIDED_P (stmt_info) = true;
--- gcc/tree-ssa-loop-ch.c.jj   2019-07-10 15:52:27.851038998 +0200

[PATCH] naming GCC's profile data section

2019-10-24 Thread David Taylor

Our application is embedded.  And in addition to cold boot (reload
everything; start over from scratch), we support warm boot.  As part of
supporting warm boot, read-write data that needs to be initialized, is
initialized by code.  And we ensure at link time that the traditional
initialized read-write data sections (.data, .ldata, .sdata) are empty.

This presents a problem when attempting to use GCC based profiling as it
creates read-write data in the aforementioned data sections.

This patch adds a new command line option that allows you to specify the
name of the section where GCC puts the instrumentation data.

If the new option (-fprofile-data-section) is not specified, GCC behaves
as before.

What's missing?  Testsuite changes.  I haven't yet figured out how to do
automated testing of this.  To test it, I built our software, several
thousand files, and then did an 'objdump --headers', verified that
sections .data / .ldata / .sdata were either absent of empty, and that
the instrumentation section had the name that I specified.

We have a copyright assignment on file from before EMC was acquired by
Dell.  Our company lawyers assure me that it survived the acquisition
and is still valid.

I'm sending this from GNU/Linux rather than from Windows (to avoid
having the patch mangled), so I'm not sure what the headers will show
for my return address.  If you wish to email me, I can be reached at
dtaylor at emc dot com or David dot Taylor at dell dot com.  Or... you
can just send to the gcc-patches list as I'll be reading it.

Enough verbiage, here's the ChangeLog entry and the patch...

2019-10-23  David Taylor  

* common.opt (fprofile-data-section): New command line switch.
* coverage.c (build_var): Add support for -fprofile-data-section.
(coverage_obj_finish): Ditto.
* toplev.c (process_options): Issue warning if
-fprofile-data-section is specified when it is not supported.
* doc/invoke.texi (Option Summary): List -fprofile-data-section.
(Instrumentation Options): Document -fprofile-data-section.

Index: gcc/common.opt
===
--- gcc/common.opt  (revision 277133)
+++ gcc/common.opt  (working copy)
@@ -2124,6 +2124,10 @@
 Common Joined RejectNegative Var(profile_note_location)
 Select the name for storing the profile note file.
 
+fprofile-data-section=
+Common Joined RejectNegative Var(profile_data_section_name)
+Specify the section name for initialized profile data.
+
 fprofile-correction
 Common Report Var(flag_profile_correction)
 Enable correction of flow inconsistent profile data input.
Index: gcc/coverage.c
===
--- gcc/coverage.c  (revision 277133)
+++ gcc/coverage.c  (working copy)
@@ -749,6 +749,9 @@
   fn_name_len = strlen (fn_name);
   buf = XALLOCAVEC (char, fn_name_len + 8 + sizeof (int) * 3);
 
+  TREE_STATIC (var) = 1;
+  if (profile_data_section_name)
+set_decl_section_name (var, profile_data_section_name);
   if (counter < 0)
 strcpy (buf, "__gcov__");
   else
@@ -757,7 +760,6 @@
   buf[len - 1] = symbol_table::symbol_suffix_separator ();
   memcpy (buf + len, fn_name, fn_name_len + 1);
   DECL_NAME (var) = get_identifier (buf);
-  TREE_STATIC (var) = 1;
   TREE_ADDRESSABLE (var) = 1;
   DECL_NONALIASED (var) = 1;
   SET_DECL_ALIGN (var, TYPE_ALIGN (type));
@@ -1188,10 +1190,14 @@
   ASM_GENERATE_INTERNAL_LABEL (name_buf, "LPBX", 1);
   DECL_NAME (fn_info_ary) = get_identifier (name_buf);
   DECL_INITIAL (fn_info_ary) = build_constructor (fn_info_ary_type, ctor);
+  if (profile_data_section_name)
+set_decl_section_name (fn_info_ary, profile_data_section_name);
   varpool_node::finalize_decl (fn_info_ary);
   
   DECL_INITIAL (gcov_info_var)
 = build_info (TREE_TYPE (gcov_info_var), fn_info_ary);
+  if (profile_data_section_name)
+set_decl_section_name (gcov_info_var, profile_data_section_name);
   varpool_node::finalize_decl (gcov_info_var);
 }
 
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 277133)
+++ gcc/doc/invoke.texi (working copy)
@@ -496,7 +496,7 @@
 @item Program Instrumentation Options
 @xref{Instrumentation Options,,Program Instrumentation Options}.
 @gccoptlist{-p  -pg  -fprofile-arcs  --coverage  -ftest-coverage @gol
--fprofile-abs-path @gol
+-fprofile-abs-path -fprofile-data-section=@var{name} @gol
 -fprofile-dir=@var{path}  -fprofile-generate  -fprofile-generate=@var{path} 
@gol
 -fprofile-note=@var{path}  -fprofile-update=@var{method} @gol
 -fprofile-filter-files=@var{regex}  -fprofile-exclude-files=@var{regex} @gol
@@ -12483,6 +12483,14 @@
 generate test coverage data.  Coverage data matches the source files
 more closely if you do not optimize.
 
+@item -fprofile-data-section
+@opindex fprofile-data-section
+When profiling, sets the name of the section where GCC places the
+pro

[ADA PATCH] Fix locales.c iso_3166 bug

2019-10-24 Thread Jakub Jelinek

Hi!

My script to check for PR79475-like issues discovered what looks like a bug
in the Ada FE.  I actually have no idea how it can work properly for any
entries after the last US entry or "United-States", because it will only
match United-StatesUY to US or UZ to Uruguay etc., rather than
United-States to US, Uruguay to UY, Uzbekistan to UZ etc.

Ok for trunk if it passes bootstrap/regtested on x86_64-linux?

2019-10-24  Jakub Jelinek  

* locales.c (iso_3166): Add missing comma after "United-States".

--- gcc/ada/locales.c.jj2019-01-08 11:55:16.792206321 +0100
+++ gcc/ada/locales.c   2019-10-24 19:50:36.781596119 +0200
@@ -529,7 +529,7 @@ static char* iso_3166[] =
   "UM", "United States Minor Outlying Islands",
   "US", "United States",
   "US", "United States of America",
-  "US", "United-States"
+  "US", "United-States",
   "UY", "Uruguay",
   "UZ", "Uzbekistan",
 

Jakub

[PATCH] rs6000: Implement [u]avg3_ceil

2019-10-24 Thread Segher Boessenkool

We already had those in fact, just under other names.  Use the standard
names so that the vectorizer can use it.

Committing to trunk; will backport to 9 and 8 later.


Segher


2019-10-24  Segher Boessenkool  

* config/rs6000/altivec.md (altivec_vavgu): Rename to...
(uavg3_ceil): ... This.
(altivec_vavgs): Rename to...
(avg3_ceil): ... This.
* rs6000-builtin.def (VAVGUB, VAVGSB, VAVGUH, VAVGSH, VAVGUW, VAVGSW):
Adjust.

---
 gcc/config/rs6000/altivec.md |  4 ++--
 gcc/config/rs6000/rs6000-builtin.def | 12 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index dc34528..daa91a4 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -560,7 +560,7 @@ (define_insn "altivec_vsubss"
   [(set_attr "type" "vecsimple")])
 
 ;;
-(define_insn "altivec_vavgu"
+(define_insn "uavg3_ceil"
   [(set (match_operand:VI 0 "register_operand" "=v")
 (unspec:VI [(match_operand:VI 1 "register_operand" "v")
 (match_operand:VI 2 "register_operand" "v")]
@@ -569,7 +569,7 @@ (define_insn "altivec_vavgu"
   "vavgu %0,%1,%2"
   [(set_attr "type" "vecsimple")])
 
-(define_insn "altivec_vavgs"
+(define_insn "avg3_ceil"
   [(set (match_operand:VI 0 "register_operand" "=v")
 (unspec:VI [(match_operand:VI 1 "register_operand" "v")
 (match_operand:VI 2 "register_operand" "v")]
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 4d4f3b3..0feee7c 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1002,12 +1002,12 @@ BU_ALTIVEC_2 (VADDUWS,"vadduws",CONST,  
altivec_vadduws)
 BU_ALTIVEC_2 (VADDSWS,   "vaddsws",CONST,  altivec_vaddsws)
 BU_ALTIVEC_2 (VAND,  "vand",   CONST,  andv4si3)
 BU_ALTIVEC_2 (VANDC, "vandc",  CONST,  andcv4si3)
-BU_ALTIVEC_2 (VAVGUB,"vavgub", CONST,  altivec_vavgub)
-BU_ALTIVEC_2 (VAVGSB,"vavgsb", CONST,  altivec_vavgsb)
-BU_ALTIVEC_2 (VAVGUH,"vavguh", CONST,  altivec_vavguh)
-BU_ALTIVEC_2 (VAVGSH,"vavgsh", CONST,  altivec_vavgsh)
-BU_ALTIVEC_2 (VAVGUW,"vavguw", CONST,  altivec_vavguw)
-BU_ALTIVEC_2 (VAVGSW,"vavgsw", CONST,  altivec_vavgsw)
+BU_ALTIVEC_2 (VAVGUB,"vavgub", CONST,  uavgv16qi3_ceil)
+BU_ALTIVEC_2 (VAVGSB,"vavgsb", CONST,  avgv16qi3_ceil)
+BU_ALTIVEC_2 (VAVGUH,"vavguh", CONST,  uavgv8hi3_ceil)
+BU_ALTIVEC_2 (VAVGSH,"vavgsh", CONST,  avgv8hi3_ceil)
+BU_ALTIVEC_2 (VAVGUW,"vavguw", CONST,  uavgv4si3_ceil)
+BU_ALTIVEC_2 (VAVGSW,"vavgsw", CONST,  avgv4si3_ceil)
 BU_ALTIVEC_2 (VCFUX, "vcfux",  CONST,  altivec_vcfux)
 BU_ALTIVEC_2 (VCFSX, "vcfsx",  CONST,  altivec_vcfsx)
 BU_ALTIVEC_2 (VCMPBFP,   "vcmpbfp",CONST,  altivec_vcmpbfp)
-- 
1.8.3.1

Re: [PATCH target/89071] Fix false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale

2019-10-24 Thread Uros Bizjak

On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu  wrote:
>
> Update patch:
> Add m constraint to define_insn (sse_1_round *sse_1_round when under sse4 but not avx512f.

It looks to me that the original insn is incompletely defined. It
should use nonimmediate_operand, "m" constraint and  pointer
size modifier. Something like:

(define_insn "sse4_1_round"
  [(set (match_operand:VF_128 0 "register_operand" "=Yr,*x,x,v")
(vec_merge:VF_128
  (unspec:VF_128
[(match_operand:VF_128 2 "nonimmediate_operand" "Yrm,*xm,xm,vm")
 (match_operand:SI 3 "const_0_to_15_operand" "n,n,n,n")]
UNSPEC_ROUND)
  (match_operand:VF_128 1 "register_operand" "0,0,x,v")
  (const_int 1)))]
  "TARGET_SSE4_1"
  "@
   round\t{%3, %2, %0|%0, %2, %3}
   round\t{%3, %2, %0|%0, %2, %3}
   vround\t{%3, %2, %1, %0|%0, %1, %2, %3}
   vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}"

>
> Changelog:
> gcc/
> * config/i386/sse.md:  (sse4_1_round):
> Change constraint x to xm
> since vround support memory operand.
> * (*sse4_1_round): Ditto.
>
> Bootstrap and regression test ok.
>
> On Wed, Oct 23, 2019 at 9:56 AM Hongtao Liu  wrote:
> >
> > Hi uros:
> >   This patch fixes false dependence of scalar operations
> > vrcp/vsqrt/vrsqrt/vrndscale.
> >   Bootstrap ok, regression test on i386/x86 ok.
> >
> >   It does something like this:
> > -
> > For scalar instructions with both xmm operands:
> >
> > op %xmmN,%xmmQ,%xmmQ > op %xmmN, %xmmN, %xmmQ
> >
> > for scalar instructions with one mem  or gpr operand:
> >
> > op mem/gpr, %xmmQ, %xmmQ
> >
> > --->  using pass rpad >
> >
> > xorps %xmmN, %xmmN, %xxN
> > op mem/gpr, %xmmN, %xmmQ
> >
> > Performance influence of SPEC2017 fprate which is tested on SKX
> >
> > 503.bwaves_r -0.03%
> > 507.cactuBSSN_r -0.22%
> > 508.namd_r -0.02%
> > 510.parest_r 0.37%
> > 511.povray_r 0.74%
> > 519.lbm_r 0.24%
> > 521.wrf_r 2.35%
> > 526.blender_r 0.71%
> > 527.cam4_r 0.65%
> > 538.imagick_r 0.95%
> > 544.nab_r -0.37
> > 549.fotonik3d_r 0.24%
> > 554.roms_r 0.90%
> > fprate geomean 0.50%
> > -
> >
> > Changelog
> > gcc/
> > * config/i386/i386.md (*rcpsf2_sse): Add
> > avx_partial_xmm_update, prefer m constraint for TARGET_AVX.
> > (*rsqrtsf2_sse): Ditto.
> > (*sqrt2_sse): Ditto.
> > (sse4_1_round2): separate constraint vm, add
> > avx_partail_xmm_update, prefer m constraint for TARGET_AVX.
> > * config/i386/sse.md (*sse_vmrcpv4sf2"): New define_insn used
> > by pass rpad.
> > (*_vmsqrt2*):
> > Ditto.
> > (*sse_vmrsqrtv4sf2): Ditto.
> > (*avx512f_rndscale): Ditto.
> > (*sse4_1_round): Ditto.
> >
> > gcc/testsuite
> > * gcc.target/i386/pr87007-4.c: New test.
> > * gcc.target/i386/pr87007-5.c: Ditto.
> >
> >
> > --
> > BR,
> > Hongtao

(set (attr "preferred_for_speed")
  (cond [(eq_attr "alternative" "1")
   (symbol_ref "TARGET_AVX || !TARGET_SSE_PARTIAL_REG_DEPENDENCY")
(eq_attr "alternative" "2")
-  (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY")
+  (symbol_ref "TARGET_AVX || !TARGET_SSE_PARTIAL_REG_DEPENDENCY")
]
(symbol_ref "true")))])

This can be written as:

(set (attr "preferred_for_speed")
  (cond [(match_test "TARGET_AVX")
   (symbol_ref "true")
(eq_attr "alternative" "1,2")
  (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY")
]
(symbol_ref "true")))])

Uros.

[PATCH, libstdc++ docs] Add lines to C++20 status.

2019-10-24 Thread Smith-Rowland, Edward M

This patch to the libstdc++ docs should get the remaining status entries for 
C++20 lib.
Ed
Index: doc/xml/manual/status_cxx2020.xml
===
--- doc/xml/manual/status_cxx2020.xml	(revision 277405)
+++ doc/xml/manual/status_cxx2020.xml	(working copy)
@@ -1127,6 +1127,160 @@
__cpp_lib_bounded_array_traits >= 201902L 
 
 
+
+   std::to_array 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0325r4.html";>
+	P0325R4
+	
+  
+   10.1 
+   __cpp_lib_to_array >= 201907L 
+
+
+
+   Bit operations 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0553r4.html";>
+	P0553R4
+	
+  
+   10.1 
+   __cpp_lib_bitops >= 201907L 
+
+
+
+   Mathematical constants 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0631r8.pdf";>
+	P0631R8
+	
+  
+   10.1 
+   __cpp_lib_math_constants >= 201907L 
+
+
+
+  
+   Layout-compatibility and pointer-interconvertibility traits 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0466r5.pdf";>
+	P0466R5
+	
+  
+   
+  
+__cpp_lib_is_layout_compatible >= 201907L,
+__cpp_lib_is_layout_interconvertible >= 201907L,
+  
+
+
+
+   std::stop_token and std::jthread 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0660r10.pdf";>
+	P0660R10
+	
+  
+   10.1 
+   __cpp_lib_jthread >= 201907L 
+
+
+
+  
+   Text formatting 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0645r10.html";>
+	P0645R10
+	
+  
+   
+  
+__cpp_lib_format >= 201907L,
+  
+
+
+
+   constexpr std::invoke and related utilities 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1065r2.html";>
+	P1065R2
+	
+  
+   10.1 
+   __cpp_lib_constexpr_invoke >= 201907L 
+
+
+
+   constexpr std::allocator and related utilities 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0784r7.html";>
+	P0784R7
+	
+  
+   10.1 
+   __cpp_constexpr_dynamic_alloc >= 201907L 
+
+
+
+  
+   Atomic waiting and notifying, std::semaphore, std::latch and std::barrier 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1135r6.html";>
+	P1135R6
+	
+  
+   
+  
+__cpp_lib_atomic_lock_free_type_aliases >= 201907L in ,
+__cpp_lib_atomic_flag_test >= 201907L in ,
+__cpp_lib_atomic_wait >= 201907L in ,
+__cpp_lib_semaphore >= 201907L in ,
+__cpp_lib_latch >= 201907L in ,
+__cpp_lib_barrier >= 201907L in 
+  
+
+
+
+  
+   std::source_location 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1208r6.pdf";>
+	P1208R6
+	
+  
+   
+  
+__cpp_lib_source_location >= 201907L,
+  
+
+
+
+  
+   Adding <=> to the standard library 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1614r2.html";>
+	P1614R2
+	
+  
+   
+  
+__cpp_lib_spaceship >= 201907L,
+  
+
+
+
+  
+   Efficient access to std::basic_stringbuf's Buffer 
+  
+http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0408r7.pdf";>
+	P0408R7
+	
+  
+   
+  
+
+

C++ PATCH for c++/91962 - ICE with reference binding and qualification conversion

2019-10-24 Thread Marek Polacek

When fixing c++/91889 (r276251) I was assuming that we couldn't have a ck_qual
under a ck_ref_bind, and I was introducing it in the patch and so this
+   if (next_conversion (convs)->kind == ck_qual)
+ {
+   gcc_assert (same_type_p (TREE_TYPE (expr),
+next_conversion (convs)->type));
+   /* Strip the cast created by the ck_qual; cp_build_addr_expr
+  below expects an lvalue.  */
+   STRIP_NOPS (expr);
+ }
in convert_like_real was supposed to handle it.  But that assumption was wrong
as this test shows; here we have "(int *)f" where f is of type long int, and
we're converting it to "const int *const &", so we have both ck_ref_bind and
ck_qual.  That means that the new STRIP_NOPS strips an expression it shouldn't
have, and that then breaks when creating a TARGET_EXPR.  So we want to limit
the stripping to the new case only.  This I do by checking need_temporary_p,
which will be 0 in the new case.  Yes, we can set need_temporary_p when
binding a reference directly, but then we won't have a qualification
conversion.  It is possible to have a bit-field, convert it to a pointer,
and then convert that pointer to a more-qualified pointer, but in that case
we're not dealing with an lvalue, so gl_kind is 0, so we won't enter this
block in reference_binding:
 1747   if ((related_p || compatible_p) && gl_kind)

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-10-24  Marek Polacek  

PR c++/91962 - ICE with reference binding and qualification conversion.
* call.c (convert_like_real) : Check need_temporary_p.

* g++.dg/cpp0x/ref-bind7.C: New test.

diff --git gcc/cp/call.c gcc/cp/call.c
index 55d2abaaddd..d4674a77078 100644
--- gcc/cp/call.c
+++ gcc/cp/call.c
@@ -7386,7 +7386,8 @@ convert_like_real (conversion *convs, tree expr, tree fn, 
int argnum,
/* direct_reference_binding might have inserted a ck_qual under
   this ck_ref_bind for the benefit of conversion sequence ranking.
   Ignore the conversion; we'll create our own below.  */
-   if (next_conversion (convs)->kind == ck_qual)
+   if (next_conversion (convs)->kind == ck_qual
+   && !convs->need_temporary_p)
  {
gcc_assert (same_type_p (TREE_TYPE (expr),
 next_conversion (convs)->type));
diff --git gcc/testsuite/g++.dg/cpp0x/ref-bind7.C 
gcc/testsuite/g++.dg/cpp0x/ref-bind7.C
new file mode 100644
index 000..e3675bc560d
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/ref-bind7.C
@@ -0,0 +1,13 @@
+// PR c++/91962 - ICE with reference binding and qualification conversion.
+// { dg-do compile { target c++11 } }
+
+template  class b {
+public:
+  void c(const a &);
+};
+class B {
+  void d();
+  b e;
+};
+long f;
+void B::d() { e.c((const int *)f); }

Re: [PATCH, libstdc++ docs] Add lines to C++20 status.

2019-10-24 Thread Jonathan Wakely


On 24/10/19 19:12 +, Smith-Rowland, Edward M wrote:

This patch to the libstdc++ docs should get the remaining status entries for 
C++20 lib.


I need to add some missing macros, like __cpp_lib_bitops.

jthread isn't committed yet, but it will be, so OK for trunk, thanks.

[PATCH,Fortran] Taking a BYTE out of type-spec

2019-10-24 Thread Steve Kargl

The patch moves the matching of the nonstandard type-spec
BYTE to its own matching function.  During this move, a
check for invalid matching in free-form source code it
detected (see byte_4.f90).  OK to commit?

2019-10-24  Steven G. Kargl  

* decl.c (match_byte_typespec): New function.  Match BYTE type-spec.
(gfc_match_decl_type_spec): Use it.

2019-10-24  Steven G. Kargl  

* gfortran.dg/byte_3.f: New test.
* gfortran.dg/byte_4.f90: Ditto.

-- 
Steve
Index: gcc/fortran/decl.c
===
--- gcc/fortran/decl.c	(revision 277420)
+++ gcc/fortran/decl.c	(working copy)
@@ -3980,6 +3980,38 @@ error_return:
 }
 
 
+/* Match a legacy nonstandard BYTE type-spec.  */
+
+static match
+match_byte_typespec (gfc_typespec *ts)
+{
+  if (gfc_match (" byte") == MATCH_YES)
+{
+  if (!gfc_notify_std (GFC_STD_GNU, "BYTE type at %C"))
+	return MATCH_ERROR;
+
+  if (gfc_current_form == FORM_FREE)
+	{
+	  char c = gfc_peek_ascii_char ();
+	  if (!gfc_is_whitespace (c) && c != ',')
+	return MATCH_NO;
+	}
+
+  if (gfc_validate_kind (BT_INTEGER, 1, true) < 0)
+	{
+	  gfc_error ("BYTE type used at %C "
+		 "is not available on the target machine");
+	  return MATCH_ERROR;
+	}
+
+  ts->type = BT_INTEGER;
+  ts->kind = 1;
+  return MATCH_YES;
+}
+  return MATCH_NO;
+}
+
+
 /* Matches a declaration-type-spec (F03:R502).  If successful, sets the ts
structure to the matched specification.  This is necessary for FUNCTION and
IMPLICIT statements.
@@ -4012,22 +4044,10 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int implic
   /* Clear the current binding label, in case one is given.  */
   curr_binding_label = NULL;
 
-  if (gfc_match (" byte") == MATCH_YES)
-{
-  if (!gfc_notify_std (GFC_STD_GNU, "BYTE type at %C"))
-	return MATCH_ERROR;
-
-  if (gfc_validate_kind (BT_INTEGER, 1, true) < 0)
-	{
-	  gfc_error ("BYTE type used at %C "
-		 "is not available on the target machine");
-	  return MATCH_ERROR;
-	}
-
-  ts->type = BT_INTEGER;
-  ts->kind = 1;
-  return MATCH_YES;
-}
+  /* Match BYTE type-spec.  */
+  m = match_byte_typespec (ts);
+  if (m != MATCH_NO)
+return m;
 
   m = gfc_match (" type (");
   matched_type = (m == MATCH_YES);
Index: gcc/testsuite/gfortran.dg/byte_3.f
===
--- gcc/testsuite/gfortran.dg/byte_3.f	(nonexistent)
+++ gcc/testsuite/gfortran.dg/byte_3.f	(working copy)
@@ -0,0 +1,6 @@
+c { dg-do run }
+c { dg-options "-std=legacy" }
+  bytea
+  a = 1
+  if (a /= 1 .and. kind(a) /= a) stop 1
+  end
Index: gcc/testsuite/gfortran.dg/byte_4.f90
===
--- gcc/testsuite/gfortran.dg/byte_4.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/byte_4.f90	(working copy)
@@ -0,0 +1,5 @@
+! { dg-do compile }
+  bytea  ! { dg-error "Unclassifiable statement" }
+  a = 1
+  print '(I0)', a
+  end

[PATCH][MSP430] Add -mtiny-printf option to support reduced code size printf and puts

2019-10-24 Thread Jozef Lawrynowicz

I added support for reduced code size printf and puts functions to Newlib for
MSP430 a while ago [1]. By removing support for reentrancy, streams and
buffering we can greatly reduce code size, which is always often a limitation
when using printf on microcontrollers.

This patch adds an interface to enable these reduced code size implementations
from GCC by using the -mtiny-printf option. The tiny printf and puts
implementations require GCC to be configured with
--enable-newlib-nano-formatted-io, so there are some modifications to configure
scripts to support the checking of that.

-mtiny-printf is merely an alias for passing "--wrap printf --wrap puts" to the
linker.
This will replace references to "printf" and "puts" in user
code with "__wrap_printf" and "__wrap_puts" respectively.
If there is no implementation of these __wrap* functions in user code,
these "tiny" printf and puts implementations will be linked into the
final executable.

The wrapping mechanism is supposed to be invisible to the user since even if
they are unaware of the "tiny" implementation, and implement their own 
__wrap_printf and __wrap_puts, their own implementation will be automatically
chosen over the "tiny" printf and puts from the library.

Successfully regtested on trunk by comparing results with -mtiny-printf with a
set of testresults without the option.
The new test "gcc.target/msp430/tiny-printf.c" verifies the option behaves as
expected when GCC is configured with and without
--enable-newlib-nano-formatted-io.

Ok to apply?

[1]
https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;a=commit;h=1e6c561d48f
>From 4d4e2b6bb92317b2b4db1d99c3f43a167a1e3288 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 24 Oct 2019 13:30:21 +0100
Subject: [PATCH] MSP430: Add -mtiny-printf option

gcc/ChangeLog:

2019-10-24  Jozef Lawrynowicz  

	* config.in: Regenerate.
	* config/msp430/msp430.c (msp430_option_override): Emit an error if
	-mtiny-printf is used without GCC being configured with
	--enable-newlib-nano-formatted-io.
	* config/msp430/msp430.h (LINK_SPEC): Pass 
	"--wrap puts --wrap printf" when -mtiny-printf is used.
	* config/msp430/msp430.opt: Document -mtiny-printf.
	* configure: Regenerate.
	* configure.ac: Enable --enable-newlib-nano-formatted-io flag.
	Define HAVE_NEWLIB_NANO_FORMATTED_IO if
	--enable-newlib-nano-formatted-io is passed.
	* doc/invoke.texi: Document -mtiny-printf.

gcc/testsuite/ChangeLog:

2019-10-24  Jozef Lawrynowicz  

	* gcc.target/msp430/tiny-printf.c: New test.

---
 gcc/config.in |  7 ++
 gcc/config/msp430/msp430.c|  6 +
 gcc/config/msp430/msp430.h|  1 +
 gcc/config/msp430/msp430.opt  |  4 +++
 gcc/configure | 25 +--
 gcc/configure.ac  | 16 
 gcc/doc/invoke.texi   | 15 ++-
 gcc/testsuite/gcc.target/msp430/tiny-printf.c |  3 +++
 8 files changed, 74 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/msp430/tiny-printf.c

diff --git a/gcc/config.in b/gcc/config.in
index 9b54a4715db..7925d892cce 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1675,6 +1675,13 @@
 #endif
 
 
+/* Define if GCC has been configured with --enable-newlib-nano-formatted-io.
+   */
+#ifndef USED_FOR_TARGET
+#undef HAVE_NEWLIB_NANO_FORMATTED_IO
+#endif
+
+
 /* Define to 1 if you have the `nl_langinfo' function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_NL_LANGINFO
diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index 31029395c3d..dbbff4a6863 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -284,6 +284,12 @@ msp430_option_override (void)
  possible to build newlib with -Os enabled.  Until now...  */
   if (TARGET_OPT_SPACE && optimize < 3)
 optimize_size = 1;
+
+#ifndef HAVE_NEWLIB_NANO_FORMATTED_IO
+  if (TARGET_TINY_PRINTF)
+error ("GCC must be configured with %<--enable-newlib-nano-formatted-io%> "
+	   "to use %<-mtiny-printf%>");
+#endif
 }
 
 #undef  TARGET_SCALAR_MODE_SUPPORTED_P
diff --git a/gcc/config/msp430/msp430.h b/gcc/config/msp430/msp430.h
index 73afe2e2d16..4a89f03a35e 100644
--- a/gcc/config/msp430/msp430.h
+++ b/gcc/config/msp430/msp430.h
@@ -75,6 +75,7 @@ extern bool msp430x;
 "msp430_propagate_region_opt(%* %{muse-lower-region-prefix})} " \
   "%{mdata-region=*:--data-region=%:" \
 "msp430_propagate_region_opt(%* %{muse-lower-region-prefix})} " \
+  "%{mtiny-printf:--wrap puts --wrap printf} "
 
 #define DRIVER_SELF_SPECS \
   " %{!mlarge:%{mcode-region=*:%{mdata-region=*:%e-mcode-region and "	\
diff --git a/gcc/config/msp430/msp430.opt b/gcc/config/msp430/msp430.opt
index 2db2906ca11..b451174c3d1 100644
--- a/gcc/config/msp430/msp430.opt
+++ b/gcc/config/msp430/msp430.opt
@@ -2,6 +2,10 @@ msim
 Target
 Use simulator runtime.
 
+mtiny-printf
+Target Report Mask(TINY_PRINTF)
+Use a lightweight co

C++ PATCH for c++/92215 - flawed diagnostic for bit-field with non-integral type

2019-10-24 Thread Marek Polacek

I noticed that for code like 

  struct S {
int *foo : 3;
  };

we generate nonsensical

  r.C:2:8: error: function definition does not declare parameters
  2 |   int *foo : 3;

It talks about a function because after parsing the declspecs of 'foo' we don't
see either ':' or "name :", so we think it's not a bit-field decl.  So we parse
the declarator and since a ctor-initializer begins with a ':', we try to parse
it as a function body, generating the awful diagnostic.  With this patch, we
issue:

  r.C:2:8: error: bit-field ‘foo’ has non-integral type ‘int*’
  2 |   int *foo : 3;

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-10-24  Marek Polacek  

PR c++/92215 - flawed diagnostic for bit-field with non-integral type.
* parser.c (cp_parser_member_declaration): Add a diagnostic for
bit-fields with non-integral types.

* g++.dg/diagnostic/bitfld4.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index 3857fe47d67..84d2121cae2 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -24971,6 +24971,29 @@ cp_parser_member_declaration (cp_parser* parser)
  else
initializer = cp_parser_initializer (parser, &x, &x);
}
+ /* Detect invalid bit-field cases such as
+
+  int *p : 4;
+  int &&r : 3;
+
+and similar.  */
+ else if (cp_lexer_next_token_is (parser->lexer, CPP_COLON)
+  && decl_specifiers.any_type_specifiers_p)
+   {
+ /* This is called for a decent diagnostic only.  */
+ tree d = grokdeclarator (declarator, &decl_specifiers,
+  BITFIELD, /*initialized=*/false,
+  &attributes);
+ error_at (DECL_SOURCE_LOCATION (d),
+   "bit-field %qD has non-integral type %qT",
+   d, TREE_TYPE (d));
+ cp_parser_skip_to_end_of_statement (parser);
+ /* Avoid "extra ;" pedwarns.  */
+ if (cp_lexer_next_token_is (parser->lexer,
+ CPP_SEMICOLON))
+   cp_lexer_consume_token (parser->lexer);
+ goto out;
+   }
  /* Otherwise, there is no initializer.  */
  else
initializer = NULL_TREE;
diff --git gcc/testsuite/g++.dg/diagnostic/bitfld4.C 
gcc/testsuite/g++.dg/diagnostic/bitfld4.C
new file mode 100644
index 000..d6aa9a5513c
--- /dev/null
+++ gcc/testsuite/g++.dg/diagnostic/bitfld4.C
@@ -0,0 +1,16 @@
+// PR c++/92215 - flawed diagnostic for bit-field with non-integral type.
+// { dg-do compile { target c++11 } }
+
+struct S {
+  int *f1 : 3; // { dg-error "bit-field .f1. has non-integral type .int\\*." }
+  int &f2 : 3; // { dg-error "bit-field .f2. has non-integral type .int&." }
+  int &&f3 : 3; // { dg-error "bit-field .f3. has non-integral type .int&&." }
+  int f4[1] : 3; // { dg-error "bit-field .f4. has non-integral type .int 
\\\[1\\\]." }
+  int *f5 __attribute__((deprecated)) : 3; // { dg-error "bit-field .f5. has 
non-integral type .int\\*." }
+  int f6[1] __attribute__((deprecated)) : 3; // { dg-error "bit-field .f6. has 
non-integral type .int \\\[1\\\]." }
+  int &f7 __attribute__((deprecated)): 3; // { dg-error "bit-field .f7. has 
non-integral type .int&." }
+  int : 3; // { dg-error "expected" }
+  int *f9[1] : 3; // { dg-error "bit-field .f9. has non-integral type .int\\* 
\\\[1\\\]." }
+  int (*f10)() : 3; // { dg-error "bit-field .f10. has non-integral type .int 
\\(\\*\\)\\(\\)." }
+  int [][2] : 3; // { dg-error "expected" }
+};

Free inline summaries for inline clones

2019-10-24 Thread Jan Hubicka

Hi,
most of IPA summaries we maintain actually needs to be kept for
offline functions.  This patch releases fnsummary and call summary
for inline clones. This needs bit of refactoring since we need to keep
size info for clones and for lto partitioning, so I split it out into
separate size summary.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* cgraphunit.c (symbol_table::process_new_functions): Call
ipa_free_size_summary.
* ipa-cp.c (ipcp_cloning_candidate_p): Update.
(devirtualization_time_bonus): Update.
(ipcp_propagate_stage): Update.
* ipa-fnsummary.c (ipa_size_summaries): New.
(ipa_fn_summary_alloc): Alloc size summary.
(dump_ipa_call_summary): Update.
(ipa_dump_fn_summary): Update.
(analyze_function_body): Update.
(compute_fn_summary): Likewise.
(ipa_get_stack_frame_offset): New function.
(inline_update_callee_summaries): Do not update frame offsets.
(ipa_merge_fn_summary_after_inlining): Update frame offsets here;
remove call and function summary.
(ipa_update_overall_fn_summary): Update.
(inline_read_section): Update.
(ipa_fn_summary_write): Update.
(ipa_free_fn_summary): Do not remove summaries.
(ipa_free_size_summary): New.
(release summary pass): Also run at WPA.
* ipa-fnsummary.h (ipa_size_summary): Declare.
(ipa_fn_summary): Remove size, self_size, stack_frame_offset,
estimated_self_stack_size.
(ipa_size_summary_t): New type.
(ipa_size_summaries): Declare.
(ipa_free_size_summary): Declare.
(ipa_get_stack_frame_offset): Declare.
* ipa-icf.c (sem_function::merge): Update.
* ipa-inline-analysis.c (estimate_size_after_inlining): Update.
(estimate_growth): Update.
(growth_likely_positive): Update.
(clone_inlined_nodes): Update.
(inline_call): Update.
* ipa-inline.c (caller_growth_limits): Update.
(edge_badness): Update.
(recursive_inlining): Update.
(inline_small_functions): Update.
(inline_to_all_callers_1): Update.
* ipa-prop.h (ipa_edge_args_sum_t): Update comment.
* lto-partition.c (add_symbol_to_partition_1): Update.
(undo_parittion): Update.
Index: cgraphunit.c
===
--- cgraphunit.c(revision 277423)
+++ cgraphunit.c(working copy)
@@ -340,7 +340,10 @@ symbol_table::process_new_functions (voi
 and splitting.  This is redundant for functions added late.
 Just throw away whatever it did.  */
  if (!summaried_computed)
-   ipa_free_fn_summary ();
+   {
+ ipa_free_fn_summary ();
+ ipa_free_size_summary ();
+   }
}
  else if (ipa_fn_summaries != NULL)
compute_fn_summary (node, true);
Index: ipa-cp.c
===
--- ipa-cp.c(revision 277423)
+++ ipa-cp.c(working copy)
@@ -731,7 +731,7 @@ ipcp_cloning_candidate_p (struct cgraph_
   init_caller_stats (&stats);
   node->call_for_symbol_thunks_and_aliases (gather_caller_stats, &stats, 
false);
 
-  if (ipa_fn_summaries->get (node)->self_size < stats.n_calls)
+  if (ipa_size_summaries->get (node)->self_size < stats.n_calls)
 {
   if (dump_file)
fprintf (dump_file, "Considering %s for cloning; code might shrink.\n",
@@ -2629,13 +2629,14 @@ devirtualization_time_bonus (struct cgra
   if (!isummary->inlinable)
continue;
 
+  int size = ipa_size_summaries->get (callee)->size;
   /* FIXME: The values below need re-considering and perhaps also
 integrating into the cost metrics, at lest in some very basic way.  */
-  if (isummary->size <= MAX_INLINE_INSNS_AUTO / 4)
+  if (size <= MAX_INLINE_INSNS_AUTO / 4)
res += 31 / ((int)speculative + 1);
-  else if (isummary->size <= MAX_INLINE_INSNS_AUTO / 2)
+  else if (size <= MAX_INLINE_INSNS_AUTO / 2)
res += 15 / ((int)speculative + 1);
-  else if (isummary->size <= MAX_INLINE_INSNS_AUTO
+  else if (size <= MAX_INLINE_INSNS_AUTO
   || DECL_DECLARED_INLINE_P (callee->decl))
res += 7 / ((int)speculative + 1);
 }
@@ -3334,7 +3335,7 @@ ipcp_propagate_stage (class ipa_topo_inf
   ipa_get_param_count (info));
initialize_node_lattices (node);
   }
-ipa_fn_summary *s = ipa_fn_summaries->get (node);
+ipa_size_summary *s = ipa_size_summaries->get (node);
 if (node->definition && !node->alias && s != NULL)
   overall_size += s->self_size;
 max_count = max_count.max (node->count.ipa ());
Index: ipa-fnsummary.c
===
--- ipa-fnsummary.c (revision 277423)
+++ ipa-fnsummary.c

C++ PATCH for c++/92134 - constinit malfunction in static data member

2019-10-24 Thread Marek Polacek

I wasn't properly setting LOOKUP_CONSTINIT in grokfield and so we didn't
detect a non-const initializer.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-10-24  Marek Polacek  

PR c++/92134 - constinit malfunction in static data member.
* decl2.c (grokfield): Set LOOKUP_CONSTINIT.

* g++.dg/cpp2a/constinit14.C: New test.

diff --git gcc/cp/decl2.c gcc/cp/decl2.c
index 6d5e973b487..a630ee31397 100644
--- gcc/cp/decl2.c
+++ gcc/cp/decl2.c
@@ -990,6 +990,9 @@ grokfield (const cp_declarator *declarator,
   else
 flags = LOOKUP_IMPLICIT;
 
+  if (decl_spec_seq_has_spec_p (declspecs, ds_constinit))
+flags |= LOOKUP_CONSTINIT;
+
   switch (TREE_CODE (value))
 {
 case VAR_DECL:
diff --git gcc/testsuite/g++.dg/cpp2a/constinit14.C 
gcc/testsuite/g++.dg/cpp2a/constinit14.C
new file mode 100644
index 000..72bfab667b8
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp2a/constinit14.C
@@ -0,0 +1,13 @@
+// PR c++/92134 - constinit malfunction in static data member.
+// { dg-do compile { target c++2a } }
+
+struct Value {
+  Value() : v{new int{42}} {}
+  int* v;
+};
+
+struct S {
+  static constinit inline Value v{}; // { dg-error "variable .S::v. does not 
have a constant initializer|call to non-.constexpr. function" }
+};
+
+int main() { return *S::v.v; }

[committed] Further omp declare variant progress

2019-10-24 Thread Jakub Jelinek

Hi!

The following patch extends the omp declare variant handling to perform
another check during gimplification, where it already can redirect various
calls to base functions to their corresponding variants.
The scoring is still unimplemented, so right now it does the redirection
only if there is a single matching variant.
As the C++ omp declare variant patch has not been committed yet,
it is for now limited to C.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-10-24  Jakub Jelinek  

* gimplify.h (omp_construct_selector_matches): Declare.
* gimplify.c (struct gimplify_omp_ctx): Add code member.
(gimplify_call_expr): Call omp_resolve_declare_variant and remap
called function if needed for flag_openmp.
(gimplify_scan_omp_clauses): Set ctx->code.
(omp_construct_selector_matches): New function.
* omp-general.h (omp_constructor_traits_to_codes,
omp_context_selector_matches, omp_resolve_declare_variant): Declare.
* omp-general.c (omp_constructor_traits_to_codes,
omp_context_selector_matches, omp_resolve_declare_variant): New
functions.
c-family/
* c-common.h (c_omp_context_selector_matches): Remove.
* c-omp.c (c_omp_context_selector_matches): Remove.
* c-attribs.c (c_common_attribute_table): Add
"omp declare target {host,nohost,block}" attributes.
c/
* c-parser.c (c_finish_omp_declare_variant): Use
omp_context_selector_matches instead of
c_omp_context_selector_matches.
* c-decl.c (c_decl_attributes): Add "omp declare target block"
attribute in between declare target and end declare target
pragmas.
cp/
* decl2.c (cplus_decl_attributes): Add "omp declare target block"
attribute in between declare target and end declare target
pragmas.
testsuite/
* c-c++-common/gomp/declare-variant-8.c: New test.

--- gcc/gimplify.h.jj   2019-10-18 00:16:11.196526233 +0200
+++ gcc/gimplify.h  2019-10-24 18:52:28.897994585 +0200
@@ -75,6 +75,8 @@ extern void omp_firstprivatize_variable
 extern enum gimplify_status gimplify_expr (tree *, gimple_seq *, gimple_seq *,
   bool (*) (tree), fallback_t);
 
+HOST_WIDE_INT omp_construct_selector_matches (enum tree_code *, int);
+
 extern void gimplify_type_sizes (tree, gimple_seq *);
 extern void gimplify_one_sizepos (tree *, gimple_seq *);
 extern gbind *gimplify_body (tree, bool);
--- gcc/gimplify.c.jj   2019-10-18 00:16:11.149526930 +0200
+++ gcc/gimplify.c  2019-10-24 20:43:05.097408585 +0200
@@ -219,6 +219,7 @@ struct gimplify_omp_ctx
   location_t location;
   enum omp_clause_default_kind default_kind;
   enum omp_region_type region_type;
+  enum tree_code code;
   bool combined_loop;
   bool distribute;
   bool target_firstprivatize_array_bases;
@@ -3385,6 +3386,13 @@ gimplify_call_expr (tree *expr_p, gimple
   /* Remember the original function pointer type.  */
   fnptrtype = TREE_TYPE (CALL_EXPR_FN (*expr_p));
 
+  if (flag_openmp && fndecl)
+{
+  tree variant = omp_resolve_declare_variant (fndecl);
+  if (variant != fndecl)
+   CALL_EXPR_FN (*expr_p) = build1 (ADDR_EXPR, fnptrtype, variant);
+}
+
   /* There is a sequence point before the call, so any side effects in
  the calling expression must occur before the actual call.  Force
  gimplify_expr to use an internal post queue.  */
@@ -8137,6 +8145,7 @@ gimplify_scan_omp_clauses (tree *list_p,
   int nowait = -1;
 
   ctx = new_omp_context (region_type);
+  ctx->code = code;
   outer_ctx = ctx->outer_context;
   if (code == OMP_TARGET)
 {
@@ -10324,6 +10333,99 @@ gimplify_adjust_omp_clauses (gimple_seq
   delete_omp_context (ctx);
 }
 
+/* Return 0 if CONSTRUCTS selectors don't match the OpenMP context,
+   -1 if unknown yet (simd is involved, won't be known until vectorization)
+   and positive number if they do, the number is then the number of constructs
+   in the OpenMP context.  */
+
+HOST_WIDE_INT
+omp_construct_selector_matches (enum tree_code *constructs, int nconstructs)
+{
+  int matched = 0, cnt = 0;
+  bool simd_seen = false;
+  for (struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp; ctx;)
+{
+  if (((ctx->region_type & ORT_PARALLEL) && ctx->code == OMP_PARALLEL)
+ || ((ctx->region_type & (ORT_TARGET | ORT_IMPLICIT_TARGET | ORT_ACC))
+ == ORT_TARGET && ctx->code == OMP_TARGET)
+ || ((ctx->region_type & ORT_TEAMS) && ctx->code == OMP_TEAMS)
+ || (ctx->region_type == ORT_WORKSHARE && ctx->code == OMP_FOR)
+ || (ctx->region_type == ORT_SIMD
+ && ctx->code == OMP_SIMD
+ && !omp_find_clause (ctx->clauses, OMP_CLAUSE_BIND)))
+   {
+ ++cnt;
+ if (matched < nconstructs && ctx->code == constructs[matched])
+   {
+ if (ctx->code == OMP_SIMD)
+   {
+ if (matched)
+

[C++ PATCH] Fix up decl_in_std_namespace_p handling of --enable-symvers=gnu-versioned-namespace

2019-10-24 Thread Jakub Jelinek

Hi!

When looking into the constexpr new issues and adding is_std_construct_at
function, I've noticed that with --enable-symvers=gnu-versioned-namespace
all of that fails, because construct_at (but for other things
forward or move etc.) aren't directly in std namespace, but in inline
namespace inside of it (std::_8::{construct_at,forward,move,...}).

The following patch changes the function all of those calls use to look
through inline namespaces.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-10-24  Jakub Jelinek  

* typeck.c (decl_in_std_namespace_p): Return true also for decls
in inline namespaces inside of std namespace.

* g++.dg/cpp0x/Wpessimizing-move6.C: New test.

--- gcc/cp/typeck.c.jj  2019-10-23 20:38:00.022871653 +0200
+++ gcc/cp/typeck.c 2019-10-24 11:36:14.982981481 +0200
@@ -9395,8 +9395,16 @@ maybe_warn_about_returning_address_of_lo
 bool
 decl_in_std_namespace_p (tree decl)
 {
-  return (decl != NULL_TREE
- && DECL_NAMESPACE_STD_P (decl_namespace_context (decl)));
+  while (decl)
+{
+  decl = decl_namespace_context (decl);
+  if (DECL_NAMESPACE_STD_P (decl))
+   return true;
+  if (!DECL_NAMESPACE_INLINE_P (decl))
+   return false;
+  decl = CP_DECL_CONTEXT (decl);
+}
+  return false;
 }
 
 /* Returns true if FN, a CALL_EXPR, is a call to std::forward.  */
--- gcc/testsuite/g++.dg/cpp0x/Wpessimizing-move6.C.jj  2019-10-24 
11:26:17.535148996 +0200
+++ gcc/testsuite/g++.dg/cpp0x/Wpessimizing-move6.C 2019-10-24 
11:27:17.359232014 +0200
@@ -0,0 +1,135 @@
+// PR c++/86981
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wpessimizing-move" }
+
+// Define std::move.
+namespace std {
+  inline namespace _8 { }
+  namespace _8 {
+template
+  struct remove_reference
+  { typedef _Tp   type; };
+
+template
+  struct remove_reference<_Tp&>
+  { typedef _Tp   type; };
+
+template
+  struct remove_reference<_Tp&&>
+  { typedef _Tp   type; };
+
+template
+  constexpr typename std::remove_reference<_Tp>::type&&
+  move(_Tp&& __t) noexcept
+  { return static_cast::type&&>(__t); }
+  }
+}
+
+struct T {
+  T() { }
+  T(const T&) { }
+  T(T&&) { }
+};
+struct U {
+  U() { }
+  U(const U&) { }
+  U(U&&) { }
+  U(T) { }
+};
+
+T g;
+
+T
+fn1 ()
+{
+  T t;
+  return std::move (t); // { dg-warning "moving a local object in a return 
statement prevents copy elision" }
+}
+
+T
+fn2 ()
+{
+  // Not a local variable.
+  return std::move (g);
+}
+
+int
+fn3 ()
+{
+  int i = 42;
+  // Not a class type.
+  return std::move (i);
+}
+
+T
+fn4 (bool b)
+{
+  T t;
+  if (b)
+throw std::move (t);
+  return std::move (t); // { dg-warning "moving a local object in a return 
statement prevents copy elision" }
+}
+
+T
+fn5 (T t)
+{
+  // Function parameter; std::move is redundant but not pessimizing.
+  return std::move (t);
+}
+
+U
+fn6 (T t, U u, bool b)
+{
+  if (b)
+return std::move (t);
+  else
+// Function parameter; std::move is redundant but not pessimizing.
+return std::move (u);
+}
+
+U
+fn6 (bool b)
+{
+  T t;
+  U u;
+  if (b)
+return std::move (t);
+  else
+return std::move (u); // { dg-warning "moving a local object in a return 
statement prevents copy elision" }
+}
+
+T
+fn7 ()
+{
+  static T t;
+  // Non-local; don't warn.
+  return std::move (t);
+}
+
+T
+fn8 ()
+{
+  return T();
+}
+
+T
+fn9 (int i)
+{
+  T t;
+
+  switch (i)
+{
+case 1:
+  return std::move ((t)); // { dg-warning "moving a local object in a 
return statement prevents copy elision" }
+case 2:
+  return (std::move (t)); // { dg-warning "moving a local object in a 
return statement prevents copy elision" }
+default:
+  return (std::move ((t))); // { dg-warning "moving a local object in a 
return statement prevents copy elision" }
+}
+}
+
+int
+fn10 ()
+{
+  return std::move (42);
+}


Jakub

Free m_vector of symbol and call summaries

2019-10-24 Thread Jan Hubicka

Hi,
we never free m_vector in summaries.  Fixed thus.

Honza

* symbols-summary.h (fast_function_summary::release,
fast_call_summary::release): Free m_vector.
Index: symbol-summary.h
===
--- symbol-summary.h(revision 277424)
+++ symbol-summary.h(working copy)
@@ -458,6 +458,8 @@ fast_function_summary::release (
 if ((*m_vector)[i] != NULL)
   this->release ((*m_vector)[i]);
 
+  vec_free (m_vector);
+ 
   this->m_released = true;
 }
 
@@ -919,6 +921,8 @@ fast_call_summary::release ()
 if ((*m_vector)[i] != NULL)
   this->release ((*m_vector)[i]);
 
+  vec_free (m_vector);
+
   this->m_released = true;
 }

Re: [PATCH] Make std::invoke usable in constant expressions

2019-10-24 Thread Jonathan Wakely


On 23/10/19 20:28 +0100, Jonathan Wakely wrote:

* include/std/functional (invoke): Add constexpr for C++20.
* include/std/version (__cpp_lib_constexpr_invoke): Define.
* testsuite/20_util/function_objects/invoke/constexpr.cc: New test.

This is an easy one, because I already made std::__invoke constexpr,
so all that's needed for C++20 is to add _GLIBCXX20_CONSTEXPR to the
public std::invoke function that calls std::__invoke.


For some reason I thought this change only affected std::invokee, but
P1065R2 affects other functions too. I'll fix those tomorrow.

Re: Type representation in CTF and DWARF

2019-10-24 Thread Indu Bhagat





On 10/11/2019 04:41 AM, Jakub Jelinek wrote:

On Fri, Oct 11, 2019 at 01:23:12PM +0200, Richard Biener wrote:

(coreutils-0.22)
   .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
(uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
ls   30616   |1136   |21098   | 26240   
| 0.62
pwd  10734   |788|10433   | 13929   
| 0.83
groups 10706 |811|10249   | 13378   
| 0.80

(emacs-26.3)
   .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
(uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
emacs-26.3.1 674657  |6402   |   273963   |   273910
| 0.33

I chose to account for 50% of .debug_str because at this point, it will be
unfair to not account for them. Actually, one could even argue that upto 70%
of the .debug_str are names of entities. CTF section sizes do include the CTF
string tables.

Across coreutils, I see a geomean of 0.73 (ratio of
.ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the
"-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger
footprint than CTF (with 50% of .debug_str accounted for).

I'm not convinced this "improvement" in size is worth maintainig another
debug-info format much less since it lacks desirable features right now
and thus evaluation is tricky.

At least you can improve dwarf size considerably with a low amount of work.

I suspect another factor where dwarf is bigger compared to CTF is that dwarf
is recording typedef names as well as qualified type variants.  But maybe
CTF just has a more compact representation for the bits it actually implements.

Does CTF record automatic variables in functions, or just global variables?
If only the latter, it would be fair to also disable addition of local
variable DIEs, lexical blocks.  Does CTF record inline functions?  Again, if
not, it would be fair to not emit that either in .debug_info.
-gno-record-gcc-switches so that the compiler command line is not encoded in
the debug info (unless it is in CTF).


CTF includes file-scope and global-scope entities. So, CTF for a function
defined/declared at these scopes is available in .ctf section, even if it is
inlined.

To not generate DWARF for function-local entities, I made a tweak in the
gen_decl_die API to have an early exit when TREE_CODE (DECL_CONTEXT (decl))
is FUNCTION_DECL.

@@ -26374,6 +26374,12 @@ gen_decl_die (tree decl, tree origin, struct 
vlr_context *ctx,
   if (DECL_P (decl_or_origin) && DECL_IGNORED_P (decl_or_origin))
 return NULL;
 
+  /* Do not generate info for function local decl when -gdwarf-like-ctf is

+ enabled.  */
+  if (debug_dwarf_like_ctf && DECL_CONTEXT (decl)
+  && (TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL))
+return NULL;
+
   switch (TREE_CODE (decl_or_origin))
 {
 case ERROR_MARK:


For the numbers in the email today:
1. CFLAGS="-g -gdwarf-like-ctf -gno-record-gcc-switches -O2". dwz is used on
   generated binaries.
2. At this time, I wanted to account for .debug_str entities appropriately (not
   50% as done previously). Using a small script to count chars for
   accounting the "path-like" strings, specifically those strings that start
   with a ".", I gathered the data in column named D5.

(coreutils-0.22)
 .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | path strings (D5) | 
.ctf (uncompressed) | ratio (.ctf/(D1+D2+D4-D5))
ls   14100   |994|16945   | 1328  | 
  26240 | 0.85
pwd   6341   |632| 9311   |  596  | 
  13929 | 0.88
groups 6410  |714| 9218   |  667  | 
  13378 | 0.85
Average geomean across coreutils = 0.84

(emacs-26.3)
 .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | path strings (D5) | 
.ctf (uncompressed) | ratio (.ctf/(D1+D2+D4-D5))
emacs-26.3.1 373678  |3794   |   219048   |  3842 | 
273910  | 0.46


DWARF is highly extensible format, what exactly is and is not emitted is
something that consumers can choose.
Yes, DWARF can be large, but mainly because it provides a lot of
information, the actual representation has been designed with size concerns
in mind and newer versions of the standard keep improving that too.

Jakub


Yes.

I started out to provide some numbers around the size impact of CTF vs DWARF
as it was a legitimate curiosity many of us have had. Comparing Compactness or
feature matrices is only one dimension of evaluating the utility of supporting
CTF in the toolchain (including GCC; Bintuils and GDB have already accepted
initial CTF support). The other dimension is a user friendly workflow which
supports current users and eases further adoption and growth.

Indu

Re: [PATCH V3] Loop split upon semi-invariant condition (PR tree-optimization/89134)

2019-10-24 Thread Feng Xue OS

Richard,

Thanks for your comments. 

>+  /* For PHI node that is not in loop header, its source operands should
>+be defined inside the loop, which are seen as loop variant.  */
>+  if (def_bb != loop->header || !skip_head)
>+   return false;

> so if we have
>
> for (;;)
>  {
> if (x)
>   a = ..;
> else
>   a = ...;
> if (cond-to-split-on dependent on a)
> ...
>  }
>
> the above is too restrictive in case 'x' is semi-invariant as well, correct?
In above case, cond-on-a will not be identified as semi-invariant, in that
a is defined by PHI with real multi-sources. To handle it,  besides each
source value, we should add extra check on each source's control
dependence node (x in the case), which might have not a little code expansion.
Anyway, I'll have a try.


>+ /* A new value comes from outside of loop.  */
>+ if (!bb || !flow_bb_inside_loop_p (loop, bb))
>+   return false;

> but that means starting from the second iteration the value is invariant.
No. Traversal direction is reverse to loop execution. In the following,
start from "x_1 = ", extract latch value x_3, and get x_3 definition, and
finally reach "x_1 =".

Loop:
  x_1 = PHI (x_0, x_3)
  ... 
  x_3 = 
  ...
  goto Loop;


>+ /* Don't consider redefinitions in excluded basic blocks.  */
>+ if (!dominated_by_p (CDI_DOMINATORS, e->src, skip_head))
>+   {
>+ /* There are more than one source operands that can
>+provide value to the SSA name, it is variant.  */
>+ if (from)
>+   return false;
>
> they might be the same though, for PHIs with > 2 arguments.
OK. Will add value equivalence check.


> In the cycle handling you are not recursing via stmt_semi_invariant_p
> but only handle SSA name copies - any particular reason for that?
The cycle handling is specified for ssa that crosses iteration. It is
semi-invariant if it remains unchanged after certain iteration, which
means its value in previous iteration (coming from latch edge) is just
a copy of its self,  nothing else. So, recursion via stmt_semi_invariant_p
is unnecessary.

Loop:
  x_1 = PHI (x_0, x_3);
  x_2 = PHI(x_1, value defined in excluded branch);
  x_3 = x_2;
  goto Loop;


>+static bool
>+branch_removable_p (basic_block branch_bb)
>+{
>+  if (single_pred_p (branch_bb))
>+return true;
>
> I'm not sure what this function tests - at least the single_pred_p check
> looks odd to me given the dominator checks later.  The single predecessor
> could simply be a forwarder.  I wonder if you are looking for branches forming
> an irreducible loop?  I think you can then check EDGE_IRREDUCIBLE_LOOP
> or BB_IRREDUCIBLE_LOOP on the condition block (btw, I don't see
> testcases covering the appearant special-cases in the patch - refering to
> existing ones via a comment often helps understanding the code).

Upon condition evaluation, if a branch is not selected,  
This function test a branch is reachable from other place other than its
conditional statement. This ensure that when the branch is not selected
upon condition evaluation, trace path led by the branch will never
be executed so that it can be excluded  during semi-invariantness analysis.

If single_pred_p, only condition statement can reach the branch.

If not, consider a half diamond condition control graph, with a back-edge to
true branch.

condition
   |  \
   |   \
   |  false branch
   .--->.  |   /
   ||  |  /
 othertrue branch
   ||
   '---<'

If there is an edge from false branch, true branch can not be excluded even it
is not selected.  And back edge from "other" (dominated by true branch) does
not have any impact.


>+
>+  return EDGE_SUCC (cond_bb, (unsigned) invar[1]);
>+}
>
> magic ensures that invar[1] is always the invariant edge?  Oh, it's a bool.
> Ick.  I wonder if logic with int invariant_edge = -1; and the loop setting
> it to either 0 or 1 would be easier to follow...
OK.


> Note your stmt_semi_invariant_p check is exponential for a condition
> like
>
>   _1 = 1;
>   _2 = _1 + _1;
>   _3 = _2 + _2;
>   if (_3 != param_4(D))
>
> because you don't track ops you already proved semi-invariant.  We've
> run into such situation repeatedly in SCEV analysis so I doubt it can be
> disregarded as irrelevant in practice.  A worklist approach could then
> also get rid of the recursion.  You are already computing the stmts
> forming the condition in compute_added_num_insns so another option
> is to re-use that.
OK.


> Btw, I wonder if we can simply re-use PARAM_MAX_PEELED_INSNS
> instead of adding yet another param (it also happens to have the same
> size).  Because we are "peeling" the loop.
I'll check that.

>+  edge invar_branch = get_cond_invariant_branch (loop, cond);
>+
>+  if (!invar_branch)
>+return NULL;
>
> extr

Re: [PATCH target/89071] Fix false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale

2019-10-24 Thread Hongtao Liu

On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak  wrote:
>
> On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu  wrote:
> >
> > Update patch:
> > Add m constraint to define_insn (sse_1_round > *sse_1_round > when under sse4 but not avx512f.
>
> It looks to me that the original insn is incompletely defined. It
> should use nonimmediate_operand, "m" constraint and  pointer
> size modifier. Something like:
>
> (define_insn "sse4_1_round"
>   [(set (match_operand:VF_128 0 "register_operand" "=Yr,*x,x,v")
> (vec_merge:VF_128
>   (unspec:VF_128
> [(match_operand:VF_128 2 "nonimmediate_operand" "Yrm,*xm,xm,vm")
>  (match_operand:SI 3 "const_0_to_15_operand" "n,n,n,n")]
> UNSPEC_ROUND)
>   (match_operand:VF_128 1 "register_operand" "0,0,x,v")
>   (const_int 1)))]
>   "TARGET_SSE4_1"
>   "@
>round\t{%3, %2, %0|%0, %2, %3}
>round\t{%3, %2, %0|%0, %2, %3}
>vround\t{%3, %2, %1, %0|%0, %1, %2, %3}
>vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}"
>
> >
> > Changelog:
> > gcc/
> > * config/i386/sse.md:  (sse4_1_round):
> > Change constraint x to xm
> > since vround support memory operand.
> > * (*sse4_1_round): Ditto.
> >
> > Bootstrap and regression test ok.
> >
> > On Wed, Oct 23, 2019 at 9:56 AM Hongtao Liu  wrote:
> > >
> > > Hi uros:
> > >   This patch fixes false dependence of scalar operations
> > > vrcp/vsqrt/vrsqrt/vrndscale.
> > >   Bootstrap ok, regression test on i386/x86 ok.
> > >
> > >   It does something like this:
> > > -
> > > For scalar instructions with both xmm operands:
> > >
> > > op %xmmN,%xmmQ,%xmmQ > op %xmmN, %xmmN, %xmmQ
> > >
> > > for scalar instructions with one mem  or gpr operand:
> > >
> > > op mem/gpr, %xmmQ, %xmmQ
> > >
> > > --->  using pass rpad >
> > >
> > > xorps %xmmN, %xmmN, %xxN
> > > op mem/gpr, %xmmN, %xmmQ
> > >
> > > Performance influence of SPEC2017 fprate which is tested on SKX
> > >
> > > 503.bwaves_r -0.03%
> > > 507.cactuBSSN_r -0.22%
> > > 508.namd_r -0.02%
> > > 510.parest_r 0.37%
> > > 511.povray_r 0.74%
> > > 519.lbm_r 0.24%
> > > 521.wrf_r 2.35%
> > > 526.blender_r 0.71%
> > > 527.cam4_r 0.65%
> > > 538.imagick_r 0.95%
> > > 544.nab_r -0.37
> > > 549.fotonik3d_r 0.24%
> > > 554.roms_r 0.90%
> > > fprate geomean 0.50%
> > > -
> > >
> > > Changelog
> > > gcc/
> > > * config/i386/i386.md (*rcpsf2_sse): Add
> > > avx_partial_xmm_update, prefer m constraint for TARGET_AVX.
> > > (*rsqrtsf2_sse): Ditto.
> > > (*sqrt2_sse): Ditto.
> > > (sse4_1_round2): separate constraint vm, add
> > > avx_partail_xmm_update, prefer m constraint for TARGET_AVX.
> > > * config/i386/sse.md (*sse_vmrcpv4sf2"): New define_insn used
> > > by pass rpad.
> > > (*_vmsqrt2*):
> > > Ditto.
> > > (*sse_vmrsqrtv4sf2): Ditto.
> > > (*avx512f_rndscale): Ditto.
> > > (*sse4_1_round): Ditto.
> > >
> > > gcc/testsuite
> > > * gcc.target/i386/pr87007-4.c: New test.
> > > * gcc.target/i386/pr87007-5.c: Ditto.
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
> (set (attr "preferred_for_speed")
>   (cond [(eq_attr "alternative" "1")
>(symbol_ref "TARGET_AVX || !TARGET_SSE_PARTIAL_REG_DEPENDENCY")
> (eq_attr "alternative" "2")
> -  (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY")
> +  (symbol_ref "TARGET_AVX || !TARGET_SSE_PARTIAL_REG_DEPENDENCY")
> ]
> (symbol_ref "true")))])
>
> This can be written as:
>
> (set (attr "preferred_for_speed")
>   (cond [(match_test "TARGET_AVX")
>(symbol_ref "true")
> (eq_attr "alternative" "1,2")
>   (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY")
> ]
> (symbol_ref "true")))])
>
> Uros.

Yes, after these fixed, i'll upstream to trunk, ok?
-- 
BR,
Hongtao

[libstdc++,doc] doc/xml/gnu/gpl-3.0.xml: Switch www.gnu.org to https.

2019-10-24 Thread Gerald Pfeifer

Committed.

Gerald

2019-10-25  Gerald Pfeifer  

* doc/xml/gnu/gpl-3.0.xml: Switch www.gnu.org to https.

Index: libstdc++-v3/doc/xml/gnu/gpl-3.0.xml
===
--- libstdc++-v3/doc/xml/gnu/gpl-3.0.xml(revision 277213)
+++ libstdc++-v3/doc/xml/gnu/gpl-3.0.xml(working copy)
@@ -829,6 +829,6 @@ under certain conditions; type ‘show c<
 subroutine library, you may consider it more useful to permit linking
 proprietary applications with the library.  If this is what you want to do,
 use the GNU Lesser General Public License instead of 
this
-License.  But first, please read http://www.w3.org/1999/xlink"; 
xlink:href="http://www.gnu.org/philosophy/why-not-lgpl.html";>http://www.gnu.org/philosophy/why-not-lgpl.html.
+License.  But first, please read http://www.w3.org/1999/xlink"; 
xlink:href="https://www.gnu.org/philosophy/why-not-lgpl.html";>https://www.gnu.org/philosophy/why-not-lgpl.html.

Translation Services with Minimal Expenses

2019-10-24 Thread Priya Sharma

Hello Ma’am/Sir,

I represent a company which has *1**7**+* years of experience in
providing *accurate
Translation, Interpretation, App/Web Localization, and Content Moderation
Services within fastest TAT.*

Our team has expertise in all types of languages including national and
international (*Arabic, African, **Bengali, Chinese, Dutch, French, German,
Malayalam, Tamil, Kannada, Gujarati, Marathi, Odia, Punjabi, Portuguese,
Russian, Swedish,  Urdu, Turkish, Spanish, Italian, Thai, Greek, Korean,
Japanese,* etc).

Clients served: *Global Sign, Emenox Group, BHEL, Interglobe Technologies,
NISC Export Services, DTS, L&T, TCS, and Schneider Electric*.

If you require any of these services mentioned above, then please let me
know.

Thanks

Priya Sharma

Dept. (Tr & In)

Re: r272976 - in /trunk/gcc/ada: ChangeLog ali.adb ...

2019-10-24 Thread Gerald Pfeifer

On Tue, 10 Sep 2019, Arnaud Charlet wrote:
> Allright, there are already similar kludges elsewhere, so I've applied the
> following patch which fixes it:
> 
> 2019-09-10  Arnaud Charlet  
> 
>   * doc/install.texi: Fix syntax for html generation.
> 
> Index: doc/install.texi
> ===
> --- doc/install.texi(revision 275400)
> +++ doc/install.texi(working copy)
> @@ -2727,7 +2727,12 @@
> 
>  @section Building the Ada compiler
> 
> -See @ref{GNAT-prerequisite}.
> +@ifnothtml
> +@ref{GNAT-prerequisite}.
> +@end ifnothtml
> +@ifhtml
> +@uref{GNAT-prerequisite}.
> +@end ifhtml

Hmm, I'm afraid this does not work as intended.

https://gcc.gnu.org/install/build.html now links to
https://gcc.gnu.org/install/GNAT-prerequisite which simply does not
exit.

Mind looking into this?

Thanks,
Gerald

[libstdc++,doc] doc/xml/manual/policy_data_structures_biblio.xml - pubs.opengroup.org goes https

2019-10-24 Thread Gerald Pfeifer

Committed.

2019-10-25  Gerald Pfeifer  
 
* doc/xml/manual/policy_data_structures_biblio.xml: Switch
pubs.opengroup.org to https.

Index: doc/xml/manual/policy_data_structures_biblio.xml
===
--- doc/xml/manual/policy_data_structures_biblio.xml(revision 277435)
+++ doc/xml/manual/policy_data_structures_biblio.xml(working copy)
@@ -1232,7 +1232,7 @@
 
   
http://www.w3.org/1999/xlink";
- 
xlink:href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/select.html";>
+ 
xlink:href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/select.html";>
  select

Re: [PATCH target/89071] Fix false dependence of scalar operations vrcp/vsqrt/vrsqrt/vrndscale

2019-10-24 Thread Hongtao Liu

On Fri, Oct 25, 2019 at 1:23 PM Hongtao Liu  wrote:
>
> On Fri, Oct 25, 2019 at 2:39 AM Uros Bizjak  wrote:
> >
> > On Wed, Oct 23, 2019 at 7:48 AM Hongtao Liu  wrote:
> > >
> > > Update patch:
> > > Add m constraint to define_insn (sse_1_round > > *sse_1_round > > when under sse4 but not avx512f.
> >
> > It looks to me that the original insn is incompletely defined. It
> > should use nonimmediate_operand, "m" constraint and  pointer
> > size modifier. Something like:
> >
> > (define_insn "sse4_1_round"
> >   [(set (match_operand:VF_128 0 "register_operand" "=Yr,*x,x,v")
> > (vec_merge:VF_128
> >   (unspec:VF_128
> > [(match_operand:VF_128 2 "nonimmediate_operand" "Yrm,*xm,xm,vm")
> >  (match_operand:SI 3 "const_0_to_15_operand" "n,n,n,n")]
> > UNSPEC_ROUND)
> >   (match_operand:VF_128 1 "register_operand" "0,0,x,v")
> >   (const_int 1)))]
> >   "TARGET_SSE4_1"
> >   "@
> >round\t{%3, %2, %0|%0, %2, %3}
> >round\t{%3, %2, %0|%0, %2, %3}
> >vround\t{%3, %2, %1, %0|%0, %1, %2, %3}
> >vrndscale\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> >
> > >
> > > Changelog:
> > > gcc/
> > > * config/i386/sse.md:  (sse4_1_round):
> > > Change constraint x to xm
> > > since vround support memory operand.
> > > * (*sse4_1_round): Ditto.
> > >
> > > Bootstrap and regression test ok.
> > >
> > > On Wed, Oct 23, 2019 at 9:56 AM Hongtao Liu  wrote:
> > > >
> > > > Hi uros:
> > > >   This patch fixes false dependence of scalar operations
> > > > vrcp/vsqrt/vrsqrt/vrndscale.
> > > >   Bootstrap ok, regression test on i386/x86 ok.
> > > >
> > > >   It does something like this:
> > > > -
> > > > For scalar instructions with both xmm operands:
> > > >
> > > > op %xmmN,%xmmQ,%xmmQ > op %xmmN, %xmmN, %xmmQ
> > > >
> > > > for scalar instructions with one mem  or gpr operand:
> > > >
> > > > op mem/gpr, %xmmQ, %xmmQ
> > > >
> > > > --->  using pass rpad >
> > > >
> > > > xorps %xmmN, %xmmN, %xxN
> > > > op mem/gpr, %xmmN, %xmmQ
> > > >
> > > > Performance influence of SPEC2017 fprate which is tested on SKX
> > > >
> > > > 503.bwaves_r -0.03%
> > > > 507.cactuBSSN_r -0.22%
> > > > 508.namd_r -0.02%
> > > > 510.parest_r 0.37%
> > > > 511.povray_r 0.74%
> > > > 519.lbm_r 0.24%
> > > > 521.wrf_r 2.35%
> > > > 526.blender_r 0.71%
> > > > 527.cam4_r 0.65%
> > > > 538.imagick_r 0.95%
> > > > 544.nab_r -0.37
> > > > 549.fotonik3d_r 0.24%
> > > > 554.roms_r 0.90%
> > > > fprate geomean 0.50%
> > > > -
> > > >
> > > > Changelog
> > > > gcc/
> > > > * config/i386/i386.md (*rcpsf2_sse): Add
> > > > avx_partial_xmm_update, prefer m constraint for TARGET_AVX.
> > > > (*rsqrtsf2_sse): Ditto.
> > > > (*sqrt2_sse): Ditto.
> > > > (sse4_1_round2): separate constraint vm, add
> > > > avx_partail_xmm_update, prefer m constraint for TARGET_AVX.
> > > > * config/i386/sse.md (*sse_vmrcpv4sf2"): New define_insn used
> > > > by pass rpad.
> > > > (*_vmsqrt2*):
> > > > Ditto.
> > > > (*sse_vmrsqrtv4sf2): Ditto.
> > > > (*avx512f_rndscale): Ditto.
> > > > (*sse4_1_round): Ditto.
> > > >
> > > > gcc/testsuite
> > > > * gcc.target/i386/pr87007-4.c: New test.
> > > > * gcc.target/i386/pr87007-5.c: Ditto.
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> >
> > (set (attr "preferred_for_speed")
> >   (cond [(eq_attr "alternative" "1")
> >(symbol_ref "TARGET_AVX || !TARGET_SSE_PARTIAL_REG_DEPENDENCY")
> > (eq_attr "alternative" "2")
> > -  (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY")
> > +  (symbol_ref "TARGET_AVX || !TARGET_SSE_PARTIAL_REG_DEPENDENCY")
> > ]
> > (symbol_ref "true")))])
> >
> > This can be written as:
> >
> > (set (attr "preferred_for_speed")
> >   (cond [(match_test "TARGET_AVX")
> >(symbol_ref "true")
> > (eq_attr "alternative" "1,2")
> >   (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY")
> > ]
> > (symbol_ref "true")))])
> >
> > Uros.
>
> Yes, after these fixed, i'll upstream to trunk, ok?
Update patch.
> --
> BR,
> Hongtao



-- 
BR,
Hongtao
From 1892f7b52ea0c5b59d3d0c9e50330f70712fc9cc Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Wed, 9 Oct 2019 11:21:25 +0800
Subject: [PATCH] Fix false dependence of scalar operation
 vrcp/vsqrt/vrsqrt/vrndscale

For instructions with xmm operand:

op %xmmN,%xmmQ,%xmmQ > op %xmmN, %xmmN, %xmmQ

for instruction with mem operand or gpr operand:

op mem/gpr, %xmmQ, %xmmQ

--->  using pass rpad >

xorps %xmmN, %xmmN, %xxN
op mem/gpr, %xmmN, %xmmQ

Performance influence of SPEC2017 fprate which is tested on SKX

503.bwaves_r	-0.03%
507.cactuBSSN_r -0.22%
508.namd_r	-0.02%
510.parest_r	0.37%
511.povray_r	0.74%
519.lbm_r	0.24%
521.wrf_r	2.35%
526.blender_r	0.71%
527.cam4_r	0.65%
538.imagick_r	0.95%
544.nab_r	-0.37
549.fotonik3d_r 0.24%
554.roms_r	0.90%
fprate geomean	0.50%
-

[wwwdocs] readings.html - http://www.idris.fr/data/publications/F95/test_F95_english.html is gone

2019-10-24 Thread Gerald Pfeifer

I looked for a replacement, and there does not appear to be one, so I
remove the link.

Committed.

Gerald

- Log -
commit 61592c09663a83809c5115cb7dfddeb3bd606418
Author: Gerald Pfeifer 
Date:   Fri Oct 25 07:55:49 2019 +0200

http://www.idris.fr/data/publications/F95/test_F95_english.html is gone.

diff --git a/htdocs/readings.html b/htdocs/readings.html
index 203b590..5c30391 100644
--- a/htdocs/readings.html
+++ b/htdocs/readings.html
@@ -435,12 +435,6 @@ names.
 contains legal and operational Fortran 77 code.
   
   
-IDRIS
-http://www.idris.fr/data/publications/F95/test_F95_english.html";>
-Low level bench tests of Fortran 95. It tests some Fortran 95
-intrinsics.
-  
-  
 The g77 testsuite (which is part of GCC).

78 matches

Mail list logo