Re: [PATCH] [RFC] Higher-level reporting of vectorization problems

2018-07-03 Thread Richard Biener
On Mon, 2 Jul 2018, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Fri, 22 Jun 2018, David Malcolm wrote:
> >
> >> NightStrike and I were chatting on IRC last week about
> >> issues with trying to vectorize the following code:
> >> 
> >> #include 
> >> std::size_t f(std::vector> const & v) {
> >>std::size_t ret = 0;
> >>for (auto const & w: v)
> >>ret += w.size();
> >>return ret;
> >> }
> >> 
> >> icc could vectorize it, but gcc couldn't, but neither of us could
> >> immediately figure out what the problem was.
> >> 
> >> Using -fopt-info leads to a wall of text.
> >> 
> >> I tried using my patch here:
> >> 
> >>  "[PATCH] v3 of optinfo, remarks and optimization records"
> >>   https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01267.html
> >> 
> >> It improved things somewhat, by showing:
> >> (a) the nesting structure via indentation, and
> >> (b) the GCC line at which each message is emitted (by using the
> >> "remark" output)
> >> 
> >> but it's still a wall of text:
> >> 
> >>   https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.remarks.html
> >>   
> >> https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.d/..%7C..%7Csrc%7Ctest.cc.html#line-4
> >> 
> >> It doesn't yet provide a simple high-level message to a
> >> tech-savvy user on what they need to do to get GCC to
> >> vectorize their loop.
> >
> > Yeah, in particular the vectorizer is way too noisy in its low-level
> > functions.  IIRC -fopt-info-vec-missed is "somewhat" better:
> >
> > t.C:4:26: note: step unknown.
> > t.C:4:26: note: vector alignment may not be reachable
> > t.C:4:26: note: not ssa-name.
> > t.C:4:26: note: use not simple.
> > t.C:4:26: note: not ssa-name.
> > t.C:4:26: note: use not simple.
> > t.C:4:26: note: no array mode for V2DI[3]
> > t.C:4:26: note: Data access with gaps requires scalar epilogue loop
> > t.C:4:26: note: can't use a fully-masked loop because the target doesn't 
> > have the appropriate masked load or store.
> > t.C:4:26: note: not ssa-name.
> > t.C:4:26: note: use not simple.
> > t.C:4:26: note: not ssa-name.
> > t.C:4:26: note: use not simple.
> > t.C:4:26: note: no array mode for V2DI[3]
> > t.C:4:26: note: Data access with gaps requires scalar epilogue loop
> > t.C:4:26: note: op not supported by target.
> > t.C:4:26: note: not vectorized: relevant stmt not supported: _15 = _14 
> > /[ex] 4;
> > t.C:4:26: note: bad operation or unsupported loop bound.
> > t.C:4:26: note: not vectorized: no grouped stores in basic block.
> > t.C:4:26: note: not vectorized: no grouped stores in basic block.
> > t.C:6:12: note: not vectorized: not enough data-refs in basic block.
> >
> >
> >> The pertinent dump messages are:
> >> 
> >> test.cc:4:23: remark: === try_vectorize_loop_1 === 
> >> [../../src/gcc/tree-vectorizer.c:674:try_vectorize_loop_1]
> >> cc1plus: remark:
> >> Analyzing loop at test.cc:4 
> >> [../../src/gcc/dumpfile.c:735:ensure_pending_optinfo]
> >> test.cc:4:23: remark:  === analyze_loop_nest === 
> >> [../../src/gcc/tree-vect-loop.c:2299:vect_analyze_loop]
> >> [...snip...]
> >> test.cc:4:23: remark:   === vect_analyze_loop_operations === 
> >> [../../src/gcc/tree-vect-loop.c:1520:vect_analyze_loop_operations]
> >> [...snip...]
> >> test.cc:4:23: remark:==> examining statement: ‘_15 = _14 /[ex] 4;’ 
> >> [../../src/gcc/tree-vect-stmts.c:9382:vect_analyze_stmt]
> >> test.cc:4:23: remark:vect_is_simple_use: operand ‘_14’ 
> >> [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use]
> >> test.cc:4:23: remark:def_stmt: ‘_14 = _8 - _7;’ 
> >> [../../src/gcc/tree-vect-stmts.c:10098:vect_is_simple_use]
> >> test.cc:4:23: remark:type of def: internal 
> >> [../../src/gcc/tree-vect-stmts.c:10112:vect_is_simple_use]
> >> test.cc:4:23: remark:vect_is_simple_use: operand ‘4’ 
> >> [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use]
> >> test.cc:4:23: remark:op not supported by target. 
> >> [../../src/gcc/tree-vect-stmts.c:5932:vectorizable_operation]
> >> test.cc:4:23: remark:not vectorized: relevant stmt not supported: ‘_15 
> >> = _14 /[ex] 4;’ [../../src/gcc/tree-vect-stmts.c:9565:vect_analyze_stmt]
> >> test.cc:4:23: remark:   bad operation or unsupported loop bound. 
> >> [../../src/gcc/tree-vect-loop.c:2043:vect_analyze_loop_2]
> >> cc1plus: remark: vectorized 0 loops in function. 
> >> [../../src/gcc/tree-vectorizer.c:904:vectorize_loops]
> >> 
> >> In particular, that complaint from
> >>   [../../src/gcc/tree-vect-stmts.c:9565:vect_analyze_stmt]
> >> is coming from:
> >> 
> >>   if (!ok)
> >> {
> >>   if (dump_enabled_p ())
> >> {
> >>   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> >>"not vectorized: relevant stmt not ");
> >>   dump_printf (MSG_MISSED_OPTIMIZATION, "supported: ");
> >>   dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, stmt, 0);
> >> }
> >> 
> >>   return false;
> >> }
> >> 
> >> This got me thinking: the u

[C++ PATCH] Fix extern_decl_map handling (PR c++/3698, PR c++/86208)

2018-07-03 Thread Jakub Jelinek
Hi!

This testcase got fixed in G++ 3.2, where we used for decision if inline
function body should be kept TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (...)),
but we now use cgraph and the testcase fails again; the particular problem
is that we set TREE_USED only on the local extern decl and don't propagate
it to the decl that extern_decl_map maps it into.

Fixed by the following patch, which propagates it at genericization time.
Bootstrapped/regtested on x86_64-linux and i686-linux.

Another option is:
+  /* If decl is local extern, recurse into corresponding decl.  */
+  if (cfun
+  && cp_function_chain
+  && cp_function_chain->extern_decl_map
+  && VAR_OR_FUNCTION_DECL_P (decl)
+  && DECL_EXTERNAL (decl))
+{
+  struct cxx_int_tree_map *h, in;
+  in.uid = DECL_UID (decl);
+  h = cp_function_chain->extern_decl_map->find_with_hash (&in, in.uid);
+  if (h)
+   TREE_USED (h->to) = 1;
+}
+
in mark_used, another one:
+  /* If decl is local extern, recurse into corresponding decl.  */
+  if (cfun
+  && cp_function_chain
+  && cp_function_chain->extern_decl_map
+  && VAR_OR_FUNCTION_DECL_P (decl)
+  && DECL_EXTERNAL (decl))
+{
+  struct cxx_int_tree_map *h, in;
+  in.uid = DECL_UID (decl);
+  h = cp_function_chain->extern_decl_map->find_with_hash (&in, in.uid);
+  if (h && !mark_used (h->to))
+   return false;
+}
+
in the same spot.
None of these fix the PR82204 though.

2018-07-02  Jakub Jelinek  

PR c++/3698
PR c++/86208
* cp-gimplify.c (cp_genericize_r): When using extern_decl_map, or
in TREE_USED flag from stmt to h->to.

* g++.dg/opt/pr3698.C: New test.

--- gcc/cp/cp-gimplify.c.jj 2018-06-20 08:15:28.980857357 +0200
+++ gcc/cp/cp-gimplify.c2018-07-02 18:03:00.714313555 +0200
@@ -1085,6 +1085,7 @@ cp_genericize_r (tree *stmt_p, int *walk
   if (h)
{
  *stmt_p = h->to;
+ TREE_USED (h->to) |= TREE_USED (stmt);
  *walk_subtrees = 0;
  return NULL;
}
--- gcc/testsuite/g++.dg/opt/pr3698.C.jj2018-07-02 18:05:52.535479087 
+0200
+++ gcc/testsuite/g++.dg/opt/pr3698.C   2018-07-02 18:05:44.507471531 +0200
@@ -0,0 +1,21 @@
+// PR c++/3698
+// { dg-do link }
+// { dg-options "-O0" }
+
+struct X {
+  int i;
+};
+
+inline const int&
+OHashKey (const X& x)
+{
+  return x.i;
+}
+
+int
+main ()
+{
+ extern const int& OHashKey (const X& x);
+ X x;
+ return OHashKey (x);
+}

Jakub


Re: [PATCH] -fopt-info: add indentation via DUMP_VECT_SCOPE

2018-07-03 Thread Richard Biener
On Mon, Jul 2, 2018 at 7:00 PM David Malcolm  wrote:
>
> On Mon, 2018-07-02 at 14:23 +0200, Christophe Lyon wrote:
> > On Fri, 29 Jun 2018 at 10:09, Richard Biener  > com> wrote:
> > >
> > > On Tue, Jun 26, 2018 at 5:43 PM David Malcolm 
> > > wrote:
> > > >
> > > > This patch adds a concept of nested "scopes" to dumpfile.c's
> > > > dump_*_loc
> > > > calls, and wires it up to the DUMP_VECT_SCOPE macro in tree-
> > > > vectorizer.h,
> > > > so that the nested structure is shown in -fopt-info by
> > > > indentation.
> > > >
> > > > For example, this converts -fopt-info-all e.g. from:
> > > >
> > > > test.c:8:3: note: === analyzing loop ===
> > > > test.c:8:3: note: === analyze_loop_nest ===
> > > > test.c:8:3: note: === vect_analyze_loop_form ===
> > > > test.c:8:3: note: === get_loop_niters ===
> > > > test.c:8:3: note: symbolic number of iterations is (unsigned int)
> > > > n_9(D)
> > > > test.c:8:3: note: not vectorized: loop contains function calls or
> > > > data references that cannot be analyzed
> > > > test.c:8:3: note: vectorized 0 loops in function
> > > >
> > > > to:
> > > >
> > > > test.c:8:3: note: === analyzing loop ===
> > > > test.c:8:3: note:  === analyze_loop_nest ===
> > > > test.c:8:3: note:   === vect_analyze_loop_form ===
> > > > test.c:8:3: note:=== get_loop_niters ===
> > > > test.c:8:3: note:   symbolic number of iterations is (unsigned
> > > > int) n_9(D)
> > > > test.c:8:3: note:   not vectorized: loop contains function calls
> > > > or data references that cannot be analyzed
> > > > test.c:8:3: note: vectorized 0 loops in function
> > > >
> > > > showing that the "symbolic number of iterations" message is
> > > > within
> > > > the "=== analyze_loop_nest ===" (and not within the
> > > > "=== vect_analyze_loop_form ===").
> > > >
> > > > This is also enabling work for followups involving optimization
> > > > records
> > > > (allowing the records to directly capture the nested structure of
> > > > the
> > > > dump messages).
> > > >
> > > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > > >
> > > > OK for trunk?
> >
> > Hi,
> >
> > I've noticed that this patch (r262246) caused regressions on aarch64:
> > gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "note: Built SLP cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built SLP
> > cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "note: Built SLP cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built SLP
> > cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "note: Built SLP cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built SLP
> > cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "note: Built SLP cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built SLP
> > cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "note: Built SLP cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built SLP
> > cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "note: Built SLP cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built SLP
> > cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "note: Built SLP cancelled: can use load/store-lanes"
> > gcc.dg/vect/slp-perm-8.c scan-tree-dump vect "note: Built SLP
> > cancelled: can use load/store-lanes"
> >
> > The problem is that now there are more spaces between "note:" and
> > "Built", the attached small patch does that for slp-perm-1.c.
>
> Sorry about the breakage.
>
> > Is it the right way of fixing it or do we want to accept any amount
> > of
> > spaces for instance?
>
> I don't think we want to hardcode the amount of space in the dumpfile.
> The idea of my patch was to make the dump more human-readable (I hope)
> by visualizing the nesting structure of the dump messages, but I think
> we shouldn't "bake" that into the expected strings, as someone might
> want to add an intermediate nesting level.
>
> Do we really need to look for the "note:" in the scan-tree-dump?
> Should that directive be rewritten to:
>
> -/* { dg-final { scan-tree-dump "note: Built SLP cancelled: can use 
> load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } } } } 
> */
> +/* { dg-final { scan-tree-dump "Built SLP cancelled: can use 
> load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } } } } 
> */
>
> which I believe would match any amount of spa

[C++ PATCH] Hide __for_{range,begin,end} symbols (PR c++/85515)

2018-07-03 Thread Jakub Jelinek
Hi!

While working on OpenMP range-for support, I ran into the weird thing that
the C++ FE accepts
  for (auto &i : a)
if (i != *__for_begin || &i == __for_end || &__for_range[0] != &a[0])
  __builtin_abort ();
outside of templates, but doesn't inside of templates.
I think we shouldn't let people do this at any time, it will just lead to
non-portable code, and when it works only outside of templates, it isn't a
well defined extension.

Now, we could just use create_tmp_var_name ("__for_range") instead of
get_identifier ("__for_range") etc. and the symbols would be non-accessible
to users (usually; if assembler doesn't support dots nor dollars in labels,
it is still a symbol with just alphanumeric chars in it, but there is a hard
to predict number suffix at least), or just NULL DECL_NAME.
But my understanding is that the intent was that in the debugger one could
use __for_range, __for_begin and __for_end to make it easier to debug
range fors.  In that case, the following patch solves it by using a name
not accessible for the user when parsing the body (with space in it) and
correcting the names when the var gets out of scope.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-07-02  Jakub Jelinek  

PR c++/85515
* parser.c (build_range_temp): Use "__for range" instead of
"__for_range".
(cp_convert_range_for): Use "__for begin" and "__for end" instead of
"__for_begin" and "__for_end".
* semantics.c (finish_for_stmt): Rename "__for {range,begin,end}"
local symbols to "__for_{range,begin,end}".

* g++.dg/pr85515-2.C: Add expected dg-error.
* g++.dg/cpp0x/range-for36.C: New test.

--- gcc/cp/parser.c.jj  2018-07-02 23:04:00.869370436 +0200
+++ gcc/cp/parser.c 2018-07-02 23:08:44.572618486 +0200
@@ -11953,7 +11953,7 @@ build_range_temp (tree range_expr)
 
   /* Create the __range variable.  */
   range_temp = build_decl (input_location, VAR_DECL,
-  get_identifier ("__for_range"), range_type);
+  get_identifier ("__for range"), range_type);
   TREE_USED (range_temp) = 1;
   DECL_ARTIFICIAL (range_temp) = 1;
 
@@ -12061,7 +12061,7 @@ cp_convert_range_for (tree statement, tr
 
   /* The new for initialization statement.  */
   begin = build_decl (input_location, VAR_DECL,
- get_identifier ("__for_begin"), iter_type);
+ get_identifier ("__for begin"), iter_type);
   TREE_USED (begin) = 1;
   DECL_ARTIFICIAL (begin) = 1;
   pushdecl (begin);
@@ -12072,7 +12072,7 @@ cp_convert_range_for (tree statement, tr
   if (cxx_dialect >= cxx17)
 iter_type = cv_unqualified (TREE_TYPE (end_expr));
   end = build_decl (input_location, VAR_DECL,
-   get_identifier ("__for_end"), iter_type);
+   get_identifier ("__for end"), iter_type);
   TREE_USED (end) = 1;
   DECL_ARTIFICIAL (end) = 1;
   pushdecl (end);
--- gcc/cp/semantics.c.jj   2018-06-25 14:51:23.096989196 +0200
+++ gcc/cp/semantics.c  2018-07-02 23:23:40.784400542 +0200
@@ -1060,7 +1060,31 @@ finish_for_stmt (tree for_stmt)
 : &FOR_SCOPE (for_stmt));
   tree scope = *scope_ptr;
   *scope_ptr = NULL;
+
+  /* During parsing of the body, range for uses "__for {range,begin,end}"
+ decl names to make those unaccessible by code in the body.
+ Change it to ones with underscore instead of space, so that it can
+ be inspected in the debugger.  */
+  tree range_for_decl[3] = { NULL_TREE, NULL_TREE, NULL_TREE };
+  for (int i = 0; i < 3; i++)
+{
+  tree id
+   = get_identifier ("__for range\0__for begin\0__for end" + 12 * i);
+  if (IDENTIFIER_BINDING (id)
+ && IDENTIFIER_BINDING (id)->scope == current_binding_level)
+   {
+ range_for_decl[i] = IDENTIFIER_BINDING (id)->value;
+ gcc_assert (VAR_P (range_for_decl[i])
+ && DECL_ARTIFICIAL (range_for_decl[i]));
+   }
+}
+
   add_stmt (do_poplevel (scope));
+
+  for (int i = 0; i < 3; i++)
+if (range_for_decl[i])
+  DECL_NAME (range_for_decl[i])
+   = get_identifier ("__for_range\0__for_begin\0__for_end" + 12 * i);
 }
 
 /* Begin a range-for-statement.  Returns a new RANGE_FOR_STMT.
--- gcc/testsuite/g++.dg/pr85515-2.C.jj 2018-04-27 22:28:02.889462532 +0200
+++ gcc/testsuite/g++.dg/pr85515-2.C2018-07-02 23:33:50.638930364 +0200
@@ -15,8 +15,7 @@ int test_2 ()
   int sum = 0;
   for (const auto v: arr) {
 sum += v;
-// TODO: should we issue an error for the following assignment?
-__for_begin = __for_end;
+__for_begin = __for_end;   // { dg-error "was not declared in this scope" }
   }
   return sum;
 }
--- gcc/testsuite/g++.dg/cpp0x/range-for36.C.jj 2018-07-02 23:35:12.796000382 
+0200
+++ gcc/testsuite/g++.dg/cpp0x/range-for36.C2018-07-02 23:35:04.018992900 
+0200
@@ -0,0 +1,32 @@
+// PR c++/85515
+// { dg-do compile { target c++11 } }
+
+int a[10];
+
+void
+foo ()
+{

Avoid matching the same pattern statement twice

2018-07-03 Thread Richard Sandiford
r262275 allowed pattern matching on pattern statements.  Testing for
SVE on more benchmarks showed a case where this interacted badly
with 14/n.

The new over-widening detection could narrow a COND_EXPR A to another
COND_EXPR B, which mixed_size_cond could then match.  This was working
as expected.  However, we left B (now dead) in the pattern definition
sequence with a non-null PATTERN_DEF_SEQ.  mask_conversion also
matched B, and unlike most recognisers, didn't clear PATTERN_DEF_SEQ
before adding statements to it.  This meant that the statements
created by mixed_size_cond appeared in two supposedy separate
sequences, causing much confusion.

This patch removes pattern statements that are replaced by further
pattern statements.  As a belt-and-braces fix, it also nullifies
PATTERN_DEF_SEQ on failure, in the same way Richard B. did recently
for RELATED_STMT.

I have patches to clean up the PATTERN_DEF_SEQ handling, but they
only apply after the complete PR85694 sequence, whereas this needs
to go in before 14/n.

Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
OK to install?

Richard


2018-07-03  Richard Sandiford  

gcc/
* tree-vect-patterns.c (vect_mark_pattern_stmts): Remove pattern
statements that have been replaced by further pattern statements.
(vect_pattern_recog_1): Clear STMT_VINFO_PATTERN_DEF_SEQ on failure.

gcc/testsuite/
* gcc.dg/vect/vect-mixed-size-cond-1.c: New test.

Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2018-07-02 14:34:45.857732632 +0100
+++ gcc/tree-vect-patterns.c2018-07-03 08:56:56.610251460 +0100
@@ -4295,6 +4295,9 @@ vect_mark_pattern_stmts (gimple *orig_st
   gimple_stmt_iterator gsi = gsi_for_stmt (orig_stmt, orig_def_seq);
   gsi_insert_seq_before_without_update (&gsi, def_seq, GSI_SAME_STMT);
   gsi_insert_before_without_update (&gsi, pattern_stmt, GSI_SAME_STMT);
+
+  /* Remove the pattern statement that this new pattern replaces.  */
+  gsi_remove (&gsi, false);
 }
   else
 vect_set_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
@@ -4358,6 +4361,8 @@ vect_pattern_recog_1 (vect_recog_func *r
  if (!is_pattern_stmt_p (stmt_info))
STMT_VINFO_RELATED_STMT (stmt_info) = NULL;
}
+  /* Clear any half-formed pattern definition sequence.  */
+  STMT_VINFO_PATTERN_DEF_SEQ (stmt_info) = NULL;
   return;
 }
 
Index: gcc/testsuite/gcc.dg/vect/vect-mixed-size-cond-1.c
===
--- /dev/null   2018-06-13 14:36:57.192460992 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-mixed-size-cond-1.c  2018-07-03 
08:56:56.610251460 +0100
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+int
+f (unsigned char *restrict x, short *restrict y)
+{
+  for (int i = 0; i < 100; ++i)
+{
+  unsigned short a = (x[i] + 11) >> 1;
+  unsigned short b = (x[i] + 42) >> 2;
+  unsigned short cmp = y[i] == 0 ? a : b;
+  int res = cmp + 1;
+  x[i] = res;
+}
+}


Clean up interface to vector pattern recognisers

2018-07-03 Thread Richard Sandiford
The PR85694 series removed the only cases in which a pattern recogniser
could attach patterns to more than one statement.  I think it would be
better to avoid adding any new instances of that, since it interferes
with the normal matching order.

This patch therefore switches the interface back to passing a single
statement instead of a vector.  It also gets rid of the clearing of
STMT_VINFO_RELATED_STMT on failure, since no recognisers use it now.

Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
OK to install?

Richard


2018-07-03  Richard Sandiford  

gcc/
* tree-vect-patterns.c (vect_recog_dot_prod_pattern):
(vect_recog_sad_pattern, vect_recog_widen_op_pattern)
(vect_recog_widen_mult_pattern, vect_recog_pow_pattern):
(vect_recog_widen_sum_pattern, vect_recog_over_widening_pattern)
(vect_recog_average_pattern, vect_recog_cast_forwprop_pattern)
(vect_recog_widen_shift_pattern, vect_recog_rotate_pattern)
(vect_recog_vector_vector_shift_pattern, vect_synth_mult_by_constant)
(vect_recog_mult_pattern, vect_recog_divmod_pattern)
(vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern)
(vect_recog_mask_conversion_pattern): Replace vec
parameter with a single stmt_vec_info.
(vect_recog_func_ptr): Likewise.
(vect_recog_gather_scatter_pattern): Likewise, folding in...
(vect_try_gather_scatter_pattern): ...this.
(vect_pattern_recog_1): Remove stmts_to_replace and just pass
the stmt_vec_info of the statement to be matched.  Don't clear
STMT_VINFO_RELATED_STMT.
(vect_pattern_recog): Update call accordingly.

Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2018-07-03 09:03:39.0 +0100
+++ gcc/tree-vect-patterns.c2018-07-03 09:03:39.834882009 +0100
@@ -888,7 +888,7 @@ vect_reassociating_reduction_p (stmt_vec
 
Input:
 
-   * STMTS: Contains a stmt from which the pattern search begins.  In the
+   * STMT_VINFO: The stmt from which the pattern search begins.  In the
example, when this function is called with S7, the pattern {S3,S4,S5,S6,S7}
will be detected.
 
@@ -909,11 +909,10 @@ vect_reassociating_reduction_p (stmt_vec
  inner-loop nested in an outer-loop that us being vectorized).  */
 
 static gimple *
-vect_recog_dot_prod_pattern (vec *stmts, tree *type_out)
+vect_recog_dot_prod_pattern (stmt_vec_info stmt_vinfo, tree *type_out)
 {
-  gimple *last_stmt = (*stmts)[0];
   tree oprnd0, oprnd1;
-  stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
+  gimple *last_stmt = stmt_vinfo->stmt;
   vec_info *vinfo = stmt_vinfo->vinfo;
   tree type, half_type;
   gimple *pattern_stmt;
@@ -1021,7 +1020,7 @@ vect_recog_dot_prod_pattern (vec *stmts, tree *type_out)
+vect_recog_sad_pattern (stmt_vec_info stmt_vinfo, tree *type_out)
 {
-  gimple *last_stmt = (*stmts)[0];
-  stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
+  gimple *last_stmt = stmt_vinfo->stmt;
   vec_info *vinfo = stmt_vinfo->vinfo;
   tree half_type;
 
@@ -1182,12 +1180,11 @@ vect_recog_sad_pattern (vec *s
name of the pattern being matched, for dump purposes.  */
 
 static gimple *
-vect_recog_widen_op_pattern (vec *stmts, tree *type_out,
+vect_recog_widen_op_pattern (stmt_vec_info last_stmt_info, tree *type_out,
 tree_code orig_code, tree_code wide_code,
 bool shift_p, const char *name)
 {
-  gimple *last_stmt = stmts->pop ();
-  stmt_vec_info last_stmt_info = vinfo_for_stmt (last_stmt);
+  gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
@@ -1231,7 +1228,6 @@ vect_recog_widen_op_pattern (vecsafe_push (last_stmt);
   return vect_convert_output (last_stmt_info, type, pattern_stmt, vecitype);
 }
 
@@ -1239,9 +1235,9 @@ vect_recog_widen_op_pattern (vec *stmts, tree *type_out)
+vect_recog_widen_mult_pattern (stmt_vec_info last_stmt_info, tree *type_out)
 {
-  return vect_recog_widen_op_pattern (stmts, type_out, MULT_EXPR,
+  return vect_recog_widen_op_pattern (last_stmt_info, type_out, MULT_EXPR,
  WIDEN_MULT_EXPR, false,
  "vect_recog_widen_mult_pattern");
 }
@@ -1257,7 +1253,7 @@ vect_recog_widen_mult_pattern (vec *stmts, tree *type_out)
+vect_recog_pow_pattern (stmt_vec_info stmt_vinfo, tree *type_out)
 {
-  gimple *last_stmt = (*stmts)[0];
+  gimple *last_stmt = stmt_vinfo->stmt;
   tree base, exp;
   gimple *stmt;
   tree var;
@@ -1344,7 +1340,6 @@ vect_recog_pow_pattern (vec *s
  *type_out = get_vectype_for_scalar_type (TREE_TYPE (base));
  if (!*type_out)
return NULL;
- stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
  tree def = vect_recog_temp_ssa_var (TREE_TYPE (base), NULL);
  gimple *g = gimple_build_as

Ensure PATTERN_DEF_SEQ is empty before recognising patterns

2018-07-03 Thread Richard Sandiford
Various recognisers set PATTERN_DEF_SEQ to null before adding
statements to it, but it should always be null at that point anyway.
This patch asserts for that in vect_pattern_recog_1 and removes
the redundant code.

Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
OK to install?

Richard


2018-07-03  Richard Sandiford  

gcc/
* tree-vect-patterns.c (new_pattern_def_seq): Delete.
(vect_recog_dot_prod_pattern, vect_recog_sad_pattern)
(vect_recog_widen_op_pattern, vect_recog_over_widening_pattern)
(vect_recog_rotate_pattern, vect_synth_mult_by_constant): Don't set
STMT_VINFO_PATTERN_DEF_SEQ to null here.
(vect_recog_pow_pattern, vect_recog_vector_vector_shift_pattern)
(vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern): Use
append_pattern_def_seq instead of new_pattern_def_seq.
(vect_recog_divmod_pattern): Do both of the above.
(vect_pattern_recog_1): Assert that STMT_VINO_PATTERN_DEF_SEQ
is null.

Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2018-07-03 09:03:39.834882009 +0100
+++ gcc/tree-vect-patterns.c2018-07-03 09:06:43.861330261 +0100
@@ -150,13 +150,6 @@ append_pattern_def_seq (stmt_vec_info st
  new_stmt);
 }
 
-static inline void
-new_pattern_def_seq (stmt_vec_info stmt_info, gimple *stmt)
-{
-  STMT_VINFO_PATTERN_DEF_SEQ (stmt_info) = NULL;
-  append_pattern_def_seq (stmt_info, stmt);
-}
-
 /* The caller wants to perform new operations on vect_external variable
VAR, so that the result of the operations would also be vect_external.
Return the edge on which the operations can be performed, if one exists.
@@ -983,7 +976,6 @@ vect_recog_dot_prod_pattern (stmt_vec_in
 return NULL;
 
   /* Get the inputs in the appropriate types.  */
-  STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
   tree mult_oprnd[2];
   vect_convert_inputs (stmt_vinfo, 2, mult_oprnd, half_type,
   unprom0, half_vectype);
@@ -1142,7 +1134,6 @@ vect_recog_sad_pattern (stmt_vec_info st
 return NULL;
 
   /* Get the inputs to the SAD_EXPR in the appropriate types.  */
-  STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
   tree sad_oprnd[2];
   vect_convert_inputs (stmt_vinfo, 2, sad_oprnd, half_type,
   unprom, half_vectype);
@@ -1220,7 +1211,6 @@ vect_recog_widen_op_pattern (stmt_vec_in
   if (!*type_out)
 return NULL;
 
-  STMT_VINFO_PATTERN_DEF_SEQ (last_stmt_info) = NULL;
   tree oprnd[2];
   vect_convert_inputs (last_stmt_info, 2, oprnd, half_type, unprom, vectype);
 
@@ -1342,7 +1332,7 @@ vect_recog_pow_pattern (stmt_vec_info st
return NULL;
  tree def = vect_recog_temp_ssa_var (TREE_TYPE (base), NULL);
  gimple *g = gimple_build_assign (def, MULT_EXPR, exp, logc);
- new_pattern_def_seq (stmt_vinfo, g);
+ append_pattern_def_seq (stmt_vinfo, g);
  tree res = vect_recog_temp_ssa_var (TREE_TYPE (base), NULL);
  g = gimple_build_call (exp_decl, 1, def);
  gimple_call_set_lhs (g, res);
@@ -1687,7 +1677,6 @@ vect_recog_over_widening_pattern (stmt_v
 }
 
   /* Calculate the rhs operands for an operation on NEW_TYPE.  */
-  STMT_VINFO_PATTERN_DEF_SEQ (last_stmt_info) = NULL;
   tree ops[3] = {};
   for (unsigned int i = 1; i < first_op; ++i)
 ops[i - 1] = gimple_op (last_stmt, i);
@@ -2073,7 +2062,6 @@ vect_recog_rotate_pattern (stmt_vec_info
def = rhs1;
 }
 
-  STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
   if (def == NULL_TREE)
 {
   def = vect_recog_temp_ssa_var (type, NULL);
@@ -2269,7 +2257,7 @@ vect_recog_vector_vector_shift_pattern (
  set_vinfo_for_stmt (def_stmt, new_stmt_info);
  STMT_VINFO_VECTYPE (new_stmt_info)
= get_vectype_for_scalar_type (TREE_TYPE (rhs1));
- new_pattern_def_seq (stmt_vinfo, def_stmt);
+ append_pattern_def_seq (stmt_vinfo, def_stmt);
}
}
 }
@@ -2278,7 +2266,7 @@ vect_recog_vector_vector_shift_pattern (
 {
   def = vect_recog_temp_ssa_var (TREE_TYPE (oprnd0), NULL);
   def_stmt = gimple_build_assign (def, NOP_EXPR, oprnd1);
-  new_pattern_def_seq (stmt_vinfo, def_stmt);
+  append_pattern_def_seq (stmt_vinfo, def_stmt);
 }
 
   /* Pattern detected.  */
@@ -2472,7 +2460,6 @@ vect_synth_mult_by_constant (tree op, tr
   tree accumulator;
 
   /* Clear out the sequence of statements so we can populate it below.  */
-  STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
   gimple *stmt = NULL;
 
   if (cast_to_unsigned_p)
@@ -2769,7 +2756,7 @@ vect_recog_divmod_pattern (stmt_vec_info
   fold_build2 (MINUS_EXPR, itype, oprnd1,
build_int_cst (itype, 1)),
   build_int_c

Pass more vector types to append_pattern_def_seq

2018-07-03 Thread Richard Sandiford
The PR85694 series added a vectype argument to append_pattern_def_seq.
This patch makes more callers use it.

Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
OK to install?

Richard


2018-07-03  Richard Sandiford  

gcc/
* tree-vect-patterns.c (vect_recog_rotate_pattern)
(vect_recog_vector_vector_shift_pattern, vect_recog_divmod_pattern)
(vect_recog_mixed_size_cond_pattern, adjust_bool_pattern_cast)
(adjust_bool_pattern, vect_recog_bool_pattern): Pass the vector
type to append_pattern_def_seq instead of creating a stmt_vec_info
directly.
(build_mask_conversion): Likewise.  Remove vinfo argument.
(vect_add_conversion_to_patterm): Likewise, renaming to...
(vect_add_conversion_to_pattern): ...this.
(vect_recog_mask_conversion_pattern): Update call to
build_mask_conversion.  Pass the vector type to
append_pattern_def_seq here too.
(vect_recog_gather_scatter_pattern): Update call to
vect_add_conversion_to_pattern.

Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2018-07-03 09:06:43.861330261 +0100
+++ gcc/tree-vect-patterns.c2018-07-03 09:09:41.627853962 +0100
@@ -2090,7 +2090,6 @@ vect_recog_rotate_pattern (stmt_vec_info
   else
 {
   tree vecstype = get_vectype_for_scalar_type (stype);
-  stmt_vec_info def_stmt_vinfo;
 
   if (vecstype == NULL_TREE)
return NULL;
@@ -2103,12 +2102,7 @@ vect_recog_rotate_pattern (stmt_vec_info
  gcc_assert (!new_bb);
}
   else
-   {
- def_stmt_vinfo = new_stmt_vec_info (def_stmt, vinfo);
- set_vinfo_for_stmt (def_stmt, def_stmt_vinfo);
- STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecstype;
- append_pattern_def_seq (stmt_vinfo, def_stmt);
-   }
+   append_pattern_def_seq (stmt_vinfo, def_stmt, vecstype);
 
   def2 = vect_recog_temp_ssa_var (stype, NULL);
   tree mask = build_int_cst (stype, GET_MODE_PRECISION (smode) - 1);
@@ -2121,12 +2115,7 @@ vect_recog_rotate_pattern (stmt_vec_info
  gcc_assert (!new_bb);
}
   else
-   {
- def_stmt_vinfo = new_stmt_vec_info (def_stmt, vinfo);
- set_vinfo_for_stmt (def_stmt, def_stmt_vinfo);
- STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecstype;
- append_pattern_def_seq (stmt_vinfo, def_stmt);
-   }
+   append_pattern_def_seq (stmt_vinfo, def_stmt, vecstype);
 }
 
   var1 = vect_recog_temp_ssa_var (type, NULL);
@@ -2252,12 +2241,8 @@ vect_recog_vector_vector_shift_pattern (
   TYPE_PRECISION (TREE_TYPE (oprnd1)));
  def = vect_recog_temp_ssa_var (TREE_TYPE (rhs1), NULL);
  def_stmt = gimple_build_assign (def, BIT_AND_EXPR, rhs1, mask);
- stmt_vec_info new_stmt_info
-   = new_stmt_vec_info (def_stmt, vinfo);
- set_vinfo_for_stmt (def_stmt, new_stmt_info);
- STMT_VINFO_VECTYPE (new_stmt_info)
-   = get_vectype_for_scalar_type (TREE_TYPE (rhs1));
- append_pattern_def_seq (stmt_vinfo, def_stmt);
+ tree vecstype = get_vectype_for_scalar_type (TREE_TYPE (rhs1));
+ append_pattern_def_seq (stmt_vinfo, def_stmt, vecstype);
}
}
 }
@@ -2688,11 +2673,9 @@ vect_recog_divmod_pattern (stmt_vec_info
   tree oprnd0, oprnd1, vectype, itype, cond;
   gimple *pattern_stmt, *def_stmt;
   enum tree_code rhs_code;
-  vec_info *vinfo = stmt_vinfo->vinfo;
   optab optab;
   tree q;
   int dummy_int, prec;
-  stmt_vec_info def_stmt_vinfo;
 
   if (!is_gimple_assign (last_stmt))
 return NULL;
@@ -2792,18 +2775,12 @@ vect_recog_divmod_pattern (stmt_vec_info
  def_stmt = gimple_build_assign (var, COND_EXPR, cond,
  build_int_cst (utype, -1),
  build_int_cst (utype, 0));
- def_stmt_vinfo = new_stmt_vec_info (def_stmt, vinfo);
- set_vinfo_for_stmt (def_stmt, def_stmt_vinfo);
- STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecutype;
- append_pattern_def_seq (stmt_vinfo, def_stmt);
+ append_pattern_def_seq (stmt_vinfo, def_stmt, vecutype);
  var = vect_recog_temp_ssa_var (utype, NULL);
  def_stmt = gimple_build_assign (var, RSHIFT_EXPR,
  gimple_assign_lhs (def_stmt),
  shift);
- def_stmt_vinfo = new_stmt_vec_info (def_stmt, vinfo);
- set_vinfo_for_stmt (def_stmt, def_stmt_vinfo);
- STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecutype;
- append_pattern_def_seq (stmt_vinfo, def_stmt);
+ append_pattern_def_seq (stmt_vinfo, def_stmt, vecutype);
  signmask = vect_recog_temp_ssa_var (itype, NULL);
   

Re: [PATCH 3/3] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

2018-07-03 Thread Martin Liška
On 06/29/2018 09:04 PM, Jeff Law wrote:
> I think this is fine for the trunk.
> 
> jeff

Thank you Jeff.

I found some issues when doing build of all targets (contrib/config-list.mk).
I'll update patch and test that affected cross-compilers still produce same 
output.

However I noticed one ppc64 issue:

$ cat -n gcc/config/powerpcspe/powerpcspe.c

  5401/* Set branch target alignment, if not optimizing for size.  */
  5402if (!optimize_size)
  5403  {
  5404/* Cell wants to be aligned 8byte for dual issue.  Titan 
wants to be
  5405   aligned 8byte to avoid misprediction by the branch 
predictor.  */
  5406if (rs6000_cpu == PROCESSOR_TITAN
  5407|| rs6000_cpu == PROCESSOR_CELL)
  5408  {
  5409if (align_functions <= 0)
  5410  align_functions = 8;
  5411if (align_jumps <= 0)
  5412  align_jumps = 8;
  5413if (align_loops <= 0)
  5414  align_loops = 8;
  5415  }
  5416if (rs6000_align_branch_targets)
  5417  {
  5418if (align_functions <= 0)
  5419  align_functions = 16;
  5420if (align_jumps <= 0)
  5421  align_jumps = 16;
  5422if (align_loops <= 0)
  5423  {
  5424can_override_loop_align = 1;
  5425align_loops = 16;
  5426  }
  5427  }
  5428if (align_jumps_max_skip <= 0)
  5429  align_jumps_max_skip = 15;
  5430if (align_loops_max_skip <= 0)
  5431  align_loops_max_skip = 15;

Note that at line 5429 there's set of align_jumps_max_skip to 15 if not set by 
default.
At line 5412 align_jumps is set to 8, and align_jumps_max_skip should be equal 
align_jumps - 1.
That's a discrepancy. Segher can you please take a look?

Thanks,
Martin


Re: Limit Debug mode impact: overload __niter_base

2018-07-03 Thread Jonathan Wakely

On 03/07/18 07:47 +0200, François Dumont wrote:

Here is the updated patch.

    * include/bits/stl_algobase.h (__niter_wrap): New.
    (__copy_move_a2(_II, _II, _OI)): Use latter.
    (__copy_move_backward_a2(_BI1, _BI1, _BI2)): Likewise.
    (fill_n(_OI, _Size, const _Tp&)): Likewise.
    (equal(_II1, _II1, _II2)): Use __glibcxx_requires_can_increment.
    * include/debug/stl_iterator.h
    (std::__niter_base(const __gnu_cxx::_Safe_iterator<
    __gnu_cxx::__normal_iterator<>, _Sequence>&)): New declaration.
    * include/debug/vector (__niter_base(const __gnu_cxx::_Safe_iterator<
    __gnu_cxx::__normal_iterator<>, _Sequence>&)): New.

Ok to commit ?


On 02/07/2018 13:57, Jonathan Wakely wrote:

On 01/07/18 21:20 +0200, François Dumont wrote:

    Here is a new proposal between yours and mine.

    It is still adding a function to wrap what __niter_base 
unwrap, I called it __nwrap_iter for this reason. But it takes 
advantage of


Since "niter" refers to __normal_iterator I think a name based on
"niter" would be better than nsomething_iter.

__niter_wrap
__niter_rewrap
__niter_lift (misuse of functional programming term?)
__niter_raise (misuse of linear algebra term?)
__make_niter
__remake_niter


knowing that __niter_base will only unwrap random access iterator 
to use an expression to that will do the right thing, no matter 
the original iterator type.


OK, since __niter_base only transforms types based on 
__normal_iterator that seems safe to assume (in theory we could use

__normal_iterator with non-random access iterators, but we don't).

Could you please add a comment to the __nwrap_iter saying something
like:

 // Reverse the __niter_base transformation to get a
 // __normal_iterator back again (this assumes that __normal_iterator
 // is only used to wrap random access iterators, like pointers).


    I eventually found no issue in the testsuite, despite the 
std::equal change. I might have had a local invalid test.


Yes, I *did* test it already :-)




diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h

index d429943..003ae8d 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -277,6 +277,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
    __niter_base(_Iterator __it)
    { return __it; }

+  // Convert an iterator of type _From to an iterator of type _To.
+  // e.g. from int* to __normal_iterator.
+  template
+    inline _Iterator
+    __nwrap_iter(_Iterator, _Iterator, _Iterator __res)
+    { return __res; }
+
+  template
+    inline _From
+    __nwrap_iter(_From __from, _To __to, _To __res)
+    { return __from + (__res - __to); }


Every time you call this function you pass it:

 __nwrap_iter(x, __niter_base(x), y)

So can the __niter_base(x) call happen inside __nwrap_iter?

i.e.

 template
   inline _From
   __nwrap_iter(_From __from, _To __res)
   { return __from + (__res - __niter_base(__from)); }








diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index d429943..e5e7d15 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -277,6 +277,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__niter_base(_Iterator __it)
{ return __it; }

+  // Reverse the __niter_base transformation to get a
+  // __normal_iterator back again (this assumes that __normal_iterator
+  // is only used to wrap random access iterators, like pointers).


Please move the comment onto the other overload, since that's the one
that actually does the wrapping.

Maybe we should re-order the overloads, so the __niter_wrap<_From, _To>
overload comes first, then on the second one add a comment:

   // No need to wrap, iterator already has the right type


+  template
+inline _Iterator
+__niter_wrap(_Iterator, _Iterator __res)
+{ return __res; }
+
+  template
+inline _From
+__niter_wrap(_From __from, _To __res)
+{ return __from + (__res - __niter_base(__from)); }


Please qualify this call as std::__niter_base

OK with those changes, thanks.



Re: [testsuite/guality, committed] Prevent optimization of local in vla-1.c

2018-07-03 Thread Tom de Vries
On 07/02/2018 10:16 AM, Jakub Jelinek wrote:
> On Mon, Jul 02, 2018 at 09:44:04AM +0200, Richard Biener wrote:
>> Given the array has size i + 1 it's upper bound should be 'i' and 'i'
>> should be available via DW_OP_[GNU_]entry_value.
>>
>> I see it is
>>
>> <175>   DW_AT_upper_bound : 10 byte block: 75 1 8 20 24 8 20 26 31
>> 1c   (DW_OP_breg5 (rdi): 1; DW_OP_const1u: 32; DW_OP_shl;
>> DW_OP_const1u: 32; DW_OP_shra; DW_OP_lit1; DW_OP_minus)
>>
>> and %rdi is 1.  Not sure why gdb fails to print it's length.  Yes, the
>> storage itself doesn't have a location but the
>> type specifies the size.
>>
>> (gdb) ptype a
>> type = char [variable length]
>> (gdb) p sizeof(a)
>> $3 = 0
>>
>> this looks like a gdb bug to me?
>>

With gdb patch:
...
diff --git a/gdb/findvar.c b/gdb/findvar.c
index 8ad5e25cb2..ebaff923a1 100644
--- a/gdb/findvar.c
+++ b/gdb/findvar.c
@@ -789,6 +789,8 @@ default_read_var_value
   break;

 case LOC_OPTIMIZED_OUT:
+  if (is_dynamic_type (type))
+   type = resolve_dynamic_type (type, NULL,
+/* Unused address.  */ 0);
   return allocate_optimized_out_value (type);

 default:
...

I get:
...
$ ./gdb -batch -ex "b f1" -ex "r" -ex "p sizeof (a)" vla-1.exe
Breakpoint 1 at 0x4004a8: file vla-1.c, line 17.

Breakpoint 1, f1 (i=i@entry=5) at vla-1.c:17
17return a[0];
$1 = 6
...

So I'd say it's a gdb bug indeed.

Thanks,
- Tom

>> Btw, the location expression looks odd, if I deciper it correctly
>> we end up with ((%rdi << 32) >> 32) - 1 which computes to 4
>> but the upper bound should be 5.  The GIMPLE debug stmts compute
>> the upper bound as (sizetype)((long)(i_1(D) + 1) - 1)
> 
> The << 32 >> 32 is sign extension.  And yes, for f1 I don't see why
> DW_OP_GNU_entry_value shouldn't work, i in main is needed for the call to
> f2, so needs to live in some register or memory in that function until the
> second call.  For f2 i is needed after the bar call for the a[i + 4] read,
> worst case in form of precomputed i + 4, but that is reversible op.
> 
>   Jakub
> 


Re: [patch] jump threading multiple paths that start from the same BB

2018-07-03 Thread Aldy Hernandez

On 07/02/2018 07:08 AM, Christophe Lyon wrote:


On 11/07/2017 10:33 AM, Aldy Hernandez wrote:

While poking around in the backwards threader I noticed that we bail if
we have already seen a starting BB.

/* Do not jump-thread twice from the same block.  */
if (bitmap_bit_p (threaded_blocks, entry->src->index)

This limitation discards paths that are sub-paths of paths that have
already been threaded.

The following patch scans the remaining to-be-threaded paths to identify
if any of them start from the same point, and are thus sub-paths of the
just-threaded path.  By removing the common prefix of blocks in upcoming
threadable paths, and then rewiring first non-common block
appropriately, we expose new threading opportunities, since we are no
longer starting from the same BB.  We also simplify the would-be
threaded paths, because we don't duplicate already duplicated paths.

[snip]

Hi,

I've noticed a regression on aarch64:
FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
threaded: 3"
very likely caused by this patch (appeared between 262282 and 262294)

Christophe


The test needs to be adjusted here.

The long story is that the aarch64 IL is different at thread3 time in 
that it has 2 profitable sub-paths that can now be threaded with my 
patch.  This is causing the threaded count to be 5 for aarch64, versus 3 
for x86 64.  Previously we couldn't thread these in aarch64, so the 
backwards threader would bail.


One can see the different threading opportunities by sticking 
debug_all_paths() at the top of thread_through_all_blocks().  You will 
notice that aarch64 has far more candidates to begin with.  The IL on 
the x86 backend, has no paths that start on the same BB.  The aarch64, 
on the other hand, has many to choose from:


path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11,
path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 19,

Some of these prove unprofitable, but 2 more than before are profitable now.

BTW, I see another threading related failure on aarch64 which is 
unrelated to my patch, and was previously there:


FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps 
threaded"


This is probably another IL incompatibility between architectures.

Anyways... the attached path fixes the regression.  I have added a note 
to the test explaining the IL differences.  We really should rewrite all 
the threading tests (I am NOT volunteering ;-)).


OK for trunk?
Aldy
gcc/testsuite/

	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust test because aarch64
	has a slightly different IL that provides more threading
	opportunities.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index 9ee8d12010b..e395de26ec0 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -2,11 +2,16 @@
 /* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
 /* { dg-final { scan-tree-dump "Jumps threaded: 16"  "thread1" } } */
 /* { dg-final { scan-tree-dump "Jumps threaded: 9" "thread2" } } */
-/* { dg-final { scan-tree-dump "Jumps threaded: 3" "thread3" } } */
 /* { dg-final { scan-tree-dump "Jumps threaded: 1"  "dom2" } } */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom3" } } */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "vrp2" } } */
 
+/* Most architectures get 3 threadable paths here, whereas aarch64 and
+   possibly others get 5.  We really should rewrite threading tests to
+   test a specific IL sequence, not gobs of code whose IL can vary
+   from architecture to architecture.  */
+/* { dg-final { scan-tree-dump "Jumps threaded: \[35\]" "thread3" } } */
+
 enum STATE {
   S0=0,
   SI,


Re: [C++ PATCH] Hide __for_{range,begin,end} symbols (PR c++/85515)

2018-07-03 Thread Richard Biener
On Tue, Jul 3, 2018 at 9:49 AM Jakub Jelinek  wrote:
>
> Hi!
>
> While working on OpenMP range-for support, I ran into the weird thing that
> the C++ FE accepts
>   for (auto &i : a)
> if (i != *__for_begin || &i == __for_end || &__for_range[0] != &a[0])
>   __builtin_abort ();
> outside of templates, but doesn't inside of templates.
> I think we shouldn't let people do this at any time, it will just lead to
> non-portable code, and when it works only outside of templates, it isn't a
> well defined extension.
>
> Now, we could just use create_tmp_var_name ("__for_range") instead of
> get_identifier ("__for_range") etc. and the symbols would be non-accessible
> to users (usually; if assembler doesn't support dots nor dollars in labels,
> it is still a symbol with just alphanumeric chars in it, but there is a hard
> to predict number suffix at least), or just NULL DECL_NAME.
> But my understanding is that the intent was that in the debugger one could
> use __for_range, __for_begin and __for_end to make it easier to debug
> range fors.  In that case, the following patch solves it by using a name
> not accessible for the user when parsing the body (with space in it) and
> correcting the names when the var gets out of scope.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Can we make them DECL_ARTIFICIAL and/or make name-lookup never
lookup DECL_ARTIFICIAL vars instead?

Richard.

> 2018-07-02  Jakub Jelinek  
>
> PR c++/85515
> * parser.c (build_range_temp): Use "__for range" instead of
> "__for_range".
> (cp_convert_range_for): Use "__for begin" and "__for end" instead of
> "__for_begin" and "__for_end".
> * semantics.c (finish_for_stmt): Rename "__for {range,begin,end}"
> local symbols to "__for_{range,begin,end}".
>
> * g++.dg/pr85515-2.C: Add expected dg-error.
> * g++.dg/cpp0x/range-for36.C: New test.
>
> --- gcc/cp/parser.c.jj  2018-07-02 23:04:00.869370436 +0200
> +++ gcc/cp/parser.c 2018-07-02 23:08:44.572618486 +0200
> @@ -11953,7 +11953,7 @@ build_range_temp (tree range_expr)
>
>/* Create the __range variable.  */
>range_temp = build_decl (input_location, VAR_DECL,
> -  get_identifier ("__for_range"), range_type);
> +  get_identifier ("__for range"), range_type);
>TREE_USED (range_temp) = 1;
>DECL_ARTIFICIAL (range_temp) = 1;
>
> @@ -12061,7 +12061,7 @@ cp_convert_range_for (tree statement, tr
>
>/* The new for initialization statement.  */
>begin = build_decl (input_location, VAR_DECL,
> - get_identifier ("__for_begin"), iter_type);
> + get_identifier ("__for begin"), iter_type);
>TREE_USED (begin) = 1;
>DECL_ARTIFICIAL (begin) = 1;
>pushdecl (begin);
> @@ -12072,7 +12072,7 @@ cp_convert_range_for (tree statement, tr
>if (cxx_dialect >= cxx17)
>  iter_type = cv_unqualified (TREE_TYPE (end_expr));
>end = build_decl (input_location, VAR_DECL,
> -   get_identifier ("__for_end"), iter_type);
> +   get_identifier ("__for end"), iter_type);
>TREE_USED (end) = 1;
>DECL_ARTIFICIAL (end) = 1;
>pushdecl (end);
> --- gcc/cp/semantics.c.jj   2018-06-25 14:51:23.096989196 +0200
> +++ gcc/cp/semantics.c  2018-07-02 23:23:40.784400542 +0200
> @@ -1060,7 +1060,31 @@ finish_for_stmt (tree for_stmt)
>  : &FOR_SCOPE (for_stmt));
>tree scope = *scope_ptr;
>*scope_ptr = NULL;
> +
> +  /* During parsing of the body, range for uses "__for {range,begin,end}"
> + decl names to make those unaccessible by code in the body.
> + Change it to ones with underscore instead of space, so that it can
> + be inspected in the debugger.  */
> +  tree range_for_decl[3] = { NULL_TREE, NULL_TREE, NULL_TREE };
> +  for (int i = 0; i < 3; i++)
> +{
> +  tree id
> +   = get_identifier ("__for range\0__for begin\0__for end" + 12 * i);
> +  if (IDENTIFIER_BINDING (id)
> + && IDENTIFIER_BINDING (id)->scope == current_binding_level)
> +   {
> + range_for_decl[i] = IDENTIFIER_BINDING (id)->value;
> + gcc_assert (VAR_P (range_for_decl[i])
> + && DECL_ARTIFICIAL (range_for_decl[i]));
> +   }
> +}
> +
>add_stmt (do_poplevel (scope));
> +
> +  for (int i = 0; i < 3; i++)
> +if (range_for_decl[i])
> +  DECL_NAME (range_for_decl[i])
> +   = get_identifier ("__for_range\0__for_begin\0__for_end" + 12 * i);
>  }
>
>  /* Begin a range-for-statement.  Returns a new RANGE_FOR_STMT.
> --- gcc/testsuite/g++.dg/pr85515-2.C.jj 2018-04-27 22:28:02.889462532 +0200
> +++ gcc/testsuite/g++.dg/pr85515-2.C2018-07-02 23:33:50.638930364 +0200
> @@ -15,8 +15,7 @@ int test_2 ()
>int sum = 0;
>for (const auto v: arr) {
>  sum += v;
> -// TODO: should we issue an error for the following assignment?
> -__for_begin = __for_end;
> +__fo

Re: Avoid matching the same pattern statement twice

2018-07-03 Thread Richard Biener
On Tue, Jul 3, 2018 at 10:02 AM Richard Sandiford
 wrote:
>
> r262275 allowed pattern matching on pattern statements.  Testing for
> SVE on more benchmarks showed a case where this interacted badly
> with 14/n.
>
> The new over-widening detection could narrow a COND_EXPR A to another
> COND_EXPR B, which mixed_size_cond could then match.  This was working
> as expected.  However, we left B (now dead) in the pattern definition
> sequence with a non-null PATTERN_DEF_SEQ.  mask_conversion also
> matched B, and unlike most recognisers, didn't clear PATTERN_DEF_SEQ
> before adding statements to it.  This meant that the statements
> created by mixed_size_cond appeared in two supposedy separate
> sequences, causing much confusion.
>
> This patch removes pattern statements that are replaced by further
> pattern statements.  As a belt-and-braces fix, it also nullifies
> PATTERN_DEF_SEQ on failure, in the same way Richard B. did recently
> for RELATED_STMT.
>
> I have patches to clean up the PATTERN_DEF_SEQ handling, but they
> only apply after the complete PR85694 sequence, whereas this needs
> to go in before 14/n.
>
> Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
> OK to install?

OK.

Richard.

> Richard
>
>
> 2018-07-03  Richard Sandiford  
>
> gcc/
> * tree-vect-patterns.c (vect_mark_pattern_stmts): Remove pattern
> statements that have been replaced by further pattern statements.
> (vect_pattern_recog_1): Clear STMT_VINFO_PATTERN_DEF_SEQ on failure.
>
> gcc/testsuite/
> * gcc.dg/vect/vect-mixed-size-cond-1.c: New test.
>
> Index: gcc/tree-vect-patterns.c
> ===
> --- gcc/tree-vect-patterns.c2018-07-02 14:34:45.857732632 +0100
> +++ gcc/tree-vect-patterns.c2018-07-03 08:56:56.610251460 +0100
> @@ -4295,6 +4295,9 @@ vect_mark_pattern_stmts (gimple *orig_st
>gimple_stmt_iterator gsi = gsi_for_stmt (orig_stmt, orig_def_seq);
>gsi_insert_seq_before_without_update (&gsi, def_seq, GSI_SAME_STMT);
>gsi_insert_before_without_update (&gsi, pattern_stmt, GSI_SAME_STMT);
> +
> +  /* Remove the pattern statement that this new pattern replaces.  */
> +  gsi_remove (&gsi, false);
>  }
>else
>  vect_set_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
> @@ -4358,6 +4361,8 @@ vect_pattern_recog_1 (vect_recog_func *r
>   if (!is_pattern_stmt_p (stmt_info))
> STMT_VINFO_RELATED_STMT (stmt_info) = NULL;
> }
> +  /* Clear any half-formed pattern definition sequence.  */
> +  STMT_VINFO_PATTERN_DEF_SEQ (stmt_info) = NULL;
>return;
>  }
>
> Index: gcc/testsuite/gcc.dg/vect/vect-mixed-size-cond-1.c
> ===
> --- /dev/null   2018-06-13 14:36:57.192460992 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-mixed-size-cond-1.c  2018-07-03 
> 08:56:56.610251460 +0100
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +
> +int
> +f (unsigned char *restrict x, short *restrict y)
> +{
> +  for (int i = 0; i < 100; ++i)
> +{
> +  unsigned short a = (x[i] + 11) >> 1;
> +  unsigned short b = (x[i] + 42) >> 2;
> +  unsigned short cmp = y[i] == 0 ? a : b;
> +  int res = cmp + 1;
> +  x[i] = res;
> +}
> +}


Re: Clean up interface to vector pattern recognisers

2018-07-03 Thread Richard Biener
On Tue, Jul 3, 2018 at 10:06 AM Richard Sandiford
 wrote:
>
> The PR85694 series removed the only cases in which a pattern recogniser
> could attach patterns to more than one statement.  I think it would be
> better to avoid adding any new instances of that, since it interferes
> with the normal matching order.
>
> This patch therefore switches the interface back to passing a single
> statement instead of a vector.  It also gets rid of the clearing of
> STMT_VINFO_RELATED_STMT on failure, since no recognisers use it now.
>
> Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
> OK to install?

Very good.

OK.
Thanks,
Richard.

> Richard
>
>
> 2018-07-03  Richard Sandiford  
>
> gcc/
> * tree-vect-patterns.c (vect_recog_dot_prod_pattern):
> (vect_recog_sad_pattern, vect_recog_widen_op_pattern)
> (vect_recog_widen_mult_pattern, vect_recog_pow_pattern):
> (vect_recog_widen_sum_pattern, vect_recog_over_widening_pattern)
> (vect_recog_average_pattern, vect_recog_cast_forwprop_pattern)
> (vect_recog_widen_shift_pattern, vect_recog_rotate_pattern)
> (vect_recog_vector_vector_shift_pattern, vect_synth_mult_by_constant)
> (vect_recog_mult_pattern, vect_recog_divmod_pattern)
> (vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern)
> (vect_recog_mask_conversion_pattern): Replace vec
> parameter with a single stmt_vec_info.
> (vect_recog_func_ptr): Likewise.
> (vect_recog_gather_scatter_pattern): Likewise, folding in...
> (vect_try_gather_scatter_pattern): ...this.
> (vect_pattern_recog_1): Remove stmts_to_replace and just pass
> the stmt_vec_info of the statement to be matched.  Don't clear
> STMT_VINFO_RELATED_STMT.
> (vect_pattern_recog): Update call accordingly.
>
> Index: gcc/tree-vect-patterns.c
> ===
> --- gcc/tree-vect-patterns.c2018-07-03 09:03:39.0 +0100
> +++ gcc/tree-vect-patterns.c2018-07-03 09:03:39.834882009 +0100
> @@ -888,7 +888,7 @@ vect_reassociating_reduction_p (stmt_vec
>
> Input:
>
> -   * STMTS: Contains a stmt from which the pattern search begins.  In the
> +   * STMT_VINFO: The stmt from which the pattern search begins.  In the
> example, when this function is called with S7, the pattern 
> {S3,S4,S5,S6,S7}
> will be detected.
>
> @@ -909,11 +909,10 @@ vect_reassociating_reduction_p (stmt_vec
>   inner-loop nested in an outer-loop that us being vectorized).  */
>
>  static gimple *
> -vect_recog_dot_prod_pattern (vec *stmts, tree *type_out)
> +vect_recog_dot_prod_pattern (stmt_vec_info stmt_vinfo, tree *type_out)
>  {
> -  gimple *last_stmt = (*stmts)[0];
>tree oprnd0, oprnd1;
> -  stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
> +  gimple *last_stmt = stmt_vinfo->stmt;
>vec_info *vinfo = stmt_vinfo->vinfo;
>tree type, half_type;
>gimple *pattern_stmt;
> @@ -1021,7 +1020,7 @@ vect_recog_dot_prod_pattern (vec
> Input:
>
> -   * STMTS: Contains a stmt from which the pattern search begins.  In the
> +   * STMT_VINFO: The stmt from which the pattern search begins.  In the
> example, when this function is called with S8, the pattern
> {S3,S4,S5,S6,S7,S8} will be detected.
>
> @@ -1035,10 +1034,9 @@ vect_recog_dot_prod_pattern (vec*/
>
>  static gimple *
> -vect_recog_sad_pattern (vec *stmts, tree *type_out)
> +vect_recog_sad_pattern (stmt_vec_info stmt_vinfo, tree *type_out)
>  {
> -  gimple *last_stmt = (*stmts)[0];
> -  stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
> +  gimple *last_stmt = stmt_vinfo->stmt;
>vec_info *vinfo = stmt_vinfo->vinfo;
>tree half_type;
>
> @@ -1182,12 +1180,11 @@ vect_recog_sad_pattern (vec *s
> name of the pattern being matched, for dump purposes.  */
>
>  static gimple *
> -vect_recog_widen_op_pattern (vec *stmts, tree *type_out,
> +vect_recog_widen_op_pattern (stmt_vec_info last_stmt_info, tree *type_out,
>  tree_code orig_code, tree_code wide_code,
>  bool shift_p, const char *name)
>  {
> -  gimple *last_stmt = stmts->pop ();
> -  stmt_vec_info last_stmt_info = vinfo_for_stmt (last_stmt);
> +  gimple *last_stmt = last_stmt_info->stmt;
>
>vect_unpromoted_value unprom[2];
>tree half_type;
> @@ -1231,7 +1228,6 @@ vect_recog_widen_op_pattern (vecgimple *pattern_stmt = gimple_build_assign (var, wide_code,
>   oprnd[0], oprnd[1]);
>
> -  stmts->safe_push (last_stmt);
>return vect_convert_output (last_stmt_info, type, pattern_stmt, vecitype);
>  }
>
> @@ -1239,9 +1235,9 @@ vect_recog_widen_op_pattern (vec to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
>
>  static gimple *
> -vect_recog_widen_mult_pattern (vec *stmts, tree *type_out)
> +vect_recog_widen_mult_pattern (stmt_vec_info last_stmt_info, tree *type_o

Re: Ensure PATTERN_DEF_SEQ is empty before recognising patterns

2018-07-03 Thread Richard Biener
On Tue, Jul 3, 2018 at 10:10 AM Richard Sandiford
 wrote:
>
> Various recognisers set PATTERN_DEF_SEQ to null before adding
> statements to it, but it should always be null at that point anyway.
> This patch asserts for that in vect_pattern_recog_1 and removes
> the redundant code.
>
> Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
> OK to install?

OK.

Thanks,
Richard.

> Richard
>
>
> 2018-07-03  Richard Sandiford  
>
> gcc/
> * tree-vect-patterns.c (new_pattern_def_seq): Delete.
> (vect_recog_dot_prod_pattern, vect_recog_sad_pattern)
> (vect_recog_widen_op_pattern, vect_recog_over_widening_pattern)
> (vect_recog_rotate_pattern, vect_synth_mult_by_constant): Don't set
> STMT_VINFO_PATTERN_DEF_SEQ to null here.
> (vect_recog_pow_pattern, vect_recog_vector_vector_shift_pattern)
> (vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern): Use
> append_pattern_def_seq instead of new_pattern_def_seq.
> (vect_recog_divmod_pattern): Do both of the above.
> (vect_pattern_recog_1): Assert that STMT_VINO_PATTERN_DEF_SEQ
> is null.
>
> Index: gcc/tree-vect-patterns.c
> ===
> --- gcc/tree-vect-patterns.c2018-07-03 09:03:39.834882009 +0100
> +++ gcc/tree-vect-patterns.c2018-07-03 09:06:43.861330261 +0100
> @@ -150,13 +150,6 @@ append_pattern_def_seq (stmt_vec_info st
>   new_stmt);
>  }
>
> -static inline void
> -new_pattern_def_seq (stmt_vec_info stmt_info, gimple *stmt)
> -{
> -  STMT_VINFO_PATTERN_DEF_SEQ (stmt_info) = NULL;
> -  append_pattern_def_seq (stmt_info, stmt);
> -}
> -
>  /* The caller wants to perform new operations on vect_external variable
> VAR, so that the result of the operations would also be vect_external.
> Return the edge on which the operations can be performed, if one exists.
> @@ -983,7 +976,6 @@ vect_recog_dot_prod_pattern (stmt_vec_in
>  return NULL;
>
>/* Get the inputs in the appropriate types.  */
> -  STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
>tree mult_oprnd[2];
>vect_convert_inputs (stmt_vinfo, 2, mult_oprnd, half_type,
>unprom0, half_vectype);
> @@ -1142,7 +1134,6 @@ vect_recog_sad_pattern (stmt_vec_info st
>  return NULL;
>
>/* Get the inputs to the SAD_EXPR in the appropriate types.  */
> -  STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
>tree sad_oprnd[2];
>vect_convert_inputs (stmt_vinfo, 2, sad_oprnd, half_type,
>unprom, half_vectype);
> @@ -1220,7 +1211,6 @@ vect_recog_widen_op_pattern (stmt_vec_in
>if (!*type_out)
>  return NULL;
>
> -  STMT_VINFO_PATTERN_DEF_SEQ (last_stmt_info) = NULL;
>tree oprnd[2];
>vect_convert_inputs (last_stmt_info, 2, oprnd, half_type, unprom, vectype);
>
> @@ -1342,7 +1332,7 @@ vect_recog_pow_pattern (stmt_vec_info st
> return NULL;
>   tree def = vect_recog_temp_ssa_var (TREE_TYPE (base), NULL);
>   gimple *g = gimple_build_assign (def, MULT_EXPR, exp, logc);
> - new_pattern_def_seq (stmt_vinfo, g);
> + append_pattern_def_seq (stmt_vinfo, g);
>   tree res = vect_recog_temp_ssa_var (TREE_TYPE (base), NULL);
>   g = gimple_build_call (exp_decl, 1, def);
>   gimple_call_set_lhs (g, res);
> @@ -1687,7 +1677,6 @@ vect_recog_over_widening_pattern (stmt_v
>  }
>
>/* Calculate the rhs operands for an operation on NEW_TYPE.  */
> -  STMT_VINFO_PATTERN_DEF_SEQ (last_stmt_info) = NULL;
>tree ops[3] = {};
>for (unsigned int i = 1; i < first_op; ++i)
>  ops[i - 1] = gimple_op (last_stmt, i);
> @@ -2073,7 +2062,6 @@ vect_recog_rotate_pattern (stmt_vec_info
> def = rhs1;
>  }
>
> -  STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
>if (def == NULL_TREE)
>  {
>def = vect_recog_temp_ssa_var (type, NULL);
> @@ -2269,7 +2257,7 @@ vect_recog_vector_vector_shift_pattern (
>   set_vinfo_for_stmt (def_stmt, new_stmt_info);
>   STMT_VINFO_VECTYPE (new_stmt_info)
> = get_vectype_for_scalar_type (TREE_TYPE (rhs1));
> - new_pattern_def_seq (stmt_vinfo, def_stmt);
> + append_pattern_def_seq (stmt_vinfo, def_stmt);
> }
> }
>  }
> @@ -2278,7 +2266,7 @@ vect_recog_vector_vector_shift_pattern (
>  {
>def = vect_recog_temp_ssa_var (TREE_TYPE (oprnd0), NULL);
>def_stmt = gimple_build_assign (def, NOP_EXPR, oprnd1);
> -  new_pattern_def_seq (stmt_vinfo, def_stmt);
> +  append_pattern_def_seq (stmt_vinfo, def_stmt);
>  }
>
>/* Pattern detected.  */
> @@ -2472,7 +2460,6 @@ vect_synth_mult_by_constant (tree op, tr
>tree accumulator;
>
>/* Clear out the sequence of statements so we can populate it below.  */
> -  STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
>gimple *stmt = NU

Re: Pass more vector types to append_pattern_def_seq

2018-07-03 Thread Richard Biener
On Tue, Jul 3, 2018 at 10:11 AM Richard Sandiford
 wrote:
>
> The PR85694 series added a vectype argument to append_pattern_def_seq.
> This patch makes more callers use it.
>
> Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
> OK to install?

OK.

Richard.

> Richard
>
>
> 2018-07-03  Richard Sandiford  
>
> gcc/
> * tree-vect-patterns.c (vect_recog_rotate_pattern)
> (vect_recog_vector_vector_shift_pattern, vect_recog_divmod_pattern)
> (vect_recog_mixed_size_cond_pattern, adjust_bool_pattern_cast)
> (adjust_bool_pattern, vect_recog_bool_pattern): Pass the vector
> type to append_pattern_def_seq instead of creating a stmt_vec_info
> directly.
> (build_mask_conversion): Likewise.  Remove vinfo argument.
> (vect_add_conversion_to_patterm): Likewise, renaming to...
> (vect_add_conversion_to_pattern): ...this.
> (vect_recog_mask_conversion_pattern): Update call to
> build_mask_conversion.  Pass the vector type to
> append_pattern_def_seq here too.
> (vect_recog_gather_scatter_pattern): Update call to
> vect_add_conversion_to_pattern.
>
> Index: gcc/tree-vect-patterns.c
> ===
> --- gcc/tree-vect-patterns.c2018-07-03 09:06:43.861330261 +0100
> +++ gcc/tree-vect-patterns.c2018-07-03 09:09:41.627853962 +0100
> @@ -2090,7 +2090,6 @@ vect_recog_rotate_pattern (stmt_vec_info
>else
>  {
>tree vecstype = get_vectype_for_scalar_type (stype);
> -  stmt_vec_info def_stmt_vinfo;
>
>if (vecstype == NULL_TREE)
> return NULL;
> @@ -2103,12 +2102,7 @@ vect_recog_rotate_pattern (stmt_vec_info
>   gcc_assert (!new_bb);
> }
>else
> -   {
> - def_stmt_vinfo = new_stmt_vec_info (def_stmt, vinfo);
> - set_vinfo_for_stmt (def_stmt, def_stmt_vinfo);
> - STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecstype;
> - append_pattern_def_seq (stmt_vinfo, def_stmt);
> -   }
> +   append_pattern_def_seq (stmt_vinfo, def_stmt, vecstype);
>
>def2 = vect_recog_temp_ssa_var (stype, NULL);
>tree mask = build_int_cst (stype, GET_MODE_PRECISION (smode) - 1);
> @@ -2121,12 +2115,7 @@ vect_recog_rotate_pattern (stmt_vec_info
>   gcc_assert (!new_bb);
> }
>else
> -   {
> - def_stmt_vinfo = new_stmt_vec_info (def_stmt, vinfo);
> - set_vinfo_for_stmt (def_stmt, def_stmt_vinfo);
> - STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecstype;
> - append_pattern_def_seq (stmt_vinfo, def_stmt);
> -   }
> +   append_pattern_def_seq (stmt_vinfo, def_stmt, vecstype);
>  }
>
>var1 = vect_recog_temp_ssa_var (type, NULL);
> @@ -2252,12 +2241,8 @@ vect_recog_vector_vector_shift_pattern (
>TYPE_PRECISION (TREE_TYPE (oprnd1)));
>   def = vect_recog_temp_ssa_var (TREE_TYPE (rhs1), NULL);
>   def_stmt = gimple_build_assign (def, BIT_AND_EXPR, rhs1, mask);
> - stmt_vec_info new_stmt_info
> -   = new_stmt_vec_info (def_stmt, vinfo);
> - set_vinfo_for_stmt (def_stmt, new_stmt_info);
> - STMT_VINFO_VECTYPE (new_stmt_info)
> -   = get_vectype_for_scalar_type (TREE_TYPE (rhs1));
> - append_pattern_def_seq (stmt_vinfo, def_stmt);
> + tree vecstype = get_vectype_for_scalar_type (TREE_TYPE (rhs1));
> + append_pattern_def_seq (stmt_vinfo, def_stmt, vecstype);
> }
> }
>  }
> @@ -2688,11 +2673,9 @@ vect_recog_divmod_pattern (stmt_vec_info
>tree oprnd0, oprnd1, vectype, itype, cond;
>gimple *pattern_stmt, *def_stmt;
>enum tree_code rhs_code;
> -  vec_info *vinfo = stmt_vinfo->vinfo;
>optab optab;
>tree q;
>int dummy_int, prec;
> -  stmt_vec_info def_stmt_vinfo;
>
>if (!is_gimple_assign (last_stmt))
>  return NULL;
> @@ -2792,18 +2775,12 @@ vect_recog_divmod_pattern (stmt_vec_info
>   def_stmt = gimple_build_assign (var, COND_EXPR, cond,
>   build_int_cst (utype, -1),
>   build_int_cst (utype, 0));
> - def_stmt_vinfo = new_stmt_vec_info (def_stmt, vinfo);
> - set_vinfo_for_stmt (def_stmt, def_stmt_vinfo);
> - STMT_VINFO_VECTYPE (def_stmt_vinfo) = vecutype;
> - append_pattern_def_seq (stmt_vinfo, def_stmt);
> + append_pattern_def_seq (stmt_vinfo, def_stmt, vecutype);
>   var = vect_recog_temp_ssa_var (utype, NULL);
>   def_stmt = gimple_build_assign (var, RSHIFT_EXPR,
>   gimple_assign_lhs (def_stmt),
>   shift);
> - def_stmt_vinfo = new_stmt_vec_info (def_stmt, vinfo);
> - set_vinfo_for_stmt (def_stmt, def_

Re: [PATCH] When using -fprofile-generate=/some/path mangle absolute path of file (PR lto/85759).

2018-07-03 Thread Jonathan Wakely

On 16/05/18 13:53 +0200, Martin Liška wrote:

On 12/21/2017 10:13 AM, Martin Liška wrote:

On 12/20/2017 06:45 PM, Jakub Jelinek wrote:

Another thing is that the "/" in there is wrong, so
  const char dir_separator_str[] = { DIR_SEPARATOR, '\0' };
  char *b = concat (profile_data_prefix, dir_separator_str, pwd, NULL);
needs to be used instead.


This looks much nicer, I forgot about DIR_SEPARATOR.


Does profile_data_prefix have any dir separators stripped from the end?


That's easy to achieve..


Is pwd guaranteed to be relative in this case?


.. however this is absolute path, which would be problematic on a DOC based FS.
Maybe we should do the same path mangling as we do for purpose of gcov:

https://github.com/gcc-mirror/gcc/blob/master/gcc/gcov.c#L2424


Hi.

I decided to implement that. Which means for:

$ gcc -fprofile-generate=/tmp/myfolder empty.c -O2 && ./a.out

we get following file:
/tmp/myfolder/#home#marxin#Programming#testcases#tmp#empty.gcda

That guarantees we have a unique file path. As seen in the PR it
can produce a funny ICE.

I've been testing the patch.
Ready after it finishes tests?

Martin



What do you think about it?
Regarding the string manipulation: I'm not an expert, but work with string in C
is for me always a pain :)

Martin






From 386a4561a4d1501e8959871791289e95f6a89af5 Mon Sep 17 00:00:00 2001

From: marxin 
Date: Wed, 16 Aug 2017 10:22:57 +0200
Subject: [PATCH] When using -fprofile-generate=/some/path mangle absolute path
of file (PR lto/85759).

gcc/ChangeLog:

2018-05-16  Martin Liska  

PR lto/85759
* coverage.c (coverage_init): Mangle full path name.
* doc/invoke.texi: Document the change.
* gcov-io.c (mangle_path): New.
* gcov-io.h (mangle_path): Likewise.
* gcov.c (mangle_name): Use mangle_path for path mangling.
---
gcc/coverage.c  | 20 ++--
gcc/doc/invoke.texi |  3 +++
gcc/gcov-io.c   | 49 +
gcc/gcov-io.h   |  1 +
gcc/gcov.c  | 37 +
5 files changed, 72 insertions(+), 38 deletions(-)

diff --git a/gcc/coverage.c b/gcc/coverage.c
index 32ef298a11f..6e621c3ff96 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -1227,8 +1227,24 @@ coverage_init (const char *filename)
g->get_passes ()->get_pass_profile ()->static_pass_number;
  g->get_dumps ()->dump_start (profile_pass_num, NULL);

-  if (!profile_data_prefix && !IS_ABSOLUTE_PATH (filename))
-profile_data_prefix = getpwd ();
+  if (!IS_ABSOLUTE_PATH (filename))
+{
+  /* When a profile_data_prefix is provided, then mangle full path
+of filename in order to prevent file path clashing.  */
+  if (profile_data_prefix)
+   {
+#if HAVE_DOS_BASED_FILE_SYSTEM
+ const char separator = "\\";


As mentioned on IRC, this is ill-formed due to the missing *


+#else
+ const char *separator = "/";
+#endif
+ filename = concat (getpwd (), separator, filename, NULL);
+ filename = mangle_path (filename);
+ len = strlen (filename);
+   }
+  else
+   profile_data_prefix = getpwd ();
+}

  if (profile_data_prefix)
prefix_len = strlen (profile_data_prefix);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ca3772bbebf..4859cec0ab5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11253,6 +11253,9 @@ and used by @option{-fprofile-use} and 
@option{-fbranch-probabilities}
and its related options.  Both absolute and relative paths can be used.
By default, GCC uses the current directory as @var{path}, thus the
profile data file appears in the same directory as the object file.
+In order to prevent filename clashing, if object file name is not an absolute
+path, we mangle absolute path of @file{@var{sourcename}.gcda} file and
+use it as file name of a @file{.gcda} file.


This is missing several definite articles, and is inconsistent about
using "filename" and "file name", i.e.

In order to prevent the file name clashing, if the object file name is
not an absolute path, we mangle the absolute path of the
@file{@var{sourcename}.gcda} file and use it as the file name of a
@file{.gcda} file. 



Re: [C++ PATCH] Hide __for_{range,begin,end} symbols (PR c++/85515)

2018-07-03 Thread Jakub Jelinek
On Tue, Jul 03, 2018 at 11:34:51AM +0200, Richard Biener wrote:
> Can we make them DECL_ARTIFICIAL and/or make name-lookup never

They are DECL_ARTIFICIAL already.

> lookup DECL_ARTIFICIAL vars instead?

Not sure about that, will try to gather some statistics on how often we
rely on name-lookup of DECL_ARTIFICIALs.

Jakub


Re: [PATCH 3/3] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

2018-07-03 Thread Segher Boessenkool
On Tue, Jul 03, 2018 at 10:53:20AM +0200, Martin Liška wrote:
> On 06/29/2018 09:04 PM, Jeff Law wrote:
> > I think this is fine for the trunk.
> > 
> > jeff
> 
> Thank you Jeff.
> 
> I found some issues when doing build of all targets (contrib/config-list.mk).
> I'll update patch and test that affected cross-compilers still produce same 
> output.
> 
> However I noticed one ppc64 issue:
> 
> $ cat -n gcc/config/powerpcspe/powerpcspe.c
> 
>   5401/* Set branch target alignment, if not optimizing for size.  */
>   5402if (!optimize_size)
>   5403  {
>   5404/* Cell wants to be aligned 8byte for dual issue.  Titan 
> wants to be
>   5405   aligned 8byte to avoid misprediction by the branch 
> predictor.  */
>   5406if (rs6000_cpu == PROCESSOR_TITAN
>   5407|| rs6000_cpu == PROCESSOR_CELL)
>   5408  {
>   5409if (align_functions <= 0)
>   5410  align_functions = 8;
>   5411if (align_jumps <= 0)
>   5412  align_jumps = 8;
>   5413if (align_loops <= 0)
>   5414  align_loops = 8;
>   5415  }
>   5416if (rs6000_align_branch_targets)
>   5417  {
>   5418if (align_functions <= 0)
>   5419  align_functions = 16;
>   5420if (align_jumps <= 0)
>   5421  align_jumps = 16;
>   5422if (align_loops <= 0)
>   5423  {
>   5424can_override_loop_align = 1;
>   5425align_loops = 16;
>   5426  }
>   5427  }
>   5428if (align_jumps_max_skip <= 0)
>   5429  align_jumps_max_skip = 15;
>   5430if (align_loops_max_skip <= 0)
>   5431  align_loops_max_skip = 15;
> 
> Note that at line 5429 there's set of align_jumps_max_skip to 15 if not set 
> by default.
> At line 5412 align_jumps is set to 8, and align_jumps_max_skip should be 
> equal align_jumps - 1.
> That's a discrepancy. Segher can you please take a look?

This is powerpcspe, that's not mine.

But rs6000 has the same code, sure.  Why do you say "align_jumps_max_skip
should be equal align_jumps - 1"?  If that were true, why does it exist
at all?

toplev.c already has (in init_alignments):

  if (align_jumps_max_skip > align_jumps)
align_jumps_max_skip = align_jumps - 1;

so why would targets duplicate that logic?  (The target override is called
before init_alignments).


Segher


Re: [C++ PATCH] Hide __for_{range,begin,end} symbols (PR c++/85515)

2018-07-03 Thread Richard Biener
On Tue, Jul 3, 2018 at 11:43 AM Jakub Jelinek  wrote:
>
> On Tue, Jul 03, 2018 at 11:34:51AM +0200, Richard Biener wrote:
> > Can we make them DECL_ARTIFICIAL and/or make name-lookup never
>
> They are DECL_ARTIFICIAL already.
>
> > lookup DECL_ARTIFICIAL vars instead?
>
> Not sure about that, will try to gather some statistics on how often we
> rely on name-lookup of DECL_ARTIFICIALs.

Hmm, we might indeed.  At least we should make sure those
cases never have valid identifiers?  Or is the implementation
allowed to "hide" user identifiers in the implementation namespace?
Like the __for_begin one might hide a user defined variable.

Richard.

> Jakub


Re: Add support for dumping multiple dump files under one name

2018-07-03 Thread Andre Vieira (lists)
On 29/06/18 11:13, David Malcolm wrote:
> On Fri, 2018-06-29 at 10:15 +0200, Richard Biener wrote:
>> On Fri, 22 Jun 2018, Jan Hubicka wrote:
>>
>>> Hi,
>>> this patch adds dumpfile support for dumps that come in multiple
>>> parts.  This
>>> is needed for WPA stream-out dump since we stream partitions in
>>> parallel and
>>> the dumps would come up in random order.  Parts are added by new
>>> parameter that
>>> is initialzed by default to -1 (no parts). 
>>>
>>> One thing I skipped is any support for duplicated opening of file
>>> with parts since I do not need it.
>>>
>>> Bootstrapped/regtested x86_64-linux, OK?
>>
>> Looks reasonable - David, anything you want to add / have changed?
> 
> No worries from my side; I don't think it interacts with the
> optimization records stuff I'm working on - presumably this is just for
> dumping the WPA stream-out, rather than for dumping specific
> optimizations.
> 
> [...snip...]
> 
> Dave
> 

Hi David,

I believe r262245 is causing the following failures on aarch64 and arm:

FAIL: gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-tree-dump
vect "note: Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built SLP
cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-tree-dump
vect "note: Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built SLP
cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-tree-dump
vect "note: Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built SLP
cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-tree-dump
vect "note: Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built SLP
cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump
vect "note: Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built SLP
cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-tree-dump
vect "note: Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built SLP
cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-tree-dump
vect "note: Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-8.c scan-tree-dump vect "note: Built SLP
cancelled: can use load/store-lanes"

Could you please have a look?

Cheers,
Andre



Re: [14/n] PR85694: Rework overwidening detection

2018-07-03 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, Jun 29, 2018 at 1:36 PM Richard Sandiford
>  wrote:
>>
>> Richard Sandiford  writes:
>> > This patch is the main part of PR85694.  The aim is to recognise at least:
>> >
>> >   signed char *a, *b, *c;
>> >   ...
>> >   for (int i = 0; i < 2048; i++)
>> > c[i] = (a[i] + b[i]) >> 1;
>> >
>> > as an over-widening pattern, since the addition and shift can be done
>> > on shorts rather than ints.  However, it ended up being a lot more
>> > general than that.
>> >
>> > The current over-widening pattern detection is limited to a few simple
>> > cases: logical ops with immediate second operands, and shifts by a
>> > constant.  These cases are enough for common pixel-format conversion
>> > and can be detected in a peephole way.
>> >
>> > The loop above requires two generalisations of the current code: support
>> > for addition as well as logical ops, and support for non-constant second
>> > operands.  These are harder to detect in the same peephole way, so the
>> > patch tries to take a more global approach.
>> >
>> > The idea is to get information about the minimum operation width
>> > in two ways:
>> >
>> > (1) by using the range information attached to the SSA_NAMEs
>> > (effectively a forward walk, since the range info is
>> > context-independent).
>> >
>> > (2) by back-propagating the number of output bits required by
>> > users of the result.
>> >
>> > As explained in the comments, there's a balance to be struck between
>> > narrowing an individual operation and fitting in with the surrounding
>> > code.  The approach is pretty conservative: if we could narrow an
>> > operation to N bits without changing its semantics, it's OK to do that if:
>> >
>> > - no operations later in the chain require more than N bits; or
>> >
>> > - all internally-defined inputs are extended from N bits or fewer,
>> >   and at least one of them is single-use.
>> >
>> > See the comments for the rationale.
>> >
>> > I didn't bother adding STMT_VINFO_* wrappers for the new fields
>> > since the code seemed more readable without.
>> >
>> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Here's a version rebased on top of current trunk.  Changes from last time:
>>
>> - reintroduce dump_generic_expr_loc, with the obvious change to the
>>   prototype
>>
>> - fix a typo in a comment
>>
>> - use vect_element_precision from the new version of 12/n.
>>
>> Tested as before.  OK to install?
>
> OK.

Thanks.  For the record, here's what I installed (updated on top of
Dave's recent patch, and with an obvious fix to vect-widen-mult-u8-u32.c).

Richard


2018-07-03  Richard Sandiford  

gcc/
* poly-int.h (print_hex): New function.
* dumpfile.h (dump_dec, dump_hex): Declare.
* dumpfile.c (dump_dec, dump_hex): New poly_wide_int functions.
* tree-vectorizer.h (_stmt_vec_info): Add min_output_precision,
min_input_precision, operation_precision and operation_sign.
* tree-vect-patterns.c (vect_get_range_info): New function.
(vect_same_loop_or_bb_p, vect_single_imm_use)
(vect_operation_fits_smaller_type): Delete.
(vect_look_through_possible_promotion): Add an optional
single_use_p parameter.
(vect_recog_over_widening_pattern): Rewrite to use new
stmt_vec_info infomration.  Handle one operation at a time.
(vect_recog_cast_forwprop_pattern, vect_narrowable_type_p)
(vect_truncatable_operation_p, vect_set_operation_type)
(vect_set_min_input_precision): New functions.
(vect_determine_min_output_precision_1): Likewise.
(vect_determine_min_output_precision): Likewise.
(vect_determine_precisions_from_range): Likewise.
(vect_determine_precisions_from_users): Likewise.
(vect_determine_stmt_precisions, vect_determine_precisions): Likewise.
(vect_vect_recog_func_ptrs): Put over_widening first.
Add cast_forwprop.
(vect_pattern_recog): Call vect_determine_precisions.

gcc/testsuite/
* gcc.dg/vect/vect-widen-mult-u8-u32.c: Check specifically for a
widen_mult pattern.
* gcc.dg/vect/vect-over-widen-1.c: Update the scan tests for new
over-widening messages.
* gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-2.c: Likewise.
* gcc.dg/vect/vect-over-widen-2-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-3.c: Likewise.
* gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
* gcc.dg/vect/vect-over-widen-4.c: Likewise.
* gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
* gcc.dg/vect/bb-slp-over-widen-1.c: New test.
* gcc.dg/vect/bb-slp-over-widen-2.c: Likewise.
* gcc.dg/vect/vect-over-widen-5.c: Likewise.
* gcc.dg/vect/vect-over-widen-6.c: Likewise.
* gcc.dg/vect/vect-over-widen-7.c: Likewise.
* gcc.dg/vect/vect-over-widen-8.c: Likewise.
* gcc

Re: [PATCH 3/3] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

2018-07-03 Thread Martin Liška
On 07/03/2018 11:55 AM, Segher Boessenkool wrote:
> On Tue, Jul 03, 2018 at 10:53:20AM +0200, Martin Liška wrote:
>> On 06/29/2018 09:04 PM, Jeff Law wrote:
>>> I think this is fine for the trunk.
>>>
>>> jeff
>>
>> Thank you Jeff.
>>
>> I found some issues when doing build of all targets (contrib/config-list.mk).
>> I'll update patch and test that affected cross-compilers still produce same 
>> output.
>>
>> However I noticed one ppc64 issue:
>>
>> $ cat -n gcc/config/powerpcspe/powerpcspe.c
>>
>>   5401/* Set branch target alignment, if not optimizing for size.  */
>>   5402if (!optimize_size)
>>   5403  {
>>   5404/* Cell wants to be aligned 8byte for dual issue.  Titan 
>> wants to be
>>   5405   aligned 8byte to avoid misprediction by the branch 
>> predictor.  */
>>   5406if (rs6000_cpu == PROCESSOR_TITAN
>>   5407|| rs6000_cpu == PROCESSOR_CELL)
>>   5408  {
>>   5409if (align_functions <= 0)
>>   5410  align_functions = 8;
>>   5411if (align_jumps <= 0)
>>   5412  align_jumps = 8;
>>   5413if (align_loops <= 0)
>>   5414  align_loops = 8;
>>   5415  }
>>   5416if (rs6000_align_branch_targets)
>>   5417  {
>>   5418if (align_functions <= 0)
>>   5419  align_functions = 16;
>>   5420if (align_jumps <= 0)
>>   5421  align_jumps = 16;
>>   5422if (align_loops <= 0)
>>   5423  {
>>   5424can_override_loop_align = 1;
>>   5425align_loops = 16;
>>   5426  }
>>   5427  }
>>   5428if (align_jumps_max_skip <= 0)
>>   5429  align_jumps_max_skip = 15;
>>   5430if (align_loops_max_skip <= 0)
>>   5431  align_loops_max_skip = 15;
>>
>> Note that at line 5429 there's set of align_jumps_max_skip to 15 if not set 
>> by default.
>> At line 5412 align_jumps is set to 8, and align_jumps_max_skip should be 
>> equal align_jumps - 1.
>> That's a discrepancy. Segher can you please take a look?
> 
> This is powerpcspe, that's not mine.
> 
> But rs6000 has the same code, sure.

Right, that why I wrote to you.

> Why do you say "align_jumps_max_skip
> should be equal align_jumps - 1"?  If that were true, why does it exist
> at all?
> 
> toplev.c already has (in init_alignments):
> 
>   if (align_jumps_max_skip > align_jumps)
> align_jumps_max_skip = align_jumps - 1;

I'm rewriting this logic in the patch set. Issue is that 
checking for value of align_jumps_max_skip is done
in rs6000_option_override_internal, which is place before
align_jumps_max_skip is parsed.

That said, 'align_jumps_max_skip <= 0' is always true.

Martin

> 
> so why would targets duplicate that logic?  (The target override is called
> before init_alignments).
> 
> 
> Segher
> 



Re: [PATCH 3/3] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

2018-07-03 Thread Segher Boessenkool
On Tue, Jul 03, 2018 at 12:15:48PM +0200, Martin Liška wrote:
> > toplev.c already has (in init_alignments):
> > 
> >   if (align_jumps_max_skip > align_jumps)
> > align_jumps_max_skip = align_jumps - 1;
> 
> I'm rewriting this logic in the patch set. Issue is that 
> checking for value of align_jumps_max_skip is done
> in rs6000_option_override_internal, which is place before
> align_jumps_max_skip is parsed.
> 
> That said, 'align_jumps_max_skip <= 0' is always true.

It's not clear to me what you want me to do.

You should write your patch so that the end result behaves the same as
before, on all targets.  If that requires changing (or at least checking)
all targets, then you have a lot of work to do.

If you think the rs6000 backend is doing something wrong, please say
what exactly?  I don't see it.

Still confused,


Segher


[PATCH] Remove powerpc-linux_paired from config-list.mk

2018-07-03 Thread Segher Boessenkool
The target has been removed, so we shouldn't try to build it.


Segher


2018-07-03  Segher Boessenkool  

* contrib/config-list.mk: Remove powerpc-linux_paired.

---
 contrib/config-list.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index d04aca2..c3537d2 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -75,7 +75,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   powerpc-eabispe powerpc-eabisimaltivec powerpc-eabisim ppc-elf \
   powerpc-eabialtivec powerpc-xilinx-eabi powerpc-eabi \
   powerpc-rtems powerpc-linux_spe \
-  powerpc-linux_paired powerpc64-linux_altivec \
+  powerpc64-linux_altivec \
   powerpc-wrs-vxworks powerpc-wrs-vxworksae powerpc-wrs-vxworksmils \
   powerpc-lynxos powerpcle-elf \
   powerpcle-eabisim powerpcle-eabi \
-- 
1.8.3.1



Re: [C++ PATCH] Hide __for_{range,begin,end} symbols (PR c++/85515)

2018-07-03 Thread Jakub Jelinek
On Tue, Jul 03, 2018 at 11:58:31AM +0200, Richard Biener wrote:
> On Tue, Jul 3, 2018 at 11:43 AM Jakub Jelinek  wrote:
> >
> > On Tue, Jul 03, 2018 at 11:34:51AM +0200, Richard Biener wrote:
> > > Can we make them DECL_ARTIFICIAL and/or make name-lookup never
> >
> > They are DECL_ARTIFICIAL already.
> >
> > > lookup DECL_ARTIFICIAL vars instead?
> >
> > Not sure about that, will try to gather some statistics on how often we
> > rely on name-lookup of DECL_ARTIFICIALs.
> 
> Hmm, we might indeed.  At least we should make sure those
> cases never have valid identifiers?  Or is the implementation

At least __FUNCTION__, __PRETTY_FUNCTION__, __func__ are all local
VAR_DECLs with DECL_ARTIFICIAL that need to be found by name lookup
(so far gathered stats just show __FUNCTION__ in lookup_name_real_1).

Jakub


Re: [C++ PATCH] Hide __for_{range,begin,end} symbols (PR c++/85515)

2018-07-03 Thread Jakub Jelinek
On Tue, Jul 03, 2018 at 01:24:28PM +0200, Jakub Jelinek wrote:
> On Tue, Jul 03, 2018 at 11:58:31AM +0200, Richard Biener wrote:
> > On Tue, Jul 3, 2018 at 11:43 AM Jakub Jelinek  wrote:
> > >
> > > On Tue, Jul 03, 2018 at 11:34:51AM +0200, Richard Biener wrote:
> > > > Can we make them DECL_ARTIFICIAL and/or make name-lookup never
> > >
> > > They are DECL_ARTIFICIAL already.
> > >
> > > > lookup DECL_ARTIFICIAL vars instead?
> > >
> > > Not sure about that, will try to gather some statistics on how often we
> > > rely on name-lookup of DECL_ARTIFICIALs.
> > 
> > Hmm, we might indeed.  At least we should make sure those
> > cases never have valid identifiers?  Or is the implementation
> 
> At least __FUNCTION__, __PRETTY_FUNCTION__, __func__ are all local
> VAR_DECLs with DECL_ARTIFICIAL that need to be found by name lookup
> (so far gathered stats just show __FUNCTION__ in lookup_name_real_1).

And it isn't limited to that, omp_priv/omp_orig/omp_in/omp_out too (the
OpenMP UDR artifical vars), also variables captured in lambdas,
anon union fields, ...
So I'm afraid it is not possible to ignore DECL_ARTIFICIAL VAR_DECLs in name
lookup and generally it is ok that they have user accessible names.
Just the range for vars are a special case.

Jakub


Re: extract_range_from_binary* cleanups for VRP

2018-07-03 Thread Martin Liška
Hi.

It caused UBSAN errors:

$ cat ubsan.i
int a;
void d() { int c, b = 8 - a; }

$ /home/marxin/Programming/gcc2/objdir/./gcc/xgcc 
-B/home/marxin/Programming/gcc2/objdir/./gcc/ ubsan.i -c -O2
../../gcc/tree-vrp.c:1715:26: runtime error: load of value 255, which is not a 
valid value for type 'bool'
#0 0x3246ca2 in extract_range_from_binary_expr_1(value_range*, tree_code, 
tree_node*, value_range*, value_range*) ../../gcc/tree-vrp.c:1715
#1 0x34aa8b6 in vr_values::extract_range_from_binary_expr(value_range*, 
tree_code, tree_node*, tree_node*, tree_node*) ../../gcc/vr-values.c:794
#2 0x34b45fa in vr_values::extract_range_from_assignment(value_range*, 
gassign*) ../../gcc/vr-values.c:1455
#3 0x494cfd5 in evrp_range_analyzer::record_ranges_from_stmt(gimple*, bool) 
../../gcc/gimple-ssa-evrp-analyze.c:293
#4 0x4942548 in evrp_dom_walker::before_dom_children(basic_block_def*) 
../../gcc/gimple-ssa-evrp.c:139
#5 0x487652b in dom_walker::walk(basic_block_def*) ../../gcc/domwalk.c:353
#6 0x49470f9 in execute_early_vrp ../../gcc/gimple-ssa-evrp.c:310
#7 0x49470f9 in execute ../../gcc/gimple-ssa-evrp.c:347
#8 0x1fc4a0e in execute_one_pass(opt_pass*) ../../gcc/passes.c:2446
#9 0x1fc8b47 in execute_pass_list_1 ../../gcc/passes.c:2535
#10 0x1fc8b8e in execute_pass_list_1 ../../gcc/passes.c:2536
#11 0x1fc8c68 in execute_pass_list(function*, opt_pass*) 
../../gcc/passes.c:2546
#12 0x2004c85 in do_per_function_toporder(void (*)(function*, void*), 
void*) ../../gcc/passes.c:1688
#13 0x2005e9a in execute_ipa_pass_list(opt_pass*) ../../gcc/passes.c:2894
#14 0xfcfa79 in ipa_passes ../../gcc/cgraphunit.c:2400
#15 0xfcfa79 in symbol_table::compile() ../../gcc/cgraphunit.c:2536
#16 0xfdc52a in symbol_table::finalize_compilation_unit() 
../../gcc/cgraphunit.c:2696
#17 0x25115e4 in compile_file ../../gcc/toplev.c:479
#18 0x9278af in do_compile ../../gcc/toplev.c:2086
#19 0x9278af in toplev::main(int, char**) ../../gcc/toplev.c:2221
#20 0x92a79a in main ../../gcc/main.c:39
#21 0x7659c11a in __libc_start_main ../csu/libc-start.c:308
#22 0x92a8c9 in _start 
(/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x92a8c9)

It's because neg_min_op0, or any other from:
  bool neg_min_op0, neg_min_op1, neg_max_op0, neg_max_op1;

Martin


Re: [PATCH 3/3] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

2018-07-03 Thread Martin Liška
On 07/03/2018 12:58 PM, Segher Boessenkool wrote:
> On Tue, Jul 03, 2018 at 12:15:48PM +0200, Martin Liška wrote:
>>> toplev.c already has (in init_alignments):
>>>
>>>   if (align_jumps_max_skip > align_jumps)
>>> align_jumps_max_skip = align_jumps - 1;
>>
>> I'm rewriting this logic in the patch set. Issue is that 
>> checking for value of align_jumps_max_skip is done
>> in rs6000_option_override_internal, which is place before
>> align_jumps_max_skip is parsed.
>>
>> That said, 'align_jumps_max_skip <= 0' is always true.
> 
> It's not clear to me what you want me to do.
> 
> You should write your patch so that the end result behaves the same as
> before, on all targets.  If that requires changing (or at least checking)
> all targets, then you have a lot of work to do.
> 
> If you think the rs6000 backend is doing something wrong, please say
> what exactly?  I don't see it.

Uf, it's quite complicated I would say.
So first I believe for all -falign-{labels,loops,jumps} we don't handle properly
value of the argument. More precisely for a value of N (not power of 2),
we don't respect max_skip and we generate alignment to M, where M is first 
bigger
power of 2 number. Example:

$ gcc /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c 
-O2 -falign-labels=1025 -c -S  -o /dev/stdout | grep align | sort | uniq -c
  1 .align 32
132 .p2align 11
  7 .p2align 4,,15

2^11 == 2048, but I would expect '.p2align 11,,1024' to be generated. That's 
what you get for function alignment:

$ gcc /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c 
-O2 -falign-functions=1025 -c -S  -o /dev/stdout | grep align | sort | uniq -c
  1 .align 32
  7 .p2align 11,,1024
 55 .p2align 3
 48 .p2align 4,,10

Do I understand that correctly that it's broken?

On powerpc, because align_jumps_max_skip is set to 15, then we see 
inconsistency like:

./xgcc -B. 
/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c -O2 
-falign-jumps=14 -c -S  -o /dev/stdout | grep align | sort | uniq -c
...
 27 .p2align 4,,13
...

which is correct.

but:

./xgcc -B. 
/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c -O2 
-falign-jumps=1025 -c -S  -o /dev/stdout | grep align | sort | uniq -c
...
 27 .p2align 11,,15
...

Here 11,,15 is completely broken value.

Martin


> 
> Still confused,
> 
> 
> Segher
> 



Re: [PATCH 3/3] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

2018-07-03 Thread Segher Boessenkool
On Tue, Jul 03, 2018 at 02:51:27PM +0200, Martin Liška wrote:
> On 07/03/2018 12:58 PM, Segher Boessenkool wrote:
> > On Tue, Jul 03, 2018 at 12:15:48PM +0200, Martin Liška wrote:
> >>> toplev.c already has (in init_alignments):
> >>>
> >>>   if (align_jumps_max_skip > align_jumps)
> >>> align_jumps_max_skip = align_jumps - 1;
> >>
> >> I'm rewriting this logic in the patch set. Issue is that 
> >> checking for value of align_jumps_max_skip is done
> >> in rs6000_option_override_internal, which is place before
> >> align_jumps_max_skip is parsed.
> >>
> >> That said, 'align_jumps_max_skip <= 0' is always true.
> > 
> > It's not clear to me what you want me to do.
> > 
> > You should write your patch so that the end result behaves the same as
> > before, on all targets.  If that requires changing (or at least checking)
> > all targets, then you have a lot of work to do.
> > 
> > If you think the rs6000 backend is doing something wrong, please say
> > what exactly?  I don't see it.
> 
> Uf, it's quite complicated I would say.
> So first I believe for all -falign-{labels,loops,jumps} we don't handle 
> properly
> value of the argument. More precisely for a value of N (not power of 2),
> we don't respect max_skip and we generate alignment to M, where M is first 
> bigger
> power of 2 number. Example:
> 
> $ gcc 
> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c -O2 
> -falign-labels=1025 -c -S  -o /dev/stdout | grep align | sort | uniq -c
>   1   .align 32
> 132   .p2align 11
>   7   .p2align 4,,15
> 
> 2^11 == 2048, but I would expect '.p2align 11,,1024' to be generated. That's 
> what you get for function alignment:
> 
> $ gcc 
> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c -O2 
> -falign-functions=1025 -c -S  -o /dev/stdout | grep align | sort | uniq -c
>   1   .align 32
>   7   .p2align 11,,1024
>  55   .p2align 3
>  48   .p2align 4,,10
> 
> Do I understand that correctly that it's broken?

Yes, this behaviour contradicts our documentation:

'-falign-labels=N'
 Align all branch targets to a power-of-two boundary, skipping up to
 N bytes like '-falign-functions'.

'-falign-functions=N'
 Align the start of functions to the next power-of-two greater than
 N, skipping up to N bytes.  For instance, '-falign-functions=32'
 aligns functions to the next 32-byte boundary, but
 '-falign-functions=24' aligns to the next 32-byte boundary only if
 this can be done by skipping 23 bytes or less.

> On powerpc, because align_jumps_max_skip is set to 15, then we see 
> inconsistency like:
> 
> ./xgcc -B. 
> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c -O2 
> -falign-jumps=14 -c -S  -o /dev/stdout | grep align | sort | uniq -c
> ...
>  27   .p2align 4,,13
> ...
> 
> which is correct.
> 
> but:
> 
> ./xgcc -B. 
> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c -O2 
> -falign-jumps=1025 -c -S  -o /dev/stdout | grep align | sort | uniq -c
> ...
>  27   .p2align 11,,15
> ...
> 
> Here 11,,15 is completely broken value.

Yup.

This is specific to align-jumps...  Not many people ever change that :-)


Segher


Re: [C++ PATCH] Hide __for_{range,begin,end} symbols (PR c++/85515)

2018-07-03 Thread Nathan Sidwell

On 07/03/2018 08:15 AM, Jakub Jelinek wrote:


And it isn't limited to that, omp_priv/omp_orig/omp_in/omp_out too (the
OpenMP UDR artifical vars), also variables captured in lambdas,
anon union fields, ...
So I'm afraid it is not possible to ignore DECL_ARTIFICIAL VAR_DECLs in name
lookup and generally it is ok that they have user accessible names.
Just the range for vars are a special case.


Yeah, I'd rather not have DECL_ARTIFICIAL be significant in name lookup. 
 An unspellable name would be better, which is what we do for, eg, ctors.


nathan

--
Nathan Sidwell


C++ PATCH for c++/86190, bogus -Wsign-conversion warning

2018-07-03 Thread Marek Polacek
This PR complains about bogus -Wsign-conversion warning even with an
explicit static_cast.  It started with this hunk from the delayed folding
merge:

@@ -5028,20 +5022,12 @@ cp_build_binary_op (location_t location,
 
   if (short_compare)
{
- /* Don't write &op0, etc., because that would prevent op0
-from being kept in a register.
-Instead, make copies of the our local variables and
-pass the copies by reference, then copy them back afterward.  */
- tree xop0 = op0, xop1 = op1, xresult_type = result_type;
+ /* We call shorten_compare only for diagnostic-reason.  */
+ tree xop0 = fold_simple (op0), xop1 = fold_simple (op1),
+  xresult_type = result_type;
  enum tree_code xresultcode = resultcode;
- tree val
-   = shorten_compare (location, &xop0, &xop1, &xresult_type,
+ shorten_compare (location, &xop0, &xop1, &xresult_type,
   &xresultcode);
- if (val != 0)
-   return cp_convert (boolean_type_node, val, complain);
- op0 = xop0, op1 = xop1;
- converted = 1;
- resultcode = xresultcode;
}
 
   if ((short_compare || code == MIN_EXPR || code == MAX_EXPR)

which means that converted is now unset so we go to

 5350   if (! converted)
 5351 {
 5352   if (TREE_TYPE (op0) != result_type)
 5353 op0 = cp_convert_and_check (result_type, op0, complain);
 5354   if (TREE_TYPE (op1) != result_type)
 5355 op1 = cp_convert_and_check (result_type, op1, complain);

and cp_convert_and_check gives those warning.  The direct comparison
of types instead of same_type_p means we can try to convert same types,
but it still wouldn't fix this PR.  What we should probably do is to
simply disable -Wsign-conversion conversion for comparison, because
-Wsign-compare will warn for those.  With this patch, the C++ FE will
follow what the C FE and clang++ do.

Also fix some formatting that's been bothering me, while at it.

Bootstrapped/regtested on x86_64-linux, ok for trunk/8?

2018-07-03  Marek Polacek  

PR c++/86190 - bogus -Wsign-conversion warning
* typeck.c (cp_build_binary_op): Fix formatting.  Add a warning
sentinel.

* g++.dg/warn/Wsign-conversion-3.C: New test.
* g++.dg/warn/Wsign-conversion-4.C: New test.

diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 3a4f1cdf479..cfd1dd8b150 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -5311,12 +5311,13 @@ cp_build_binary_op (location_t location,
 
   if (short_compare)
{
- /* We call shorten_compare only for diagnostic-reason.  */
- tree xop0 = fold_simple (op0), xop1 = fold_simple (op1),
-  xresult_type = result_type;
+ /* We call shorten_compare only for diagnostics.  */
+ tree xop0 = fold_simple (op0);
+ tree xop1 = fold_simple (op1);
+ tree xresult_type = result_type;
  enum tree_code xresultcode = resultcode;
  shorten_compare (location, &xop0, &xop1, &xresult_type,
-  &xresultcode);
+  &xresultcode);
}
 
   if ((short_compare || code == MIN_EXPR || code == MAX_EXPR)
@@ -5349,6 +5350,7 @@ cp_build_binary_op (location_t location,
  otherwise, it will be given type RESULT_TYPE.  */
   if (! converted)
 {
+  warning_sentinel w (warn_sign_conversion, short_compare);
   if (TREE_TYPE (op0) != result_type)
op0 = cp_convert_and_check (result_type, op0, complain);
   if (TREE_TYPE (op1) != result_type)
diff --git gcc/testsuite/g++.dg/warn/Wsign-conversion-3.C 
gcc/testsuite/g++.dg/warn/Wsign-conversion-3.C
index e69de29bb2d..2c3fef31475 100644
--- gcc/testsuite/g++.dg/warn/Wsign-conversion-3.C
+++ gcc/testsuite/g++.dg/warn/Wsign-conversion-3.C
@@ -0,0 +1,13 @@
+// PR c++/86190
+// { dg-options "-Wsign-conversion -Wsign-compare" }
+
+typedef unsigned long sz_t;
+sz_t s();
+bool f(int i) { return s() < (unsigned long) i; }
+bool f2(int i) { return s() < static_cast(i); }
+bool f3(int i) { return s() < i; } // { dg-warning "comparison of integer 
expressions of different signedness" }
+bool f4(int i) { return s() < (long) i; } // { dg-warning "comparison of 
integer expressions of different signedness" }
+bool f5(short int i) { return s() < (int) i; } // { dg-warning "comparison of 
integer expressions of different signedness" }
+bool f6(signed char i) { return s() < (int) i; } // { dg-warning "comparison 
of integer expressions of different signedness" }
+bool f7(unsigned char i) { return s() < i; }
+bool f8(signed char i) { return s() < i; } // { dg-warning "comparison of 
integer expressions of different signedness" }
diff --git gcc/testsuite/g++.dg/warn/Wsign-conversion-4.C 
gcc/testsuite/g++.dg/warn/Wsign-conversion-4.C
index e69de29bb2d..40814b95587 100644
--- gcc/testsuite/g++.dg/warn/Wsign-conversion-4.C
+++ gcc/testsuite/g++.dg/warn/Wsign-conversion-4.C
@@ -0,0 +1,14 @@
+// PR c++/86190
+// { dg-options "-Wsign-conversion -Ws

Re: Invert sense of NO_IMPLICIT_EXTERN_C

2018-07-03 Thread Nathan Sidwell
could a global reviewer comment?  This touches a lot of target-specific 
config files.  David has kindly checked AIX is ok, the known target 
needing the functionality.


https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01568.html

nathan

On 06/25/2018 12:48 PM, Nathan Sidwell wrote:
NO_IMPLICIT_EXTERN_C was introduced to tell the compiler that it didn't 
need to fake up 'extern "C" { ... }' around system header files.  Over 
the years more and more system headers have become C++-aware, leading to 
more targets defining this macro.


Unfortunately because of the sense of this macro, and that the 
requirement is based on the target-OS, whereas we partition the config 
directory by target-ARCH, it's become hard to know which targets still 
require the older functionality.


There have been a few questions over the past 2 decades to figure this 
out, but they didn;t progress.


This patch replaces the negative NO_IMPLICIT_EXTERN_C with the positive 
SYSTEM_IMPLICIT_EXTERN_C.  Targets that previously did not define 
NO_IMPLICIT_EXTERN_C now need to define SYSTEM_IMPLICIT_EXTERN_C.  I 
know of one such target -- AIX, and I'd be grateful this patch could be 
tried there.


Going through the config files was tricky, and I may well have missed 
something.  One suspicious file is config/sparc/openbsd64.h which did 
explicitly undef the macro, with the comment:


   /* Inherited from sp64-elf.  */

sp64-elf.h does define the macro, but the other bsd's also define it, 
which leaves me wondering if openbsd.h has bit rotted here.  Which leads 
me to another observation:


It's quite possible the extern "C" functionality is enabled on targets 
that no longer need it, because their observed behaviour would not be 
broken.  On the other hand, the failure mode of not defining its 
replacement (or alternatively mistakenly defining NO_IMPLICIT_EXTERN_C), 
would be immediate and obvious.  And the fix is also simple.


So, if you have a target that you think has C++-unaware system headers, 
please give this patch a spin and report.  Blessing from a GM after a 
few days out there would be nice :)


The lesson here is that when one has a transition, chose an enablement 
mechanism that makes it easy to tell when the transition is complete.


nathan




--
Nathan Sidwell


Re: [PATCH] -fopt-info: add indentation via DUMP_VECT_SCOPE

2018-07-03 Thread David Malcolm
On Tue, 2018-07-03 at 09:37 +0200, Richard Biener wrote:
> On Mon, Jul 2, 2018 at 7:00 PM David Malcolm 
> wrote:
> > 
> > On Mon, 2018-07-02 at 14:23 +0200, Christophe Lyon wrote:
> > > On Fri, 29 Jun 2018 at 10:09, Richard Biener  > > ail.
> > > com> wrote:
> > > > 
> > > > On Tue, Jun 26, 2018 at 5:43 PM David Malcolm  > > > com>
> > > > wrote:
> > > > > 
> > > > > This patch adds a concept of nested "scopes" to dumpfile.c's
> > > > > dump_*_loc
> > > > > calls, and wires it up to the DUMP_VECT_SCOPE macro in tree-
> > > > > vectorizer.h,
> > > > > so that the nested structure is shown in -fopt-info by
> > > > > indentation.
> > > > > 
> > > > > For example, this converts -fopt-info-all e.g. from:
> > > > > 
> > > > > test.c:8:3: note: === analyzing loop ===
> > > > > test.c:8:3: note: === analyze_loop_nest ===
> > > > > test.c:8:3: note: === vect_analyze_loop_form ===
> > > > > test.c:8:3: note: === get_loop_niters ===
> > > > > test.c:8:3: note: symbolic number of iterations is (unsigned
> > > > > int)
> > > > > n_9(D)
> > > > > test.c:8:3: note: not vectorized: loop contains function
> > > > > calls or
> > > > > data references that cannot be analyzed
> > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > 
> > > > > to:
> > > > > 
> > > > > test.c:8:3: note: === analyzing loop ===
> > > > > test.c:8:3: note:  === analyze_loop_nest ===
> > > > > test.c:8:3: note:   === vect_analyze_loop_form ===
> > > > > test.c:8:3: note:=== get_loop_niters ===
> > > > > test.c:8:3: note:   symbolic number of iterations is
> > > > > (unsigned
> > > > > int) n_9(D)
> > > > > test.c:8:3: note:   not vectorized: loop contains function
> > > > > calls
> > > > > or data references that cannot be analyzed
> > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > 
> > > > > showing that the "symbolic number of iterations" message is
> > > > > within
> > > > > the "=== analyze_loop_nest ===" (and not within the
> > > > > "=== vect_analyze_loop_form ===").
> > > > > 
> > > > > This is also enabling work for followups involving
> > > > > optimization
> > > > > records
> > > > > (allowing the records to directly capture the nested
> > > > > structure of
> > > > > the
> > > > > dump messages).
> > > > > 
> > > > > Successfully bootstrapped & regrtested on x86_64-pc-linux-
> > > > > gnu.
> > > > > 
> > > > > OK for trunk?
> > > 
> > > Hi,
> > > 
> > > I've noticed that this patch (r262246) caused regressions on
> > > aarch64:
> > > gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-tree-
> > > dump
> > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built SLP
> > > cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-tree-
> > > dump
> > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built SLP
> > > cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-tree-
> > > dump
> > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built SLP
> > > cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-tree-
> > > dump
> > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built SLP
> > > cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-
> > > dump
> > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built SLP
> > > cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-tree-
> > > dump
> > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built SLP
> > > cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-tree-
> > > dump
> > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > gcc.dg/vect/slp-perm-8.c scan-tree-dump vect "note: Built SLP
> > > cancelled: can use load/store-lanes"
> > > 
> > > The problem is that now there are more spaces between "note:" and
> > > "Built", the attached small patch does that for slp-perm-1.c.
> > 
> > Sorry about the breakage.
> > 
> > > Is it the right way of fixing it or do we want to accept any
> > > amount
> > > of
> > > spaces for instance?
> > 
> > I don't think we want to hardcode the amount of space in the
> > dumpfile.
> > The idea of my patch was to make the dump more human-readable (I
> > hope)
> > by visualizing the nesting structure of the dump messages, but I
> > think
> > we shouldn't "bake" that into the expected strings, as someone
> > might
> > want to add an intermediate nesting level.
> > 
> > Do we really

Re: [PATCH] -fopt-info: add indentation via DUMP_VECT_SCOPE

2018-07-03 Thread Richard Biener
On Tue, Jul 3, 2018 at 3:52 PM David Malcolm  wrote:
>
> On Tue, 2018-07-03 at 09:37 +0200, Richard Biener wrote:
> > On Mon, Jul 2, 2018 at 7:00 PM David Malcolm 
> > wrote:
> > >
> > > On Mon, 2018-07-02 at 14:23 +0200, Christophe Lyon wrote:
> > > > On Fri, 29 Jun 2018 at 10:09, Richard Biener  > > > ail.
> > > > com> wrote:
> > > > >
> > > > > On Tue, Jun 26, 2018 at 5:43 PM David Malcolm  > > > > com>
> > > > > wrote:
> > > > > >
> > > > > > This patch adds a concept of nested "scopes" to dumpfile.c's
> > > > > > dump_*_loc
> > > > > > calls, and wires it up to the DUMP_VECT_SCOPE macro in tree-
> > > > > > vectorizer.h,
> > > > > > so that the nested structure is shown in -fopt-info by
> > > > > > indentation.
> > > > > >
> > > > > > For example, this converts -fopt-info-all e.g. from:
> > > > > >
> > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > test.c:8:3: note: === analyze_loop_nest ===
> > > > > > test.c:8:3: note: === vect_analyze_loop_form ===
> > > > > > test.c:8:3: note: === get_loop_niters ===
> > > > > > test.c:8:3: note: symbolic number of iterations is (unsigned
> > > > > > int)
> > > > > > n_9(D)
> > > > > > test.c:8:3: note: not vectorized: loop contains function
> > > > > > calls or
> > > > > > data references that cannot be analyzed
> > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > >
> > > > > > to:
> > > > > >
> > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > test.c:8:3: note:  === analyze_loop_nest ===
> > > > > > test.c:8:3: note:   === vect_analyze_loop_form ===
> > > > > > test.c:8:3: note:=== get_loop_niters ===
> > > > > > test.c:8:3: note:   symbolic number of iterations is
> > > > > > (unsigned
> > > > > > int) n_9(D)
> > > > > > test.c:8:3: note:   not vectorized: loop contains function
> > > > > > calls
> > > > > > or data references that cannot be analyzed
> > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > >
> > > > > > showing that the "symbolic number of iterations" message is
> > > > > > within
> > > > > > the "=== analyze_loop_nest ===" (and not within the
> > > > > > "=== vect_analyze_loop_form ===").
> > > > > >
> > > > > > This is also enabling work for followups involving
> > > > > > optimization
> > > > > > records
> > > > > > (allowing the records to directly capture the nested
> > > > > > structure of
> > > > > > the
> > > > > > dump messages).
> > > > > >
> > > > > > Successfully bootstrapped & regrtested on x86_64-pc-linux-
> > > > > > gnu.
> > > > > >
> > > > > > OK for trunk?
> > > >
> > > > Hi,
> > > >
> > > > I've noticed that this patch (r262246) caused regressions on
> > > > aarch64:
> > > > gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-tree-
> > > > dump
> > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built SLP
> > > > cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-tree-
> > > > dump
> > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built SLP
> > > > cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-tree-
> > > > dump
> > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built SLP
> > > > cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-tree-
> > > > dump
> > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built SLP
> > > > cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-
> > > > dump
> > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built SLP
> > > > cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-tree-
> > > > dump
> > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built SLP
> > > > cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-tree-
> > > > dump
> > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > gcc.dg/vect/slp-perm-8.c scan-tree-dump vect "note: Built SLP
> > > > cancelled: can use load/store-lanes"
> > > >
> > > > The problem is that now there are more spaces between "note:" and
> > > > "Built", the attached small patch does that for slp-perm-1.c.
> > >
> > > Sorry about the breakage.
> > >
> > > > Is it the right way of fixing it or do we want to accept any
> > > > amount
> > > > of
> > > > spaces for instance?
> > >
> > > I don't think we want to hardcode the amount of space in the
> > > dumpfile.
> > > The idea of my patch wa

[PATCH] Fix part of PR86389

2018-07-03 Thread Richard Biener


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

>From 52aad98947e5cfcb5624ff24f0c557d0029c34fe Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Tue, 3 Jul 2018 14:04:01 +0200
Subject: [PATCH] fix-pr86389

2018-07-03  Richard Biener  

PR ipa/86389
* tree-ssa-structalias.c (find_func_clobbers): Properly
handle indirect calls.

* gcc.dg/torture/pr86389.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/torture/pr86389.c 
b/gcc/testsuite/gcc.dg/torture/pr86389.c
new file mode 100644
index 000..cc29635c6d0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr86389.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-additional-options "-fipa-pta" } */
+
+void callme (void (*callback) (void));
+
+int
+main (void)
+{
+  int ok = 0;
+  void callback (void) { ok = 1; }
+
+  callme (&callback);
+
+  if (!ok)
+__builtin_abort ();
+  return 0;
+}
+
+__attribute__((noinline, noclone))
+void
+callme (void (*callback) (void))
+{
+  (*callback) ();
+}
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index ac5d4bc93fe..fd24f84fb14 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -5353,6 +5353,7 @@ find_func_clobbers (struct function *fn, gimple *origt)
   /* For callees without function info (that's external functions),
 ESCAPED is clobbered and used.  */
   if (cfi->decl
+ && TREE_CODE (cfi->decl) == FUNCTION_DECL
  && !cfi->is_fn_info)
{
  varinfo_t vi;


[committed] More H8 cleanups, mostly expander consolidation

2018-07-03 Thread Jeff Law

Another batch of cleanups/consolidations.

First, the move patterns.  This consolidates the movXX expanders -- this
can slightly change the generated code on the H8/SX due to an
inconsistency in the old movhi expander that is fixed by this change.

The expanders for logicals and basic arithmetic were trivial to
consolidate.  That in turn is showing several define_insns that ought to
be consolidated -- I've done a trivial amount of that here with the
logical3_sn and logical3 patterns.  But there's more to go.

The zero/sign_extend, rotate and shift expanders were also trivial.

Cleaning up the expanders isn't that big of a deal since it doesn't
directly affect the transition away from cc0.  However, cleaning them up
does tend to bring the matching patterns for different modes and
variants of the H8 closer together on the screen.  That in turn is
making the redundancies in the matching patterns more obvious and
cleaning *those* up is definitely useful since every define_insn is
going to need tweaking for the transition away from cc0.

Committing to the trunk,

Jeff
* config/h8300/h8300.md (HSI, QHSI, QHSIF): New mode iterators.
(shifts): New code iterator.
(movqi, movhi, movsi, movsf expanders): Consolidate into a single
expander.  Fix HImode handling on H8/SX.
(addqi3, addhi3, addsi3 expanders): Consolidate into a single expander.
(subqi3, subhi3, subsi3 expanders): Likewise.
(andqi3, andhi3, andsi3 expanders): Likewise.
(iorqi3, iorhi3, iorsi3 expanders): Likewise.
(xorqi3, xorhi3, xorsi3 expanders): Likewise.
(negqi2, neghi2, negsi2, negsf2 expanders): Likewise.
(one_cmplqi2, one_cmplhi2, one_cmplsi2): Likewise.
(zero_extendqihi2, zero_extendqisi2): Likewise.
(extendqihi2, extendqisi2): Likewise.
(rotlqi3, rotlhi3, rotlsi3): Likewise.
(neghi2_h8300, negsi2_h8300): Likewise for these patterns.
(rotlqi3_1, rotlhi3_1): Likewise.
(logicalhi3_sn, logicalsi3_sn): Likewise.
(logicalhi3, logicalsi3): Likewise.

diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index 74b2233..aac405a 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -185,6 +185,14 @@
 (define_mode_iterator P [(HI "Pmode == HImode") (SI "Pmode == SImode")])
 
 (define_mode_iterator QHI [QI HI])
+
+(define_mode_iterator HSI [HI SI])
+
+(define_mode_iterator QHSI [QI HI SI])
+
+(define_mode_iterator QHSIF [QI HI SI SF])
+
+(define_code_iterator shifts [ashift ashiftrt lshiftrt])
 
 ;; --
 ;; MOVE INSTRUCTIONS
@@ -218,14 +226,28 @@
   [(set_attr "length_table" "mov_imm4,movb")
(set_attr "cc" "set_znv")])
 
-(define_expand "movqi"
-  [(set (match_operand:QI 0 "general_operand_dst" "")
-   (match_operand:QI 1 "general_operand_src" ""))]
+(define_expand "mov"
+  [(set (match_operand:QHSIF 0 "general_operand_dst" "")
+   (match_operand:QHSIF 1 "general_operand_src" ""))]
   ""
   {
-/* One of the ops has to be in a register.  */
-if (!TARGET_H8300SX && !h8300_move_ok (operands[0], operands[1]))
-  operands[1] = copy_to_mode_reg (QImode, operands[1]);
+enum machine_mode mode = mode;
+if (TARGET_H8300 && (mode == SImode || mode == SFmode))
+  {
+   /* The original H8/300 needs to split up 32 bit moves.  */
+   if (h8300_expand_movsi (operands))
+ DONE;
+  }
+else if (!TARGET_H8300SX)
+  {
+   /* Other H8 chips, except the H8/SX family can only handle a
+  single memory operand, which is checked by h8300_move_ok.
+
+  We could perhaps have h8300_move_ok handle the H8/SX better
+  and just remove the !TARGET_H8300SX conditional.  */
+   if (!h8300_move_ok (operands[0], operands[1]))
+ operands[1] = copy_to_mode_reg (mode, operand1);
+  }
   })
 
 (define_insn "movstrictqi"
@@ -271,16 +293,6 @@
(set_attr "length" "2,2,*,*,*")
(set_attr "cc" "set_zn,set_znv,set_znv,set_znv,set_znv")])
 
-(define_expand "movhi"
-  [(set (match_operand:HI 0 "general_operand_dst" "")
-   (match_operand:HI 1 "general_operand_src" ""))]
-  ""
-  {
-/* One of the ops has to be in a register.  */
-if (!h8300_move_ok (operands[0], operands[1]))
-  operands[1] = copy_to_mode_reg (HImode, operand1);
-  })
-
 (define_insn "movstricthi"
   [(set (strict_low_part (match_operand:HI 0 "general_operand_dst" "+r,r,r"))
 (match_operand:HI 1 "general_operand_src" 
"I,P3>X,rmi"))]
@@ -295,24 +307,6 @@
 
 ;; movsi
 
-(define_expand "movsi"
-  [(set (match_operand:SI 0 "general_operand_dst" "")
-   (match_operand:SI 1 "general_operand_src" ""))]
-  ""
-  {
-if (TARGET_H8300)
-  {
-   if (h8300_expand_movsi (operands))
- DONE;
-  }
-else if (!TARGET_H8300SX)
-  {
-   /* One of the ops has to be in a register.  */
-   if (!h8300_move_ok (o

Re: [17/n] PR85694: AArch64 support for AVG_FLOOR/CEIL

2018-07-03 Thread James Greenhalgh
On Fri, Jun 29, 2018 at 04:24:58AM -0500, Richard Sandiford wrote:
> This patch adds AArch64 patterns for the new AVG_FLOOR/CEIL operations.
> AVG_FLOOR is [SU]HADD and AVG_CEIL is [SU]RHADD.
> 
> Tested on aarch64-linux-gnu (with and without SVE).  OK to install?


OK.

Thanks,
James

> 2018-06-29  Richard Sandiford  
> 
> gcc/
>   PR tree-optimization/85694
>   * config/aarch64/iterators.md (HADD, RHADD): New int iterators.
>   (u): Handle UNSPEC_SHADD, UNSPEC_UHADD, UNSPEC_SRHADD and
>   UNSPEC_URHADD.
>   * config/aarch64/aarch64-simd.md (avg3_floor)
>   (avg3_ceil): New patterns.
> 
> gcc/testsuite/
>   PR tree-optimization/85694
>   * lib/target-supports.exp (check_effective_target_vect_avg_qi):
>   Return true for AArch64 without SVE.
>   * gcc.target/aarch64/vect_hadd_1.h: New file.
>   * gcc.target/aarch64/vect_shadd_1.c: New test.
>   * gcc.target/aarch64/vect_srhadd_1.c: Likewise.
>   * gcc.target/aarch64/vect_uhadd_1.c: Likewise.
>   * gcc.target/aarch64/vect_urhadd_1.c: Likewise.


[PATCH] Remove "note: " prefix from some scan-tree-dump directives

2018-07-03 Thread David Malcolm
On Tue, 2018-07-03 at 15:53 +0200, Richard Biener wrote:
> On Tue, Jul 3, 2018 at 3:52 PM David Malcolm 
> wrote:
> >
> > On Tue, 2018-07-03 at 09:37 +0200, Richard Biener wrote:
> > > On Mon, Jul 2, 2018 at 7:00 PM David Malcolm  > > >
> > > wrote:
> > > >
> > > > On Mon, 2018-07-02 at 14:23 +0200, Christophe Lyon wrote:
> > > > > On Fri, 29 Jun 2018 at 10:09, Richard Biener  > > > > r@gm
> > > > > ail.
> > > > > com> wrote:
> > > > > >
> > > > > > On Tue, Jun 26, 2018 at 5:43 PM David Malcolm  > > > > > hat.
> > > > > > com>
> > > > > > wrote:
> > > > > > >
> > > > > > > This patch adds a concept of nested "scopes" to
> > > > > > > dumpfile.c's
> > > > > > > dump_*_loc
> > > > > > > calls, and wires it up to the DUMP_VECT_SCOPE macro in
> > > > > > > tree-
> > > > > > > vectorizer.h,
> > > > > > > so that the nested structure is shown in -fopt-info by
> > > > > > > indentation.
> > > > > > >
> > > > > > > For example, this converts -fopt-info-all e.g. from:
> > > > > > >
> > > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > > test.c:8:3: note: === analyze_loop_nest ===
> > > > > > > test.c:8:3: note: === vect_analyze_loop_form ===
> > > > > > > test.c:8:3: note: === get_loop_niters ===
> > > > > > > test.c:8:3: note: symbolic number of iterations is
> > > > > > > (unsigned
> > > > > > > int)
> > > > > > > n_9(D)
> > > > > > > test.c:8:3: note: not vectorized: loop contains function
> > > > > > > calls or
> > > > > > > data references that cannot be analyzed
> > > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > > >
> > > > > > > to:
> > > > > > >
> > > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > > test.c:8:3: note:  === analyze_loop_nest ===
> > > > > > > test.c:8:3: note:   === vect_analyze_loop_form ===
> > > > > > > test.c:8:3: note:=== get_loop_niters ===
> > > > > > > test.c:8:3: note:   symbolic number of iterations is
> > > > > > > (unsigned
> > > > > > > int) n_9(D)
> > > > > > > test.c:8:3: note:   not vectorized: loop contains
> > > > > > > function
> > > > > > > calls
> > > > > > > or data references that cannot be analyzed
> > > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > > >
> > > > > > > showing that the "symbolic number of iterations" message
> > > > > > > is
> > > > > > > within
> > > > > > > the "=== analyze_loop_nest ===" (and not within the
> > > > > > > "=== vect_analyze_loop_form ===").
> > > > > > >
> > > > > > > This is also enabling work for followups involving
> > > > > > > optimization
> > > > > > > records
> > > > > > > (allowing the records to directly capture the nested
> > > > > > > structure of
> > > > > > > the
> > > > > > > dump messages).
> > > > > > >
> > > > > > > Successfully bootstrapped & regrtested on x86_64-pc-
> > > > > > > linux-
> > > > > > > gnu.
> > > > > > >
> > > > > > > OK for trunk?
> > > > >
> > > > > Hi,
> > > > >
> > > > > I've noticed that this patch (r262246) caused regressions on
> > > > > aarch64:
> > > > > gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-
> > > > > tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built
> > > > > SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-
> > > > > tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built
> > > > > SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-
> > > > > tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built
> > > > > SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-
> > > > > tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built
> > > > > SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-
> > > > > tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built
> > > > > SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-
> > > > > tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built
> > > > > SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-
> > > > > tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/sl

Re: [PATCH] Remove "note: " prefix from some scan-tree-dump directives

2018-07-03 Thread Richard Biener
On Tue, Jul 3, 2018 at 4:10 PM David Malcolm  wrote:
>
> On Tue, 2018-07-03 at 15:53 +0200, Richard Biener wrote:
> > On Tue, Jul 3, 2018 at 3:52 PM David Malcolm 
> > wrote:
> > >
> > > On Tue, 2018-07-03 at 09:37 +0200, Richard Biener wrote:
> > > > On Mon, Jul 2, 2018 at 7:00 PM David Malcolm  > > > >
> > > > wrote:
> > > > >
> > > > > On Mon, 2018-07-02 at 14:23 +0200, Christophe Lyon wrote:
> > > > > > On Fri, 29 Jun 2018 at 10:09, Richard Biener  > > > > > r@gm
> > > > > > ail.
> > > > > > com> wrote:
> > > > > > >
> > > > > > > On Tue, Jun 26, 2018 at 5:43 PM David Malcolm  > > > > > > hat.
> > > > > > > com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > This patch adds a concept of nested "scopes" to
> > > > > > > > dumpfile.c's
> > > > > > > > dump_*_loc
> > > > > > > > calls, and wires it up to the DUMP_VECT_SCOPE macro in
> > > > > > > > tree-
> > > > > > > > vectorizer.h,
> > > > > > > > so that the nested structure is shown in -fopt-info by
> > > > > > > > indentation.
> > > > > > > >
> > > > > > > > For example, this converts -fopt-info-all e.g. from:
> > > > > > > >
> > > > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > > > test.c:8:3: note: === analyze_loop_nest ===
> > > > > > > > test.c:8:3: note: === vect_analyze_loop_form ===
> > > > > > > > test.c:8:3: note: === get_loop_niters ===
> > > > > > > > test.c:8:3: note: symbolic number of iterations is
> > > > > > > > (unsigned
> > > > > > > > int)
> > > > > > > > n_9(D)
> > > > > > > > test.c:8:3: note: not vectorized: loop contains function
> > > > > > > > calls or
> > > > > > > > data references that cannot be analyzed
> > > > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > > > >
> > > > > > > > to:
> > > > > > > >
> > > > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > > > test.c:8:3: note:  === analyze_loop_nest ===
> > > > > > > > test.c:8:3: note:   === vect_analyze_loop_form ===
> > > > > > > > test.c:8:3: note:=== get_loop_niters ===
> > > > > > > > test.c:8:3: note:   symbolic number of iterations is
> > > > > > > > (unsigned
> > > > > > > > int) n_9(D)
> > > > > > > > test.c:8:3: note:   not vectorized: loop contains
> > > > > > > > function
> > > > > > > > calls
> > > > > > > > or data references that cannot be analyzed
> > > > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > > > >
> > > > > > > > showing that the "symbolic number of iterations" message
> > > > > > > > is
> > > > > > > > within
> > > > > > > > the "=== analyze_loop_nest ===" (and not within the
> > > > > > > > "=== vect_analyze_loop_form ===").
> > > > > > > >
> > > > > > > > This is also enabling work for followups involving
> > > > > > > > optimization
> > > > > > > > records
> > > > > > > > (allowing the records to directly capture the nested
> > > > > > > > structure of
> > > > > > > > the
> > > > > > > > dump messages).
> > > > > > > >
> > > > > > > > Successfully bootstrapped & regrtested on x86_64-pc-
> > > > > > > > linux-
> > > > > > > > gnu.
> > > > > > > >
> > > > > > > > OK for trunk?
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I've noticed that this patch (r262246) caused regressions on
> > > > > > aarch64:
> > > > > > gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-
> > > > > > tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built
> > > > > > SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-
> > > > > > tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built
> > > > > > SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-
> > > > > > tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built
> > > > > > SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-
> > > > > > tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built
> > > > > > SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-
> > > > > > tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built
> > > > > > SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-
> > > > > > tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-

C++ PATCH to add a test for c++/84306

2018-07-03 Thread Marek Polacek
This patch merely adds a test for an already fixed issue.

Tested on x86_64-linux, ok for trunk?

2018-07-03  Marek Polacek  

PR c++/84306
* g++.dg/overload/conv-op3.C: New test.

diff --git gcc/testsuite/g++.dg/overload/conv-op3.C 
gcc/testsuite/g++.dg/overload/conv-op3.C
index e69de29bb2d..9d04a37fe5e 100644
--- gcc/testsuite/g++.dg/overload/conv-op3.C
+++ gcc/testsuite/g++.dg/overload/conv-op3.C
@@ -0,0 +1,18 @@
+// c++/84306
+// { dg-do link { target c++11 } }
+
+struct foo {
+  foo() = default;
+
+  foo(foo const&);
+
+  template
+  explicit foo(T&&) { }
+};
+
+int
+main()
+{
+  foo f1;
+  foo f2{f1};
+}


Re: Add support for dumping multiple dump files under one name

2018-07-03 Thread David Malcolm
On Tue, 2018-07-03 at 11:00 +0100, Andre Vieira (lists) wrote:
> On 29/06/18 11:13, David Malcolm wrote:
> > On Fri, 2018-06-29 at 10:15 +0200, Richard Biener wrote:
> > > On Fri, 22 Jun 2018, Jan Hubicka wrote:
> > > 
> > > > Hi,
> > > > this patch adds dumpfile support for dumps that come in
> > > > multiple
> > > > parts.  This
> > > > is needed for WPA stream-out dump since we stream partitions in
> > > > parallel and
> > > > the dumps would come up in random order.  Parts are added by
> > > > new
> > > > parameter that
> > > > is initialzed by default to -1 (no parts). 
> > > > 
> > > > One thing I skipped is any support for duplicated opening of
> > > > file
> > > > with parts since I do not need it.
> > > > 
> > > > Bootstrapped/regtested x86_64-linux, OK?
> > > 
> > > Looks reasonable - David, anything you want to add / have
> > > changed?
> > 
> > No worries from my side; I don't think it interacts with the
> > optimization records stuff I'm working on - presumably this is just
> > for
> > dumping the WPA stream-out, rather than for dumping specific
> > optimizations.
> > 
> > [...snip...]
> > 
> > Dave
> > 
> 
> Hi David,
> 
> I believe r262245 is causing the following failures on aarch64 and
> arm:
> 
> FAIL: gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-tree-
> dump
> vect "note: Built SLP cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built SLP
> cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-tree-
> dump
> vect "note: Built SLP cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built SLP
> cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-tree-
> dump
> vect "note: Built SLP cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built SLP
> cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-tree-
> dump
> vect "note: Built SLP cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built SLP
> cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-
> dump
> vect "note: Built SLP cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built SLP
> cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-tree-
> dump
> vect "note: Built SLP cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built SLP
> cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-tree-
> dump
> vect "note: Built SLP cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-8.c scan-tree-dump vect "note: Built SLP
> cancelled: can use load/store-lanes"
> 
> Could you please have a look?

Sorry about this; I think my r262246 ("dumpfile.c: add indentation via
DUMP_VECT_SCOPE") caused this.

Does https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00122.html help?

Dave


Re: C++ PATCH to add a test for c++/84306

2018-07-03 Thread Jason Merrill
OK.

On Tue, Jul 3, 2018 at 10:14 AM, Marek Polacek  wrote:
> This patch merely adds a test for an already fixed issue.
>
> Tested on x86_64-linux, ok for trunk?
>
> 2018-07-03  Marek Polacek  
>
> PR c++/84306
> * g++.dg/overload/conv-op3.C: New test.
>
> diff --git gcc/testsuite/g++.dg/overload/conv-op3.C 
> gcc/testsuite/g++.dg/overload/conv-op3.C
> index e69de29bb2d..9d04a37fe5e 100644
> --- gcc/testsuite/g++.dg/overload/conv-op3.C
> +++ gcc/testsuite/g++.dg/overload/conv-op3.C
> @@ -0,0 +1,18 @@
> +// c++/84306
> +// { dg-do link { target c++11 } }
> +
> +struct foo {
> +  foo() = default;
> +
> +  foo(foo const&);
> +
> +  template
> +  explicit foo(T&&) { }
> +};
> +
> +int
> +main()
> +{
> +  foo f1;
> +  foo f2{f1};
> +}


Re: [patch, fortran] Asynchronous I/O, take 3

2018-07-03 Thread Rainer Orth
Hi Thomas,

> the attached patch is the third take on Nicolas' and my patch
> for implementing asynchronous I/O.  Some parts have been reworked, and
> several bugs which caused either incorrect I/O or hangs have been
> fixed in the process.
>
> I have to say that getting out these bugs has been much harder
> than Nicolas and I originally thought, and that this has cost more
> working hours than any other patch I have been involved in.
>
> This has been regression-tested on x86_64-pc-linux-gnu. The new test
> cases have also been tested in a tight loop with
>
> n=1; while ./a.out; do echo -n $n " " ; n=$((n+1)); done
>
> or (for async_io_3.f90, which is supposed to fail)
>
> while true ; do ./a.out > /dev/null 2>&1 ;  echo -n $n " " ; n=$((n+1));
> done
>
> and the test cases also come up clean with valgrind --tool=drd
> (which is a _very_ strict tool which, after this experience, I
> wholeheartedly recommend for doing pthreads debugging).
>
> The interface remains as before - link in pthread to get asynchronous
> I/O, which matches what ifort does.

another test run on i386-pc-solaris2.11 is underway.  However, may
(all?) gfortran tests now SEGV.  One example is

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
Segmentation Fault

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0xfe1b1f03 in pthread_mutex_unlock () from /lib/libc.so.1
(gdb) where
#0  0xfe1b1f03 in pthread_mutex_unlock () from /lib/libc.so.1
#1  0xfe5d1b7c in __gthread_mutex_unlock (__mutex=0x18)
at ../libgcc/gthr-default.h:778
#2  _gfortran_st_rewind (fpp=0xfeffda9c)
at /vol/gcc/src/hg/trunk/solaris/libgfortran/io/file_pos.c:486
#3  0x0805110f in MAIN__ ()
at 
/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gfortran.dg/backslash_2.f90:6  

Obviously __mutex above hasn't been properly initialized.

> 2018-07-02  Nicolas Koenig  
> Thomas Koenig 
>
> PR fortran/25829
> * testsuite/libgfomp.fortran/async_io_1.f90: New test.
> * testsuite/libgfomp.fortran/async_io_2.f90: New test.
> * testsuite/libgfomp.fortran/async_io_3.f90: New test.

You seem to have a special fondness for libgfomp ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [patch, fortran] Asynchronous I/O, take 3

2018-07-03 Thread Rainer Orth
Hi Thomas,

> another test run on i386-pc-solaris2.11 is underway.  However, may
> (all?) gfortran tests now SEGV.  One example is

the good news is: all three libgomp.fortran/async_io_?.f90 tests now
PASS for both 32 and 64-bit, as do gfortran.dg/f2003_inquire_1.f03 and
gfortran.dg/f2003_io_1.f03.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] i386; Add indirect_return function attribute

2018-07-03 Thread H.J. Lu
On Fri, Jun 8, 2018 at 3:27 AM, H.J. Lu  wrote:
> On x86, swapcontext may return via indirect branch when shadow stack
> is enabled.  To support code instrumentation of control-flow transfers
> with -fcf-protection, add indirect_return function attribute to inform
> compiler that a function may return via indirect branch.
>
> Note: Unlike setjmp, swapcontext only returns once.  Mark it return
> twice will unnecessarily disable compiler optimization.
>
> OK for trunk?
>
> H.J.
> 
> gcc/
>
> PR target/85620
> * config/i386/i386.c (rest_of_insert_endbranch): Also generate
> ENDBRANCH for non-tail call which may return via indirect branch.
> * doc/extend.texi: Document indirect_return attribute.
>
> gcc/testsuite/
>
> PR target/85620
> * gcc.target/i386/pr85620-1.c: New test.
> * gcc.target/i386/pr85620-2.c: Likewise.
>

Here is the updated patch with a testcase to show the impact of
returns_twice attribute.

Jan, Uros, can you take a look?

Thanks.

-- 
H.J.
From 6115541e03073b93bd81f5eb81fdedd4e5b47b28 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 7 Jun 2018 20:05:15 -0700
Subject: [PATCH] i386; Add indirect_return function attribute

On x86, swapcontext may return via indirect branch when shadow stack
is enabled.  To support code instrumentation of control-flow transfers
with -fcf-protection, add indirect_return function attribute to inform
compiler that a function may return via indirect branch.

Note: Unlike setjmp, swapcontext only returns once.  Mark it return
twice will unnecessarily disable compiler optimization as shown in
the testcase here.

gcc/

	PR target/85620
	* config/i386/i386.c (rest_of_insert_endbranch): Also generate
	ENDBRANCH for non-tail call which may return via indirect branch.
	* doc/extend.texi: Document indirect_return attribute.

gcc/testsuite/

	PR target/85620
	* gcc.target/i386/pr85620-1.c: New test.
	* gcc.target/i386/pr85620-2.c: Likewise.
	* gcc.target/i386/pr85620-3.c: Likewise.
	* gcc.target/i386/pr85620-4.c: Likewise.
---
 gcc/config/i386/i386.c| 23 ++-
 gcc/doc/extend.texi   |  6 ++
 gcc/testsuite/gcc.target/i386/pr85620-1.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr85620-2.c | 13 +
 gcc/testsuite/gcc.target/i386/pr85620-3.c | 18 ++
 gcc/testsuite/gcc.target/i386/pr85620-4.c | 18 ++
 6 files changed, 92 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-4.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e6d17632142..41461d582a4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2621,7 +2621,26 @@ rest_of_insert_endbranch (void)
 	{
 	  if (CALL_P (insn))
 	{
-	  if (find_reg_note (insn, REG_SETJMP, NULL) == NULL)
+	  bool need_endbr;
+	  need_endbr = find_reg_note (insn, REG_SETJMP, NULL) != NULL;
+	  if (!need_endbr && !SIBLING_CALL_P (insn))
+		{
+		  rtx call = get_call_rtx_from (insn);
+		  rtx fnaddr = XEXP (call, 0);
+
+		  /* Also generate ENDBRANCH for non-tail call which
+		 may return via indirect branch.  */
+		  if (MEM_P (fnaddr)
+		  && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF)
+		{
+		  tree fndecl = SYMBOL_REF_DECL (XEXP (fnaddr, 0));
+		  if (fndecl
+			  && lookup_attribute ("indirect_return",
+	   DECL_ATTRIBUTES (fndecl)))
+			need_endbr = true;
+		}
+		}
+	  if (!need_endbr)
 		continue;
 	  /* Generate ENDBRANCH after CALL, which can return more than
 		 twice, setjmp-like functions.  */
@@ -45897,6 +45916,8 @@ static const struct attribute_spec ix86_attribute_table[] =
 ix86_handle_fndecl_attribute, NULL },
   { "function_return", 1, 1, true, false, false, false,
 ix86_handle_fndecl_attribute, NULL },
+  { "indirect_return", 0, 0, true, false, false, false,
+ix86_handle_fndecl_attribute, NULL },
 
   /* End element.  */
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 19c2da2e5db..97b1f78cade 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5886,6 +5886,12 @@ foo (void)
 @}
 @end smallexample
 
+@item indirect_return
+@cindex @code{indirect_return} function attribute, x86
+
+The @code{indirect_return} attribute on a function is used to inform
+the compiler that the function may return via indiret branch.
+
 @end table
 
 On the x86, the inliner does not inline a
diff --git a/gcc/testsuite/gcc.target/i386/pr85620-1.c b/gcc/testsuite/gcc.target/i386/pr85620-1.c
new file mode 100644
index 000..32efb08e59e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr85620-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection" } */

C++ PATCH for c++/86201, diagnostic routines re-entered crash

2018-07-03 Thread Marek Polacek
This is an ICE of kind "Error reporting routines re-entered".  From the PR:

The problem here is that we report the missing return value:
 9224 permerror (input_location, "return-statement with no value, in "
 9225"function returning %qT", valtype);
but permerror will end up calling print_instantiation_full_context, which ends
up calling dump_template_bindings and then tsubst -> tsubst_copy_and_build ->
build_functional_cast -> ... -> ocp_convert which has (complain is tf_none)
 829   if (complain & tf_warning)
 830 return cp_truthvalue_conversion (e);
 831   else
 832 {
 833   /* Prevent bogus -Wint-in-bool-context warnings coming
 834  from c_common_truthvalue_conversion down the line.  */
 835   warning_sentinel w (warn_int_in_bool_context);
 836   return cp_truthvalue_conversion (e);
 837 }
So we call cp_truthvalue_conversion -> c_common_truthvalue_conversion ->
build_binary_op which only calls cp_build_binary_op but with
tf_warning_or_error.  So even though the warning
 4736   if ((complain & tf_warning)
 4737   && (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1)))
 4738 warning (OPT_Wfloat_equal,
 4739  "comparing floating point with == or != is unsafe");
is properly guarded, we still re-enter the diagnostic routines.

But since we're parsing decltype we can check c_inhibit_evaluation_warnings
which means we won't call warning().

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-07-03  Marek Polacek  

PR c++/86201
* typeck.c (cp_build_binary_op): Check c_inhibit_evaluation_warnings.

* g++.dg/diagnostic/pr86201.C: New test.

diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 3a4f1cdf479..ea4ce9649cd 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -4734,6 +4734,7 @@ cp_build_binary_op (location_t location,
   if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE)
goto vector_compare;
   if ((complain & tf_warning)
+ && c_inhibit_evaluation_warnings == 0
  && (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1)))
warning (OPT_Wfloat_equal,
 "comparing floating point with == or != is unsafe");
diff --git gcc/testsuite/g++.dg/diagnostic/pr86201.C 
gcc/testsuite/g++.dg/diagnostic/pr86201.C
index e69de29bb2d..e7019c22d95 100644
--- gcc/testsuite/g++.dg/diagnostic/pr86201.C
+++ gcc/testsuite/g++.dg/diagnostic/pr86201.C
@@ -0,0 +1,12 @@
+// PR c++/86201
+// { dg-do compile { target c++11 } }
+
+template 
+auto fn1 (V&& v) -> decltype(U(v))
+{
+  return; // { dg-error "return-statement with no value" }
+}
+void fn2 ()
+{
+  fn1(1.0);
+}


Re: [PATCH] i386; Add indirect_return function attribute

2018-07-03 Thread Uros Bizjak
On Tue, Jul 3, 2018 at 5:32 PM, H.J. Lu  wrote:
> On Fri, Jun 8, 2018 at 3:27 AM, H.J. Lu  wrote:
>> On x86, swapcontext may return via indirect branch when shadow stack
>> is enabled.  To support code instrumentation of control-flow transfers
>> with -fcf-protection, add indirect_return function attribute to inform
>> compiler that a function may return via indirect branch.
>>
>> Note: Unlike setjmp, swapcontext only returns once.  Mark it return
>> twice will unnecessarily disable compiler optimization.
>>
>> OK for trunk?
>>
>> H.J.
>> 
>> gcc/
>>
>> PR target/85620
>> * config/i386/i386.c (rest_of_insert_endbranch): Also generate
>> ENDBRANCH for non-tail call which may return via indirect branch.
>> * doc/extend.texi: Document indirect_return attribute.
>>
>> gcc/testsuite/
>>
>> PR target/85620
>> * gcc.target/i386/pr85620-1.c: New test.
>> * gcc.target/i386/pr85620-2.c: Likewise.
>>
>
> Here is the updated patch with a testcase to show the impact of
> returns_twice attribute.
>
> Jan, Uros, can you take a look?

LGTM for the implementation, can't say if attribute is really needed or not.

+@item indirect_return
+@cindex @code{indirect_return} function attribute, x86
+
+The @code{indirect_return} attribute on a function is used to inform
+the compiler that the function may return via indiret branch.

s/indiret/indirect/

Uros.


Re: Add support for dumping multiple dump files under one name

2018-07-03 Thread Andre Vieira (lists)
On 03/07/18 15:15, David Malcolm wrote:
> On Tue, 2018-07-03 at 11:00 +0100, Andre Vieira (lists) wrote:
>> On 29/06/18 11:13, David Malcolm wrote:
>>> On Fri, 2018-06-29 at 10:15 +0200, Richard Biener wrote:
 On Fri, 22 Jun 2018, Jan Hubicka wrote:

> Hi,
> this patch adds dumpfile support for dumps that come in
> multiple
> parts.  This
> is needed for WPA stream-out dump since we stream partitions in
> parallel and
> the dumps would come up in random order.  Parts are added by
> new
> parameter that
> is initialzed by default to -1 (no parts). 
>
> One thing I skipped is any support for duplicated opening of
> file
> with parts since I do not need it.
>
> Bootstrapped/regtested x86_64-linux, OK?

 Looks reasonable - David, anything you want to add / have
 changed?
>>>
>>> No worries from my side; I don't think it interacts with the
>>> optimization records stuff I'm working on - presumably this is just
>>> for
>>> dumping the WPA stream-out, rather than for dumping specific
>>> optimizations.
>>>
>>> [...snip...]
>>>
>>> Dave
>>>
>>
>> Hi David,
>>
>> I believe r262245 is causing the following failures on aarch64 and
>> arm:
>>
>> FAIL: gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-tree-
>> dump
>> vect "note: Built SLP cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built SLP
>> cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-tree-
>> dump
>> vect "note: Built SLP cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built SLP
>> cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-tree-
>> dump
>> vect "note: Built SLP cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built SLP
>> cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-tree-
>> dump
>> vect "note: Built SLP cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built SLP
>> cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-
>> dump
>> vect "note: Built SLP cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built SLP
>> cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-tree-
>> dump
>> vect "note: Built SLP cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built SLP
>> cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-tree-
>> dump
>> vect "note: Built SLP cancelled: can use load/store-lanes"
>> FAIL: gcc.dg/vect/slp-perm-8.c scan-tree-dump vect "note: Built SLP
>> cancelled: can use load/store-lanes"
>>
>> Could you please have a look?
> 
> Sorry about this; I think my r262246 ("dumpfile.c: add indentation via
> DUMP_VECT_SCOPE") caused this.
> 
> Does https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00122.html help?

Yes it does. Thank you!

Andre
> 
> Dave
> 



C++ PATCH for c++/86378, functional cast in template noexcept-specifier

2018-07-03 Thread Jason Merrill
This was a simple typo in strip_typedefs_expr, where we were iterating
but then building a new list consisting entirely of the first element.

Tested x86_64-pc-linux-gnu, applying to trunk, 8, 7.
commit 1cab1ce37320aac67d8fbf88d10930f5c769cfb1
Author: Jason Merrill 
Date:   Mon Jul 2 15:09:58 2018 -0400

PR c++/86378 - functional cast in noexcept-specifier.

* tree.c (strip_typedefs_expr) [TREE_LIST]: Fix iteration.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 361248d4b52..b1333f55e39 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1735,9 +1735,9 @@ strip_typedefs_expr (tree t, bool *remove_attributes)
 	tree it;
 	for (it = t; it; it = TREE_CHAIN (it))
 	  {
-	tree val = strip_typedefs_expr (TREE_VALUE (t), remove_attributes);
+	tree val = strip_typedefs_expr (TREE_VALUE (it), remove_attributes);
 	vec_safe_push (vec, val);
-	if (val != TREE_VALUE (t))
+	if (val != TREE_VALUE (it))
 	  changed = true;
 	gcc_assert (TREE_PURPOSE (it) == NULL_TREE);
 	  }
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept33.C b/gcc/testsuite/g++.dg/cpp0x/noexcept33.C
new file mode 100644
index 000..c5a03de38dd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept33.C
@@ -0,0 +1,28 @@
+// PR c++/86378
+// { dg-do compile { target c++11 } }
+
+struct Pepper {};
+struct Apple { Apple(int) {} };
+
+struct Combination : Apple, Pepper
+{
+  Combination(Pepper p, Apple a)
+: Apple(a), Pepper(p)
+  {}
+};
+
+struct MyCombination
+{
+  using Spice = Pepper;
+  using Fruit = Apple;
+
+  Combination combination;
+
+  template
+  constexpr MyCombination(T&& t)
+  noexcept(noexcept(Combination(Spice(), Fruit(t
+: combination(Spice(), Fruit(t))
+  {}
+};
+
+MyCombination obj(Apple(4));


Re: C++ PATCH for c++/86201, diagnostic routines re-entered crash

2018-07-03 Thread Jason Merrill
OK.

On Tue, Jul 3, 2018 at 12:08 PM, Marek Polacek  wrote:
> This is an ICE of kind "Error reporting routines re-entered".  From the PR:
>
> The problem here is that we report the missing return value:
>  9224 permerror (input_location, "return-statement with no value, in "
>  9225"function returning %qT", valtype);
> but permerror will end up calling print_instantiation_full_context, which ends
> up calling dump_template_bindings and then tsubst -> tsubst_copy_and_build ->
> build_functional_cast -> ... -> ocp_convert which has (complain is tf_none)
>  829   if (complain & tf_warning)
>  830 return cp_truthvalue_conversion (e);
>  831   else
>  832 {
>  833   /* Prevent bogus -Wint-in-bool-context warnings coming
>  834  from c_common_truthvalue_conversion down the line.  */
>  835   warning_sentinel w (warn_int_in_bool_context);
>  836   return cp_truthvalue_conversion (e);
>  837 }
> So we call cp_truthvalue_conversion -> c_common_truthvalue_conversion ->
> build_binary_op which only calls cp_build_binary_op but with
> tf_warning_or_error.  So even though the warning
>  4736   if ((complain & tf_warning)
>  4737   && (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1)))
>  4738 warning (OPT_Wfloat_equal,
>  4739  "comparing floating point with == or != is unsafe");
> is properly guarded, we still re-enter the diagnostic routines.
>
> But since we're parsing decltype we can check c_inhibit_evaluation_warnings
> which means we won't call warning().
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2018-07-03  Marek Polacek  
>
> PR c++/86201
> * typeck.c (cp_build_binary_op): Check c_inhibit_evaluation_warnings.
>
> * g++.dg/diagnostic/pr86201.C: New test.
>
> diff --git gcc/cp/typeck.c gcc/cp/typeck.c
> index 3a4f1cdf479..ea4ce9649cd 100644
> --- gcc/cp/typeck.c
> +++ gcc/cp/typeck.c
> @@ -4734,6 +4734,7 @@ cp_build_binary_op (location_t location,
>if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE)
> goto vector_compare;
>if ((complain & tf_warning)
> + && c_inhibit_evaluation_warnings == 0
>   && (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1)))
> warning (OPT_Wfloat_equal,
>  "comparing floating point with == or != is unsafe");
> diff --git gcc/testsuite/g++.dg/diagnostic/pr86201.C 
> gcc/testsuite/g++.dg/diagnostic/pr86201.C
> index e69de29bb2d..e7019c22d95 100644
> --- gcc/testsuite/g++.dg/diagnostic/pr86201.C
> +++ gcc/testsuite/g++.dg/diagnostic/pr86201.C
> @@ -0,0 +1,12 @@
> +// PR c++/86201
> +// { dg-do compile { target c++11 } }
> +
> +template 
> +auto fn1 (V&& v) -> decltype(U(v))
> +{
> +  return; // { dg-error "return-statement with no value" }
> +}
> +void fn2 ()
> +{
> +  fn1(1.0);
> +}


Re: [C++ Patch] More location fixes to grokdeclarator

2018-07-03 Thread Jason Merrill
On Thu, Jun 28, 2018 at 5:39 AM, Paolo Carlini  wrote:
> Hi,
>
> On 28/06/2018 03:22, David Malcolm wrote:
>>
>> [snip]
>>>
>>> If I'm following you right, the idea is that gcc should complain
>>> because two different things in the user's source code contradict
>>> each
>>> other.
>>>
>>> In such circumstances, I think we ought to try to print *both*
>>> locations, so that we're showing, rather than just telling.
>>
>> Or to put in another way, if two things in the user's source contradict
>> each other, we should show the user both.  The user is going to have to
>> decide to delete one (or both) of them, and we don't know which one,
>> but at least by showing both it helps him/her take their next action.
>
> Sure, makes sense. Thus the below uses rich_location the way you explained.
> I also added 2 specific testcases and extended a bit another one to exercise
> a bit more min_location..Of course the patch doesn't add max_location
> anymore, I suspect we are not going to find uses for a max anytime soon,
> because we really want rich_location with multiple ranges in all those
> cases...

>if ((type_quals & TYPE_QUAL_VOLATILE)
> -  && (loc == UNKNOWN_LOCATION || locations[ds_volatile] < loc))
> +  && (loc == UNKNOWN_LOCATION
> +  || linemap_location_before_p (line_table,
> +locations[ds_volatile], loc)))
>  loc = locations[ds_volatile];

>if ((type_quals & TYPE_QUAL_RESTRICT)
> -  && (loc == UNKNOWN_LOCATION || locations[ds_restrict] < loc))
> +  && (loc == UNKNOWN_LOCATION
> +  || linemap_location_before_p (line_table,
> +locations[ds_restrict], loc)))
>  loc = locations[ds_restrict];

Why not use min_location here?

Jason


Re: C++ PATCH for c++/57891, narrowing conversions in non-type template arguments

2018-07-03 Thread Jason Merrill
On Fri, Jun 29, 2018 at 3:58 PM, Marek Polacek  wrote:
> On Wed, Jun 27, 2018 at 07:35:15PM -0400, Jason Merrill wrote:
>> On Wed, Jun 27, 2018 at 12:53 PM, Marek Polacek  wrote:
>> > This PR complains about us accepting invalid code like
>> >
>> >   template struct A {};
>> >   A<-1> a;
>> >
>> > Where we should detect the narrowing: [temp.arg.nontype] says
>> > "A template-argument for a non-type template-parameter shall be a converted
>> > constant expression ([expr.const]) of the type of the template-parameter."
>> > and a converted constant expression can contain only
>> > - integral conversions other than narrowing conversions,
>> > - [...]."
>> > It spurred e.g.
>> > 
>> > and has >=3 dups so it has some visibility.
>> >
>> > I think build_converted_constant_expr needs to set check_narrowing.
>> > check_narrowing also always mentions that it's in { } but that is no longer
>> > true; in the future it will also apply to <=>.  We'd probably have to add 
>> > a new
>> > flag to struct conversion if wanted to distinguish between these.
>> >
>> > This does not yet fix detecting narrowing in function templates (78244).
>> >
>> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
>> >
>> > 2018-06-27  Marek Polacek  
>> >
>> > PR c++/57891
>> > * call.c (build_converted_constant_expr): Set check_narrowing.
>> > * decl.c (compute_array_index_type): Add warning sentinel.  Use
>> > input_location.
>> > * pt.c (convert_nontype_argument): Return NULL_TREE if any errors
>> > were reported.
>> > * typeck2.c (check_narrowing): Don't mention { } in diagnostic.
>> >
>> > * g++.dg/cpp0x/Wnarrowing6.C: New test.
>> > * g++.dg/cpp0x/Wnarrowing7.C: New test.
>> > * g++.dg/cpp0x/Wnarrowing8.C: New test.
>> > * g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
>> > * g++.dg/init/new43.C: Adjust dg-error.
>> > * g++.dg/other/fold1.C: Likewise.
>> > * g++.dg/parse/array-size2.C: Likewise.
>> > * g++.dg/other/vrp1.C: Add dg-error.
>> > * g++.dg/template/char1.C: Likewise.
>> > * g++.dg/ext/builtin12.C: Likewise.
>> > * g++.dg/template/dependent-name3.C: Adjust dg-error.
>> >
>> > diff --git gcc/cp/call.c gcc/cp/call.c
>> > index 209c1fd2f0e..956c7b149dc 100644
>> > --- gcc/cp/call.c
>> > +++ gcc/cp/call.c
>> > @@ -4152,7 +4152,10 @@ build_converted_constant_expr (tree type, tree 
>> > expr, tsubst_flags_t complain)
>> >  }
>> >
>> >if (conv)
>> > -expr = convert_like (conv, expr, complain);
>> > +{
>> > +  conv->check_narrowing = !processing_template_decl;
>>
>> Why !processing_template_decl?  This needs a comment.
>
> Otherwise we'd warn for e.g.
>
> template struct S { char a[N]; };
> S<1> s;
>
> where compute_array_index_type will try to convert the size of the array 
> (which
> is a template_parm_index of type int when parsing the template) to size_type.
> So I guess I can say that we need to wait for instantiation?

We certainly shouldn't give a narrowing diagnostic about a
value-dependent expression.  It probably makes sense to check that at
the top of check_narrowing, with all the other early exit conditions.
But if we do know the constant value in the template, it's good to
complain then rather than wait for instantiation.

Jason


Re: [PATCH] i386; Add indirect_return function attribute

2018-07-03 Thread H.J. Lu
On Tue, Jul 3, 2018 at 9:12 AM, Uros Bizjak  wrote:
> On Tue, Jul 3, 2018 at 5:32 PM, H.J. Lu  wrote:
>> On Fri, Jun 8, 2018 at 3:27 AM, H.J. Lu  wrote:
>>> On x86, swapcontext may return via indirect branch when shadow stack
>>> is enabled.  To support code instrumentation of control-flow transfers
>>> with -fcf-protection, add indirect_return function attribute to inform
>>> compiler that a function may return via indirect branch.
>>>
>>> Note: Unlike setjmp, swapcontext only returns once.  Mark it return
>>> twice will unnecessarily disable compiler optimization.
>>>
>>> OK for trunk?
>>>
>>> H.J.
>>> 
>>> gcc/
>>>
>>> PR target/85620
>>> * config/i386/i386.c (rest_of_insert_endbranch): Also generate
>>> ENDBRANCH for non-tail call which may return via indirect branch.
>>> * doc/extend.texi: Document indirect_return attribute.
>>>
>>> gcc/testsuite/
>>>
>>> PR target/85620
>>> * gcc.target/i386/pr85620-1.c: New test.
>>> * gcc.target/i386/pr85620-2.c: Likewise.
>>>
>>
>> Here is the updated patch with a testcase to show the impact of
>> returns_twice attribute.
>>
>> Jan, Uros, can you take a look?
>
> LGTM for the implementation, can't say if attribute is really needed or not.

This gives programmers more flexibly.

> +@item indirect_return
> +@cindex @code{indirect_return} function attribute, x86
> +
> +The @code{indirect_return} attribute on a function is used to inform
> +the compiler that the function may return via indiret branch.
>
> s/indiret/indirect/

Fixed.  Here is the updated patch.

Thanks.

-- 
H.J.
From bb98f6a31801659ae3c6689d6d31af33a3c28bb2 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 7 Jun 2018 20:05:15 -0700
Subject: [PATCH] i386; Add indirect_return function attribute

On x86, swapcontext may return via indirect branch when shadow stack
is enabled.  To support code instrumentation of control-flow transfers
with -fcf-protection, add indirect_return function attribute to inform
compiler that a function may return via indirect branch.

Note: Unlike setjmp, swapcontext only returns once.  Mark it return
twice will unnecessarily disable compiler optimization as shown in
the testcase here.

gcc/

	PR target/85620
	* config/i386/i386.c (rest_of_insert_endbranch): Also generate
	ENDBRANCH for non-tail call which may return via indirect branch.
	* doc/extend.texi: Document indirect_return attribute.

gcc/testsuite/

	PR target/85620
	* gcc.target/i386/pr85620-1.c: New test.
	* gcc.target/i386/pr85620-2.c: Likewise.
	* gcc.target/i386/pr85620-3.c: Likewise.
	* gcc.target/i386/pr85620-4.c: Likewise.
---
 gcc/config/i386/i386.c| 23 ++-
 gcc/doc/extend.texi   |  6 ++
 gcc/testsuite/gcc.target/i386/pr85620-1.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr85620-2.c | 13 +
 gcc/testsuite/gcc.target/i386/pr85620-3.c | 18 ++
 gcc/testsuite/gcc.target/i386/pr85620-4.c | 18 ++
 6 files changed, 92 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-4.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e6d17632142..41461d582a4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2621,7 +2621,26 @@ rest_of_insert_endbranch (void)
 	{
 	  if (CALL_P (insn))
 	{
-	  if (find_reg_note (insn, REG_SETJMP, NULL) == NULL)
+	  bool need_endbr;
+	  need_endbr = find_reg_note (insn, REG_SETJMP, NULL) != NULL;
+	  if (!need_endbr && !SIBLING_CALL_P (insn))
+		{
+		  rtx call = get_call_rtx_from (insn);
+		  rtx fnaddr = XEXP (call, 0);
+
+		  /* Also generate ENDBRANCH for non-tail call which
+		 may return via indirect branch.  */
+		  if (MEM_P (fnaddr)
+		  && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF)
+		{
+		  tree fndecl = SYMBOL_REF_DECL (XEXP (fnaddr, 0));
+		  if (fndecl
+			  && lookup_attribute ("indirect_return",
+	   DECL_ATTRIBUTES (fndecl)))
+			need_endbr = true;
+		}
+		}
+	  if (!need_endbr)
 		continue;
 	  /* Generate ENDBRANCH after CALL, which can return more than
 		 twice, setjmp-like functions.  */
@@ -45897,6 +45916,8 @@ static const struct attribute_spec ix86_attribute_table[] =
 ix86_handle_fndecl_attribute, NULL },
   { "function_return", 1, 1, true, false, false, false,
 ix86_handle_fndecl_attribute, NULL },
+  { "indirect_return", 0, 0, true, false, false, false,
+ix86_handle_fndecl_attribute, NULL },
 
   /* End element.  */
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 19c2da2e5db..071d0ffc414 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5886,6 +5886,12 @@ foo (void)
 @}
 @end smallexample

Re: [PATCH][GCC][AArch64] Simplify movmem code by always doing overlapping copies when larger than 8 bytes.

2018-07-03 Thread James Greenhalgh
On Tue, Jun 19, 2018 at 09:09:27AM -0500, Tamar Christina wrote:
> Hi All,



OK.

Thanks,
James

> Thanks,
> Tamar
> 
> gcc/
> 2018-06-19  Tamar Christina  
> 
>   * config/aarch64/aarch64.c (aarch64_expand_movmem): Fix mode size.
> 
> gcc/testsuite/
> 2018-06-19  Tamar Christina  
> 
>   * gcc.target/aarch64/struct_cpy.c: New.
> 
> -- 


[PATCH] Fix bootstrap on ia64 with old GCC version.

2018-07-03 Thread Martin Liška

Hi.

In order to make GCC 4.1 happy and build current tip, we need to define
static constants out of a class definition.

Ready for trunk?
Thanks,
Martin

gcc/ChangeLog:

2018-07-03  Martin Liska  

* tree-switch-conversion.h (struct jump_table_cluster): Define
constant values outside of class declaration.
---
 gcc/tree-switch-conversion.h | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)


diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
index 4beac785f05..8efb125aff1 100644
--- a/gcc/tree-switch-conversion.h
+++ b/gcc/tree-switch-conversion.h
@@ -259,12 +259,17 @@ struct jump_table_cluster: public group_cluster
   static bool is_enabled (void);
 
   /* Max growth ratio for code that is optimized for size.  */
-  static const unsigned HOST_WIDE_INT max_ratio_for_size = 3;
+  static const unsigned HOST_WIDE_INT max_ratio_for_size;
 
   /* Max growth ratio for code that is optimized for speed.  */
-  static const unsigned HOST_WIDE_INT max_ratio_for_speed = 8;
+  static const unsigned HOST_WIDE_INT max_ratio_for_speed;
 };
 
+const unsigned HOST_WIDE_INT jump_table_cluster::max_ratio_for_size = 3;
+
+const unsigned HOST_WIDE_INT jump_table_cluster::max_ratio_for_speed = 8;
+
+
 /* A GIMPLE switch statement can be expanded to a short sequence of bit-wise
 comparisons.  "switch(x)" is converted into "if ((1 << (x-MINVAL)) & CST)"
 where CST and MINVAL are integer constants.  This is better than a series



[PATCH] Fix DOS-based system build and fix documentation.

2018-07-03 Thread Martin Liška

Hi.

I'm sending fix for DOS-based system, it's a compilation error that
I introduced some time ago. Plus I add Jonathan's correction of a documentation
entry.

Ready for trunk?
Thanks,
Martin

gcc/ChangeLog:

2018-07-03  Martin Liska  
Jonathan Wakely  

* coverage.c: Use correct type.
* doc/invoke.texi: Language correction.
---
 gcc/coverage.c  | 2 +-
 gcc/doc/invoke.texi | 7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)


diff --git a/gcc/coverage.c b/gcc/coverage.c
index 9c9d3dbd39e..da171c84d3c 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -1227,7 +1227,7 @@ coverage_init (const char *filename)
   if (profile_data_prefix)
 	{
 #if HAVE_DOS_BASED_FILE_SYSTEM
-	  const char separator = "\\";
+	  const char *separator = "\\";
 #else
 	  const char *separator = "/";
 #endif
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 56cd122b0d7..31d4f1047ba 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11352,9 +11352,10 @@ and used by @option{-fprofile-use} and @option{-fbranch-probabilities}
 and its related options.  Both absolute and relative paths can be used.
 By default, GCC uses the current directory as @var{path}, thus the
 profile data file appears in the same directory as the object file.
-In order to prevent filename clashing, if object file name is not an absolute
-path, we mangle absolute path of @file{@var{sourcename}.gcda} file and
-use it as file name of a @file{.gcda} file.
+In order to prevent the file name clashing, if the object file name is
+not an absolute path, we mangle the absolute path of the
+@file{@var{sourcename}.gcda} file and use it as the file name of a
+@file{.gcda} file.
 
 When an executable is run in a massive parallel environment, it is recommended
 to save profile to different folders.  That can be done with variables



[PATCH] Remove legacy testcase for -fprofile-generate=./

2018-07-03 Thread Martin Liška

Hi.

As new option mangles absolute path of a compiled file, it's hard
to come up with a scan-file pattern. Note similar test is done in
gcc.dg/profile-dir-*.c that I adjusted few days ago.

Ready for trunk?
Thanks,
Martin

gcc/testsuite/ChangeLog:

2018-07-03  Martin Liska  

* gcc.dg/pr47793.c: Remove.
---
 gcc/testsuite/gcc.dg/pr47793.c | 13 -
 1 file changed, 13 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/pr47793.c


diff --git a/gcc/testsuite/gcc.dg/pr47793.c b/gcc/testsuite/gcc.dg/pr47793.c
deleted file mode 100644
index 0ee1aaee421..000
--- a/gcc/testsuite/gcc.dg/pr47793.c
+++ /dev/null
@@ -1,13 +0,0 @@
-/* Bug pr47793: Allow relative paths in profile-generate.  */
-/* { dg-do run } */
-/* { dg-options "-O -fprofile-generate=./" } */
-/* { dg-require-profiling "-fprofile-generate" } */
-/* { dg-final { scan-file pr47793.gcda "."} } */
-
-int
-main(void)
-{
-  return 0;
-}
-
-/* { dg-final { cleanup-coverage-files } } */



Re: [PATCH] Fix bootstrap on ia64 with old GCC version.

2018-07-03 Thread Jakub Jelinek
On Tue, Jul 03, 2018 at 07:22:19PM +0200, Martin Liška wrote:
> In order to make GCC 4.1 happy and build current tip, we need to define
> static constants out of a class definition.
> 
> Ready for trunk?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-07-03  Martin Liska  
> 
>   * tree-switch-conversion.h (struct jump_table_cluster): Define
> constant values outside of class declaration.

That looks incorrect.  I don't see why 4.1 wouldn't allow the const static
data members initializers inside of the class.

You just need to define those vars, and the definition (without the
initializers) shouldn't go into the header, but to a single .c file instead
(I know right now there is just one .c file that includes this header, but
if we ever want to include it in more than one, it would be a problem;
if we never want to include in more than one, the question is why we have
the header file at all).

So IMHO keep tree-switch-conversion.h unmodified and add:

const unsigned HOST_WIDE_INT jump_table_cluster::max_ratio_for_size;
const unsigned HOST_WIDE_INT jump_table_cluster::max_ratio_for_speed;

to tree-switch-conversion.c somewhere.

> diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
> index 4beac785f05..8efb125aff1 100644
> --- a/gcc/tree-switch-conversion.h
> +++ b/gcc/tree-switch-conversion.h
> @@ -259,12 +259,17 @@ struct jump_table_cluster: public group_cluster
>static bool is_enabled (void);
>  
>/* Max growth ratio for code that is optimized for size.  */
> -  static const unsigned HOST_WIDE_INT max_ratio_for_size = 3;
> +  static const unsigned HOST_WIDE_INT max_ratio_for_size;
>  
>/* Max growth ratio for code that is optimized for speed.  */
> -  static const unsigned HOST_WIDE_INT max_ratio_for_speed = 8;
> +  static const unsigned HOST_WIDE_INT max_ratio_for_speed;
>  };
>  
> +const unsigned HOST_WIDE_INT jump_table_cluster::max_ratio_for_size = 3;
> +
> +const unsigned HOST_WIDE_INT jump_table_cluster::max_ratio_for_speed = 8;
> +
> +
>  /* A GIMPLE switch statement can be expanded to a short sequence of bit-wise
>  comparisons.  "switch(x)" is converted into "if ((1 << (x-MINVAL)) & CST)"
>  where CST and MINVAL are integer constants.  This is better than a series
> 


Jakub


[PATCH 18/n, 386]: Fix PR85694, Generation of vectorized AVG (Average) instruction

2018-07-03 Thread Uros Bizjak
Hello!

Attached patch implements unsigned HImode and QImode vector average
instructions. This is all x86 has to offer...

2018-07-03  Uros Bizjak  

PR target/85694
* config/i386/sse.md (uavg3_ceil): New expander.
(_uavg3): Simplify expander.

testsuite/ChangeLog:

2018-07-03  Uros Bizjak  

PR target/85694
* gcc.target/i386/pr85694.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 262347)
+++ config/i386/sse.md  (working copy)
@@ -10764,6 +10764,24 @@
   DONE;
 })
 
+(define_expand "uavg3_ceil"
+  [(set (match_operand:VI12_AVX2 0 "register_operand")
+   (truncate:VI12_AVX2
+ (lshiftrt:
+   (plus:
+ (plus:
+   (zero_extend:
+ (match_operand:VI12_AVX2 1 "vector_operand"))
+   (zero_extend:
+ (match_operand:VI12_AVX2 2 "vector_operand")))
+ (match_dup 3))
+   (const_int 1]
+  "TARGET_SSE2"
+{
+  operands[3] = CONST1_RTX(mode);
+  ix86_fixup_binary_operands_no_copy (PLUS, mode, operands);
+})
+
 (define_expand "usadv16qi"
   [(match_operand:V4SI 0 "register_operand")
(match_operand:V16QI 1 "register_operand")
@@ -14234,17 +14252,8 @@
(const_int 1]
   "TARGET_SSE2 &&  && "
 {
-  rtx tmp;
-  if ()
-tmp = operands[3];
-  operands[3] = CONST1_RTX(mode);
+  operands[] = CONST1_RTX(mode);
   ix86_fixup_binary_operands_no_copy (PLUS, mode, operands);
-
-  if ()
-{
-  operands[5] = operands[3];
-  operands[3] = tmp;
-}
 })
 
 (define_insn "*_uavg3"
Index: testsuite/gcc.target/i386/pr85694.c
===
--- testsuite/gcc.target/i386/pr85694.c (nonexistent)
+++ testsuite/gcc.target/i386/pr85694.c (working copy)
@@ -0,0 +1,18 @@
+/* { dg-do compile }
+/* { dg-options "-msse2 -O2 -ftree-vectorize" } */
+/* { dg-final { scan-assembler "pavgb" } } */
+/* { dg-final { scan-assembler "pavgw" } } */
+
+#define N 1024
+
+#define TEST(TYPE) \
+  unsigned TYPE a_##TYPE[N], b_##TYPE[N], c_##TYPE[N]; \
+  void f_##TYPE (void) \
+  {\
+int i; \
+for (i = 0; i < N; i++)\
+  a_##TYPE[i] = (b_##TYPE[i] + c_##TYPE[i] + 1) >> 1;  \
+  }
+
+TEST(char);
+TEST(short);


Re: [patch, fortran] Asynchronous I/O, take 3

2018-07-03 Thread Thomas König

Hi Rainer,


However, may
(all?) gfortran tests now SEGV.  One example is

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
Segmentation Fault

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0xfe1b1f03 in pthread_mutex_unlock () from /lib/libc.so.1
(gdb) where
#0  0xfe1b1f03 in pthread_mutex_unlock () from /lib/libc.so.1
#1  0xfe5d1b7c in __gthread_mutex_unlock (__mutex=0x18)
 at ../libgcc/gthr-default.h:778
#2  _gfortran_st_rewind (fpp=0xfeffda9c)
 at /vol/gcc/src/hg/trunk/solaris/libgfortran/io/file_pos.c:486
#3  0x0805110f in MAIN__ ()
 at 
/vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gfortran.dg/backslash_2.f90:6


Ah, I see what was wrong.

The attached patch should fix this.

I have also attached a new test case which detects this error
even on Linux systems, plus a ChangeLog which fixes the typo :-)

Again regression-tested.

So, OK for trunk?

Regards

Thomas

2018-07-02  Nicolas Koenig  
Thomas Koenig 

PR fortran/25829
* gfortran.texi: Add description of asynchronous I/O.
* trans-decl.c (gfc_finish_var_decl): Treat asynchronous variables
as volatile.
* trans-io.c (gfc_build_io_library_fndecls): Rename st_wait to
st_wait_async and change argument spec from ".X" to ".w".
(gfc_trans_wait): Pass ID argument via reference.

2018-07-02  Nicolas Koenig  
Thomas Koenig 

PR fortran/25829
* gfortran.dg/f2003_inquire_1.f03: Add write statement.
* gfortran.dg/f2003_io_1.f03: Add wait statement.

2018-01-02  Nicolas Koenig  
Thomas Koenig 

PR fortran/25829
* Makefile.am: Add async.c to gfor_io_src.
Add async.h to gfor_io_headers.
* Makefile.in: Regenerated.
* gfortran.map: Add _gfortran_st_wait_async.
* io/async.c: New file.
* io/async.h: New file.
* io/close.c: Include async.h.
(st_close): Call async_wait for an asynchronous unit.
* io/file_pos.c (st_backspace): Likewise.
(st_endfile): Likewise.
(st_rewind): Likewise.
(st_flush): Likewise.
* io/inquire.c: Add handling for asynchronous PENDING
and ID arguments.
* io/io.h (st_parameter_dt): Add async bit.
(st_parameter_wait): Correct.
(gfc_unit): Add au pointer.
(st_wait_async): Add prototype.
(transfer_array_inner): Likewise.
(st_write_done_worker): Likewise.
* io/open.c: Include async.h.
(new_unit): Initialize asynchronous unit.
* io/transfer.c (async_opt): New struct.
(wrap_scalar_transfer): New function.
(transfer_integer): Call wrap_scalar_transfer to do the work.
(transfer_real): Likewise.
(transfer_real_write): Likewise.
(transfer_character): Likewise.
(transfer_character_wide): Likewise.
(transfer_complex): Likewise.
(transfer_array_inner): New function.
(transfer_array): Call transfer_array_inner.
(transfer_derived): Call wrap_scalar_transfer.
(data_transfer_init): Check for asynchronous I/O.
Perform a wait operation on any pending asynchronous I/O
if the data transfer is synchronous. Copy PDT and enqueue
thread for data transfer.
(st_read_done_worker): New function.
(st_read_done): Enqueue transfer or call st_read_done_worker.
(st_write_done_worker): New function.
(st_write_done): Enqueue transfer or call st_read_done_worker.
(st_wait): Document as no-op for compatibility reasons.
(st_wait_async): New function.
* io/unit.c (insert_unit): Use macros LOCK, UNLOCK and TRYLOCK;
add NOTE where necessary.
(get_gfc_unit): Likewise.
(init_units): Likewise.
(close_unit_1): Likewise. Call async_close if asynchronous.
(close_unit): Use macros LOCK and UNLOCK.
(finish_last_advance_record): Likewise.
(newunit_alloc): Likewise.
* io/unix.c (find_file): Likewise.
(flush_all_units_1): Likewise.
(flush_all_units): Likewise.
* libgfortran.h (generate_error_common): Add prototype.
* runtime/error.c: Include io.h and async.h.
(generate_error_common): New function.

2018-07-02  Nicolas Koenig  
Thomas Koenig 

PR fortran/25829
* testsuite/libgomp.fortran/async_io_1.f90: New test.
* testsuite/libgomp.fortran/async_io_2.f90: New test.
* testsuite/libgomp.fortran/async_io_3.f90: New test.
* testsuite/libgomp.fortran/async_io_4.f90: New test.



Obviously __mutex above hasn't been properly initialized.


2018-07-02  Nicolas Koenig  
 Thomas Koenig 

 PR fortran/25829
 * testsuite/libgfomp.fortran/async_io_1.f90: New test.
 * testsuite/libgfomp.fortran/async_io_2.f90: New test.
 * testsuite/libgfomp.fortran/async_io

Re: extract_range_from_binary* cleanups for VRP

2018-07-03 Thread Aldy Hernandez



On 07/03/2018 08:16 AM, Martin Liška wrote:

Hi.

It caused UBSAN errors:

$ cat ubsan.i
int a;
void d() { int c, b = 8 - a; }

$ /home/marxin/Programming/gcc2/objdir/./gcc/xgcc 
-B/home/marxin/Programming/gcc2/objdir/./gcc/ ubsan.i -c -O2
../../gcc/tree-vrp.c:1715:26: runtime error: load of value 255, which is not a 
valid value for type 'bool'
 #0 0x3246ca2 in extract_range_from_binary_expr_1(value_range*, tree_code, 
tree_node*, value_range*, value_range*) ../../gcc/tree-vrp.c:1715
 #1 0x34aa8b6 in vr_values::extract_range_from_binary_expr(value_range*, 
tree_code, tree_node*, tree_node*, tree_node*) ../../gcc/vr-values.c:794
 #2 0x34b45fa in vr_values::extract_range_from_assignment(value_range*, 
gassign*) ../../gcc/vr-values.c:1455
 #3 0x494cfd5 in evrp_range_analyzer::record_ranges_from_stmt(gimple*, 
bool) ../../gcc/gimple-ssa-evrp-analyze.c:293
 #4 0x4942548 in evrp_dom_walker::before_dom_children(basic_block_def*) 
../../gcc/gimple-ssa-evrp.c:139
 #5 0x487652b in dom_walker::walk(basic_block_def*) ../../gcc/domwalk.c:353
 #6 0x49470f9 in execute_early_vrp ../../gcc/gimple-ssa-evrp.c:310
 #7 0x49470f9 in execute ../../gcc/gimple-ssa-evrp.c:347
 #8 0x1fc4a0e in execute_one_pass(opt_pass*) ../../gcc/passes.c:2446
 #9 0x1fc8b47 in execute_pass_list_1 ../../gcc/passes.c:2535
 #10 0x1fc8b8e in execute_pass_list_1 ../../gcc/passes.c:2536
 #11 0x1fc8c68 in execute_pass_list(function*, opt_pass*) 
../../gcc/passes.c:2546
 #12 0x2004c85 in do_per_function_toporder(void (*)(function*, void*), 
void*) ../../gcc/passes.c:1688
 #13 0x2005e9a in execute_ipa_pass_list(opt_pass*) ../../gcc/passes.c:2894
 #14 0xfcfa79 in ipa_passes ../../gcc/cgraphunit.c:2400
 #15 0xfcfa79 in symbol_table::compile() ../../gcc/cgraphunit.c:2536
 #16 0xfdc52a in symbol_table::finalize_compilation_unit() 
../../gcc/cgraphunit.c:2696
 #17 0x25115e4 in compile_file ../../gcc/toplev.c:479
 #18 0x9278af in do_compile ../../gcc/toplev.c:2086
 #19 0x9278af in toplev::main(int, char**) ../../gcc/toplev.c:2221
 #20 0x92a79a in main ../../gcc/main.c:39
 #21 0x7659c11a in __libc_start_main ../csu/libc-start.c:308
 #22 0x92a8c9 in _start 
(/home/marxin/Programming/gcc2/objdir/gcc/cc1+0x92a8c9)

It's because neg_min_op0, or any other from:
   bool neg_min_op0, neg_min_op1, neg_max_op0, neg_max_op1;


I see.

After this spaghetti...

 if (vr0.type == VR_RANGE && vr1.type == VR_RANGE
  && (TREE_CODE (min_op0) == INTEGER_CST
  || (sym_min_op0
  = get_single_symbol (min_op0, &neg_min_op0, &min_op0)))
  && (TREE_CODE (min_op1) == INTEGER_CST
  || (sym_min_op1
  = get_single_symbol (min_op1, &neg_min_op1, &min_op1)))
  && (!(sym_min_op0 && sym_min_op1)
  || (sym_min_op0 == sym_min_op1
  && neg_min_op0 == (minus_p ? neg_min_op1 : !neg_min_op1)))
  && (TREE_CODE (max_op0) == INTEGER_CST
  || (sym_max_op0
  = get_single_symbol (max_op0, &neg_max_op0, &max_op0)))
  && (TREE_CODE (max_op1) == INTEGER_CST
  || (sym_max_op1
  = get_single_symbol (max_op1, &neg_max_op1, &max_op1)))
  && (!(sym_max_op0 && sym_max_op1)
  || (sym_max_op0 == sym_max_op1
  && neg_max_op0 == (minus_p ? neg_max_op1 : !neg_max_op1

...we would never actually use the neg*op* variables inside the 
adjust_symbolic_bound code.


Does this patch fix the problem on your end?

If so, OK for trunk?
gcc/

	* tree-vrp.c (extract_range_from_binary_expr_1): Initialize
	neg_*_op* variables.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index c966334acbc..65865a7f5b6 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -1661,6 +1661,8 @@ extract_range_from_binary_expr_1 (value_range *vr,
   tree sym_max_op1 = NULL_TREE;
   bool neg_min_op0, neg_min_op1, neg_max_op0, neg_max_op1;
 
+  neg_min_op0 = neg_min_op1 = neg_max_op0 = neg_max_op1 = false;
+
   /* If we have a PLUS or MINUS with two VR_RANGEs, either constant or
 	 single-symbolic ranges, try to compute the precise resulting range,
 	 but only if we know that this resulting range will also be constant


Re: [C++ Patch] More location fixes to grokdeclarator

2018-07-03 Thread Paolo Carlini

Hi,

On 03/07/2018 18:36, Jason Merrill wrote:



if ((type_quals & TYPE_QUAL_VOLATILE)
-  && (loc == UNKNOWN_LOCATION || locations[ds_volatile] < loc))
+  && (loc == UNKNOWN_LOCATION
+  || linemap_location_before_p (line_table,
+locations[ds_volatile], loc)))
  loc = locations[ds_volatile];
if ((type_quals & TYPE_QUAL_RESTRICT)
-  && (loc == UNKNOWN_LOCATION || locations[ds_restrict] < loc))
+  && (loc == UNKNOWN_LOCATION
+  || linemap_location_before_p (line_table,
+locations[ds_restrict], loc)))
  loc = locations[ds_restrict];

Why not use min_location here?

Indeed. Thus I successfully tested the below.

Thanks,
Paolo.

//
Index: cp/decl.c
===
--- cp/decl.c   (revision 262329)
+++ cp/decl.c   (working copy)
@@ -8545,15 +8545,18 @@ check_concept_fn (tree fn)
 {
   // A constraint is nullary.
   if (DECL_ARGUMENTS (fn))
-error ("concept %q#D declared with function parameters", fn);
+error_at (DECL_SOURCE_LOCATION (fn),
+ "concept %q#D declared with function parameters", fn);
 
   // The declared return type of the concept shall be bool, and
   // it shall not be deduced from it definition.
   tree type = TREE_TYPE (TREE_TYPE (fn));
   if (is_auto (type))
-error ("concept %q#D declared with a deduced return type", fn);
+error_at (DECL_SOURCE_LOCATION (fn),
+ "concept %q#D declared with a deduced return type", fn);
   else if (type != boolean_type_node)
-error ("concept %q#D with non-% return type %qT", fn, type);
+error_at (DECL_SOURCE_LOCATION (fn),
+ "concept %q#D with non-% return type %qT", fn, type);
 }
 
 /* Helper function.  Replace the temporary this parameter injected
@@ -9795,6 +9798,18 @@ create_array_type_for_decl (tree name, tree type,
   return build_cplus_array_type (type, itype);
 }
 
+/* Returns the smallest location that is not UNKNOWN_LOCATION.  */
+
+static location_t
+min_location (location_t loca, location_t locb)
+{
+  if (loca == UNKNOWN_LOCATION
+  || (locb != UNKNOWN_LOCATION
+ && linemap_location_before_p (line_table, locb, loca)))
+return locb;
+  return loca;
+}
+
 /* Returns the smallest location != UNKNOWN_LOCATION among the
three stored in LOCATIONS[ds_const], LOCATIONS[ds_volatile],
and LOCATIONS[ds_restrict].  */
@@ -9807,13 +9822,11 @@ smallest_type_quals_location (int type_quals, cons
   if (type_quals & TYPE_QUAL_CONST)
 loc = locations[ds_const];
 
-  if ((type_quals & TYPE_QUAL_VOLATILE)
-  && (loc == UNKNOWN_LOCATION || locations[ds_volatile] < loc))
-loc = locations[ds_volatile];
+  if (type_quals & TYPE_QUAL_VOLATILE)
+loc = min_location (loc, locations[ds_volatile]);
 
-  if ((type_quals & TYPE_QUAL_RESTRICT)
-  && (loc == UNKNOWN_LOCATION || locations[ds_restrict] < loc))
-loc = locations[ds_restrict];
+  if (type_quals & TYPE_QUAL_RESTRICT)
+loc = min_location (loc, locations[ds_restrict]);
 
   return loc;
 }
@@ -10710,14 +10723,20 @@ grokdeclarator (const cp_declarator *declarator,
 {
   if (staticp == 2)
{
- error ("member %qD cannot be declared both % "
-"and %", dname);
+ rich_location richloc (line_table, declspecs->locations[ds_virtual]);
+ richloc.add_range (declspecs->locations[ds_storage_class], false);
+ error_at (&richloc, "member %qD cannot be declared both % "
+   "and %", dname);
  storage_class = sc_none;
  staticp = 0;
}
   if (constexpr_p)
-   error ("member %qD cannot be declared both % "
-  "and %", dname);
+   {
+ rich_location richloc (line_table, declspecs->locations[ds_virtual]);
+ richloc.add_range (declspecs->locations[ds_constexpr], false);
+ error_at (&richloc, "member %qD cannot be declared both % "
+   "and %", dname);
+   }
 }
   friendp = decl_spec_seq_has_spec_p (declspecs, ds_friend);
 
@@ -10726,18 +10745,27 @@ grokdeclarator (const cp_declarator *declarator,
 {
   if (typedef_p)
{
- error ("typedef declaration invalid in parameter declaration");
+ error_at (declspecs->locations[ds_typedef],
+   "typedef declaration invalid in parameter declaration");
  return error_mark_node;
}
   else if (template_parm_flag && storage_class != sc_none)
{
- error ("storage class specified for template parameter %qs", name);
+ error_at (min_location (declspecs->locations[ds_thread],
+ declspecs->locations[ds_storage_class]),
+   "storage class specified for template parameter %qs",
+   name);
  return error_mark_node;
}
   else if (storage_class == sc_static
   || sto

Re: C++ PATCH for c++/57891, narrowing conversions in non-type template arguments

2018-07-03 Thread Marek Polacek
On Tue, Jul 03, 2018 at 12:40:51PM -0400, Jason Merrill wrote:
> On Fri, Jun 29, 2018 at 3:58 PM, Marek Polacek  wrote:
> > On Wed, Jun 27, 2018 at 07:35:15PM -0400, Jason Merrill wrote:
> >> On Wed, Jun 27, 2018 at 12:53 PM, Marek Polacek  wrote:
> >> > This PR complains about us accepting invalid code like
> >> >
> >> >   template struct A {};
> >> >   A<-1> a;
> >> >
> >> > Where we should detect the narrowing: [temp.arg.nontype] says
> >> > "A template-argument for a non-type template-parameter shall be a 
> >> > converted
> >> > constant expression ([expr.const]) of the type of the 
> >> > template-parameter."
> >> > and a converted constant expression can contain only
> >> > - integral conversions other than narrowing conversions,
> >> > - [...]."
> >> > It spurred e.g.
> >> > 
> >> > and has >=3 dups so it has some visibility.
> >> >
> >> > I think build_converted_constant_expr needs to set check_narrowing.
> >> > check_narrowing also always mentions that it's in { } but that is no 
> >> > longer
> >> > true; in the future it will also apply to <=>.  We'd probably have to 
> >> > add a new
> >> > flag to struct conversion if wanted to distinguish between these.
> >> >
> >> > This does not yet fix detecting narrowing in function templates (78244).
> >> >
> >> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> >> >
> >> > 2018-06-27  Marek Polacek  
> >> >
> >> > PR c++/57891
> >> > * call.c (build_converted_constant_expr): Set check_narrowing.
> >> > * decl.c (compute_array_index_type): Add warning sentinel.  Use
> >> > input_location.
> >> > * pt.c (convert_nontype_argument): Return NULL_TREE if any errors
> >> > were reported.
> >> > * typeck2.c (check_narrowing): Don't mention { } in diagnostic.
> >> >
> >> > * g++.dg/cpp0x/Wnarrowing6.C: New test.
> >> > * g++.dg/cpp0x/Wnarrowing7.C: New test.
> >> > * g++.dg/cpp0x/Wnarrowing8.C: New test.
> >> > * g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
> >> > * g++.dg/init/new43.C: Adjust dg-error.
> >> > * g++.dg/other/fold1.C: Likewise.
> >> > * g++.dg/parse/array-size2.C: Likewise.
> >> > * g++.dg/other/vrp1.C: Add dg-error.
> >> > * g++.dg/template/char1.C: Likewise.
> >> > * g++.dg/ext/builtin12.C: Likewise.
> >> > * g++.dg/template/dependent-name3.C: Adjust dg-error.
> >> >
> >> > diff --git gcc/cp/call.c gcc/cp/call.c
> >> > index 209c1fd2f0e..956c7b149dc 100644
> >> > --- gcc/cp/call.c
> >> > +++ gcc/cp/call.c
> >> > @@ -4152,7 +4152,10 @@ build_converted_constant_expr (tree type, tree 
> >> > expr, tsubst_flags_t complain)
> >> >  }
> >> >
> >> >if (conv)
> >> > -expr = convert_like (conv, expr, complain);
> >> > +{
> >> > +  conv->check_narrowing = !processing_template_decl;
> >>
> >> Why !processing_template_decl?  This needs a comment.
> >
> > Otherwise we'd warn for e.g.
> >
> > template struct S { char a[N]; };
> > S<1> s;
> >
> > where compute_array_index_type will try to convert the size of the array 
> > (which
> > is a template_parm_index of type int when parsing the template) to 
> > size_type.
> > So I guess I can say that we need to wait for instantiation?
> 
> We certainly shouldn't give a narrowing diagnostic about a
> value-dependent expression.  It probably makes sense to check that at
> the top of check_narrowing, with all the other early exit conditions.
> But if we do know the constant value in the template, it's good to
> complain then rather than wait for instantiation.

Makes sense; how about this then?  (Regtest/bootstrap running.)

2018-07-03  Marek Polacek  

PR c++/57891
* call.c (build_converted_constant_expr): Set check_narrowing.
* decl.c (compute_array_index_type): Add warning sentinel.  Use
input_location.
* pt.c (convert_nontype_argument): Return NULL_TREE if any errors
were reported.
* typeck2.c (check_narrowing): Don't warn for instantiation-dependent
expressions or non-constants in a template.  Don't mention { } in
diagnostic.

* g++.dg/cpp0x/Wnarrowing6.C: New test.
* g++.dg/cpp0x/Wnarrowing7.C: New test.
* g++.dg/cpp0x/Wnarrowing8.C: New test.
* g++.dg/cpp0x/Wnarrowing9.C: New test.
* g++.dg/cpp0x/Wnarrowing10.C: New test.
* g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
* g++.dg/init/new43.C: Adjust dg-error.
* g++.dg/other/fold1.C: Likewise.
* g++.dg/parse/array-size2.C: Likewise.
* g++.dg/other/vrp1.C: Add dg-error.
* g++.dg/template/char1.C: Likewise.
* g++.dg/ext/builtin12.C: Likewise.
* g++.dg/template/dependent-name3.C: Adjust dg-error.

diff --git gcc/cp/call.c gcc/cp/call.c
index 209c1fd2f0e..4fb0fa8774b 100644
--- gcc/cp/call.c
+++ gcc/cp/call.c
@@ -4152

Re: [PATCH 3/3] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

2018-07-03 Thread Martin Liška

On 07/03/2018 10:53 AM, Martin Liška wrote:

Thank you Jeff.

I found some issues when doing build of all targets (contrib/config-list.mk).
I'll update patch and test that affected cross-compilers still produce same 
output.


Hello.

I'm done with testing, I bootstrapped and regtested the patch on x86_64-linux 
and ppc64-linux-gnu.
I also build all cross compilers we have in contrib/config-list.mk and I 
verified that
results for gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c source file is 
equal for all cross compilers
that I touched in the patch. I tested these options:

-O2
-O2 -falign-loops=256
-O2 -falign-loops=256 -falign-functions=512 -falign-labels=1024 
-falign-jumps=2048
-O2 -falign-loops=1024 -falign-functions=512 -falign-jumps=2048
-O2 -falign-loops=256 -falign-jumps=2048
-O2 -falign-loops=100 -falign-functions=200 -falign-labels=300 -falign-jumps=400
-O2 -falign-loops= -falign-functions=1112 -falign-labels=1113 
-falign-jumps=1114

there are no issues except one that are present on current trunk:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86394
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86390

Is the patchset still ready for approval?
Thanks,
Martin
>From cd071ae635d24bfaf41afbe4531de578833dce8c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 21 May 2018 20:58:02 +0200
Subject: [PATCH] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

gcc/ChangeLog:

2018-05-25  Denys Vlasenko  
	Martin Liska  

	PR middle-end/66240
	PR target/45996
	PR c/84100
	* common.opt: Rename align options with 'str_' prefix.
	* common/config/i386/i386-common.c (set_malign_value): New
	function.
	(ix86_handle_option): Use it to set -falign-* options/
	* config/aarch64/aarch64-protos.h (struct tune_params): Change
	type from int to string.
	* config/aarch64/aarch64.c: Update default values from int
	to string.
	* config/alpha/alpha.c (alpha_override_options_after_change):
	Likewise.
	* config/arm/arm.c (arm_override_options_after_change_1): Likewise.
	* config/i386/dragonfly.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Print
	max skip conditionally.
	* config/i386/freebsd.h (SUBALIGN_LOG): New.
	(ASM_OUTPUT_MAX_SKIP_ALIGN): Print
	max skip conditionally.
	* config/i386/gas.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Print
	max skip conditionally.
	* config/i386/gnu-user.h (SUBALIGN_LOG): New.
	(ASM_OUTPUT_MAX_SKIP_ALIGN): Print
	max skip conditionally.
	* config/i386/i386.c (struct ptt): Change type from int to
	string.
	(ix86_default_align): Set default values.
	* config/i386/i386.h (ASM_OUTPUT_MAX_SKIP_PAD): Print
	max skip conditionally.
	* config/i386/iamcu.h (SUBALIGN_LOG): New.
	(ASM_OUTPUT_MAX_SKIP_ALIGN):
	* config/i386/lynx.h (ASM_OUTPUT_MAX_SKIP_ALIGN):
	* config/i386/netbsd-elf.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Print
	max skip conditionally.
	* config/i386/openbsdelf.h (SUBALIGN_LOG): New.
	(ASM_OUTPUT_MAX_SKIP_ALIGN) Print max skip conditionally.:
	* config/i386/x86-64.h (SUBALIGN_LOG): New.
	(ASM_OUTPUT_MAX_SKIP_ALIGN): Print
	max skip conditionally.
	(ASM_OUTPUT_MAX_SKIP_PAD): Likewise.
	* config/ia64/ia64.c (ia64_option_override): Set default values
for alignment options.
	* config/m68k/m68k.c: Handle new str_align_* options.
	* config/mips/mips.c (mips_set_compression_mode): Change
	type of constants.
	(mips_option_override): Set default values for options.
	* config/powerpcspe/powerpcspe.c (rs6000_option_override_internal):
Likewise.
	* config/rs6000/rs6000.c (rs6000_option_override_internal):
	Likewise.
	* config/rx/rx.c (rx_option_override): Likewise.
	* config/rx/rx.h (JUMP_ALIGN): Use align_jumps_log.
	(LABEL_ALIGN): Use align_labels_log.
	(LOOP_ALIGN): Use align_loops_align.
	* config/s390/s390.c (s390_asm_output_function_label): Use new
macros.
	* config/sh/sh.c (sh_override_options_after_change):
	Change type of constants.
	* config/spu/spu.c (spu_sched_init): Likewise.
	* config/sparc/sparc.c (sparc_option_override): Set default
values for options.
	* config/visium/visium.c (visium_option_override): Likewise.
	* config/visium/visium.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Do not
emit p2align format with last argument if it's not needed.
	* doc/invoke.texi: Document extended format of -falign-*.
	* final.c: Use align_labels alignment.
	* flags.h (struct target_flag_state): Change type to use
	align_flags.
	(struct align_flags_tuple): New.
	(struct align_flags): Likewise.
	(align_loops_log): Redefine macro to use new types.
	(align_loops_max_skip): Redefine macro to use new types.
	(align_jumps_log): Redefine macro to use new types.
	(align_jumps_max_skip): Redefine macro to use new types.
	(align_labels_log): Redefine macro to use new types.
	(align_labels_max_skip): Redefine macro to use new types.
	(align_functions_log): Redefine macro to use new types.
	(align_loops): Redefine macro to use new types.
	(align_jumps): Redefine macro to use new types.
	(align_labels): Redefine macro to use new types.
	(align_functions): Redefine macro to use new types.
	(align_functions_max_skip): Redef

Re: [PATCH 2/3] Temporary remove "at least 8 byte alignment" code from x86

2018-07-03 Thread Martin Liška

On 06/13/2018 04:02 AM, Jeff Law wrote:

On 05/21/2018 07:55 AM, marxin wrote:


gcc/ChangeLog:

2017-04-18  Denys Vlasenko  

 * config/i386/i386-common.c (ix86_handle_option): Remove support
 for obsolete -malign-loops, -malign-jumps and -malign-functions
 options.
 * config/i386/i386.opt: Likewise.

The ChangeLog doesn't seem match the actual changes.  I think this patch
is more about using the additional info that's potentially provided by
the -falign options to drive backend decisions and dropping the
secondary alignment request.


Hi.

Exactly.



The change seems to consistently do:

(1<<(LOG)))-1)

Which needs some horizontal whitespace.


Fixed in updated version of the patch.

Martin



The meat of the changes seem fairly reasonable.

jeff



>From d4e3c9f6438dd665a5e1802c625dbbd7742cb52f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 21 May 2018 15:55:09 +0200
Subject: [PATCH] Temporary remove "at least 8 byte alignment" code from x86

gcc/ChangeLog:

2017-04-18  Denys Vlasenko  

	* config/i386/dragonfly.h: (ASM_OUTPUT_MAX_SKIP_ALIGN):
	Use a simpler align directive also if MAXSKIP = ALIGN-1.
	* config/i386/gas.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
	* config/i386/lynx.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
	* config/i386/netbsd-elf.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
	* config/i386/i386.h (ASM_OUTPUT_MAX_SKIP_PAD): Likewise.
	* config/i386/freebsd.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Remove "If N
	is large, do at least 8 byte alignment" code. Add SUBALIGN_LOG
	define. Use a simpler align directive also if MAXSKIP = ALIGN-1.
	* config/i386/gnu-user.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
	* config/i386/iamcu.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
	* config/i386/openbsdelf.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
	* config/i386/x86-64.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
---
 gcc/config/i386/dragonfly.h  | 10 ++
 gcc/config/i386/freebsd.h| 16 +---
 gcc/config/i386/gas.h| 12 +++-
 gcc/config/i386/gnu-user.h   | 16 +---
 gcc/config/i386/i386.h   |  2 +-
 gcc/config/i386/iamcu.h  | 16 +---
 gcc/config/i386/lynx.h   |  6 --
 gcc/config/i386/netbsd-elf.h |  6 --
 gcc/config/i386/openbsdelf.h | 16 +---
 gcc/config/i386/x86-64.h | 16 ++--
 10 files changed, 48 insertions(+), 68 deletions(-)

diff --git a/gcc/config/i386/dragonfly.h b/gcc/config/i386/dragonfly.h
index a05b36435de..40774c0cf7a 100644
--- a/gcc/config/i386/dragonfly.h
+++ b/gcc/config/i386/dragonfly.h
@@ -69,10 +69,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #undef  ASM_OUTPUT_MAX_SKIP_ALIGN
-#define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE, LOG, MAX_SKIP)	\
-  if ((LOG) != 0) {		\
-if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-else fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
+#define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE, LOG, MAX_SKIP)			\
+  if ((LOG) != 0) {			\
+if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1 << (LOG)) - 1)		\
+  fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+else\
+  fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
   }
 #endif
 
diff --git a/gcc/config/i386/freebsd.h b/gcc/config/i386/freebsd.h
index 0f38e6d859a..caac6b38575 100644
--- a/gcc/config/i386/freebsd.h
+++ b/gcc/config/i386/freebsd.h
@@ -92,25 +92,19 @@ along with GCC; see the file COPYING3.  If not see
 
 /* A C statement to output to the stdio stream FILE an assembler
command to advance the location counter to a multiple of 1<= (1 << (LOG)) - 1)		\
+	fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+  else\
 	fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
-	/* Make sure that we have at least 8 byte alignment if > 8 byte \
-	   alignment is preferred.  */	\
-	if ((LOG) > 3			\
-	&& (1 << (LOG)) > ((MAX_SKIP) + 1)\
-	&& (MAX_SKIP) >= 7)		\
-	  fputs ("\t.p2align 3\n", (FILE));\
-  }	\
 }	\
   } while (0)
 #endif
diff --git a/gcc/config/i386/gas.h b/gcc/config/i386/gas.h
index 25d5f7809b5..e149ab1360c 100644
--- a/gcc/config/i386/gas.h
+++ b/gcc/config/i386/gas.h
@@ -57,7 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #ifdef HAVE_GAS_BALIGN_AND_P2ALIGN 
 #undef ASM_OUTPUT_ALIGN
 #define ASM_OUTPUT_ALIGN(FILE,LOG) \
-  if ((LOG)!=0) fprintf ((FILE), "\t.balign %d\n", 1<<(LOG))
+  if ((LOG)!=0) fprintf ((FILE), "\t.balign %d\n", 1 << (LOG))
 #endif
 
 /* A C statement to output to the stdio stream FILE an assembler
@@ -68,10 +68,12 @@ along with GCC; see the file COPYING3.  If not see
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #  define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP) \
- if ((LOG) != 0) {\
-   if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG)); \
-   else fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP)); \
- }
+if ((LOG) != 0

[committed] Consolidating various H8 patterns

2018-07-03 Thread Jeff Law

Another round.

This merges the movmd/movsd related patterns.  Their primary difference
was normal vs advanced mode, so we can just use the P mode iterator and
significantly reduce duplication here.

This also merges two of the btst patterns and fixes the pattern's
condition.  This also allows the btst to be used more aggressively on
the H8/SX where we see minor improvements in the generated code.

The tstXX patterns get merged.  Sadly the way we use output modifiers to
print different registers is painful -- the more natural way would be to
define the modifier to print a register in the size of the operand.
Instead we have a distinct modifier for each size.  The port exploits
the sub-word addressibility in various places.  As a result we can't
combine the output templates as much as I'd like.  I'm pondering
cleaning that up as well.

There's a bit of merging of arithmetic patterns.  What probably gets in
the way most of the time is differing constraints or pattern conditions.
 The former are always a blocker.  The latter we can sometimes work
around (or as in the case of btst point to a minor goof).


Some of the length computations are table driven and were getting in the
way of merging patterns.  So rather than have a separate attributes for
addb, addw, addl length computation, we have a single attribute and
select the right table based on the size of the operand.


That's it until tonight...


Jeff
* config/h8300/h8300.c (h8300_insn_length_from_table): Consolidate
ADDB, ADDW and ADDL into a single ADD attribute which selects the
right table based on the size of the operand.
* config/h8300/h8300.md (length_table): Corresponding changes. All
references to "addb", "addw" and "addl" changed to "add".
(btst patterns): Merge two variants into a single pattern.
(tstqi, tsthi): Likewise.
(addhi3_incdec, addsi3_incdec): Likewise.
(subhi3_h8300hs, subsi3_h8300hs): Likewise.
(mulhi3, mulsi3): Likewise.
(udivhi3, udivsi3): Likewise.
(divhi3, divsi3): Likewise.
(andorqi3, andorhi3, andorsi3): Likewise.

diff --git a/gcc/config/h8300/h8300.c b/gcc/config/h8300/h8300.c
index 697041e..01c765d 100644
--- a/gcc/config/h8300/h8300.c
+++ b/gcc/config/h8300/h8300.c
@@ -2551,14 +2551,14 @@ h8300_insn_length_from_table (rtx_insn *insn, rtx * 
operands)
 case LENGTH_TABLE_NONE:
   gcc_unreachable ();
 
-case LENGTH_TABLE_ADDB:
-  return h8300_binary_length (insn, &addb_length_table);
-
-case LENGTH_TABLE_ADDW:
-  return h8300_binary_length (insn, &addw_length_table);
-
-case LENGTH_TABLE_ADDL:
-  return h8300_binary_length (insn, &addl_length_table);
+case LENGTH_TABLE_ADD:
+  if (GET_MODE (operands[0]) == QImode)
+return h8300_binary_length (insn, &addb_length_table);
+  else if (GET_MODE (operands[0]) == HImode)
+return h8300_binary_length (insn, &addw_length_table);
+  else if (GET_MODE (operands[0]) == SImode)
+return h8300_binary_length (insn, &addl_length_table);
+  gcc_unreachable ();
 
 case LENGTH_TABLE_LOGICB:
   return h8300_binary_length (insn, &logicb_length_table);
diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index 6084240..e654784 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -77,7 +77,7 @@
 (define_attr "type" "branch,arith,bitbranch,call"
   (const_string "arith"))
 
-(define_attr "length_table" 
"none,addb,addw,addl,logicb,movb,movw,movl,mova_zero,mova,unary,mov_imm4,short_immediate,bitfield,bitbranch"
+(define_attr "length_table" 
"none,add,logicb,movb,movw,movl,mova_zero,mova,unary,mov_imm4,short_immediate,bitfield,bitbranch"
   (const_string "none"))
 
 ;; The size of instructions in bytes.
@@ -752,17 +752,6 @@
   [(set_attr "length" "2,4")
(set_attr "cc" "set_zn,set_zn")])
 
-(define_insn ""
-  [(set (cc0)
-   (compare (zero_extract:HI (match_operand:HI 0 "register_operand" "r")
- (const_int 1)
- (match_operand 1 "const_int_operand" "n"))
-(const_int 0)))]
-  "TARGET_H8300"
-  "btst%Z1,%Y0"
-  [(set_attr "length" "2")
-   (set_attr "cc" "set_zn")])
-
 (define_insn_and_split "*tst_extzv_1_n"
   [(set (cc0)
(compare (zero_extract:SI (match_operand:QI 0 "general_operand_src" 
"r,U,mn>")
@@ -790,11 +779,11 @@
 
 (define_insn ""
   [(set (cc0)
-   (compare (zero_extract:SI (match_operand:SI 0 "register_operand" "r")
- (const_int 1)
- (match_operand 1 "const_int_operand" "n"))
+   (compare (zero_extract:HSI (match_operand:HSI 0 "register_operand" "r")
+  (const_int 1)
+  (match_operand 1 "const_int_operand" "n"))
 (const_int 0)))]
-  "(TARGET_H8300H || TARGET_H8300S)
+  "(TARGET_H8300 || TARGET_H8300H || TARGET_H8300S)
 

Re: C++ PATCH for c++/57891, narrowing conversions in non-type template arguments

2018-07-03 Thread Jason Merrill
On Tue, Jul 3, 2018 at 2:58 PM, Marek Polacek  wrote:
> On Tue, Jul 03, 2018 at 12:40:51PM -0400, Jason Merrill wrote:
>> On Fri, Jun 29, 2018 at 3:58 PM, Marek Polacek  wrote:
>> > On Wed, Jun 27, 2018 at 07:35:15PM -0400, Jason Merrill wrote:
>> >> On Wed, Jun 27, 2018 at 12:53 PM, Marek Polacek  
>> >> wrote:
>> >> > This PR complains about us accepting invalid code like
>> >> >
>> >> >   template struct A {};
>> >> >   A<-1> a;
>> >> >
>> >> > Where we should detect the narrowing: [temp.arg.nontype] says
>> >> > "A template-argument for a non-type template-parameter shall be a 
>> >> > converted
>> >> > constant expression ([expr.const]) of the type of the 
>> >> > template-parameter."
>> >> > and a converted constant expression can contain only
>> >> > - integral conversions other than narrowing conversions,
>> >> > - [...]."
>> >> > It spurred e.g.
>> >> > 
>> >> > and has >=3 dups so it has some visibility.
>> >> >
>> >> > I think build_converted_constant_expr needs to set check_narrowing.
>> >> > check_narrowing also always mentions that it's in { } but that is no 
>> >> > longer
>> >> > true; in the future it will also apply to <=>.  We'd probably have to 
>> >> > add a new
>> >> > flag to struct conversion if wanted to distinguish between these.
>> >> >
>> >> > This does not yet fix detecting narrowing in function templates (78244).
>> >> >
>> >> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
>> >> >
>> >> > 2018-06-27  Marek Polacek  
>> >> >
>> >> > PR c++/57891
>> >> > * call.c (build_converted_constant_expr): Set check_narrowing.
>> >> > * decl.c (compute_array_index_type): Add warning sentinel.  Use
>> >> > input_location.
>> >> > * pt.c (convert_nontype_argument): Return NULL_TREE if any 
>> >> > errors
>> >> > were reported.
>> >> > * typeck2.c (check_narrowing): Don't mention { } in diagnostic.
>> >> >
>> >> > * g++.dg/cpp0x/Wnarrowing6.C: New test.
>> >> > * g++.dg/cpp0x/Wnarrowing7.C: New test.
>> >> > * g++.dg/cpp0x/Wnarrowing8.C: New test.
>> >> > * g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
>> >> > * g++.dg/init/new43.C: Adjust dg-error.
>> >> > * g++.dg/other/fold1.C: Likewise.
>> >> > * g++.dg/parse/array-size2.C: Likewise.
>> >> > * g++.dg/other/vrp1.C: Add dg-error.
>> >> > * g++.dg/template/char1.C: Likewise.
>> >> > * g++.dg/ext/builtin12.C: Likewise.
>> >> > * g++.dg/template/dependent-name3.C: Adjust dg-error.
>> >> >
>> >> > diff --git gcc/cp/call.c gcc/cp/call.c
>> >> > index 209c1fd2f0e..956c7b149dc 100644
>> >> > --- gcc/cp/call.c
>> >> > +++ gcc/cp/call.c
>> >> > @@ -4152,7 +4152,10 @@ build_converted_constant_expr (tree type, tree 
>> >> > expr, tsubst_flags_t complain)
>> >> >  }
>> >> >
>> >> >if (conv)
>> >> > -expr = convert_like (conv, expr, complain);
>> >> > +{
>> >> > +  conv->check_narrowing = !processing_template_decl;
>> >>
>> >> Why !processing_template_decl?  This needs a comment.
>> >
>> > Otherwise we'd warn for e.g.
>> >
>> > template struct S { char a[N]; };
>> > S<1> s;
>> >
>> > where compute_array_index_type will try to convert the size of the array 
>> > (which
>> > is a template_parm_index of type int when parsing the template) to 
>> > size_type.
>> > So I guess I can say that we need to wait for instantiation?
>>
>> We certainly shouldn't give a narrowing diagnostic about a
>> value-dependent expression.  It probably makes sense to check that at
>> the top of check_narrowing, with all the other early exit conditions.
>> But if we do know the constant value in the template, it's good to
>> complain then rather than wait for instantiation.
>
> Makes sense; how about this then?  (Regtest/bootstrap running.)
>
> 2018-07-03  Marek Polacek  
>
> PR c++/57891
> * call.c (build_converted_constant_expr): Set check_narrowing.
> * decl.c (compute_array_index_type): Add warning sentinel.  Use
> input_location.
> * pt.c (convert_nontype_argument): Return NULL_TREE if any errors
> were reported.
> * typeck2.c (check_narrowing): Don't warn for instantiation-dependent
> expressions or non-constants in a template.  Don't mention { } in
> diagnostic.
>
> * g++.dg/cpp0x/Wnarrowing6.C: New test.
> * g++.dg/cpp0x/Wnarrowing7.C: New test.
> * g++.dg/cpp0x/Wnarrowing8.C: New test.
> * g++.dg/cpp0x/Wnarrowing9.C: New test.
> * g++.dg/cpp0x/Wnarrowing10.C: New test.
> * g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
> * g++.dg/init/new43.C: Adjust dg-error.
> * g++.dg/other/fold1.C: Likewise.
> * g++.dg/parse/array-size2.C: Likewise.
> * g++.dg/other/vrp1.C: Add dg-error.
> * g++.dg/template/char1.C: Likewise.
> * g++.dg/ext/bu

Re: [14/n] PR85694: Rework overwidening detection

2018-07-03 Thread Christophe Lyon
On Tue, 3 Jul 2018 at 12:02, Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Fri, Jun 29, 2018 at 1:36 PM Richard Sandiford
> >  wrote:
> >>
> >> Richard Sandiford  writes:
> >> > This patch is the main part of PR85694.  The aim is to recognise at 
> >> > least:
> >> >
> >> >   signed char *a, *b, *c;
> >> >   ...
> >> >   for (int i = 0; i < 2048; i++)
> >> > c[i] = (a[i] + b[i]) >> 1;
> >> >
> >> > as an over-widening pattern, since the addition and shift can be done
> >> > on shorts rather than ints.  However, it ended up being a lot more
> >> > general than that.
> >> >
> >> > The current over-widening pattern detection is limited to a few simple
> >> > cases: logical ops with immediate second operands, and shifts by a
> >> > constant.  These cases are enough for common pixel-format conversion
> >> > and can be detected in a peephole way.
> >> >
> >> > The loop above requires two generalisations of the current code: support
> >> > for addition as well as logical ops, and support for non-constant second
> >> > operands.  These are harder to detect in the same peephole way, so the
> >> > patch tries to take a more global approach.
> >> >
> >> > The idea is to get information about the minimum operation width
> >> > in two ways:
> >> >
> >> > (1) by using the range information attached to the SSA_NAMEs
> >> > (effectively a forward walk, since the range info is
> >> > context-independent).
> >> >
> >> > (2) by back-propagating the number of output bits required by
> >> > users of the result.
> >> >
> >> > As explained in the comments, there's a balance to be struck between
> >> > narrowing an individual operation and fitting in with the surrounding
> >> > code.  The approach is pretty conservative: if we could narrow an
> >> > operation to N bits without changing its semantics, it's OK to do that 
> >> > if:
> >> >
> >> > - no operations later in the chain require more than N bits; or
> >> >
> >> > - all internally-defined inputs are extended from N bits or fewer,
> >> >   and at least one of them is single-use.
> >> >
> >> > See the comments for the rationale.
> >> >
> >> > I didn't bother adding STMT_VINFO_* wrappers for the new fields
> >> > since the code seemed more readable without.
> >> >
> >> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >>
> >> Here's a version rebased on top of current trunk.  Changes from last time:
> >>
> >> - reintroduce dump_generic_expr_loc, with the obvious change to the
> >>   prototype
> >>
> >> - fix a typo in a comment
> >>
> >> - use vect_element_precision from the new version of 12/n.
> >>
> >> Tested as before.  OK to install?
> >
> > OK.
>
> Thanks.  For the record, here's what I installed (updated on top of
> Dave's recent patch, and with an obvious fix to vect-widen-mult-u8-u32.c).
>
> Richard
>
Hi,

It seems the new bb-slp-over-widen tests lack a -fdump option:
gcc.dg/vect/bb-slp-over-widen-2.c -flto -ffat-lto-objects : dump file
does not exist
UNRESOLVED: gcc.dg/vect/bb-slp-over-widen-2.c -flto -ffat-lto-objects
scan-tree-dump-times vect "basic block vectorized" 2

Christophe

>
> 2018-07-03  Richard Sandiford  
>
> gcc/
> * poly-int.h (print_hex): New function.
> * dumpfile.h (dump_dec, dump_hex): Declare.
> * dumpfile.c (dump_dec, dump_hex): New poly_wide_int functions.
> * tree-vectorizer.h (_stmt_vec_info): Add min_output_precision,
> min_input_precision, operation_precision and operation_sign.
> * tree-vect-patterns.c (vect_get_range_info): New function.
> (vect_same_loop_or_bb_p, vect_single_imm_use)
> (vect_operation_fits_smaller_type): Delete.
> (vect_look_through_possible_promotion): Add an optional
> single_use_p parameter.
> (vect_recog_over_widening_pattern): Rewrite to use new
> stmt_vec_info infomration.  Handle one operation at a time.
> (vect_recog_cast_forwprop_pattern, vect_narrowable_type_p)
> (vect_truncatable_operation_p, vect_set_operation_type)
> (vect_set_min_input_precision): New functions.
> (vect_determine_min_output_precision_1): Likewise.
> (vect_determine_min_output_precision): Likewise.
> (vect_determine_precisions_from_range): Likewise.
> (vect_determine_precisions_from_users): Likewise.
> (vect_determine_stmt_precisions, vect_determine_precisions): Likewise.
> (vect_vect_recog_func_ptrs): Put over_widening first.
> Add cast_forwprop.
> (vect_pattern_recog): Call vect_determine_precisions.
>
> gcc/testsuite/
> * gcc.dg/vect/vect-widen-mult-u8-u32.c: Check specifically for a
> widen_mult pattern.
> * gcc.dg/vect/vect-over-widen-1.c: Update the scan tests for new
> over-widening messages.
> * gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
> * gcc.dg/vect/vect-over-widen-2.c: Likewise.
> * gcc.dg/vect/vect-over-widen-2-big-array.c: Likewis

PR libstdc++/86272 Fix undefined __glibcxx_check_insert_range2

2018-07-03 Thread François Dumont

Hi

    I plan to commit attached patch to fix the __gnu_debug::string 
undefined macro.


    Is it ok the way I used some tests to now validate 
__gnu_debug::string on check-debug ?


François

diff --git a/libstdc++-v3/include/debug/string b/libstdc++-v3/include/debug/string
index 4aadf4c..963b84f 100644
--- a/libstdc++-v3/include/debug/string
+++ b/libstdc++-v3/include/debug/string
@@ -126,7 +126,7 @@ template,
 : _Base(__str, __pos, __n, __a) { }
 
 basic_string(const _CharT* __s, size_type __n,
-		   const _Allocator& __a = _Allocator())
+		 const _Allocator& __a = _Allocator())
 : _Base(__gnu_debug::__check_string(__s, __n), __n, __a) { }
 
 basic_string(const _CharT* __s, const _Allocator& __a = _Allocator())
@@ -568,7 +568,7 @@ template,
   insert(const_iterator __p, _InputIterator __first, _InputIterator __last)
   {
 	typename __gnu_debug::_Distance_traits<_InputIterator>::__type __dist;
-	__glibcxx_check_insert_range2(__p, __first, __last, __dist);
+	__glibcxx_check_insert_range(__p, __first, __last, __dist);
 
 	typename _Base::iterator __res;
 	if (__dist.second >= __dp_sign)
diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/1.cc b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/1.cc
index 391528a..7ebbf60 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/1.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/1.cc
@@ -20,25 +20,32 @@
 // 21.3.1 basic_string constructors.
 
 #include 
-#include 
 #include 
 #include 
 
+#ifdef _GLIBCXX_DEBUG
+# include 
+using namespace __gnu_debug;
+#else
+# include 
+using namespace std;
+#endif
+
 void test01(void)
 {
-  typedef std::string::size_type csize_type;
-  typedef std::string::iterator citerator;
-  csize_type npos = std::string::npos;
+  typedef string::size_type csize_type;
+  typedef string::iterator citerator;
+  csize_type npos = string::npos;
   csize_type csz01;
 
   const char str_lit01[] = "rodeo beach, marin";
-  const std::string str01(str_lit01);
-  const std::string str02("baker beach, san francisco");
+  const string str01(str_lit01);
+  const string str02("baker beach, san francisco");
 
   // basic_string(const string&, size_type pos = 0, siz_type n = npos, alloc)
   csz01 = str01.size();
   try {
-std::string str03(str01, csz01 + 1);
+string str03(str01, csz01 + 1);
 VERIFY( false );
   }		 
   catch(std::out_of_range& fail) {
@@ -49,7 +56,7 @@ void test01(void)
   }
 
   try {
-std::string str03(str01, csz01);
+string str03(str01, csz01);
 VERIFY( str03.size() == 0 );
 VERIFY( str03.size() <= str03.capacity() );
   }		 
@@ -62,7 +69,7 @@ void test01(void)
   // NB: As strlen(str_lit01) != csz01, this test is undefined. It
   // should not crash, but what gets constructed is a bit arbitrary.
   try {
-std::string str03(str_lit01, csz01 + 1);
+string str03(str_lit01, csz01 + 1);
 VERIFY( true );
   }		 
   catch(std::length_error& fail) {
@@ -76,7 +83,7 @@ void test01(void)
   // should not crash, but what gets constructed is a bit arbitrary.
   // The "maverick's" of all string objects.
   try {
-std::string str04(str_lit01, npos); 
+string str04(str_lit01, npos);
 VERIFY( true );
   }		 
   catch(std::length_error& fail) {
@@ -88,7 +95,7 @@ void test01(void)
 
   // Build a maxsize - 1 lengthed string consisting of all A's
   try {
-std::string str03(csz01 - 1, 'A');
+string str03(csz01 - 1, 'A');
 VERIFY( str03.size() == csz01 - 1 );
 VERIFY( str03.size() <= str03.capacity() );
   }		 
@@ -102,14 +109,14 @@ void test01(void)
   }
 
   // basic_string(const char* s, const allocator& a = allocator())
-  std::string str04(str_lit01);
+  string str04(str_lit01);
   VERIFY( str01 == str04 );
 
 
   // basic_string(size_type n, char c, const allocator& a = allocator())
   csz01 = str01.max_size();
   try {
-std::string str03(csz01 + 1, 'z');
+string str03(csz01 + 1, 'z');
 VERIFY( false );
   }		 
   catch(std::length_error& fail) {
@@ -120,7 +127,7 @@ void test01(void)
   }
 
   try {
-std::string str04(npos, 'b'); // the "maverick's" of all string objects.
+string str04(npos, 'b'); // the "maverick's" of all string objects.
 VERIFY( false );
   }		 
   catch(std::length_error& fail) {
@@ -131,7 +138,7 @@ void test01(void)
   }
 
   try {
-std::string str03(csz01 - 1, 'z');
+string str03(csz01 - 1, 'z');
 VERIFY( str03.size() != 0 );
 VERIFY( str03.size() <= str03.capacity() );
   }		 
@@ -144,10 +151,9 @@ void test01(void)
 VERIFY( false );
   }
 
-
   // template
   //   basic_string(_InputIter begin, _InputIter end, const allocator& a)
-  std::string str06(str01.begin(), str01.end());
+  string str06(str01.begin(), str01.end());
   VERIFY( str06 == str01 );
 }
 
diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/init-list.cc b/libstdc++-v3/testsuite/21_strings/basic_string/init-list.cc
index 2cc9cff..aa77548 100644
--- a/libst

Re: PR libstdc++/86272 Fix undefined __glibcxx_check_insert_range2

2018-07-03 Thread Jonathan Wakely

On 03/07/18 22:11 +0200, François Dumont wrote:

Hi

    I plan to commit attached patch to fix the __gnu_debug::string 
undefined macro.


    Is it ok the way I used some tests to now validate 
__gnu_debug::string on check-debug ?


Yes, good idea. We don't want to do that for *every* basic_string
test, but doing it for a few of them means we cover the
__gnu_debug::string.

OK for trunk and the affected branches. Thanks.




Re: C++ PATCH for c++/57891, narrowing conversions in non-type template arguments

2018-07-03 Thread Jason Merrill
On Tue, Jul 3, 2018 at 3:41 PM, Jason Merrill  wrote:
> On Tue, Jul 3, 2018 at 2:58 PM, Marek Polacek  wrote:
>> On Tue, Jul 03, 2018 at 12:40:51PM -0400, Jason Merrill wrote:
>>> On Fri, Jun 29, 2018 at 3:58 PM, Marek Polacek  wrote:
>>> > On Wed, Jun 27, 2018 at 07:35:15PM -0400, Jason Merrill wrote:
>>> >> On Wed, Jun 27, 2018 at 12:53 PM, Marek Polacek  
>>> >> wrote:
>>> >> > This PR complains about us accepting invalid code like
>>> >> >
>>> >> >   template struct A {};
>>> >> >   A<-1> a;
>>> >> >
>>> >> > Where we should detect the narrowing: [temp.arg.nontype] says
>>> >> > "A template-argument for a non-type template-parameter shall be a 
>>> >> > converted
>>> >> > constant expression ([expr.const]) of the type of the 
>>> >> > template-parameter."
>>> >> > and a converted constant expression can contain only
>>> >> > - integral conversions other than narrowing conversions,
>>> >> > - [...]."
>>> >> > It spurred e.g.
>>> >> > 
>>> >> > and has >=3 dups so it has some visibility.
>>> >> >
>>> >> > I think build_converted_constant_expr needs to set check_narrowing.
>>> >> > check_narrowing also always mentions that it's in { } but that is no 
>>> >> > longer
>>> >> > true; in the future it will also apply to <=>.  We'd probably have to 
>>> >> > add a new
>>> >> > flag to struct conversion if wanted to distinguish between these.
>>> >> >
>>> >> > This does not yet fix detecting narrowing in function templates 
>>> >> > (78244).
>>> >> >
>>> >> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
>>> >> >
>>> >> > 2018-06-27  Marek Polacek  
>>> >> >
>>> >> > PR c++/57891
>>> >> > * call.c (build_converted_constant_expr): Set check_narrowing.
>>> >> > * decl.c (compute_array_index_type): Add warning sentinel.  Use
>>> >> > input_location.
>>> >> > * pt.c (convert_nontype_argument): Return NULL_TREE if any 
>>> >> > errors
>>> >> > were reported.
>>> >> > * typeck2.c (check_narrowing): Don't mention { } in diagnostic.
>>> >> >
>>> >> > * g++.dg/cpp0x/Wnarrowing6.C: New test.
>>> >> > * g++.dg/cpp0x/Wnarrowing7.C: New test.
>>> >> > * g++.dg/cpp0x/Wnarrowing8.C: New test.
>>> >> > * g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
>>> >> > * g++.dg/init/new43.C: Adjust dg-error.
>>> >> > * g++.dg/other/fold1.C: Likewise.
>>> >> > * g++.dg/parse/array-size2.C: Likewise.
>>> >> > * g++.dg/other/vrp1.C: Add dg-error.
>>> >> > * g++.dg/template/char1.C: Likewise.
>>> >> > * g++.dg/ext/builtin12.C: Likewise.
>>> >> > * g++.dg/template/dependent-name3.C: Adjust dg-error.
>>> >> >
>>> >> > diff --git gcc/cp/call.c gcc/cp/call.c
>>> >> > index 209c1fd2f0e..956c7b149dc 100644
>>> >> > --- gcc/cp/call.c
>>> >> > +++ gcc/cp/call.c
>>> >> > @@ -4152,7 +4152,10 @@ build_converted_constant_expr (tree type, tree 
>>> >> > expr, tsubst_flags_t complain)
>>> >> >  }
>>> >> >
>>> >> >if (conv)
>>> >> > -expr = convert_like (conv, expr, complain);
>>> >> > +{
>>> >> > +  conv->check_narrowing = !processing_template_decl;
>>> >>
>>> >> Why !processing_template_decl?  This needs a comment.
>>> >
>>> > Otherwise we'd warn for e.g.
>>> >
>>> > template struct S { char a[N]; };
>>> > S<1> s;
>>> >
>>> > where compute_array_index_type will try to convert the size of the array 
>>> > (which
>>> > is a template_parm_index of type int when parsing the template) to 
>>> > size_type.
>>> > So I guess I can say that we need to wait for instantiation?
>>>
>>> We certainly shouldn't give a narrowing diagnostic about a
>>> value-dependent expression.  It probably makes sense to check that at
>>> the top of check_narrowing, with all the other early exit conditions.
>>> But if we do know the constant value in the template, it's good to
>>> complain then rather than wait for instantiation.
>>
>> Makes sense; how about this then?  (Regtest/bootstrap running.)
>>
>> 2018-07-03  Marek Polacek  
>>
>> PR c++/57891
>> * call.c (build_converted_constant_expr): Set check_narrowing.
>> * decl.c (compute_array_index_type): Add warning sentinel.  Use
>> input_location.
>> * pt.c (convert_nontype_argument): Return NULL_TREE if any errors
>> were reported.
>> * typeck2.c (check_narrowing): Don't warn for instantiation-dependent
>> expressions or non-constants in a template.  Don't mention { } in
>> diagnostic.
>>
>> * g++.dg/cpp0x/Wnarrowing6.C: New test.
>> * g++.dg/cpp0x/Wnarrowing7.C: New test.
>> * g++.dg/cpp0x/Wnarrowing8.C: New test.
>> * g++.dg/cpp0x/Wnarrowing9.C: New test.
>> * g++.dg/cpp0x/Wnarrowing10.C: New test.
>> * g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
>> * g++.dg/init/new43.C: Adjust dg-error.
>> * g++.dg/other/fold1.C: Lik

Re: [C++ Patch] More location fixes to grokdeclarator

2018-07-03 Thread Jason Merrill
OK.

On Tue, Jul 3, 2018 at 2:37 PM, Paolo Carlini  wrote:
> Hi,
>
> On 03/07/2018 18:36, Jason Merrill wrote:
>>
>>
>>> if ((type_quals & TYPE_QUAL_VOLATILE)
>>> -  && (loc == UNKNOWN_LOCATION || locations[ds_volatile] < loc))
>>> +  && (loc == UNKNOWN_LOCATION
>>> +  || linemap_location_before_p (line_table,
>>> +locations[ds_volatile], loc)))
>>>   loc = locations[ds_volatile];
>>> if ((type_quals & TYPE_QUAL_RESTRICT)
>>> -  && (loc == UNKNOWN_LOCATION || locations[ds_restrict] < loc))
>>> +  && (loc == UNKNOWN_LOCATION
>>> +  || linemap_location_before_p (line_table,
>>> +locations[ds_restrict], loc)))
>>>   loc = locations[ds_restrict];
>>
>> Why not use min_location here?
>
> Indeed. Thus I successfully tested the below.
>
> Thanks,
> Paolo.
>
> //


Re: C++ PATCH for c++/57891, narrowing conversions in non-type template arguments

2018-07-03 Thread Marek Polacek
On Tue, Jul 03, 2018 at 03:41:43PM -0400, Jason Merrill wrote:
> On Tue, Jul 3, 2018 at 2:58 PM, Marek Polacek  wrote:
> > On Tue, Jul 03, 2018 at 12:40:51PM -0400, Jason Merrill wrote:
> >> On Fri, Jun 29, 2018 at 3:58 PM, Marek Polacek  wrote:
> >> > On Wed, Jun 27, 2018 at 07:35:15PM -0400, Jason Merrill wrote:
> >> >> On Wed, Jun 27, 2018 at 12:53 PM, Marek Polacek  
> >> >> wrote:
> >> >> > This PR complains about us accepting invalid code like
> >> >> >
> >> >> >   template struct A {};
> >> >> >   A<-1> a;
> >> >> >
> >> >> > Where we should detect the narrowing: [temp.arg.nontype] says
> >> >> > "A template-argument for a non-type template-parameter shall be a 
> >> >> > converted
> >> >> > constant expression ([expr.const]) of the type of the 
> >> >> > template-parameter."
> >> >> > and a converted constant expression can contain only
> >> >> > - integral conversions other than narrowing conversions,
> >> >> > - [...]."
> >> >> > It spurred e.g.
> >> >> > 
> >> >> > and has >=3 dups so it has some visibility.
> >> >> >
> >> >> > I think build_converted_constant_expr needs to set check_narrowing.
> >> >> > check_narrowing also always mentions that it's in { } but that is no 
> >> >> > longer
> >> >> > true; in the future it will also apply to <=>.  We'd probably have to 
> >> >> > add a new
> >> >> > flag to struct conversion if wanted to distinguish between these.
> >> >> >
> >> >> > This does not yet fix detecting narrowing in function templates 
> >> >> > (78244).
> >> >> >
> >> >> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> >> >> >
> >> >> > 2018-06-27  Marek Polacek  
> >> >> >
> >> >> > PR c++/57891
> >> >> > * call.c (build_converted_constant_expr): Set check_narrowing.
> >> >> > * decl.c (compute_array_index_type): Add warning sentinel.  
> >> >> > Use
> >> >> > input_location.
> >> >> > * pt.c (convert_nontype_argument): Return NULL_TREE if any 
> >> >> > errors
> >> >> > were reported.
> >> >> > * typeck2.c (check_narrowing): Don't mention { } in 
> >> >> > diagnostic.
> >> >> >
> >> >> > * g++.dg/cpp0x/Wnarrowing6.C: New test.
> >> >> > * g++.dg/cpp0x/Wnarrowing7.C: New test.
> >> >> > * g++.dg/cpp0x/Wnarrowing8.C: New test.
> >> >> > * g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
> >> >> > * g++.dg/init/new43.C: Adjust dg-error.
> >> >> > * g++.dg/other/fold1.C: Likewise.
> >> >> > * g++.dg/parse/array-size2.C: Likewise.
> >> >> > * g++.dg/other/vrp1.C: Add dg-error.
> >> >> > * g++.dg/template/char1.C: Likewise.
> >> >> > * g++.dg/ext/builtin12.C: Likewise.
> >> >> > * g++.dg/template/dependent-name3.C: Adjust dg-error.
> >> >> >
> >> >> > diff --git gcc/cp/call.c gcc/cp/call.c
> >> >> > index 209c1fd2f0e..956c7b149dc 100644
> >> >> > --- gcc/cp/call.c
> >> >> > +++ gcc/cp/call.c
> >> >> > @@ -4152,7 +4152,10 @@ build_converted_constant_expr (tree type, tree 
> >> >> > expr, tsubst_flags_t complain)
> >> >> >  }
> >> >> >
> >> >> >if (conv)
> >> >> > -expr = convert_like (conv, expr, complain);
> >> >> > +{
> >> >> > +  conv->check_narrowing = !processing_template_decl;
> >> >>
> >> >> Why !processing_template_decl?  This needs a comment.
> >> >
> >> > Otherwise we'd warn for e.g.
> >> >
> >> > template struct S { char a[N]; };
> >> > S<1> s;
> >> >
> >> > where compute_array_index_type will try to convert the size of the array 
> >> > (which
> >> > is a template_parm_index of type int when parsing the template) to 
> >> > size_type.
> >> > So I guess I can say that we need to wait for instantiation?
> >>
> >> We certainly shouldn't give a narrowing diagnostic about a
> >> value-dependent expression.  It probably makes sense to check that at
> >> the top of check_narrowing, with all the other early exit conditions.
> >> But if we do know the constant value in the template, it's good to
> >> complain then rather than wait for instantiation.
> >
> > Makes sense; how about this then?  (Regtest/bootstrap running.)
> >
> > 2018-07-03  Marek Polacek  
> >
> > PR c++/57891
> > * call.c (build_converted_constant_expr): Set check_narrowing.
> > * decl.c (compute_array_index_type): Add warning sentinel.  Use
> > input_location.
> > * pt.c (convert_nontype_argument): Return NULL_TREE if any errors
> > were reported.
> > * typeck2.c (check_narrowing): Don't warn for 
> > instantiation-dependent
> > expressions or non-constants in a template.  Don't mention { } in
> > diagnostic.
> >
> > * g++.dg/cpp0x/Wnarrowing6.C: New test.
> > * g++.dg/cpp0x/Wnarrowing7.C: New test.
> > * g++.dg/cpp0x/Wnarrowing8.C: New test.
> > * g++.dg/cpp0x/Wnarrowing9.C: New test.
> > * g++.dg/cpp0x/Wnarrowing10.C: New test.
> > 

Re: [14/n] PR85694: Rework overwidening detection

2018-07-03 Thread Rainer Orth
Hi Christophe,

> It seems the new bb-slp-over-widen tests lack a -fdump option:
> gcc.dg/vect/bb-slp-over-widen-2.c -flto -ffat-lto-objects : dump file
> does not exist
> UNRESOLVED: gcc.dg/vect/bb-slp-over-widen-2.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "basic block vectorized" 2

indeed, but that's not enough: adding

/* { dg-additional-options "-fdump-tree-vect-details" } */

to both affected tests (gcc.dg/vect/bb-slp-over-widen-[12].c) yields

FAIL: gcc.dg/vect/bb-slp-over-widen-1.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "basic block vectorized" 2
FAIL: gcc.dg/vect/bb-slp-over-widen-2.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "basic block vectorized" 2

on both 32 and 64-bit x86, and the dump contains:

/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c:60:3:
 note:   not vectorized: control flow in loop.
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c:60:3:
 note:  not vectorized: loop contains function calls or data references that 
cannot be analyzed
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c:59:3:
 note:   not vectorized: control flow in loop.
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c:59:3:
 note:  not vectorized: loop contains function calls or data references that 
cannot be analyzed
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c:55:1:
 note: vectorized 0 loops in function.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: C++ PATCH for c++/57891, narrowing conversions in non-type template arguments

2018-07-03 Thread Jason Merrill
On Tue, Jul 3, 2018 at 4:35 PM, Marek Polacek  wrote:
> On Tue, Jul 03, 2018 at 03:41:43PM -0400, Jason Merrill wrote:
>> On Tue, Jul 3, 2018 at 2:58 PM, Marek Polacek  wrote:
>> > On Tue, Jul 03, 2018 at 12:40:51PM -0400, Jason Merrill wrote:
>> >> On Fri, Jun 29, 2018 at 3:58 PM, Marek Polacek  wrote:
>> >> > On Wed, Jun 27, 2018 at 07:35:15PM -0400, Jason Merrill wrote:
>> >> >> On Wed, Jun 27, 2018 at 12:53 PM, Marek Polacek  
>> >> >> wrote:
>> >> >> > This PR complains about us accepting invalid code like
>> >> >> >
>> >> >> >   template struct A {};
>> >> >> >   A<-1> a;
>> >> >> >
>> >> >> > Where we should detect the narrowing: [temp.arg.nontype] says
>> >> >> > "A template-argument for a non-type template-parameter shall be a 
>> >> >> > converted
>> >> >> > constant expression ([expr.const]) of the type of the 
>> >> >> > template-parameter."
>> >> >> > and a converted constant expression can contain only
>> >> >> > - integral conversions other than narrowing conversions,
>> >> >> > - [...]."
>> >> >> > It spurred e.g.
>> >> >> > 
>> >> >> > and has >=3 dups so it has some visibility.
>> >> >> >
>> >> >> > I think build_converted_constant_expr needs to set check_narrowing.
>> >> >> > check_narrowing also always mentions that it's in { } but that is no 
>> >> >> > longer
>> >> >> > true; in the future it will also apply to <=>.  We'd probably have 
>> >> >> > to add a new
>> >> >> > flag to struct conversion if wanted to distinguish between these.
>> >> >> >
>> >> >> > This does not yet fix detecting narrowing in function templates 
>> >> >> > (78244).
>> >> >> >
>> >> >> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
>> >> >> >
>> >> >> > 2018-06-27  Marek Polacek  
>> >> >> >
>> >> >> > PR c++/57891
>> >> >> > * call.c (build_converted_constant_expr): Set 
>> >> >> > check_narrowing.
>> >> >> > * decl.c (compute_array_index_type): Add warning sentinel.  
>> >> >> > Use
>> >> >> > input_location.
>> >> >> > * pt.c (convert_nontype_argument): Return NULL_TREE if any 
>> >> >> > errors
>> >> >> > were reported.
>> >> >> > * typeck2.c (check_narrowing): Don't mention { } in 
>> >> >> > diagnostic.
>> >> >> >
>> >> >> > * g++.dg/cpp0x/Wnarrowing6.C: New test.
>> >> >> > * g++.dg/cpp0x/Wnarrowing7.C: New test.
>> >> >> > * g++.dg/cpp0x/Wnarrowing8.C: New test.
>> >> >> > * g++.dg/cpp0x/constexpr-data2.C: Add dg-error.
>> >> >> > * g++.dg/init/new43.C: Adjust dg-error.
>> >> >> > * g++.dg/other/fold1.C: Likewise.
>> >> >> > * g++.dg/parse/array-size2.C: Likewise.
>> >> >> > * g++.dg/other/vrp1.C: Add dg-error.
>> >> >> > * g++.dg/template/char1.C: Likewise.
>> >> >> > * g++.dg/ext/builtin12.C: Likewise.
>> >> >> > * g++.dg/template/dependent-name3.C: Adjust dg-error.
>> >> >> >
>> >> >> > diff --git gcc/cp/call.c gcc/cp/call.c
>> >> >> > index 209c1fd2f0e..956c7b149dc 100644
>> >> >> > --- gcc/cp/call.c
>> >> >> > +++ gcc/cp/call.c
>> >> >> > @@ -4152,7 +4152,10 @@ build_converted_constant_expr (tree type, 
>> >> >> > tree expr, tsubst_flags_t complain)
>> >> >> >  }
>> >> >> >
>> >> >> >if (conv)
>> >> >> > -expr = convert_like (conv, expr, complain);
>> >> >> > +{
>> >> >> > +  conv->check_narrowing = !processing_template_decl;
>> >> >>
>> >> >> Why !processing_template_decl?  This needs a comment.
>> >> >
>> >> > Otherwise we'd warn for e.g.
>> >> >
>> >> > template struct S { char a[N]; };
>> >> > S<1> s;
>> >> >
>> >> > where compute_array_index_type will try to convert the size of the 
>> >> > array (which
>> >> > is a template_parm_index of type int when parsing the template) to 
>> >> > size_type.
>> >> > So I guess I can say that we need to wait for instantiation?
>> >>
>> >> We certainly shouldn't give a narrowing diagnostic about a
>> >> value-dependent expression.  It probably makes sense to check that at
>> >> the top of check_narrowing, with all the other early exit conditions.
>> >> But if we do know the constant value in the template, it's good to
>> >> complain then rather than wait for instantiation.
>> >
>> > Makes sense; how about this then?  (Regtest/bootstrap running.)
>> >
>> > 2018-07-03  Marek Polacek  
>> >
>> > PR c++/57891
>> > * call.c (build_converted_constant_expr): Set check_narrowing.
>> > * decl.c (compute_array_index_type): Add warning sentinel.  Use
>> > input_location.
>> > * pt.c (convert_nontype_argument): Return NULL_TREE if any errors
>> > were reported.
>> > * typeck2.c (check_narrowing): Don't warn for 
>> > instantiation-dependent
>> > expressions or non-constants in a template.  Don't mention { } in
>> > diagnostic.
>> >
>> > * g++.dg/cpp0x/Wnarrowing6.C: New test.
>> > * g++.dg/cpp0x/Wnarrowin

[PATCH] P0556R3 Integral power-of-2 operations, P0553R2 Bit operations

2018-07-03 Thread Jonathan Wakely

P0553R2 is not in the C++2a working draft yet, but is likely to be
approved soon. Neither proposal supports std::byte but this adds
overloads of each function for std::byte, assuming that will also get
added.

* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Include new header.
* include/std/bit: New header.
(__rotl, __rotr, __countl_zero, __countl_one, __countr_zero)
(__countr_one, __popcount, __ispow2, __ceil2, __floor2, __log2p1):
Define for C++14.
[!__STRICT_ANSI__] (rotl, rotr, countl_zero, countl_one, countr_zero)
(countr_one, popcount): Define for C++2a. Also overload for std::byte.
(ispow2, ceil2, floor2, log2p1): Define for C++2a.
[!__STRICT_ANSI__] (ispow2, ceil2, floor2, log2p1): Overload for
std::byte.
* testsuite/26_numerics/bit/bit.pow.two/ceil2.cc: New.
* testsuite/26_numerics/bit/bit.pow.two/floor2.cc: New.
* testsuite/26_numerics/bit/bit.pow.two/ispow2.cc: New.
* testsuite/26_numerics/bit/bit.pow.two/log2p1.cc: New.
* testsuite/26_numerics/bit/bitops.rot/rotl.cc: New.
* testsuite/26_numerics/bit/bitops.rot/rotr.cc: New.
* testsuite/26_numerics/bit/bitops.count/countl_one.cc: New.
* testsuite/26_numerics/bit/bitops.count/countl_zero.cc: New.
* testsuite/26_numerics/bit/bitops.count/countr_one.cc: New.
* testsuite/26_numerics/bit/bitops.count/countr_zero.cc: New.

Tested powerpc64le-linux, committed to trunk.


commit 8e5c24b33c987bc72d429755ed90958f7931711b
Author: Jonathan Wakely 
Date:   Tue Jul 3 21:33:20 2018 +0100

P0556R3 Integral power-of-2 operations, P0553R2 Bit operations

P0553R2 is not in the C++2a working draft yet, but is likely to be
approved soon. Neither proposal supports std::byte but this adds
overloads of each function for std::byte, assuming that will also get
added.

* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Include new header.
* include/std/bit: New header.
(__rotl, __rotr, __countl_zero, __countl_one, __countr_zero)
(__countr_one, __popcount, __ispow2, __ceil2, __floor2, __log2p1):
Define for C++14.
[!__STRICT_ANSI__] (rotl, rotr, countl_zero, countl_one, 
countr_zero)
(countr_one, popcount): Define for C++2a. Also overload for 
std::byte.
(ispow2, ceil2, floor2, log2p1): Define for C++2a.
[!__STRICT_ANSI__] (ispow2, ceil2, floor2, log2p1): Overload for
std::byte.
* testsuite/26_numerics/bit/bit.pow.two/ceil2.cc: New.
* testsuite/26_numerics/bit/bit.pow.two/floor2.cc: New.
* testsuite/26_numerics/bit/bit.pow.two/ispow2.cc: New.
* testsuite/26_numerics/bit/bit.pow.two/log2p1.cc: New.
* testsuite/26_numerics/bit/bitops.rot/rotl.cc: New.
* testsuite/26_numerics/bit/bitops.rot/rotr.cc: New.
* testsuite/26_numerics/bit/bitops.count/countl_one.cc: New.
* testsuite/26_numerics/bit/bitops.count/countl_zero.cc: New.
* testsuite/26_numerics/bit/bitops.count/countr_one.cc: New.
* testsuite/26_numerics/bit/bitops.count/countr_zero.cc: New.

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index f91907df325..d1453a5abce 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -30,6 +30,7 @@ std_headers = \
${std_srcdir}/any \
${std_srcdir}/array \
${std_srcdir}/atomic \
+   ${std_srcdir}/bit \
${std_srcdir}/bitset \
${std_srcdir}/charconv \
${std_srcdir}/chrono \
diff --git a/libstdc++-v3/include/precompiled/stdc++.h 
b/libstdc++-v3/include/precompiled/stdc++.h
index 9b056bb3467..80769233eb3 100644
--- a/libstdc++-v3/include/precompiled/stdc++.h
+++ b/libstdc++-v3/include/precompiled/stdc++.h
@@ -134,6 +134,7 @@
 #endif
 
 #if __cplusplus > 201703L
+#include 
 // #include 
 // #include 
 // #include 
diff --git a/libstdc++-v3/include/std/bit b/libstdc++-v3/include/std/bit
new file mode 100644
index 000..76aa0957b56
--- /dev/null
+++ b/libstdc++-v3/include/std/bit
@@ -0,0 +1,359 @@
+//  -*- C++ -*-
+
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more

[PATCH] Remove redundant #if conditional

2018-07-03 Thread Jonathan Wakely

The whole file is guarded by the same check already.

* include/bits/alloc_traits.h: Remove redundant preprocessor
condition.

Tested powerpc64le-linux, committed to trunk.

commit 76fd352d2bccd2a7399f6693cb5c7a8e9c65274b
Author: Jonathan Wakely 
Date:   Tue Jul 3 00:26:07 2018 +0100

Remove redundant #if conditional

The whole file is guarded by the same check already.

* include/bits/alloc_traits.h: Remove redundant preprocessor
condition.

diff --git a/libstdc++-v3/include/bits/alloc_traits.h 
b/libstdc++-v3/include/bits/alloc_traits.h
index eee9d8502e4..742fdd0447d 100644
--- a/libstdc++-v3/include/bits/alloc_traits.h
+++ b/libstdc++-v3/include/bits/alloc_traits.h
@@ -598,7 +598,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : is_copy_constructible<_Tp>
 { };
 
-#if __cplusplus >= 201103L
   // Trait to detect Allocator-like types.
   template
 struct __is_allocator : false_type { };
@@ -612,10 +611,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 using _RequireAllocator
   = typename enable_if<__is_allocator<_Alloc>::value, _Alloc>::type;
-#endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
-
-#endif
-#endif
+#endif // C++11
+#endif // _ALLOC_TRAITS_H


Re: [PATCH,rs6000] Fix implementation of vec_unpackh, vec_unpackl builtins

2018-07-03 Thread Carl Love
Segher:

On Mon, 2018-07-02 at 11:53 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Jun 29, 2018 at 07:38:39AM -0700, Carl Love wrote:
> > +;; Unpack high elements of float vector to vector of doubles
> > +(define_expand "altivec_unpackh_v4sf"
> > +  [(set (match_operand:V2DF 0 "register_operand" "=v")
> > +(match_operand:V4SF 1 "register_operand" "v"))]
> > +  "TARGET_VSX"
> > +{
> > +  emit_insn (gen_doublehv4sf2 (operands[0], operands[1]));
> > +  DONE;
> > +}
> > +  [(set_attr "type" "veccomplex")])
> 
> I wondered if these mactually work for all VSX registers, not just
> the VMX
> registers (i.e. "wa" or similar instead of "v").  But constraints in
> define_expand are meaningless anyway; just leave them out please.
> 
> Does it help to define these altivec_unpackh_v4sf, when all it does
> is
> expand as doublehv4sf2?  Can't you more easily put the latter in the
> tables?

Yes, my bad. It is way cleaner to just do it directly.  My first
attempt needed the define_expand but then I realized I had made things
way more complicated then needed and rewrote the patch.

> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
> > @@ -0,0 +1,257 @@
> > +/* { dg-do compile { target powerpc*-*-* } } */
> > +/* { dg-require-effective-target powerpc_altivec_ok } */
> > +/* { dg-options "-mpower8-vector -maltivec" } */
> 
> This needs p8vector_ok then.  Is that correct?  What requires p8?
> Is VSX (p7) enough for everything here?
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/altivec-2-runnable.c
> > @@ -0,0 +1,94 @@
> > +/* { dg-do compile { target powerpc*-*-* } } */
> > +/* { dg-require-effective-target powerpc_altivec_ok } */
> > +/* { dg-options "-mpower8-vector -mvsx" } */
> 
> Same here: required target does not match options.
> 
By bad again, I can't follow my own comments. altivec-1-runnable.c does
not need power 8.  But altivec-2-runnable.c does, per the comments in
the file.

Fixed the various issues and retested on 

powerpc64le-unknown-linux-gnu (Power 8 LE)  
powerpc64-unknown-linux-gnu (Power 8 BE)
    powerpc64le-unknown-linux-gnu (Power 9 LE)

Please let me know if the patch looks OK for GCC mainline. The patch
also needs to be backported to GCC 8.

 Carl Love
-


gcc/ChangeLog:

2018-07-03  Carl Love  

* config/rs6000/rs6000-c.c: Map ALTIVEC_BUILTIN_VEC_UNPACKH for
float argument to VSX_BUILTIN_DOUBLEH_V4SF.
Map ALTIVEC_BUILTIN_VEC_UNPACKL for float argument to
VSX_BUILTIN_DOUBLEL_V4SF.

gcc/testsuite/ChangeLog:

2018-07-03  Carl Love  
* gcc.target/altivec-1-runnable.c: New test file.
* gcc.target/altivec-2-runnable.c: New test file.
* gcc.target/vsx-7.c (main2): Change expected expected instruction
for tests.
---
 gcc/config/rs6000/rs6000-c.c   |   4 +-
 .../gcc.target/powerpc/altivec-1-runnable.c| 257 +
 .../gcc.target/powerpc/altivec-2-runnable.c|  94 
 gcc/testsuite/gcc.target/powerpc/vsx-7.c   |   7 +-
 4 files changed, 356 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/altivec-2-runnable.c

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index f4b1bf7..f37f0b1 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -865,7 +865,7 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V4SI, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_UNPACKH, ALTIVEC_BUILTIN_VUPKHPX,
 RS6000_BTI_unsigned_V4SI, RS6000_BTI_pixel_V8HI, 0, 0 },
-  { ALTIVEC_BUILTIN_VEC_UNPACKH, ALTIVEC_BUILTIN_VUPKHPX,
+  { ALTIVEC_BUILTIN_VEC_UNPACKH, VSX_BUILTIN_DOUBLEH_V4SF,
 RS6000_BTI_V2DF, RS6000_BTI_V4SF, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_VUPKHSH, ALTIVEC_BUILTIN_VUPKHSH,
 RS6000_BTI_V4SI, RS6000_BTI_V8HI, 0, 0 },
@@ -897,7 +897,7 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_UNPACKL, P8V_BUILTIN_VUPKLSW,
 RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V4SI, 0, 0 },
-  { ALTIVEC_BUILTIN_VEC_UNPACKL, ALTIVEC_BUILTIN_VUPKLPX,
+  { ALTIVEC_BUILTIN_VEC_UNPACKL, VSX_BUILTIN_DOUBLEL_V4SF,
 RS6000_BTI_V2DF, RS6000_BTI_V4SF, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_VUPKLPX, ALTIVEC_BUILTIN_VUPKLPX,
 RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, 0, 0 },
diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
new file mode 100644
index 000..bb913d2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-1-runnable.c
@@ -0,0 +1,257 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-m

Re: [PATCH] P0556R3 Integral power-of-2 operations, P0553R2 Bit operations

2018-07-03 Thread Jakub Jelinek
On Tue, Jul 03, 2018 at 10:02:47PM +0100, Jonathan Wakely wrote:
> +#ifndef _GLIBCXX_BIT
> +#define _GLIBCXX_BIT 1
> +
> +#pragma GCC system_header
> +
> +#if __cplusplus >= 201402L
> +
> +#include 
> +#include 
> +
> +namespace std _GLIBCXX_VISIBILITY(default)
> +{
> +_GLIBCXX_BEGIN_NAMESPACE_VERSION
> +
> +  template
> +constexpr _Tp
> +__rotl(_Tp __x, unsigned int __s) noexcept
> +{
> +  constexpr auto _Nd = numeric_limits<_Tp>::digits;
> +  const unsigned __sN = __s % _Nd;
> +  if (__sN)
> +return (__x << __sN) | (__x >> (_Nd - __sN));

Wouldn't it be better to use some branchless pattern that
GCC can also optimize well, like:
  return (__x << __sN) | (__x >> ((-_sN) & (_Nd - 1)));
(iff _Nd is always power of two), or perhaps
  return (__x << __sN) | (__x >> ((-_sN) % _Nd));
which is going to be folded into the above one for power of two constants?
E.g. ia32intrin.h also uses:
/* 64bit rol */
extern __inline unsigned long long
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
__rolq (unsigned long long __X, int __C)
{
  __C &= 63;
  return (__X << __C) | (__X >> (-__C & 63));
}
etc.

Jakub


Re: [PATCH] P0556R3 Integral power-of-2 operations, P0553R2 Bit operations

2018-07-03 Thread Jonathan Wakely

On 03/07/18 23:40 +0200, Jakub Jelinek wrote:

On Tue, Jul 03, 2018 at 10:02:47PM +0100, Jonathan Wakely wrote:

+#ifndef _GLIBCXX_BIT
+#define _GLIBCXX_BIT 1
+
+#pragma GCC system_header
+
+#if __cplusplus >= 201402L
+
+#include 
+#include 
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  template
+constexpr _Tp
+__rotl(_Tp __x, unsigned int __s) noexcept
+{
+  constexpr auto _Nd = numeric_limits<_Tp>::digits;
+  const unsigned __sN = __s % _Nd;
+  if (__sN)
+return (__x << __sN) | (__x >> (_Nd - __sN));


Wouldn't it be better to use some branchless pattern that
GCC can also optimize well, like:
 return (__x << __sN) | (__x >> ((-_sN) & (_Nd - 1)));
(iff _Nd is always power of two),


_Nd is 20 for one of the INT_N types on msp340, but we could have a
special case for the rare integer types with unusual sizes.


or perhaps
 return (__x << __sN) | (__x >> ((-_sN) % _Nd));
which is going to be folded into the above one for power of two constants?


That looks good.


E.g. ia32intrin.h also uses:
/* 64bit rol */
extern __inline unsigned long long
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
__rolq (unsigned long long __X, int __C)
{
 __C &= 63;
 return (__X << __C) | (__X >> (-__C & 63));
}
etc.


Should we delegate to those intrinsics for x86, so that
__builtin_ia32_rolqi and __builtin_ia32_rolhi can be used when
relevant?




[PATCH] relax lower bound for infinite arguments in gimple-ssa-sprinf.c (PR 86274)

2018-07-03 Thread Martin Sebor

In computing the size of expected output for non-constant floating
arguments the sprintf pass doesn't consider the possibility that
the argument value may be not finite (i.e., it can be infinity or
NaN).  Infinities and NaNs are formatted as "inf" or "infinity"
and "nan".  As a result, any floating directive can produce as
few bytes on output as three for an non-finite argument, when
the least amount directives such as %f produce for finite
arguments is 8.

The attached patch adjusts the floating point code to correctly
reflect the lower bound.

Martin
PR tree-optimization/86274 - SEGFAULT when logging std::to_string(NAN)

gcc/ChangeLog:

	PR tree-optimization/86274
	* gimple-ssa-sprintf.c (fmtresult::type_max_digits): Verify
	precondition.
	(format_floating): Correct handling of infinities and NaNs.

gcc/testsuite/ChangeLog:

	PR tree-optimization/86274
	* gcc.dg/tree-ssa/builtin-sprintf-9.c: New test.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-10.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-15.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf.c: Same.
	* gcc.dg/tree-ssa/pr83198.c: Same.

Index: gcc/gimple-ssa-sprintf.c
===
--- gcc/gimple-ssa-sprintf.c	(revision 262312)
+++ gcc/gimple-ssa-sprintf.c	(working copy)
@@ -781,15 +781,19 @@ unsigned
 fmtresult::type_max_digits (tree type, int base)
 {
   unsigned prec = TYPE_PRECISION (type);
-  if (base == 8)
-return (prec + 2) / 3;
+  switch (base)
+{
+case 8:
+  return (prec + 2) / 3;
+case 10:
+  /* Decimal approximation: yields 3, 5, 10, and 20 for precision
+	 of 8, 16, 32, and 64 bits.  */
+  return prec * 301 / 1000 + 1;
+case 16:
+  return prec / 4;
+}
 
-  if (base == 16)
-return prec / 4;
-
-  /* Decimal approximation: yields 3, 5, 10, and 20 for precision
- of 8, 16, 32, and 64 bits.  */
-  return prec * 301 / 1000 + 1;
+  gcc_unreachable ();
 }
 
 static bool
@@ -1759,6 +1763,11 @@ format_floating (const directive &dir, const HOST_
   unsigned flagmin = (1 /* for the first digit */
 		  + (dir.get_flag ('+') | dir.get_flag (' ')));
 
+  /* The minimum is 3 for "inf" and "nan" for all specifiers, plus 1
+ for the plus sign/space with the '+' and ' ' flags, respectively,
+ unless reduced below.  */
+  res.range.min = 2 + flagmin;
+
   /* When the pound flag is set the decimal point is included in output
  regardless of precision.  Whether or not a decimal point is included
  otherwise depends on the specification and precision.  */
@@ -1775,14 +1784,13 @@ format_floating (const directive &dir, const HOST_
 	else if (dir.prec[0] > 0)
 	  minprec = dir.prec[0] + !radix /* decimal point */;
 
-	res.range.min = (2 /* 0x */
-			 + flagmin
-			 + radix
-			 + minprec
-			 + 3 /* p+0 */);
+	res.range.likely = (2 /* 0x */
+			+ flagmin
+			+ radix
+			+ minprec
+			+ 3 /* p+0 */);
 
 	res.range.max = format_floating_max (type, 'a', prec[1]);
-	res.range.likely = res.range.min;
 
 	/* The unlikely maximum accounts for the longest multibyte
 	   decimal point character.  */
@@ -1800,15 +1808,14 @@ format_floating (const directive &dir, const HOST_
 	   non-zero, decimal point.  */
 	HOST_WIDE_INT minprec = prec[0] ? prec[0] + !radix : 0;
 
-	/* The minimum output is "[-+]1.234567e+00" regardless
+	/* The likely minimum output is "[-+]1.234567e+00" regardless
 	   of the value of the actual argument.  */
-	res.range.min = (flagmin
-			 + radix
-			 + minprec
-			 + 2 /* e+ */ + 2);
+	res.range.likely = (flagmin
+			+ radix
+			+ minprec
+			+ 2 /* e+ */ + 2);
 
 	res.range.max = format_floating_max (type, 'e', prec[1]);
-	res.range.likely = res.range.min;
 
 	/* The unlikely maximum accounts for the longest multibyte
 	   decimal point character.  */
@@ -1827,12 +1834,15 @@ format_floating (const directive &dir, const HOST_
 	   decimal point.  */
 	HOST_WIDE_INT minprec = prec[0] ? prec[0] + !radix : 0;
 
-	/* The lower bound when precision isn't specified is 8 bytes
-	   ("1.23456" since precision is taken to be 6).  When precision
-	   is zero, the lower bound is 1 byte (e.g., "1").  Otherwise,
-	   when precision is greater than zero, then the lower bound
-	   is 2 plus precision (plus flags).  */
-	res.range.min = flagmin + radix + minprec;
+	/* For finite numbers (i.e., not infinity or NaN) the lower bound
+	   when precision isn't specified is 8 bytes ("1.23456" since
+	   precision is taken to be 6).  When precision is zero, the lower
+	   bound is 1 byte (e.g., "1").  Otherwise, when precision is greater
+	   than zero, then the lower bound is 2 plus precision (plus flags).
+	   But in all cases, the lower bound is no greater than 3.  */
+	unsigned HOST_WIDE_INT min = flagmin + radix + minprec;
+	if (min < res.range.min)
+	  res.range.min = min;
 
 	/* Compute the upper bound for -TYPE_MAX.  */
 	

[PATCH] have pretty printer include NaN representation

2018-07-03 Thread Martin Sebor

The pretty-printer formats NaNs simply as Nan, even though
there is much more to a NaN than than that.  At the very
least, one might like to know if the NaN is signaling or
quiet, negative or positive.  If it's not in a canonical
form, one might also be interested in the significand
and exponent parts.  The attached patch enhances
the pretty printer to include all these details in
its detailed output.

Tested by bootstrapping & regtesting on x86_64-linux.

Martin
gcc/ChangeLog:

	* print-tree.c (print_real_cst): New function.
	(print_node_brief): Call it.
	(print_node): Ditto.

Index: gcc/print-tree.c
===
--- gcc/print-tree.c	(revision 262312)
+++ gcc/print-tree.c	(working copy)
@@ -52,6 +52,71 @@ dump_addr (FILE *file, const char *prefix, const v
 fprintf (file, "%s" HOST_PTR_PRINTF, prefix, addr);
 }
 
+/* Print to FILE a NODE representing a REAL_CST constant, including
+   Infinity and NaN.  Be verbose when BFRIEF is false.  */
+
+static void
+print_real_cst (FILE *file, const_tree node, bool brief)
+{
+  if (TREE_OVERFLOW (node))
+fprintf (file, " overflow");
+
+  REAL_VALUE_TYPE d = TREE_REAL_CST (node);
+  if (REAL_VALUE_ISINF (d))
+fprintf (file,  REAL_VALUE_NEGATIVE (d) ? " -Inf" : " Inf");
+  else if (REAL_VALUE_ISNAN (d))
+{
+  /* Print a NaN in the format [-][Q|S]NaN[(significand[exponent])]
+	 where significand is a hexadecimal string that starts with
+	 the 0x prefix followed by 0 if the number is not canonical
+	 and a non-zero digit if it is, and exponent is decimal.  */
+  unsigned start = 0;
+  const char *psig = (const char *) d.sig;
+  for (unsigned i = 0; i != sizeof d.sig; ++i)
+	if (psig[i])
+	  {
+	start = i;
+	break;
+	  }
+
+  fprintf (file, " %s%sNaN", d.sign ? "-" : "",
+	   d.signalling ? "S" : "Q");
+
+  if (brief)
+	return;
+
+  if (start)
+	fprintf (file, "(0x%s", d.canonical ? "" : "0");
+  else if (d.uexp)
+	fprintf (file, "(%s", d.canonical ? "" : "0");
+  else if (!d.canonical)
+	{
+	  fprintf (file, "(0)");
+	  return;
+	}
+
+  if (psig[start])
+	{
+	  for (unsigned i = start; i != sizeof d.sig; ++i)
+	if (i == start)
+	  fprintf (file, "%x", psig[i]);
+	else
+	  fprintf (file, "%02x", psig[i]);
+	}
+
+  if (d.uexp)
+	fprintf (file, "%se%u)", psig[start] ? "," : "", d.uexp);
+  else if (psig[start])
+	fputc (')', file);
+}
+  else
+{
+  char string[64];
+  real_to_decimal (string, &d, sizeof (string), 0, 1);
+  fprintf (file, " %s", string);
+}
+}
+
 /* Print a node in brief fashion, with just the code, address and name.  */
 
 void
@@ -121,24 +186,7 @@ print_node_brief (FILE *file, const char *prefix,
   print_dec (wi::to_wide (node), file, TYPE_SIGN (TREE_TYPE (node)));
 }
   if (TREE_CODE (node) == REAL_CST)
-{
-  REAL_VALUE_TYPE d;
-
-  if (TREE_OVERFLOW (node))
-	fprintf (file, " overflow");
-
-  d = TREE_REAL_CST (node);
-  if (REAL_VALUE_ISINF (d))
-	fprintf (file,  REAL_VALUE_NEGATIVE (d) ? " -Inf" : " Inf");
-  else if (REAL_VALUE_ISNAN (d))
-	fprintf (file, " Nan");
-  else
-	{
-	  char string[60];
-	  real_to_decimal (string, &d, sizeof (string), 0, 1);
-	  fprintf (file, " %s", string);
-	}
-}
+print_real_cst (file, node, true);
   if (TREE_CODE (node) == FIXED_CST)
 {
   FIXED_VALUE_TYPE f;
@@ -730,24 +778,7 @@ print_node (FILE *file, const char *prefix, tree n
 	  break;
 
 	case REAL_CST:
-	  {
-	REAL_VALUE_TYPE d;
-
-	if (TREE_OVERFLOW (node))
-	  fprintf (file, " overflow");
-
-	d = TREE_REAL_CST (node);
-	if (REAL_VALUE_ISINF (d))
-	  fprintf (file,  REAL_VALUE_NEGATIVE (d) ? " -Inf" : " Inf");
-	else if (REAL_VALUE_ISNAN (d))
-	  fprintf (file, " Nan");
-	else
-	  {
-		char string[64];
-		real_to_decimal (string, &d, sizeof (string), 0, 1);
-		fprintf (file, " %s", string);
-	  }
-	  }
+	  print_real_cst (file, node, false);
 	  break;
 
 	case FIXED_CST:


Re: [PATCH] have pretty printer include NaN representation

2018-07-03 Thread Jeff Law
On 07/03/2018 04:59 PM, Martin Sebor wrote:
> The pretty-printer formats NaNs simply as Nan, even though
> there is much more to a NaN than than that.  At the very
> least, one might like to know if the NaN is signaling or
> quiet, negative or positive.  If it's not in a canonical
> form, one might also be interested in the significand
> and exponent parts.  The attached patch enhances
> the pretty printer to include all these details in
> its detailed output.
> 
> Tested by bootstrapping & regtesting on x86_64-linux.
> 
> Martin
> 
> gcc-print-real-cst.diff
> 
> 
> gcc/ChangeLog:
> 
>   * print-tree.c (print_real_cst): New function.
>   (print_node_brief): Call it.
>   (print_node): Ditto.
OK.
jeff


Re: [patch] jump threading multiple paths that start from the same BB

2018-07-03 Thread Jeff Law
On 07/03/2018 03:31 AM, Aldy Hernandez wrote:
> On 07/02/2018 07:08 AM, Christophe Lyon wrote:
> 
 On 11/07/2017 10:33 AM, Aldy Hernandez wrote:
> While poking around in the backwards threader I noticed that we bail if
>
> we have already seen a starting BB.
>
> /* Do not jump-thread twice from the same block.  */
> if (bitmap_bit_p (threaded_blocks, entry->src->index)
>
> This limitation discards paths that are sub-paths of paths that have
> already been threaded.
>
> The following patch scans the remaining to-be-threaded paths to identify
>
> if any of them start from the same point, and are thus sub-paths of the
>
> just-threaded path.  By removing the common prefix of blocks in upcoming
>
> threadable paths, and then rewiring first non-common block
> appropriately, we expose new threading opportunities, since we are no
> longer starting from the same BB.  We also simplify the would-be
> threaded paths, because we don't duplicate already duplicated paths.
> [snip]
>> Hi,
>>
>> I've noticed a regression on aarch64:
>> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
>> threaded: 3"
>> very likely caused by this patch (appeared between 262282 and 262294)
>>
>> Christophe
> 
> The test needs to be adjusted here.
> 
> The long story is that the aarch64 IL is different at thread3 time in
> that it has 2 profitable sub-paths that can now be threaded with my
> patch.  This is causing the threaded count to be 5 for aarch64, versus 3
> for x86 64.  Previously we couldn't thread these in aarch64, so the
> backwards threader would bail.
> 
> One can see the different threading opportunities by sticking
> debug_all_paths() at the top of thread_through_all_blocks().  You will
> notice that aarch64 has far more candidates to begin with.  The IL on
> the x86 backend, has no paths that start on the same BB.  The aarch64,
> on the other hand, has many to choose from:
> 
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 19,
> 
> Some of these prove unprofitable, but 2 more than before are profitable now.
> 
> 
> BTW, I see another threading related failure on aarch64 which is
> unrelated to my patch, and was previously there:
> 
> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps
> threaded"
> 
> This is probably another IL incompatibility between architectures.
> 
> Anyways... the attached path fixes the regression.  I have added a note
> to the test explaining the IL differences.  We really should rewrite all
> the threading tests (I am NOT volunteering ;-)).
> 
> OK for trunk?
> Aldy
> 
> curr.patch
> 
> 
> gcc/testsuite/
> 
>   * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust test because aarch64
>   has a slightly different IL that provides more threading
>   opportunities.
OK.

WRT rewriting the tests.  I'd certainly agree that we don't have the
right set of knobs to allow us to characterize the target nor do we have
the right dumping/scanning facilities to describe and query the CFG changes.

The fact that the IL changes so much across targets is a sign that
target dependency (probably BRANCH_COST) is twiddling the gimple we
generate.  I strongly suspect we'd be a lot better off if we tackled the
BRANCH_COST problem first.

jeff

ps.  That particular test is the  test which led to the creation of the
backwards jump threader :-)


Re: [PATCH 3/3] Extend -falign-FOO=N to N[:M[:N2[:M2]]]

2018-07-03 Thread Jeff Law
On 07/03/2018 01:11 PM, Martin Liška wrote:
> On 07/03/2018 10:53 AM, Martin Liška wrote:
>> Thank you Jeff.
>>
>> I found some issues when doing build of all targets
>> (contrib/config-list.mk).
>> I'll update patch and test that affected cross-compilers still produce
>> same output.
> 
> Hello.
> 
> I'm done with testing, I bootstrapped and regtested the patch on
> x86_64-linux and ppc64-linux-gnu.
> I also build all cross compilers we have in contrib/config-list.mk and I
> verified that
> results for gcc/gcc/testsuite/gcc.dg/params/blocksort-part.c source file
> is equal for all cross compilers
> that I touched in the patch. I tested these options:
> 
> -O2
> -O2 -falign-loops=256
> -O2 -falign-loops=256 -falign-functions=512 -falign-labels=1024
> -falign-jumps=2048
> -O2 -falign-loops=1024 -falign-functions=512 -falign-jumps=2048
> -O2 -falign-loops=256 -falign-jumps=2048
> -O2 -falign-loops=100 -falign-functions=200 -falign-labels=300
> -falign-jumps=400
> -O2 -falign-loops= -falign-functions=1112 -falign-labels=1113
> -falign-jumps=1114
> 
> there are no issues except one that are present on current trunk:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86394
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86390
> 
> Is the patchset still ready for approval?
Yes.

jeff



[committed] Consolidate movsd and movmd patterns/splitters for H8

2018-07-03 Thread Jeff Law

So the H8 port has movmd and movsd instructions which are used to
implement block moves and stpcpy.  Those expanders and patters are
necessarily fairly ugly, but not enough to warrant trying to simplify.

What is worth simplifying is the fact that we have two copies of each
pattern and splitter because Pmode varies.  This patch uses the P mode
iterator to consolidate the patterns & splitters that only differed in
the modes they used for the pointer operands.

The resulting code for libgcc & newlib is unchanged.

Installed on the trunk.

Jeff
* config/h8300/h8300.md (movmd_internal_normal): Consolidated with
(movmd_internal) into a single pattern using the P mode iterator.
(movmd splitters): Similarly.
(stpcpy_internal_normal, stpcpy_internal): Similarly for thes patterns.
(movsd splitters): Similarly.


diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index 5014fd5..e654784 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -512,33 +512,16 @@
 ;; This is a difficult instruction to reload since operand 0 must be the
 ;; frame pointer.  See h8300_reg_class_from_letter for an explanation.
 
-(define_insn "movmd_internal_normal"
-  [(set (mem:BLK (match_operand:HI 3 "register_operand" "0,r"))
-   (mem:BLK (match_operand:HI 4 "register_operand" "1,1")))
+(define_insn "movmd_internal_"
+  [(set (mem:BLK (match_operand:P 3 "register_operand" "0,r"))
+   (mem:BLK (match_operand:P 4 "register_operand" "1,1")))
(unspec [(match_operand:HI 5 "register_operand" "2,2")
(match_operand:HI 6 "const_int_operand" "n,n")] UNSPEC_MOVMD)
-   (clobber (match_operand:HI 0 "register_operand" "=d,??D"))
-   (clobber (match_operand:HI 1 "register_operand" "=f,f"))
+   (clobber (match_operand:P 0 "register_operand" "=d,??D"))
+   (clobber (match_operand:P 1 "register_operand" "=f,f"))
(set (match_operand:HI 2 "register_operand" "=c,c")
(const_int 0))]
-  "TARGET_H8300SX && TARGET_NORMAL_MODE"
-  "@
-movmd%m6
-#"
-  [(set_attr "length" "2,14")
-   (set_attr "can_delay" "no")
-   (set_attr "cc" "none,clobber")])
-
-(define_insn "movmd_internal"
-  [(set (mem:BLK (match_operand:SI 3 "register_operand" "0,r"))
-   (mem:BLK (match_operand:SI 4 "register_operand" "1,1")))
-   (unspec [(match_operand:HI 5 "register_operand" "2,2")
-   (match_operand:HI 6 "const_int_operand" "n,n")] UNSPEC_MOVMD)
-   (clobber (match_operand:SI 0 "register_operand" "=d,??D"))
-   (clobber (match_operand:SI 1 "register_operand" "=f,f"))
-   (set (match_operand:HI 2 "register_operand" "=c,c")
-   (const_int 0))]
-  "TARGET_H8300SX && !TARGET_NORMAL_MODE"
+  "TARGET_H8300SX"
   "@
 movmd%m6
 #"
@@ -563,33 +546,11 @@
(match_operand:BLK 1 "memory_operand" ""))
(unspec [(match_operand:HI 2 "register_operand" "")
(match_operand:HI 3 "const_int_operand" "")] UNSPEC_MOVMD)
-   (clobber (match_operand:HI 4 "register_operand" ""))
-   (clobber (match_operand:HI 5 "register_operand" ""))
-   (set (match_dup 2)
-   (const_int 0))]
-  "TARGET_H8300SX && TARGET_NORMAL_MODE && reload_completed
-   && REGNO (operands[4]) != DESTINATION_REG"
-  [(const_int 0)]
-  {
-rtx dest;
-
-h8300_swap_into_er6 (XEXP (operands[0], 0));
-dest = replace_equiv_address (operands[0], hard_frame_pointer_rtx);
-emit_insn (gen_movmd (dest, operands[1], operands[2], operands[3]));
-h8300_swap_out_of_er6 (operands[4]);
-DONE;
-  })
-
-(define_split
-  [(set (match_operand:BLK 0 "memory_operand" "")
-   (match_operand:BLK 1 "memory_operand" ""))
-   (unspec [(match_operand:HI 2 "register_operand" "")
-   (match_operand:HI 3 "const_int_operand" "")] UNSPEC_MOVMD)
-   (clobber (match_operand:SI 4 "register_operand" ""))
-   (clobber (match_operand:SI 5 "register_operand" ""))
+   (clobber (match_operand:P 4 "register_operand" ""))
+   (clobber (match_operand:P 5 "register_operand" ""))
(set (match_dup 2)
(const_int 0))]
-  "TARGET_H8300SX && !TARGET_NORMAL_MODE && reload_completed
+  "TARGET_H8300SX && reload_completed
&& REGNO (operands[4]) != DESTINATION_REG"
   [(const_int 0)]
   {
@@ -641,28 +602,14 @@
 
 ;; See comments above memcpy_internal().
 
-(define_insn "stpcpy_internal_normal"
-  [(set (mem:BLK (match_operand:HI 3 "register_operand" "0,r"))
-   (unspec:BLK [(mem:BLK (match_operand:HI 4 "register_operand" "1,1"))]
-   UNSPEC_STPCPY))
-   (clobber (match_operand:HI 0 "register_operand" "=d,??D"))
-   (clobber (match_operand:HI 1 "register_operand" "=f,f"))
-   (clobber (match_operand:HI 2 "register_operand" "=c,c"))]
-  "TARGET_H8300SX && TARGET_NORMAL_MODE"
-  "@
-\n1:\tmovsd\t2f\;bra\t1b\n2:
-#"
-  [(set_attr "length" "6,18")
-   (set_attr "cc" "none,clobber")])
-
-(define_insn "stpcpy_internal"
-  [(set (mem:BLK (match_operand:SI 3 "register_operand" "0,r"))
-   (unspec:BLK [(mem:BLK (match_operand:SI 4 "register_operand" "1,1"))]
+(de

Re: [PATCH] relax lower bound for infinite arguments in gimple-ssa-sprinf.c (PR 86274)

2018-07-03 Thread Jeff Law
On 07/03/2018 04:50 PM, Martin Sebor wrote:
> In computing the size of expected output for non-constant floating
> arguments the sprintf pass doesn't consider the possibility that
> the argument value may be not finite (i.e., it can be infinity or
> NaN).  Infinities and NaNs are formatted as "inf" or "infinity"
> and "nan".  As a result, any floating directive can produce as
> few bytes on output as three for an non-finite argument, when
> the least amount directives such as %f produce for finite
> arguments is 8.
> 
> The attached patch adjusts the floating point code to correctly
> reflect the lower bound.
> 
> Martin
> 
> gcc-86274.diff
> 
> 
> PR tree-optimization/86274 - SEGFAULT when logging std::to_string(NAN)
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/86274
>   * gimple-ssa-sprintf.c (fmtresult::type_max_digits): Verify
>   precondition.
>   (format_floating): Correct handling of infinities and NaNs.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/86274
>   * gcc.dg/tree-ssa/builtin-sprintf-9.c: New test.
>   * gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust.
>   * gcc.dg/tree-ssa/builtin-sprintf-warn-10.c: Same.
>   * gcc.dg/tree-ssa/builtin-sprintf-warn-15.c: Same.
>   * gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: Same.
>   * gcc.dg/tree-ssa/builtin-sprintf.c: Same.
>   * gcc.dg/tree-ssa/pr83198.c: Same.
OK
jeff


[committed] Consolidate xor/ior patterns and splitters for H8

2018-07-03 Thread Jeff Law

This patch consolidates the ior/xor expanders, patterns and splitters
using a code iterator.

This does make us a bit more lenient on what we accept for xor operands
than we were before.  However, AFAICT xor should accept the same
operands as ior on all the supported H8 variants.

The additional leniency in accepted operands for xor does result in
improved code generation in some circumstances.  For example, instead of
something like this:

!  22c: 6e 7a 00 11 mov.b   @(0x11:16,er7),r2l
!  230: 71 0a   bnot#0x0,r2l
!  232: 6e fa 00 11 mov.b   r2l,@(0x11:16,er7)

Instead we get:

!  22c: 01 74 6e 78 xor.b   #0x1,@(0x11:16,er7)
!  230: 00 11 d0 01


Shorter and saves a register.  More importantly there's simply going to
be fewer patterns & splitters to adjust for the transition away from cc0.

Installed on the trunk.

Jeff
* config/h8300/h8300.md (ors code_iterator): New.
(bsetqi_msx, bnotqi_msx patterns and splitters): Consolidate into
a single pattern and single splitter.
(bsethi_msx, bnothi_msx patterns): Consolidate into a single pattern.
(iorqi3_1, xorqi3_1): Likewise.
(iorqi3, xorqi3 expanders): Similarly.

diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index e654784..f3cf421 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -193,6 +193,8 @@
 (define_mode_iterator QHSIF [QI HI SI SF])
 
 (define_code_iterator shifts [ashift ashiftrt lshiftrt])
+
+(define_code_iterator ors [ior xor])
 
 ;; --
 ;; MOVE INSTRUCTIONS
@@ -1597,126 +1599,44 @@
   [(set_attr "length" "2")])
 
 ;; --
-;; OR INSTRUCTIONS
+;; OR/XOR INSTRUCTIONS
 ;; --
 
-(define_insn "bsetqi_msx"
+(define_insn "bqi_msx"
   [(set (match_operand:QI 0 "bit_register_indirect_operand" "=WU")
-   (ior:QI (match_operand:QI 1 "bit_register_indirect_operand" "%0")
+   (ors:QI (match_operand:QI 1 "bit_register_indirect_operand" "%0")
(match_operand:QI 2 "single_one_operand" "Y2")))]
   "TARGET_H8300SX && rtx_equal_p (operands[0], operands[1])"
-  "bset\\t%V2,%0"
+  { return  == IOR ? "bset\\t%V2,%0" : "bnot\\t%V2,%0"; }
   [(set_attr "length" "8")])
 
-(define_split
-  [(set (match_operand:HI 0 "bit_register_indirect_operand")
-   (ior:HI (match_operand:HI 1 "bit_register_indirect_operand")
-   (match_operand:HI 2 "single_one_operand")))]
-  "TARGET_H8300SX"
-  [(set (match_dup 0)
-   (ior:QI (match_dup 1)
-   (match_dup 2)))]
-  {
-if (abs (INTVAL (operands[2])) > 0xFF)
-  {
-   operands[0] = adjust_address (operands[0], QImode, 0);
-   operands[1] = adjust_address (operands[1], QImode, 0);
-   operands[2] = GEN_INT ((INTVAL (operands[2])) >> 8);
-  }
-else
-  {
-   operands[0] = adjust_address (operands[0], QImode, 1);
-   operands[1] = adjust_address (operands[1], QImode, 1);
-  }
-  })
-
-(define_insn "bsethi_msx"
+(define_insn "bhi_msx"
   [(set (match_operand:HI 0 "bit_register_indirect_operand" "=m")
-   (ior:HI (match_operand:HI 1 "bit_register_indirect_operand" "%0")
+   (ors:HI (match_operand:HI 1 "bit_register_indirect_operand" "%0")
(match_operand:HI 2 "single_one_operand" "Y2")))]
   "TARGET_H8300SX"
-  "bset\\t%V2,%0"
+  { return  == IOR ? "bset\\t%V2,%0" : "bnot\\t%V2,%0"; }
   [(set_attr "length" "8")])
 
-(define_insn "iorqi3_1"
+(define_insn "qi3_1"
   [(set (match_operand:QI 0 "bit_operand" "=U,rQ")
-   (ior:QI (match_operand:QI 1 "bit_operand" "%0,0")
+   (ors:QI (match_operand:QI 1 "bit_operand" "%0,0")
(match_operand:QI 2 "h8300_src_operand" "Y2,rQi")))]
   "TARGET_H8300SX || register_operand (operands[0], QImode)
|| single_one_operand (operands[2], QImode)"
-  "@
-   bset\\t%V2,%R0
-   or\\t%X2,%X0"
-  [(set_attr "length" "8,*")
-   (set_attr "length_table" "*,logicb")
-   (set_attr "cc" "none_0hit,set_znv")])
-
-(define_expand "ior3"
-  [(set (match_operand:QHSI 0 "register_operand" "")
-   (ior:QHSI (match_operand:QHSI 1 "register_operand" "")
- (match_operand:QHSI 2 "h8300_src_operand" "")))]
-  ""
-  "")
-
-;; --
-;; XOR INSTRUCTIONS
-;; --
-
-(define_insn "bnotqi_msx"
-  [(set (match_operand:QI 0 "bit_register_indirect_operand" "=WU")
-   (xor:QI (match_operand:QI 1 "bit_register_indirect_operand" "%0")
-   (match_operand:QI 2 "single_one_operand" "Y2")))]
-  "TARGET_H8300SX
-   && rtx_equal_p (operands[0], operands[1])"
-  "bnot\\t%V2,%0"
-  [(set_attr "length" "8")])
-
-(define_split
-  [(set (match_operand:HI 0 "bit_register_indirect_operand")
-   (xor:HI (match_operand:HI 1 "bit_regist

PING [PATCH] specify large command line option arguments (PR 82063)

2018-07-03 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01509.html

On 06/24/2018 03:05 PM, Martin Sebor wrote:

Storing integer command line option arguments in type int
limits options such as -Wlarger-than= or -Walloca-larger-than
to at most INT_MAX (see bug 71905).  Larger values wrap around
zero.  The value zero is considered to disable the option,
making it impossible to specify a zero limit.

To get around these limitations, the -Walloc-size-larger-than=
option accepts a string argument that it then parses itself
and interprets as HOST_WIDE_INT.  The option also accepts byte
size suffixes like KB, MB, GiB, etc. to make it convenient to
specify very large limits.

The int limitation is obviously less than ideal in a 64-bit
world.  The treatment of zero as a toggle is just a minor wart.
The special treatment to make it work for just a single option
makes option handling inconsistent.  It should be possible for
any option that takes an integer argument to use the same logic.

The attached patch enhances GCC option processing to do that.
It changes the storage type of option arguments from int to
HOST_WIDE_INT and extends the existing (although undocumented)
option property Host_Wide_Int to specify wide option arguments.
It also introduces the ByteSize property for options for which
specifying the byte-size suffix makes sense.

To make it possible to consider zero as a meaningful argument
value rather than a flag indicating that an option is disabled
the patch also adds a CLVC_SIZE enumerator to the cl_var_type
enumeration, and modifies how options of the kind are handled.

Warning options that take large byte-size arguments can be
disabled by specifying a value equal to or greater than
HOST_WIDE_INT_M1U.  For convenience, aliases in the form of
-Wno-xxx-larger-than have been provided for all the affected
options.

In the patch all the existing -larger-than options are set
to PTRDIFF_MAX.  This makes them effectively enabled, but
because the setting is exceedingly permissive, and because
some of the existing warnings are already set to the same
value and some other checks detect and reject such exceedingly
large values with errors, this change shouldn't noticeably
affect what constructs are diagnosed.

Although all the options are set to PTRDIFF_MAX, I think it
would make sense to consider setting some of them lower, say
to PTRDIFF_MAX / 2.  I'd like to propose that in a followup
patch.

To minimize observable changes the -Walloca-larger-than and
-Wvla-larger-than warnings required more extensive work to
make of the new mechanism because of the "unbounded" argument
handling (the warnings trigger for arguments that are not
visibly constrained), and because of the zero handling
(the warnings also trigger


Martin





[committed] Remove duplicate logical3_sn pattern

2018-07-03 Thread Jeff Law

After consolidating HI and SI mode patterns for logical bit operators,
the remaining patterns "logical3_sn" "logical3" are in
effect identical.

This patch deletes logical3_sn.

This (not surprisingly) makes no difference in the generated code :-)

Installed on the trunk

Jeff
* config/h8300/h8300.md (logical3_sn, logical3): Merge
into a single pattern.

diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index 5014fd5..faa9c78 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -1797,34 +1640,13 @@
 ;; {AND,IOR,XOR}{HI3,SI3} PATTERNS
 ;; --
 
-;; We need a separate pattern here because machines other than the
-;; original H8300 don't have to split the 16-bit operand into a pair
-;; of high/low instructions, so we can accept literal addresses, that
-;; have to be loaded into a register on H8300.
-
-(define_insn "*logical3_sn"
-  [(set (match_operand:HSI 0 "h8300_dst_operand" "=rQ")
-   (match_operator:HSI 3 "bit_operator"
-[(match_operand:HSI 1 "h8300_dst_operand" "%0")
- (match_operand:HSI 2 "h8300_src_operand" "rQi")]))]
-  "(TARGET_H8300S || TARGET_H8300H) && h8300_operands_match_p (operands)"
-{
-  return output_logical_op (mode, operands);
-}
-  [(set (attr "length")
-   (symbol_ref "compute_logical_op_length (mode, operands)"))
-   (set (attr "cc")
-   (symbol_ref "compute_logical_op_cc (mode, operands)"))])
-
 (define_insn "*logical3"
   [(set (match_operand:HSI 0 "h8300_dst_operand" "=rQ")
(match_operator:HSI 3 "bit_operator"
  [(match_operand:HSI 1 "h8300_dst_operand" "%0")
   (match_operand:HSI 2 "h8300_src_operand" "rQi")]))]
   "h8300_operands_match_p (operands)"
-{
-  return output_logical_op (mode, operands);
-}
+  { return output_logical_op (mode, operands); }
   [(set (attr "length")
(symbol_ref "compute_logical_op_length (mode, operands)"))
(set (attr "cc")


Re: [PATCH] relax lower bound for infinite arguments in gimple-ssa-sprinf.c (PR 86274)

2018-07-03 Thread Martin Sebor

Committed to trunk in r86274.  Jakub/Richard, can you please
also review and approve the corresponding fix for the release
branches?

Martin

On 07/03/2018 06:32 PM, Jeff Law wrote:

On 07/03/2018 04:50 PM, Martin Sebor wrote:

In computing the size of expected output for non-constant floating
arguments the sprintf pass doesn't consider the possibility that
the argument value may be not finite (i.e., it can be infinity or
NaN).  Infinities and NaNs are formatted as "inf" or "infinity"
and "nan".  As a result, any floating directive can produce as
few bytes on output as three for an non-finite argument, when
the least amount directives such as %f produce for finite
arguments is 8.

The attached patch adjusts the floating point code to correctly
reflect the lower bound.

Martin

gcc-86274.diff


PR tree-optimization/86274 - SEGFAULT when logging std::to_string(NAN)

gcc/ChangeLog:

PR tree-optimization/86274
* gimple-ssa-sprintf.c (fmtresult::type_max_digits): Verify
precondition.
(format_floating): Correct handling of infinities and NaNs.

gcc/testsuite/ChangeLog:

PR tree-optimization/86274
* gcc.dg/tree-ssa/builtin-sprintf-9.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust.
* gcc.dg/tree-ssa/builtin-sprintf-warn-10.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-15.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf.c: Same.
* gcc.dg/tree-ssa/pr83198.c: Same.

OK
jeff





Re: [PATCH] Remove legacy testcase for -fprofile-generate=./

2018-07-03 Thread Jeff Law
On 07/03/2018 11:24 AM, Martin Liška wrote:
> Hi.
> 
> As new option mangles absolute path of a compiled file, it's hard
> to come up with a scan-file pattern. Note similar test is done in
> gcc.dg/profile-dir-*.c that I adjusted few days ago.
> 
> Ready for trunk?
> Thanks,
> Martin
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-07-03  Martin Liska  
> 
> * gcc.dg/pr47793.c: Remove.
OK.

Note that I'd consider tests which test profiling to fall under the gcov
maintainership as well :-)

jeff


Re: [PATCH] Fix DOS-based system build and fix documentation.

2018-07-03 Thread Jeff Law
On 07/03/2018 11:23 AM, Martin Liška wrote:
> Hi.
> 
> I'm sending fix for DOS-based system, it's a compilation error that
> I introduced some time ago. Plus I add Jonathan's correction of a
> documentation
> entry.
> 
> Ready for trunk?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-07-03  Martin Liska  
>     Jonathan Wakely  
> 
> * coverage.c: Use correct type.
> * doc/invoke.texi: Language correction.
OK
jeff


Re: [RFC PATCH] diagnose built-in declarations without prototype (PR 83656)

2018-07-03 Thread Jeff Law
On 06/29/2018 09:54 AM, Martin Sebor wrote:
> 
> All the warnings I have seen are because of declarations like
> the one in the test below that checks for the presence of symbol
> sin in the library:
> 
>   char sin ();
>   int main () { return sin (); }
> 
> GCC has warned for this code by default since at least 4.1 so if
> it is, in fact, a problem it has been with us for over a decade.
> 
> There's a comment in the test that explains that the char return
> type is deliberate to prevent GCC from treating the function as
> a built-in.  (That, of course, relies on undefined behavior
> because names of extern library functions are reserved and
> conforming compilers could simply avoid emitting the call or
> replace it with a trap.)
As you noted, by doing this the test can verify if there's a sin()
function in the library or not, regardless of whether or not the
compiler has a builtin for sin().


> 
>> I wonder if stepping forward to a more modern version of autoconf is
>> going to help here and if we should be feeding them updates to make this
>> kind of stuff less pervasive, at least in standard autoconf tests.
> 
> That would make sense to me.  The tests should not rely on
> undefined behavior.  They should declare standard functions with
> the right prototypes.  IMO, for GCC and compatible compilers they
> should disable built-in expansion instead via -fno-builtin.  For
> all other compilers, they could store the address of each function
> in a (perhaps volatile) pointer and use it to make the call instead.
I think the problem is they can't rely on the compiler having the
-fno-builtin flag.  Of course they're relying on other compilers having
the same kind of behavior as GCC WRT which is possibly worse.

I guess there's a reason why they didn't extract the set of symbols from
the library. :-)

> 
> But since the number of warnings here hasn't changed, the ones
> in GCC logs predate my changes.  So updating the tests seems
> like an improvement to consider independently of the patch.
Agreed.  I'm still wary of proceeding given the general concerns about
configure tests.  It's good that GCC's configury bits aren't affected,
but I'm not sure we can generalize a whole lot from that.

jeff


Re: [PATCH] i386; Add indirect_return function attribute

2018-07-03 Thread Jeff Law
On 07/03/2018 10:53 AM, H.J. Lu wrote:
> On Tue, Jul 3, 2018 at 9:12 AM, Uros Bizjak  wrote:
>> On Tue, Jul 3, 2018 at 5:32 PM, H.J. Lu  wrote:
>>> On Fri, Jun 8, 2018 at 3:27 AM, H.J. Lu  wrote:
 On x86, swapcontext may return via indirect branch when shadow stack
 is enabled.  To support code instrumentation of control-flow transfers
 with -fcf-protection, add indirect_return function attribute to inform
 compiler that a function may return via indirect branch.

 Note: Unlike setjmp, swapcontext only returns once.  Mark it return
 twice will unnecessarily disable compiler optimization.

 OK for trunk?

 H.J.
 
 gcc/

 PR target/85620
 * config/i386/i386.c (rest_of_insert_endbranch): Also generate
 ENDBRANCH for non-tail call which may return via indirect branch.
 * doc/extend.texi: Document indirect_return attribute.

 gcc/testsuite/

 PR target/85620
 * gcc.target/i386/pr85620-1.c: New test.
 * gcc.target/i386/pr85620-2.c: Likewise.

>>> Here is the updated patch with a testcase to show the impact of
>>> returns_twice attribute.
>>>
>>> Jan, Uros, can you take a look?
>> LGTM for the implementation, can't say if attribute is really needed or not.
> This gives programmers more flexibly.
> 
>> +@item indirect_return
>> +@cindex @code{indirect_return} function attribute, x86
>> +
>> +The @code{indirect_return} attribute on a function is used to inform
>> +the compiler that the function may return via indiret branch.
>>
>> s/indiret/indirect/
> Fixed.  Here is the updated patch.
> 
> Thanks.
> 
> -- H.J.
> 
> 
> 0001-i386-Add-indirect_return-function-attribute.patch
> 
> 
> From bb98f6a31801659ae3c6689d6d31af33a3c28bb2 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" 
> Date: Thu, 7 Jun 2018 20:05:15 -0700
> Subject: [PATCH] i386; Add indirect_return function attribute
> 
> On x86, swapcontext may return via indirect branch when shadow stack
> is enabled.  To support code instrumentation of control-flow transfers
> with -fcf-protection, add indirect_return function attribute to inform
> compiler that a function may return via indirect branch.
> 
> Note: Unlike setjmp, swapcontext only returns once.  Mark it return
> twice will unnecessarily disable compiler optimization as shown in
> the testcase here.
> 
> gcc/
> 
>   PR target/85620
>   * config/i386/i386.c (rest_of_insert_endbranch): Also generate
>   ENDBRANCH for non-tail call which may return via indirect branch.
>   * doc/extend.texi: Document indirect_return attribute.
OK
jeff


Re: Invert sense of NO_IMPLICIT_EXTERN_C

2018-07-03 Thread Jeff Law
On 07/03/2018 07:50 AM, Nathan Sidwell wrote:
> could a global reviewer comment?  This touches a lot of target-specific
> config files.  David has kindly checked AIX is ok, the known target
> needing the functionality.
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01568.html
So it's almost certain OpenBSD has bitrotted.

It's also likely extern C functionality is enabled on targets that don't
actually need it.  As you note, failure to have it would be immediate,
obvious and easy to fix.

I think we should go forward.  We can fault in any necessary fixes.

jeff



Re: [PATCH][PR sanitizer/84250] Avoid global symbols collision when using both ASan and UBSan

2018-07-03 Thread Jeff Law
On 05/23/2018 11:15 AM, Maxim Ostapenko wrote:
> Hi,
> 
> 
> as described in PR, when using both ASan and UBSan 
> (-fsanitize=address,undefined ), we have symbols collision for global 
> functions, like __sanitizer_set_report_path. This leads to fuzzy results 
> when printing reports into files e.g. for this test case:
> 
> #include 
> int main(int argc, char **argv) {
>    __sanitizer_set_report_path("/tmp/sanitizer.txt");
>    int i = 23;
>    i <<= 32;
>    int *array = new int[100];
>    delete [] array;
>    return array[argc];
> }
> 
> only ASan's report gets written to file; UBSan output goes to stderr.
> 
> To resolve this issue we could use two approaches:
> 
> 1) Use the same approach to that is implemented in Clang (UBSan embedded 
> to ASan). The only caveat here is that we need to link (unused) C++ part 
> of UBSan even in C programs when linking static ASan runtime. This 
> happens because GCC, as opposed to Clang, doesn't split C and C++ 
> runtimes for sanitizers.
> 
> 2) Just add SANITIZER_INTERFACE_ATTRIBUTE to report_file global 
> variable. In this case all __sanitizer_set_report_path calls will set 
> the same report_file variable. IMHO this is a hacky way to fix the 
> issue, it's better to use the first option if possible.
> 
> 
> The attached patch fixes the symbols collision by embedding UBSan into 
> ASan (variant 1), just like we do for LSan.
> 
> 
> Regtested/bootstrapped on x86_64-unknown-linux-gnu, looks reasonable 
> enough for trunk?
> 
> 
> -Maxim
> 
> 
> pr84250-2.diff
> 
> 
> gcc/ChangeLog:
> 
> 2018-05-23  Maxim Ostapenko  
> 
>   * config/gnu-user.h (LIBASAN_EARLY_SPEC): Pass -lstdc++ for static
>   libasan.
>   * gcc.c: Do not pass LIBUBSAN_SPEC if ASan is enabled with UBSan.
> 
> libsanitizer/ChangeLog:
> 
> 2018-05-23  Maxim Ostapenko  
> 
>   * Makefile.am: Reorder libs.
>   * Makefile.in: Regenerate.
>   * asan/Makefile.am: Define DCAN_SANITIZE_UB=1, add dependancy from
>   libsanitizer_ubsan.la.
>   * asan/Makefile.in: Regenerate.
>   * ubsan/Makefile.am: Define new libsanitizer_ubsan.la library.
>   * ubsan/Makefile.in: Regenerate.
You know this code better than anyone else working on GCC.  My only
concern would be the kernel builds with asan, but I suspect they're
providing their own runtime anyway, so the libstdc++ caveat shouldn't apply.

OK for the trunk.
jeff