date:20170324

Re: [PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-24 Thread Jakub Jelinek

On Thu, Mar 23, 2017 at 05:24:31PM -0400, Jason Merrill wrote:
> On Thu, Mar 23, 2017 at 4:44 PM, Jakub Jelinek  wrote:
> > The following C testcase shows how profiledbootstrap fails with checking
> > compiler.  We have a (nested) FUNCTION_DECL inside of BLOCK_VARS of an
> > inline function, when it gets inlined, it is moved into
> > BLOCK_NONLOCALIZED_VARS.  And, decls_for_scope calls process_scope_var
> > with NULL decl and non-NULL origin for all BLOCK_NONLOCALIZED_VARS.
> > That is fine for variables, but for FUNCTION_DECLs it can actually
> > try to dwarf2out_abstract_function that FUNCTION_DECL, which should be
> > really done only when it is inlined (i.e. BLOCK_ABSTRACT_ORIGIN of
> > some BLOCK).
> 
> And when it's cloned.
> 
> But does it make sense for gen_decl_die to call
> dwarf2out_abstract_function when decl is null?  That seems wrong.

Before r144529 we had just:
  if (DECL_ORIGIN (decl) != decl)
dwarf2out_abstract_function (DECL_ABSTRACT_ORIGIN (decl));
and that is indeed to handle clones etc.

r144529 changed that to:
  if (origin || DECL_ORIGIN (decl) != decl)
dwarf2out_abstract_function (DECL_ABSTRACT_ORIGIN (decl));
All of the decl is NULL introduced in r144529 implies decl_or_origin
is abstract.

Removing that origin || wouldn't really work, we'd have to rewrite most of
gen_decl_die FUNCTION_DECL handling to use decl_or_origin instead of
decl etc.

But it doesn't look right to me, conceptually a FUNCTION_DECL in
BLOCK_NONLOCALIZED_VARS is nothing that represents a clone or something
abstract in itself.  We can't keep it in BLOCK_VARS just because those
are chained through DECL_CHAIN and a single FUNCTION_DECL can't be
put into multiple BLOCK_VARS chains.  It is still the same FUNCTION_DECL,
not a sign that we want to have in each inline function a separate
function declaration with abstract origin of the original FUNCTION_DECL.

Jakub

Re: [PATCH] fwprop: Prevent infinite looping (PR79405)

2017-03-24 Thread Richard Biener

On Thu, Mar 23, 2017 at 4:45 PM, Segher Boessenkool
 wrote:
> On Thu, Mar 23, 2017 at 04:16:56PM +0100, Richard Biener wrote:
>> On Thu, Mar 23, 2017 at 3:58 PM, Segher Boessenkool
>>  wrote:
>> > The algorithm fwprop uses never reconsiders a possible propagation,
>> > although it could succeed if the def (in the def-use to propagate)
>> > has changed.  This causes fwprop to do infinite propagations, like
>> > in the scenario in the PR, where we end up with effectively
>> >   B = A
>> >   A = B
>> >   D = A
>> > where only propagations into the last statement are still tried, and
>> > that loops (it becomes D = B, then back to D = A, etc.)
>> >
>> > Fixing this properly isn't easy; this patch instead limits the number
>> > of propagations performed to the number of uses we originally had,
>> > which is the maximum number of propagations that can be done if there
>> > are no such infinite loops.
>> >
>> > Bootstrapped and regression checked on powerpc64-linux {-m64,-m32};
>> > is this okay for trunk?
>>
>> https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01485.html
>>
>> ?
>
> What about it?  That suggestion would make fwprop do *less* useful work,
> while in principle the problem is that it *already* does not enough!

Ok, not knowing a lot about fwprop I take your word for it (but GIMPLE level
forwprop works that way and certainly you can't substitute a def into a use
that is before the def).

> If fwprop actually tried all propagations (and never tried substituting
> X with X, which it currently does), there would be no problem.

To me it looked like fwprop tries to iterate over all propagations but
it iterates over a changing data structure which is why it "oscillates".
But I didn't actually verify that theory (in fact, I just very much disliked
Berns patch give its compile-time cost).

> This patch is a workaround; it makes no difference on pretty much any
> code, except this single testcase (which has undefined behaviour,
> uninitialised variables).

Yeah, your fix looks similar to the other hack I suggested.

Richard.

>
> Segher

Re: [PATCH] Fix PR80158

2017-03-24 Thread Richard Biener

On Thu, Mar 23, 2017 at 7:20 PM, Bill Schmidt
 wrote:
> Hi,
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80158 reports an ICE in
> SLSR while building 416.gamess on x86_64.  This is a latent (but
> previously harmless) bug that was exposed by the fix for PR80054.
> When replacing any statement with a strength reduction, SLSR updates
> the candidate statement S associated with the associated candidate
> record in the candidate table.  However, S may actually be associated
> with two candidate records when an expression may be treated as
> either a CAND_ADD or a CAND_MULT.  In this case, the second one can
> be reached via the next_interp field.

Hmm, the following code in SLSR suggests there may be more than
one such alternate candidate:

  while (arg_cand->kind != CAND_ADD && arg_cand->kind != CAND_PHI)
{
  if (!arg_cand->next_interp)
return;

  arg_cand = lookup_cand (arg_cand->next_interp);
}

> In the excerpt from 416.gamess, the first candidate record was
> updated when its statement was replaced, but the next-interp record
> was not.  Later, that record was used as a basis for another tree
> of strength-reduction candidates, but since it now pointed to an
> orphaned statement without a basic block, the ICE occurred when
> checking dominance against the statement's block.
>
> The fix is simply to ensure that the next_interp record is always
> updated when present.
>
> I've bootstrapped and tested this on powerpc64le-unknown-linux-gnu
> with no regressions.  I tested that the standalone test works on
> x86_64 with this fix, but I have been unable to perform an x86_64
> regstrap because the machine I have access to has insufficient
> disk space. :/  I would appreciate it if you could do this for me.
>
> Assuming no problems with x86_64 regstrap, is this ok for trunk?
>
> Note: I will be unreachable from now until next Tuesday (i.e.,
> returning the 28th).  If needed, please feel free to commit the
> patch on my behalf.  Otherwise I will attend to it when I return.

I'll do a regstrap and gamess build and will commit if that succeeds.

The above can be addressed as followup.

Thanks,
Richard.

> Thanks!
> Bill
>
>
> [gcc]
>
> 2017-03-23  Bill Schmidt  
>
> PR tree-optimization/80158
> * gimple-ssa-strength-reduction.c (replace_mult_candidate): When
> replacing a candidate statement, also replace it for the
> candidate's alternate interpretation.
> (replace_rhs_if_not_dup): Likewise.
> (replace_one_candidate): Likewise.
>
> [gcc/testsuite]
>
> 2017-03-23  Bill Schmidt  
>
> PR tree-optimization/80158
> * gfortran.fortran-torture/compile/pr80158.f: New file.
>
>
> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c (revision 246420)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -2089,6 +2089,8 @@ replace_mult_candidate (slsr_cand_t c, tree basis_
>   gimple_set_location (copy_stmt, gimple_location (c->cand_stmt));
>   gsi_replace (&gsi, copy_stmt, false);
>   c->cand_stmt = copy_stmt;
> + if (c->next_interp)
> +   lookup_cand (c->next_interp)->cand_stmt = copy_stmt;
>   if (dump_file && (dump_flags & TDF_DETAILS))
> stmt_to_print = copy_stmt;
> }
> @@ -2118,6 +2120,8 @@ replace_mult_candidate (slsr_cand_t c, tree basis_
>   basis_name, bump_tree);
>   update_stmt (gsi_stmt (gsi));
>c->cand_stmt = gsi_stmt (gsi);
> + if (c->next_interp)
> +   lookup_cand (c->next_interp)->cand_stmt = gsi_stmt (gsi);
>   if (dump_file && (dump_flags & TDF_DETAILS))
> stmt_to_print = gsi_stmt (gsi);
> }
> @@ -3405,6 +3409,8 @@ replace_rhs_if_not_dup (enum tree_code new_code, t
>gimple_assign_set_rhs_with_ops (&gsi, new_code, new_rhs1, new_rhs2);
>update_stmt (gsi_stmt (gsi));
>c->cand_stmt = gsi_stmt (gsi);
> +  if (c->next_interp)
> +   lookup_cand (c->next_interp)->cand_stmt = gsi_stmt (gsi);
>
>if (dump_file && (dump_flags & TDF_DETAILS))
> return gsi_stmt (gsi);
> @@ -3511,6 +3517,8 @@ replace_one_candidate (slsr_cand_t c, unsigned i,
>   gimple_assign_set_rhs_with_ops (&gsi, MINUS_EXPR, basis_name, rhs2);
>   update_stmt (gsi_stmt (gsi));
>c->cand_stmt = gsi_stmt (gsi);
> + if (c->next_interp)
> +   lookup_cand (c->next_interp)->cand_stmt = gsi_stmt (gsi);
>
>   if (dump_file && (dump_flags & TDF_DETAILS))
> stmt_to_print = gsi_stmt (gsi);
> @@ -3532,6 +3540,8 @@ replace_one_candidate (slsr_cand_t c, unsigned i,
>   gimple_set_location (copy_stmt, gimple_location (c->cand_stmt));
>   gsi_replace (&gsi, copy_stmt, false);
>   c->cand_s

Re: [PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-24 Thread Richard Biener

On Fri, Mar 24, 2017 at 8:46 AM, Jakub Jelinek  wrote:
> On Thu, Mar 23, 2017 at 05:24:31PM -0400, Jason Merrill wrote:
>> On Thu, Mar 23, 2017 at 4:44 PM, Jakub Jelinek  wrote:
>> > The following C testcase shows how profiledbootstrap fails with checking
>> > compiler.  We have a (nested) FUNCTION_DECL inside of BLOCK_VARS of an
>> > inline function, when it gets inlined, it is moved into
>> > BLOCK_NONLOCALIZED_VARS.  And, decls_for_scope calls process_scope_var
>> > with NULL decl and non-NULL origin for all BLOCK_NONLOCALIZED_VARS.
>> > That is fine for variables, but for FUNCTION_DECLs it can actually
>> > try to dwarf2out_abstract_function that FUNCTION_DECL, which should be
>> > really done only when it is inlined (i.e. BLOCK_ABSTRACT_ORIGIN of
>> > some BLOCK).
>>
>> And when it's cloned.
>>
>> But does it make sense for gen_decl_die to call
>> dwarf2out_abstract_function when decl is null?  That seems wrong.
>
> Before r144529 we had just:
>   if (DECL_ORIGIN (decl) != decl)
> dwarf2out_abstract_function (DECL_ABSTRACT_ORIGIN (decl));
> and that is indeed to handle clones etc.
>
> r144529 changed that to:
>   if (origin || DECL_ORIGIN (decl) != decl)
> dwarf2out_abstract_function (DECL_ABSTRACT_ORIGIN (decl));
> All of the decl is NULL introduced in r144529 implies decl_or_origin
> is abstract.
>
> Removing that origin || wouldn't really work, we'd have to rewrite most of
> gen_decl_die FUNCTION_DECL handling to use decl_or_origin instead of
> decl etc.
>
> But it doesn't look right to me, conceptually a FUNCTION_DECL in
> BLOCK_NONLOCALIZED_VARS is nothing that represents a clone or something
> abstract in itself.  We can't keep it in BLOCK_VARS just because those
> are chained through DECL_CHAIN and a single FUNCTION_DECL can't be
> put into multiple BLOCK_VARS chains.  It is still the same FUNCTION_DECL,
> not a sign that we want to have in each inline function a separate
> function declaration with abstract origin of the original FUNCTION_DECL.

Yeah, the thing BLOCK_NONLOCALIZED_VARS wants to do is optimize generated
dwarf by adding a DW_AT_abstract_origin (just to refer to the
subprogram DIE) but
it doesn't actually care about refering to an abstract instance (a
concrete one would
work just fine).

So I think in the context of scope vars calling
dwarf2out_abstract_function is _always_
wrong.  But it would require quite some refactoring to do this in a clean way.

Richard.

> Jakub

Re: [PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 09:29:00AM +0100, Richard Biener wrote:
> Yeah, the thing BLOCK_NONLOCALIZED_VARS wants to do is optimize generated
> dwarf by adding a DW_AT_abstract_origin (just to refer to the
> subprogram DIE) but

Well, for FUNCTION_DECLs in BLOCK_VARS/BLOCK_NONLOCALIZED_VARS we actually don't
emit any further DIE and so there is no DW_AT_abstract_origin.
E.g. gen_subprogram_die has:
  /* Detect and ignore this case, where we are trying to output
 something we have already output.  */
  if (get_AT (old_die, DW_AT_low_pc)
  || get_AT (old_die, DW_AT_ranges))
return;
That is why the posted testcase doesn't ICE without -fno-toplevel-reorder,
normally the body is emitted earlier and so we don't do anything at all.
Otherwise we just want to make sure we have a DIE and, if it is
inline/clone, have also DW_AT_inline set, and if the DIE is without parent
that we put it into proper place in the DIE tree.  And when we actually
see the body of the function we fill locations and all other details.

Jakub

[PATCH] Fix calls.c for a _complex type (PR ipa/80104).

2017-03-24 Thread Martin Liška

Hello.

Briefly described in the PR, running ICF (without any optimization level) can 
create a thunk call
that does not use an SSA_NAME which is a default def of an argument of the 
caller:

c (complex float b)
{
  complex float arg.1;
  float retval.0;

   [100.00%]:
  arg.1_2 = b;
  retval.0_4 = a (arg.1_2); [tail call]
  return retval.0_4;

}

The "arg" variable creation was introduced by Jason in r207301.
As complex type is passed as invisible reference, we need to find address of 
the argument.
Thus I'm suggesting to find it alternatively via SSA_NAME_DEF_STMT.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. And 
fixes the ICE
on s390x cross compiler.

Ready to be installed?
Martin
>From 6624f7b292d174b5c0c1a276c1ed919c9930dca4 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 23 Mar 2017 16:03:04 +0100
Subject: [PATCH] Fix calls.c for a _complex type (PR ipa/80104).

gcc/ChangeLog:

2017-03-23  Martin Liska  

	PR ipa/80104
	* calls.c (initialize_argument_information): Find real argument
	of a gimple call (of thunk) in order to find address of the
	argument.

gcc/testsuite/ChangeLog:

2017-03-23  Martin Liska  

	* gcc.dg/ipa/pr80104.c: New test.
---
 gcc/calls.c| 11 +--
 gcc/testsuite/gcc.dg/ipa/pr80104.c | 15 +++
 2 files changed, 24 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr80104.c

diff --git a/gcc/calls.c b/gcc/calls.c
index 61caf4ca752..0ee21d9e951 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1745,8 +1745,15 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
 		 address.  */
 	  if (TREE_CODE (args[i].tree_value) == SSA_NAME)
 		{
-		  gcc_assert (SSA_NAME_IS_DEFAULT_DEF (args[i].tree_value));
-		  args[i].tree_value = SSA_NAME_VAR (args[i].tree_value);
+		  if (SSA_NAME_IS_DEFAULT_DEF (args[i].tree_value))
+		args[i].tree_value = SSA_NAME_VAR (args[i].tree_value);
+		  else
+		  {
+		gimple *g = SSA_NAME_DEF_STMT (args[i].tree_value);
+		if (gimple_assign_single_p (g))
+		  args[i].tree_value = gimple_assign_rhs1 (g);
+		  }
+
 		  gcc_assert (TREE_CODE (args[i].tree_value) == PARM_DECL);
 		}
 	  /* Argument setup code may have copied the value to register.  We
diff --git a/gcc/testsuite/gcc.dg/ipa/pr80104.c b/gcc/testsuite/gcc.dg/ipa/pr80104.c
new file mode 100644
index 000..7e75c9907e7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr80104.c
@@ -0,0 +1,15 @@
+/* PR ipa/80104 */
+/* { dg-do compile } */
+/* { dg-options "-fipa-icf" } */
+
+float
+a (_Complex float b)
+{
+  return *&b;
+}
+
+float
+c (_Complex float b)
+{
+  return (&b)[0];
+}
-- 
2.12.0

Re: [PATCH 0/3] Introduce internal_error_cont and exclude it from pot files

2017-03-24 Thread Martin Liška

I would like to ping that. I'm not sure what's agreement after I read
discussion in: https://gcc.gnu.org/ml/gcc/2017-03/msg00070.html

Martin Sebor may know, CC'ing him.

Thanks,
Martin

Re: [PATCH] fwprop: Prevent infinite looping (PR79405)

2017-03-24 Thread Segher Boessenkool

On Fri, Mar 24, 2017 at 09:13:59AM +0100, Richard Biener wrote:
> >> https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01485.html
> >>
> >> ?
> >
> > What about it?  That suggestion would make fwprop do *less* useful work,
> > while in principle the problem is that it *already* does not enough!
> 
> Ok, not knowing a lot about fwprop I take your word for it (but GIMPLE level
> forwprop works that way and certainly you can't substitute a def into a use
> that is before the def).

When fwprop has done a propagation it makes fresh new uses for all the
sources of the resulting insn, which it adds at the end of the table.
It doesn't reuse the original uses.

> > If fwprop actually tried all propagations (and never tried substituting
> > X with X, which it currently does), there would be no problem.
> 
> To me it looked like fwprop tries to iterate over all propagations but
> it iterates over a changing data structure which is why it "oscillates".
> But I didn't actually verify that theory (in fact, I just very much disliked
> Berns patch give its compile-time cost).

We have, in order, in one bb:

  B = A|Z
  A = B
  D = A

where each of those is the only set of its dest.  Now the first thing
tried is propagating the first into the second.  This fails.  Then,
Z is found to be zero, so we get

  B = A
  A = B
  D = A

If the first was now propagated into the second again, all is fine
(all three vars are set to A).  But this is not tried again.  The
second is propagated into the third, giving

  B = A
  A = B
  D = B

and then the first is propagated into the third, and we repeat
forever.

> > This patch is a workaround; it makes no difference on pretty much any
> > code, except this single testcase (which has undefined behaviour,
> > uninitialised variables).
> 
> Yeah, your fix looks similar to the other hack I suggested.

I have implemented the "retry things" as well fwiw, but a) it is too
big and invasive for stage 4, and b) it kind of sucks, needs more
work, even more invasive.  The workaround is cheap and solves the
immediate problem.

Segher

[PATCH][ARM] PR target/79871: Clean up lane/constant bounds checking errors for translation

2017-03-24 Thread Kyrill Tkachov


Hi all,

This PR complains that the bounds checking error strings contain a string placeholder for 
"constant" or "lane"
which makes it hard for translators (who may want to move words around in a 
different language for syntactical reasons).

This patch cleans that up. The bunching up of common functionality between 
neon_lane_bounds and arm_const_bounds was a bit
dubious in any case as arm_const_bounds never passed down a non-NULL tree 
anyway, so one of the paths of bonds_check was
always used in only the neon_lane_bounds case anyway.

I also add some function comments and use IN_RANGE for range checking.
Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk? (I believe such translation improvements fall under the 
documentation rule at this stage).

Thanks,
Kyrill

2017-03-24  Kyrylo Tkachov  

PR target/79871
* config/arm/arm.c (bounds_check): Delete.
(neon_lane_bounds): Adjust.  Make sure error string template
doesn't use string placeholder.
(arm_const_bounds): Likewise.
commit 102b86a782297c725c4796c4dd36d33fdf024ee7
Author: Kyrylo Tkachov 
Date:   Thu Mar 23 10:50:32 2017 +

PR target/79871: Clean up lane/constant bounds checking errors for translation

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7b2d3d5..98059da 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12186,13 +12186,16 @@ neon_expand_vector_init (rtx target, rtx vals)
   emit_move_insn (target, mem);
 }
 
-/* Ensure OPERAND lies between LOW (inclusive) and HIGH (exclusive).  Raise
-   ERR if it doesn't.  EXP indicates the source location, which includes the
-   inlining history for intrinsics.  */
+/* Bounds-check lanes.
+   Ensure OPERAND lies between LOW (inclusive) and HIGH (exclusive).  Emit an
+   an error if it doesn't.  EXP indicates the source location, which includes
+   the inlining history for intrinsics.
+   We don't unify this and arm_const_bounds because the error string needs to
+   explicity contain "constant" or "lane" for translation purposes.  */
 
-static void
-bounds_check (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high,
-	  const_tree exp, const char *desc)
+void
+neon_lane_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high,
+		  const_tree exp)
 {
   HOST_WIDE_INT lane;
 
@@ -12200,31 +12203,34 @@ bounds_check (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high,
 
   lane = INTVAL (operand);
 
-  if (lane < low || lane >= high)
+  if (!IN_RANGE (lane, low, high - 1))
 {
   if (exp)
-	error ("%K%s %wd out of range %wd - %wd",
-	   exp, desc, lane, low, high - 1);
+	error ("%Klane %wd out of range %wd - %wd",
+		exp, lane, low, high - 1);
   else
-	error ("%s %wd out of range %wd - %wd", desc, lane, low, high - 1);
+	error ("lane %wd out of range %wd - %wd",
+		lane, low, high - 1);
 }
 }
 
-/* Bounds-check lanes.  */
+/* Bounds-check constants.
+   Ensure OPERAND lies between LOW (inclusive) and HIGH (exclusive).  Emit an
+   an error if it doesn't.  See neon_lane_bounds comment for
+   translation comment.  */
 
 void
-neon_lane_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high,
-		  const_tree exp)
+arm_const_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high)
 {
-  bounds_check (operand, low, high, exp, "lane");
-}
+  HOST_WIDE_INT constant;
 
-/* Bounds-check constants.  */
+  gcc_assert (CONST_INT_P (operand));
 
-void
-arm_const_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high)
-{
-  bounds_check (operand, low, high, NULL_TREE, "constant");
+  constant = INTVAL (operand);
+
+  if (!IN_RANGE (constant, low, high - 1))
+error ("constant %wd out of range %wd - %wd",
+	constant, low, high - 1);
 }
 
 HOST_WIDE_INT

Re: [PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-24 Thread Richard Biener

On Fri, Mar 24, 2017 at 9:43 AM, Jakub Jelinek  wrote:
> On Fri, Mar 24, 2017 at 09:29:00AM +0100, Richard Biener wrote:
>> Yeah, the thing BLOCK_NONLOCALIZED_VARS wants to do is optimize generated
>> dwarf by adding a DW_AT_abstract_origin (just to refer to the
>> subprogram DIE) but
>
> Well, for FUNCTION_DECLs in BLOCK_VARS/BLOCK_NONLOCALIZED_VARS we actually 
> don't
> emit any further DIE and so there is no DW_AT_abstract_origin.
> E.g. gen_subprogram_die has:
>   /* Detect and ignore this case, where we are trying to output
>  something we have already output.  */
>   if (get_AT (old_die, DW_AT_low_pc)
>   || get_AT (old_die, DW_AT_ranges))
> return;

Hmm, but we do want to put the function in scope?  THus

void foo () {}
void bar ()
{
  int foo;
  {
 void foo();
 foo();
  }
}

should have a DIE for foo in bar (possibly refering to the concrete instance
for optimization).

Richard.

> That is why the posted testcase doesn't ICE without -fno-toplevel-reorder,
> normally the body is emitted earlier and so we don't do anything at all.
> Otherwise we just want to make sure we have a DIE and, if it is
> inline/clone, have also DW_AT_inline set, and if the DIE is without parent
> that we put it into proper place in the DIE tree.  And when we actually
> see the body of the function we fill locations and all other details.
>
> Jakub

Re: [PATCH] Fix calls.c for a _complex type (PR ipa/80104).

2017-03-24 Thread Richard Biener

On Fri, Mar 24, 2017 at 10:25 AM, Martin Liška  wrote:
> Hello.
>
> Briefly described in the PR, running ICF (without any optimization level) can 
> create a thunk call
> that does not use an SSA_NAME which is a default def of an argument of the 
> caller:
>
> c (complex float b)
> {
>   complex float arg.1;
>   float retval.0;
>
>[100.00%]:
>   arg.1_2 = b;
>   retval.0_4 = a (arg.1_2); [tail call]
>
>   return retval.0_4;
>
> }
>
> The "arg" variable creation was introduced by Jason in r207301.
> As complex type is passed as invisible reference, we need to find address of 
> the argument.
> Thus I'm suggesting to find it alternatively via SSA_NAME_DEF_STMT.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. 
> And fixes the ICE
> on s390x cross compiler.
>
> Ready to be installed?

Ick, that looks like fragile code ...  looks like the
call_from_thunk_p case expects
the default def or the param-decl.  So why not create it that way?  Or
simply not do
this "optimization"?

Richard.

> Martin

Re: [PATCH] fwprop: Prevent infinite looping (PR79405)

2017-03-24 Thread Richard Biener

On Fri, Mar 24, 2017 at 10:39 AM, Segher Boessenkool
 wrote:
> On Fri, Mar 24, 2017 at 09:13:59AM +0100, Richard Biener wrote:
>> >> https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01485.html
>> >>
>> >> ?
>> >
>> > What about it?  That suggestion would make fwprop do *less* useful work,
>> > while in principle the problem is that it *already* does not enough!
>>
>> Ok, not knowing a lot about fwprop I take your word for it (but GIMPLE level
>> forwprop works that way and certainly you can't substitute a def into a use
>> that is before the def).
>
> When fwprop has done a propagation it makes fresh new uses for all the
> sources of the resulting insn, which it adds at the end of the table.
> It doesn't reuse the original uses.

Yes, that's what I suspected.

>> > If fwprop actually tried all propagations (and never tried substituting
>> > X with X, which it currently does), there would be no problem.
>>
>> To me it looked like fwprop tries to iterate over all propagations but
>> it iterates over a changing data structure which is why it "oscillates".
>> But I didn't actually verify that theory (in fact, I just very much disliked
>> Berns patch give its compile-time cost).
>
> We have, in order, in one bb:
>
>   B = A|Z
>   A = B
>   D = A
>
> where each of those is the only set of its dest.  Now the first thing
> tried is propagating the first into the second.  This fails.  Then,
> Z is found to be zero, so we get
>
>   B = A
>   A = B
>   D = A
>
> If the first was now propagated into the second again, all is fine
> (all three vars are set to A).  But this is not tried again.  The
> second is propagated into the third, giving
>
>   B = A
>   A = B
>   D = B
>
> and then the first is propagated into the third, and we repeat
> forever.
>
>> > This patch is a workaround; it makes no difference on pretty much any
>> > code, except this single testcase (which has undefined behaviour,
>> > uninitialised variables).
>>
>> Yeah, your fix looks similar to the other hack I suggested.
>
> I have implemented the "retry things" as well fwiw, but a) it is too
> big and invasive for stage 4, and b) it kind of sucks, needs more
> work, even more invasive.  The workaround is cheap and solves the
> immediate problem.

Agreed, still iterating over the DF uses in the first place looks like
the bug (given this "all uses" data structure changes during propagation!).

I'd have done

  for (BBs in RPO order)
for (insn in BB)
repeat:
  for (use in insn)
 if (propagate_into (use))
   goto repeat;

Richard.

>
> Segher

[PATCH] Fix PR80167

2017-03-24 Thread Richard Biener


Seems we have simply forgotten about the case of replacing with a 
default-def.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-03-24  Richard Biener  

PR tree-optimization/80167
* graphite-isl-ast-to-gimple.c
(translate_isl_ast_to_gimple::is_valid_rename): Handle default-defs
properly.
(translate_isl_ast_to_gimple::get_rename): Likewise.

* gcc.dg/graphite/pr80167.c: New testcase.

Index: gcc/graphite-isl-ast-to-gimple.c
===
*** gcc/graphite-isl-ast-to-gimple.c(revision 246437)
--- gcc/graphite-isl-ast-to-gimple.c(working copy)
*** bool translate_isl_ast_to_gimple::
*** 1123,1128 
--- 1123,1131 
  is_valid_rename (tree rename, basic_block def_bb, basic_block use_bb,
 phi_node_kind phi_kind, tree old_name, basic_block old_bb) 
const
  {
+   if (SSA_NAME_IS_DEFAULT_DEF (rename))
+ return true;
+ 
/* The def of the rename must either dominate the uses or come from a
   back-edge.  Also the def must respect the loop closed ssa form.  */
if (!is_loop_closed_ssa_use (use_bb, rename))
*** get_rename (basic_block new_bb, tree old
*** 1178,1183 
--- 1181,1187 
  basic_block bb = gimple_bb (SSA_NAME_DEF_STMT (rename));
  if (is_valid_rename (rename, bb, new_bb, phi_kind, old_name, old_bb)
  && (phi_kind == close_phi
+ || ! bb
  || flow_bb_inside_loop_p (bb->loop_father, new_bb)))
return rename;
  return NULL_TREE;
Index: gcc/testsuite/gcc.dg/graphite/pr80167.c
===
*** gcc/testsuite/gcc.dg/graphite/pr80167.c (nonexistent)
--- gcc/testsuite/gcc.dg/graphite/pr80167.c (working copy)
***
*** 0 
--- 1,24 
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -floop-nest-optimize" } */
+ 
+ typedef struct
+ {
+   short a;
+   short b;
+   short c;
+ } d;
+ extern d e[];
+ int f[8];
+ void
+ g (d *i)
+ {
+   int h = 0;
+   for (; h < 28; h++)
+ e[h].a = e[h].b = i[h].a;
+   h = 0;
+   for (; h < 8; h++)
+ f[h] = i[h].b + i[h].c;
+   h = 0;
+   for (; h < 8; h++)
+ f[h] = i[h].b;
+ }

Re: [PATCH] Fix doloop ICE (PR rtl-optimization/80112)

2017-03-24 Thread Jeff Law


On 03/21/2017 09:00 AM, Jakub Jelinek wrote:

Hi!

doloop_condition_get computes cmp in several places, and in one of them
wants to fail if the condition inside of it isn't NE against const0_rtx.
The problem with that is that nothing checked what cmp is yet,
  /* Check for (set (pc) (if_then_else (condition)
   (label_ref (label))
   (pc))).  */
  if (GET_CODE (cmp) != SET
  || SET_DEST (cmp) != pc_rtx
  || GET_CODE (SET_SRC (cmp)) != IF_THEN_ELSE
  || GET_CODE (XEXP (SET_SRC (cmp), 1)) != LABEL_REF
  || XEXP (SET_SRC (cmp), 2) != pc_rtx)
return 0;
is checked only a couple of lines later.

To fix this, either we can use the following patch which guards the
early check with the necessary subset of the later check, or we could
for
  if (GET_CODE (cmp) != SET || GET_CODE (SET_SRC (cmp)) != IF_THEN_ELSE)
return 0;
before cond = XEXP (SET_SRC (cmp), 0);,
or set some bool flag and only verify this requirement if the bool flag
is true after the later check.

Any preferences?  The following has been successfully bootstrapped/regtested
on powerpc64le-linux.

2017-03-21  Jakub Jelinek  

PR rtl-optimization/80112
* loop-doloop.c (doloop_condition_get): Don't check condition
if cmp isn't SET with IF_THEN_ELSE src.

* gcc.dg/pr80112.c: New test.
This code is a bit of a mess.  I could argue for at least 3 approaches 
to fix the bug, none of which I particularly like.


Yours isn't any worse than the other approaches, so let's just go with it.

OK.
jeff

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jeff Law


On 03/21/2017 02:21 PM, Jakub Jelinek wrote:

On Tue, Mar 21, 2017 at 06:53:34PM +0100, Jakub Jelinek wrote:

On Tue, Mar 21, 2017 at 08:41:43AM +0100, Jakub Jelinek wrote:

On Tue, Mar 21, 2017 at 08:38:20AM +1000, Richard Henderson wrote:

On 03/21/2017 07:15 AM, Jakub Jelinek wrote:

Not really sure what we should do if both i1 and i2 are frame related, shall
we check for each of the CFA reg notes if they are available and equal?
Or punt if either of the insns is frame related?


I would punt if either is frame related.


Ok, I'll test then the following patch and gather some statistic on how
often we trigger this.


The statistics I've gathered unfortunately shows that at least on
powerpc64le-linux it is very important to not give up if both i1 and i2
are frame related and have rtx_equal_p notes.
I've set on unpatched old_insns_match_p flags when returning non-dir_none
and checked those flags in the various callers of these when about to
successfully perform cross-jumping, head-merging etc.
With /f vs. non-/f, the only 3 hits were on the new pr80102.C testcase
during powerpc64le-linux bootstrap/regtest, but /f vs. /f there were
167601 hits.


Here is updated patch which allows cross-jumping of RTX_FRAME_RELATED_P
insns, as long as they have the same CFA related notes on them (same order
if more than one).

Bootstrapped/regtested on powerpc64le-linux, ok for trunk?

2017-03-21  Jakub Jelinek  

PR target/80102
* cfgcleanup.c (old_insns_match_p): Don't cross-jump in between /f
and non-/f instructions.  If both i1 and i2 are frame related,
verify all CFA notes, their order and content.

* g++.dg/opt/pr80102.C: New test.
Presumably this didn't ICE at some point in the past, so it's a 
regression?  (it's not marked as such in the BZ).





--- gcc/cfgcleanup.c.jj 2017-03-21 07:56:55.711233924 +0100
+++ gcc/cfgcleanup.c2017-03-21 19:20:40.517746664 +0100
@@ -1149,6 +1149,11 @@ old_insns_match_p (int mode ATTRIBUTE_UN
   else if (p1 || p2)
 return dir_none;

+  /* Do not allow cross-jumping between frame related insns and other
+ insns.  */
+  if (RTX_FRAME_RELATED_P (i1) != RTX_FRAME_RELATED_P (i2))
+return dir_none;
+
   p1 = PATTERN (i1);
   p2 = PATTERN (i2);

@@ -1207,6 +1212,58 @@ old_insns_match_p (int mode ATTRIBUTE_UN
}
 }

+  /* If both i1 and i2 are frame related, verify all the CFA notes
+ in the same order and with the same content.  */
+  if (RTX_FRAME_RELATED_P (i1))
+{
+  static enum reg_note cfa_note_kinds[] = {
+   REG_FRAME_RELATED_EXPR, REG_CFA_DEF_CFA, REG_CFA_ADJUST_CFA,
+   REG_CFA_OFFSET, REG_CFA_REGISTER, REG_CFA_EXPRESSION,
+   REG_CFA_VAL_EXPRESSION, REG_CFA_RESTORE, REG_CFA_SET_VDRAP,
+   REG_CFA_TOGGLE_RA_MANGLE, REG_CFA_WINDOW_SAVE, REG_CFA_FLUSH_QUEUE
+  };
ISTM this could get out of date very easily.  Is there a clean way to 
generate the array of cfa notes as we build up the notes from 
reg-notes.def?



Jeff

Re: [testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-24 Thread Rainer Orth

Hi Tom,

> On 23/03/17 18:25, Mike Stump wrote:
>> On Mar 23, 2017, at 8:46 AM, Tom de Vries  wrote:
>>>
>>> I've run the gcc testsuite for target nvptx-none and ran into "test for
>>> excess errors" FAILs due to:
>>> ...
>>> sorry, unimplemented: target cannot support alloca.
>>
>> We'd encourage ports to support alloca.  :-)
>>
>>> OK for trunk for stage1?
>>
>> Ok.  Ok for release branches and trunk as well, if you want.  I'd
>> recommend trunk, if your port is meant to work and test out nicely in gcc
>> 7.
>>
>
> Committed to trunk.

seems you didn't properly test this patchset.  It caused

+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++11  (test for warnings,
 line 24)
+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++11  (test for warnings,
 line 34)
+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++11 (test for excess err
ors)
+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++14  (test for warnings,
 line 24)
+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++14  (test for warnings,
 line 34)
+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++14 (test for excess err
ors)
+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++98  (test for warnings,
 line 24)
+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++98  (test for warnings,
 line 34)
+FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++98 (test for excess err
ors)

everywhere.  Adding that dg-require-effective-target line requires
adjusting dg-warning etc. line numbers.

Fixed as follows, installed on mainline after checking with the
appropriate runtest invocations (for c and c++) on x86_64-pc-linux-gnu.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-03-24  Rainer Orth  

* c-c++-common/Wimplicit-fallthrough-7.c: Adjust dg-warning line
numbers.

# HG changeset patch
# Parent  288df338f490e514591c89d11734d7de56f62460
Adjust c-c++-common/Wimplicit-fallthrough-7.c line numbers

diff --git a/gcc/testsuite/c-c++-common/Wimplicit-fallthrough-7.c b/gcc/testsuite/c-c++-common/Wimplicit-fallthrough-7.c
--- a/gcc/testsuite/c-c++-common/Wimplicit-fallthrough-7.c
+++ b/gcc/testsuite/c-c++-common/Wimplicit-fallthrough-7.c
@@ -21,8 +21,8 @@ f (int i)
   switch (i)
 {
 case 1:
-  { /* { dg-warning "statement may fall through" "" { target c } 23 } */
-	int a[i]; /* { dg-warning "statement may fall through" "" { target c++ } 24 } */
+  { /* { dg-warning "statement may fall through" "" { target c } 24 } */
+	int a[i]; /* { dg-warning "statement may fall through" "" { target c++ } 25 } */
   }
 case 2:
   bar (99);
@@ -31,8 +31,8 @@ f (int i)
   switch (i)
 {
 case 1:
-  for (int j = 0; j < 10; j++) /* { dg-warning "statement may fall through" "" { target c } 33 } */
-	map[j] = j; /* { dg-warning "statement may fall through" "" { target c++ } 34 } */
+  for (int j = 0; j < 10; j++) /* { dg-warning "statement may fall through" "" { target c } 34 } */
+	map[j] = j; /* { dg-warning "statement may fall through" "" { target c++ } 35 } */
 case 2:
   bar (99);
 }

Re: [testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-24 Thread Richard Biener

On Fri, Mar 24, 2017 at 1:38 PM, Rainer Orth
 wrote:
> Hi Tom,
>
>> On 23/03/17 18:25, Mike Stump wrote:
>>> On Mar 23, 2017, at 8:46 AM, Tom de Vries  wrote:

 I've run the gcc testsuite for target nvptx-none and ran into "test for
 excess errors" FAILs due to:
 ...
 sorry, unimplemented: target cannot support alloca.
>>>
>>> We'd encourage ports to support alloca.  :-)
>>>
 OK for trunk for stage1?
>>>
>>> Ok.  Ok for release branches and trunk as well, if you want.  I'd
>>> recommend trunk, if your port is meant to work and test out nicely in gcc
>>> 7.
>>>
>>
>> Committed to trunk.
>
> seems you didn't properly test this patchset.  It caused
>
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++11  (test for 
> warnings,
>  line 24)
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++11  (test for 
> warnings,
>  line 34)
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++11 (test for excess 
> err
> ors)
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++14  (test for 
> warnings,
>  line 24)
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++14  (test for 
> warnings,
>  line 34)
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++14 (test for excess 
> err
> ors)
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++98  (test for 
> warnings,
>  line 24)
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++98  (test for 
> warnings,
>  line 34)
> +FAIL: c-c++-common/Wimplicit-fallthrough-7.c  -std=gnu++98 (test for excess 
> err
> ors)
>
> everywhere.  Adding that dg-require-effective-target line requires
> adjusting dg-warning etc. line numbers.
>
> Fixed as follows, installed on mainline after checking with the
> appropriate runtest invocations (for c and c++) on x86_64-pc-linux-gnu.

Similar

-m64
FAIL: gcc.dg/Walloca-2.c note (test for warnings, line 38)
FAIL: gcc.dg/Wvla-larger-than-2.c note (test for warnings, line 25)

-m32
FAIL: gcc.dg/Walloca-1.c  (test for warnings, line 26)
FAIL: gcc.dg/Walloca-1.c (test for excess errors)
FAIL: gcc.dg/Walloca-2.c  (test for warnings, line 38)
FAIL: gcc.dg/Walloca-2.c (test for excess errors)
FAIL: gcc.dg/Wvla-larger-than-2.c note (test for warnings, line 25)


> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>
> 2017-03-24  Rainer Orth  
>
> * c-c++-common/Wimplicit-fallthrough-7.c: Adjust dg-warning line
> numbers.
>

Re: [testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-24 Thread Rainer Orth

Hi Richard,

> Similar
>
> -m64
> FAIL: gcc.dg/Walloca-2.c note (test for warnings, line 38)
> FAIL: gcc.dg/Wvla-larger-than-2.c note (test for warnings, line 25)
>
> -m32
> FAIL: gcc.dg/Walloca-1.c  (test for warnings, line 26)
> FAIL: gcc.dg/Walloca-1.c (test for excess errors)
> FAIL: gcc.dg/Walloca-2.c  (test for warnings, line 38)
> FAIL: gcc.dg/Walloca-2.c (test for excess errors)
> FAIL: gcc.dg/Wvla-larger-than-2.c note (test for warnings, line 25)

right, just noticed that myself.  Fix in progress...

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 12:45:28PM +0100, Richard Biener wrote:
> On Fri, Mar 24, 2017 at 9:43 AM, Jakub Jelinek  wrote:
> > On Fri, Mar 24, 2017 at 09:29:00AM +0100, Richard Biener wrote:
> >> Yeah, the thing BLOCK_NONLOCALIZED_VARS wants to do is optimize generated
> >> dwarf by adding a DW_AT_abstract_origin (just to refer to the
> >> subprogram DIE) but
> >
> > Well, for FUNCTION_DECLs in BLOCK_VARS/BLOCK_NONLOCALIZED_VARS we actually 
> > don't
> > emit any further DIE and so there is no DW_AT_abstract_origin.
> > E.g. gen_subprogram_die has:
> >   /* Detect and ignore this case, where we are trying to output
> >  something we have already output.  */
> >   if (get_AT (old_die, DW_AT_low_pc)
> >   || get_AT (old_die, DW_AT_ranges))
> > return;
> 
> Hmm, but we do want to put the function in scope?  THus
> 
> void foo () {}
> void bar ()
> {
>   int foo;
>   {
>  void foo();
>  foo();
>   }
> }
> 
> should have a DIE for foo in bar (possibly refering to the concrete instance
> for optimization).

We actually do that.  If I change your testcase so that it actually triggers
the changed part of the code (so there is inlining etc.), so:
volatile int v;
__attribute__((noinline)) void foo () { v++; }
static inline void bar ()
{
  int foo;
  {
 void foo();
 foo();
  }
}
void
baz (void)
{
  bar ();
  bar ();
}
then at -gdwarf-3 -dA -O2 the difference between vanilla GCC and my patched GCC 
is
following (used -gdwarf-3 so that there are no DW_FORM_flag_present that
result in DIE offset changes):
.uleb128 0xd# (DIE (0x117) DW_TAG_subprogram)
.ascii "bar\0"  # DW_AT_name
.byte   0x1 # DW_AT_decl_file (prQQ.c)
.byte   0x3 # DW_AT_decl_line
.byte   0x3 # DW_AT_inline
.long   0x13c   # DW_AT_sibling
.uleb128 0xe# (DIE (0x123) DW_TAG_variable)
.ascii "foo\0"  # DW_AT_name
.byte   0x1 # DW_AT_decl_file (prQQ.c)
.byte   0x5 # DW_AT_decl_line
.long   0x41# DW_AT_type
.uleb128 0xf# (DIE (0x12e) DW_TAG_lexical_block)
.uleb128 0x10   # (DIE (0x12f) DW_TAG_subprogram)
.byte   0x1 # DW_AT_external
.ascii "foo\0"  # DW_AT_name
.byte   0x1 # DW_AT_decl_file (prQQ.c)
.byte   0x7 # DW_AT_decl_line
-   .byte   0   # DW_AT_inline
+   .byte   0x1 # DW_AT_declaration
.uleb128 0x11   # (DIE (0x138) DW_TAG_unspecified_parameters)
.byte   0   # end of children of DIE 0x12f
.byte   0   # end of children of DIE 0x12e
.byte   0   # end of children of DIE 0x117
.uleb128 0x12   # (DIE (0x13c) DW_TAG_subprogram)
.byte   0x1 # DW_AT_external
.ascii "foo\0"  # DW_AT_name
.byte   0x1 # DW_AT_decl_file (prQQ.c)
.byte   0x2 # DW_AT_decl_line
.quad   .LFB0   # DW_AT_low_pc
.quad   .LFE0   # DW_AT_high_pc
.byte   0x1 # DW_AT_frame_base
.byte   0x9c# DW_OP_call_frame_cfa
.byte   0x1 # DW_AT_GNU_all_call_sites
.byte   0   # end of children of DIE 0xb
and corresponding change in .debug_abbrev.
That seems to be the right change to me, inside of bar (the DW_AT_inline
one) we have a declaration of foo, and we properly use DW_AT_declaration for
it, then at the toplevel we actually have the full definition die for foo,
and then we have in two spots inside baz
.uleb128 0x6# (DIE (0x6d) DW_TAG_inlined_subroutine)
.long   0x117   # DW_AT_abstract_origin
without the need to duplicate the foo declaration in there,
that is inherited through DW_AT_abstract_origin.

Jakub

Re: [testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-24 Thread Rainer Orth

Hi Richard,

>> Similar
>>
>> -m64
>> FAIL: gcc.dg/Walloca-2.c note (test for warnings, line 38)
>> FAIL: gcc.dg/Wvla-larger-than-2.c note (test for warnings, line 25)
>>
>> -m32
>> FAIL: gcc.dg/Walloca-1.c  (test for warnings, line 26)
>> FAIL: gcc.dg/Walloca-1.c (test for excess errors)
>> FAIL: gcc.dg/Walloca-2.c  (test for warnings, line 38)
>> FAIL: gcc.dg/Walloca-2.c (test for excess errors)
>> FAIL: gcc.dg/Wvla-larger-than-2.c note (test for warnings, line 25)
>
> right, just noticed that myself.  Fix in progress...

here's what I committed after similar testing.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-03-24  Rainer Orth  

* c-c++-common/Wimplicit-fallthrough-7.c: Adjust dg-warning
etc. line numbers.
* gcc.dg/Walloca-1.c: Likewise.
* gcc.dg/Walloca-2.c: Likewise.
* gcc.dg/Wvla-larger-than-2.c: Likewise.

# HG changeset patch
# Parent  288df338f490e514591c89d11734d7de56f62460
Adjust c-c++-common/Wimplicit-fallthrough-7.c line numbers

diff --git a/gcc/testsuite/c-c++-common/Wimplicit-fallthrough-7.c b/gcc/testsuite/c-c++-common/Wimplicit-fallthrough-7.c
--- a/gcc/testsuite/c-c++-common/Wimplicit-fallthrough-7.c
+++ b/gcc/testsuite/c-c++-common/Wimplicit-fallthrough-7.c
@@ -21,8 +21,8 @@ f (int i)
   switch (i)
 {
 case 1:
-  { /* { dg-warning "statement may fall through" "" { target c } 23 } */
-	int a[i]; /* { dg-warning "statement may fall through" "" { target c++ } 24 } */
+  { /* { dg-warning "statement may fall through" "" { target c } 24 } */
+	int a[i]; /* { dg-warning "statement may fall through" "" { target c++ } 25 } */
   }
 case 2:
   bar (99);
@@ -31,8 +31,8 @@ f (int i)
   switch (i)
 {
 case 1:
-  for (int j = 0; j < 10; j++) /* { dg-warning "statement may fall through" "" { target c } 33 } */
-	map[j] = j; /* { dg-warning "statement may fall through" "" { target c++ } 34 } */
+  for (int j = 0; j < 10; j++) /* { dg-warning "statement may fall through" "" { target c } 34 } */
+	map[j] = j; /* { dg-warning "statement may fall through" "" { target c++ } 35 } */
 case 2:
   bar (99);
 }
diff --git a/gcc/testsuite/gcc.dg/Walloca-1.c b/gcc/testsuite/gcc.dg/Walloca-1.c
--- a/gcc/testsuite/gcc.dg/Walloca-1.c
+++ b/gcc/testsuite/gcc.dg/Walloca-1.c
@@ -25,7 +25,7 @@ void foo1 (size_t len, size_t len2, size
   useit (s);			// OK, constant argument to alloca
 
   s = alloca (num);		// { dg-warning "large due to conversion" "" { target lp64 } }
-  // { dg-warning "unbounded use of 'alloca'" "" { target { ! lp64 } } 26 }
+  // { dg-warning "unbounded use of 'alloca'" "" { target { ! lp64 } } 27 }
   useit (s);
 
   s = alloca (3);		/* { dg-warning "is too large" } */
diff --git a/gcc/testsuite/gcc.dg/Walloca-2.c b/gcc/testsuite/gcc.dg/Walloca-2.c
--- a/gcc/testsuite/gcc.dg/Walloca-2.c
+++ b/gcc/testsuite/gcc.dg/Walloca-2.c
@@ -37,8 +37,8 @@ g3 (int n)
   if (n > 0 && n < 3000)
 {
   p = __builtin_alloca (n); // { dg-warning "'alloca' may be too large" "" { target lp64} }
-  // { dg-message "note:.*argument may be as large as 2999" "note" { target lp64 } 38 }
-  // { dg-warning "unbounded use of 'alloca'" "" { target { ! lp64 } } 38 }
+  // { dg-message "note:.*argument may be as large as 2999" "note" { target lp64 } 39 }
+  // { dg-warning "unbounded use of 'alloca'" "" { target { ! lp64 } } 39 }
 }
   else
 p = __builtin_malloc (n);
diff --git a/gcc/testsuite/gcc.dg/Wvla-larger-than-2.c b/gcc/testsuite/gcc.dg/Wvla-larger-than-2.c
--- a/gcc/testsuite/gcc.dg/Wvla-larger-than-2.c
+++ b/gcc/testsuite/gcc.dg/Wvla-larger-than-2.c
@@ -24,7 +24,7 @@ f2 (__SIZE_TYPE__ a)
 {
   // 11 * 4 bytes = 44: Not OK.
   uint32_t x[a]; // { dg-warning "array may be too large" }
-  // { dg-message "note:.*argument may be as large as 44" "note" { target *-*-* } 25 }
+  // { dg-message "note:.*argument may be as large as 44" "note" { target *-*-* } 26 }
   f0 (x);
 }
 }

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 06:37:10AM -0600, Jeff Law wrote:
> > 2017-03-21  Jakub Jelinek  
> > 
> > PR target/80102
> > * cfgcleanup.c (old_insns_match_p): Don't cross-jump in between /f
> > and non-/f instructions.  If both i1 and i2 are frame related,
> > verify all CFA notes, their order and content.
> > 
> > * g++.dg/opt/pr80102.C: New test.
> Presumably this didn't ICE at some point in the past, so it's a regression?
> (it's not marked as such in the BZ).

It doesn't ICE for me with r238210 and ICEs with current trunk, I don't have
too many ppc64le compilers around though.

> > +  /* If both i1 and i2 are frame related, verify all the CFA notes
> > + in the same order and with the same content.  */
> > +  if (RTX_FRAME_RELATED_P (i1))
> > +{
> > +  static enum reg_note cfa_note_kinds[] = {
> > +   REG_FRAME_RELATED_EXPR, REG_CFA_DEF_CFA, REG_CFA_ADJUST_CFA,
> > +   REG_CFA_OFFSET, REG_CFA_REGISTER, REG_CFA_EXPRESSION,
> > +   REG_CFA_VAL_EXPRESSION, REG_CFA_RESTORE, REG_CFA_SET_VDRAP,
> > +   REG_CFA_TOGGLE_RA_MANGLE, REG_CFA_WINDOW_SAVE, REG_CFA_FLUSH_QUEUE
> > +  };
> ISTM this could get out of date very easily.  Is there a clean way to
> generate the array of cfa notes as we build up the notes from reg-notes.def?

We could e.g.
#ifndef REG_CFA_NOTE
# define REG_CFA_NOTE(NAME) REG_NOTE(NAME)
#endif
and then
REG_CFA_NOTE (FRAME_RELATED_EXPR)
etc. in reg-notes.def (and document that REG_CFA_NOTE should be used for
notes related to CFA).
Then in cfgcleanups.c we could just
#undef REG_CFA_NOTE
#define DEF_REG_NOTE(NAME)
#define REG_CFA_NOTE(NAME) REG_##NAME,
#include "reg-notes.def"
#undef DEF_REG_NOTE
#undef REG_CFA_NOTE
to populate the cfa_note_kinds array.

Jakub

Re: [PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-24 Thread Jason Merrill

On Fri, Mar 24, 2017 at 3:46 AM, Jakub Jelinek  wrote:
> On Thu, Mar 23, 2017 at 05:24:31PM -0400, Jason Merrill wrote:
>> On Thu, Mar 23, 2017 at 4:44 PM, Jakub Jelinek  wrote:
>> > The following C testcase shows how profiledbootstrap fails with checking
>> > compiler.  We have a (nested) FUNCTION_DECL inside of BLOCK_VARS of an
>> > inline function, when it gets inlined, it is moved into
>> > BLOCK_NONLOCALIZED_VARS.  And, decls_for_scope calls process_scope_var
>> > with NULL decl and non-NULL origin for all BLOCK_NONLOCALIZED_VARS.
>> > That is fine for variables, but for FUNCTION_DECLs it can actually
>> > try to dwarf2out_abstract_function that FUNCTION_DECL, which should be
>> > really done only when it is inlined (i.e. BLOCK_ABSTRACT_ORIGIN of
>> > some BLOCK).
>>
>> And when it's cloned.
>>
>> But does it make sense for gen_decl_die to call
>> dwarf2out_abstract_function when decl is null?  That seems wrong.
>
> Before r144529 we had just:
>   if (DECL_ORIGIN (decl) != decl)
> dwarf2out_abstract_function (DECL_ABSTRACT_ORIGIN (decl));
> and that is indeed to handle clones etc.
>
> r144529 changed that to:
>   if (origin || DECL_ORIGIN (decl) != decl)
> dwarf2out_abstract_function (DECL_ABSTRACT_ORIGIN (decl));
> All of the decl is NULL introduced in r144529 implies decl_or_origin
> is abstract.
>
> Removing that origin || wouldn't really work, we'd have to rewrite most of
> gen_decl_die FUNCTION_DECL handling to use decl_or_origin instead of
> decl etc.

I was thinking to change it to

  if (decl && (origin || DECL_ORIGIN (decl) != decl))

Jason

Re: [PATCH] Fix profiledbootstrap ada checking failure (PR debug/79255)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 09:07:54AM -0400, Jason Merrill wrote:
> >> And when it's cloned.
> >>
> >> But does it make sense for gen_decl_die to call
> >> dwarf2out_abstract_function when decl is null?  That seems wrong.
> >
> > Before r144529 we had just:
> >   if (DECL_ORIGIN (decl) != decl)
> > dwarf2out_abstract_function (DECL_ABSTRACT_ORIGIN (decl));
> > and that is indeed to handle clones etc.
> >
> > r144529 changed that to:
> >   if (origin || DECL_ORIGIN (decl) != decl)
> > dwarf2out_abstract_function (DECL_ABSTRACT_ORIGIN (decl));
> > All of the decl is NULL introduced in r144529 implies decl_or_origin
> > is abstract.
> >
> > Removing that origin || wouldn't really work, we'd have to rewrite most of
> > gen_decl_die FUNCTION_DECL handling to use decl_or_origin instead of
> > decl etc.
> 
> I was thinking to change it to
> 
>   if (decl && (origin || DECL_ORIGIN (decl) != decl))

But then you segfault immediately on the next:
  else if (cgraph_function_possibly_inlined_p (decl)
   && ! DECL_ABSTRACT_P (decl)
because decl is NULL.  So, it would be far easier to do:
  if (decl == NULL_TREE)
{
  decl = origin;
  origin = NULL_TREE;
}
before the
  if (origin || DECL_ORIGIN (decl) != decl)
with a comment, rather than to rewrite a couple of dozen decl_or_origin,
and change assumptions that non-NULL origin actually means anything for
FUNCTION_DECL.  That is IMO still uglier than the decls_for_scope change
though, but if you prefer that, I can test that.

Jakub

Re: [PATCH] Fix calls.c for a _complex type (PR ipa/80104).

2017-03-24 Thread Martin Liška

On 03/24/2017 12:49 PM, Richard Biener wrote:
> On Fri, Mar 24, 2017 at 10:25 AM, Martin Liška  wrote:
>> Hello.
>>
>> Briefly described in the PR, running ICF (without any optimization level) 
>> can create a thunk call
>> that does not use an SSA_NAME which is a default def of an argument of the 
>> caller:
>>
>> c (complex float b)
>> {
>>   complex float arg.1;
>>   float retval.0;
>>
>>[100.00%]:
>>   arg.1_2 = b;
>>   retval.0_4 = a (arg.1_2); [tail call]
>>
>>   return retval.0_4;
>>
>> }
>>
>> The "arg" variable creation was introduced by Jason in r207301.
>> As complex type is passed as invisible reference, we need to find address of 
>> the argument.
>> Thus I'm suggesting to find it alternatively via SSA_NAME_DEF_STMT.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. 
>> And fixes the ICE
>> on s390x cross compiler.
>>
>> Ready to be installed?
> 
> Ick, that looks like fragile code ...  looks like the
> call_from_thunk_p case expects
> the default def or the param-decl.  So why not create it that way?  Or

Not creating the "arg" and directly passing the PARM_DECL will break verifier:

/home/marxin/Programming/testcases/pr80104.c:10:1: error: invalid argument to 
gimple call
 }
 ^
b
# .MEM_2 = VDEF <.MEM_1(D)>
retval.0_3 = a (b); [tail call]
/home/marxin/Programming/testcases/pr80104.c:10:1: internal compiler error: 
verify_gimple failed

> simply not do
> this "optimization"?

You mean understanding when an argument of a wrapper (thunk) will need 
invisible reference and
if so, then ICF should not do the optimization?

Martin

> 
> Richard.
> 
>> Martin

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 02:04:42PM +0100, Jakub Jelinek wrote:
> On Fri, Mar 24, 2017 at 06:37:10AM -0600, Jeff Law wrote:
> > > 2017-03-21  Jakub Jelinek  
> > > 
> > >   PR target/80102
> > >   * cfgcleanup.c (old_insns_match_p): Don't cross-jump in between /f
> > >   and non-/f instructions.  If both i1 and i2 are frame related,
> > >   verify all CFA notes, their order and content.
> > > 
> > >   * g++.dg/opt/pr80102.C: New test.
> > Presumably this didn't ICE at some point in the past, so it's a regression?
> > (it's not marked as such in the BZ).
> 
> It doesn't ICE for me with r238210 and ICEs with current trunk, I don't have
> too many ppc64le compilers around though.

GCC 4.8 doesn't ICE either.

Jakub

Re: [RFC] VEC_SELECT sanity checking in genrecog

2017-03-24 Thread Jakub Jelinek

On Mon, Mar 06, 2017 at 12:53:38PM +0100, Richard Biener wrote:
> I think these are all bugs and should be fixed and thus this checking
> is good.
> 
> Of course we'd better not break (too many) targets at this point...

I've tested it today and it passed on all targets I've tried make s-recog
on, i.e. i386, arm, aarch64, alpha, avr, cris, hppa, ia64, m32c, m68k,
mips, nvptx, rs6000, sparc, sh, s390x.
Ok for trunk then?

> > 2017-03-03  Jakub Jelinek  
> > 
> > * genrecog.c (validate_pattern): Add VEC_SELECT validation.
> > * genmodes.c (emit_min_insn_modes_c): Call emit_mode_nunits
> > and emit_mode_inner.

Jakub

[Committed] S/390: PR79893: Add diagnostics vec_load_bndry builtin.

2017-03-24 Thread Andreas Krebbel

The boundary argument of the vec_load_bndry builtin needs to be
rewritten.  At that point it must be constant already.  The current
diagnostics in s390_expand_builtins is too late for this.  The patch
adds an additional check for that builtin which will be triggered
already during preprocessing.

Regression tested on s390x.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

PR target/79893
* gcc.target/s390/zvector/pr79893.c: New test.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

PR target/79893
* config/s390/s390-c.c (s390_adjust_builtin_arglist): Issue an
error if the boundary argument is not constant.
---
 gcc/ChangeLog   |  6 ++
 gcc/config/s390/s390-c.c| 12 ++--
 gcc/testsuite/ChangeLog |  5 +
 gcc/testsuite/gcc.target/s390/zvector/pr79893.c |  9 +
 4 files changed, 30 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/pr79893.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8d4adfb..9cb2271 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-03-24  Andreas Krebbel  
+
+   PR target/79893
+   * config/s390/s390-c.c (s390_adjust_builtin_arglist): Issue an
+   error if the boundary argument is not constant.
+
 2017-03-24  Bill Schmidt  
 
PR tree-optimization/80158
diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 8cdac7e..0521e1e 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -673,10 +673,18 @@ s390_adjust_builtin_arglist (unsigned int ob_fcode, tree 
decl,
case S390_OVERLOADED_BUILTIN_s390_vec_load_bndry:
  {
int code;
-
if (dest_arg_index == 1)
  {
-   switch (tree_to_uhwi ((**arglist)[src_arg_index]))
+   tree arg = (**arglist)[src_arg_index];
+
+   if (TREE_CODE (arg) != INTEGER_CST)
+ {
+   error ("constant value required for builtin %qF argument 
%d",
+  decl, src_arg_index + 1);
+   return;
+ }
+
+   switch (tree_to_uhwi (arg))
  {
  case 64: code = 0; break;
  case 128: code = 1; break;
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 759d2b3..f3908d9 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2017-03-24  Andreas Krebbel  
+
+   PR target/79893
+   * gcc.target/s390/zvector/pr79893.c: New test.
+
 2017-03-24  Bill Schmidt  
 
PR tree-optimization/80158
diff --git a/gcc/testsuite/gcc.target/s390/zvector/pr79893.c 
b/gcc/testsuite/gcc.target/s390/zvector/pr79893.c
new file mode 100644
index 000..ad6ca30
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/pr79893.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-march=z13 -mzarch -mzvector" } */
+
+#include 
+
+void
+foo(signed char *p, int i) {
+  vec_load_bndry(p, i); /* { dg-error "constant value required for builtin.*2" 
} */
+}
-- 
2.9.1

[Committed] S/390: PR79904: Disallow reg + sym_ref literal pool addresses.

2017-03-24 Thread Andreas Krebbel

We accept reg + sym_ref as valid address if sym_ref is a literal pool
reference knowing that it will be rewritten as r13 + reg + offset.
However, annotate_constant_pool_refs was never able to handle that.

With the patch only single sym_refs are accepted.

Regression tested on s390x.

2017-03-24  Andreas Krebbel  

PR target/79904
* config/s390/s390.c (s390_decompose_address): Reject reg +
sym_ref literal pool references.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.dg/ubsan/pr79904-2.c: New test.
---
 gcc/config/s390/s390.c | 11 ---
 gcc/testsuite/ChangeLog|  4 
 gcc/testsuite/gcc.dg/ubsan/pr79904-2.c | 11 +++
 3 files changed, 19 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pr79904-2.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index e7ab128..27640ad 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2842,13 +2842,10 @@ s390_decompose_address (rtx addr, struct s390_address 
*out)
  displacements by basing them off the base register.  */
   if (disp && GET_CODE (disp) == SYMBOL_REF && CONSTANT_POOL_ADDRESS_P (disp))
 {
-  /* Either base or index must be free to hold the base register.  */
-  if (!base)
-base = fake_pool_base, literal_pool = true;
-  else if (!indx)
-indx = fake_pool_base, literal_pool = true;
-  else
-return false;
+  if (base || indx)
+   return false;
+
+  base = fake_pool_base, literal_pool = true;
 
   /* Mark up the displacement.  */
   disp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, disp),
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index f3908d9..29318bb 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  
 
+   * gcc.dg/ubsan/pr79904-2.c: New test.
+
+2017-03-24  Andreas Krebbel  
+
PR target/79893
* gcc.target/s390/zvector/pr79893.c: New test.
 
diff --git a/gcc/testsuite/gcc.dg/ubsan/pr79904-2.c 
b/gcc/testsuite/gcc.dg/ubsan/pr79904-2.c
new file mode 100644
index 000..c99c43d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/pr79904-2.c
@@ -0,0 +1,11 @@
+/* PR sanitizer/79904 */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=signed-integer-overflow" } */
+
+typedef signed char V __attribute__((vector_size (8))); 
+
+void
+foo (V *a) 
+{ 
+  *a = *a * (V) { 3, 4, 5, 6, 7, 8, 9, 10 }; 
+}
-- 
2.9.1

Re: [RFC] VEC_SELECT sanity checking in genrecog

2017-03-24 Thread Richard Biener

On Fri, 24 Mar 2017, Jakub Jelinek wrote:

> On Mon, Mar 06, 2017 at 12:53:38PM +0100, Richard Biener wrote:
> > I think these are all bugs and should be fixed and thus this checking
> > is good.
> > 
> > Of course we'd better not break (too many) targets at this point...
> 
> I've tested it today and it passed on all targets I've tried make s-recog
> on, i.e. i386, arm, aarch64, alpha, avr, cris, hppa, ia64, m32c, m68k,
> mips, nvptx, rs6000, sparc, sh, s390x.
> Ok for trunk then?

Yes.

Richard.

[Committed] S/390: Add arch12 support

2017-03-24 Thread Andreas Krebbel

This patch series adds support for a new architecture level of S/390.

The most important feature of the new instruction set is the support
of single and extended precision floating point vector operations.

Binutils support is part of the 2.28 release:
https://sourceware.org/ml/binutils/2017-02/msg00301.html

Note: arch12 is NOT the official name of the new CPU.  It just
continues the series of archXX options supported as alternate names.
The archXX terminology refers to the edition number of the Principle
of Operations manual.  The official CPU name will be added later while
keeping support of the arch12 for backwards compatibility.

Andreas Krebbel (16):
  S/390: Rename cpu facility vec to vx.
  S/390: Improve support of 128 bit vectors in GPRs
  S/390: vec_init improvements
  S/390: movsf/sd pattern fixes.
  S/390: movdf improvements
  S/390: Move and rename vector check.
  S/390: Use wfc for scalar vector compares
  S/390: Rearrange fixuns_trunc pattern definitions.
  S/390: arch12: Add arch12 option.
  S/390: arch12: Add support for new vector bit operations.
  S/390: arch12: New vector popcount variants
  S/390: arch12: Add vllezlf instruction.
  S/390: arch12: Add indirect branch pattern
  S/390: arch12: Support the mul/add/subtract instructions.
  S/390: arch12: Support new vector floating point modes.
  S/390: arch12: New builtins.

 gcc/ChangeLog  |  223 ++
 gcc/common/config/s390/s390-common.c   |5 +-
 gcc/config.gcc |2 +-
 gcc/config/s390/2964.md|8 +-
 gcc/config/s390/constraints.md |   10 +-
 gcc/config/s390/driver-native.c|3 +
 gcc/config/s390/s390-builtin-types.def |  129 +-
 gcc/config/s390/s390-builtins.def  | 3504 +++-
 gcc/config/s390/s390-builtins.h|2 +
 gcc/config/s390/s390-c.c   |   41 +-
 gcc/config/s390/s390-opts.h|1 +
 gcc/config/s390/s390.c |  206 +-
 gcc/config/s390/s390.h |   25 +-
 gcc/config/s390/s390.md|  663 ++--
 gcc/config/s390/s390.opt   |3 +
 gcc/config/s390/vecintrin.h|  125 +-
 gcc/config/s390/vector.md  |  522 ++-
 gcc/config/s390/vx-builtins.md |  547 +--
 gcc/testsuite/ChangeLog|   62 +
 gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c |   23 +
 gcc/testsuite/gcc.target/s390/arch12/mul-1.c   |   30 +
 gcc/testsuite/gcc.target/s390/arch12/mul-2.c   |   16 +
 gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c |2 +-
 gcc/testsuite/gcc.target/s390/s390.exp |   22 +-
 .../gcc.target/s390/target-attribute/tattr-3.c |3 +-
 .../gcc.target/s390/target-attribute/tattr-4.c |6 +-
 .../s390/target-attribute/tpragma-struct-vx-1.c|2 +-
 .../s390/target-attribute/tpragma-struct-vx-2.c|2 +-
 gcc/testsuite/gcc.target/s390/vector/stpcpy-1.c|2 +-
 .../gcc.target/s390/vector/vec-abi-vararg-1.c  |2 +-
 .../gcc.target/s390/vector/vec-clobber-1.c |2 +-
 .../gcc.target/s390/vector/vec-genbytemask-1.c |2 +-
 .../gcc.target/s390/vector/vec-genmask-1.c |2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-init-2.c  |   48 +
 .../gcc.target/s390/vector/vec-nopeel-1.c  |2 +-
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c  |   31 +-
 gcc/testsuite/gcc.target/s390/vector/vec-vrepi-1.c |2 +-
 gcc/testsuite/gcc.target/s390/vxe/bitops-1.c   |   52 +
 gcc/testsuite/gcc.target/s390/vxe/negfma-1.c   |   49 +
 gcc/testsuite/gcc.target/s390/vxe/popcount-1.c |   88 +
 gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c  |   30 +
 gcc/testsuite/lib/target-supports.exp  |   35 +
 42 files changed, 4083 insertions(+), 2451 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/mul-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/mul-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-init-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/bitops-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/negfma-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/popcount-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c

-- 
2.9.1

[PATCH 02/16] S/390: Improve support of 128 bit vectors in GPRs

2017-03-24 Thread Andreas Krebbel

This patch improves the handling of 128 bit vectors residing in GPRs
by adding more alternatives to the move pattern.

Regression tested on s390x.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/constraints.md: Add comments.
(jKK): Reject element sizes > 8 bytes.
* config/s390/s390.c (s390_split_ok_p): Enable splitting also for
s_operands.
* config/s390/s390.md: Add the s_operand checks formerly in
s390_split_ok_p to various splitters where they are still
required.
* config/s390/vector.md ("mov" V_128): Add GPR alternatives
for 128 bit vectors.  Plus two splitters.
---
 gcc/ChangeLog  | 12 ++
 gcc/config/s390/constraints.md | 10 +++--
 gcc/config/s390/s390.c |  4 
 gcc/config/s390/s390.md| 16 +
 gcc/config/s390/vector.md  | 51 ++
 5 files changed, 83 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7a83d1b..292e946 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,17 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/constraints.md: Add comments.
+   (jKK): Reject element sizes > 8 bytes.
+   * config/s390/s390.c (s390_split_ok_p): Enable splitting also for
+   s_operands.
+   * config/s390/s390.md: Add the s_operand checks formerly in
+   s390_split_ok_p to various splitters where they are still
+   required.
+   * config/s390/vector.md ("mov" V_128): Add GPR alternatives
+   for 128 bit vectors.  Plus two splitters.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/s390.md: Rename the cpu facilty vec to vx throughout
the file.
 
diff --git a/gcc/config/s390/constraints.md b/gcc/config/s390/constraints.md
index 536f485..95c6a8f 100644
--- a/gcc/config/s390/constraints.md
+++ b/gcc/config/s390/constraints.md
@@ -410,20 +410,26 @@
   "All one bit scalar or vector constant"
   (match_test "op == CONSTM1_RTX (GET_MODE (op))"))
 
+; vector generate mask operand - support for up to 64 bit elements
 (define_constraint "jxx"
   "@internal"
   (and (match_code "const_vector")
(match_test "s390_contiguous_bitmask_vector_p (op, NULL, NULL)")))
 
+; vector generate byte mask operand - this is only supposed to deal
+; with real vectors 128 bit values of being either 0 or -1 are handled
+; with j00 and jm1
 (define_constraint "jyy"
   "@internal"
   (and (match_code "const_vector")
(match_test "s390_bytemask_vector_p (op, NULL)")))
 
+; vector replicate immediate operand - support for up to 64 bit elements
 (define_constraint "jKK"
   "@internal"
-  (and (and (match_code "const_vector")
-   (match_test "const_vec_duplicate_p (op)"))
+  (and (and (and (match_code "const_vector")
+(match_test "const_vec_duplicate_p (op)"))
+   (match_test "GET_MODE_UNIT_SIZE (GET_MODE (op)) <= 8"))
(match_test "satisfies_constraint_K (XVECEXP (op, 0, 0))")))
 
 (define_constraint "jm6"
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 27640ad..f3cebd6 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2494,10 +2494,6 @@ s390_split_ok_p (rtx dst, rtx src, machine_mode mode, 
int first_subword)
   if (FP_REG_P (src) || FP_REG_P (dst) || VECTOR_REG_P (src) || VECTOR_REG_P 
(dst))
 return false;
 
-  /* We don't need to split if operands are directly accessible.  */
-  if (s_operand (src, mode) || s_operand (dst, mode))
-return false;
-
   /* Non-offsettable memory references cannot be split.  */
   if ((GET_CODE (src) == MEM && !offsettable_memref_p (src))
   || (GET_CODE (dst) == MEM && !offsettable_memref_p (dst)))
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 660b5f9..555a779 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -1490,6 +1490,8 @@
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
 (match_operand:TI 1 "general_operand" ""))]
   "TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], TImode)
+   && !s_operand (operands[1], TImode)
&& s390_split_ok_p (operands[0], operands[1], TImode, 0)"
   [(set (match_dup 2) (match_dup 4))
(set (match_dup 3) (match_dup 5))]
@@ -1504,6 +1506,8 @@
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
 (match_operand:TI 1 "general_operand" ""))]
   "TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], TImode)
+   && !s_operand (operands[1], TImode)
&& s390_split_ok_p (operands[0], operands[1], TImode, 1)"
   [(set (match_dup 2) (match_dup 4))
(set (match_dup 3) (match_dup 5))]
@@ -1824,6 +1828,8 @@
   [(set (match_operand:DI 0 "nonimmediate_operand" "")
 (match_operand:DI 1 "general_operand" ""))]
   "!TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], DImode)
+   && !s_operand (operands[1], DImode)
&& s390_split_ok_p (operands[0], operands[1], DImode, 0)"
   [(set (match_dup 2) (match_

[PATCH 01/16] S/390: Rename cpu facility vec to vx.

2017-03-24 Thread Andreas Krebbel

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/s390.md: Rename the cpu facilty vec to vx throughout
the file.
---
 gcc/ChangeLog   |  5 +
 gcc/config/s390/s390.md | 46 +++---
 2 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9cb2271..7a83d1b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/s390.md: Rename the cpu facilty vec to vx throughout
+   the file.
+
+2017-03-24  Andreas Krebbel  
+
PR target/79893
* config/s390/s390-c.c (s390_adjust_builtin_arglist): Issue an
error if the boundary argument is not constant.
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 19daf31..660b5f9 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -482,7 +482,7 @@
   (const (symbol_ref "s390_tune_attr")))
 
 (define_attr "cpu_facility"
-  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vec,z13"
+  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vx,z13"
   (const_string "standard"))
 
 (define_attr "enabled" ""
@@ -525,7 +525,7 @@
   (match_test "TARGET_ZEC12"))
 (const_int 1)
 
- (and (eq_attr "cpu_facility" "vec")
+ (and (eq_attr "cpu_facility" "vx")
   (match_test "TARGET_VX"))
 (const_int 1)
 
@@ -1484,7 +1484,7 @@
#"
   [(set_attr "op_type" "RSY,RSY,VRR,VRI,VRI,VRR,*,VRX,VRX,*,*")
(set_attr "type" "lm,stm,*,*,*,*,*,*,*,*,*")
-   (set_attr "cpu_facility" "*,*,vec,vec,vec,vec,vec,vec,vec,*,*")])
+   (set_attr "cpu_facility" "*,*,vx,vx,vx,vx,vx,vx,vx,*,*")])
 
 (define_split
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
@@ -1720,7 +1720,7 @@
  *,*,*,*,*,*,*")
(set_attr "cpu_facility" "*,*,*,*,*,extimm,extimm,extimm,dfp,dfp,longdisp,
  z10,*,*,*,*,*,longdisp,*,longdisp,
- z10,z10,*,*,*,*,vec,vec,vec,vec,vec,vec")
+ z10,z10,*,*,*,*,vx,vx,vx,vx,vx,vx")
(set_attr "z10prop" "z10_fwd_A1,
 z10_fwd_E1,
 z10_fwd_E1,
@@ -2001,7 +2001,7 @@
  *,
  *,*,*,*,*,*,*")
(set_attr "cpu_facility" "*,*,*,extimm,longdisp,z10,*,*,longdisp,*,longdisp,
- 
vec,*,vec,*,longdisp,*,longdisp,*,*,*,z10,z10,*,vec,vec,vec,vec,vec,vec")
+ 
vx,*,vx,*,longdisp,*,longdisp,*,*,*,z10,z10,*,vx,vx,vx,vx,vx,vx")
(set_attr "z10prop" "z10_fwd_A1,
 z10_fwd_E1,
 z10_fwd_E1,
@@ -2049,7 +2049,7 @@
(set_attr "type" 
"*,lr,load,store,floadsf,floadsf,floadsf,floadsf,fstoresf,*,*,*,*")
(set_attr "z10prop" 
"z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_rec,*,*,*,*,*,z10_super_E1,
 z10_super,*,*")
-   (set_attr "cpu_facility" "*,*,*,*,vec,*,vec,*,*,*,*,*,*")
+   (set_attr "cpu_facility" "*,*,*,*,vx,*,vx,*,*,*,*,*,*")
 ])
 
 (define_peephole2
@@ -2183,7 +2183,7 @@
vsteh\t%v1,%0,0"
   [(set_attr "op_type"  
"RR,RI,RX,RXY,RIL,RX,RXY,RIL,SIL,VRI,VRR,VRS,VRS,VRX,VRX")
(set_attr "type" "lr,*,*,*,larl,store,store,store,*,*,*,*,*,*,*")
-   (set_attr "cpu_facility" 
"*,*,*,longdisp,z10,*,longdisp,z10,z10,vec,vec,vec,vec,vec,vec")
+   (set_attr "cpu_facility" 
"*,*,*,longdisp,z10,*,longdisp,z10,z10,vx,vx,vx,vx,vx,vx")
(set_attr "z10prop" "z10_fr_E1,
z10_fwd_A1,
z10_super_E1,
@@ -2248,7 +2248,7 @@
vsteb\t%v1,%0,0"
   [(set_attr "op_type" "RR,RI,RX,RXY,RX,RXY,SI,SIY,SS,VRI,VRR,VRS,VRS,VRX,VRX")
(set_attr "type" "lr,*,*,*,store,store,store,store,*,*,*,*,*,*,*")
-   (set_attr "cpu_facility" 
"*,*,*,longdisp,*,longdisp,*,longdisp,*,vec,vec,vec,vec,vec,vec")
+   (set_attr "cpu_facility" 
"*,*,*,longdisp,*,longdisp,*,longdisp,*,vx,vx,vx,vx,vx,vx")
(set_attr "z10prop" "z10_fr_E1,
 z10_fwd_A1,
 z10_super_E1,
@@ -2476,7 +2476,7 @@
(set_attr "type" "fsimpdf,floaddf,floaddf,floaddf,floaddf,floaddf,
  
fstoredf,fstoredf,*,lr,load,load,store,store,*,*,*,load,store")
(set_attr "z10prop" 
"*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*,*,*")
-   (set_attr "cpu_facility" 
"z196,*,*,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vec,vec,vec,vec,vec")])
+   (set_attr "cpu_facility" 
"z196,*,*,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx,vx,vx")])
 
 (define_insn "*mov_64"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand" 
"=f,f,f,f,R,T,d,d,d,d,b,T,v,v,R")
@@ -2502,7 +2502,7 @@
(set_attr "type""fsimpdf,fload,fload,fload,
 
fstore,fstore,*,lr,load,load,store,store,*,load,store")
(set_attr "z10prop" 
"*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*")

[PATCH 05/16] S/390: movdf improvements

2017-03-24 Thread Andreas Krebbel

This patch add the vector load element from immediate instruction to the
movdf/dd pattern for loading a FP zero and it removes the vector
instructions from the mov_64 pattern. These were pointless in
there because z13 support implies DFP support so these instructions will
always be matched in the mov_64dfp pattern instead.

Regression tested on s390x

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/s390.md ("mov_64dfp" DD_DF): Use vleig for loading a
FP zero.
("*mov_64" DD_DF): Remove the vector instructions. These
will anyway by matched by mov_64dfp.
---
 gcc/ChangeLog   |  7 +++
 gcc/config/s390/s390.md | 30 ++
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d29c06b..4dd2be6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/s390.md ("mov_64dfp" DD_DF): Use vleig for loading a
+   FP zero.
+   ("*mov_64" DD_DF): Remove the vector instructions. These
+   will anyway by matched by mov_64dfp.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/s390.c (s390_expand_vec_init): Enable vector load
pair for all vector types with 64 bit elements.
* config/s390/vx-builtins.md (V_HW_64): Move mode iterator to ...
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 75b15df..554fb37 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -2460,9 +2460,9 @@
 
 (define_insn "*mov_64dfp"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand"
-  "=f,f,f,d,f,f,R,T,d,d,d,d,b,T,v,v,d,v,R")
+  "=f,f,f,d,f,f,R,T,d,d,d,d,b,T,v,v,v,d,v,R")
 (match_operand:DD_DF 1 "general_operand"
-  " G,f,d,f,R,T,f,f,G,d,b,T,d,d,v,d,v,R,v"))]
+  " G,f,d,f,R,T,f,f,G,d,b,T,d,d,v,G,d,v,R,v"))]
   "TARGET_DFP"
   "@
lzdr\t%0
@@ -2480,19 +2480,20 @@
stgrl\t%1,%0
stg\t%1,%0
vlr\t%v0,%v1
+   vleig\t%v0,0,0
vlvgg\t%v0,%1,0
vlgvg\t%0,%v1,0
vleg\t%0,%1,0
vsteg\t%1,%0,0"
-  [(set_attr "op_type" 
"RRE,RR,RRE,RRE,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY,VRR,VRS,VRS,VRX,VRX")
+  [(set_attr "op_type" 
"RRE,RR,RRE,RRE,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY,VRR,VRI,VRS,VRS,VRX,VRX")
(set_attr "type" "fsimpdf,floaddf,floaddf,floaddf,floaddf,floaddf,
- 
fstoredf,fstoredf,*,lr,load,load,store,store,*,*,*,load,store")
-   (set_attr "z10prop" 
"*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*,*,*")
-   (set_attr "cpu_facility" 
"z196,*,*,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx,vx,vx")])
+ 
fstoredf,fstoredf,*,lr,load,load,store,store,*,*,*,*,load,store")
+   (set_attr "z10prop" 
"*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*,*,*,*")
+   (set_attr "cpu_facility" 
"z196,*,*,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx,vx,vx,vx")])
 
 (define_insn "*mov_64"
-  [(set (match_operand:DD_DF 0 "nonimmediate_operand" 
"=f,f,f,f,R,T,d,d,d,d,b,T,v,v,R")
-(match_operand:DD_DF 1 "general_operand"  " 
G,f,R,T,f,f,G,d,b,T,d,d,v,R,v"))]
+  [(set (match_operand:DD_DF 0 "nonimmediate_operand" 
"=f,f,f,f,R,T,d,d,d,d,b,T")
+(match_operand:DD_DF 1 "general_operand"  " 
G,f,R,T,f,f,G,d,b,T,d,d"))]
   "TARGET_ZARCH"
   "@
lzdr\t%0
@@ -2506,15 +2507,12 @@
lgrl\t%0,%1
lg\t%0,%1
stgrl\t%1,%0
-   stg\t%1,%0
-   vlr\t%v0,%v1
-   vleg\t%v0,%1,0
-   vsteg\t%v1,%0,0"
-  [(set_attr "op_type" 
"RRE,RR,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY,VRR,VRX,VRX")
+   stg\t%1,%0"
+  [(set_attr "op_type" "RRE,RR,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY")
(set_attr "type""fsimpdf,fload,fload,fload,
-
fstore,fstore,*,lr,load,load,store,store,*,load,store")
-   (set_attr "z10prop" 
"*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*")
-   (set_attr "cpu_facility" 
"z196,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx")])
+fstore,fstore,*,lr,load,load,store,store")
+   (set_attr "z10prop" 
"*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec")
+   (set_attr "cpu_facility" "z196,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*")])
 
 (define_insn "*mov_31"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand"
-- 
2.9.1

[PATCH 04/16] S/390: movsf/sd pattern fixes.

2017-03-24 Thread Andreas Krebbel

The SD/SFmode move pattern used a wrong mnemonic for vector load
element.
On the vector load element instruction was an operand missing.

Regression tested on s390x.

2017-03-24  Andreas Krebbel  

* config/s390/s390.md ("mov" SD_SF): Change vleg/vsteg to
vlef/vstef.  Add missing operand to vleif.
---
 gcc/config/s390/s390.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 555a779..75b15df 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -2613,11 +2613,11 @@
st\t%1,%0
sty\t%1,%0
vlr\t%v0,%v1
-   vleif\t%v0,0
+   vleif\t%v0,0,0
vlvgf\t%v0,%1,0
vlgvf\t%0,%v1,0
-   vleg\t%0,%1,0
-   vsteg\t%1,%0,0"
+   vlef\t%0,%1,0
+   vstef\t%1,%0,0"
   [(set_attr "op_type" 
"RRE,RR,RR,RXE,RX,RXY,RX,RXY,RI,RR,RIL,RX,RXY,RIL,RX,RXY,VRR,VRI,VRS,VRS,VRX,VRX")
(set_attr "type"
"fsimpsf,fsimpsf,fload,fload,fload,fload,
 
fstore,fstore,*,lr,load,load,load,store,store,store,*,*,*,*,load,store")
-- 
2.9.1

[PATCH 06/16] S/390: Move and rename vector check.

2017-03-24 Thread Andreas Krebbel

Move the target support routine for the vector facility to the common
code file.  This is required to enable the generic vectorization tests
on S/390.  While doing this the too generic name for the check (vector)
is changed to s390_vx.  The renaming required to modify all the
testcases currently using that check.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/s390.exp (check_effective_target_vector):
Include target-supports.exp and move target_vector check routine
...
* lib/target-supports.exp (check_effective_target_s390_vx): ... to
here and rename it.
* gcc.target/s390/htm-builtins-z13-1.c: Rename effective target
check from vector to s390_vx.
* gcc.target/s390/target-attribute/tpragma-struct-vx-1.c: Likewise.
* gcc.target/s390/target-attribute/tpragma-struct-vx-2.c: Likewise.
* gcc.target/s390/vector/stpcpy-1.c: Likewise.
* gcc.target/s390/vector/vec-abi-vararg-1.c: Likewise.
* gcc.target/s390/vector/vec-clobber-1.c: Likewise.
* gcc.target/s390/vector/vec-genbytemask-1.c: Likewise.
* gcc.target/s390/vector/vec-genmask-1.c: Likewise.
* gcc.target/s390/vector/vec-nopeel-1.c: Likewise.
* gcc.target/s390/vector/vec-vrepi-1.c: Likewise.
---
 gcc/testsuite/ChangeLog   | 19 +++
 gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c|  2 +-
 gcc/testsuite/gcc.target/s390/s390.exp| 16 +---
 .../s390/target-attribute/tpragma-struct-vx-1.c   |  2 +-
 .../s390/target-attribute/tpragma-struct-vx-2.c   |  2 +-
 gcc/testsuite/gcc.target/s390/vector/stpcpy-1.c   |  2 +-
 .../gcc.target/s390/vector/vec-abi-vararg-1.c |  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c  |  2 +-
 .../gcc.target/s390/vector/vec-genbytemask-1.c|  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c  |  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c   |  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-vrepi-1.c|  2 +-
 gcc/testsuite/lib/target-supports.exp | 18 ++
 13 files changed, 48 insertions(+), 25 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 727bb45..0f0877c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,24 @@
 2017-03-24  Andreas Krebbel  
 
+   * gcc.target/s390/s390.exp (check_effective_target_vector):
+   Include target-supports.exp and move target_vector check routine
+   ...
+   * lib/target-supports.exp (check_effective_target_s390_vx): ... to
+   here and rename it.
+   * gcc.target/s390/htm-builtins-z13-1.c: Rename effective target
+   check from vector to s390_vx.
+   * gcc.target/s390/target-attribute/tpragma-struct-vx-1.c: Likewise.
+   * gcc.target/s390/target-attribute/tpragma-struct-vx-2.c: Likewise.
+   * gcc.target/s390/vector/stpcpy-1.c: Likewise.
+   * gcc.target/s390/vector/vec-abi-vararg-1.c: Likewise.
+   * gcc.target/s390/vector/vec-clobber-1.c: Likewise.
+   * gcc.target/s390/vector/vec-genbytemask-1.c: Likewise.
+   * gcc.target/s390/vector/vec-genmask-1.c: Likewise.
+   * gcc.target/s390/vector/vec-nopeel-1.c: Likewise.
+   * gcc.target/s390/vector/vec-vrepi-1.c: Likewise.
+
+2017-03-24  Andreas Krebbel  
+
* gcc.target/s390/vector/vec-init-2.c: New test.
 
 2017-03-24  Andreas Krebbel  
diff --git a/gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c 
b/gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c
index 7879c36..aaca1f4 100644
--- a/gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c
+++ b/gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c
@@ -1,7 +1,7 @@
 /* Verify if VRs are saved and restored.  */
 
 /* { dg-do run } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 /* { dg-options "-O3 -march=z13 -mzarch" } */
 
 typedef int __attribute__((vector_size(16))) v4si;
diff --git a/gcc/testsuite/gcc.target/s390/s390.exp 
b/gcc/testsuite/gcc.target/s390/s390.exp
index cab68e8..d7a61f4 100644
--- a/gcc/testsuite/gcc.target/s390/s390.exp
+++ b/gcc/testsuite/gcc.target/s390/s390.exp
@@ -26,6 +26,7 @@ if ![istarget s390*-*-*] then {
 
 # Load support procs.
 load_lib gcc-dg.exp
+load_lib target-supports.exp
 
 # Return 1 if the the assembler understands .machine and .machinemode.  The
 # target attribute needs that feature to work.
@@ -55,21 +56,6 @@ proc check_effective_target_htm { } {
 }] "-march=zEC12 -mzarch" ] } { return 0 } else { return 1 }
 }
 
-# Return 1 if vector (va - vector add) instructions are understood by
-# the assembler and can be executed.  This also covers checking for
-# the VX kernel feature.  A kernel without that feature does not
-# enable the vector facility and the following check will die with a
-# signal.
-proc check_effective_target_vector { } {
-if { ![check_runtime s390_chec

[PATCH 07/16] S/390: Use wfc for scalar vector compares

2017-03-24 Thread Andreas Krebbel

The z13 vector support used the vector style comparison instructions
also for the scalar compares in vector registers.  However, it is much
more convenient to just use the compare scalar instruction for that
purpose.  The advantage is that this instruction generates a CC result
as our compares usually do.  So this results in quite some code to be
removed from the backend.

Regression tested on s390x.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/2964.md: Remove the single element vector compare
instructions which are no longer used.
* config/s390/s390.c (s390_select_ccmode): Remove handling of
vector CCmodes.
(s390_canonicalize_comparison): Remove handling of DFmode
compares.
(s390_expand_vec_compare_scalar): Remove function.
(s390_emit_compare): Don't call s390_expand_vec_compare_scalar.
* config/s390/s390.md ("*vec_cmpdf_cconly"): Remove
pattern.
("*cmp_ccs"): Add wfcdb instruction.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust for the
comparison instructions used from now on.
---
 gcc/ChangeLog  |  14 +++
 gcc/config/s390/2964.md|   8 +-
 gcc/config/s390/s390.c | 102 +
 gcc/config/s390/s390.md|  26 ++
 gcc/testsuite/ChangeLog|   5 +
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c  |  31 +--
 6 files changed, 57 insertions(+), 129 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4dd2be6..fef571c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,19 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/2964.md: Remove the single element vector compare
+   instructions which are no longer used.
+   * config/s390/s390.c (s390_select_ccmode): Remove handling of
+   vector CCmodes.
+   (s390_canonicalize_comparison): Remove handling of DFmode
+   compares.
+   (s390_expand_vec_compare_scalar): Remove function.
+   (s390_emit_compare): Don't call s390_expand_vec_compare_scalar.
+   * config/s390/s390.md ("*vec_cmpdf_cconly"): Remove
+   pattern.
+   ("*cmp_ccs"): Add wfcdb instruction.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/s390.md ("mov_64dfp" DD_DF): Use vleig for loading a
FP zero.
("*mov_64" DD_DF): Remove the vector instructions. These
diff --git a/gcc/config/s390/2964.md b/gcc/config/s390/2964.md
index 374e2e3..d9b6729 100644
--- a/gcc/config/s390/2964.md
+++ b/gcc/config/s390/2964.md
@@ -88,7 +88,7 @@ 
vsh,vsl,vsq,lxebr,cdtr,fiebr,vupllb,vupllf,vupllh,vmrhb,madbr,vtm,vmrhf,\
 vmrhg,vmrhh,axtr,fiebra,vleb,cxtr,vlef,vleg,vleh,vpkf,vpkg,vpkh,vmlob,vmlof,\
 vmloh,lxdb,ldeb,mdtr,vceqfs,adb,wflndb,lxeb,vn,vo,vchlb,vx,mxtr,vchlf,vchlg,\
 vchlh,vfcedbs,vfcedb,vceqgs,cxbr,msdbr,vcdgb,debr,vceqhs,meeb,lcxbr,vavglb,\
-vavglf,vavglg,vavglh,wfcedbs,vmrlb,vmrlf,vmrlg,vmrlh,wfchedbs,vmxb,tcdb,\
+vavglf,vavglg,vavglh,vmrlb,vmrlf,vmrlg,vmrlh,vmxb,tcdb,\
 vmahh,vsrlb,wcgdb,lcdbr,vistrbs,vrepb,wfmdb,vrepf,vrepg,vreph,ler,wcdlgb,\
 ley,vistrb,vistrf,vistrh,tceb,wfsqdb,sqeb,vsumqf,vsumqg,vesrlb,vfeezbs,\
 maebr,vesrlf,vesrlg,vesrlh,vmeb,vmef,vmeh,meebr,vflcdb,wfmadb,vperm,sxtr,\
@@ -96,7 +96,7 @@ 
vclzf,vgm,vgmb,vgmf,vgmg,vgmh,tdcxt,vzero,msebr,veslb,veslf,veslg,vfenezb,\
 vfenezf,vfenezh,vistrfs,vchf,vchg,vchh,vmhb,vmhf,vmhh,cdb,veslvb,ledbr,\
 veslvf,veslvg,veslvh,wclgdb,vfmdb,vmnlb,vmnlf,vmnlg,vmnlh,vclzb,vfeezfs,\
 vclzg,vclzh,mdb,vmxlb,vmxlf,vmxlg,vmxlh,ltdtr,vsbcbiq,ceb,wfddb,sebr,vistrhs,\
-lxdtr,lcebr,vab,vaf,vag,vah,ltxtr,vlpf,vlpg,vsegb,vaq,vsegf,vsegh,wfchdbs,\
+lxdtr,lcebr,vab,vaf,vag,vah,ltxtr,vlpf,vlpg,vsegb,vaq,vsegf,vsegh,\
 sdtr,cdbr,vfeezhs,le,wldeb,vfmadb,vchlbs,vacccq,vmaleb,vsel,vmalef,vmaleh,\
 vflndb,mdbr,vmlb,wflpdb,ldetr,vpksfs,vpksf,vpksg,vpksh,sqdb,mxbr,sqdbr,\
 vmaeb,veslh,vmaef,vpklsf,vpklsg,vpklsh,verllb,vchb,ddtr,verllf,verllg,verllh,\
@@ -164,7 +164,7 @@ 
vsl,vsq,lxebr,cdtr,fiebr,vupllb,vupllf,vupllh,vmrhb,madbr,vtm,vmrhf,vmrhg,\
 vmrhh,axtr,fiebra,vleb,cxtr,vlef,vleg,vleh,vpkf,vpkg,vpkh,vmlob,vmlof,vmloh,\
 lxdb,ldeb,vceqfs,adb,wflndb,lxeb,vn,vo,vchlb,vx,vchlf,vchlg,vchlh,vfcedbs,\
 vfcedb,vceqgs,cxbr,msdbr,vcdgb,vceqhs,meeb,lcxbr,vavglb,vavglf,vavglg,vavglh,\
-wfcedbs,vmrlb,vmrlf,vmrlg,vmrlh,wfchedbs,vmxb,tcdb,vmahh,vsrlb,wcgdb,lcdbr,\
+vmrlb,vmrlf,vmrlg,vmrlh,vmxb,tcdb,vmahh,vsrlb,wcgdb,lcdbr,\
 vistrbs,vrepb,wfmdb,vrepf,vrepg,vreph,ler,wcdlgb,ley,vistrb,vistrf,vistrh,\
 tceb,vsumqf,vsumqg,vesrlb,vfeezbs,maebr,vesrlf,vesrlg,vesrlh,vmeb,vmef,\
 vmeh,meebr,vflcdb,wfmadb,vperm,sxtr,vclzf,vgm,vgmb,vgmf,vgmg,vgmh,tdcxt,\
@@ -172,7 +172,7 @@ 
vzero,msebr,veslb,veslf,veslg,vfenezb,vfenezf,vfenezh,vistrfs,vchf,vchg,\
 vchh,vmhb,vmhf,vmhh,cdb,veslvb,ledbr,veslvf,veslvg,veslvh,wclgdb,vfmdb,\
 vmnlb,vmnlf,vmnlg,vmnlh,vclzb,vfeezfs,vclzg,vclzh,mdb,

[PATCH 03/16] S/390: vec_init improvements

2017-03-24 Thread Andreas Krebbel

This enables the vec_init pattern also for V4SF, V1TI, and V1TF.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/vector/vec-init-2.c: New test.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/s390.c (s390_expand_vec_init): Enable vector load
pair for all vector types with 64 bit elements.
* config/s390/vx-builtins.md (V_HW_64): Move mode iterator to ...
* config/s390/vector.md (V_HW_64): ... here.
(V_128_NOSINGLE): New mode iterator.
("vec_init"): Use V_128 as mode iterator.
("*vec_splat"): Use V_128_NOSINGLE mode iterator.
("*vec_tf_to_v1tf", "*vec_ti_to_v1ti"): New pattern definitions.
("*vec_load_pairv2di"): Change to ...
("*vec_load_pair"): ... this one.
---
 gcc/ChangeLog | 13 
 gcc/config/s390/s390.c|  5 +-
 gcc/config/s390/vector.md | 72 +--
 gcc/config/s390/vx-builtins.md|  1 -
 gcc/testsuite/ChangeLog   |  4 ++
 gcc/testsuite/gcc.target/s390/vector/vec-init-2.c | 48 +++
 6 files changed, 122 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-init-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 292e946..d29c06b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,18 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/s390.c (s390_expand_vec_init): Enable vector load
+   pair for all vector types with 64 bit elements.
+   * config/s390/vx-builtins.md (V_HW_64): Move mode iterator to ...
+   * config/s390/vector.md (V_HW_64): ... here.
+   (V_128_NOSINGLE): New mode iterator.
+   ("vec_init"): Use V_128 as mode iterator.
+   ("*vec_splat"): Use V_128_NOSINGLE mode iterator.
+   ("*vec_tf_to_v1tf", "*vec_ti_to_v1ti"): New pattern definitions.
+   ("*vec_load_pairv2di"): Change to ...
+   ("*vec_load_pair"): ... this one.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/constraints.md: Add comments.
(jKK): Reject element sizes > 8 bytes.
* config/s390/s390.c (s390_split_ok_p): Enable splitting also for
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index f3cebd6..65a7546 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6617,7 +6617,10 @@ s390_expand_vec_init (rtx target, rtx vals)
   return;
 }
 
-  if (all_regs && REG_P (target) && n_elts == 2 && inner_mode == DImode)
+  if (all_regs
+  && REG_P (target)
+  && n_elts == 2
+  && GET_MODE_SIZE (inner_mode) == 8)
 {
   /* Use vector load pair.  */
   emit_insn (gen_rtx_SET (target,
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 38905e8..7ddeb9a 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -31,6 +31,9 @@
 ; independently e.g. vcond
 (define_mode_iterator V_HW  [V16QI V8HI V4SI V2DI V2DF])
 (define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF])
+
+(define_mode_iterator V_HW_64 [V2DI V2DF])
+
 ; Including TI for instructions that support it (va, vn, ...)
 (define_mode_iterator VT_HW [V16QI V8HI V4SI V2DI V2DF V1TI TI])
 
@@ -53,6 +56,8 @@
 (define_mode_iterator V_64  [V8QI  V4HI V2SI V2SF V1DI V1DF])
 (define_mode_iterator V_128 [V16QI V8HI V4SI V4SF V2DI V2DF V1TI V1TF])
 
+(define_mode_iterator V_128_NOSINGLE [V16QI V8HI V4SI V4SF V2DI V2DF])
+
 ; A blank for vector modes and a * for TImode.  This is used to hide
 ; the TImode expander name in case it is defined already.  See addti3
 ; for an example.
@@ -437,9 +442,9 @@
   "vlgv\t%0,%v1,%Y3(%2)"
   [(set_attr "op_type" "VRS")])
 
-(define_expand "vec_init"
-  [(match_operand:V_HW 0 "register_operand" "")
-   (match_operand:V_HW 1 "nonmemory_operand" "")]
+(define_expand "vec_init"
+  [(match_operand:V_128 0 "register_operand" "")
+   (match_operand:V_128 1 "nonmemory_operand" "")]
   "TARGET_VX"
 {
   s390_expand_vec_init (operands[0], operands[1]);
@@ -449,20 +454,20 @@
 ; Replicate from vector element
 ; vrepb, vreph, vrepf, vrepg
 (define_insn "*vec_splat"
-  [(set (match_operand:V_HW   0 "register_operand" "=v")
-   (vec_duplicate:V_HW
+  [(set (match_operand:V_128_NOSINGLE   0 "register_operand" "=v")
+   (vec_duplicate:V_128_NOSINGLE
 (vec_select:
- (match_operand:V_HW 1 "register_operand"  "v")
+ (match_operand:V_128_NOSINGLE 1 "register_operand"  "v")
  (parallel
   [(match_operand:QI 2 "const_mask_operand" "C")]]
-  "TARGET_VX && UINTVAL (operands[2]) < GET_MODE_NUNITS (mode)"
+  "TARGET_VX && UINTVAL (operands[2]) < GET_MODE_NUNITS (mode)"
   "vrep\t%v0,%v1,%2"
   [(set_attr "op_type" "VRI")])
 
 ; vlrepb, vlreph, vlrepf, vlrepg, vrepib, vrepih, vrepif, vrepig, vrepb, 
vreph, vrepf, vrepg
 (define_insn "*vec_splats"
-  [(set (match_operand:V_HW  0 "register_operand" 
"=v,v,v,v")
-   (vec

[PATCH 13/16] S/390: arch12: Add indirect branch pattern

2017-03-24 Thread Andreas Krebbel

This adds support for the branch indirect instruction.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/s390.md ("indirect_jump"): Turn insn definition into
expander.
("*indirect_jump", "*indirect2_jump"): New pattern definitions.
---
 gcc/ChangeLog   |  6 ++
 gcc/config/s390/s390.md | 48 +---
 2 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a48b743..b3f3d95 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,11 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/s390.md ("indirect_jump"): Turn insn definition into
+   expander.
+   ("*indirect_jump", "*indirect2_jump"): New pattern definitions.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/s390.c (s390_expand_vec_init): Use vllezl
instruction if possible.
* config/s390/vector.md (vec_halfnumelts): New mode
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 53c8fed..32753ef 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -9509,20 +9509,46 @@
 ; indirect-jump instruction pattern(s).
 ;
 
-(define_insn "indirect_jump"
- [(set (pc) (match_operand 0 "address_operand" "ZR"))]
+(define_expand "indirect_jump"
+  [(set (pc) (match_operand 0 "nonimmediate_operand" ""))]
   ""
 {
-  if (get_attr_op_type (insn) == OP_TYPE_RR)
-return "br\t%0";
+  if (address_operand (operands[0], GET_MODE (operands[0])))
+;
+  else if (TARGET_ARCH12
+  && GET_MODE (operands[0]) == Pmode
+  && memory_operand (operands[0], Pmode))
+;
   else
-return "b\t%a0";
-}
-  [(set (attr "op_type")
-(if_then_else (match_operand 0 "register_operand" "")
-  (const_string "RR") (const_string "RX")))
-   (set_attr "type"  "branch")
-   (set_attr "atype" "agen")])
+operands[0] = force_reg (Pmode, operands[0]);
+})
+
+(define_insn "*indirect_jump"
+  [(set (pc)
+   (match_operand 0 "address_operand" "a,ZR"))]
+ ""
+ "@
+  br\t%0
+  b\t%a0"
+ [(set_attr "op_type" "RR,RX")
+  (set_attr "type"  "branch")
+  (set_attr "atype" "agen")
+  (set_attr "cpu_facility" "*")])
+
+; FIXME: LRA does not appear to be able to deal with MEMs being
+; checked against address constraints like ZR above.  So make this a
+; separate pattern for now.
+(define_insn "*indirect2_jump"
+  [(set (pc)
+   (match_operand 0 "nonimmediate_operand" "a,T"))]
+ ""
+ "@
+  br\t%0
+  bi\t%0"
+ [(set_attr "op_type" "RR,RXY")
+  (set_attr "type"  "branch")
+  (set_attr "atype" "agen")
+  (set_attr "cpu_facility" "*,arch12")])
 
 ;
 ; casesi instruction pattern(s).
-- 
2.9.1

[PATCH 10/16] S/390: arch12: Add support for new vector bit operations.

2017-03-24 Thread Andreas Krebbel

This patch adds support for the new bit operations introduced with
arch12.

The patch also renames the one complement pattern to the proper RTL
standard name.

2017-03-24  Andreas Krebbel  

* config/s390/s390.c (s390_rtx_costs): Return low costs for the
canonical form of ~AND to make sure the new instruction will be
used.
* config/s390/vector.md ("notand3", "ior_not3")
("notxor3"): Add new pattern definitions.
("*not"): Rename to ...
("one_cmpl2"): ... this.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/vxe/bitops-1.c: New test.
---
 gcc/config/s390/s390.c   | 15 
 gcc/config/s390/vector.md| 31 +++--
 gcc/testsuite/ChangeLog  |  4 +++
 gcc/testsuite/gcc.target/s390/vxe/bitops-1.c | 52 
 4 files changed, 100 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/bitops-1.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index c94edcc..416a15e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -3373,6 +3373,21 @@ s390_rtx_costs (rtx x, machine_mode mode, int outer_code,
  *total = COSTS_N_INSNS (2);
  return true;
}
+
+  /* ~AND on a 128 bit mode.  This can be done using a vector
+instruction.  */
+  if (TARGET_VXE
+ && GET_CODE (XEXP (x, 0)) == NOT
+ && GET_CODE (XEXP (x, 1)) == NOT
+ && REG_P (XEXP (XEXP (x, 0), 0))
+ && REG_P (XEXP (XEXP (x, 1), 0))
+ && GET_MODE_SIZE (GET_MODE (XEXP (XEXP (x, 0), 0))) == 16
+ && s390_hard_regno_mode_ok (VR0_REGNUM,
+ GET_MODE (XEXP (XEXP (x, 0), 0
+   {
+ *total = COSTS_N_INSNS (1);
+ return true;
+   }
   /* fallthrough */
 case ASHIFT:
 case ASHIFTRT:
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 7ddeb9a..68a8ed0 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -655,6 +655,15 @@
   "vn\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
+; Vector not and
+
+(define_insn "notand3"
+  [(set (match_operand:VT 0 "register_operand" "=v")
+   (ior:VT (not:VT (match_operand:VT 1 "register_operand" "%v"))
+   (not:VT (match_operand:VT 2 "register_operand"  "v"]
+  "TARGET_VXE"
+  "vnn\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
 
 ; Vector or
 
@@ -666,6 +675,15 @@
   "vo\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
+; Vector or with complement
+
+(define_insn "ior_not3"
+  [(set (match_operand:VT 0 "register_operand" "=v")
+   (ior:VT (not:VT (match_operand:VT 2 "register_operand"  "v"))
+   (match_operand:VT 1 "register_operand" "%v")))]
+  "TARGET_VXE"
+  "voc\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
 
 ; Vector xor
 
@@ -677,9 +695,18 @@
   "vx\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
+; Vector not xor
+
+(define_insn "notxor3"
+  [(set (match_operand:VT 0 "register_operand" "=v")
+   (not:VT (xor:VT (match_operand:VT 1 "register_operand" "%v")
+   (match_operand:VT 2 "register_operand"  "v"]
+  "TARGET_VXE"
+  "vnx\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
 
-; Bitwise inversion of a vector - used for vec_cmpne
-(define_insn "*not"
+; Bitwise inversion of a vector
+(define_insn "one_cmpl2"
   [(set (match_operand:VT 0 "register_operand" "=v")
(not:VT (match_operand:VT 1 "register_operand"  "v")))]
   "TARGET_VX"
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9ca13ab..bbdd3c8 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  
 
+   * gcc.target/s390/vxe/bitops-1.c: New test.
+
+2017-03-24  Andreas Krebbel  
+
* gcc.target/s390/s390.exp: Run tests in arch12 and vxe dirs.
* lib/target-supports.exp: Add effective target check s390_vxe.
 
diff --git a/gcc/testsuite/gcc.target/s390/vxe/bitops-1.c 
b/gcc/testsuite/gcc.target/s390/vxe/bitops-1.c
new file mode 100644
index 000..bdf7457
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vxe/bitops-1.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mzarch -march=arch12 --save-temps" } */
+/* { dg-require-effective-target s390_vxe } */
+
+typedef unsigned int   uv4si __attribute__((vector_size(16)));
+
+uv4si __attribute__((noinline))
+not_xor (uv4si a, uv4si b)
+{
+  return ~(a ^ b);
+}
+/* { dg-final { scan-assembler-times "vnx\t%v24,%v24,%v26" 1 } } */
+
+uv4si __attribute__((noinline))
+not_and (uv4si a, uv4si b)
+{
+  return ~(a & b);
+}
+/* { dg-final { scan-assembler-times "vnn\t%v24,%v24,%v26" 1 } } */
+
+uv4si __attribute__((noinline))
+or_not (uv4si a, uv4si b)
+{
+  return a | ~b;
+}
+/* { dg-final { scan-assembler-times "voc\t%v24,%v24,%v26" 1 } } */
+
+
+int
+main ()
+{
+

[PATCH 11/16] S/390: arch12: New vector popcount variants

2017-03-24 Thread Andreas Krebbel

arch12 provides pop count vector instructions for bigger elements than
just chars.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/vxe/popcount-1.c: New test.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/vector.md ("popcountv16qi2", "popcountv8hi2")
("popcountv4si2", "popcountv2di2"): Rename to ...
("popcount2", "popcountv8hi2_vx", "popcountv4si2_vx")
("popcountv2di2_vx"): ... these and add !TARGET_VXE to the
condition.
("popcount2_vxe"): New pattern.
---
 gcc/ChangeLog  |  9 +++
 gcc/config/s390/vector.md  | 38 ---
 gcc/testsuite/ChangeLog|  4 ++
 gcc/testsuite/gcc.target/s390/vxe/popcount-1.c | 88 ++
 4 files changed, 131 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/popcount-1.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 89e7906..d516b4d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,14 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/vector.md ("popcountv16qi2", "popcountv8hi2")
+   ("popcountv4si2", "popcountv2di2"): Rename to ...
+   ("popcount2", "popcountv8hi2_vx", "popcountv4si2_vx")
+   ("popcountv2di2_vx"): ... these and add !TARGET_VXE to the
+   condition.
+   ("popcount2_vxe"): New pattern.
+
+2017-03-24  Andreas Krebbel  
+
* common/config/s390/s390-common.c (processor_flags_table): Add
arch12.
* config.gcc: Add arch12.
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 68a8ed0..d4c0e95 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -715,11 +715,33 @@
 
 ; Vector population count
 
-(define_insn "popcountv16qi2"
+(define_expand "popcount2"
+  [(set (match_operand:VI_HW0 "register_operand" "=v")
+   (unspec:VI_HW [(match_operand:VI_HW 1 "register_operand"  "v")]
+ UNSPEC_POPCNT))]
+  "TARGET_VX"
+{
+  if (TARGET_VXE)
+emit_insn (gen_popcount2_vxe (operands[0], operands[1]));
+  else
+emit_insn (gen_popcount2_vx (operands[0], operands[1]));
+  DONE;
+})
+
+; vpopctb, vpopcth, vpopctf, vpopctg
+(define_insn "popcount2_vxe"
+  [(set (match_operand:VI_HW0 "register_operand" "=v")
+   (unspec:VI_HW [(match_operand:VI_HW 1 "register_operand"  "v")]
+ UNSPEC_POPCNT))]
+  "TARGET_VXE"
+  "vpopct\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "popcountv16qi2_vx"
   [(set (match_operand:V16QI0 "register_operand" "=v")
(unspec:V16QI [(match_operand:V16QI 1 "register_operand"  "v")]
  UNSPEC_POPCNT))]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
   "vpopct\t%v0,%v1,0"
   [(set_attr "op_type" "VRR")])
 
@@ -729,7 +751,7 @@
 ; of the result, add it to the result and extend it to halfword
 ; element size (unpack).
 
-(define_expand "popcountv8hi2"
+(define_expand "popcountv8hi2_vx"
   [(set (match_dup 2)
(unspec:V16QI [(subreg:V16QI (match_operand:V8HI 1 "register_operand" 
"v") 0)]
  UNSPEC_POPCNT))
@@ -761,7 +783,7 @@
(and:V8HI (subreg:V8HI (match_dup 2) 0)
  (subreg:V8HI (match_dup 3) 0)))
 ]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
 {
   operands[2] = gen_reg_rtx (V16QImode);
   operands[3] = gen_reg_rtx (V16QImode);
@@ -769,20 +791,20 @@
   operands[5] = CONST0_RTX (V16QImode);
 })
 
-(define_expand "popcountv4si2"
+(define_expand "popcountv4si2_vx"
   [(set (match_dup 2)
(unspec:V16QI [(subreg:V16QI (match_operand:V4SI 1 "register_operand" 
"v") 0)]
  UNSPEC_POPCNT))
(set (match_operand:V4SI 0 "register_operand" "=v")
(unspec:V4SI [(match_dup 2) (match_dup 3)]
 UNSPEC_VEC_VSUM))]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
 {
   operands[2] = gen_reg_rtx (V16QImode);
   operands[3] = force_reg (V16QImode, CONST0_RTX (V16QImode));
 })
 
-(define_expand "popcountv2di2"
+(define_expand "popcountv2di2_vx"
   [(set (match_dup 2)
(unspec:V16QI [(subreg:V16QI (match_operand:V2DI 1 "register_operand" 
"v") 0)]
  UNSPEC_POPCNT))
@@ -792,7 +814,7 @@
(set (match_operand:V2DI 0 "register_operand" "=v")
(unspec:V2DI [(match_dup 3) (match_dup 5)]
 UNSPEC_VEC_VSUMG))]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
 {
   operands[2] = gen_reg_rtx (V16QImode);
   operands[3] = gen_reg_rtx (V4SImode);
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index bbdd3c8..6d178c5 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  
 
+   * gcc.target/s390/vxe/popcount-1.c: New test.
+
+2017-03-24  Andreas Krebbel  
+
* gcc.target/s390/vxe/bitops-1.c: New test.
 
 2017-03-24  Andreas Krebbel  
diff --git a/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c 
b/gcc/testsuite/

[PATCH 08/16] S/390: Rearrange fixuns_trunc pattern definitions.

2017-03-24 Thread Andreas Krebbel

This reworks the fixuns_trunc* patterns a bit which got quite confusing
after adding z13 support.  Now we just have a single RTL standard name
expander definition ("fixuns_trunc2") which then
multiplexes to either the emulation variants *_emu or the hardware
implementations.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/s390.md
("fixuns_truncdddi2", "fixuns_trunctddi2")
("fixuns_trunc2"): Merge into ...
("fixuns_trunc2"): New expander.

("fixuns_trunc2", "fixuns_truncsi2"):
Rename expanders to ...

("fixuns_trunc2_emu")
("fixuns_truncdddi2_emu"): ... these.

("fixuns_truncsi2_emu"): New expander.

("*fixuns_truncdfdi2_z13"): Rename to ...
("*fixuns_truncdfdi2_vx"): ... this.
---
 gcc/ChangeLog   |  18 
 gcc/config/s390/s390.md | 253 +++-
 2 files changed, 161 insertions(+), 110 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index fef571c..779101e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,23 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/s390.md
+   ("fixuns_truncdddi2", "fixuns_trunctddi2")
+   ("fixuns_trunc2"): Merge into ...
+   ("fixuns_trunc2"): New expander.
+
+   ("fixuns_trunc2", "fixuns_truncsi2"):
+   Rename expanders to ...
+
+   ("fixuns_trunc2_emu")
+   ("fixuns_truncdddi2_emu"): ... these.
+
+   ("fixuns_truncsi2_emu"): New expander.
+
+   ("*fixuns_truncdfdi2_z13"): Rename to ...
+   ("*fixuns_truncdfdi2_vx"): ... this.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/2964.md: Remove the single element vector compare
instructions which are no longer used.
* config/s390/s390.c (s390_select_ccmode): Remove handling of
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index e72d5be..d4d3781 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -4735,152 +4735,185 @@
   "operands[2] = gen_lowpart (QImode, operands[0]);")
 
 ;
-; fixuns_trunc(dd|td)di2 instruction pattern(s).
+; fixuns_trunc(dd|td|sf|df|tf)(si|di)2 expander
 ;
 
-(define_expand "fixuns_truncdddi2"
+; This is the only entry point for fixuns_trunc.  It multiplexes the
+; expansion to either the *_emu expanders below for pre z196 machines
+; or emits the default pattern otherwise.
+(define_expand "fixuns_trunc2"
   [(parallel
-[(set (match_operand:DI 0 "register_operand" "")
- (unsigned_fix:DI (match_operand:DD 1 "register_operand" "")))
- (unspec:DI [(const_int DFP_RND_TOWARD_0)] UNSPEC_ROUND)
+[(set (match_operand:GPR 0 "register_operand" "")
+ (unsigned_fix:GPR (match_operand:FP 1 "register_operand" "")))
+ (unspec:GPR [(match_dup 2)] UNSPEC_ROUND)
  (clobber (reg:CC CC_REGNUM))])]
-
-  "TARGET_HARD_DFP"
+  "TARGET_HARD_FLOAT"
 {
   if (!TARGET_Z196)
 {
-  rtx_code_label *label1 = gen_label_rtx ();
-  rtx_code_label *label2 = gen_label_rtx ();
-  rtx temp = gen_reg_rtx (TDmode);
-  REAL_VALUE_TYPE cmp, sub;
-
-  decimal_real_from_string (&cmp, "9223372036854775808.0");  /* 2^63 */
-  decimal_real_from_string (&sub, "18446744073709551616.0"); /* 2^64 */
-
-  /* 2^63 can't be represented as 64bit DFP number with full precision.  
The
- solution is doing the check and the subtraction in TD mode and using a
- TD -> DI convert afterwards.  */
-  emit_insn (gen_extendddtd2 (temp, operands[1]));
-  temp = force_reg (TDmode, temp);
-  emit_cmp_and_jump_insns (temp,
-   const_double_from_real_value (cmp, TDmode),
-   LT, NULL_RTX, VOIDmode, 0, label1);
-  emit_insn (gen_subtd3 (temp, temp,
-   const_double_from_real_value (sub, TDmode)));
-  emit_insn (gen_fix_trunctddi2_dfp (operands[0], temp,
-GEN_INT (DFP_RND_TOWARD_MINF)));
-  emit_jump (label2);
-
-  emit_label (label1);
-  emit_insn (gen_fix_truncdddi2_dfp (operands[0], operands[1],
-GEN_INT (DFP_RND_TOWARD_0)));
-  emit_label (label2);
+  /* We don't provide emulation for TD|DD->SI.  */
+  if (GET_MODE_CLASS (mode) == MODE_DECIMAL_FLOAT
+ && mode == SImode)
+   FAIL;
+  emit_insn (gen_fixuns_trunc2_emu (operands[0],
+  operands[1]));
   DONE;
 }
+
+  if (GET_MODE_CLASS (mode) == MODE_DECIMAL_FLOAT)
+operands[2] = GEN_INT (DFP_RND_TOWARD_0);
+  else
+operands[2] = GEN_INT (BFP_RND_TOWARD_0);
+})
+
+; (sf|df|tf)->unsigned (si|di)
+
+; Emulate the unsigned conversion with the signed version for pre z196
+; machines.
+(define_expand "fixuns_trunc2_emu"
+  [(parallel
+[(set (match_operand:GPR 0 "register_operand" "")
+ (unsigned_fix:GPR (match_operand:BFP 1 "register_operand" "")))
+ (unspec:GPR [(const_int BFP_RND_TOWARD_0)] UNSPEC_ROUND)
+ (clobber (reg:CC CC_REG

[PATCH 14/16] S/390: arch12: Support the mul/add/subtract instructions.

2017-03-24 Thread Andreas Krebbel

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/s390.md ("*adddi3_sign", "*subdi3_sign", "mulditi3")
("mulditi3_2", "*muldi3_sign"): New patterns.
("muldi3", "*muldi3", "mulsi3", "*mulsi3"): Add an expander and
rename the pattern definition.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/arch12/aghsghmgh-1.c: New test.
* gcc.target/s390/arch12/mul-1.c: New test.
* gcc.target/s390/arch12/mul-2.c: New test.
---
 gcc/ChangeLog  |  7 ++
 gcc/config/s390/s390.md| 98 +++---
 gcc/testsuite/ChangeLog|  6 ++
 gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c | 23 +
 gcc/testsuite/gcc.target/s390/arch12/mul-1.c   | 30 +++
 gcc/testsuite/gcc.target/s390/arch12/mul-2.c   | 16 
 6 files changed, 167 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/mul-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/mul-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b3f3d95..3753ad6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/s390.md ("*adddi3_sign", "*subdi3_sign", "mulditi3")
+   ("mulditi3_2", "*muldi3_sign"): New patterns.
+   ("muldi3", "*muldi3", "mulsi3", "*mulsi3"): Add an expander and
+   rename the pattern definition.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/s390.md ("indirect_jump"): Turn insn definition into
expander.
("*indirect_jump", "*indirect2_jump"): New pattern definitions.
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 32753ef..93a0bc6 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -5795,6 +5795,15 @@
(set_attr "cpu_facility" "*,z196,extimm,z10")
(set_attr "z10prop" "z10_super_E1,*,z10_super_E1,z10_super_E1")])
 
+(define_insn "*adddi3_sign"
+  [(set (match_operand:DI  0 "register_operand" "=d")
+(plus:DI (sign_extend:DI (match_operand:HI 2 "memory_operand""T"))
+(match_operand:DI 1 "register_operand"  "0")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_ARCH12"
+  "agh\t%0,%2"
+  [(set_attr "op_type"  "RXY")])
+
 ;
 ; add(tf|df|sf|td|dd)3 instruction pattern(s).
 ;
@@ -6226,6 +6235,15 @@
(set_attr "cpu_facility" "*,z196,*,longdisp")
(set_attr "z10prop" "z10_super_c_E1,*,z10_super_E1,z10_super_E1")])
 
+(define_insn "*subdi3_sign"
+  [(set (match_operand:DI   0 "register_operand" "=d")
+(minus:DI (match_operand:DI 1 "register_operand"  "0")
+  (sign_extend:DI (match_operand:HI 2 "memory_operand"
"T"
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_ARCH12"
+  "sgh\t%0,%2"
+  [(set_attr "op_type"  "RXY")])
+
 
 ;
 ; sub(tf|df|sf|td|dd)3 instruction pattern(s).
@@ -6565,6 +6583,14 @@
 ; muldi3 instruction pattern(s).
 ;
 
+(define_expand "muldi3"
+  [(parallel
+[(set (match_operand:DI  0 "register_operand")
+ (mult:DI (match_operand:DI 1 "nonimmediate_operand")
+  (match_operand:DI 2 "general_operand")))
+ (clobber (reg:CC CC_REGNUM))])]
+  "TARGET_ZARCH")
+
 (define_insn "*muldi3_sign"
   [(set (match_operand:DI 0 "register_operand" "=d,d")
 (mult:DI (sign_extend:DI (match_operand:SI 2 "general_operand" "d,T"))
@@ -6576,24 +6602,68 @@
   [(set_attr "op_type"  "RRE,RXY")
(set_attr "type" "imuldi")])
 
-(define_insn "muldi3"
-  [(set (match_operand:DI 0 "register_operand" "=d,d,d,d")
-(mult:DI (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,0")
- (match_operand:DI 2 "general_operand" "d,K,T,Os")))]
+(define_insn "*muldi3"
+  [(set (match_operand:DI  0 "register_operand" "=d,d,d,d,d")
+   (mult:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d,0,0,0")
+(match_operand:DI 2 "general_operand"   "d,d,K,T,Os")))
+   (clobber (match_scratch:CC  3"=X,c,X,X,X"))]
   "TARGET_ZARCH"
   "@
msgr\t%0,%2
+   msgrkc\t%0,%1,%2
mghi\t%0,%h2
msg\t%0,%2
msgfi\t%0,%2"
-  [(set_attr "op_type"  "RRE,RI,RXY,RIL")
+  [(set_attr "op_type"  "RRE,RRF,RI,RXY,RIL")
(set_attr "type" "imuldi")
-   (set_attr "cpu_facility" "*,*,*,z10")])
+   (set_attr "cpu_facility" "*,arch12,*,*,z10")])
+
+(define_insn "mulditi3"
+  [(set (match_operand:TI 0 "register_operand"   "=d,d")
+(mult:TI (sign_extend:TI
+ (match_operand:DI 1 "register_operand" "%d,0"))
+(sign_extend:TI
+ (match_operand:DI 2 "nonimmediate_operand" " d,T"]
+  "TARGET_ARCH12"
+  "@
+   mgrk\t%0,%1,%2
+   mg\t%0,%2"
+  [(set_attr "op_type"  "RRF,RXY")])
+
+; Combine likes op

[PATCH 09/16] S/390: arch12: Add arch12 option.

2017-03-24 Thread Andreas Krebbel

This patch covers the mechanical work of making the new architecture
option arch12 available wherever it will be needed later.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/s390.exp: Run tests in arch12 and vxe dirs.
* lib/target-supports.exp: Add effective target check s390_vxe.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* common/config/s390/s390-common.c (processor_flags_table): Add
arch12.
* config.gcc: Add arch12.
* config/s390/driver-native.c (s390_host_detect_local_cpu):
Default to arch12 for unknown CPU model numbers.
* config/s390/s390-builtins.def: Add B_VXE builtin flag.
* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Adjust
PROCESSOR_max sanity check.
* config/s390/s390-opts.h (enum processor_type): Add
PROCESSOR_ARCH12.
* config/s390/s390.c (processor_table): Add arch12.
(s390_expand_builtin): Add check for B_VXE flag.
(s390_issue_rate): Add PROCESSOR_ARCH12.
(s390_get_sched_attrmask): Likewise.
(s390_get_unit_mask): Likewise.
(s390_sched_score): Enable z13 scheduling for arch12.
(s390_sched_reorder): Likewise.
(s390_sched_variable_issue): Likewise.
* config/s390/s390.h (enum processor_flags): Add PF_ARCH12 and
PF_VXE.
(s390_tune_attr): Use z13 scheduling also for arch12.
(TARGET_CPU_ARCH12, TARGET_CPU_ARCH12_P, TARGET_CPU_VXE)
(TARGET_CPU_VXE_P, TARGET_ARCH12, TARGET_ARCH12_P, TARGET_VXE)
(TARGET_VXE_P): New macros.
* config/s390/s390.md: Add arch12 to cpu attribute.  Add arch12
and vxe to cpu_facility.  Add arch12 and vxe to enabled attribute.
* config/s390/s390.opt: Add arch12 as processor_type.
---
 gcc/ChangeLog  | 30 ++
 gcc/common/config/s390/s390-common.c   |  5 -
 gcc/config.gcc |  2 +-
 gcc/config/s390/driver-native.c|  3 +++
 gcc/config/s390/s390-builtins.def  |  3 ++-
 gcc/config/s390/s390-c.c   |  2 +-
 gcc/config/s390/s390-opts.h|  1 +
 gcc/config/s390/s390.c | 22 --
 gcc/config/s390/s390.h | 25 +
 gcc/config/s390/s390.md| 12 ++--
 gcc/config/s390/s390.opt   |  3 +++
 gcc/testsuite/ChangeLog|  5 +
 gcc/testsuite/gcc.target/s390/s390.exp |  6 ++
 gcc/testsuite/lib/target-supports.exp  | 17 +
 14 files changed, 120 insertions(+), 16 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 779101e..89e7906 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,35 @@
 2017-03-24  Andreas Krebbel  
 
+   * common/config/s390/s390-common.c (processor_flags_table): Add
+   arch12.
+   * config.gcc: Add arch12.
+   * config/s390/driver-native.c (s390_host_detect_local_cpu):
+   Default to arch12 for unknown CPU model numbers.
+   * config/s390/s390-builtins.def: Add B_VXE builtin flag.
+   * config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Adjust
+   PROCESSOR_max sanity check.
+   * config/s390/s390-opts.h (enum processor_type): Add
+   PROCESSOR_ARCH12.
+   * config/s390/s390.c (processor_table): Add arch12.
+   (s390_expand_builtin): Add check for B_VXE flag.
+   (s390_issue_rate): Add PROCESSOR_ARCH12.
+   (s390_get_sched_attrmask): Likewise.
+   (s390_get_unit_mask): Likewise.
+   (s390_sched_score): Enable z13 scheduling for arch12.
+   (s390_sched_reorder): Likewise.
+   (s390_sched_variable_issue): Likewise.
+   * config/s390/s390.h (enum processor_flags): Add PF_ARCH12 and
+   PF_VXE.
+   (s390_tune_attr): Use z13 scheduling also for arch12.
+   (TARGET_CPU_ARCH12, TARGET_CPU_ARCH12_P, TARGET_CPU_VXE)
+   (TARGET_CPU_VXE_P, TARGET_ARCH12, TARGET_ARCH12_P, TARGET_VXE)
+   (TARGET_VXE_P): New macros.
+   * config/s390/s390.md: Add arch12 to cpu attribute.  Add arch12
+   and vxe to cpu_facility.  Add arch12 and vxe to enabled attribute.
+   * config/s390/s390.opt: Add arch12 as processor_type.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/s390.md
("fixuns_truncdddi2", "fixuns_trunctddi2")
("fixuns_trunc2"): Merge into ...
diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 47f13e1..10418a3 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -45,7 +45,10 @@ EXPORTED_CONST int processor_flags_table[] =
  | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX,
 /* z13 */PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
  | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
- | PF_Z13 | PF_VX
+ | PF_Z13 | PF_VX,
+/* arch12 */ PF_IEEE_FL

[PATCH 12/16] S/390: arch12: Add vllezlf instruction.

2017-03-24 Thread Andreas Krebbel

This adds support for the vector load element and zero instruction and
makes sure it is used when initializing vectors with elements while
setting the rest to 0.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/s390.c (s390_expand_vec_init): Use vllezl
instruction if possible.
* config/s390/vector.md (vec_halfnumelts): New mode
attribute.
("*vec_vllezlf"): New pattern.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/vxe/vllezlf-1.c: New test.
---
 gcc/ChangeLog |  8 +++
 gcc/config/s390/s390.c| 28 +
 gcc/config/s390/vector.md | 17 +++
 gcc/testsuite/ChangeLog   |  4 
 gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c | 30 +++
 5 files changed, 87 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d516b4d..a48b743 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,13 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/s390.c (s390_expand_vec_init): Use vllezl
+   instruction if possible.
+   * config/s390/vector.md (vec_halfnumelts): New mode
+   attribute.
+   ("*vec_vllezlf"): New pattern.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/vector.md ("popcountv16qi2", "popcountv8hi2")
("popcountv4si2", "popcountv2di2"): Rename to ...
("popcount2", "popcountv8hi2_vx", "popcountv4si2_vx")
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 416a15e..e800323 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6552,6 +6552,34 @@ s390_expand_vec_init (rtx target, rtx vals)
   return;
 }
 
+  /* Use vector load logical element and zero.  */
+  if (TARGET_VXE && (mode == V4SImode || mode == V4SFmode))
+{
+  bool found = true;
+
+  x = XVECEXP (vals, 0, 0);
+  if (memory_operand (x, inner_mode))
+   {
+ for (i = 1; i < n_elts; ++i)
+   found = found && XVECEXP (vals, 0, i) == const0_rtx;
+
+ if (found)
+   {
+ machine_mode half_mode = (inner_mode == SFmode
+   ? V2SFmode : V2SImode);
+ emit_insn (gen_rtx_SET (target,
+ gen_rtx_VEC_CONCAT (mode,
+ gen_rtx_VEC_CONCAT (half_mode,
+ x,
+ 
const0_rtx),
+ gen_rtx_VEC_CONCAT (half_mode,
+ 
const0_rtx,
+ 
const0_rtx;
+ return;
+   }
+   }
+}
+
   /* We are about to set the vector elements one by one.  Zero out the
  full register first in order to help the data flow framework to
  detect it as full VR set.  */
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index d4c0e95..6a726a3 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -44,6 +44,7 @@
 (define_mode_iterator VI_HW_HSD [V8HI  V4SI V2DI])
 (define_mode_iterator VI_HW_HS  [V8HI  V4SI])
 (define_mode_iterator VI_HW_QH  [V16QI V8HI])
+(define_mode_iterator VI_HW_4   [V4SI V4SF])
 
 ; All integer vector modes supported in a vector register + TImode
 (define_mode_iterator VIT [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI 
V2SI V4SI V1DI V2DI V1TI TI])
@@ -127,6 +128,9 @@
(V2DI "V2SI")
(V2DF "V2SF")])
 
+(define_mode_attr vec_halfnumelts
+  [(V4SF "V2SF") (V4SI "V2SI")])
+
 ; The comparisons not setting CC iterate over the rtx code.
 (define_code_iterator VFCMP_HW_OP [eq gt ge])
 (define_code_attr asm_fcmp_op [(eq "e") (gt "h") (ge "he")])
@@ -451,6 +455,19 @@
   DONE;
 })
 
+(define_insn "*vec_vllezlf"
+  [(set (match_operand:VI_HW_4  0 "register_operand" "=v")
+   (vec_concat:VI_HW_4
+(vec_concat:
+ (match_operand: 1 "memory_operand""R")
+ (const_int 0))
+(vec_concat:
+ (const_int 0)
+ (const_int 0]
+  "TARGET_VXE"
+  "vllezlf\t%v0,%1"
+  [(set_attr "op_type" "VRX")])
+
 ; Replicate from vector element
 ; vrepb, vreph, vrepf, vrepg
 (define_insn "*vec_splat"
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 6d178c5..4efc391 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  
 
+   * gcc.target/s390/vxe/vllezlf-1.c: New test.
+
+2017-03-24  Andreas Krebbel  
+
* gcc.target/s390/vxe/popcount-1.c: New test.
 
 2017-03-24  Andreas Krebbel  
diff --git a/gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c 
b/gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c
new file mo

[PATCH 15/16] S/390: arch12: Support new vector floating point modes.

2017-03-24 Thread Andreas Krebbel

This patch adds support for the new floating point vector elements (SF
and TF) introduced with arch12.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  

* config/s390/s390.c (s390_expand_vec_compare): Support other
vector floating point modes than just V2DF.
(s390_expand_vcond): Likewise.
(s390_hard_regno_mode_ok): Allow SFmode values in VRs.
(s390_cannot_change_mode_class): Prevent mode changes between TF
and V1TF in vector registers.
* config/s390/s390.md (DF, SF): New mode attributes.
("*cmp_ccs", "add3", "sub3", "mul3")
("fma4", "fms4", "div3", "*neg2"): Add
SFmode support for VRs.
* config/s390/vector.md (V_HW, V_HW2, VT_HW, ti*, nonvec): Add new
vector fp modes.
(VFT, VF_HW): New mode iterators.
(vw, sdx): New mode attributes.
("addv2df3", "subv2df3", "mulv2df3", "divv2df3", "sqrtv2df2")
("fmav2df4","fmsv2df4", "negv2df2", "absv2df2", "*negabsv2df2")
("smaxv2df3", "sminv2df3", "*vec_cmpv2df_nocc")
("vec_cmpuneqv2df", "vec_cmpltgtv2df", "vec_orderedv2df")
("vec_unorderedv2df"): Adjust the v2df only patterns to support
also the new vector floating point modes.  Renaming to ...

("add3", "sub3", "mul3", "div3")
("sqrt2", "fma4", "fms4", "neg2")
("abs2", "negabs2", "smax3")
("smin3", "*vec_cmp_nocc")
("vec_cmpuneq", "vec_cmpltgt", "vec_ordered")
("vec_unordered"): ... these.

("neg_fma4", "neg_fms4", "*smax3_vxe")
("*smin3_vxe", "*sminv2df3_vx", "*vec_extendv4sf")
("*vec_extendv2df"): New insn definitions.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  

* gcc.target/s390/vxe/negfma-1.c: New test.
---
 gcc/ChangeLog|  34 +++
 gcc/config/s390/s390-builtins.def|   2 +-
 gcc/config/s390/s390.c   |  13 +-
 gcc/config/s390/s390.md  | 144 ++--
 gcc/config/s390/vector.md| 313 ++-
 gcc/testsuite/ChangeLog  |   4 +
 gcc/testsuite/gcc.target/s390/vxe/negfma-1.c |  49 +
 7 files changed, 388 insertions(+), 171 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/negfma-1.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3753ad6..bd60982 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,39 @@
 2017-03-24  Andreas Krebbel  
 
+   * config/s390/s390.c (s390_expand_vec_compare): Support other
+   vector floating point modes than just V2DF.
+   (s390_expand_vcond): Likewise.
+   (s390_hard_regno_mode_ok): Allow SFmode values in VRs.
+   (s390_cannot_change_mode_class): Prevent mode changes between TF
+   and V1TF in vector registers.
+   * config/s390/s390.md (DF, SF): New mode attributes.
+   ("*cmp_ccs", "add3", "sub3", "mul3")
+   ("fma4", "fms4", "div3", "*neg2"): Add
+   SFmode support for VRs.
+   * config/s390/vector.md (V_HW, V_HW2, VT_HW, ti*, nonvec): Add new
+   vector fp modes.
+   (VFT, VF_HW): New mode iterators.
+   (vw, sdx): New mode attributes.
+   ("addv2df3", "subv2df3", "mulv2df3", "divv2df3", "sqrtv2df2")
+   ("fmav2df4","fmsv2df4", "negv2df2", "absv2df2", "*negabsv2df2")
+   ("smaxv2df3", "sminv2df3", "*vec_cmpv2df_nocc")
+   ("vec_cmpuneqv2df", "vec_cmpltgtv2df", "vec_orderedv2df")
+   ("vec_unorderedv2df"): Adjust the v2df only patterns to support
+   also the new vector floating point modes.  Renaming to ...
+
+   ("add3", "sub3", "mul3", "div3")
+   ("sqrt2", "fma4", "fms4", "neg2")
+   ("abs2", "negabs2", "smax3")
+   ("smin3", "*vec_cmp_nocc")
+   ("vec_cmpuneq", "vec_cmpltgt", "vec_ordered")
+   ("vec_unordered"): ... these.
+
+   ("neg_fma4", "neg_fms4", "*smax3_vxe")
+   ("*smin3_vxe", "*sminv2df3_vx", "*vec_extendv4sf")
+   ("*vec_extendv2df"): New insn definitions.
+
+2017-03-24  Andreas Krebbel  
+
* config/s390/s390.md ("*adddi3_sign", "*subdi3_sign", "mulditi3")
("mulditi3_2", "*muldi3_sign"): New patterns.
("muldi3", "*muldi3", "mulsi3", "*mulsi3"): Add an expander and
diff --git a/gcc/config/s390/s390-builtins.def 
b/gcc/config/s390/s390-builtins.def
index 27cb6a8..bb2d743 100644
--- a/gcc/config/s390/s390-builtins.def
+++ b/gcc/config/s390/s390-builtins.def
@@ -2496,7 +2496,7 @@ B_DEF  (s390_vec_ctsl,  vec_ctsl, 
  0,
 B_DEF  (s390_vec_ctul,  vec_ctul,   0, 
 B_VX,   O2_U3,  BT_FN_UV2DI_V2DF_INT)  
  /* vclgdb */
 B_DEF  (s390_vcgdb, vec_df_to_di_s64,   0, 
 B_VX,   O2_U3,  BT_FN_V2DI_V2DF_INT)   
  /* vcgdb */
 B_DEF  (s390_vclgdb,vec_df_to_di_u64,   0, 
 B_VX,   O2_U3,  BT_FN_UV2DI_V

C++ PATCH for c++/77339, ICE with invalid use of alias template

2017-03-24 Thread Jason Merrill

Here, we were seeing that A:: is a template specialization, so we
tried to treat it as entering the scope of A.  But since A is an
alias template, that doesn't make sense.  Fixed by not considering
alias templates for this treatment.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 7fde9bb020c7be786c791e485261e28e7b0d3ccf
Author: Jason Merrill 
Date:   Thu Mar 23 16:21:34 2017 -0400

PR c++/77339 - ICE with invalid use of alias template.

* pt.c (lookup_template_class_1): Don't try to enter the scope of an
alias template.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 5259dad..cbe8082 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8682,9 +8682,9 @@ lookup_template_class_1 (tree d1, tree arglist, tree 
in_decl, tree context,
  || !PRIMARY_TEMPLATE_P (gen_tmpl)
  || currently_open_class (template_type))
{
- tree tinfo = TYPE_TEMPLATE_INFO_MAYBE_ALIAS (template_type);
+ tree tinfo = TYPE_TEMPLATE_INFO (template_type);
 
- if (comp_template_args (TI_ARGS (tinfo), arglist))
+ if (tinfo && comp_template_args (TI_ARGS (tinfo), arglist))
return template_type;
}
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-58.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-58.C
new file mode 100644
index 000..0ae1c49
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-58.C
@@ -0,0 +1,7 @@
+// PR c++/77339
+// { dg-do compile { target c++11 } }
+
+template < typename > using A = int;
+
+//OK: template < typename X > A < X > a; 
+template < typename X > A < X >::a; // { dg-error "" }

Re: [PATCH 0/3] Introduce internal_error_cont and exclude it from pot files

2017-03-24 Thread Martin Sebor


On 03/24/2017 03:29 AM, Martin Liška wrote:

I would like to ping that. I'm not sure what's agreement after I read
discussion in: https://gcc.gnu.org/ml/gcc/2017-03/msg00070.html

Martin Sebor may know, CC'ing him.


I haven't seen any followup posts after the one above.  The other
(and I think more important point, one with a bearing on the whole
patch series) to consider is the one made by translator Frederic
Marchal in comment #2 in bug 80055.

Martin

Re: C++ PATCH to fix ICE in replace_placeholders_r (PR c++/79937)

2017-03-24 Thread Marek Polacek

On Thu, Mar 23, 2017 at 05:09:58PM -0400, Jason Merrill wrote:
> On Thu, Mar 23, 2017 at 4:34 PM, Marek Polacek  wrote:
> > On Tue, Mar 14, 2017 at 02:34:30PM -0400, Jason Merrill wrote:
> >> On Tue, Mar 14, 2017 at 2:33 PM, Jason Merrill  wrote:
> >> > On Tue, Mar 7, 2017 at 12:10 PM, Marek Polacek  
> >> > wrote:
> >> >> In this testcase we have
> >> >> C c = bar (X{1});
> >> >> which store_init_value sees as
> >> >> c = TARGET_EXPR  >> >> .n=(&)->i}>)>
> >> >> i.e. we're initializing "c" with a TARGET_EXPR.  We call 
> >> >> replace_placeholders
> >> >> that walks the whole tree to substitute the placeholders.  Eventually 
> >> >> we find
> >> >> the nested  but that's for another object, 
> >> >> so we
> >> >> crash.  Seems that we shouldn't have stepped into the second 
> >> >> TARGET_EXPR at
> >> >> all; it has nothing to with "c", it's bar's argument.
> >> >>
> >> >> It occurred to me that we shouldn't step into CALL_EXPRs and leave the
> >> >> placeholders in function arguments to cp_gimplify_init_expr which calls
> >> >> replace_placeholders for constructors.  Not sure if it's enough to 
> >> >> handle
> >> >> CALL_EXPRs like this, anything else?
> >> >
> >> > Hmm, we might have a DMI containing a call with an argument referring
> >> > to *this, i.e.
> >> >
> >> > struct A
> >> > {
> >> >   int i;
> >> >   int j = frob (this->i);
> >> > };
> >> >
> >> > The TARGET_EXPR seems like a more likely barrier, but even there we
> >> > could have something like
> >> >
> >> > struct A { int i; };
> >> > struct B
> >> > {
> >> >   int i;
> >> >   A a = A{this->i};
> >> > };
> >> >
> >> > I think we need replace_placeholders to keep a stack of objects, so
> >> > that when we see a TARGET_EXPR we add it to the stack and therefore
> >> > can properly replace a PLACEHOLDER_EXPR of its type.
> >>
> >> Or actually, avoid replacing such a PLACEHOLDER_EXPR, but rather leave
> >> it for later when we lower the TARGET_EXPR.
> >
> > Sorry, I don't really follow.  I have a patch that puts TARGET_EXPRs on
> > a stack, but I don't know how that helps.  E.g. with nsdmi-aggr3.C
> > we have
> > B b = TARGET_EXPR  > &>}>
> > so when we get to that PLACEHOLDER_EXPR, on the stack there's
> > TARGET_EXPR with type struct A
> > TARGET_EXPR with type struct B
> > so the type of the PLACEHOLDER_EXPR doesn't match the type of the current
> > TARGET_EXPR, but we still want to replace it in this case.
> >
> > So -- could you expand a bit on what you had in mind, please?
> 
> So then when we see a placeholder, we walk the stack to find the
> object of the matching type.
> 
> But if the object we find was collected from walking through a
> TARGET_EXPR, we should leave the PLACEHOLDER_EXPR alone, so that it
> can be replaced later with the actual target of the initialization.

Unfortunately, I still don't understand; guess I'll have to drop this PR.

With this we put TARGET_EXPRs on a stack, and then when we find a
PLACEHOLDER_EXPR we walk the stack to find a TARGET_EXPR of the same type as
the PLACEHOLDER_EXPR.  There are three simplified examples I've been playing
with:

  B b = T_E >}>

  - here we should replace the P_E; on the stack there are two
TARGET_EXPRs of types B and A

  C c = T_E >)>

  - here we shouldn't replace the P_E; on the stack there are two
TARGET_EXPRs of types X and C

  B b = T_E }}>

  - here we should replace the P_E; on the stack there's one TARGET_EXPR
of type B

In each case we find a TARGET_EXPR of the type of the PLACEHOLDER_EXPR, but I
don't see how to decide which PLACEHOLDER_EXPR we should let slide.  Sorry for
being dense...

diff --git gcc/cp/tree.c gcc/cp/tree.c
index 2757af6..2439a00 100644
--- gcc/cp/tree.c
+++ gcc/cp/tree.c
@@ -2741,8 +2741,12 @@ build_ctor_subob_ref (tree index, tree type, tree obj)
 
 struct replace_placeholders_t
 {
-  tree obj;/* The object to be substituted for a PLACEHOLDER_EXPR.  */
-  bool seen;   /* Whether we've encountered a PLACEHOLDER_EXPR.  */
+  /* The object to be substituted for a PLACEHOLDER_EXPR.  */
+  tree obj;
+  /* Whether we've encountered a PLACEHOLDER_EXPR.  */
+  bool seen;
+  /* A stack of TARGET_EXPRs we've found ourselves in.  */
+  vec target_expr_stack;
 };
 
 /* Like substitute_placeholder_in_expr, but handle C++ tree codes and
@@ -2762,14 +2766,35 @@ replace_placeholders_r (tree* t, int* walk_subtrees, 
void* data_)
 
   switch (TREE_CODE (*t))
 {
+case TARGET_EXPR:
+  d->target_expr_stack.safe_push (*t);
+  cp_walk_tree (&TARGET_EXPR_INITIAL (*t), replace_placeholders_r, data_,
+   NULL);
+  d->target_expr_stack.pop ();
+  *walk_subtrees = 0;
+  break;
+
 case PLACEHOLDER_EXPR:
   {
-   tree x = obj;
-   for (; !(same_type_ignoring_top_level_qualifiers_p
-(TREE_TYPE (*t), TREE_TYPE (x)));
-x = TREE_OPERAND (x, 0))
- gcc_assert (TREE_CODE (x) == COMPONENT_REF);
-   *t = x;
+   bool skip_it = false;
+   unsigned ix;
+

Re: [testsuite] Add missing dg-require-effective-target alloca to gcc testsuite

2017-03-24 Thread Mike Stump

On Mar 24, 2017, at 5:58 AM, Rainer Orth  wrote:
> -  { /* { dg-warning "statement may fall through" "" { target c } 23 } */
> - int a[i]; /* { dg-warning "statement may fall through" "" { target c++ 
> } 24 } */
> +  { /* { dg-warning "statement may fall through" "" { target c } 24 } */
> + int a[i]; /* { dg-warning "statement may fall through" "" { target c++ 
> } 25 } */

Any reason to not use relative line numbers?  See testsuite/gcc.dg/dg-test-1.c 
for a template for how to do it.  I think they should work and be better.  
Indeed, all line numbers should be relative, generally speaking.

Re: [PATCH] fwprop: Prevent infinite looping (PR79405)

2017-03-24 Thread Jeff Law


On 03/24/2017 05:54 AM, Richard Biener wrote:

On Fri, Mar 24, 2017 at 10:39 AM, Segher Boessenkool
 wrote:

On Fri, Mar 24, 2017 at 09:13:59AM +0100, Richard Biener wrote:

https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01485.html

?


What about it?  That suggestion would make fwprop do *less* useful work,
while in principle the problem is that it *already* does not enough!


Ok, not knowing a lot about fwprop I take your word for it (but GIMPLE level
forwprop works that way and certainly you can't substitute a def into a use
that is before the def).


When fwprop has done a propagation it makes fresh new uses for all the
sources of the resulting insn, which it adds at the end of the table.
It doesn't reuse the original uses.


Yes, that's what I suspected.


If fwprop actually tried all propagations (and never tried substituting
X with X, which it currently does), there would be no problem.


To me it looked like fwprop tries to iterate over all propagations but
it iterates over a changing data structure which is why it "oscillates".
But I didn't actually verify that theory (in fact, I just very much disliked
Berns patch give its compile-time cost).


We have, in order, in one bb:

  B = A|Z
  A = B
  D = A

where each of those is the only set of its dest.  Now the first thing
tried is propagating the first into the second.  This fails.  Then,
Z is found to be zero, so we get

  B = A
  A = B
  D = A

If the first was now propagated into the second again, all is fine
(all three vars are set to A).  But this is not tried again.  The
second is propagated into the third, giving

  B = A
  A = B
  D = B

and then the first is propagated into the third, and we repeat
forever.


This patch is a workaround; it makes no difference on pretty much any
code, except this single testcase (which has undefined behaviour,
uninitialised variables).


Yeah, your fix looks similar to the other hack I suggested.


I have implemented the "retry things" as well fwiw, but a) it is too
big and invasive for stage 4, and b) it kind of sucks, needs more
work, even more invasive.  The workaround is cheap and solves the
immediate problem.


Agreed, still iterating over the DF uses in the first place looks like
the bug (given this "all uses" data structure changes during propagation!).

I'd have done

  for (BBs in RPO order)
for (insn in BB)
repeat:
  for (use in insn)
 if (propagate_into (use))
   goto repeat;

I wonder why it wasn't implemented like this in the first place.

ISTM that with the exception of loops that it ought to give the same 
final result.  For loops, I wouldn't expect opportunities along 
backedges to be all that common.


It also seems to me that we ought to be able to verify if the different 
order changes things.


Just to be clear, I can live with either approach.  I would even be 
option to Segher's as a stopgap and yours for gcc-8.


jeff

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jeff Law


On 03/24/2017 07:35 AM, Jakub Jelinek wrote:

On Fri, Mar 24, 2017 at 02:04:42PM +0100, Jakub Jelinek wrote:

On Fri, Mar 24, 2017 at 06:37:10AM -0600, Jeff Law wrote:

2017-03-21  Jakub Jelinek  

PR target/80102
* cfgcleanup.c (old_insns_match_p): Don't cross-jump in between /f
and non-/f instructions.  If both i1 and i2 are frame related,
verify all CFA notes, their order and content.

* g++.dg/opt/pr80102.C: New test.

Presumably this didn't ICE at some point in the past, so it's a regression?
(it's not marked as such in the BZ).


It doesn't ICE for me with r238210 and ICEs with current trunk, I don't have
too many ppc64le compilers around though.


GCC 4.8 doesn't ICE either.
ISTM this isn't a regression (without doing deeper analysis).  As a 
release manager, ISTM you can grant an exception if you want to push 
this forward for gcc-7.


Jeff

Re: [PATCH 0/3] Introduce internal_error_cont and exclude it from pot files

2017-03-24 Thread Jeff Law


On 03/24/2017 09:37 AM, Martin Sebor wrote:

On 03/24/2017 03:29 AM, Martin Liška wrote:

I would like to ping that. I'm not sure what's agreement after I read
discussion in: https://gcc.gnu.org/ml/gcc/2017-03/msg00070.html

Martin Sebor may know, CC'ing him.


I haven't seen any followup posts after the one above.  The other
(and I think more important point, one with a bearing on the whole
patch series) to consider is the one made by translator Frederic
Marchal in comment #2 in bug 80055.
It's on the list of things to evaluate.  The non-gcc stuff on my plate 
is just about wrapped up, at which point I can give more attention to 
gcc stuff.



Jeff

Re: [PATCH 1/5] [ARC] Save/restore blink when in ISR.

2017-03-24 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-03-20 12:43:26 
+0100]:

> BLIBK register needs to be saved/restored in a interrupt. Fix this issue.
> 
> gcc/
> 2016-09-21  Claudiu Zissulescu  
> 
>   * config/arc/arc.c (arc_epilogue_uses): BLINK should be also
>   restored when in interrupt.
>   * config/arc/arc.md (simple_return): ARCv2 rtie instruction
>   doesn't have delay slot.

Looks good thanks,
Andrew



> 
> gcc/testsuite/
> 2016-09-21  Claudiu Zissulescu  
> 
>   * gcc.target/arc/interrupt-4.c: New file.
> ---
>  gcc/config/arc/arc.c   | 10 ++
>  gcc/config/arc/arc.md  |  7 ++-
>  gcc/testsuite/gcc.target/arc/interrupt-4.c | 15 +++
>  3 files changed, 27 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-4.c
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 556b587..50bfa11 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -9513,9 +9513,10 @@ arc_can_follow_jump (const rtx_insn *follower, const 
> rtx_insn *followee)
> Return true if REGNO should be added to the deemed uses of the epilogue.
>  
> We use the return address
> -   arc_return_address_regs[arc_compute_function_type (cfun)] .
> -   But also, we have to make sure all the register restore instructions
> -   are known to be live in interrupt functions.  */
> +   arc_return_address_regs[arc_compute_function_type (cfun)].  But
> +   also, we have to make sure all the register restore instructions
> +   are known to be live in interrupt functions, plus the blink
> +   register if it is clobbered by the isr.  */
>  
>  bool
>  arc_epilogue_uses (int regno)
> @@ -9528,7 +9529,8 @@ arc_epilogue_uses (int regno)
>   {
> if (!fixed_regs[regno])
>   return true;
> -   return regno == arc_return_address_regs[cfun->machine->fn_type];
> +   return ((regno == arc_return_address_regs[cfun->machine->fn_type])
> +   || (regno == RETURN_ADDR_REGNUM));
>   }
>else
>   return regno == RETURN_ADDR_REGNUM;
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index a06c2ed..b912bd4 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -4748,7 +4748,12 @@
>output_asm_insn (\"j%!%* [%0]%&\", ®);
>return \"\";
>  }
> -  [(set_attr "type" "return")
> +  [(set (attr "type")
> + (cond [(and (eq (symbol_ref "arc_compute_function_type (cfun)")
> + (symbol_ref "ARC_FUNCTION_ILINK1"))
> + (match_test "TARGET_V2"))
> +(const_string "brcc_no_delay_slot")]
> +   (const_string "return")))
> ; predicable won't help here since the canonical rtl looks different
> ; for branches.
> (set (attr "cond")
> diff --git a/gcc/testsuite/gcc.target/arc/interrupt-4.c 
> b/gcc/testsuite/gcc.target/arc/interrupt-4.c
> new file mode 100644
> index 000..ea6596e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/interrupt-4.c
> @@ -0,0 +1,15 @@
> +#if defined (__ARCHS__) || defined (__ARCEM__)
> +#define RILINK "ilink"
> +#else
> +#define RILINK "ilink1"
> +#endif
> +
> +extern int gpio_int;
> +extern int read_reg (int);
> +
> +void __attribute__ ((interrupt(RILINK)))
> +isr_handler (void)
> +{
> +  gpio_int = read_reg (1);
> +}
> +/* { dg-final { scan-assembler-times "blink" 2 } } */
> -- 
> 1.9.1
>

Re: [PATCH 2/5] [ARC] Fix detection of long immediate for load/store operands.

2017-03-24 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-03-20 12:43:27 
+0100]:

> ARC can use scaled offsets when loading (i.e. ld.as rA,[base,
> offset]).  Where base and offset can be a register or an immediate
> operand.  The scaling only applies on the offset part of the
> instruction.  The compiler can accept an address like this:
> 
> (plus:SI (mult:SI (reg:SI 2 r2 [orig:596 _2129 ] [596])
> (const_int 4 [0x4]))
>(const_int 60 [0x3c]))
> 
> Hence, to emit this instruction we place the (const_int 60) into base
> and the register into offset to take advantage of the scaled offset
> facility of the load instruction.  As a result the length of the load
> instruction is 8 bytes.  However, the long_immediate_loadstore_operand
> predicate used for calculating the length attribute doesn't recognize
> this address and returns a wrong decision leading to a wrong length
> computation for a load instruction using the above address.
> 
> gcc/
> 2016-09-21  Claudiu Zissulescu  
> 
>   * config/arc/predicates.md (long_immediate_loadstore_operand):
>   Consider scaled addresses cases.

Looks good thanks,
Andrew



> ---
>  gcc/config/arc/predicates.md | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
> index 0dec736..8dd8d55 100644
> --- a/gcc/config/arc/predicates.md
> +++ b/gcc/config/arc/predicates.md
> @@ -148,6 +148,11 @@
>{
>   rtx x = XEXP (op, 1);
>  
> + if ((GET_CODE (XEXP (op, 0)) == MULT)
> + && REG_P (XEXP (XEXP (op, 0), 0))
> + && CONSTANT_P (x))
> +   return 1;
> +
>   if (GET_CODE (x) == CONST)
> {
>   x = XEXP (x, 0);
> -- 
> 1.9.1
>

Re: [PATCH 3/5] [ARC] Disable TP register when building for bare metal.

2017-03-24 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-03-20 12:43:28 
+0100]:

> No need for thread pointer in bare metal toolchain. Use TP register normally.
> 
> gcc/
> 2016-09-29  Claudiu Zissulescu  
> 
>   * config/arc/elf.h (ARGET_ARC_TP_REGNO_DEFAULT): Define.
>   * config/arc/linux.h (ARGET_ARC_TP_REGNO_DEFAULT): Likewise.
>   * config/arc/arc.opt (mtp-regno): Use
>   * ARGET_ARC_TP_REGNO_DEFAULT.

Looks fine.  I've suggested some alternative phrasing for the comments
to make them read a little easier.

Thanks,
Andrew



> ---
>  gcc/config/arc/arc.opt | 2 +-
>  gcc/config/arc/elf.h   | 5 +
>  gcc/config/arc/linux.h | 4 
>  3 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt
> index 17af736..6060ded 100644
> --- a/gcc/config/arc/arc.opt
> +++ b/gcc/config/arc/arc.opt
> @@ -469,7 +469,7 @@ EnumValue
>  Enum(arc_fpu) String(fpud_all) Value(FPU_FPUD_ALL)
>  
>  mtp-regno=
> -Target RejectNegative Joined UInteger Var(arc_tp_regno) Init(25)
> +Target RejectNegative Joined UInteger Var(arc_tp_regno) 
> Init(TARGET_ARC_TP_REGNO_DEFAULT)
>  Specify thread pointer register number.
>  
>  mtp-regno=none
> diff --git a/gcc/config/arc/elf.h b/gcc/config/arc/elf.h
> index d2106c5..a6d6c05 100644
> --- a/gcc/config/arc/elf.h
> +++ b/gcc/config/arc/elf.h
> @@ -53,3 +53,8 @@ along with GCC; see the file COPYING3.  If not see
>  #  define MULTILIB_DEFAULTS { "mcpu=" ARC_MULTILIB_CPU_DEFAULT }
>  # endif
>  #endif
> +
> +/* Bare-metal toolchains are not having a thread pointer register
> +   set.  */

/* Bare-metal toolchains do not need a thread pointer register.  */

> +#undef TARGET_ARC_TP_REGNO_DEFAULT
> +#define TARGET_ARC_TP_REGNO_DEFAULT -1
> diff --git a/gcc/config/arc/linux.h b/gcc/config/arc/linux.h
> index 10c291c..1d10357 100644
> --- a/gcc/config/arc/linux.h
> +++ b/gcc/config/arc/linux.h
> @@ -74,3 +74,7 @@ along with GCC; see the file COPYING3.  If not see
>  /* We do not have any MULTILIB_OPTIONS specified, so there are no
> MULTILIB_DEFAULTS.  */
>  #undef  MULTILIB_DEFAULTS
> +
> +/* Linux toolchains are using r25 as TLS register.  */

/* Linux toolchains use r25 as the thread pointer register.  */

> +#undef TARGET_ARC_TP_REGNO_DEFAULT
> +#define TARGET_ARC_TP_REGNO_DEFAULT 25
> -- 
> 1.9.1
>

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jeff Law


On 03/24/2017 07:04 AM, Jakub Jelinek wrote:

On Fri, Mar 24, 2017 at 06:37:10AM -0600, Jeff Law wrote:

2017-03-21  Jakub Jelinek  

PR target/80102
* cfgcleanup.c (old_insns_match_p): Don't cross-jump in between /f
and non-/f instructions.  If both i1 and i2 are frame related,
verify all CFA notes, their order and content.

* g++.dg/opt/pr80102.C: New test.

Presumably this didn't ICE at some point in the past, so it's a regression?
(it's not marked as such in the BZ).


It doesn't ICE for me with r238210 and ICEs with current trunk, I don't have
too many ppc64le compilers around though.


+  /* If both i1 and i2 are frame related, verify all the CFA notes
+ in the same order and with the same content.  */
+  if (RTX_FRAME_RELATED_P (i1))
+{
+  static enum reg_note cfa_note_kinds[] = {
+   REG_FRAME_RELATED_EXPR, REG_CFA_DEF_CFA, REG_CFA_ADJUST_CFA,
+   REG_CFA_OFFSET, REG_CFA_REGISTER, REG_CFA_EXPRESSION,
+   REG_CFA_VAL_EXPRESSION, REG_CFA_RESTORE, REG_CFA_SET_VDRAP,
+   REG_CFA_TOGGLE_RA_MANGLE, REG_CFA_WINDOW_SAVE, REG_CFA_FLUSH_QUEUE
+  };

ISTM this could get out of date very easily.  Is there a clean way to
generate the array of cfa notes as we build up the notes from reg-notes.def?


We could e.g.
#ifndef REG_CFA_NOTE
# define REG_CFA_NOTE(NAME) REG_NOTE(NAME)
#endif
and then
REG_CFA_NOTE (FRAME_RELATED_EXPR)
etc. in reg-notes.def (and document that REG_CFA_NOTE should be used for
notes related to CFA).
Then in cfgcleanups.c we could just
#undef REG_CFA_NOTE
#define DEF_REG_NOTE(NAME)
#define REG_CFA_NOTE(NAME) REG_##NAME,
#include "reg-notes.def"
#undef DEF_REG_NOTE
#undef REG_CFA_NOTE
to populate the cfa_note_kinds array.
Something like that seems preferable -- I think we're a lot more likely 
to catch the need to use REG_CFA_NOTE when defining the notes in 
reg-notes.def than we are to remember to update an array in a different 
file.


jeff

[New file] Add testcase to ensure that #pragma GCC diagnostic push/pop works with -Wtraditional.

2017-03-24 Thread Eric Gallager

The attached test case failed with gcc 4.9 and older, but started
compiling successfully with only the 1 expected warning with gcc 5.
Adding it to the test suite would ensure that this behavior doesn't
regress. Note that I have only tested it by compiling it manually, and
not by actually running it as part of the entire test suite, so please
let me know if I got any of the dejagnu directives wrong.

Thanks,
Eric Gallager

gcc/testsuite/ChangeLog:

2017-03-24  Eric Gallager  

* gcc.dg/pragma-diag-7.c: New test.
/* { dg-do compile } */

unsigned long ok = 0UL;
#pragma GCC diagnostic push
#pragma GCC diagnostic warning "-Wtraditional"
unsigned long bad = 1UL; /* { dg-warning "suffix" } */
/* Note the extra space before the pragma on this next line: */
 #pragma GCC diagnostic pop
unsigned long ok_again = 2UL; /* { dg-bogus "suffix" } */

Re: [PATCH 4/5] [ARC] Fix divdf3 emulation for arcem.

2017-03-24 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-03-20 12:43:29 
+0100]:

> Missing case for ARCEM cpus. Add it.
> 
> libgcc/
> 2016-09-29  Claudiu Zissulescu  
> 
>   * config/arc/ieee-754/divdf3.S (__divdf3): Use __ARCEM__.

Looks good, thanks,
Andrew


> ---
>  libgcc/config/arc/ieee-754/divdf3.S | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/libgcc/config/arc/ieee-754/divdf3.S 
> b/libgcc/config/arc/ieee-754/divdf3.S
> index 4d6aae2..b8085a6 100644
> --- a/libgcc/config/arc/ieee-754/divdf3.S
> +++ b/libgcc/config/arc/ieee-754/divdf3.S
> @@ -189,13 +189,13 @@ __divdf3:
>   asl r8,DBL1H,12
>   lsr r12,DBL1L,20
>   lsr r4,r8,26
> -#ifdef __HS__
> +#if defined (__ARCHS__) || defined (__ARCEM__)
>   add3 r10,pcl,60 ; (.Ldivtab-.) >> 3
>  #else
>   add3 r10,pcl,59 ; (.Ldivtab-.) >> 3
>  #endif
>   ld.as r4,[r10,r4]
> -#ifdef __HS__
> +#if defined (__ARCHS__) || defined (__ARCEM__)
>   ld.as r9,[pcl,182]; [pcl,(-((.-.L7ff0) >> 2))] ; 0x7ff0
>  #else
>   ld.as r9,[pcl,180]; [pcl,(-((.-.L7ff0) >> 2))] ; 0x7ff0
> @@ -299,14 +299,14 @@ __divdf3:
>   rsub r7,r6,5
>   asr r10,r12,28
>   bmsk r4,r12,27
> -#ifdef __HS__
> +#if defined (__ARCHS__) || defined (__ARCEM__)
>   min  r7, r7, 31
>   asr  DBL0L, r4, r7
>  #else
>   asrs DBL0L,r4,r7
>  #endif
>   add DBL1H,r11,r10
> -#ifdef __HS__
> +#if defined (__ARCHS__) || defined (__ARCEM__)
>   abs.f r10, r4
>   sub.mi r10, r10, 1
>  #endif
> -- 
> 1.9.1
>

Re: [PATCH 5/5] [ARC] Fix move_double_src_operand predicate.

2017-03-24 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-03-20 12:43:30 
+0100]:

> Durring compilation process, (subreg (mem ...) ...) can occur. Hence,
> we need to check if the address of mem is a valid one. This patch is
> fixing this check by directly calling the address_operand, instead of
> calling move_double_src_operand, as the latter is always checking
> against the original mode, thus, returning false when the inner and
> outer modes are different.
> 
> gcc/
> 2016-10-07  Claudiu Zissulescu  
> 
>   * config/arc/predicates.md (move_double_src_operand): Replace the
>   call to move_double_src_operand with a call to
>   address_operand.

Sounds good, thanks,
Andrew


> ---
>  gcc/config/arc/predicates.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
> index 8dd8d55..9e60cb7 100644
> --- a/gcc/config/arc/predicates.md
> +++ b/gcc/config/arc/predicates.md
> @@ -318,7 +318,7 @@
>/* (subreg (mem ...) ...) can occur here if the inner part was once a
>pseudo-reg and is now a stack slot.  */
>if (GET_CODE (SUBREG_REG (op)) == MEM)
> - return move_double_src_operand (SUBREG_REG (op), mode);
> + return address_operand (XEXP (SUBREG_REG (op), 0), mode);
>else
>   return register_operand (op, mode);
>  case MEM :
> -- 
> 1.9.1
>

patch to fix PR80148

2017-03-24 Thread Vladimir Makarov


  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80148

  The test file is too big so there is no a testcase.

  The patch was successfully bootstrapped and tested on x86-64.

  Committed as rev. 246467.

Index: ChangeLog
===
--- ChangeLog	(revision 246466)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2017-03-24  Vladimir Makarov  
+
+	PR target/80148
+	* lra-assigns.c (assign_by_spills): Add spilled non-reload pseudos
+	to consider in curr_insn_transform.
+
 2017-03-24  Jakub Jelinek  
 
 	* genrecog.c (validate_pattern): Add VEC_SELECT validation.
Index: lra-assigns.c
===
--- lra-assigns.c	(revision 246466)
+++ lra-assigns.c	(working copy)
@@ -1507,6 +1507,14 @@ assign_by_spills (void)
 	  sorted_pseudos[nfails++] = conflict_regno;
 	  former_reload_pseudo_spill_p = true;
 	}
+	  else
+	/* It is better to do reloads before spilling as after the
+	   spill-subpass we will reload memory instead of pseudos
+	   and this will make reusing reload pseudos more
+	   complicated.  Going directly to the spill pass in such
+	   case might result in worse code performance or even LRA
+	   cycling if we have few registers.  */
+	bitmap_set_bit (&all_spilled_pseudos, conflict_regno);
 	  if (lra_dump_file != NULL)
 	fprintf (lra_dump_file, "	  Spill %s r%d(hr=%d, freq=%d)\n",
 		 pseudo_prefix_title (conflict_regno), conflict_regno,

[C/C++ PATCH] Fix ICEs with vector subscripting of register variables (PR middle-end/80162)

2017-03-24 Thread Jakub Jelinek

Hi!

c*_mark_addressable doesn't look through VIEW_CONVERT_EXPRs that
vector subscripts are turned into, which means we don't diagnose
taking address of e.g. a vector element in a hard register.

On the other side, I think we want to support just normal vector
subscripting of vectors in hard registers, that can be expanded
into e.g. vector shift etc.

So, this patch handles specially c*_mark_addressable calls from
array_ref building, otherwise diagnoses taking addresses of
vector elements in hard registers.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-03-24  Jakub Jelinek  

PR middle-end/80162
c-common/
* c-common.c (c_common_mark_addressable_vec): Don't set
TREE_ADDRESSABLE on DECL_HARD_REGISTER.
c/
* c-tree.h (c_mark_addressable): Add array_ref_p argument.
* c-typeck.c (c_mark_addressable): Likewise.  Look through
VIEW_CONVERT_EXPR unless array_ref_p and VCE is from VECTOR_TYPE
to ARRAY_TYPE.
(build_array_ref): Pass true as array_ref_p to c_mark_addressable.
cp/
* cp-tree.h (cxx_mark_addressable): Add array_ref_p argument.
* typeck.c (cxx_mark_addressable): Likewise.  Look through
VIEW_CONVERT_EXPR unless array_ref_p and VCE is from VECTOR_TYPE
to ARRAY_TYPE.
(cp_build_array_ref): Pass true as array_ref_p to cxx_mark_addressable.
testsuite/
* c-c++-common/pr80162-1.c: New test.
* c-c++-common/pr80162-2.c: New test.
* c-c++-common/pr80162-3.c: New test.

--- gcc/c-family/c-common.c.jj  2017-03-22 19:31:40.0 +0100
+++ gcc/c-family/c-common.c 2017-03-24 10:20:13.340088654 +0100
@@ -6542,7 +6542,8 @@ c_common_mark_addressable_vec (tree t)
   && TREE_CODE (t) != PARM_DECL
   && TREE_CODE (t) != COMPOUND_LITERAL_EXPR)
 return;
-  TREE_ADDRESSABLE (t) = 1;
+  if (!VAR_P (t) || !DECL_HARD_REGISTER (t))
+TREE_ADDRESSABLE (t) = 1;
 }
 
 
--- gcc/c/c-tree.h.jj   2017-03-23 15:49:55.0 +0100
+++ gcc/c/c-tree.h  2017-03-24 11:12:19.877403779 +0100
@@ -615,7 +615,7 @@ extern int same_translation_unit_p (cons
 extern int comptypes (tree, tree);
 extern int comptypes_check_different_types (tree, tree, bool *);
 extern bool c_vla_type_p (const_tree);
-extern bool c_mark_addressable (tree);
+extern bool c_mark_addressable (tree, bool = false);
 extern void c_incomplete_type_error (location_t, const_tree, const_tree);
 extern tree c_type_promotes_to (tree);
 extern struct c_expr default_function_array_conversion (location_t,
--- gcc/c/c-typeck.c.jj 2017-03-21 07:55:51.0 +0100
+++ gcc/c/c-typeck.c2017-03-24 11:16:16.725391687 +0100
@@ -2654,7 +2654,7 @@ build_array_ref (location_t loc, tree ar
  || (COMPLETE_TYPE_P (TREE_TYPE (TREE_TYPE (array)))
  && TREE_CODE (TYPE_SIZE (TREE_TYPE (TREE_TYPE (array != 
INTEGER_CST))
{
- if (!c_mark_addressable (array))
+ if (!c_mark_addressable (array, true))
return error_mark_node;
}
   /* An array that is indexed by a constant value which is not within
@@ -4755,16 +4755,26 @@ lvalue_or_else (location_t loc, const_tr
 
 /* Mark EXP saying that we need to be able to take the
address of it; it should not be allocated in a register.
-   Returns true if successful.  */
+   Returns true if successful.  ARRAY_REF_P is true if this
+   is for ARRAY_REF construction - in that case we don't want
+   to look through VIEW_CONVERT_EXPR from VECTOR_TYPE to ARRAY_TYPE,
+   it is fine to use ARRAY_REFs for vector subscripts on vector
+   register variables.  */
 
 bool
-c_mark_addressable (tree exp)
+c_mark_addressable (tree exp, bool array_ref_p)
 {
   tree x = exp;
 
   while (1)
 switch (TREE_CODE (x))
   {
+  case VIEW_CONVERT_EXPR:
+   if (array_ref_p
+   && TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE
+   && TREE_CODE (TREE_TYPE (TREE_OPERAND (x, 0))) == VECTOR_TYPE)
+ return true;
+   /* FALLTHRU */
   case COMPONENT_REF:
   case ADDR_EXPR:
   case ARRAY_REF:
--- gcc/cp/cp-tree.h.jj 2017-03-23 15:49:56.0 +0100
+++ gcc/cp/cp-tree.h2017-03-24 11:23:23.828878855 +0100
@@ -6715,7 +6715,7 @@ extern void cxx_print_error_function  (d
 struct diagnostic_info *);
 
 /* in typeck.c */
-extern bool cxx_mark_addressable   (tree);
+extern bool cxx_mark_addressable   (tree, bool = false);
 extern int string_conv_p   (const_tree, const_tree, int);
 extern tree cp_truthvalue_conversion   (tree);
 extern tree condition_conversion   (tree);
--- gcc/cp/typeck.c.jj  2017-03-19 11:57:14.0 +0100
+++ gcc/cp/typeck.c 2017-03-24 11:25:51.997956658 +0100
@@ -3217,7 +3217,7 @@ cp_build_array_ref (location_t loc, tree
  && (TREE_CODE (TYPE_SIZE (TREE_TYPE (TREE_TYPE (array
  != INTEGER_CST)))
{
- i

[PATCH] Fix expansion of initializer extensions (PR c/80163)

2017-03-24 Thread Jakub Jelinek

Hi!

As can be seen on the following testcase, when expanding an extension
in EXPAND_INITIALIZER context, we emit wrong extension operation
(one depending on the signedness of the result type rather than
on the signedness of the argument type, so e.g. extension of
unsigned int to long long int is done using SIGN_EXTEND instead of
ZERO_EXTEND, and extension of int to unsigned long long int using
ZERO_EXTEND instead of SIGN_EXTEND.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2017-03-24  Jakub Jelinek  

PR c/80163
* expr.c : For EXPAND_INITIALIZER determine SIGN_EXTEND
vs. ZERO_EXTEND based on signedness of treeop0's type rather than
signedness of the result type.

* gcc.dg/torture/pr80163.c: New test.

--- gcc/expr.c.jj   2017-03-07 09:04:04.0 +0100
+++ gcc/expr.c  2017-03-24 12:09:57.729854079 +0100
@@ -8333,7 +8333,8 @@ expand_expr_real_2 (sepops ops, rtx targ
}
 
   else if (modifier == EXPAND_INITIALIZER)
-   op0 = gen_rtx_fmt_e (unsignedp ? ZERO_EXTEND : SIGN_EXTEND, mode, op0);
+   op0 = gen_rtx_fmt_e (TYPE_UNSIGNED (TREE_TYPE (treeop0))
+? ZERO_EXTEND : SIGN_EXTEND, mode, op0);
 
   else if (target == 0)
op0 = convert_to_mode (mode, op0,
--- gcc/testsuite/gcc.dg/torture/pr80163.c.jj   2017-03-24 12:19:21.363603066 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr80163.c  2017-03-24 12:19:09.0 
+0100
@@ -0,0 +1,35 @@
+/* PR c/80163 */
+/* { dg-do compile { target int128 } } */
+
+volatile int v;
+
+__attribute__((noinline, noclone)) void
+bar (void)
+{
+  v++;
+  asm volatile ("" : : : "memory");
+}
+
+__attribute__((noinline, noclone)) __int128_t *
+foo (unsigned long **p)
+{
+a:
+  bar ();
+b:
+  bar ();
+  static __int128_t d = (unsigned long) &&a - (unsigned long) &&b;
+  static unsigned long e = (unsigned long) &&a - (unsigned long) &&b;
+  *p = &e;
+  return &d;
+}
+
+int
+main ()
+{
+  __int128_t *p;
+  unsigned long *q;
+  p = foo (&q);
+  if (*p != *q)
+__builtin_abort ();
+  return 0;
+}

Jakub

[patch] have "would be stringified in traditional C" warning in libcpp/macro.c be controlled by -Wtraditional

2017-03-24 Thread Eric Gallager

It seemed odd to me that gcc was issuing a warning about compatibility
with traditional C that I couldn't turn off by pushing/popping
-Wtraditional over the problem area, so I made the attached (minor)
patch to fix it. Survives bootstrap, but the only testing I've done
with it has been compiling the one file that was giving me issues
previously, which I'd need to reduce further to turn it into a proper
test case.

Thanks,
Eric Gallager

libcpp/ChangeLog:

2017-03-24  Eric Gallager  

* macro.c (check_trad_stringification): Have warning be controlled by
-Wtraditional.
 libcpp/macro.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libcpp/macro.c b/libcpp/macro.c
index de18c2210cf..5c0de2e4bab 100644
--- a/libcpp/macro.c
+++ b/libcpp/macro.c
@@ -3316,7 +3316,7 @@ check_trad_stringification (cpp_reader *pfile, const 
cpp_macro *macro,
  if (NODE_LEN (node) == len
  && !memcmp (p, NODE_NAME (node), len))
{
- cpp_error (pfile, CPP_DL_WARNING,
+ cpp_warning (pfile, CPP_W_TRADITIONAL,
   "macro argument \"%s\" would be stringified in traditional C",
   NODE_NAME (node));
  break;

Re: [New file] Add testcase to ensure that #pragma GCC diagnostic push/pop works with -Wtraditional.

2017-03-24 Thread David Malcolm

On Fri, 2017-03-24 at 14:10 -0400, Eric Gallager wrote:
> The attached test case failed with gcc 4.9 and older, but started
> compiling successfully with only the 1 expected warning with gcc 5.
> Adding it to the test suite would ensure that this behavior doesn't
> regress. 

Thanks for posting this.

What's the significance of the leading space in the:
 #pragma GCC diagnostic pop
line?  Is *that* the bug?  (did we have a bug # for this, I wonder?)

> Note that I have only tested it by compiling it manually, and
> not by actually running it as part of the entire test suite, so
> please
> let me know if I got any of the dejagnu directives wrong.

When I started contributing to gcc, it took me a while to figure out
how to run just one case in the testsuite, so in case it's helpful I'll
post the recipe here:

1) Find the pertinent Tcl script that runs the test: a .exp script in
the same directory, or one of the ancestors directories.  For this case
it's gcc.dg/dg.exp.  The significant part is the filename: dg.exp

2) Figure out the appropriate "make" target, normally based on the
source language for the test.  For this case it's "check-gcc"

3) Run make in your BUILDDIR/gcc, passing in a suitable value for
RUNTESTFLAGS based on the filename found in step 1 above.
For this case, giving it a couple of "-v" flags for verbosity (so that
we can see the command-line of the compiler invocation) it would be:

$ make -jN && make check-gcc RUNTESTFLAGS="-v -v dg.exp=pragma-diag
-7.c"

(for some N; I like the "make && make check-FOO" construction to ensure
that the compiler is rebuilt before running the tests).

...which leads to a summary of:

# of expected passes3

which looks good.

You can also use wildcards e.g.:

make -j64 && make check-gcc RUNTESTFLAGS="-v -v dg.exp=pragma-diag-*.c"

(and can use -jN on the "make check-FOO" invocation if there are a lot of 
tests; I tend not to use it for a small number of tests, to avoid interleaving 
of output in the logs).

Thanks,
> Eric Gallager
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-03-24  Eric Gallager  
> 
>   * gcc.dg/pragma-diag-7.c: New test.

I tested your new test case via the above approach and it looks good to
me.

Although we're meant to only be accepting regression fixes and
documentation fixes right now (stage 4 of gcc 7 development) I feel
that extra test coverage like this also ought to be acceptable.

I don't know if the test case is sufficiently small to be exempt from
the FSF's paperwork requirements here:
  https://gcc.gnu.org/contribute.html
(do you have that paperwork in place?)

Thanks
Dave

>

[PATCH] Reject > word sign extensions in initializers (PR middle-end/80163)

2017-03-24 Thread Jakub Jelinek

Hi!

I'm not aware of any target that would support sign extension of
something that can't be folded in the compiler into some type
larger than word/pointer.  Zero extension is doable and assemble_variable
is able to deal with it by emitting low/high subregs of it (where the
low one will contain some expression that needs to be computed by assembler
or even have relocation and upper part will be all zeros), but for
sign extension we'd need assembler support that would for some
asm expression shift it arithmetically right.

So, this patch just rejects it in initializers (in C++ handles through
dynamic initialization) in that case.  clang also rejects it, ICC silently
miscompiles (performs zero extension).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-03-24  Jakub Jelinek  

PR middle-end/80163
* varasm.c (initializer_constant_valid_p_1): Disallow sign-extending
conversions to integer types wider than word and pointer.

* gcc.dg/pr80163.c: New test.

--- gcc/varasm.c.jj 2017-01-01 12:45:37.0 +0100
+++ gcc/varasm.c2017-03-24 13:03:37.489529361 +0100
@@ -4472,8 +4472,15 @@ initializer_constant_valid_p_1 (tree val
  return initializer_constant_valid_p_1 (src, endtype, cache);
 
/* Allow conversions between other integer types only if
-  explicit value.  */
-   if (INTEGRAL_TYPE_P (dest_type) && INTEGRAL_TYPE_P (src_type))
+  explicit value.  Don't allow sign-extension to a type larger
+  than word and pointer, there aren't relocations that would
+  allow to sign extend it to a wider type.  */
+   if (INTEGRAL_TYPE_P (dest_type)
+   && INTEGRAL_TYPE_P (src_type)
+   && (TYPE_UNSIGNED (src_type)
+   || TYPE_PRECISION (dest_type) <= TYPE_PRECISION (src_type)
+   || TYPE_PRECISION (dest_type) <= BITS_PER_WORD
+   || TYPE_PRECISION (dest_type) <= POINTER_SIZE))
  {
tree inner = initializer_constant_valid_p_1 (src, endtype, cache);
if (inner == null_pointer_node)
--- gcc/testsuite/gcc.dg/pr80163.c.jj   2017-03-24 13:02:37.827295587 +0100
+++ gcc/testsuite/gcc.dg/pr80163.c  2017-03-24 13:05:12.465309614 +0100
@@ -0,0 +1,22 @@
+/* PR middle-end/80163 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O0" } */
+
+void bar (void);
+
+__int128_t *
+foo (void)
+{
+a:
+  bar ();
+b:;
+  static __int128_t d = (long) &&a - (long) &&b;   /* { dg-error 
"initializer element is not computable at load time" } */
+  return &d;
+}
+
+__int128_t *
+baz (void)
+{
+  static __int128_t d = (long) (3 * 4);
+  return &d;
+}

Jakub

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 11:47:35AM -0600, Jeff Law wrote:
> On 03/24/2017 07:35 AM, Jakub Jelinek wrote:
> > On Fri, Mar 24, 2017 at 02:04:42PM +0100, Jakub Jelinek wrote:
> > > On Fri, Mar 24, 2017 at 06:37:10AM -0600, Jeff Law wrote:
> > > > > 2017-03-21  Jakub Jelinek  
> > > > > 
> > > > >   PR target/80102
> > > > >   * cfgcleanup.c (old_insns_match_p): Don't cross-jump in between 
> > > > > /f
> > > > >   and non-/f instructions.  If both i1 and i2 are frame related,
> > > > >   verify all CFA notes, their order and content.
> > > > > 
> > > > >   * g++.dg/opt/pr80102.C: New test.
> > > > Presumably this didn't ICE at some point in the past, so it's a 
> > > > regression?
> > > > (it's not marked as such in the BZ).
> > > 
> > > It doesn't ICE for me with r238210 and ICEs with current trunk, I don't 
> > > have
> > > too many ppc64le compilers around though.
> > 
> > GCC 4.8 doesn't ICE either.
> ISTM this isn't a regression (without doing deeper analysis).  As a release
> manager, ISTM you can grant an exception if you want to push this forward
> for gcc-7.

It is a [7 Regression], started with r239866, so it is a P1.

Jakub

[PATCH] Fix asan/ubsan bitfield handling in VL structures (PR sanitizer/80168)

2017-03-24 Thread Jakub Jelinek

Hi!

We ICE on the following testcase, because we attempt to use
DECL_BIT_FIELD_REPRESENTATIVE instead of original FIELD_DECL
in a COMPONENT_REF in a VL structure, but DECL_BIT_FIELD_REPRESENTATIVE's
DECL_FIELD_OFFSET is not really gimplified and even if it was,
it wouldn't be current.  From the expr.c and stor-layout.c comments,
seems DECL_BIT_FIELD_REPRESENTATIVE's DECL_FIELD_OFFSET is guaranteed
to be the same as the corresponding field's by construction if it is not
constant, all the differences if any are in DECL_FIELD_BIT_OFFSET.

Therefore, it should be safe to reuse 3rd COMPONENT_REF operand.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-03-24  Jakub Jelinek  

PR sanitizer/80168
* asan.c (instrument_derefs): Copy over last operand from
original COMPONENT_REF to the new COMPONENT_REF with
DECL_BIT_FIELD_REPRESENTATIVE.
* ubsan.c (instrument_object_size): Likewise.

* gcc.dg/asan/pr80168.c: New test.

--- gcc/asan.c.jj   2017-03-21 07:57:00.0 +0100
+++ gcc/asan.c  2017-03-24 17:02:35.451865004 +0100
@@ -1868,7 +1868,8 @@ instrument_derefs (gimple_stmt_iterator
   tree repr = DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1));
   instrument_derefs (iter, build3 (COMPONENT_REF, TREE_TYPE (repr),
   TREE_OPERAND (t, 0), repr,
-  NULL_TREE), location, is_store);
+  TREE_OPERAND (t, 2)),
+location, is_store);
   return;
 }
 
--- gcc/ubsan.c.jj  2017-03-07 07:10:00.0 +0100
+++ gcc/ubsan.c 2017-03-24 17:02:58.439568314 +0100
@@ -1772,7 +1772,7 @@ instrument_object_size (gimple_stmt_iter
{
  tree repr = DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1));
  t = build3 (COMPONENT_REF, TREE_TYPE (repr), TREE_OPERAND (t, 0),
- repr, NULL_TREE);
+ repr, TREE_OPERAND (t, 2));
}
   break;
 case ARRAY_REF:
--- gcc/testsuite/gcc.dg/asan/pr80168.c.jj  2017-03-24 17:08:14.440489868 
+0100
+++ gcc/testsuite/gcc.dg/asan/pr80168.c 2017-03-24 17:09:08.567791277 +0100
@@ -0,0 +1,12 @@
+/* PR sanitizer/80168 */
+/* { dg-do compile } */
+
+int a;
+
+int
+foo (void)
+{
+  struct S { int c[a]; int q : 8; int e : 4; } f;
+  f.e = 4;
+  return f.e;
+}

Jakub

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 11:56:05AM -0600, Jeff Law wrote:
> > We could e.g.
> > #ifndef REG_CFA_NOTE
> > # define REG_CFA_NOTE(NAME) REG_NOTE(NAME)
> > #endif
> > and then
> > REG_CFA_NOTE (FRAME_RELATED_EXPR)
> > etc. in reg-notes.def (and document that REG_CFA_NOTE should be used for
> > notes related to CFA).
> > Then in cfgcleanups.c we could just
> > #undef REG_CFA_NOTE
> > #define DEF_REG_NOTE(NAME)
> > #define REG_CFA_NOTE(NAME) REG_##NAME,
> > #include "reg-notes.def"
> > #undef DEF_REG_NOTE
> > #undef REG_CFA_NOTE
> > to populate the cfa_note_kinds array.
> Something like that seems preferable -- I think we're a lot more likely to
> catch the need to use REG_CFA_NOTE when defining the notes in reg-notes.def
> than we are to remember to update an array in a different file.

So like this (if it passes bootstrap/regtest on x86_64, i686 and
powerpc64le)?

2017-03-24  Jakub Jelinek  

PR target/80102
* reg-notes.def (REG_CFA_NOTE): Define.  Use it for CFA related
notes.
* cfgcleanup.c (old_insns_match_p): Don't cross-jump in between /f
and non-/f instructions.  If both i1 and i2 are frame related,
verify all CFA notes, their order and content.

* g++.dg/opt/pr80102.C: New test.

--- gcc/reg-notes.def.jj2017-01-21 02:26:33.0 +0100
+++ gcc/reg-notes.def   2017-03-24 19:14:23.413483807 +0100
@@ -20,10 +20,16 @@ along with GCC; see the file COPYING3.
 /* This file defines all the codes that may appear on individual
EXPR_LIST, INSN_LIST and INT_LIST rtxes in the REG_NOTES chain of an insn.
The codes are stored in the mode field of the rtx.  Source files
-   define DEF_REG_NOTE appropriately before including this file.  */
+   define DEF_REG_NOTE appropriately before including this file.
+
+   CFA related notes meant for RTX_FRAME_RELATED_P instructions
+   should be declared with REG_CFA_NOTE macro instead of REG_NOTE.  */
 
 /* Shorthand.  */
 #define REG_NOTE(NAME) DEF_REG_NOTE (REG_##NAME)
+#ifndef REG_CFA_NOTE
+# define REG_CFA_NOTE(NAME) REG_NOTE (NAME)
+#endif
 
 /* REG_DEP_TRUE is used in scheduler dependencies lists to represent a
read-after-write dependency (i.e. a true data dependency).  This is
@@ -112,7 +118,7 @@ REG_NOTE (BR_PRED)
 /* Attached to insns that are RTX_FRAME_RELATED_P, but are too complex
for DWARF to interpret what they imply.  The attached rtx is used
instead of intuition.  */
-REG_NOTE (FRAME_RELATED_EXPR)
+REG_CFA_NOTE (FRAME_RELATED_EXPR)
 
 /* Attached to insns that are RTX_FRAME_RELATED_P, but are too complex
for FRAME_RELATED_EXPR intuition.  The insn's first pattern must be
@@ -122,7 +128,7 @@ REG_NOTE (FRAME_RELATED_EXPR)
with a base register and a constant offset.  In the most complicated
cases, this will result in a DW_CFA_def_cfa_expression with the rtx
expression rendered in a dwarf location expression.  */
-REG_NOTE (CFA_DEF_CFA)
+REG_CFA_NOTE (CFA_DEF_CFA)
 
 /* Attached to insns that are RTX_FRAME_RELATED_P, but are too complex
for FRAME_RELATED_EXPR intuition.  This note adjusts the expression
@@ -130,57 +136,57 @@ REG_NOTE (CFA_DEF_CFA)
expression, relative to the old CFA expression.  This rtx must be of
the form (SET new-cfa-reg (PLUS old-cfa-reg const_int)).  If the note
rtx is NULL, we use the first SET of the insn.  */
-REG_NOTE (CFA_ADJUST_CFA)
+REG_CFA_NOTE (CFA_ADJUST_CFA)
 
 /* Similar to FRAME_RELATED_EXPR, with the additional information that
this is a save to memory, i.e. will result in DW_CFA_offset or the
like.  The pattern or the insn should be a simple store relative to
the CFA.  */
-REG_NOTE (CFA_OFFSET)
+REG_CFA_NOTE (CFA_OFFSET)
 
 /* Similar to FRAME_RELATED_EXPR, with the additional information that this
is a save to a register, i.e. will result in DW_CFA_register.  The insn
or the pattern should be simple reg-reg move.  */
-REG_NOTE (CFA_REGISTER)
+REG_CFA_NOTE (CFA_REGISTER)
 
 /* Attached to insns that are RTX_FRAME_RELATED_P, but are too complex
for FRAME_RELATED_EXPR intuition.  This is a save to memory, i.e. will
result in a DW_CFA_expression.  The pattern or the insn should be a
store of a register to an arbitrary (non-validated) memory address.  */
-REG_NOTE (CFA_EXPRESSION)
+REG_CFA_NOTE (CFA_EXPRESSION)
 
 /* Attached to insns that are RTX_FRAME_RELATED_P, but are too complex
for FRAME_RELATED_EXPR intuition.  The DWARF expression computes the value 
of
the given register.  */
-REG_NOTE (CFA_VAL_EXPRESSION)
+REG_CFA_NOTE (CFA_VAL_EXPRESSION)
 
 /* Attached to insns that are RTX_FRAME_RELATED_P, with the information
that this is a restore operation, i.e. will result in DW_CFA_restore
or the like.  Either the attached rtx, or the destination of the insn's
first pattern is the register to be restored.  */
-REG_NOTE (CFA_RESTORE)
+REG_CFA_NOTE (CFA_RESTORE)
 
 /* Attached to insns that are RTX_FRAME_RELATED_P, marks insn that sets
vDRAP from DRAP.  If vDRAP is

Re: [New file] Add testcase to ensure that #pragma GCC diagnostic push/pop works with -Wtraditional.

2017-03-24 Thread Eric Gallager

On 3/24/17, David Malcolm  wrote:
> On Fri, 2017-03-24 at 14:10 -0400, Eric Gallager wrote:
>> The attached test case failed with gcc 4.9 and older, but started
>> compiling successfully with only the 1 expected warning with gcc 5.
>> Adding it to the test suite would ensure that this behavior doesn't
>> regress.
>
> Thanks for posting this.
>
> What's the significance of the leading space in the:
>  #pragma GCC diagnostic pop
> line?  Is *that* the bug?  (did we have a bug # for this, I wonder?)
>

It prints a warning without it, which would be entirely correct of it to do:

/Users/ericgallager/gcc-git/gcc/testsuite/gcc.dg/pragma-diag-7.c:8:2:
warning: suggest hiding #pragma from traditional C with an indented #
[-Wtraditional]
 #pragma GCC diagnostic pop
  ^

I only wanted the test case to be testing for the warnings about
suffixes; another warning about the pragma would just be noise, albeit
correct noise.

>
>> Note that I have only tested it by compiling it manually, and
>> not by actually running it as part of the entire test suite, so
>> please
>> let me know if I got any of the dejagnu directives wrong.
>
> When I started contributing to gcc, it took me a while to figure out
> how to run just one case in the testsuite, so in case it's helpful I'll
> post the recipe here:
>
> 1) Find the pertinent Tcl script that runs the test: a .exp script in
> the same directory, or one of the ancestors directories.  For this case
> it's gcc.dg/dg.exp.  The significant part is the filename: dg.exp
>
> 2) Figure out the appropriate "make" target, normally based on the
> source language for the test.  For this case it's "check-gcc"
>
> 3) Run make in your BUILDDIR/gcc, passing in a suitable value for
> RUNTESTFLAGS based on the filename found in step 1 above.
> For this case, giving it a couple of "-v" flags for verbosity (so that
> we can see the command-line of the compiler invocation) it would be:
>
> $ make -jN && make check-gcc RUNTESTFLAGS="-v -v dg.exp=pragma-diag
> -7.c"
>
> (for some N; I like the "make && make check-FOO" construction to ensure
> that the compiler is rebuilt before running the tests).
>
> ...which leads to a summary of:
>
> # of expected passes  3
>
> which looks good.

Okay, I tried this, and I also got:

# of expected passes3

too, so that's good.

>
> You can also use wildcards e.g.:
>
> make -j64 && make check-gcc RUNTESTFLAGS="-v -v dg.exp=pragma-diag-*.c"
>
> (and can use -jN on the "make check-FOO" invocation if there are a lot of
> tests; I tend not to use it for a small number of tests, to avoid
> interleaving of output in the logs).
>
> Thanks,
>> Eric Gallager
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-03-24  Eric Gallager  
>>
>>  * gcc.dg/pragma-diag-7.c: New test.
>
> I tested your new test case via the above approach and it looks good to
> me.
>
> Although we're meant to only be accepting regression fixes and
> documentation fixes right now (stage 4 of gcc 7 development) I feel
> that extra test coverage like this also ought to be acceptable.

It's okay to save it for next stage 1, I'm already submitting it later
than I intended to, so extra waiting won't hurt.

>
> I don't know if the test case is sufficiently small to be exempt from
> the FSF's paperwork requirements here:
>   https://gcc.gnu.org/contribute.html
> (do you have that paperwork in place?)
>
> Thanks
> Dave

Yes, I dropped off my copyright assignment at the FSF in December, but
I don't have commit access yet though.
Thanks,
Eric

Re: [PATCH] fwprop: Prevent infinite looping (PR79405)

2017-03-24 Thread Segher Boessenkool

On Fri, Mar 24, 2017 at 12:54:35PM +0100, Richard Biener wrote:
> > I have implemented the "retry things" as well fwiw, but a) it is too
> > big and invasive for stage 4, and b) it kind of sucks, needs more
> > work, even more invasive.  The workaround is cheap and solves the
> > immediate problem.
> 
> Agreed, still iterating over the DF uses in the first place looks like
> the bug (given this "all uses" data structure changes during propagation!).
> 
> I'd have done
> 
>   for (BBs in RPO order)
> for (insn in BB)
> repeat:
>   for (use in insn)
>  if (propagate_into (use))
>goto repeat;

That is more or less how the def-use links for fwprop are built now,
but not the order that propagations are tried in.  I think originally
it used all single definitions, not just those dominating their uses,
in which case something like your proposed algorithm does not work
(there can be def-use loops, even irreducible loops).

I agree it would be good to try something like you suggest in next
stage 1.

Segher

Re: [Committed] S/390: PR79904: Disallow reg + sym_ref literal pool addresses.

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 03:05:21PM +0100, Andreas Krebbel wrote:
> 2017-03-24  Andreas Krebbel  
> 
>   PR target/79904
>   * config/s390/s390.c (s390_decompose_address): Reject reg +
>   sym_ref literal pool references.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-03-24  Andreas Krebbel  
> 
>   * gcc.dg/ubsan/pr79904-2.c: New test.

Similarly to the other tests, we need -Wno-psabi, otherwise it fails
on i686-linux.  Fixed thusly, committed as obvious:

2017-03-24  Jakub Jelinek  

PR sanitizer/79904
* gcc.dg/ubsan/pr79904-2.c: Add -Wno-psabi to dg-options.

--- gcc/testsuite/gcc.dg/ubsan/pr79904-2.c.jj   2017-03-24 15:08:52.0 
+0100
+++ gcc/testsuite/gcc.dg/ubsan/pr79904-2.c  2017-03-24 19:43:52.283599715 
+0100
@@ -1,6 +1,6 @@
 /* PR sanitizer/79904 */
 /* { dg-do compile } */
-/* { dg-options "-fsanitize=signed-integer-overflow" } */
+/* { dg-options "-fsanitize=signed-integer-overflow -Wno-psabi" } */
 
 typedef signed char V __attribute__((vector_size (8))); 
 


Jakub

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 08:36:16PM +0100, Jakub Jelinek wrote:
> On Fri, Mar 24, 2017 at 11:56:05AM -0600, Jeff Law wrote:
> > > We could e.g.
> > > #ifndef REG_CFA_NOTE
> > > # define REG_CFA_NOTE(NAME) REG_NOTE(NAME)
> > > #endif
> > > and then
> > > REG_CFA_NOTE (FRAME_RELATED_EXPR)
> > > etc. in reg-notes.def (and document that REG_CFA_NOTE should be used for
> > > notes related to CFA).
> > > Then in cfgcleanups.c we could just
> > > #undef REG_CFA_NOTE
> > > #define DEF_REG_NOTE(NAME)
> > > #define REG_CFA_NOTE(NAME) REG_##NAME,
> > > #include "reg-notes.def"
> > > #undef DEF_REG_NOTE
> > > #undef REG_CFA_NOTE
> > > to populate the cfa_note_kinds array.
> > Something like that seems preferable -- I think we're a lot more likely to
> > catch the need to use REG_CFA_NOTE when defining the notes in reg-notes.def
> > than we are to remember to update an array in a different file.
> 
> So like this (if it passes bootstrap/regtest on x86_64, i686 and
> powerpc64le)?
> 
> 2017-03-24  Jakub Jelinek  
> 
>   PR target/80102
>   * reg-notes.def (REG_CFA_NOTE): Define.  Use it for CFA related
>   notes.
>   * cfgcleanup.c (old_insns_match_p): Don't cross-jump in between /f
>   and non-/f instructions.  If both i1 and i2 are frame related,
>   verify all CFA notes, their order and content.
> 
>   * g++.dg/opt/pr80102.C: New test.

Successfully bootstrapped/regtested on {x86_64,i686,powerpc64le}-linux, ok
for trunk?

Jakub

[PATCH,rs6000] PR80103: Fix ICE with cross compiler

2017-03-24 Thread Kelvin Nilsen


PR 80103 provides a test case which results in an internal
compiler error when invoked with -mno-direct-move -mpower9-dform-
vector target options.  The internal compiler error results because
these two target options are incompatible with each other.

The enclosed patch simply disables this particular combination of
target options, terminating gcc with an error message instead of
producing an internal compiler error.  Additionally, this patch
includes new comments to address omissions from a patch committed
on 2017/03/23 which deals with conflicts between the 
-mno-power9-vector and -mcpu=power9 target options.

This patch has been bootstrapped and tested with no regressions on
both powerpc64-unknown-linux-gnu and powerpc64le-unknown-linux-gnu.
Is this ok for the trunk?

gcc/testsuite/ChangeLog:

2017-03-24  Kelvin Nilsen  

PR target/80103
* gcc.target/powerpc/pr80103-1.c: New test.


gcc/ChangeLog:

2017-03-24  Kelvin Nilsen  

PR target/80103
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Edit and
add comments.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
special handling for target option conflicts between dform
options (-mpower9-dform, -mpower9-dform-vector,
-mpower9-dform-scalar) and -mno-direct-move.

Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 246406)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -429,6 +429,12 @@ rs6000_target_modify_macros (bool define_p, HOST_W
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
   /* Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
+ turned on in the following condition:
+ 1. TARGET_P9_DFORM_SCALAR or TARGET_P9_DFORM_VECTOR are enabled
+and OPTION_MASK_DIRECT_MOVE is not explicitly disabled.
+Hereafter, the OPTION_MASK_DIRECT_MOVE flag is considered to
+have been turned on explicitly.
+ Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
  turned off in any of the following conditions:
  1. TARGET_HARD_FLOAT, TARGET_ALTIVEC, or TARGET_VSX is explicitly
disabled and OPTION_MASK_DIRECT_MOVE was not explicitly
@@ -473,8 +479,13 @@ rs6000_target_modify_macros (bool define_p, HOST_W
   if (!flag_iso)
rs6000_define_or_undefine_macro (define_p, "__APPLE_ALTIVEC__");
 }
-  /* Note that the OPTION_MASK_VSX flag is automatically turned off in
+  /* Note that the OPTION_MASK_VSX flag is automatically turned on in
  the following conditions:
+ 1. TARGET_P8_VECTOR is explicitly turned on and the OPTION_MASK_VSX
+was not explicitly turned off.  Hereafter, the OPTION_MASK_VSX
+flag is considered to have been explicitly turned on.
+ Note that the OPTION_MASK_VSX flag is automatically turned off in
+ the following conditions:
  1. The operating system does not support saving of AltiVec
registers (OS_MISSING_ALTIVEC).
  2. If any of the options TARGET_HARD_FLOAT, TARGET_FPRS,
@@ -507,6 +518,12 @@ rs6000_target_modify_macros (bool define_p, HOST_W
   rs6000_define_or_undefine_macro (define_p, "__TM_FENCE__");
 }
   /* Note that the OPTION_MASK_P8_VECTOR flag is automatically turned
+ on in the following conditions:
+ 1. TARGET_P9_VECTOR is explicitly turned on and
+OPTION_MASK_P8_VECTOR is not explicitly turned off.
+Hereafter, the OPTION_MASK_P8_VECTOR flag is considered to
+have been turned off explicitly.
+ Note that the OPTION_MASK_P8_VECTOR flag is automatically turned
  off in the following conditions:
  1. If any of TARGET_HARD_FLOAT, TARGET_ALTIVEC, or TARGET_VSX
were turned off explicitly and OPTION_MASK_P8_VECTOR flag was
@@ -514,15 +531,24 @@ rs6000_target_modify_macros (bool define_p, HOST_W
  2. If TARGET_ALTIVEC is turned off.  Hereafter, the
OPTION_MASK_P8_VECTOR flag is considered to have been turned off
explicitly.
- 3. If TARGET_VSX is turned off.  Hereafter, the OPTION_MASK_P8_VECTOR
-   flag is considered to have been turned off explicitly.  */
+ 3. If TARGET_VSX is turned off and OPTION_MASK_P8_VECTOR was not
+explicitly enabled.  If TARGET_VSX is explicitly enabled, the
+OPTION_MASK_P8_VECTOR flag is hereafter also considered to
+   have been turned off explicitly.  */
   if ((flags & OPTION_MASK_P8_VECTOR) != 0)
 rs6000_define_or_undefine_macro (define_p, "__POWER8_VECTOR__");
   /* Note that the OPTION_MASK_P9_VECTOR flag is automatically turned
  off in the following conditions:
- 1. If TARGET_P8_VECTOR is turned off. Hereafter, the
-   OPTION_MASK_P9_VECTOR flag is considered to have been turned off
-   explicitly.  */
+ 1. If TARGET_P8_VECTOR is turned off and OPTION_MASK_P9_VECTOR is
+not turned on expli

[wwwdocs] Update five citeseer.ist.psu.edu references in news/profiledriven.html

2017-03-24 Thread Gerald Pfeifer

Update four citeseer.ist.psu.edu references with doi.org and one
with an updated one.

Applied.

Gerald

Index: news/profiledriven.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/news/profiledriven.html,v
retrieving revision 1.10
diff -u -r1.10 profiledriven.html
--- news/profiledriven.html 28 Jun 2014 10:34:16 -  1.10
+++ news/profiledriven.html 24 Mar 2017 22:02:33 -
@@ -260,11 +260,11 @@
 
 
 [1]
-http://citeseer.ist.psu.edu/ball93branch.html";>Branch Prediction 
for Free;
+https://doi.org/10.1145/155090.155119";>Branch Prediction for Free;
Ball and Larus; PLDI '93.
 [2]
 
-http://citeseer.ist.psu.edu/wu94static.html";>Static Branch
+https://doi.org/10.1145/192724.192725";>Static Branch
 Frequency and Program Profile Analysis; Wu and Larus; MICRO-27.
 
 [3]
@@ -276,7 +276,8 @@
 
 [5]
 
-http://citeseer.ist.psu.edu/young97nearoptimal.html";>Near-optimal 
Intraprocedural Branch Alignment;
+https://doi.org/10.1145/258916.258932";>Near-optimal
+   Intraprocedural Branch Alignment;
Cliff Young, David S. Johnson, David R. Karger, Michael D. Smith, ACM 
1997
 
 [6]
@@ -284,11 +285,13 @@
International Conference on Supercomputing, 1999
 
 [7]
-http://citeseer.ist.psu.edu/chang91using.html";>Using Profile 
Information to Assist Classic Code Optimizations; Pohua P. Chang, Scott A. 
Mahlke, and Wen-mei W. Hwu, 1991
+https://doi.org/10.1002/spe.4380211204";>Using Profile
+   Information to Assist Classic Code Optimizations;
+   Pohua P. Chang, Scott A. Mahlke, and Wen-mei W. Hwu, 1991
 
 [8]
 http://citeseer.ist.psu.edu/august96hyperblock.html";>Hyperblock
+href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.1922";>Hyperblock
 Performance Optimizations For ILP Processors;  David Isaac August, 1996
 
 [9]

Re: [PATCH,rs6000] PR80103: Fix ICE with cross compiler

2017-03-24 Thread Segher Boessenkool

On Fri, Mar 24, 2017 at 04:04:33PM -0600, Kelvin Nilsen wrote:
> PR 80103 provides a test case which results in an internal
> compiler error when invoked with -mno-direct-move -mpower9-dform-
> vector target options.  The internal compiler error results because
> these two target options are incompatible with each other.
> 
> The enclosed patch simply disables this particular combination of
> target options, terminating gcc with an error message instead of
> producing an internal compiler error.  Additionally, this patch
> includes new comments to address omissions from a patch committed
> on 2017/03/23 which deals with conflicts between the 
> -mno-power9-vector and -mcpu=power9 target options.
> 
> This patch has been bootstrapped and tested with no regressions on
> both powerpc64-unknown-linux-gnu and powerpc64le-unknown-linux-gnu.
> Is this ok for the trunk?

This looks good, please apply.  Thanks,


Segher

[PATCH #2, rs6000][GCC6] Fix PR78543, ICE in push_reload on powerpc64le-linux-gnu

2017-03-24 Thread Michael Meissner

This patch reworks the original patch I attached to the bug report, to try and
make it less hacky.  It separates the bswap insns where there is hardware
support into separate read, write, and register swap instructions. This is
because the register allocators will try to push the bswap value in a register
to the stack and do the load based swap with reverse bytes.

Reload fumbles in certain conditions.  LRA generates working code, but the
store and the load with byte reverse from the same location, can slow things
down compared to the operation on registers.

I only did this optimization where we had the hardware support (i.e. bswap for
HImode all of the time, bswap for SImode all of the time, and bswap for DImode
if we are executing 64-bit instructions and the machine has LDBRX/STDBRX --
power7 and newer/cell ppc).

I have done bootstrap builds on a little endian power8 system, on a big endian
power8 system, and a big endian power7 system (both 32/64-bit support on this
last system).  There were no regressions.

I applied the patches to GCC 6, and the compiler builds and has no regressions
on a little endian power8 system.

I built spec 2006 benchmarks with the GCC 7 compiler.  There are 12 benchmarks
that generate one or more load/store with byte swap instructions (perlbench,
gcc, gamess, milc, zeusmp, calculix, h264ref, tonto, omnetpp, wrf, sphinx3,
xalancbmk).

I compared the instructions generated.  10 of the benchmarks generated the same
instructions.

Milc generated 1 less load with byte swap instruction and 1 more store with
byte swap instruction.

Sphinx3 generated 6 less load with byte swap instructions and 6 more store with
byte swap instructions.

So I count this as the same level of byte swapping is being generated.

Can I apply the patch to the GCC 7 trunk?  Since the patch shows up as an abort
in GCC 6 and not in GCC 7 (unless you disable LRA), I would like to apply the
patch as soon as possible to the GCC 6 branch.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 246413)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -370,6 +370,11 @@ (define_mode_iterator P [(SI "TARGET_32B
 ; PTImode is GPR only)
 (define_mode_iterator TI2 [TI PTI])
 
+; Types that have hardware load/store with byte reverse
+(define_mode_iterator BSWAP [HI
+SI
+(DI "TARGET_POWERPC64 && TARGET_LDBRX")])
+  
 ; Any hardware-supported floating-point mode
 (define_mode_iterator FP [
   (SF "TARGET_HARD_FLOAT 
@@ -2350,12 +2355,12 @@ (define_insn "cmpb3"
 
 ;; Since the hardware zeros the upper part of the register, save generating the
 ;; AND immediate if we are converting to unsigned
-(define_insn "*bswaphi2_extenddi"
+(define_insn "*bswap2_extenddi"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
(zero_extend:DI
-(bswap:HI (match_operand:HI 1 "memory_operand" "Z"]
+(bswap:HSI (match_operand:HSI 1 "memory_operand" "Z"]
   "TARGET_POWERPC64"
-  "lhbrx %0,%y1"
+  "lbrx %0,%y1"
   [(set_attr "length" "4")
(set_attr "type" "load")])
 
@@ -2368,34 +2373,55 @@ (define_insn "*bswaphi2_extendsi"
   [(set_attr "length" "4")
(set_attr "type" "load")])
 
-(define_expand "bswaphi2"
-  [(parallel [(set (match_operand:HI 0 "reg_or_mem_operand" "")
-  (bswap:HI
-   (match_operand:HI 1 "reg_or_mem_operand" "")))
- (clobber (match_scratch:SI 2 ""))])]
+;; Separate the bswap patterns into load, store, and gpr<-gpr.  This prevents
+;; the register allocator from converting a gpr<-gpr swap into a store and then
+;; load with byte swap, which can be slower than doing it in the registers.  It
+;; also prevents certain failures with the RELOAD register allocator.
+
+(define_expand "bswap2"
+  [(use (match_operand:HSI 0 "reg_or_mem_operand"))
+   (use (match_operand:HSI 1 "reg_or_mem_operand"))]
   ""
 {
-  if (!REG_P (operands[0]) && !REG_P (operands[1]))
-operands[1] = force_reg (HImode, operands[1]);
+  rtx dest = operands[0];
+  rtx src = operands[1];
+
+  if (!REG_P (dest) && !REG_P (src))
+src = force_reg (mode, src);
+
+  if (MEM_P (src))
+emit_insn (gen_bswap2_load (dest, src));
+  else
+{
+  if (MEM_P (dest))
+   emit_insn (gen_bswap2_store (dest, src));
+  else
+   emit_insn (gen_bswap2_reg (dest, src));
+}
+  DONE;
 })
 
-(define_insn "bswaphi2_internal"
-  [(set (match_operand:HI 0 "reg_or_mem_operand" "=r,Z,&r")
-   (bswap:HI
-(match_operand:HI 1 "reg_or_mem_operand" "Z,r,r")))
-   (clobber (match_scratch:SI 2 "=X,X,&r"))]
+(define_insn "bswap2_load"
+  [(set (match_operand:BSWAP 0 "gpc_reg_operand" "=r")
+   (bswap:BSWAP (match_operand:BSWAP 1 "memory_operand" "Z")))]
+  ""
+  "lbrx %0,%y1"
+  [(

Re: [PATCH #2, rs6000][GCC6] Fix PR78543, ICE in push_reload on powerpc64le-linux-gnu

2017-03-24 Thread Michael Meissner

Well instead of attaching the ChangeLog, I attached the patch without
ChangeLog.

Here is the ChangeLog entry for the patch:

[gcc]
2017-03-24  Michael Meissner  

PR target/78543
* config/rs6000/rs6000.md (BSWAP): New mode iterator for modes
with hardware byte swap load/store intstructions.
(bswaphi2_extenddi): Combine bswap HImode and SImode with zero
extend to DImode to one insn.
(bswap2_extenddi): Likewise.
(bswapsi2_extenddi): Likewise.
(bswaphi2_extendsi): Likewise.
(bswaphi2): Combine bswap HImode and SImode into one insn.
Separate memory insns from swapping register.
(bswapsi2): Likewise.
(bswap2): Likewise.
(bswaphi2_internal): Delete, no longer used.
(bswapsi2_internal): Likewise.
(bswap2_load): Split bswap HImode/SImode into separate load,
store, and gpr<-gpr swap insns.
(bswap2_store): Likewise.
(bswaphi2_reg): Register only splitter, combine with the splitter.
(bswaphi2 splitter): Likewise.
(bswapsi2_reg): Likewise.
(bswapsi2 splitter): Likewise.
(bswapdi2): If we have the LDBRX and STDBRX instructions, split
the insns into load, store, and register/register insns.
(bswapdi2_ldbrx): Likewise.
(bswapdi2_load): Likewise.
(bswapdi2_store): Likewise.
(bswapdi2_reg): Likewise.

[gcc/testsuite]
2017-03-24  Michael Meissner  

PR target/78543
* gcc.target/powerpc/pr78543.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Segher Boessenkool

Hi!

On Fri, Mar 24, 2017 at 08:36:16PM +0100, Jakub Jelinek wrote:
> +   /* Skip over reg notes not related to CFI information.  */
> +   while (n1)
> + {
> +   for (i = 0; i < ARRAY_SIZE (cfa_note_kinds) - 1; i++)
> + if (REG_NOTE_KIND (n1) == cfa_note_kinds[i])
> +   break;
> +   if (i != ARRAY_SIZE (cfa_note_kinds))
> + break;
> +   n1 = XEXP (n1, 1);
> + }

Maybe factor out reg_note_is_cfa_note and/or insns_have_identical_cfa_notes
functions?

The patch looks fine to me btw.  Thanks for working on this!


Segher

Re: [PATCH #2, rs6000][GCC6] Fix PR78543, ICE in push_reload on powerpc64le-linux-gnu

2017-03-24 Thread Segher Boessenkool

Hi Mike,

On Fri, Mar 24, 2017 at 07:23:02PM -0400, Michael Meissner wrote:
> Reload fumbles in certain conditions.

Yeah.  And it does not need bswap to get totally lost with this, so this
patch is a workaround, not a fix.

It does make things nicer though :-)

> LRA generates working code, but the
> store and the load with byte reverse from the same location, can slow things
> down compared to the operation on registers.

How so?  You mean (say) lwbrx after doing a stw to that same address?
We have that problem in general :-/

Thanks for the extensive testing.

> Can I apply the patch to the GCC 7 trunk?  Since the patch shows up as an 
> abort
> in GCC 6 and not in GCC 7 (unless you disable LRA), I would like to apply the
> patch as soon as possible to the GCC 6 branch.

A few remarks:

> +; Types that have hardware load/store with byte reverse
> +(define_mode_iterator BSWAP [HI
> +  SI
> +  (DI "TARGET_POWERPC64 && TARGET_LDBRX")])

The patch would be much easier to read if you had done this step (with HSI
as well) separately.  Oh well.

> +(define_expand "bswap2"

> +  if (MEM_P (src))
> +emit_insn (gen_bswap2_load (dest, src));
> +  else
> +{
> +  if (MEM_P (dest))
> + emit_insn (gen_bswap2_store (dest, src));
> +  else
> + emit_insn (gen_bswap2_reg (dest, src));
> +}
> +  DONE;

Please avoid the extra nesting, i.e. do

+  if (MEM_P (src))
+emit_insn (gen_bswap2_load (dest, src));
+  else if (MEM_P (dest))
+emit_insn (gen_bswap2_store (dest, src));
+  else
+emit_insn (gen_bswap2_reg (dest, src));
+  DONE;

(also for bswapdi2).

> +  [(set_attr "length" "12")
> +   (set_attr "type" "*")])

Lose that last line?  You don't need to explicitly set things to their
default value ;-)
OTOH, you might want to make it type "three" instead?

> +  [(set_attr "length" "36")
> +   (set_attr "type" "*")])

Same.

This patch is okay for trunk.  Also for 6, but what is the hurry there?

Thanks!

Segher

LRA fix for 80160

2017-03-24 Thread Bernd Schmidt

This fixes two PRs; we shouldn't try to avoid spilling a reg if it has 
an alternate class it can use.


Bootstrapped and tested on x86_64-linux, approved by Vlad, committed.


Bernd
Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 246472)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,10 @@
+2017-03-25  Bernd Schmidt  
+
+	PR rtl-optimization/80160
+	PR rtl-optimization/80159
+	* lra-assigns.c (must_not_spill_p): Tighten new test to also take
+	reg_alternate_class into account.
+
 2017-03-24  Vladimir Makarov  
 
 	PR target/80148
Index: gcc/lra-assigns.c
===
--- gcc/lra-assigns.c	(revision 246472)
+++ gcc/lra-assigns.c	(working copy)
@@ -908,7 +908,8 @@ must_not_spill_p (unsigned spill_regno)
  does not solve the general case where existing reloads fully
  cover a limited register class.  */
   if (!bitmap_bit_p (&non_reload_pseudos, spill_regno)
-  && reg_class_size [reg_preferred_class (spill_regno)] == 1)
+  && reg_class_size [reg_preferred_class (spill_regno)] == 1
+  && reg_alternate_class (spill_regno) == NO_REGS)
 return true;
   return false;
 }
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(revision 246472)
+++ gcc/testsuite/ChangeLog	(working copy)
@@ -1,3 +1,10 @@
+2017-03-25  Bernd Schmidt  
+
+	PR rtl-optimization/80160
+	PR rtl-optimization/80159
+
+	* gcc.target/i386/pr80160.c: New test.
+
 2017-03-24  Jakub Jelinek  
 
 	PR sanitizer/79904
Index: gcc/testsuite/gcc.target/i386/pr80160.c
===
--- gcc/testsuite/gcc.target/i386/pr80160.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr80160.c	(working copy)
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -w" } */
+/* { dg-additional-options "-march=pentium-mmx" { target ia32 } } */
+
+typedef struct { long long a; } a_t;
+int *a, b;
+a_t *e, c;
+long long f;
+void fn (int);
+void fn2 (void);
+int fn3 (a_t);
+void fn4 (a_t);
+a_t foo (long long val) { return (a_t){val}; }
+static void
+bar (int ka)
+{
+  unsigned i;
+  for (i = 0; i < 512; i++) {
+long d;
+c = (a_t){d};
+fn2 ();
+  }
+  fn (ka);
+}
+void
+test (void)
+{
+  a_t g;
+  a_t *h, j;
+  h = e;
+  j = *h;
+  if (e == (a_t *) 1) {
+a_t k = {fn3 (j)};
+fn4 (j);
+long l;
+g = foo((long long)b << 2 | l);
+k = g;
+if (j.a != k.a) {
+  a_t m = g;
+  int n = m.a, o = m.a >> 32;
+  asm ("# %0 %1 %2 %3" : "=m"(*a), "+A"(f) : "b"(n), "c"(o));
+}
+  }
+  bar ((int) h);
+}

Re: [PATCH] Don't cross-jump in between frame related and non-frame related insns (PR target/80102, take 3)

2017-03-24 Thread Jakub Jelinek

On Fri, Mar 24, 2017 at 06:37:46PM -0500, Segher Boessenkool wrote:
> On Fri, Mar 24, 2017 at 08:36:16PM +0100, Jakub Jelinek wrote:
> > + /* Skip over reg notes not related to CFI information.  */
> > + while (n1)
> > +   {
> > + for (i = 0; i < ARRAY_SIZE (cfa_note_kinds) - 1; i++)
> > +   if (REG_NOTE_KIND (n1) == cfa_note_kinds[i])
> > + break;
> > + if (i != ARRAY_SIZE (cfa_note_kinds))
> > +   break;
> > + n1 = XEXP (n1, 1);
> > +   }
> 
> Maybe factor out reg_note_is_cfa_note and/or insns_have_identical_cfa_notes
> functions?

Well, for the reg_note_is_cfa_note, actually it might be better to just
have a const bool array indexed by the note enum instead of the loop, I'll
implement it later.  And yes, I can outline insns_have_identical_cfa_notes.

Jakub

81 matches

Mail list logo