Re: [PATCH] gimplify: Don't optimize register const vars to static [PR93949]

2020-02-29 Thread Uecker, Martin

One could also simply remove the error in varasm.c. This
would preserve the optimization. As a side effect, this
would allow register without __asm__ at file scope, but
there do not seem to be any disadvantages. (register
at file scope is already diagnosed by the C FE when
using --pedantic).

Best,
Martin

Am Donnerstag, den 27.02.2020, 10:31 +0100 schrieb Richard Biener:
> On Thu, 27 Feb 2020, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > The following testcase is rejected, while it was accepted in 3.4 and earlier
> > (before tree-ssa merge).
> > The problem is that we decide to promote the const variable to TREE_STATIC,
> > but TREE_STATIC DECL_REGISTER VAR_DECLs may only be the global register vars
> > and so assemble_variable/make_decl_rtl diagnoses it.
> > 
> > Either we do what the following patch does, where we could consider
> > register as a hint the user doesn't want such optimization, because if
> > something is forced static, it is not "register" anymore and register static
> > is not valid in C either, or we could clear DECL_REGISTER instead, but would
> > still need to punt at least on DECL_HARD_REGISTER cases.
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> OK.
> 
> Thanks,
> Richard.
> 
> > 2020-02-27  Jakub Jelinek  
> > 
> > PR c/93949
> > * gimplify.c (gimplify_init_constructor): Don't promote readonly
> > DECL_REGISTER variables to TREE_STATIC.
> > 
> > * gcc.c-torture/compile/pr93949.c: New test.
> > 
> > --- gcc/gimplify.c.jj   2020-02-25 13:54:02.087091120 +0100
> > +++ gcc/gimplify.c  2020-02-26 19:30:57.466490166 +0100
> > @@ -4923,6 +4923,7 @@ gimplify_init_constructor (tree *expr_p,
> >     && num_nonzero_elements > 1
> >     && TREE_READONLY (object)
> >     && VAR_P (object)
> > +   && !DECL_REGISTER (object)
> >     && (flag_merge_constants >= 2 || !TREE_ADDRESSABLE (object))
> >     /* For ctors that have many repeated nonzero elements
> >        represented through RANGE_EXPRs, prefer initializing
> > --- gcc/testsuite/gcc.c-torture/compile/pr93949.c.jj2020-02-26 
> > 19:42:15.754530691 +0100
> > +++ gcc/testsuite/gcc.c-torture/compile/pr93949.c   2020-02-26 
> > 19:42:08.153642329 +0100
> > @@ -0,0 +1,7 @@
> > +/* PR c/93949 */
> > +
> > +void
> > +foo (void)
> > +{
> > +  register const double d[3] = { 0., 1., 2. };
> > +}
> > 
> > Jakub
> > 
> > 
> 
> 

Re: [PATCH] gimplify: Don't optimize register const vars to static [PR93949]

2020-02-29 Thread Jakub Jelinek
On Sat, Feb 29, 2020 at 09:50:00AM +, Uecker, Martin wrote:
> One could also simply remove the error in varasm.c. This
> would preserve the optimization. As a side effect, this
> would allow register without __asm__ at file scope, but
> there do not seem to be any disadvantages. (register
> at file scope is already diagnosed by the C FE when
> using --pedantic).

First of all, such a change wouldn't be appropriate for backports, and IMHO
makes only sense if we actually start optimizing based on that (assuming
the address of the var can't be taken in alias analysis etc.).

Jakub



Re: [PATCH] gimplify: Don't optimize register const vars to static [PR93949]

2020-02-29 Thread Uecker, Martin
Am Samstag, den 29.02.2020, 10:57 +0100 schrieb Jakub Jelinek:
> On Sat, Feb 29, 2020 at 09:50:00AM +, Uecker, Martin wrote:
> > One could also simply remove the error in varasm.c. This
> > would preserve the optimization. As a side effect, this
> > would allow register without __asm__ at file scope, but
> > there do not seem to be any disadvantages. (register
> > at file scope is already diagnosed by the C FE when
> > using --pedantic).
> 
> First of all, such a change wouldn't be appropriate for backports,

Ok.

> and IMHO
> makes only sense if we actually start optimizing based on that (assuming
> the address of the var can't be taken in alias analysis etc.).

FYI: There is a proposal for C2X from a developer of compiler
for small embedded systems to allow this for optimization.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2486.htm

Best,
Martin

Re: [PATCH 01/10] i386: Properly encode vector registers in vector move

2020-02-29 Thread H.J. Lu
On Fri, Feb 28, 2020 at 6:15 PM H.J. Lu  wrote:
>
> On Fri, Feb 28, 2020 at 4:16 PM Jeff Law  wrote:
> >
> > On Thu, 2020-02-27 at 06:50 -0800, H.J. Lu wrote:
> > >
> > > How about this?  If it looks OK, I will post the whole patch set.
> > It's better.  I'm guessing the two cases that were previously handled with
> > vextract/vbroadcast aren't supposed to happen?  They're caught here IIUC:
> >
> > > +  /* NB: To move xmm16-xmm31/ymm16-ymm31 registers without AVX512VL,
> > > + we can only use zmm register move without memory operand.  */
> > > +   if (evex_reg_p
> > > +   && !TARGET_AVX512VL
> > > +   && GET_MODE_SIZE (mode) < 64)
> > > + {
> > > +   if (memory_operand (operands[0], mode)
> > > +|| memory_operand (operands[1], mode))
> > > + gcc_unreachable ();
> > >
> >
> > If they truly can't happen, that's fine.  My worry is I don't see changes to
> > the operand predicates or constraints which would avoid this case.   Is it
> > prevented by the mode iterator on the operands?  Again, just want to make 
> > sure
> > I understand why the vextract/vbroadcast stuff isn't in the new code.
>
> There are no GCC testcases to show that they are actually ever used.   That is
> why I removed them and added gcc_unreachable ().

This is covered by the testcases I added:

[hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
#include 

extern __m128 d;

void
foo1 (__m128 x)
{
  register __m128 xmm16 __asm ("xmm16") = x;
  asm volatile ("" : "+v" (xmm16));
  d = xmm16;
}
[hjl@gnu-cfl-2 gcc]$ gcc -O2 -march=skylake-avx512  /tmp/x.c -S
[hjl@gnu-cfl-2 gcc]$ gcc -O2 -march=skylake-avx512 -mno-avx512vl  /tmp/x.c -S
/tmp/x.c: In function ‘foo1’:
/tmp/x.c:8:19: error: register specified for ‘xmm16’ isn’t suitable
for data type
8 |   register __m128 xmm16 __asm ("xmm16") = x;
  |   ^
[hjl@gnu-cfl-2 gcc]$

GCC doesn't allow xmm16-xmm31/ymm16-ymm31 without AVX512VL since
ix86_hard_regno_mode_ok has

 /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
  if (TARGET_AVX512VL
  && (mode == OImode
  || mode == TImode
  || VALID_AVX256_REG_MODE (mode)
  || VALID_AVX512VL_128_REG_MODE (mode)))
return true;

  /* xmm16-xmm31 are only available for AVX-512.  */
  if (EXT_REX_SSE_REGNO_P (regno))
return false;

The vextract/vbroadcast stuff is dead code.

> > I'm doing a little assuming that the  bits in the old code 
> > are
> > mapped correctly to the 32/64 suffixes on the opcodes in the new version.
> >
> > I'm also assuming that mapping of "size" in the argument to ix86_get_ssemov 
> > to
> > the operand modifiers g, t, and x are right.  I'm guessing the operand
> > modifiers weren't needed in the original because we had the actual operand 
> > and
> > could look at it to get the right modifier.  In the evex, but not avx512vl 
> > case
> > those are forced to a g modifier which seems to match the original.
> >
> > Are we going to need further refinements to 
> > ix86_output_ssemov/ix86_get_ssemov?
> > If so, then I'd suggest the next patch be those patterns which don't require
> > further refinements to ix86_output_ssemov.
>
> 4 patches don't require changes in ix86_output_ssemov/ix86_get_ssemov:
>
> https://gitlab.com/x86-gcc/gcc/-/commit/426f2464abb80b97b8533f9efa15bbe72e6aa888
> https://gitlab.com/x86-gcc/gcc/-/commit/ec5b40d77f7a4424935275f1a7ccedbce83b6f54
> https://gitlab.com/x86-gcc/gcc/-/commit/92fdd98234984f86b66fb5403dd828661cd7999f
> https://gitlab.com/x86-gcc/gcc/-/commit/f8fa5e571caf6740b36d042d631b4ace11683cd7
>
> I can combine them into a single patch.
>
> Other 5 patches contain a small change to  ix86_output_ssemov:
>
> https://gitlab.com/x86-gcc/gcc/-/commit/b1746392e1d350d689a80fb71b2c72f909c20f30
> https://gitlab.com/x86-gcc/gcc/-/commit/14c3cbdbdcc36fa1edea4572b89a039726a4e2bc
> https://gitlab.com/x86-gcc/gcc/-/commit/69c8c928b26242116cc261a9d2f6b1265218f1d3
> https://gitlab.com/x86-gcc/gcc/-/commit/04335f582f0b281d5f357185d154087997fd7cfd
> https://gitlab.com/x86-gcc/gcc/-/commit/64f6a5d6d3405331d9c02aaae0faccf449d6647a
>
> Should I made the change and submit them for review?

I am preparing the new patch set.

> > If no further refinements to ix86_output_ssemov/ix86_get_ssemov are 
> > required,
> > then I think you can just send the rest of the pattern changes in a single
> > unit.
> >
> > jeff
> >

-- 
H.J.


[Patch, fortran] PR92959 - ICE in gfc_conv_associated, at fortran/trans-intrinsic.c:8634

2020-02-29 Thread Paul Richard Thomas
This is a another case of the gotcha's that come from trying to use
ts.u.cl->backend_decl directly, where deferred length and even, in
this case fixed length characters are concerned. The fix is to make
use of the string length obtained from evaluation of the expression.

Regtested on FC31/x86_64 - OK for trunk?

Paul

2020-02-29  Paul Thomas  

PR fortran/92959
* trans-intrinsic.c (gfc_conv_associated): Eliminate
'nonzero_charlen' and move the chunk to evaluate zero character
length until after the argument evaluation so that the string
length can be used.

2020-02-29  Paul Thomas  

PR fortran/92959
* gfortran.dg/associated_8.f90 : New test.
Index: gcc/fortran/trans-intrinsic.c
===
*** gcc/fortran/trans-intrinsic.c	(revision 279842)
--- gcc/fortran/trans-intrinsic.c	(working copy)
*** gfc_conv_associated (gfc_se *se, gfc_exp
*** 8573,8579 
gfc_se arg2se;
tree tmp2;
tree tmp;
-   tree nonzero_charlen;
tree nonzero_arraylen;
gfc_ss *ss;
bool scalar;
--- 8573,8578 
*** gfc_conv_associated (gfc_se *se, gfc_exp
*** 8629,8641 
if (arg2->expr->ts.type == BT_CLASS)
  	gfc_add_data_component (arg2->expr);

-   nonzero_charlen = NULL_TREE;
-   if (arg1->expr->ts.type == BT_CHARACTER)
- 	nonzero_charlen = fold_build2_loc (input_location, NE_EXPR,
- 	   logical_type_node,
- 	   arg1->expr->ts.u.cl->backend_decl,
- 	   build_zero_cst
- 	   (TREE_TYPE (arg1->expr->ts.u.cl->backend_decl)));
if (scalar)
  {
  	  /* A pointer to a scalar.  */
--- 8628,8633 
*** gfc_conv_associated (gfc_se *se, gfc_exp
*** 8705,8714 

/* If target is present zero character length pointers cannot
  	 be associated.  */
!   if (nonzero_charlen != NULL_TREE)
! 	se->expr = fold_build2_loc (input_location, TRUTH_AND_EXPR,
! logical_type_node,
! se->expr, nonzero_charlen);
  }

se->expr = convert (gfc_typenode_for_spec (&expr->ts), se->expr);
--- 8697,8711 

/* If target is present zero character length pointers cannot
  	 be associated.  */
!   if (arg1->expr->ts.type == BT_CHARACTER)
! 	{
! 	  tmp = arg1se.string_length;
! 	  tmp = fold_build2_loc (input_location, NE_EXPR,
!  logical_type_node, tmp,
!  build_zero_cst (TREE_TYPE (tmp)));
! 	  se->expr = fold_build2_loc (input_location, TRUTH_AND_EXPR,
!   logical_type_node, se->expr, tmp);
! 	}
  }

se->expr = convert (gfc_typenode_for_spec (&expr->ts), se->expr);
Index: gcc/testsuite/gfortran.dg/associated_8.f90
===
*** gcc/testsuite/gfortran.dg/associated_8.f90	(nonexistent)
--- gcc/testsuite/gfortran.dg/associated_8.f90	(working copy)
***
*** 0 
--- 1,37 
+ ! { dg-do run }
+ !
+ ! Test the fix for PR92959, where compilation of ASSOCIATED segfaulted in 's1' and 's2'.
+ !
+ ! Contributed by Gerhard Steinmetz  
+ !
+ program p
+character(:), pointer :: x, y => NULL()
+character, pointer :: u, v => NULL ()
+character(4), target :: tgt = "abcd"
+
+ ! Manifestly not associated
+x => tgt
+u => tgt(1:1)
+call s1 (.false., 1)
+call s2 (.false., 2)
+ ! Manifestly associated
+y => x
+v => u
+call s1 (.true., 3)
+call s2 (.true., 4)
+ ! Zero sized storage sequences must give a false.
+y => tgt(1:0)
+x => y
+call s1 (.false., 5)
+ contains
+subroutine s1 (state, err_no)
+   logical :: state
+   integer :: err_no
+   if (associated(x, y) .neqv. state) stop err_no
+end
+subroutine s2 (state, err_no)
+   logical :: state
+   integer :: err_no
+   if (associated(u, v) .neqv. state) stop err_no
+ end
+ end


[PATCH] [9/10 Regression] lto: Also copy .note.gnu.property section

2020-02-29 Thread H.J. Lu
On Fri, Feb 28, 2020 at 7:38 AM H.J. Lu  wrote:
>
> On Fri, Feb 28, 2020 at 6:30 AM H.J. Lu  wrote:
> >
> > When generating the separate file with LTO debug sections, we should
> > also copy .note.gnu.property section.
> >
> > OK for master if there is no regression?
> >
> > Thanks.
> >
> > H.J.
> > ---
> > libiberty/
> >
> > PR lto/93966
> > * simple-object.c (handle_lto_debug_sections): Also copy
> > .note.gnu.property section.
> >
>
> The test will fail on non-CET enabled OS.   Here is the updated patch without
> testcase.OK for master and backport to GCC 8/9 branches?
>

This is a GCC 9/10 regression introduced by early LTO debug patches.  Is my
patch:

https://gcc.gnu.org/ml/gcc-patches/2020-02/msg01626.html

OK for master and backport for GCC 9 branch?

Thanks.

-- 
H.J.


V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move

2020-02-29 Thread H.J. Lu
This patch set was originally submitted in Feb 2019:

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01841.html

I broke it into 6 smaller patches for easy review.

On x86, when AVX and AVX512 are enabled, vector move instructions can
be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):

   0:   c5 f9 6f d1 vmovdqa %xmm1,%xmm2
   4:   62 f1 fd 08 6f d1   vmovdqa64 %xmm1,%xmm2

We prefer VEX encoding over EVEX since VEX is shorter.  Also AVX512F
only supports 512-bit vector moves.  AVX512F + AVX512VL supports 128-bit
and 256-bit vector moves.  xmm16-xmm31 and ymm16-ymm31 are disallowed in
128-bit and 256-bit modes when AVX512VL is disabled.  Mode attributes on
x86 vector move patterns indicate target preferences of vector move
encoding.  For scalar register to register move, we can use 512-bit
vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
available.  With AVX512F and AVX512VL, we should use VEX encoding for
128-bit/256-bit vector moves if upper 16 vector registers aren't used.
This patch adds a function, ix86_output_ssemov, to generate vector moves:

1. If zmm registers are used, use EVEX encoding.
2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
will be generated.
3. If xmm16-xmm31/ymm16-ymm31 registers are used:
   a. With AVX512VL, AVX512VL vector moves will be generated.
   b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
  move will be done with zmm register move.

There is no need to set mode attribute to XImode explicitly since
ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
with and without AVX512VL.

Tested on AVX2 and AVX512 with and without --with-arch=native.

H.J. Lu (6):
  i386: Properly encode vector registers in vector move
  i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV
  i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV
  i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV
  i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV
  i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV

 gcc/config/i386/i386-protos.h |   2 +
 gcc/config/i386/i386.c| 242 ++
 gcc/config/i386/i386.md   | 212 +--
 gcc/config/i386/mmx.md|  29 +--
 gcc/config/i386/predicates.md |   5 -
 gcc/config/i386/sse.md|  98 +--
 .../gcc.target/i386/avx512vl-vmovdqa64-1.c|   7 +-
 gcc/testsuite/gcc.target/i386/pr89229-2a.c|  15 ++
 gcc/testsuite/gcc.target/i386/pr89229-2b.c|  13 +
 gcc/testsuite/gcc.target/i386/pr89229-2c.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89229-3a.c|  16 ++
 gcc/testsuite/gcc.target/i386/pr89229-3b.c|  12 +
 gcc/testsuite/gcc.target/i386/pr89229-3c.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89229-4a.c|  17 ++
 gcc/testsuite/gcc.target/i386/pr89229-4b.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89229-4c.c|   7 +
 gcc/testsuite/gcc.target/i386/pr89229-5a.c|  17 ++
 gcc/testsuite/gcc.target/i386/pr89229-5b.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89229-5c.c|   7 +
 gcc/testsuite/gcc.target/i386/pr89229-6a.c|  16 ++
 gcc/testsuite/gcc.target/i386/pr89229-6b.c|   7 +
 gcc/testsuite/gcc.target/i386/pr89229-6c.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89229-7a.c|  16 ++
 gcc/testsuite/gcc.target/i386/pr89229-7b.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89229-7c.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89346.c   |  15 ++
 26 files changed, 465 insertions(+), 330 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89346.c

-- 
2.24.1



[PATCH 4/6] i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV

2020-02-29 Thread H.J. Lu
There is no need to set mode attribute to XImode nor V8DFmode since
ix86_output_ssemov can properly encode xmm16-xmm31 registers with and
without AVX512VL.

gcc/

PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_DF.
* config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov
for TYPE_SSEMOV.  Remove TARGET_AVX512F, TARGET_PREFER_AVX256,
TARGET_AVX512VL and ext_sse_reg_operand check.

gcc/testsuite/

PR target/89229
* gcc.target/i386/pr89229-6a.c: New test.
* gcc.target/i386/pr89229-6b.c: Likewise.
* gcc.target/i386/pr89229-6c.c: Likewise.
---
 gcc/config/i386/i386.c |  6 +++
 gcc/config/i386/i386.md| 44 ++
 gcc/testsuite/gcc.target/i386/pr89229-6a.c | 16 
 gcc/testsuite/gcc.target/i386/pr89229-6b.c |  7 
 gcc/testsuite/gcc.target/i386/pr89229-6c.c |  6 +++
 5 files changed, 38 insertions(+), 41 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c28c162282a..a6fe9894ab8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5130,6 +5130,12 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
 case MODE_SI:
   return "%vmovd\t{%1, %0|%0, %1}";
 
+case MODE_DF:
+  if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
+   return "vmovsd\t{%d1, %0|%0, %d1}";
+  else
+   return "%vmovsd\t{%1, %0|%0, %1}";
+
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index e9537fadfe8..060a34c4bd4 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3307,37 +3307,7 @@ (define_insn "*movdf_internal"
   return standard_sse_constant_opcode (insn, operands);
 
 case TYPE_SSEMOV:
-  switch (get_attr_mode (insn))
-   {
-   case MODE_DF:
- if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
-   return "vmovsd\t{%d1, %0|%0, %d1}";
- return "%vmovsd\t{%1, %0|%0, %1}";
-
-   case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-   case MODE_V8DF:
- return "vmovapd\t{%g1, %g0|%g0, %g1}";
-   case MODE_V2DF:
- return "%vmovapd\t{%1, %0|%0, %1}";
-
-   case MODE_V2SF:
- gcc_assert (!TARGET_AVX);
- return "movlps\t{%1, %0|%0, %1}";
-   case MODE_V1DF:
- gcc_assert (!TARGET_AVX);
- return "movlpd\t{%1, %0|%0, %1}";
-
-   case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq.  */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
-   return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
-
-   default:
- gcc_unreachable ();
-   }
+  return ix86_output_ssemov (insn, operands);
 
 default:
   gcc_unreachable ();
@@ -3391,10 +3361,7 @@ (define_insn "*movdf_internal"
 
   /* xorps is one byte shorter for non-AVX targets.  */
   (eq_attr "alternative" "12,16")
-(cond [(and (match_test "TARGET_AVX512F")
-(not (match_test "TARGET_PREFER_AVX256")))
- (const_string "XI")
-   (match_test "TARGET_AVX")
+(cond [(match_test "TARGET_AVX")
  (const_string "V2DF")
(ior (not (match_test "TARGET_SSE2"))
 (match_test "optimize_function_for_size_p (cfun)"))
@@ -3410,12 +3377,7 @@ (define_insn "*movdf_internal"
 
   /* movaps is one byte shorter for non-AVX targets.  */
   (eq_attr "alternative" "13,17")
-(cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
- (not (match_test "TARGET_AVX512VL")))
-(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand")))
- (const_string "V8DF")
-   (match_test "TARGET_AVX")
+(cond [(match_test "TARGET_AVX")
  (const_string "DF")
(ior (not (match_test "TARGET_SSE2"))
 (match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6a.c 
b/gcc/testsuite/gcc.target/i386/pr89229-6a.c
new file mode 100644
index 000..5bc10d25619
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-6a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern double d;
+
+void
+foo1 (double x)
+{
+  register double xmm16 __asm ("xmm

[PATCH 1/6] i386: Properly encode vector registers in vector move

2020-02-29 Thread H.J. Lu
On x86, when AVX and AVX512 are enabled, vector move instructions can
be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):

   0:   c5 f9 6f d1 vmovdqa %xmm1,%xmm2
   4:   62 f1 fd 08 6f d1   vmovdqa64 %xmm1,%xmm2

We prefer VEX encoding over EVEX since VEX is shorter.  Also AVX512F
only supports 512-bit vector moves.  AVX512F + AVX512VL supports 128-bit
and 256-bit vector moves.  xmm16-xmm31 and ymm16-ymm31 are disallowed in
128-bit and 256-bit modes when AVX512VL is disabled.  Mode attributes on
x86 vector move patterns indicate target preferences of vector move
encoding.  For scalar register to register move, we can use 512-bit
vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
available.  With AVX512F and AVX512VL, we should use VEX encoding for
128-bit/256-bit vector moves if upper 16 vector registers aren't used.
This patch adds a function, ix86_output_ssemov, to generate vector moves:

1. If zmm registers are used, use EVEX encoding.
2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
will be generated.
3. If xmm16-xmm31/ymm16-ymm31 registers are used:
   a. With AVX512VL, AVX512VL vector moves will be generated.
   b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
  move will be done with zmm register move.

There is no need to set mode attribute to XImode explicitly since
ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
with and without AVX512VL.

Tested on AVX2 and AVX512 with and without --with-arch=native.

gcc/

PR target/89229
PR target/89346
* config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
* config/i386/i386.c (ix86_get_ssemov): New function.
(ix86_output_ssemov): Likewise.
* config/i386/sse.md (VMOVE:mov_internal): Call
ix86_output_ssemov for TYPE_SSEMOV.  Remove TARGET_AVX512VL
check.
(*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV.
(*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
Remove ext_sse_reg_operand and TARGET_AVX512VL check.
(*movti_internal): Likewise.
(*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.

gcc/testsuite/

PR target/89229
PR target/89346
* gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
* gcc.target/i386/pr89346.c: New test.

gcc/testsuite/

PR target/89229
* gcc.target/i386/pr89229-2a.c: New test.
* gcc.target/i386/pr89229-2b.c: Likewise.
* gcc.target/i386/pr89229-2c.c: Likewise.
* gcc.target/i386/pr89229-3a.c: Likewise.
* gcc.target/i386/pr89229-3b.c: Likewise.
* gcc.target/i386/pr89229-3c.c: Likewise.
---
 gcc/config/i386/i386-protos.h |   2 +
 gcc/config/i386/i386.c| 208 ++
 gcc/config/i386/i386.md   |  86 +---
 gcc/config/i386/sse.md|  98 +
 .../gcc.target/i386/avx512vl-vmovdqa64-1.c|   7 +-
 gcc/testsuite/gcc.target/i386/pr89229-2a.c|  15 ++
 gcc/testsuite/gcc.target/i386/pr89229-2b.c|  13 ++
 gcc/testsuite/gcc.target/i386/pr89229-2c.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89229-3a.c|  16 ++
 gcc/testsuite/gcc.target/i386/pr89229-3b.c|  12 +
 gcc/testsuite/gcc.target/i386/pr89229-3c.c|   6 +
 gcc/testsuite/gcc.target/i386/pr89346.c   |  15 ++
 12 files changed, 303 insertions(+), 181 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89346.c

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 266381ca5a6..39fcaa0ad5f 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -38,6 +38,8 @@ extern void ix86_expand_split_stack_prologue (void);
 extern void ix86_output_addr_vec_elt (FILE *, int);
 extern void ix86_output_addr_diff_elt (FILE *, int, int);
 
+extern const char *ix86_output_ssemov (rtx_insn *, rtx *);
+
 extern enum calling_abi ix86_cfun_abi (void);
 extern enum calling_abi ix86_function_type_abi (const_tree);
 
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index dac7a3fc5fd..7bbfbb4c5a7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4915,6 +4915,214 @@ ix86_pre_reload_split (void)
  && !(cfun->curr_properties & PROP_rtl_split_insns));
 }
 
+/* Return the opcode of the TYPE_SSEMOV instruction.  To move from
+   or to xmm16-xmm31/ymm16-ymm31 registers, we either require
+   TARGET_AVX512VL or it is a register to register move which can
+   be done with zmm regis

[PATCH 6/6] i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV

2020-02-29 Thread H.J. Lu
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.

Remove ext_sse_reg_operand since it is no longer needed.

PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_V1DF and
MODE_V2SF.
* config/i386/mmx.md (MMXMODE:*mov_internal): Call
ix86_output_ssemov for TYPE_SSEMOV.  Remove ext_sse_reg_operand
check.
* config/i386/predicates.md (ext_sse_reg_operand): Removed.
---
 gcc/config/i386/i386.c| 10 ++
 gcc/config/i386/mmx.md| 29 ++---
 gcc/config/i386/predicates.md |  5 -
 3 files changed, 12 insertions(+), 32 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1d3b784532b..f34a708cdc3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5142,6 +5142,16 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
   else
return "%vmovss\t{%1, %0|%0, %1}";
 
+case MODE_V1DF:
+  gcc_assert (!TARGET_AVX);
+   return "movlpd\t{%1, %0|%0, %1}";
+
+case MODE_V2SF:
+  if (TARGET_AVX && REG_P (operands[0]))
+   return "vmovlps\t{%1, %d0|%d0, %1}";
+  else
+   return "%vmovlps\t{%1, %0|%0, %1}";
+
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index e1c8b0af4c7..c3f195bb34a 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -118,29 +118,7 @@ (define_insn "*mov_internal"
   return standard_sse_constant_opcode (insn, operands);
 
 case TYPE_SSEMOV:
-  switch (get_attr_mode (insn))
-   {
-   case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq.  */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
-   return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
-   case MODE_TI:
- return "%vmovdqa\t{%1, %0|%0, %1}";
-   case MODE_XI:
- return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
-
-   case MODE_V2SF:
- if (TARGET_AVX && REG_P (operands[0]))
-   return "vmovlps\t{%1, %0, %0|%0, %0, %1}";
- return "%vmovlps\t{%1, %0|%0, %1}";
-   case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
-   default:
- gcc_unreachable ();
-   }
+  return ix86_output_ssemov (insn, operands);
 
 default:
   gcc_unreachable ();
@@ -189,10 +167,7 @@ (define_insn "*mov_internal"
  (cond [(eq_attr "alternative" "2")
  (const_string "SI")
(eq_attr "alternative" "11,12")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
-   (const_string "XI")
-(match_test "mode == V2SFmode")
+ (cond [(match_test "mode == V2SFmode")
   (const_string "V4SF")
 (ior (not (match_test "TARGET_SSE2"))
  (match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 1119366d54e..71f4cb1193c 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -61,11 +61,6 @@ (define_predicate "sse_reg_operand"
   (and (match_code "reg")
(match_test "SSE_REGNO_P (REGNO (op))")))
 
-;; True if the operand is an AVX-512 new register.
-(define_predicate "ext_sse_reg_operand"
-  (and (match_code "reg")
-   (match_test "EXT_REX_SSE_REGNO_P (REGNO (op))")))
-
 ;; Return true if op is a QImode register.
 (define_predicate "any_QIreg_operand"
   (and (match_code "reg")
-- 
2.24.1



[PATCH 5/6] i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV

2020-02-29 Thread H.J. Lu
There is no need to set mode attribute to V16SFmode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.

gcc/

PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_SF.
* config/i386/i386.md (*movdf_internal): Call ix86_output_ssemov
for TYPE_SSEMOV.  Remove TARGET_PREFER_AVX256, TARGET_AVX512VL
and ext_sse_reg_operand check.

gcc/testsuite/

PR target/89229
* gcc.target/i386/pr89229-7a.c: New test.
* gcc.target/i386/pr89229-7b.c: Likewise.
* gcc.target/i386/pr89229-7c.c: Likewise.
---
 gcc/config/i386/i386.c |  6 +
 gcc/config/i386/i386.md| 26 ++
 gcc/testsuite/gcc.target/i386/pr89229-7a.c | 16 +
 gcc/testsuite/gcc.target/i386/pr89229-7b.c |  6 +
 gcc/testsuite/gcc.target/i386/pr89229-7c.c |  6 +
 5 files changed, 36 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a6fe9894ab8..1d3b784532b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5136,6 +5136,12 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
   else
return "%vmovsd\t{%1, %0|%0, %1}";
 
+case MODE_SF:
+  if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
+   return "vmovss\t{%d1, %0|%0, %d1}";
+  else
+   return "%vmovss\t{%1, %0|%0, %1}";
+
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 060a34c4bd4..b837c345f4e 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3469,24 +3469,7 @@ (define_insn "*movsf_internal"
   return standard_sse_constant_opcode (insn, operands);
 
 case TYPE_SSEMOV:
-  switch (get_attr_mode (insn))
-   {
-   case MODE_SF:
- if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
-   return "vmovss\t{%d1, %0|%0, %d1}";
- return "%vmovss\t{%1, %0|%0, %1}";
-
-   case MODE_V16SF:
- return "vmovaps\t{%g1, %g0|%g0, %g1}";
-   case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
-   case MODE_SI:
- return "%vmovd\t{%1, %0|%0, %1}";
-
-   default:
- gcc_unreachable ();
-   }
+  return ix86_output_ssemov (insn, operands);
 
 case TYPE_MMXMOV:
   switch (get_attr_mode (insn))
@@ -3558,12 +3541,7 @@ (define_insn "*movsf_internal"
  better to maintain the whole registers in single format
  to avoid problems on using packed logical operations.  */
   (eq_attr "alternative" "6")
-(cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
- (not (match_test "TARGET_AVX512VL")))
-(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand")))
- (const_string "V16SF")
-   (ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
+(cond [(ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
 (match_test "TARGET_SSE_SPLIT_REGS"))
  (const_string "V4SF")
   ]
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7a.c 
b/gcc/testsuite/gcc.target/i386/pr89229-7a.c
new file mode 100644
index 000..856115b2f5a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern float d;
+
+void
+foo1 (float x)
+{
+  register float xmm16 __asm ("xmm16") = x;
+  asm volatile ("" : "+v" (xmm16));
+  register float xmm17 __asm ("xmm17") = xmm16;
+  asm volatile ("" : "+v" (xmm17));
+  d = xmm17;
+}
+
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7b.c 
b/gcc/testsuite/gcc.target/i386/pr89229-7b.c
new file mode 100644
index 000..93d1e43770c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-7a.c"
+
+/* { dg-final { scan-assembler-times 
"vmovaps\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7c.c 
b/gcc/testsuite/gcc.target/i386/pr89229-7c.c
new file mode 100644
index 000..e37ff2bf5bd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-7c.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-7a.c"
+
+/* { dg-final { scan-ass

[PATCH 2/6] i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV

2020-02-29 Thread H.J. Lu
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.

gcc/

PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_DI.
* config/i386/i386.md (*movdi_internal): Call ix86_output_ssemov
for TYPE_SSEMOV.  Remove ext_sse_reg_operand and TARGET_AVX512VL
check.

gcc/testsuite/

PR target/89229
* gcc.target/i386/pr89229-4a.c: New test.
* gcc.target/i386/pr89229-4b.c: Likewise.
* gcc.target/i386/pr89229-4c.c: Likewise.
---
 gcc/config/i386/i386.c |  9 +++
 gcc/config/i386/i386.md| 31 ++
 gcc/testsuite/gcc.target/i386/pr89229-4a.c | 17 
 gcc/testsuite/gcc.target/i386/pr89229-4b.c |  6 +
 gcc/testsuite/gcc.target/i386/pr89229-4c.c |  7 +
 5 files changed, 41 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7bbfbb4c5a7..baf70a64193 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5118,6 +5118,15 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
 case MODE_V4SF:
   return ix86_get_ssemov (operands, 16, insn_mode, mode);
 
+case MODE_DI:
+  /* Handle broken assemblers that require movd instead of movq. */
+  if (!HAVE_AS_IX86_INTERUNIT_MOVQ
+ && (GENERAL_REG_P (operands[0])
+ || GENERAL_REG_P (operands[1])))
+   return "%vmovd\t{%1, %0|%0, %1}";
+  else
+   return "%vmovq\t{%1, %0|%0, %1}";
+
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index cea831b6086..d8462b3de37 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2054,31 +2054,7 @@ (define_insn "*movdi_internal"
   return standard_sse_constant_opcode (insn, operands);
 
 case TYPE_SSEMOV:
-  switch (get_attr_mode (insn))
-   {
-   case MODE_DI:
- /* Handle broken assemblers that require movd instead of movq.  */
- if (!HAVE_AS_IX86_INTERUNIT_MOVQ
- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
-   return "%vmovd\t{%1, %0|%0, %1}";
- return "%vmovq\t{%1, %0|%0, %1}";
-
-   case MODE_TI:
- /* Handle AVX512 registers set.  */
- if (EXT_REX_SSE_REG_P (operands[0])
- || EXT_REX_SSE_REG_P (operands[1]))
-   return "vmovdqa64\t{%1, %0|%0, %1}";
- return "%vmovdqa\t{%1, %0|%0, %1}";
-
-   case MODE_V2SF:
- gcc_assert (!TARGET_AVX);
- return "movlps\t{%1, %0|%0, %1}";
-   case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
-   default:
- gcc_unreachable ();
-   }
+  return ix86_output_ssemov (insn, operands);
 
 case TYPE_SSECVT:
   if (SSE_REG_P (operands[0]))
@@ -2164,10 +2140,7 @@ (define_insn "*movdi_internal"
  (cond [(eq_attr "alternative" "2")
  (const_string "SI")
(eq_attr "alternative" "12,13")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
-  (const_string "TI")
-(match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
   (const_string "TI")
 (ior (not (match_test "TARGET_SSE2"))
  (match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4a.c 
b/gcc/testsuite/gcc.target/i386/pr89229-4a.c
new file mode 100644
index 000..cb9b071e873
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-4a.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+extern long long i;
+
+long long
+foo1 (void)
+{
+  register long long xmm16 __asm ("xmm16") = i;
+  asm volatile ("" : "+v" (xmm16));
+  register long long xmm17 __asm ("xmm17") = xmm16;
+  asm volatile ("" : "+v" (xmm17));
+  return xmm17;
+}
+
+/* { dg-final { scan-assembler-times 
"vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4b.c 
b/gcc/testsuite/gcc.target/i386/pr89229-4b.c
new file mode 100644
index 000..023e81253a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-4b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-4a.c"
+
+/* { dg-final { scan-assembler-times 
"vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4c.c

[PATCH 3/6] i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV

2020-02-29 Thread H.J. Lu
There is no need to set mode attribute to XImode since ix86_output_ssemov
can properly encode xmm16-xmm31 registers with and without AVX512VL.

gcc/

PR target/89229
* config/i386/i386.c (ix86_output_ssemov): Handle MODE_SI.
* config/i386/i386.md (*movsi_internal): Call ix86_output_ssemov
for TYPE_SSEMOV.  Remove ext_sse_reg_operand and TARGET_AVX512VL
check.

gcc/testsuite/

PR target/89229
* gcc.target/i386/pr89229-5a.c: New test.
* gcc.target/i386/pr89229-5b.c: Likewise.
* gcc.target/i386/pr89229-5c.c: Likewise.
---
 gcc/config/i386/i386.c |  3 +++
 gcc/config/i386/i386.md| 25 ++
 gcc/testsuite/gcc.target/i386/pr89229-5a.c | 17 +++
 gcc/testsuite/gcc.target/i386/pr89229-5b.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr89229-5c.c |  7 ++
 5 files changed, 35 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index baf70a64193..c28c162282a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5127,6 +5127,9 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands)
   else
return "%vmovq\t{%1, %0|%0, %1}";
 
+case MODE_SI:
+  return "%vmovd\t{%1, %0|%0, %1}";
+
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index d8462b3de37..e9537fadfe8 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2261,25 +2261,7 @@ (define_insn "*movsi_internal"
   gcc_unreachable ();
 
 case TYPE_SSEMOV:
-  switch (get_attr_mode (insn))
-   {
-   case MODE_SI:
-  return "%vmovd\t{%1, %0|%0, %1}";
-   case MODE_TI:
- return "%vmovdqa\t{%1, %0|%0, %1}";
-   case MODE_XI:
- return "vmovdqa32\t{%g1, %g0|%g0, %g1}";
-
-   case MODE_V4SF:
- return "%vmovaps\t{%1, %0|%0, %1}";
-
-   case MODE_SF:
- gcc_assert (!TARGET_AVX);
-  return "movss\t{%1, %0|%0, %1}";
-
-   default:
- gcc_unreachable ();
-   }
+  return ix86_output_ssemov (insn, operands);
 
 case TYPE_MMX:
   return "pxor\t%0, %0";
@@ -2345,10 +2327,7 @@ (define_insn "*movsi_internal"
  (cond [(eq_attr "alternative" "2,3")
  (const_string "DI")
(eq_attr "alternative" "8,9")
- (cond [(ior (match_operand 0 "ext_sse_reg_operand")
- (match_operand 1 "ext_sse_reg_operand"))
-  (const_string "XI")
-(match_test "TARGET_AVX")
+ (cond [(match_test "TARGET_AVX")
   (const_string "TI")
 (ior (not (match_test "TARGET_SSE2"))
  (match_test "optimize_function_for_size_p (cfun)"))
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5a.c 
b/gcc/testsuite/gcc.target/i386/pr89229-5a.c
new file mode 100644
index 000..fd56f447016
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5a.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512" } */
+
+extern int i;
+
+int
+foo1 (void)
+{
+  register int xmm16 __asm ("xmm16") = i;
+  asm volatile ("" : "+v" (xmm16));
+  register int xmm17 __asm ("xmm17") = xmm16;
+  asm volatile ("" : "+v" (xmm17));
+  return xmm17;
+}
+
+/* { dg-final { scan-assembler-times 
"vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5b.c 
b/gcc/testsuite/gcc.target/i386/pr89229-5b.c
new file mode 100644
index 000..261f2e12e8d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
+
+#include "pr89229-5a.c"
+
+/* { dg-final { scan-assembler-times 
"vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5c.c 
b/gcc/testsuite/gcc.target/i386/pr89229-5c.c
new file mode 100644
index 000..16fad809385
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-5c.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+#include "pr89229-5a.c"
+
+/* { dg-final { scan-assembler-times 
"vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
-- 
2.24.1



[committed] Fix trivial testsuite fallout from Vlad's recent IRA changes

2020-02-29 Thread Jeff Law
Vlad's recent IRA changes twiddled register allocation slightly causing some
tests to regress.  See 

http://gcc.gnu.org/jenkins

And look at the failures in the last 24hrs.

Anyway, I'm working through them right now.  This is the first issue. 
xstormy16 fails one test because of the register allocation difference.  I've
looked at the before/after and they appear equivalent in terms of code size and
likely performance.  So I'm just updating the test.

Given these tests are scanning for specific instruction sequences with
particular register numbers, I'm surprised have stable they've been over time.

Anyway, I'm committing this to the trunk momentarily and will start a fresh
xstormy16 build.

Jeff
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index ff1d1da3300..0ea4ffcc5f9 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2020-02-29  Jeff Law  
+
+	* gcc.target/xstormy16/sfr/06_sfrw_to_var.c: Update expected output.
+
 2020-02-28  Iain Sandoe  
 
 	* g++.dg/coroutines/torture/func-params-08.C: Update
diff --git a/gcc/testsuite/gcc.target/xstormy16/sfr/06_sfrw_to_var.c b/gcc/testsuite/gcc.target/xstormy16/sfr/06_sfrw_to_var.c
index 39cbab5c3e9..54c9baf8746 100644
--- a/gcc/testsuite/gcc.target/xstormy16/sfr/06_sfrw_to_var.c
+++ b/gcc/testsuite/gcc.target/xstormy16/sfr/06_sfrw_to_var.c
@@ -1,5 +1,5 @@
 /* { dg-options { -nostartfiles below100.o -Tbelow100.ld -O2 } } */
-/* { dg-final { scan-assembler "mov.w r6,32532" } } */
+/* { dg-final { scan-assembler "mov.w r1,32532" } } */
 
 #define SFR (*((volatile unsigned short*)0x7f14))
 unsigned short *p = (unsigned short *) 0x7f14;


Re: Minor regression due to recent IRA changes

2020-02-29 Thread Oleg Endo
On Fri, 2020-02-28 at 13:24 -0700, Jeff Law wrote:
> This change:
> 
> > commit 3133bed5d0327e8a9cd0a601b7ecdb9de4fc825d
> > Author: Vladimir N. Makarov 
> > Date:   Sun Feb 23 16:20:05 2020 -0500
> > 
> > Changing cost propagation and ordering colorable bucket
> > heuristics for
> > PR93564.
> > 
> > 2020-02-23  Vladimir Makarov  
> > 
> > PR rtl-optimization/93564
> > * ira-color.c (struct update_cost_queue_elem): New
> > member start.
> > (queue_update_cost, get_next_update_cost): Add new arg
> > start.
> > (allocnos_conflict_p): New function.
> > (update_costs_from_allocno): Add new arg
> > conflict_cost_update_p.
> > Add checking conflicts with allocnos_conflict_p.
> > (update_costs_from_prefs, restore_costs_from_copies):
> > Adjust
> > update_costs_from_allocno calls.
> > (update_conflict_hard_regno_costs): Add checking
> > conflicts with
> > allocnos_conflict_p.  Adjust calls of queue_update_cost
> > and
> > get_next_update_cost.
> > (assign_hard_reg): Adjust calls of
> > queue_update_cost.  Add
> > debugging print.
> > (bucket_allocno_compare_func): Restore previous
> > version.
> > 
> 
> Is causing c-torture/compile/sync-1 to fail with an ICE on sh4eb
> (search for
> "Tests that now fail, but worked before":
> 
> 
> http://3.14.90.209:8080/job/sh4eb-linux-gnu/lastFailedBuild/console
> 
> 
> In the .log we have:
> 
> > /home/gcc/gcc/gcc/testsuite/gcc.c-torture/compile/sync-1.c:253:1:
> > error:
> > unable to find a register to spill in class 'R0_REGS'^M
> > /home/gcc/gcc/gcc/testsuite/gcc.c-torture/compile/sync-1.c:253:1:
> > error: this
> > is the insn:^M
> > (insn 209 207 212 2 (parallel [^M
> > (set (subreg:SI (reg:HI 431) 0)^M
> > (unspec_volatile:SI [^M
> > (mem/v:HI (reg/f:SI 299) [-1  S2 A16])^M
> > (subreg:HI (reg:SI 6 r6 [orig:425 uc+-3 ]
> > [425]) 2)^M
> > (reg:HI 5 r5 [orig:428 sc+-1 ] [428])^M
> > ] UNSPECV_CMPXCHG_1))^M
> > (set (mem/v:HI (reg/f:SI 299) [-1  S2 A16])^M
> > (unspec_volatile:HI [^M
> > (const_int 0 [0])^M
> > ] UNSPECV_CMPXCHG_2))^M
> > (set (reg:SI 147 t)^M
> > (unspec_volatile:SI [^M
> > (const_int 0 [0])^M
> > ] UNSPECV_CMPXCHG_3))^M
> > (clobber (scratch:SI))^M
> > (clobber (reg:SI 0 r0))^M
> > (clobber (reg:SI 1 r1))^M
> > ]) "/home/gcc/gcc/gcc/testsuite/gcc.c-torture/compile/sync-
> > 1.c":245:8 
> > 406 {atomic_compare_and_swaphi_soft_gusa}^M
> >  (expr_list:REG_DEAD (reg:HI 5 r5 [orig:428 sc+-1 ] [428])^M
> > (expr_list:REG_DEAD (reg:SI 6 r6 [orig:425 uc+-3 ] [425])^M
> > (expr_list:REG_DEAD (reg/f:SI 299)^M
> > (expr_list:REG_UNUSED (reg:HI 431)^M
> > (expr_list:REG_UNUSED (reg:SI 1 r1)^M
> > (expr_list:REG_UNUSED (reg:SI 0 r0)^M
> > (nil^M
> > 
> 
> You should be able to trigger it with a cross compiler at -O2 with
> the attached
> testcase.
> 
> This could well be a target issue.  I haven't tried to debug it.  If
> it's a
> target issue, I'm fully comfortable punting it to the SH folks for
> resolving.

The R0_REGS spill failure is a general problem, in particular with old
reload.  The atomic patterns tend to trigger it in one circumstance or
the other.  The IRA change probably just stresses it more.  Perhaps it
will go away with -mlra.

However, LRA on SH still has its own issues, so it can't be generally
enabled by default yet, unfortunately.  See also some of the recent
posts in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93877

Cheers,
Oleg



Re: Minor regression due to recent IRA changes

2020-02-29 Thread Jeff Law
On Sun, 2020-03-01 at 00:43 +0900, Oleg Endo wrote:
> 
> > This could well be a target issue.  I haven't tried to debug it.  If
> > it's a
> > target issue, I'm fully comfortable punting it to the SH folks for
> > resolving.
> 
> The R0_REGS spill failure is a general problem, in particular with old
> reload.  The atomic patterns tend to trigger it in one circumstance or
> the other.  The IRA change probably just stresses it more.  Perhaps it
> will go away with -mlra.
> 
> However, LRA on SH still has its own issues, so it can't be generally
> enabled by default yet, unfortunately.  See also some of the recent
> posts in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93877
It's almost certainly the case that the recent IRA changes are going to stress
R0 more.  If I'm reading what Vlad did correctly, one of the tie-breakers its
using now is to choose the lowest numbered register when all else is equal.  So
R0 on SH is likely going to be more problematical.

I wonder if just reordering the regs on the SH (and adjusting the debug output
to keep that working) would be enough to mitigate some of the R0 problems.

And yes, I saw 93877 fly by too :(

Jeff




Re: Minor regression due to recent IRA changes

2020-02-29 Thread Oleg Endo
On Sat, 2020-02-29 at 08:47 -0700, Jeff Law wrote:
> 
> It's almost certainly the case that the recent IRA changes are going to stress
> R0 more.  If I'm reading what Vlad did correctly, one of the tie-breakers its
> using now is to choose the lowest numbered register when all else is equal.  
> So
> R0 on SH is likely going to be more problematical.
> 
> I wonder if just reordering the regs on the SH (and adjusting the debug output
> to keep that working) would be enough to mitigate some of the R0 problems.

It could open a can of worms.  Off the top of my head, R0 is used to
hold the function return value, and R0:R1 to return structs with sizeof
> 4 bytes.  So if DImode is allocated to R0, it implicitly uses R0:R1,
AFAIR, doesn't it?  Would that kind of thing cause troubles?

Cheers,
Oleg



Re: Minor regression due to recent IRA changes

2020-02-29 Thread Jeff Law
On Sun, 2020-03-01 at 00:55 +0900, Oleg Endo wrote:
> On Sat, 2020-02-29 at 08:47 -0700, Jeff Law wrote:
> > It's almost certainly the case that the recent IRA changes are going to
> > stress
> > R0 more.  If I'm reading what Vlad did correctly, one of the tie-breakers
> > its
> > using now is to choose the lowest numbered register when all else is
> > equal.  So
> > R0 on SH is likely going to be more problematical.
> > 
> > I wonder if just reordering the regs on the SH (and adjusting the debug
> > output
> > to keep that working) would be enough to mitigate some of the R0 problems.
> 
> It could open a can of worms.  Off the top of my head, R0 is used to
> hold the function return value, and R0:R1 to return structs with sizeof
> > 4 bytes.  So if DImode is allocated to R0, it implicitly uses R0:R1,
> AFAIR, doesn't it?  Would that kind of thing cause troubles?
It might.  We might have to move a pair or even a quad if you have modes that
cover r0-r3. It may not be feasible in practice.  I was just thinking off the
top of my head.

jeff



[committed] Trivial or1k fallout from recent IRA changes

2020-02-29 Thread Jeff Law


The IRA changes twidded the register allocations slightly.  Again I verified
that the code should be same from a runtime performance and codesize
perspective.

Committing momentarily.

Jeff
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 0ea4ffcc5f9..9b2df5596d7 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,7 @@
 2020-02-29  Jeff Law  
 
+	* gcc.target/or1k/return-2.c: Update expected output.
+
 	* gcc.target/xstormy16/sfr/06_sfrw_to_var.c: Update expected output.
 
 2020-02-28  Iain Sandoe  
diff --git a/gcc/testsuite/gcc.target/or1k/return-2.c b/gcc/testsuite/gcc.target/or1k/return-2.c
index c072ae23142..add3720c88e 100644
--- a/gcc/testsuite/gcc.target/or1k/return-2.c
+++ b/gcc/testsuite/gcc.target/or1k/return-2.c
@@ -16,4 +16,4 @@ struct a getstruct (long aa) {
 /* Ensure our return value is returned on stack.  */
 /* { dg-final { scan-assembler-not "r12," } } */
 /* { dg-final { scan-assembler "l.or\\s+r11, r3, r3" } } */
-/* { dg-final { scan-assembler-times "l.sw\\s+\\d+.r3.," 3 } } */
+/* { dg-final { scan-assembler-times "l.sw\\s+\\d+.r11.," 3 } } */


Re: Minor regression due to recent IRA changes

2020-02-29 Thread Oleg Endo
On Sat, 2020-02-29 at 08:57 -0700, Jeff Law wrote:
> 
> > It could open a can of worms.  Off the top of my head, R0 is used to
> > hold the function return value, and R0:R1 to return structs with sizeof
> > > 4 bytes.  So if DImode is allocated to R0, it implicitly uses R0:R1,
> > 
> > AFAIR, doesn't it?  Would that kind of thing cause troubles?
> 
> It might.  We might have to move a pair or even a quad if you have modes that
> cover r0-r3. It may not be feasible in practice.  I was just thinking off the
> top of my head.
> 

Yeah, for instance 'double _Complex' will be returned in R0-R3 when
compiling for 'without FPU'.  How about adding a target hook or look-up 
table (default 1:1 mapping for other targets)?  Would that be an
option?

Cheers,
Oleg



[committed] Update baseline symbols for hppa-linux

2020-02-29 Thread John David Anglin
The attached change updates the baseline symbols for hppa-linux-gnu.  Tested on 
hppa-unknown-linux-gnu.

Dave

2020-02-29  John David Anglin  

PR libstdc++/92906
* config/abi/post/hppa-linux-gnu/baseline_symbols.txt: Update.

diff --git a/libstdc++-v3/config/abi/post/hppa-linux-gnu/baseline_symbols.txt 
b/libstdc++-v3/config/abi/post/hppa-linux-gnu/baseline_symbols.txt
index d870c05b9dc..fdb71818007 100644
--- a/libstdc++-v3/config/abi/post/hppa-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/hppa-linux-gnu/baseline_symbols.txt
@@ -3005,12 +3005,18 @@ FUNC:_ZNSt3_V214error_categoryD1Ev@@GLIBCXX_3.4.21
 FUNC:_ZNSt3_V214error_categoryD2Ev@@GLIBCXX_3.4.21
 FUNC:_ZNSt3_V215system_categoryEv@@GLIBCXX_3.4.21
 FUNC:_ZNSt3_V216generic_categoryEv@@GLIBCXX_3.4.21
+FUNC:_ZNSt3pmr15memory_resourceD0Ev@@GLIBCXX_3.4.28
+FUNC:_ZNSt3pmr15memory_resourceD1Ev@@GLIBCXX_3.4.28
+FUNC:_ZNSt3pmr15memory_resourceD2Ev@@GLIBCXX_3.4.28
 FUNC:_ZNSt3pmr19new_delete_resourceEv@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr20get_default_resourceEv@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr20null_memory_resourceEv@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr20set_default_resourceEPNS_15memory_resourceE@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr25monotonic_buffer_resource13_M_new_bufferEjj@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr25monotonic_buffer_resource18_M_release_buffersEv@@GLIBCXX_3.4.26
+FUNC:_ZNSt3pmr25monotonic_buffer_resourceD0Ev@@GLIBCXX_3.4.28
+FUNC:_ZNSt3pmr25monotonic_buffer_resourceD1Ev@@GLIBCXX_3.4.28
+FUNC:_ZNSt3pmr25monotonic_buffer_resourceD2Ev@@GLIBCXX_3.4.28
 FUNC:_ZNSt3pmr26synchronized_pool_resource11do_allocateEjj@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr26synchronized_pool_resource13do_deallocateEPvjj@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr26synchronized_pool_resource7releaseEv@@GLIBCXX_3.4.26
@@ -4461,6 +4467,7 @@ 
OBJECT:12:_ZTIN9__gnu_cxx18stdio_sync_filebufIwSt11char_traitsIwEEE@@GLIBCXX_3.4
 OBJECT:12:_ZTINSt10filesystem16filesystem_errorE@@GLIBCXX_3.4.26
 OBJECT:12:_ZTINSt10filesystem7__cxx1116filesystem_errorE@@GLIBCXX_3.4.26
 OBJECT:12:_ZTINSt13__future_base19_Async_state_commonE@@GLIBCXX_3.4.17
+OBJECT:12:_ZTINSt3pmr25monotonic_buffer_resourceE@@GLIBCXX_3.4.28
 OBJECT:12:_ZTINSt3pmr26synchronized_pool_resourceE@@GLIBCXX_3.4.26
 OBJECT:12:_ZTINSt3pmr28unsynchronized_pool_resourceE@@GLIBCXX_3.4.26
 OBJECT:12:_ZTINSt7__cxx1114collate_bynameIcEE@@GLIBCXX_3.4.21
@@ -4617,10 +4624,16 @@ OBJECT:15:_ZTSSt8numpunctIcE@@GLIBCXX_3.4
 OBJECT:15:_ZTSSt8numpunctIwE@@GLIBCXX_3.4
 
OBJECT:16:_ZNSbIwSt11char_traitsIwESaIwEE4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4
 OBJECT:16:_ZNSs4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4
+OBJECT:16:_ZTIPDd@@CXXABI_1.3.4
+OBJECT:16:_ZTIPDe@@CXXABI_1.3.4
+OBJECT:16:_ZTIPDf@@CXXABI_1.3.4
 OBJECT:16:_ZTIPDi@@CXXABI_1.3.3
 OBJECT:16:_ZTIPDn@@CXXABI_1.3.5
 OBJECT:16:_ZTIPDs@@CXXABI_1.3.3
 OBJECT:16:_ZTIPDu@@CXXABI_1.3.12
+OBJECT:16:_ZTIPKDd@@CXXABI_1.3.4
+OBJECT:16:_ZTIPKDe@@CXXABI_1.3.4
+OBJECT:16:_ZTIPKDf@@CXXABI_1.3.4
 OBJECT:16:_ZTIPKDi@@CXXABI_1.3.3
 OBJECT:16:_ZTIPKDn@@CXXABI_1.3.5
 OBJECT:16:_ZTIPKDs@@CXXABI_1.3.3
@@ -5080,6 +5093,7 @@ OBJECT:25:_ZTSNSt7__cxx118messagesIwEE@@GLIBCXX_3.4.21
 OBJECT:25:_ZTSNSt7__cxx118numpunctIcEE@@GLIBCXX_3.4.21
 OBJECT:25:_ZTSNSt7__cxx118numpunctIwEE@@GLIBCXX_3.4.21
 OBJECT:25:_ZTSSt20bad_array_new_length@@CXXABI_1.3.8
+OBJECT:26:_ZTSNSt3pmr15memory_resourceE@@GLIBCXX_3.4.28
 OBJECT:27:_ZTSSt19__codecvt_utf8_baseIwE@@GLIBCXX_3.4.21
 OBJECT:28:_ZTSSt19__codecvt_utf8_baseIDiE@@GLIBCXX_3.4.21
 OBJECT:28:_ZTSSt19__codecvt_utf8_baseIDsE@@GLIBCXX_3.4.21
@@ -5088,6 +5102,8 @@ OBJECT:28:_ZTSSt7codecvtIcc11__mbstate_tE@@GLIBCXX_3.4
 OBJECT:28:_ZTSSt7codecvtIwc11__mbstate_tE@@GLIBCXX_3.4
 OBJECT:28:_ZTTSd@@GLIBCXX_3.4
 OBJECT:28:_ZTTSt14basic_iostreamIwSt11char_traitsIwEE@@GLIBCXX_3.4
+OBJECT:28:_ZTVNSt3pmr15memory_resourceE@@GLIBCXX_3.4.28
+OBJECT:28:_ZTVNSt3pmr25monotonic_buffer_resourceE@@GLIBCXX_3.4.28
 OBJECT:28:_ZTVNSt7__cxx1114collate_bynameIcEE@@GLIBCXX_3.4.21
 OBJECT:28:_ZTVNSt7__cxx1114collate_bynameIwEE@@GLIBCXX_3.4.21
 OBJECT:28:_ZTVNSt7__cxx1115messages_bynameIcEE@@GLIBCXX_3.4.21
@@ -5194,6 +5210,7 @@ 
OBJECT:34:_ZTSSt25__codecvt_utf8_utf16_baseIDsE@@GLIBCXX_3.4.21
 OBJECT:34:_ZTSSt9basic_iosIcSt11char_traitsIcEE@@GLIBCXX_3.4
 OBJECT:34:_ZTSSt9basic_iosIwSt11char_traitsIwEE@@GLIBCXX_3.4
 OBJECT:36:_ZTSN10__cxxabiv119__pointer_type_infoE@@CXXABI_1.3
+OBJECT:36:_ZTSNSt3pmr25monotonic_buffer_resourceE@@GLIBCXX_3.4.28
 OBJECT:36:_ZTSSt14codecvt_bynameIcc11__mbstate_tE@@GLIBCXX_3.4
 OBJECT:36:_ZTSSt14codecvt_bynameIwc11__mbstate_tE@@GLIBCXX_3.4
 OBJECT:36:_ZTVN10__cxxabiv117__pbase_type_infoE@@CXXABI_1.3
@@ -5757,6 +5774,9 @@ 
OBJECT:8:_ZGVNSt9money_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE2idE@@G
 
OBJECT:8:_ZGVNSt9money_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE2idE@@GLIBCXX_3.4
 
OBJECT:8:_ZGVNSt9money_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE2idE@@GLIBCXX_3.4
 
OBJECT:8:_ZGVNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEE2idE@@GLIBCXX_3.4
+OBJECT:8:_ZTIDd

Re: Minor regression due to recent IRA changes

2020-02-29 Thread Jeff Law
On Sun, 2020-03-01 at 01:06 +0900, Oleg Endo wrote:
> On Sat, 2020-02-29 at 08:57 -0700, Jeff Law wrote:
> > > It could open a can of worms.  Off the top of my head, R0 is used to
> > > hold the function return value, and R0:R1 to return structs with sizeof
> > > > 4 bytes.  So if DImode is allocated to R0, it implicitly uses R0:R1,
> > > 
> > > AFAIR, doesn't it?  Would that kind of thing cause troubles?
> > 
> > It might.  We might have to move a pair or even a quad if you have modes
> > that
> > cover r0-r3. It may not be feasible in practice.  I was just thinking off
> > the
> > top of my head.
> > 
> 
> Yeah, for instance 'double _Complex' will be returned in R0-R3 when
> compiling for 'without FPU'.  How about adding a target hook or look-up 
> table (default 1:1 mapping for other targets)?  Would that be an
> option?
I think it's pretty deeply baked that we can iterate from the first register in
a group to the last.  Given we'd have to move quads, I suspect this isn't
feasible in practice.

It really would have just been a workaround for some of the R0 issues anyway. 
I think at its core R0 on the SH probably needs to be treated more like a
temporary rather than a general register.  But that's probably a huge change,
both in terms of just getting it working right and in terms of addressing the
code quality regressions that would introduce.

jeff



Re: Minor regression due to recent IRA changes

2020-02-29 Thread Oleg Endo
On Sat, 2020-02-29 at 09:38 -0700, Jeff Law wrote:
> 
> It really would have just been a workaround for some of the R0 issues anyway. 
> I think at its core R0 on the SH probably needs to be treated more like a
> temporary rather than a general register.  But that's probably a huge change,
> both in terms of just getting it working right and in terms of addressing the
> code quality regressions that would introduce.
> 

I think one of the major issues is that R0 is a constraint in several
addressing modes for memory accesses.  I believe I once had the idea of
hiding R0 from RA ... then insert reg-reg copies (to load R0) after
RA/reload ... and then somehow do back propagation to get rid of the
reg-reg copies again.  Another idea was to run a pre-RA pass to pre-
allocate all R0 things.  But I think it's all just running in sqrt(1)
circles after all.

Cheers,
Oleg



[committed] Add dg-require-visibility to some tests

2020-02-29 Thread John David Anglin
This fixes the failure of g++.dg/ext/visibility/ref-temp1.C, 
gfortran.dg/pr90988_4.f and
gfortran.dg/pr91372.f90 on hppa2.0w-hp-hpux11.11.

Dave

2020-02-29  John David Anglin  

* g++.dg/ext/visibility/ref-temp1.C: Require visibility.
* gfortran.dg/pr90988_4.f: Likewise.
* gfortran.dg/pr91372.f90: Likewise.

diff --git a/gcc/testsuite/g++.dg/ext/visibility/ref-temp1.C 
b/gcc/testsuite/g++.dg/ext/visibility/ref-temp1.C
index ecb62326e1b..5d3e99ddb76 100644
--- a/gcc/testsuite/g++.dg/ext/visibility/ref-temp1.C
+++ b/gcc/testsuite/g++.dg/ext/visibility/ref-temp1.C
@@ -1,5 +1,6 @@
 // PR c++/91476
 // Test that hidden and internal visibility propagates to reference temps.
+// { dg-require-visibility "" }

 #define HIDDEN __attribute((visibility("hidden")))

diff --git a/gcc/testsuite/gfortran.dg/pr90988_4.f 
b/gcc/testsuite/gfortran.dg/pr90988_4.f
index 3379b2e128d..0a4e3f6aabf 100644
--- a/gcc/testsuite/gfortran.dg/pr90988_4.f
+++ b/gcc/testsuite/gfortran.dg/pr90988_4.f
@@ -1,4 +1,5 @@
 c { dg-do compile }
+c { dg-require-visibility "" }
module foo
   implicit none
   real a,b,c
diff --git a/gcc/testsuite/gfortran.dg/pr91372.f90 
b/gcc/testsuite/gfortran.dg/pr91372.f90
index b9483141eb6..8c200f683a6 100644
--- a/gcc/testsuite/gfortran.dg/pr91372.f90
+++ b/gcc/testsuite/gfortran.dg/pr91372.f90
@@ -1,4 +1,5 @@
 ! { dg-do compile }
+! { dg-require-visibility "" }
 ! PR fortran/91372
 module module_sf_lake
 implicit none


[Patch, fortran] PR92976 - [8/9/10 Regression][OOP] ICE in trans_associate_var, at fortran/trans-stmt.c:1963

2020-02-29 Thread Paul Richard Thomas
I am a tiny bit skeptical that this is a regression but I will check.
However, it has clearly been there from the early days of OOP without
being picked up.

The fix is to ensure that the temporary has the correct type of array spec.

Regtested on x86_64/FC31 - OK for trunk and 8-/9- branches ?

Cheers

Paul

2020-02-29  Paul Thomas  

PR fortran/92976
* match.c (select_type_set_tmp): If the selector array spec has
explicit bounds, make the temporary's bounds deferred.

2020-02-29  Paul Thomas  

PR fortran/92976
* gfortran.dg/select_type_48.f90 : New test.
Index: gcc/fortran/match.c
===
*** gcc/fortran/match.c	(revision 279842)
--- gcc/fortran/match.c	(working copy)
*** select_type_set_tmp (gfc_typespec *ts)
*** 6294,6301 
  		= CLASS_DATA (selector)->attr.dimension;
  	  sym->attr.codimension
  		= CLASS_DATA (selector)->attr.codimension;
! 	  sym->as
! 		= gfc_copy_array_spec (CLASS_DATA (selector)->as);
  	}
  	}
  
--- 6294,6307 
  		= CLASS_DATA (selector)->attr.dimension;
  	  sym->attr.codimension
  		= CLASS_DATA (selector)->attr.codimension;
! 	  if (CLASS_DATA (selector)->as->type != AS_EXPLICIT)
! 		sym->as = gfc_copy_array_spec (CLASS_DATA (selector)->as);
! 	  else
! 		{
! 		  sym->as = gfc_get_array_spec();
! 		  sym->as->rank = CLASS_DATA (selector)->as->rank;
! 		  sym->as->type = AS_DEFERRED;
! 		}
  	}
  	}
  
Index: gcc/testsuite/gfortran.dg/select_type_48.f90
===
*** gcc/testsuite/gfortran.dg/select_type_48.f90	(nonexistent)
--- gcc/testsuite/gfortran.dg/select_type_48.f90	(working copy)
***
*** 0 
--- 1,31 
+ ! { dg-do run }
+ !
+ ! Test the fix for PR92976, in which the TYPE IS statement caused an ICE
+ ! because of the explicit bounds of 'x'.
+ !
+ ! Contributed by Gerhard Steinmetz  
+ !
+ program p
+type t
+   integer :: i
+end type
+class(t), allocatable :: c(:)
+allocate (c, source = [t(),t(),t()])
+call s(c)
+if (sum (c%i) .ne. ) stop 1
+ contains
+subroutine s(x)
+   class(t) :: x(2)
+   select type (x)
+ ! ICE as compiler attempted to assign descriptor to an array
+  type is (t)
+ x%i = 0
+ ! Make sure that bounds are correctly translated.
+ call counter (x)
+   end select
+end
+subroutine counter (arg)
+  type(t) :: arg(:)
+  if (size (arg, 1) .ne. 2) stop 2
+end
+ end


[committed] Explicitly link against against libatomic in various libstdc++ tests

2020-02-29 Thread John David Anglin
We need to explicity link against libatomic on hppa.  The attached changes add 
"dg-add-options libatomic"
to the test setup where needed.

Tested on hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.  Committed to trunk.

Dave
-- 
John David Anglin  dave.ang...@bell.net
2020-02-29  John David Anglin  

* testsuite/30_threads/condition_variable_any/stop_token/wait_on.cc:
Add libatomic option.
* testsuite/30_threads/jthread/jthread.cc: Likewise.

diff --git 
a/libstdc++-v3/testsuite/30_threads/condition_variable_any/stop_token/wait_on.cc
 
b/libstdc++-v3/testsuite/30_threads/condition_variable_any/stop_token/wait_on.cc
index cb1637c306d..0efda12708f 100644
--- 
a/libstdc++-v3/testsuite/30_threads/condition_variable_any/stop_token/wait_on.cc
+++ 
b/libstdc++-v3/testsuite/30_threads/condition_variable_any/stop_token/wait_on.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++2a -pthread" }
+// { dg-add-options libatomic }
 // { dg-do run }
 // { dg-require-effective-target c++2a }
 // { dg-require-effective-target pthread }
diff --git a/libstdc++-v3/testsuite/30_threads/jthread/jthread.cc 
b/libstdc++-v3/testsuite/30_threads/jthread/jthread.cc
index c34958c25d9..746ff437c1d 100644
--- a/libstdc++-v3/testsuite/30_threads/jthread/jthread.cc
+++ b/libstdc++-v3/testsuite/30_threads/jthread/jthread.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++2a -pthread" }
+// { dg-add-options libatomic }
 // { dg-do run { target c++2a } }
 // { dg-require-effective-target pthread }
 // { dg-require-gthreads "" }
2020-02-29  John David Anglin  

* testsuite/30_threads/stop_token/stop_callback.cc: Add libatomic
option.
* testsuite/30_threads/stop_token/stop_callback/deadlock-mt.cc:
Likewise.
* testsuite/30_threads/stop_token/stop_callback/deadlock.cc: Likewise.
* testsuite/30_threads/stop_token/stop_callback/destroy.cc: Likewise.
* testsuite/30_threads/stop_token/stop_callback/invoke.cc: Likewise.
* testsuite/30_threads/stop_token/stop_source.cc: Likewise.
* testsuite/30_threads/stop_token/stop_source/assign.cc: Likewise.
* testsuite/30_threads/stop_token/stop_token.cc: Likewise.
* testsuite/30_threads/stop_token/stop_token/stop_possible.cc:
Likewise.

diff --git a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback.cc 
b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback.cc
index da44f8ad8ed..b84d3af4f9b 100644
--- a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback.cc
+++ b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++2a" }
+// { dg-add-options libatomic }
 // { dg-do run { target c++2a } }
 
 #include 
diff --git 
a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/deadlock-mt.cc 
b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/deadlock-mt.cc
index 12c54db554f..96f7197c3da 100644
--- a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/deadlock-mt.cc
+++ b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/deadlock-mt.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++2a -pthread"  }
+// { dg-add-options libatomic }
 // { dg-require-effective-target c++2a }
 // { dg-require-effective-target pthread }
 // { dg-require-gthreads "" }
diff --git 
a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/deadlock.cc 
b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/deadlock.cc
index f9de6e02562..c59446cf1b0 100644
--- a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/deadlock.cc
+++ b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/deadlock.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++2a" }
+// { dg-add-options libatomic }
 // { dg-do run { target c++2a } }
 
 #include 
diff --git 
a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/destroy.cc 
b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/destroy.cc
index 3fa4d21c55c..b94743a884c 100644
--- a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/destroy.cc
+++ b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/destroy.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++2a -pthread"  }
+// { dg-add-options libatomic }
 // { dg-require-effective-target c++2a }
 // { dg-require-effective-target pthread }
 // { dg-require-gthreads "" }
diff --git 
a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/invoke.cc 
b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/invoke.cc
index 9b8137cc46d..dc121121a59 100644
--- a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/invoke.cc
+++ b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/invoke.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=g

[committed] Skip libstdc++ charset.cc tests on *-*-hpux*

2020-02-29 Thread John David Anglin
Tested on hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.  Committed to trunk.

Dave

2020-02-29  John David Anglin  

* testsuite/17_intro/headers/c++1998/charset.cc: Skip on *-*-hpux*.
* testsuite/17_intro/headers/c++2011/charset.cc: Likewise.
* testsuite/17_intro/headers/c++2014/charset.cc: Likewise.
* testsuite/17_intro/headers/c++2017/charset.cc: Likewise.
* testsuite/17_intro/headers/c++2020/charset.cc: Likewise.

diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++1998/charset.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++1998/charset.cc
index 4425e1cf63e..e76edea1559 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++1998/charset.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++1998/charset.cc
@@ -1,5 +1,5 @@
 // { dg-options "-finput-charset=ascii" }
 // { dg-do compile }
-// { dg-skip-if "non-ascii in system headers" { *-*-darwin10*  *-*-darwin[89]* 
} }
+// { dg-skip-if "non-ascii in system headers" { *-*-hpux* *-*-darwin10*  
*-*-darwin[89]* } }

 #include 
diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++2011/charset.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++2011/charset.cc
index 4425e1cf63e..e76edea1559 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++2011/charset.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++2011/charset.cc
@@ -1,5 +1,5 @@
 // { dg-options "-finput-charset=ascii" }
 // { dg-do compile }
-// { dg-skip-if "non-ascii in system headers" { *-*-darwin10*  *-*-darwin[89]* 
} }
+// { dg-skip-if "non-ascii in system headers" { *-*-hpux* *-*-darwin10*  
*-*-darwin[89]* } }

 #include 
diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++2014/charset.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++2014/charset.cc
index 4425e1cf63e..e76edea1559 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++2014/charset.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++2014/charset.cc
@@ -1,5 +1,5 @@
 // { dg-options "-finput-charset=ascii" }
 // { dg-do compile }
-// { dg-skip-if "non-ascii in system headers" { *-*-darwin10*  *-*-darwin[89]* 
} }
+// { dg-skip-if "non-ascii in system headers" { *-*-hpux* *-*-darwin10*  
*-*-darwin[89]* } }

 #include 
diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++2017/charset.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++2017/charset.cc
index 4425e1cf63e..e76edea1559 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++2017/charset.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++2017/charset.cc
@@ -1,5 +1,5 @@
 // { dg-options "-finput-charset=ascii" }
 // { dg-do compile }
-// { dg-skip-if "non-ascii in system headers" { *-*-darwin10*  *-*-darwin[89]* 
} }
+// { dg-skip-if "non-ascii in system headers" { *-*-hpux* *-*-darwin10*  
*-*-darwin[89]* } }

 #include 
diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++2020/charset.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++2020/charset.cc
index 4425e1cf63e..e76edea1559 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++2020/charset.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++2020/charset.cc
@@ -1,5 +1,5 @@
 // { dg-options "-finput-charset=ascii" }
 // { dg-do compile }
-// { dg-skip-if "non-ascii in system headers" { *-*-darwin10*  *-*-darwin[89]* 
} }
+// { dg-skip-if "non-ascii in system headers" { *-*-hpux* *-*-darwin10*  
*-*-darwin[89]* } }

 #include 


[PATCH] c++: Fix convert_like in template [PR91465, PR93870, PR92031]

2020-02-29 Thread Marek Polacek
The point of this patch is to fix the recurring problem of trees
generated by convert_like while processing a template that break when
substituting.  For instance, when convert_like creates a CALL_EXPR
while in a template, substituting such a call breaks in finish_call_expr
because we have two 'this' arguments.  Another problem is that we
can create &TARGET_EXPR<> and then fail when substituting because we're
taking the address of an rvalue.  I've analyzed some of the already fixed
PRs and also some of the currently open ones:

In c++/93870 we create EnumWrapper::operator E(&operator~(E)).
In c++/87145 we create S::operator int (&{N}).
In c++/92031 we create &TARGET_EXPR <0>.

And so on.  I'd like to fix it once and for all.  I wanted something
that fixes all the existing cases, removes the ugly check in
convert_nontype_argument, and something suitable for stage4.  I.e.,
I didn't implement any cleanups suggested in
 regarding
the pattern in e.g. build_explicit_specifier.

The gist of the problem is when convert_like_real creates a call for
a ck_user or wraps a TARGET_EXPR in & in a template.  So in these cases
use IMPLICIT_CONV_EXPR.  In a template we shouldn't need to perform the
actual conversion, we only need it's result type.  Is that something
that convert_like_real shouldn't do?
perform_direct_initialization_if_possible and perform_implicit_conversion_flags
can also create an IMPLICIT_CONV_EXPR.

Given the change above, build_converted_constant_expr can return an
IMPLICIT_CONV_EXPR so call fold_non_dependent_expr rather than
maybe_constant_value to deal with that.  A problem with that is that now
we may instantiate something twice in a row (?).  Handling all of it in
build_converted_constant_expr won't be that straightforward because we
sometimes call cxx_constant_value to give errors, or use manifestly_const_eval
which should be honored.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2020-02-29  Marek Polacek  

PR c++/92031 - bogus taking address of rvalue error.
PR c++/91465 - ICE with template codes in check_narrowing.
PR c++/93870 - wrong error when converting template non-type arg.
* call.c (convert_like_real) : Return IMPLICIT_CONV_EXPR
in a template.
(convert_like_real) : Likewise.
* decl.c (compute_array_index_type_loc): Call fold_non_dependent_expr
instead of maybe_constant_value.
* pt.c (convert_nontype_argument): Don't build IMPLICIT_CONV_EXPR.
Set IMPLICIT_CONV_EXPR_NONTYPE_ARG if that's what
build_converted_constant_expr returned.
* typeck2.c (check_narrowing): Call fold_non_dependent_expr instead
of maybe_constant_value.

* g++.dg/cpp0x/conv-tmpl2.C: New test.
* g++.dg/cpp0x/conv-tmpl3.C: New test.
* g++.dg/cpp0x/conv-tmpl4.C: New test.
* g++.dg/cpp1z/conv-tmpl1.C: New test.
---
 gcc/cp/call.c   | 12 +
 gcc/cp/decl.c   |  4 +--
 gcc/cp/pt.c | 25 ---
 gcc/cp/typeck2.c|  6 -
 gcc/testsuite/g++.dg/cpp0x/conv-tmpl2.C | 21 
 gcc/testsuite/g++.dg/cpp0x/conv-tmpl3.C | 16 
 gcc/testsuite/g++.dg/cpp0x/conv-tmpl4.C | 33 +
 gcc/testsuite/g++.dg/cpp1z/conv-tmpl1.C | 10 
 8 files changed, 104 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/conv-tmpl2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/conv-tmpl3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/conv-tmpl4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/conv-tmpl1.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 85bbd043a1d..4cb07b61695 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7383,6 +7383,12 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
   {
struct z_candidate *cand = convs->cand;
 
+   /* Creating &TARGET_EXPR<> in a template breaks when substituting,
+  and creating a CALL_EXPR in a template breaks in finish_call_expr
+  so use an IMPLICIT_CONV_EXPR for this conversion.  */
+   if (processing_template_decl)
+ return build1 (IMPLICIT_CONV_EXPR, totype, expr);
+
if (cand == NULL)
  /* We chose the surrogate function from add_conv_candidate, now we
 actually need to build the conversion.  */
@@ -7760,6 +7766,12 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
expr = convert_bitfield_to_declared_type (expr);
expr = fold_convert (type, expr);
  }
+
+   /* Creating &TARGET_EXPR<> in a template would break when
+  tsubsting the expression, so use an IMPLICIT_CONV_EXPR
+  instead.  */
+   if (processing_template_decl)
+ return build1 (IMPLICIT_CONV_EXPR, totype, expr);
expr = build_targ

[committed] XFAIL some IPA tests that are not supported on 32-bit hppa*-*-hpux*

2020-02-29 Thread John David Anglin
IPA-SRA does not handle structures passed by invisible reference when the 
callee does copies.
This patch xfails test that depend on this feature on 32-bit hppa*-*-hpux*.

Dave

2020-02-29  John David Anglin  

PR ipa/92548
* gcc.dg/ipa/ipa-sra-12.c: xfail parameter split test on 32-bit
hppa*-*-hpux*.
* gcc.dg/ipa/ipa-sra-14.c: Likewise.
* gcc.dg/ipa/ipcp-agg-12.c: xfail adding extra caller test.

diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c 
b/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
index 689071e566c..4d9057e6353 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
@@ -46,5 +46,5 @@ main (int argc, char *argv[])
   return 0;
 }

-/* { dg-final { scan-ipa-dump-times "Will split parameter" 2 "sra" } } */
-/* { dg-final { scan-ipa-dump-times "component at byte offset" 4 "sra" } } */
+/* { dg-final { scan-ipa-dump-times "Will split parameter" 2 "sra" { xfail { 
hppa*-*-hpux* && { ! lp64 } } } } } */
+/* { dg-final { scan-ipa-dump-times "component at byte offset" 4 "sra" { xfail 
{ hppa*-*-hpux* && { ! lp64 } } } } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c 
b/gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c
index 01881249d90..3ca302c77e2 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c
@@ -56,5 +56,7 @@ main (int argc, char *argv[])
 }


-/* { dg-final { scan-ipa-dump-times "Will split parameter" 2 "sra" } } */
-/* { dg-final { scan-ipa-dump-times "component at byte offset" 4 "sra" } } */
+/* { dg-final { scan-ipa-dump-times "Will split parameter" 2 "sra" { xfail { 
hpp
+a*-*-hpux* && { ! lp64 } } } } } */
+/* { dg-final { scan-ipa-dump-times "component at byte offset" 4 "sra" { xfail 
{ hpp
+a*-*-hpux* && { ! lp64 } } } } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c 
b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c
index 5c57913803e..57a94aca049 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c
@@ -50,4 +50,4 @@ void entry2 (void)
 }


-/* { dg-final { scan-ipa-dump-times "adding an extra caller" 2 "cp" } } */
+/* { dg-final { scan-ipa-dump-times "adding an extra caller" 2 "cp" { xfail { 
hppa*-*-hpux* && { ! lp64 } } } } } */


Re: Minor regression due to recent IRA changes

2020-02-29 Thread Jeff Law
On Sun, 2020-03-01 at 01:47 +0900, Oleg Endo wrote:
> On Sat, 2020-02-29 at 09:38 -0700, Jeff Law wrote:
> > It really would have just been a workaround for some of the R0 issues
> > anyway. 
> > I think at its core R0 on the SH probably needs to be treated more like a
> > temporary rather than a general register.  But that's probably a huge
> > change,
> > both in terms of just getting it working right and in terms of addressing
> > the
> > code quality regressions that would introduce.
> > 
> 
> I think one of the major issues is that R0 is a constraint in several
> addressing modes for memory accesses.  I believe I once had the idea of
> hiding R0 from RA ... then insert reg-reg copies (to load R0) after
> RA/reload ... and then somehow do back propagation to get rid of the
> reg-reg copies again.  Another idea was to run a pre-RA pass to pre-
> allocate all R0 things.  But I think it's all just running in sqrt(1)
> circles after all.
Yup.  That was roughly what I was thinking and roughly the worry I had with
trying to squash out the quality regressions.  But it may ultimately be the
only way to really resolve these issues.

DJ's work on the m32c IIRC might be useful if you do try to chase this stuff
down.  Essentially there weren't really enough registers.  So he had the port
pretend to have more than it really did, then had a post-reload pass to do the
final allocation into the target's actual register file.

jeff



[PATCH] c++: Add -std=gnu++20 option [PR93958]

2020-02-29 Thread Marek Polacek
One missing bit from r10-6656.  The docs and target-supports.exp
already handle -std=gnu++20.

Ok?

2020-02-29  Marek Polacek  

PR c++/93958 - add missing -std=gnu++20.
* c.opt: Add -std=gnu++20.
---
 gcc/c-family/c.opt | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index b7e4fe146b2..1cd585fa71d 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -2149,7 +2149,11 @@ Conform to the ISO 2017 C++ standard with GNU extensions.
 
 std=gnu++2a
 C++ ObjC++
-Conform to the ISO 2020(?) C++ draft standard with GNU extensions 
(experimental and incomplete support).
+Conform to the ISO 2020 C++ draft standard with GNU extensions (experimental 
and incomplete support).
+
+std=gnu++20
+C++ ObjC++ Alias(std=gnu++2a)
+Conform to the ISO 2020 C++ draft standard with GNU extensions (experimental 
and incomplete support).
 
 std=gnu11
 C ObjC

base-commit: 38b1722d5d44c52e06a8694b8fa36793735e27d1
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



[committed] Skip/xfail various tests for gcc-10 on hppa*-*-hpux*

2020-02-29 Thread John David Anglin
This change addresses a bunch of miscellaneous testsuite issues for 
hppa*-*-hpux*.

Tested on hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.

Dave

2020-02-29  John David Anglin  

* g++.dg/pr90981.C: Skip on hppa*-*-hpux*.
* gcc.dg/gnu2x-attrs-1.c: Add dg-require-alias.
* gcc.dg/pr90756.c: Add -fno-common option on hppa*-*-hpux*.
* gcc.dg/torture/20190327-1.c: Likewise.
* gcc.dg/spellcheck-options-21.c: Skip on 32-bit hppa*-*-hpux*.
* gcc.dg/strlenopt-68.c: Skip on hppa*-*-hpux*.
* gcc.dg/torture/pr90020.c: Likewise.
* gcc.dg/ucnid-16-utf8.c: Add dg-require-iconv "latin1".

diff --git a/gcc/testsuite/g++.dg/pr90981.C b/gcc/testsuite/g++.dg/pr90981.C
index 5a273027908..b88d6e88b68 100644
--- a/gcc/testsuite/g++.dg/pr90981.C
+++ b/gcc/testsuite/g++.dg/pr90981.C
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if "No dwarf debug support" { hppa*-*-hpux* } } */
 /* { dg-options "-O2 -g -gdwarf-5 -gsplit-dwarf" } */

 /* No addresses in the DWARF, so no .debug_addr section,
diff --git a/gcc/testsuite/gcc.dg/gnu2x-attrs-1.c 
b/gcc/testsuite/gcc.dg/gnu2x-attrs-1.c
index 87bdaec0807..2007911e720 100644
--- a/gcc/testsuite/gcc.dg/gnu2x-attrs-1.c
+++ b/gcc/testsuite/gcc.dg/gnu2x-attrs-1.c
@@ -1,6 +1,7 @@
 /* Test C2x attribute syntax.  Test GNU attributes appertain to
appropriate constructs.  */
 /* { dg-do compile } */
+/* { dg-require-alias "" } */
 /* { dg-options "-std=gnu2x" } */

 void f (void) {};
diff --git a/gcc/testsuite/gcc.dg/pr90756.c b/gcc/testsuite/gcc.dg/pr90756.c
index 3507aa29e70..a4ba64acd72 100644
--- a/gcc/testsuite/gcc.dg/pr90756.c
+++ b/gcc/testsuite/gcc.dg/pr90756.c
@@ -2,6 +2,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -Wno-psabi" } */
 /* { dg-additional-options "-mno-sse" { target ia32 } } */
+/* { dg-additional-options "-fno-common" { target hppa*-*-hpux* } } */

 typedef float B __attribute__((vector_size(4 * sizeof (float;
 typedef unsigned long long C __attribute__((vector_size(4 * sizeof (long 
long;
diff --git a/gcc/testsuite/gcc.dg/spellcheck-options-21.c 
b/gcc/testsuite/gcc.dg/spellcheck-options-21.c
index 3e0e8a8ebaf..92fcb020e12 100644
--- a/gcc/testsuite/gcc.dg/spellcheck-options-21.c
+++ b/gcc/testsuite/gcc.dg/spellcheck-options-21.c
@@ -1,3 +1,4 @@
 /* { dg-do compile } */
+/* { dg-skip-if "-flto not supported" { { hppa*-*-hpux* } && { ! lp64 } } } */
 /* { dg-options "-flto=sparta" } */
 /* { dg-error "unrecognized argument to '-flto=' option: 'sparta'" "" { target 
*-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-68.c 
b/gcc/testsuite/gcc.dg/strlenopt-68.c
index 56d314e5d45..f77162f2c9d 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-68.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-68.c
@@ -3,6 +3,7 @@
the expected result regardless of the order of the expression
operands.
{ dg-do run }
+   { dg-skip-if "UNIX 2003 return behavior not supported" { hppa*-*-hpux* } }
{ dg-options "-O2 -Wall" } */

 #include "strlenopt.h"
diff --git a/gcc/testsuite/gcc.dg/torture/20190327-1.c 
b/gcc/testsuite/gcc.dg/torture/20190327-1.c
index bb20e7fba99..45093da768c 100644
--- a/gcc/testsuite/gcc.dg/torture/20190327-1.c
+++ b/gcc/testsuite/gcc.dg/torture/20190327-1.c
@@ -1,4 +1,5 @@
 /* { dg-do run } */
+/* { dg-additional-options "-fno-common" { target hppa*-*-hpux* } } */

 typedef long v2di __attribute__((vector_size(16)));
 v2di v;
diff --git a/gcc/testsuite/gcc.dg/torture/pr90020.c 
b/gcc/testsuite/gcc.dg/torture/pr90020.c
index 1748243852a..27d1ea41ddd 100644
--- a/gcc/testsuite/gcc.dg/torture/pr90020.c
+++ b/gcc/testsuite/gcc.dg/torture/pr90020.c
@@ -1,4 +1,5 @@
 /* { dg-do run } */
+/* { dg-skip-if "No undefined weak" { hppa*-*-hpux* } } */
 /* { dg-require-weak "" } */
 /* { dg-additional-options "-Wl,-undefined,dynamic_lookup" { target 
*-*-darwin* } } */
 /* { dg-additional-options "-Wl,-flat_namespace" { target *-*-darwin[89]* } } 
*/
diff --git a/gcc/testsuite/gcc.dg/ucnid-16-utf8.c 
b/gcc/testsuite/gcc.dg/ucnid-16-utf8.c
index 5d000a0758a..0b9befa4ad8 100644
--- a/gcc/testsuite/gcc.dg/ucnid-16-utf8.c
+++ b/gcc/testsuite/gcc.dg/ucnid-16-utf8.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-iconv "latin1" } */
 /* { dg-options "-std=c99 -g -finput-charset=latin1" } */
 /* { dg-final { scan-file ucnid-16-utf8.s "²" } } */



[committed] Fix STATIC_CHAIN_REGNUM for v850 port

2020-02-29 Thread Jeff Law

Wow, I think I wrote the v850 port back in circa 1997 and this bug has been
latent all this time.  Vlad's IRA changes twiddled register allocation in just
the right way to expose this bug.

I'm not sure what I was thinking, but apparently I made a spectacularly bad
choice for the STATIC_CHAIN_REGNUM in choosing a call-saved register (r20).

It's simply wrong to use a call-saved register for the static chain.  Think
about what the case if we take the address of a nested function.  We actually
get the address of the trampoline.  Then assume we call through that function
pointer at some point deeper in the call stack.  At the call site we have to
express that the static chain register was changed, but there's no way to know
at the call site -- that's the whole point of using the trampoline, it looks
just like a normal indirect call.

The only was I can see to fix this is to fix the static chain register to be a
call clobbered register which is an ABI change.  Thankfully the combination of
v850 and nested functions probably isn't used terribly much.

I've verified this fixed the recent v850 regressions.  Committing to the trunk.

Jeff
commit c7dbc54958321d296ca4e283f26f279f6a5342a7
Author: Jeff Law 
Date:   Sat Feb 29 13:45:37 2020 -0700

Make STATIC_CHAIN_REGNUM a call used register.

* config/v850/v850.h (STATIC_CHAIN_REGNUM): Change to r19.
* config/v850/v850.c (v850_asm_trampoline_template): Update
accordingly.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7d95db8623f..2a69c680a9b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2020-02-28  Jeff Law  
+
+   * config/v850/v850.h (STATIC_CHAIN_REGNUM): Change to r19.
+   * config/v850/v850.c (v850_asm_trampoline_template): Update
+   accordingly.
+
 2020-02-28  Michael Meissner  
 
PR target/93937
diff --git a/gcc/config/v850/v850.c b/gcc/config/v850/v850.c
index 074adf87687..4b0e28c1786 100644
--- a/gcc/config/v850/v850.c
+++ b/gcc/config/v850/v850.c
@@ -2961,7 +2961,7 @@ static void
 v850_asm_trampoline_template (FILE *f)
 {
   fprintf (f, "\tjarl .+4,r12\n");
-  fprintf (f, "\tld.w 12[r12],r20\n");
+  fprintf (f, "\tld.w 12[r12],r19\n");
   fprintf (f, "\tld.w 16[r12],r12\n");
   fprintf (f, "\tjmp [r12]\n");
   fprintf (f, "\tnop\n");
diff --git a/gcc/config/v850/v850.h b/gcc/config/v850/v850.h
index 823bc5e17e3..7ae583c7df2 100644
--- a/gcc/config/v850/v850.h
+++ b/gcc/config/v850/v850.h
@@ -438,8 +438,9 @@ enum reg_class
 /* Base register for access to arguments of the function.  */
 #define ARG_POINTER_REGNUM 35
 
-/* Register in which static-chain is passed to a function.  */
-#define STATIC_CHAIN_REGNUM 20
+/* Register in which static-chain is passed to a function.
+   This must be a call used register.  */
+#define STATIC_CHAIN_REGNUM 19
 
 /* If defined, this macro specifies a table of register pairs used to
eliminate unneeded registers that point into the stack frame.  If


coroutines: Add a test for non-trivial await_resume return type (NFC).

2020-02-29 Thread Iain Sandoe
Hi

Just an improvement to test coverage.

Tested on x86_64 darwin and linux,
applied to master
thanks
Iain

gcc/testsuite/ChangeLog:

2020-02-29  Iain Sandoe  

* g++.dg/coroutines/coro1-ret-int-yield-int.h: Add templated
awaitable.
* g++.dg/coroutines/torture/co-await-15-return-non-triv.C: New test.

diff --git a/gcc/testsuite/g++.dg/coroutines/coro1-ret-int-yield-int.h 
b/gcc/testsuite/g++.dg/coroutines/coro1-ret-int-yield-int.h
index abf625869fa..67ac197fee4 100644
--- a/gcc/testsuite/g++.dg/coroutines/coro1-ret-int-yield-int.h
+++ b/gcc/testsuite/g++.dg/coroutines/coro1-ret-int-yield-int.h
@@ -78,6 +78,16 @@ struct coro1 {
 int& await_resume() const noexcept { PRINT ("susp-always-resume-intprt"); 
return x;}
   };
 
+  template 
+  struct suspend_always_tmpl_awaiter {
+_AwaitType x;
+suspend_always_tmpl_awaiter(_AwaitType __x) : x(__x) {}
+~suspend_always_tmpl_awaiter() {}
+bool await_ready() const noexcept { return false; }
+void await_suspend(coro::coroutine_handle<>) const noexcept { PRINT 
("suspend_always_tmpl_awaiter");}
+_AwaitType await_resume() const noexcept { PRINT 
("suspend_always_tmpl_awaiter"); return x;}
+  };
+
   struct promise_type {
 
   promise_type() : vv(-1) {  PRINT ("Created Promise"); }
diff --git 
a/gcc/testsuite/g++.dg/coroutines/torture/co-await-15-return-non-triv.C 
b/gcc/testsuite/g++.dg/coroutines/torture/co-await-15-return-non-triv.C
new file mode 100644
index 000..70c974bc56a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/co-await-15-return-non-triv.C
@@ -0,0 +1,51 @@
+//  { dg-do run }
+
+/* Check that we handle await_resume for a non-trivial type.  */
+
+#include "../coro.h"
+
+// boiler-plate for tests of codegen
+#include "../coro1-ret-int-yield-int.h"
+
+coro1
+f ()
+{
+  struct test {
+int a;
+~test () {}
+  };
+  test input{5};
+  test res = co_await coro1::suspend_always_tmpl_awaiter(input);
+  co_return res.a + 10;
+}
+
+int main ()
+{
+  PRINT ("main: create coro1");
+  struct coro1 f_coro = f ();
+
+  if (f_coro.handle.done())
+{
+  PRINT ("main: we should not be 'done' [1]");
+  abort ();
+}
+  PRINT ("main: resuming [1] initial suspend");
+  f_coro.handle.resume();
+  PRINT ("main: resuming [2] co_await suspend_always_tmpl_awaiter");
+  f_coro.handle.resume();
+
+  /* we should now have returned with the co_return (15) */
+  if (!f_coro.handle.done())
+{
+  PRINT ("main: we should be 'done' ");
+  abort ();
+}
+  int y = f_coro.handle.promise().get_value();
+  if (y != 15)
+{
+  PRINTF ("main: y is wrong : %d, should be 15\n", y);
+  abort ();
+}
+  PRINT ("main: done");
+  return 0;
+}



[committed] Disable gnat.dg/socket1.adb on hppa*-*-hpux*

2020-02-29 Thread John David Anglin
Committed to trunk and gcc-9 branch.

Dave

2020-02-29  John David Anglin  

PR ada/91100
* gnat.dg/socket1.adb: Disable on hppa*-*-hpux*.

diff --git a/gcc/testsuite/gnat.dg/socket1.adb 
b/gcc/testsuite/gnat.dg/socket1.adb
index a6bdade304b..154a7aff190 100644
--- a/gcc/testsuite/gnat.dg/socket1.adb
+++ b/gcc/testsuite/gnat.dg/socket1.adb
@@ -1,4 +1,4 @@
--- { dg-do run { target { ! "*-*-solaris2*" } } }
+-- { dg-do run { target { ! { hppa*-*-hpux* *-*-solaris2* } } } }

 with GNAT.Sockets; use GNAT.Sockets;
 procedure socket1 is



Re: Minor regression due to recent IRA changes

2020-02-29 Thread Oleg Endo
On Sat, 2020-02-29 at 12:35 -0700, Jeff Law wrote:
> 
> Yup.  That was roughly what I was thinking and roughly the worry I had with
> trying to squash out the quality regressions.  But it may ultimately be the
> only way to really resolve these issues.

Another idea would be to let RA see R0, but ignore all the R0
constraints.  Then try fixing up everything afterwards.  If R0 is
removed from the allocatable reg list, there will be one register less
for it to work with and I'd expect some code quality regressions.  But
in order to fix up all the R0 cases after the regular RA/reload, I
believe it will have to re-do a lot of (similar) work that has been
done by the regular RA already.  One thing that comes instantly to mind
are loops and the use of R0 as index/base register in memory addressing
... it just sounds like a lot of duplicate work in general.

> 
> DJ's work on the m32c IIRC might be useful if you do try to chase this stuff
> down.  Essentially there weren't really enough registers.  So he had the port
> pretend to have more than it really did, then had a post-reload pass to do the
> final allocation into the target's actual register file.
> 

AFAIK DJ did the same (or similar) thing for RL78.  IMHO that just
shows that one type of RA/reload does not fit all.  Perhaps it'd be
better to have the option of different RA/reload implementations, which
implement different strategies for different needs and priorities.

Anyway, on SH the R0 problem seems to go away with LRA for the most
part.  I don't know if anything has been put in LRA specifically to
address such cases, or it works by general definition of the design, or
it's just a mere coincidence.  If it's the latter case, I'm not sure
what to expect in the future.  Perhaps it will start breaking again if
changes for other targets are being made to LRA.

Cheers,
Oleg