Re: [PATCH] fix ICEs in c-attribs.c (PR 88383, 89288, 89798, 89797)​

2019-04-17 Thread Jakub Jelinek
On Tue, Apr 16, 2019 at 08:40:29PM -0600, Martin Sebor wrote:
> --- gcc/tree.h(revision 270402)
> +++ gcc/tree.h(working copy)
> @@ -3735,9 +3735,9 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
>if (NUM_POLY_INT_COEFFS == 2)
>  {
>poly_uint64 res = 0;
> -  res.coeffs[0] = 1 << (precision & 0xff);
> +  res.coeffs[0] = (unsigned HOST_WIDE_INT)1 << (precision & 0xff);
>if (precision & 0x100)
> - res.coeffs[1] = 1 << (precision & 0xff);
> + res.coeffs[1] = (unsigned HOST_WIDE_INT)1 << (precision & 0xff);

Instead of (unsigned HOST_WIDE_INT)1 one should use HOST_WIDE_INT_1U
macro.

Jakub


Re: [PATCH] backport r257541, r259936, r260294, r260623, r261098, r261333, r268585.

2019-04-17 Thread Xiong Hu Luo
Hi Segher,

On 2019/4/16 PM6:54, Segher Boessenkool wrote:
> Hi Xiong,
> 
> Sorry I took so long to review this.
> 
> On Thu, Apr 04, 2019 at 02:49:29AM -0500, luo...@linux.ibm.com wrote:
>> These patches are followed changes for r25 on testcases
>> vsx-vector-6*.c.  backport them to update file names and fix regressions
>> for GCC7 on power9.
> 
> (See e.g. https://gcc.gnu.org/ml/gcc-testresults/2019-04/msg01868.html for
> the failures this patch fixes; the patch is for GCC 7).
> 
>> gcc/ChangeLog:
>>
>> 2019-04-03  Xiong Hu Luo 
>>
>>  backport from trunk r260623.
>>
>>  2018-05-23  Segher Boessenkool  
>>
>>  * doc/sourcebuild.texi (Endianness): New subsubsection.
> 
> We write the changelog like
> 
> 2019-04-16  Xiong Hu Luo 
> 
>   Backport from trunk
>   2018-05-23  Segher Boessenkool  
> 
>   * doc/sourcebuild.texi (Endianness): New subsubsection.
> 
> (no revision number, capital on Backport, no empty line after it).
> 
>> 2019-04-03  Xiong Hu Luo 
>>
>>  backport from trunk r257541.
>>
>>  2018-02-07  Will Schmidt  
>>
>>  * gcc.target/powerpc/vsx-vector-6-le.c:  Update CPU target.
>>  * gcc.target/powerpc/vsx-vector-6-le.p9.c:  New.
> 
> Only one space after : please.
> 
>>  2018-05-04 Carl Love  
> 
> Two spaces between date and name.
> 
>>  * gcc.target/powerpc/vsx-vector-6-le.c: Add le qualifiers as needed for
>>  the various instruction counts.  Rename file to vsx-vector-6.p8.c.
> 
> There's a tab after "to" here, should be a space.
> 
> 
> Other than those nits, okay for the GCC 7 branch, thanks!

I will modify all the ChangeLog nits by copy-paste, thanks.

> 
> ("be" and "le" are essentially PowerPC-specific selectors on the 7 branch,
> otherwise you'd need a release manager's approval as well).

Do you mean move the "be" and "le" code from
gcc/testsuite/lib/target-supports.exp to
gcc/testsuite/gcc.target/powerpc/powerpc.exp here?  I tried this, it can
work.
This require ChangeLog update as below?  Or rewrite all the ChangeLog
with mine signed-of-by?

2018-05-23  Segher Boessenkool  

* lib/target-supports.exp (check_effective_target_be): New.
(check_effective_target_le): New.
=>

2018-05-23  Segher Boessenkool  

* gcc.target/powerpc/powerpc.exp
(check_effective_target_be): New.
(check_effective_target_le): New.

Also need update the "doc/sourcebuild.texi" from "+@subsubsection
Endianness" to "+@subsubsection Endianness For powerpc".



Thanks
Xiong Hu

> 
> 
> Segher
> 



Re: [PATCH PR90078]Capping comp_cost computation in ivopts

2019-04-17 Thread Jakub Jelinek
On Wed, Apr 17, 2019 at 02:13:12PM +0800, bin.cheng wrote:
> Hi,
> As discussed in PR90078, this patch checks possible infinite_cost overflow in 
> ivopts.
> Also as discussed, overflow happens mostly because of cost scaling wrto 
> bb_freq/loop_freq.
> For the moment, we only implement capping in comp_cost operators, while in 
> next
> stage1, we may instead implement capping in get_scaled_computation_cost_at 
> with
> more supporting benchmark data.
> 
> BTW, I think switching costs around comparison between infinite_cost is 
> unnecessary
> since there will be no overflow in integer after capping with infinite_cost.
> 
> Bootstrap and test on x86_64, is it OK?
> 
> Thanks,
> bin
> 
> 2019-04-17  Bin Cheng  
> 
> PR tree-optimization/92078
> * tree-ssa-loop-ivopts.c (comp_cost::operator +,-,+=,-+,/=,*=): Add
> checks for infinite_cost overflow.
> 
> 2018-04-17  Bin Cheng  
> 
> PR tree-optimization/92078
> * gcc/testsuite/g++.dg/tree-ssa/pr90078.C: New test.

--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -243,6 +243,9 @@ operator+ (comp_cost cost1, comp_cost cost2)
   if (cost1.infinite_cost_p () || cost2.infinite_cost_p ())
 return infinite_cost;
 
+  if (cost1.cost + cost2.cost >= infinite_cost.cost)
+return infinite_cost;

As
#define INFTY 1000
what is the reason to keep the previous condition as well?
I mean, if cost1.cost == INFTY or cost2.cost == INFTY,
cost1.cost + cost2.cost >= INFTY too.
Unless costs can go negative.

@@ -256,6 +259,8 @@ operator- (comp_cost cost1, comp_cost cost2)
 return infinite_cost;
 
   gcc_assert (!cost2.infinite_cost_p ());
+  if (cost1.cost - cost2.cost >= infinite_cost.cost)
+return infinite_cost;

Unless costs can be negative, when you first bail out
for cost1.cost == INFTY, then cost1.cost - cost2.cost won't
be INFTY (but could get negative).  So shouldn't there be a guard against
that instead?  Or, if costs can be negative, shouldn't there be also
guards that it doesn't grow too negative (say smaller than -INFTY)?

Jakub


Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-17 Thread Thomas König
Hi,

thanks a lot for the extensive discussion :-)

How should we now proceed, first for gcc 9, snd then for backporting?
Use Richard‘s patch with the corresponding Fortran FE change?

Regards

Thomas

Re: [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095, take 2)

2019-04-17 Thread Richard Biener
On Tue, 16 Apr 2019, Jakub Jelinek wrote:

> On Tue, Apr 16, 2019 at 06:21:25PM +0200, Eric Botcazou wrote:
> > > The runtime check assures that at runtime, the upper 32 bits of pseudo 104
> > > must be always 0 (in this case, in some other case could be sign bit
> > > copies).
> > 
> > OK, as Richard pointed out, that's not sufficient if we allow...
> > 
> > > The question is if it would be valid say for forward propagation to first
> > > propagate (or combine) the pseudo 97 into the (subreg/s/v:SI (reg:DI 104)
> > > 0), then hoisting it before the jump_insn 16, have the subreg optimized
> > > away and miscompile later on.
> > 
> > ...this to happen.  So we could clear SUBREG_PROMOTED_VAR_P as soon as the 
> > SUBREG is rewritten, but this looks quite fragile.  The safest route is 
> > probably not to use SUBREG_PROMOTED_VAR_P in this conditional context.
> > 
> > > That means either that the hoisting pass is buggy, or that 
> > > SUBREG_PROMOTED_*
> > > is only safe at the function boundary (function arguments and return 
> > > value)
> > > and not elsewhere.
> > 
> > I think that Richard's characterization is correct:
> > 
> > "Note that likely SUBREG_PROMOTED_VAR_P wasn't designed to communicate
> > zero-extend info (can't you use a REG_EQUIV note somehow?) but it has
> > to be information that is valid everywhere in the function unless
> > data dependences force its motion (thus a conditional doesn't do)."
> > 
> > i.e. this also works for a local variable that is always accessed with the 
> > SUBREG_PROMOTED_VAR_P semantics.
> 
> Ok, here is a patch that just removes all of that SUBREG_PROMOTED_SET then,
> as even for the opN_small_p we can't actually guarantee that for the whole
> function, only for where the pseudo with the SSA_NAME for which we get the
> range appears.  On the bright side, the generated code at least for the
> particular testcase has somewhat different RA decisions, but isn't
> significantly worse.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2019-04-16  Jakub Jelinek  
> 
>   PR middle-end/90095
>   * internal-fn.c (expand_mul_overflow): Don't set SUBREG_PROMOTED_VAR_P
>   on lowpart SUBREGs.
> 
>   * gcc.dg/pr90095-1.c: New test.
>   * gcc.dg/pr90095-2.c: New test.
> 
> --- gcc/internal-fn.c.jj  2019-04-15 19:45:22.38646 +0200
> +++ gcc/internal-fn.c 2019-04-16 15:18:56.614708804 +0200
> @@ -1753,22 +1753,9 @@ expand_mul_overflow (location_t loc, tre
> /* If both op0 and op1 are sign (!uns) or zero (uns) extended from
>hmode to mode, the multiplication will never overflow.  We can
>do just one hmode x hmode => mode widening multiplication.  */
> -   rtx lopart0s = lopart0, lopart1s = lopart1;
> -   if (GET_CODE (lopart0) == SUBREG)
> - {
> -   lopart0s = shallow_copy_rtx (lopart0);
> -   SUBREG_PROMOTED_VAR_P (lopart0s) = 1;
> -   SUBREG_PROMOTED_SET (lopart0s, uns ? SRP_UNSIGNED : SRP_SIGNED);
> - }
> -   if (GET_CODE (lopart1) == SUBREG)
> - {
> -   lopart1s = shallow_copy_rtx (lopart1);
> -   SUBREG_PROMOTED_VAR_P (lopart1s) = 1;
> -   SUBREG_PROMOTED_SET (lopart1s, uns ? SRP_UNSIGNED : SRP_SIGNED);
> - }
> tree halfstype = build_nonstandard_integer_type (hprec, uns);
> -   ops.op0 = make_tree (halfstype, lopart0s);
> -   ops.op1 = make_tree (halfstype, lopart1s);
> +   ops.op0 = make_tree (halfstype, lopart0);
> +   ops.op1 = make_tree (halfstype, lopart1);
> ops.code = WIDEN_MULT_EXPR;
> ops.type = type;
> rtx thisres
> --- gcc/testsuite/gcc.dg/pr90095-1.c.jj   2019-04-16 13:45:22.614772955 
> +0200
> +++ gcc/testsuite/gcc.dg/pr90095-1.c  2019-04-16 13:45:22.614772955 +0200
> @@ -0,0 +1,18 @@
> +/* PR middle-end/90095 */
> +/* { dg-do run } */
> +/* { dg-options "-Os -fno-tree-bit-ccp" } */
> +
> +unsigned long long a;
> +unsigned int b;
> +
> +int
> +main ()
> +{
> +  unsigned int c = 255, d = c |= b;
> +  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4 || __SIZEOF_LONG_LONG__ != 8)
> +return 0;
> +  d = __builtin_mul_overflow (-(unsigned long long) d, (unsigned char) - c, 
> &a);
> +  if (d != 0)
> +__builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.dg/pr90095-2.c.jj   2019-04-16 15:20:14.728414325 
> +0200
> +++ gcc/testsuite/gcc.dg/pr90095-2.c  2019-04-16 15:20:29.597167928 +0200
> @@ -0,0 +1,5 @@
> +/* PR middle-end/90095 */
> +/* { dg-do run } */
> +/* { dg-options "-Os -fno-tree-bit-ccp -fno-split-wide-types" } */
> +
> +#include "pr90095-1.c"
> 
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [PATCH] Don't ignore leading whitespace in ARM target attribute/pragma (PR target/89093)

2019-04-17 Thread Kyrill Tkachov



On 4/16/19 6:50 PM, Jakub Jelinek wrote:

On Fri, Apr 12, 2019 at 05:10:48PM +0100, Ramana Radhakrishnan wrote:
> No, that's not right. we should get rid of this.

Here is a patch for that.

Bootstrapped/regtested on armv7hl-linux-gnueabi, ok for trunk?



Ok. I don't think anyone relies on this behaviour.

Thanks,

Kyrill



2019-04-16  Jakub Jelinek  

    PR target/89093
    * config/arm/arm.c (arm_valid_target_attribute_rec): Don't skip
    whitespace at the start of target attribute string.

    * gcc.target/arm/pr89093-2.c: New test.

--- gcc/config/arm/arm.c.jj 2019-04-13 17:20:07.353977370 +0200
+++ gcc/config/arm/arm.c    2019-04-15 19:50:31.386414421 +0200
@@ -30871,8 +30871,6 @@ arm_valid_target_attribute_rec (tree arg

   while ((q = strtok (argstr, ",")) != NULL)
 {
-  while (ISSPACE (*q)) ++q;
-
   argstr = NULL;
   if (!strcmp (q, "thumb"))
 opts->x_target_flags |= MASK_THUMB;
--- gcc/testsuite/gcc.target/arm/pr89093-2.c.jj 2019-04-15 
19:53:23.740608673 +0200
+++ gcc/testsuite/gcc.target/arm/pr89093-2.c    2019-04-15 
19:52:29.841486100 +0200

@@ -0,0 +1,9 @@
+/* PR target/89093 */
+/* { dg-do compile } */
+
+__attribute__((target (" arm"))) void f1 (void) {} /* { dg-error 
"unknown target attribute or pragma ' arm'" } */
+__attribute__((target ("   thumb"))) void f2 (void) {} /* { dg-error 
"unknown target attribute or pragma '   thumb'" } */
+__attribute__((target ("arm,  thumb"))) void f3 (void) {} /* { 
dg-error "unknown target attribute or pragma '  thumb'" } */
+__attribute__((target ("thumb,  arm"))) void f4 (void) {} /* { 
dg-error "unknown target attribute or pragma '  arm'" } */
+#pragma GCC target ("    arm") /* { dg-error "unknown target 
attribute or pragma '    arm'" } */

+void f5 (void) {}


    Jakub


Re: [PATCH] Don't ignore leading whitespace in AArch64 target attribute/pragma (PR target/89093)

2019-04-17 Thread Kyrill Tkachov

HI Jakub,

On 4/16/19 7:32 PM, Jakub Jelinek wrote:

On Tue, Apr 16, 2019 at 07:50:35PM +0200, Jakub Jelinek wrote:
> On Fri, Apr 12, 2019 at 05:10:48PM +0100, Ramana Radhakrishnan wrote:
> > No, that's not right. we should get rid of this.
>
> Here is a patch for that.
>
> Bootstrapped/regtested on armv7hl-linux-gnueabi, ok for trunk?

And here is the same thing for aarch64. Bootstrapped/regtested on
aarch64-linux, ok for trunk?



FWIW this looks ok to me implementation-wise (since I wrote that code a 
few years ago).




I think it is better not to accept any spaces in there, than accepting it
only at the beginning and after , but not e.g. at the end of before ,
like the trunk currently does, furthermore, e.g. x86 or ppc don't allow
spaces there.


Thinking about it a bit more, I think it's a good idea to disallow 
leading and trailing whitespaces.


But there could be a case for allowing whitespaces between separate 
target attributes.


Personally, I would find it more readable to have a space after a comma.

Similarly, spaces are allowed in the general attribute syntax, for 
example in our intrinsics header we have:


__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))

That said, distinguishing between the two classes of whitespace is 
probably more complexity than it's worth


and if other targets don't allow it then I won't let it block this patch.

Thanks,

Kyrill




2019-04-16  Jakub Jelinek  

    PR target/89093
    * config/aarch64/aarch64.c (aarch64_process_one_target_attr): 
Don't skip

    whitespace at the start of target attribute string.

    * gcc.target/aarch64/pr89093.c: New test.
    * gcc.target/aarch64/pr63304_1.c: Remove space from target string.

--- gcc/config/aarch64/aarch64.c.jj 2019-04-11 10:26:22.907293129 
+0200
+++ gcc/config/aarch64/aarch64.c    2019-04-15 19:59:55.784226278 
+0200

@@ -12536,10 +12536,6 @@ aarch64_process_one_target_attr (char *a
   char *str_to_check = (char *) alloca (len + 1);
   strcpy (str_to_check, arg_str);

-  /* Skip leading whitespace.  */
-  while (*str_to_check == ' ' || *str_to_check == '\t')
-    str_to_check++;
-
   /* We have something like __attribute__ ((target ("+fp+nosimd"))).
  It is easier to detect and handle it explicitly here rather than 
going

  through the machinery for the rest of the target attributes in this
--- gcc/testsuite/gcc.target/aarch64/pr89093.c.jj 2019-04-15 
20:02:25.456788897 +0200
+++ gcc/testsuite/gcc.target/aarch64/pr89093.c  2019-04-15 
20:02:04.433131260 +0200

@@ -0,0 +1,7 @@
+/* PR target/89093 */
+/* { dg-do compile } */
+
+__attribute__((target ("  no-strict-align"))) void f1 (void) {} /* { 
dg-error "is not valid" } */
+__attribute__((target ("   general-regs-only"))) void f2 (void) 
{} /* { dg-error "is not valid" } */
+#pragma GCC target ("    general-regs-only")   /* { dg-error "is not 
valid" } */

+void f3 (void) {}
--- gcc/testsuite/gcc.target/aarch64/pr63304_1.c.jj 2017-09-13 
16:22:19.795513580 +0200
+++ gcc/testsuite/gcc.target/aarch64/pr63304_1.c 2019-04-15 
20:27:17.724847578 +0200

@@ -1,7 +1,7 @@
 /* { dg-do assemble } */
 /* { dg-options "-O1 --save-temps" } */
 #pragma GCC push_options
-#pragma GCC target ("+nothing+simd, cmodel=small")
+#pragma GCC target ("+nothing+simd,cmodel=small")

 int
 cal (double a)


    Jakub


Re: [PATCH] Don't ignore leading whitespace in AArch64 target attribute/pragma (PR target/89093)

2019-04-17 Thread Jakub Jelinek
On Wed, Apr 17, 2019 at 08:59:08AM +0100, Kyrill Tkachov wrote:
> Similarly, spaces are allowed in the general attribute syntax, for example
> in our intrinsics header we have:
> 
> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))

Well, that is how the C/C++ lexing works.  We also allow
__attribute__ ((__always_inline__   , __gnu_inline__

,

__artificial__

))
etc.

The whitespace skipping in the target string handling allowed
target (" abc,  def")
but didn't allow
target ("abc , def")
or
target ("abc ,def")
etc.
IMHO either we shouldn't allow any whitespace anywhere, or allow it
everywhere (leading, trailing, before or after comma), but then for
consistency all targets should do that.
If one wants to do some whitespace, there is always an option to do
target ("abc,"
"def")
or
target ("abc,"  "def")
or
#define C(x) #x
target (C(abc)  "," C(def))
or whatever else one wants to use.

Jakub


Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).

2019-04-17 Thread Hongtao Liu
On Tue, Apr 16, 2019 at 11:41 PM H.J. Lu  wrote:
>
> On Tue, Apr 16, 2019 at 8:36 AM Martin Liška  wrote:
> >
> > On 4/16/19 4:50 PM, H.J. Lu wrote:
> > > On Tue, Apr 16, 2019 at 1:28 AM Martin Liška  wrote:
> > >>
> > >> On 4/15/19 5:09 PM, H.J. Lu wrote:
> > >>> On Mon, Apr 15, 2019 at 12:26 AM Martin Liška  wrote:
> > 
> >  On 4/12/19 4:12 PM, H.J. Lu wrote:
> > > On Fri, Apr 12, 2019 at 4:41 AM Martin Liška  wrote:
> > >>
> > >> On 4/11/19 6:30 PM, H.J. Lu wrote:
> > >>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška  wrote:
> > 
> >  Hi.
> > 
> >  The patch is adding missing AVX512 ISAs for target and target_clone
> >  attributes.
> > 
> >  Patch can bootstrap on x86_64-linux-gnu and survives regression 
> >  tests.
> > 
> >  Ready to be installed?
> >  Thanks,
> >  Martin
> > 
> >  gcc/ChangeLog:
> > 
> >  2019-04-10  Martin Liska  
> > 
> >  PR target/89929
> >  * config/i386/i386.c (get_builtin_code_for_version): Add
> >  support for missing AVX512 ISAs.
> > 
> >  gcc/testsuite/ChangeLog:
> > 
> >  2019-04-10  Martin Liska  
> > 
> >  PR target/89929
> >  * g++.target/i386/mv28.C: New test.
> >  * gcc.target/i386/mvc14.c: New test.
> >  ---
> >   gcc/config/i386/i386.c| 34 
> >  ++-
> >   gcc/testsuite/g++.target/i386/mv28.C  | 30 +++
> >   gcc/testsuite/gcc.target/i386/mvc14.c | 16 +
> >   3 files changed, 79 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.target/i386/mv28.C
> >   create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c
> > 
> > 
> > >>>
> > >>
> > >> Hi.
> > >>
> > >>> Since any ISAs beyond AVX512F may be enabled individually, we
> > >>> can't simply assign priorities to them.   For GFNI, we can have
> > >>>
> > >>> 1. GFNI
> > >>> 2.  GFNI + AVX
> > >>> 3.  GFNI + AVX512F
> > >>> 4. GFNI + AVX512F + AVX512VL
> > >>
> > >> Makes sense to me! I'm considering syntax extension where one would 
> > >> be
> > >> able to come up with a priority. Eg.
> > >>
> > >> __attribute__((target("gfni,avx512bw", priority((3)
> > >>
> > >> Without that the ISA combinations are probably not comparable in a 
> > >> reasonable way.
> > >>
> > >>>
> > >>> For this code,  GFNI + AVX512BW is ignored:
> > >>>
> > >>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii
> > >>> __attribute__((target("gfni")))
> > >>> int foo(int i) {
> > >>> return 1;
> > >>> }
> > >>> __attribute__((target("gfni,avx512bw")))
> > >>> int foo(int i) {
> > >>> return 4;
> > >>> }
> > >>> __attribute__((target("default")))
> > >>> int foo(int i) {
> > >>> return 3;
> > >>> }
> > >>> int bar ()
> > >>> {
> > >>> return foo(2);
> > >>> }
> > >>
> > >> For 'target' attribute it works for me:
> > >>
> > >> 1) $ cat z.c && ./xg++ -B. z.c -c
> > >> #include 
> > >> volatile __m512i x1, x2;
> > >> volatile __mmask64 m64;
> > >>
> > >> __attribute__((target("gfni")))
> > >> int foo(int i) {
> > >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
> > >> return 1;
> > >> }
> > >> __attribute__((target("gfni,avx512bw")))
> > >> int foo(int i) {
> > >> return 4;
> > >> }
> > >> __attribute__((target("default")))
> > >> int foo(int i) {
> > >>   return 3;
> > >> }
> > >> int bar ()
> > >> {
> > >> return foo(2);
> > >> }
> > >> In file included from ./include/immintrin.h:117,
> > >>  from ./include/x86intrin.h:32,
> > >>  from z.c:1:
> > >> z.c: In function ‘int foo(int)’:
> > >> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa 
> > >> option -m32 -mgfni -mavx512f
> > >> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
> > >>   |  ^~~~
> > >> z.c:7:10: note: the ABI for passing parameters with 64-byte 
> > >> alignment has changed in GCC 4.6
> > >>
> > >> 2) $ cat z.c && ./xg++ -B. z.c -c
> > >> #include 
> > >> volatile __m512i x1, x2;
> > >> volatile __mmask64 m64;
> > >>
> > >> __attribute__((target("gfni")))
> > >> int foo(int i) {
> > >> return 1;
> > >> }
> > >> __attribute__((target("gfni,avx512bw")))
> > >> int foo(int i) {
> > >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
> > >> return 4;
> > >> }
> > >> __attribute__((target("default")))
> > >>

[PR90048] Fortran OpenACC 'private' clause rejected for predetermined private loop iteration variable (was: [patch,gomp4] make fortran loop variables implicitly private in openacc)

2019-04-17 Thread Thomas Schwinge
Hi!

On Mon, 11 Aug 2014 16:55:28 -0700, Cesar Philippidis  
wrote:
> According to section 2.6.1 in the openacc spec, fortran loop variables
> should be implicitly private like in openmp.

More correctly, they are "predetermined private" (which cannot be
overridden), not "implicit private" (which could be overridden with a
different explicit clause).

> This patch does just so.

But it also introduced PR90048 "Fortran OpenACC 'private' clause rejected
for predetermined private loop iteration variable".

Instead of the patch "don't error on implicitly private induction
variables in gfortran" proposed by Cesar, and then challenged by Jakub, I
have now committed a different patch (more similar to the existing
handling for OpenMP) to trunk in r270406 "[PR90048] Fortran OpenACC
'private' clause rejected for predetermined private loop iteration
variable", see attached.


I have a cleanup patch (for next GCC development stage 1), which will
simply merge the special-case 'gfc_resolve_oacc_blocks' into the generic
'gfc_resolve_omp_parallel_blocks', see attached.


> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/private-1.f95
> @@ -0,0 +1,39 @@
> +! { dg-do compile } 
> +! { dg-additional-options "-fdump-tree-omplower" } 
> +
> +! test for implicit private clauses in do loops
> +
> +program test
> +  implicit none
> +  integer :: i, j, k
> +  logical :: l
> +
> +  !$acc parallel
> +  !$acc loop
> +  do i = 1, 100
> +  end do
> +  !$acc end parallel
> +
> +  !$acc parallel
> +  !$acc loop
> +  do i = 1, 100
> + do j = 1, 100
> + end do
> +  end do
> +  !$acc end parallel
> +
> +  !$acc parallel
> +  !$acc loop
> +  do i = 1, 100
> + do j = 1, 100
> +do k = 1, 100
> +end do
> + end do
> +  end do
> +  !$acc end parallel
> +end program test
> +! { dg-prune-output "unimplemented" }
> +! { dg-final { scan-tree-dump-times "pragma acc parallel" 3 "omplower" } } 
> +! { dg-final { scan-tree-dump-times "private\\(i\\)" 3 "omplower" } } 
> +! { dg-final { scan-tree-dump-times "private\\(j\\)" 2 "omplower" } } 
> +! { dg-final { scan-tree-dump-times "private\\(k\\)" 1 "omplower" } } 

I turned that one and 'gfortran.dg/goacc/private-2.f95' into more
elaborate testcases, committed to trunk in r270405 "[PR90067, PR90114]
Document Fortran OpenACC predetermined private status quo", see attached.


Grüße
 Thomas


>From b8d03885017763f914a48b19b6cb383239430b97 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 17 Apr 2019 08:34:20 +
Subject: [PATCH] [PR90048] Fortran OpenACC 'private' clause rejected for
 predetermined private loop iteration variable

	gcc/fortran/
	PR fortran/90048
	* openmp.c (gfc_resolve_do_iterator): Handle sharing_clauses for
	OpenACC, too.
	(gfc_resolve_oacc_blocks): Populate sharing_clauses with private
	clauses.
	gcc/testsuite/
	PR fortran/90048
	* gfortran.dg/goacc/private-explicit-kernels-1.f95: New file.
	* gfortran.dg/goacc/private-explicit-parallel-1.f95: Likewise.
	* gfortran.dg/goacc/private-explicit-routine-1.f95: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@270406 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog |   8 +
 gcc/fortran/openmp.c  |  20 +-
 gcc/testsuite/ChangeLog   |   5 +
 .../goacc/private-explicit-kernels-1.f95  | 248 ++
 .../goacc/private-explicit-parallel-1.f95 | 247 +
 .../goacc/private-explicit-routine-1.f95  | 146 +++
 6 files changed, 671 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/private-explicit-parallel-1.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/private-explicit-routine-1.f95

diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index e27743cac280..1ff03e1e85b5 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,3 +1,11 @@
+2019-04-17  Thomas Schwinge  
+
+	PR fortran/90048
+	* openmp.c (gfc_resolve_do_iterator): Handle sharing_clauses for
+	OpenACC, too.
+	(gfc_resolve_oacc_blocks): Populate sharing_clauses with private
+	clauses.
+
 2019-04-14  Paul Thomas  
 
 	PR fortran/89843
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 9fc236760a1c..1c7bce6c3000 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -5510,8 +5510,7 @@ gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym, bool add_clause)
   if (!omp_current_ctx->is_openmp && !oacc_is_loop (omp_current_ctx->code))
 return;
 
-  if (omp_current_ctx->is_openmp
-  && omp_current_ctx->sharing_clauses->contains (sym))
+  if (omp_current_ctx->sharing_clauses->contains (sym))
 return;
 
   if (! omp_current_ctx->private_iterators->add (sym) && add_clause)
@@ -5971,19 +5970,34 @@ void
 gfc_resolve_oacc_blocks (gfc_code *code, gfc_namespace *ns)
 {
   fortran_omp_context ctx;
+  gfc_omp_clauses *omp_clauses = code->ext.omp_clauses;
+  gf

Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-17 Thread Richard Biener
On Wed, Apr 17, 2019 at 9:19 AM Thomas König  wrote:
>
> Hi,
>
> thanks a lot for the extensive discussion :-)
>
> How should we now proceed, first for gcc 9, snd then for backporting?
> Use Richard‘s patch with the corresponding Fortran FE change?

Btw, for the testcase the fortran FE could also simply opt to not
make def_init TREE_READONLY.  Or even better, for all-zero
initialization omit the explicit initialization data and instead
mark it specially in the vtable (just use a NULL initializer
denoting zero-initialization?).  Even .bss costs (runtime) memory.

But yes, my patch would be a way to solve the middle-end issue
of promoting a variable TREE_READONLY, preventing .bss use.
And the FE could then "abuse" this feature.  Note the middle-end
already special-cases variables with an explicit section so the
Fortran FE can already use that feature to put the initializer into
.bss explicitely (set_decl_section_name (decl, ".bss"),
conditional on availability (not 100% sure how to test that...).
Your testcase probably will fail on targets w/o .bss section support.

Richard.

> Regards
>
> Thomas


Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).

2019-04-17 Thread Martin Liška
On 4/17/19 10:14 AM, Hongtao Liu wrote:
> Any other comments, I'll merge this to trunk?

Hi.

I don't understand you. The patch in its original version will no be installed 
to trunk
and I'll rework it to not support AVX512* (except AVX512F) in target_clone 
attribute.

Martin


Re: [PR 85762, 87008, 85459] Relax MEM_REF check in contains_vce_or_bfcref_p

2019-04-17 Thread Martin Jambor
Hello,

On Sun, Mar 10 2019, Martin Jambor wrote:
> Hi,
>
> after we have accidentally dropped the mailing list from our discussion
> (my apologies for not spotting that in time), Richi has approved the
> following patch which I have bootstrapped and tested on x86_64-linux
> (all languages) and on i686-linux, aarch64-linux and ppc64-linux (C, C++
> and Fortran) and so I am about to commit it to trunk.
>
> It XFAILS three guality tests which pass at -O0, which means there are
> three additional XPASSes - there already are 5 pre-existing XPASSes in
> that testcase and 29 outright failures.  I will come back to this next
> in April and see whether I can make the tests pass by decoupling the
> roles now played by cannot_scalarize_away_bitmap (or at least massage
> the testcase to go make the XPASSes go away).  But I won't have time to
> do it next two weeks and this patch is important enough to have it in
> trunk now.  I intend to backport it to gcc 8 in April too.
>
> Thanks,
>
> Martin
>
>
> 2019-03-08  Martin Jambor  
>
>   PR tree-optimization/85762
>   PR tree-optimization/87008
>   PR tree-optimization/85459
>   * tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool
>   it points to if there is a type changing MEM_REF.  Adjust all callers.
>   (build_accesses_from_assign): Disable total scalarization if
>   contains_vce_or_bfcref_p returns true through the new parameter, for
>   both rhs and lhs.
>
>   testsuite/
>   * g++.dg/tree-ssa/pr87008.C: New test.
>   * gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere.

this patch has been on trunk for over a month and at least so far nobody
complained.  I have applied it to gcc-8-branch and did a bootstrap and
testing on an x86_64-linux machine and there were no problems either.

Therefore I would propose to backport it - the other option being leaving
the gcc 8 regression(s) unfixed.  What do you think?

Martin


2019-04-16  Martin Jambor  

Backport from mainline
2019-03-10  Martin Jambor  

PR tree-optimization/85762
PR tree-optimization/87008
PR tree-optimization/85459
* tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool
it points to if there is a type changing MEM_REF.  Adjust all callers.
(build_accesses_from_assign): Disable total scalarization if
contains_vce_or_bfcref_p returns true through the new parameter, for
both rhs and lhs.

testsuite/
* g++.dg/tree-ssa/pr87008.C: New test.
* gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere.
---
 gcc/testsuite/g++.dg/tree-ssa/pr87008.C | 17 
 gcc/testsuite/gcc.dg/guality/pr54970.c  |  6 ++---
 gcc/tree-sra.c  | 36 ++---
 3 files changed, 47 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr87008.C

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr87008.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
new file mode 100644
index 000..eef521f9ad5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+extern void dontcallthis();
+
+struct A { long a, b; };
+struct B : A {};
+templatevoid cp(T&a,T const&b){a=b;}
+long f(B x){
+  B y; cp(y,x);
+  B z; cp(z,x);
+  if (y.a - z.a)
+dontcallthis ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "dontcallthis" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/guality/pr54970.c 
b/gcc/testsuite/gcc.dg/guality/pr54970.c
index 1819d023e21..f12a9aac1d2 100644
--- a/gcc/testsuite/gcc.dg/guality/pr54970.c
+++ b/gcc/testsuite/gcc.dg/guality/pr54970.c
@@ -8,17 +8,17 @@
 int
 main ()
 {
-  int a[] = { 1, 2, 3 };   /* { dg-final { gdb-test 15 "a\[0\]" "1" } } */
+  int a[] = { 1, 2, 3 };   /* { dg-final { gdb-test 15 "a\[0\]" "1" { 
xfail { *-*-* } } } } */
   int *p = a + 2;  /* { dg-final { gdb-test 15 "a\[1\]" "2" } } */
   int *q = a + 1;  /* { dg-final { gdb-test 15 "a\[2\]" "3" } } */
/* { dg-final { gdb-test 15 "*p" "3" } } */
   asm volatile (NOP);  /* { dg-final { gdb-test 15 "*q" "2" } } */
-  *p += 10;/* { dg-final { gdb-test 20 "a\[0\]" "1" } } */
+  *p += 10;/* { dg-final { gdb-test 20 "a\[0\]" "1" { 
xfail { *-*-* } } } } */
/* { dg-final { gdb-test 20 "a\[1\]" "2" } } */
/* { dg-final { gdb-test 20 "a\[2\]" "13" } } */
/* { dg-final { gdb-test 20 "*p" "13" } } */
   asm volatile (NOP);  /* { dg-final { gdb-test 20 "*q" "2" } } */
-  *q += 10;/* { dg-final { gdb-test 25 "a\[0\]" "1" } } */
+  *q += 10;/* { dg-final { gdb-test 25 "a\[0\]" "1" { 
xfail { *-*-* } } } } */
/* { dg-final { 

[PATCH] rs6000: Improve the load/store-with-update patterns (PR17108)

2019-04-17 Thread Segher Boessenkool
Many of these patterns only worked in 32-bit mode, and some only worked
in 64-bit mode.  This patch makes these use Pmode, fixing the PR.  On
the other hand, the stack updates have to use the same mode for the
stack pointer as for the value stored, so let's simplify that a bit.

Many of these patterns pass the wrong mode to
avoiding_indexed_address_p (it should be the mode of the datum
accessed, not the mode of the pointer).

Finally, I merge some patterns into one (using iterators).

Tested on powerpc64-linux {-m32,-m64}.  Committing.


Segher


2019-04-17  Segher Boessenkool  


* config/rs6000/rs6000.c (rs6000_split_multireg_move): Adjust pattern
name.
(rs6000_emit_allocate_stack_1): Simplify condition.  Adjust pattern
name.
* config/rs6000/rs6000.md (bits): Add entries for SF and DF.
(*movdi_update1): Use Pmode.
(movdi__update): Fix argument to avoiding_indexed_address_p.
(movdi__update_stack): Rename to ...
(movdi_update_stack): ... this.  Fix comment.  Change condition. Don't
use Pmode.
(*movsi_update1): Use Pmode.
(*movsi_update2): Use Pmode.
(movsi_update): Rename to ...
(movsi__update): ... this.  Use Pmode.
(movsi_update_stack): Fix condition.
(*movhi_update1): Use Pmode.  Fix argument to
avoiding_indexed_address_p.
(*movhi_update2): Ditto.
(*movhi_update3): Ditto.
(*movhi_update4): Ditto.
(*movqi_update1): Ditto.
(*movqi_update2): Ditto.
(*movqi_update3): Ditto.
(*movsf_update1, *movdf_update1): Merge, rename to...
(*mov_update1): This.  Use Pmode.  Fix argument to
avoiding_indexed_address_p.  Add "size" attribute.
(*movsf_update2, *movdf_update2): Merge, rename to...
(*mov_update2): This.  Ditto.
(*movsf_update3): Use Pmode.  Fix argument to
avoiding_indexed_address_p.
(*movsf_update4): Ditto.
(allocate_stack): Simplify condition.  Adjust pattern names.

---
 gcc/config/rs6000/rs6000.c  |  12 +-
 gcc/config/rs6000/rs6000.md | 274 
 2 files changed, 128 insertions(+), 158 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9105253..ae2249b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24010,7 +24010,7 @@ rs6000_split_multireg_move (rtx dst, rtx src)
  emit_insn (TARGET_32BIT
 ? (TARGET_POWERPC64
? gen_movdi_si_update (breg, breg, delta_rtx, 
nsrc)
-   : gen_movsi_update (breg, breg, delta_rtx, 
nsrc))
+   : gen_movsi_si_update (breg, breg, delta_rtx, 
nsrc))
 : gen_movdi_di_update (breg, breg, delta_rtx, 
nsrc));
  used_update = true;
}
@@ -25486,16 +25486,16 @@ rs6000_emit_allocate_stack_1 (HOST_WIDE_INT size_int, 
rtx orig_sp)
   size_rtx = tmp_reg;
 }
   
-  if (Pmode == SImode)
+  if (TARGET_32BIT)
 insn = emit_insn (gen_movsi_update_stack (stack_pointer_rtx,
  stack_pointer_rtx,
  size_rtx,
  orig_sp));
   else
-insn = emit_insn (gen_movdi_di_update_stack (stack_pointer_rtx,
-stack_pointer_rtx,
-size_rtx,
-orig_sp));
+insn = emit_insn (gen_movdi_update_stack (stack_pointer_rtx,
+ stack_pointer_rtx,
+ size_rtx,
+ orig_sp));
   rtx par = PATTERN (insn);
   gcc_assert (GET_CODE (par) == PARALLEL);
   rtx set = XVECEXP (par, 0, 0);
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b8dd859..6feaa10 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -563,7 +563,8 @@ (define_mode_attr wd [(QI"b")
  (TI"q")])
 
 ;; How many bits in this mode?
-(define_mode_attr bits [(QI "8") (HI "16") (SI "32") (DI "64")])
+(define_mode_attr bits [(QI "8") (HI "16") (SI "32") (DI "64")
+  (SF "32") (DF "64")])
 
 ; DImode bits
 (define_mode_attr dbits [(QI "56") (HI "48") (SI "32")])
@@ -9083,13 +9084,13 @@ (define_expand "movmemsi"
 
 (define_insn "*movdi_update1"
   [(set (match_operand:DI 3 "gpc_reg_operand" "=r,r")
-   (mem:DI (plus:DI (match_operand:DI 1 "gpc_reg_operand" "0,0")
-(match_operand:DI 2 "reg_or_aligned_short_operand" 
"r,I"
-   (set (match_operand:DI 0 "gpc_reg_operand" "=b,b")
-   (plus:DI (match_dup 1) (match_dup 2)))]
+   (mem:DI (plus:P (match_operand:P 1 "gpc_reg_

Re: [PR 85762, 87008, 85459] Relax MEM_REF check in contains_vce_or_bfcref_p

2019-04-17 Thread Richard Biener
On Wed, 17 Apr 2019, Martin Jambor wrote:

> Hello,
> 
> On Sun, Mar 10 2019, Martin Jambor wrote:
> > Hi,
> >
> > after we have accidentally dropped the mailing list from our discussion
> > (my apologies for not spotting that in time), Richi has approved the
> > following patch which I have bootstrapped and tested on x86_64-linux
> > (all languages) and on i686-linux, aarch64-linux and ppc64-linux (C, C++
> > and Fortran) and so I am about to commit it to trunk.
> >
> > It XFAILS three guality tests which pass at -O0, which means there are
> > three additional XPASSes - there already are 5 pre-existing XPASSes in
> > that testcase and 29 outright failures.  I will come back to this next
> > in April and see whether I can make the tests pass by decoupling the
> > roles now played by cannot_scalarize_away_bitmap (or at least massage
> > the testcase to go make the XPASSes go away).  But I won't have time to
> > do it next two weeks and this patch is important enough to have it in
> > trunk now.  I intend to backport it to gcc 8 in April too.
> >
> > Thanks,
> >
> > Martin
> >
> >
> > 2019-03-08  Martin Jambor  
> >
> > PR tree-optimization/85762
> > PR tree-optimization/87008
> > PR tree-optimization/85459
> > * tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool
> > it points to if there is a type changing MEM_REF.  Adjust all callers.
> > (build_accesses_from_assign): Disable total scalarization if
> > contains_vce_or_bfcref_p returns true through the new parameter, for
> > both rhs and lhs.
> >
> > testsuite/
> > * g++.dg/tree-ssa/pr87008.C: New test.
> > * gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere.
> 
> this patch has been on trunk for over a month and at least so far nobody
> complained.  I have applied it to gcc-8-branch and did a bootstrap and
> testing on an x86_64-linux machine and there were no problems either.
> 
> Therefore I would propose to backport it - the other option being leaving
> the gcc 8 regression(s) unfixed.  What do you think?

Let's go for the backport.

Richard.

> Martin
> 
> 
> 2019-04-16  Martin Jambor  
> 
>   Backport from mainline
>   2019-03-10  Martin Jambor  
> 
> PR tree-optimization/85762
> PR tree-optimization/87008
> PR tree-optimization/85459
> * tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool
> it points to if there is a type changing MEM_REF.  Adjust all callers.
> (build_accesses_from_assign): Disable total scalarization if
> contains_vce_or_bfcref_p returns true through the new parameter, for
> both rhs and lhs.
> 
> testsuite/
> * g++.dg/tree-ssa/pr87008.C: New test.
> * gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere.
> ---
>  gcc/testsuite/g++.dg/tree-ssa/pr87008.C | 17 
>  gcc/testsuite/gcc.dg/guality/pr54970.c  |  6 ++---
>  gcc/tree-sra.c  | 36 ++---
>  3 files changed, 47 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr87008.C
> 
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr87008.C 
> b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
> new file mode 100644
> index 000..eef521f9ad5
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +extern void dontcallthis();
> +
> +struct A { long a, b; };
> +struct B : A {};
> +templatevoid cp(T&a,T const&b){a=b;}
> +long f(B x){
> +  B y; cp(y,x);
> +  B z; cp(z,x);
> +  if (y.a - z.a)
> +dontcallthis ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "dontcallthis" "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54970.c 
> b/gcc/testsuite/gcc.dg/guality/pr54970.c
> index 1819d023e21..f12a9aac1d2 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54970.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54970.c
> @@ -8,17 +8,17 @@
>  int
>  main ()
>  {
> -  int a[] = { 1, 2, 3 }; /* { dg-final { gdb-test 15 "a\[0\]" "1" } } */
> +  int a[] = { 1, 2, 3 }; /* { dg-final { gdb-test 15 "a\[0\]" "1" { 
> xfail { *-*-* } } } } */
>int *p = a + 2;/* { dg-final { gdb-test 15 "a\[1\]" "2" } } */
>int *q = a + 1;/* { dg-final { gdb-test 15 "a\[2\]" "3" } } */
>   /* { dg-final { gdb-test 15 "*p" "3" } } */
>asm volatile (NOP);/* { dg-final { gdb-test 15 "*q" "2" } 
> } */
> -  *p += 10;  /* { dg-final { gdb-test 20 "a\[0\]" "1" } } */
> +  *p += 10;  /* { dg-final { gdb-test 20 "a\[0\]" "1" { 
> xfail { *-*-* } } } } */
>   /* { dg-final { gdb-test 20 "a\[1\]" "2" } } */
>   /* { dg-final { gdb-test 20 "a\[2\]" "13" } } */
>   /* { dg-final { gdb-test 20 "*p" "13" } } */
>asm volatile (NOP)

Re: [PATCH] (RFA tree-tailcall) PR c++/82081 - tail call optimization breaks noexcept

2019-04-17 Thread Jason Merrill
On Tue, Apr 16, 2019 at 1:24 AM Richard Biener
 wrote:
> On Mon, Apr 15, 2019 at 7:09 PM Andrew Pinski  wrote:
> > On Sun, Apr 14, 2019 at 11:50 PM Richard Biener
> >  wrote:
> > >
> > > On Sat, Apr 13, 2019 at 12:34 AM Jeff Law  wrote:
> > > >
> > > > On 4/12/19 3:24 PM, Jason Merrill wrote:
> > > > > If a noexcept function calls a function that might throw, doing the 
> > > > > tail
> > > > > call optimization means that an exception thrown in the called 
> > > > > function
> > > > > will propagate out, breaking the noexcept specification.  So we need 
> > > > > to
> > > > > prevent the optimization in that case.
> > > > >
> > > > > Tested x86_64-pc-linux-gnu.  OK for trunk or hold for GCC 10?  This 
> > > > > isn't a
> > > > > regression, but it is a straightforward fix for a wrong-code bug.
> > > > >
> > > > >   * tree-tailcall.c (find_tail_calls): Don't turn a call from a
> > > > >   nothrow function to a might-throw function into a tail call.
> > > > I'd go on the trunk.  It's a wrong-code issue, what we're doing is just
> > > > plain wrong.  One could even make a case for backporting to the 
> > > > branches.
> > >
> > > Hmm, how's this different from adding another indirection?  That is,
> > > I don't understand why the tailcall is the issue here, shouldn't unwind
> > > still stop at the noexcept caller?  Thus, isn't this wrong CFI instead?
> >
> > noexcept caller is no longer on the stack so the unwinder does not see it.
> > It is not the tail call from a normal function to a noexcept that is
> > an issue but rather inside a noexcept caller to a normal function.
>
> Hmm, OK, so essentially a tail-call cannot be represented in the CFI
> program.

Right.  Because the "caller" frame no longer exists.

> > > Of course I know to little about this.
> > >
> > > Btw, doesn't your check also prevent tail/sibling calls when
> > > the caller wraps it into a try { } catch (...) {}?  Or does unwind
> > > not work in that case either?
> > >
> > > Btw, I'd like to see a runtime testcase that fails.
> >
> > There is one in the bug report.  Though it would not work for the
> > testsuite.  It should not be hard to change it to be one that works
> > for the testsuite.
>
> With dg-additional-sources and registering a custom std::terminate
> it should work I guess (or by catching SIGABRT).
>
> The patch and the bug also suggests that an internally
> throwing function cannot be tail-called either (can't find a testcase
> we'd mark as tail-call here)

If you mean a call wrapped in try/catch, that is correct.  The
tail-call optimization breaks all exception handlers, so the patch
prevents it if the call can throw and is in an active exception
region.

Jason


[PATCH] auto-inc-dec: Set alignment properly

2019-04-17 Thread Segher Boessenkool
When auto-inc-dec creates a new mem to compute the cost of doing some
transform, it forgets to copy over the alignment of the original mem.
This gives wrong costs, for example, for rs6000 a floating point load
or store is hugely expensive if unaligned.  This patch fixes it.

This doesn't fix any test case I'm aware of, but it is a very simple
patch.  Is it okay for trunk?


Segher


2019-04-17  Segher Boessenkool  

* auto-inc-dec.c (attempt_change): Set the alignment of the
temporary memory to that of the original.

---
 gcc/auto-inc-dec.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c
index 43400cc..bdb6efa 100644
--- a/gcc/auto-inc-dec.c
+++ b/gcc/auto-inc-dec.c
@@ -471,6 +471,7 @@ attempt_change (rtx new_addr, rtx inc_reg)
   int regno;
   rtx mem = *mem_insn.mem_loc;
   machine_mode mode = GET_MODE (mem);
+  int align = MEM_ALIGN (mem);
   rtx new_mem;
   int old_cost = 0;
   int new_cost = 0;
@@ -478,6 +479,7 @@ attempt_change (rtx new_addr, rtx inc_reg)
 
   PUT_MODE (mem_tmp, mode);
   XEXP (mem_tmp, 0) = new_addr;
+  set_mem_align (mem_tmp, align);
 
   old_cost = (set_src_cost (mem, mode, speed)
  + set_rtx_cost (PATTERN (inc_insn.insn), speed));
-- 
1.8.3.1



Re: [Patch] [Aarch64] PR rtl-optimization/87763 - this patch fixes gcc.target/aarch64/lsl_asr_sbfiz.c

2019-04-17 Thread Richard Earnshaw (lists)
On 10/04/2019 23:03, Steve Ellcey wrote:
> 
> Here is another patch to fix one of the failures
> listed in PR rtl-optimization/87763. This change
> fixes gcc.target/aarch64/lsl_asr_sbfiz.c by adding
> an alternative version of *ashiftsi_extv_bfiz that
> has a subreg in it.
> 
> Tested with bootstrap and regression test run.
> 
> OK for checkin?
> 
> Steve Ellcey
> 
> 
> 2018-04-10  Steve Ellcey  
> 
>   PR rtl-optimization/87763
>   * config/aarch64/aarch64.md (*ashiftsi_extv_bfiz_alt):
>   New Instruction.
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index e0df975..04dc06f 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -5634,6 +5634,22 @@
>[(set_attr "type" "bfx")]
>  )
>  
> +(define_insn "*ashiftsi_extv_bfiz_alt"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> + (ashift:SI
> +   (subreg:SI
> + (sign_extract:DI
> +   (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
> +   (match_operand 2 "aarch64_simd_shift_imm_offset_si" "n")
> +   (const_int 0))
> + 0)
> +   (match_operand 3 "aarch64_simd_shift_imm_si" "n")))]
> +  "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
> +  1, GET_MODE_BITSIZE (SImode) - 1)"
> +  "sbfiz\\t%w0, %w1, %3, %2"
> +  [(set_attr "type" "bfx")]
> +)
> +
>  ;; When the bit position and width of the equivalent extraction add up to 32
>  ;; we can use a W-reg LSL instruction taking advantage of the implicit
>  ;; zero-extension of the X-reg.
> 

I don't think this is right for big-endian, where the subreg offset is
not zero.  Perhaps you should look at using subreg_lowpart_operator.

Due to that, I think this also needs some test cases.

R.


Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).

2019-04-17 Thread Hongtao Liu
On Wed, Apr 17, 2019 at 4:48 PM Martin Liška  wrote:
>
> On 4/17/19 10:14 AM, Hongtao Liu wrote:
> > Any other comments, I'll merge this to trunk?
>
> Hi.
>
> I don't understand you. The patch in its original version will no be 
> installed to trunk
> and I'll rework it to not support AVX512* (except AVX512F) in target_clone 
> attribute.
>
> Martin

Sorry,I've sent the mail to the wrong address,please ignore it.

-- 
BR,
Hongtao


Re: Enable BF16 support (Please ignore my former email)

2019-04-17 Thread Hongtao Liu
On Fri, Apr 12, 2019 at 11:18 PM H.J. Lu  wrote:
>
> On Fri, Apr 12, 2019 at 3:19 AM Uros Bizjak  wrote:
> >
> > On Fri, Apr 12, 2019 at 11:03 AM Hongtao Liu  wrote:
> > >
> > > On Fri, Apr 12, 2019 at 3:30 PM Uros Bizjak  wrote:
> > > >
> > > > On Fri, Apr 12, 2019 at 9:09 AM Liu, Hongtao  
> > > > wrote:
> > > > >
> > > > > Hi :
> > > > > This patch is about to enable support for bfloat16 which will be 
> > > > > in Future Cooper Lake, Please refer to 
> > > > > https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference
> > > > > for more details about BF16.
> > > > >
> > > > > There are 3 instructions for AVX512BF16: VCVTNE2PS2BF16, 
> > > > > VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural 
> > > > > Network Instructions supporting:
> > > > >
> > > > > -   VCVTNE2PS2BF16: Convert Two Packed Single Data to One Packed 
> > > > > BF16 Data.
> > > > > -   VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 Data.
> > > > > -   VDPBF16PS: Dot Product of BF16 Pairs Accumulated into Packed 
> > > > > Single Precision.
> > > > >
> > > > > Since only BF16 intrinsics are supported, we treat it as HI for 
> > > > > simplicity.
> > > >
> > > > I think it was a mistake declaring cvtps2ph and cvtph2ps using HImode
> > > > instead of HFmode. Is there a compelling reason not to introduce
> > > > corresponding bf16_format supporting infrastructure and declare these
> > > > intrinsics using half-binary (HBmode ?) mode instead?
> > > >
> > > > Uros.
> > >
> > > Bfloat16 isn't IEEE standard which we want to reserve HFmode for.
> >
> > True.
> >
> > > The IEEE 754 standard specifies a binary16 as having the following format:
> > > Sign bit: 1 bit
> > > Exponent width: 5 bits
> > > Significand precision: 11 bits (10 explicitly stored)
> > >
> > > Bfloat16 has the following format:
> > > Sign bit: 1 bit
> > > Exponent width: 8 bits
> > > Significand precision: 8 bits (7 explicitly stored), as opposed to 24
> > > bits in a classical single-precision floating-point format
> >
> > This is why I proposed to introduce HBmode (and corresponding
> > bfloat16_format) to distingush between ieee HFmode and BFmode.
> >
>
> Unless there is BF16 language level support,  HBmode has no advantage
> over HImode.   We can add HBmode when we gain BF16 language support.
>
> --
> H.J.

Any other comments, I'll merge this to trunk?

-- 
BR,
Hongtao


Re: [PATCH][RFC] Improve get_qualified_type linear list walk

2019-04-17 Thread Richard Biener
On Tue, 16 Apr 2019, Michael Matz wrote:

> Hi,
> 
> On Tue, 16 Apr 2019, Richard Biener wrote:
> 
> > Comments?
> 
> I was quickly testing also with some early-outs but didn't get conclusive 
> performance results (but really only superficial testing) so I'm not 
> proposing it, like so:
> 
> diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
> index 7045284..33f56f9 100644
> --- a/gcc/cp/typeck.c
> +++ b/gcc/cp/typeck.c
> @@ -1508,6 +1508,10 @@ same_type_ignoring_top_level_qualifiers_p (tree 
>if (type1 == error_mark_node || type2 == error_mark_node)
>  return false;
>  
> +  if (type1 == type2)
> +return true;

This one reduces the number of get_qualified_type calls
by about 10%.  Probably worth doing.

Another smallish improvement is using strip_top_quals which
does nothing for ARRAY_TYPE.

Btw, the new get_qualified_type shows (with the above patch applied)

   if (TYPE_QUALS (type) == type_quals)
 return type; // 0.3% hit

   tree mv = TYPE_MAIN_VARIANT (type);
   if (check_qualified_type (mv, type, type_quals))
 return mv; // 43.8% hit

for the C++ FE the LRU cache effectively moves the unqualified
variants first in the variant list.  Since we always first
build the unqualified variants before the qualified ones
the unqualified ones tend to be at the end of the list.  That's
clearly bad for the C++ pattern of repeatedly looking up the
unqualified type variant from a type.  Of course a direct
shortcut would be much cheaper here (but it obviously isn't
the main variant due to TYPE_NAME differences).

So do you think the change to get_qualified_type is OK?  Or
do we absolutely want to avoid changing the variant list from
a function like this?

Thanks,
Richard.


Re: Enable BF16 support (Please ignore my former email)

2019-04-17 Thread Uros Bizjak
On Wed, Apr 17, 2019 at 12:29 PM Hongtao Liu  wrote:
>
> On Fri, Apr 12, 2019 at 11:18 PM H.J. Lu  wrote:
> >
> > On Fri, Apr 12, 2019 at 3:19 AM Uros Bizjak  wrote:
> > >
> > > On Fri, Apr 12, 2019 at 11:03 AM Hongtao Liu  wrote:
> > > >
> > > > On Fri, Apr 12, 2019 at 3:30 PM Uros Bizjak  wrote:
> > > > >
> > > > > On Fri, Apr 12, 2019 at 9:09 AM Liu, Hongtao  
> > > > > wrote:
> > > > > >
> > > > > > Hi :
> > > > > > This patch is about to enable support for bfloat16 which will 
> > > > > > be in Future Cooper Lake, Please refer to 
> > > > > > https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference
> > > > > > for more details about BF16.
> > > > > >
> > > > > > There are 3 instructions for AVX512BF16: VCVTNE2PS2BF16, 
> > > > > > VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural 
> > > > > > Network Instructions supporting:
> > > > > >
> > > > > > -   VCVTNE2PS2BF16: Convert Two Packed Single Data to One 
> > > > > > Packed BF16 Data.
> > > > > > -   VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 
> > > > > > Data.
> > > > > > -   VDPBF16PS: Dot Product of BF16 Pairs Accumulated into 
> > > > > > Packed Single Precision.
> > > > > >
> > > > > > Since only BF16 intrinsics are supported, we treat it as HI for 
> > > > > > simplicity.
> > > > >
> > > > > I think it was a mistake declaring cvtps2ph and cvtph2ps using HImode
> > > > > instead of HFmode. Is there a compelling reason not to introduce
> > > > > corresponding bf16_format supporting infrastructure and declare these
> > > > > intrinsics using half-binary (HBmode ?) mode instead?
> > > > >
> > > > > Uros.
> > > >
> > > > Bfloat16 isn't IEEE standard which we want to reserve HFmode for.
> > >
> > > True.
> > >
> > > > The IEEE 754 standard specifies a binary16 as having the following 
> > > > format:
> > > > Sign bit: 1 bit
> > > > Exponent width: 5 bits
> > > > Significand precision: 11 bits (10 explicitly stored)
> > > >
> > > > Bfloat16 has the following format:
> > > > Sign bit: 1 bit
> > > > Exponent width: 8 bits
> > > > Significand precision: 8 bits (7 explicitly stored), as opposed to 24
> > > > bits in a classical single-precision floating-point format
> > >
> > > This is why I proposed to introduce HBmode (and corresponding
> > > bfloat16_format) to distingush between ieee HFmode and BFmode.
> > >
> >
> > Unless there is BF16 language level support,  HBmode has no advantage
> > over HImode.   We can add HBmode when we gain BF16 language support.
> >
> > --
> > H.J.
>
> Any other comments, I'll merge this to trunk?

It is not a regression, so please no.

Uros.


Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-17 Thread Florian Weimer
* Richard Biener:

> On Wed, Apr 17, 2019 at 9:19 AM Thomas König  wrote:
>>
>> Hi,
>>
>> thanks a lot for the extensive discussion :-)
>>
>> How should we now proceed, first for gcc 9, snd then for backporting?
>> Use Richard‘s patch with the corresponding Fortran FE change?
>
> Btw, for the testcase the fortran FE could also simply opt to not
> make def_init TREE_READONLY.  Or even better, for all-zero
> initialization omit the explicit initialization data and instead
> mark it specially in the vtable (just use a NULL initializer
> denoting zero-initialization?).  Even .bss costs (runtime) memory.

Not just that, .bss adds to the commit charge, while .rodata would not.
So it's not clear that using .bss for zero constants is always a win.

Thanks,
Florian


[PATCH] [ARC][COMMITTED] Fix diagnostic messages.

2019-04-17 Thread Claudiu Zissulescu
Apply upper/dot rule on diagnostic messages.

gcc/
-xx-xx  Claudiu Zissulescu  

* config/arc/arc.c (arc_init): Format diagnostic string.
(arc_override_options): Likewise.
(check_if_valid_regno_const): Likewise.
(arc_reorg): Likewise.
---
 gcc/ChangeLog|  7 +++
 gcc/config/arc/arc.c | 22 --
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9480e693c08..3820fae8ee7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2019-04-17  Claudiu Zissulescu  
+
+   * config/arc/arc.c (arc_init): Format diagnostic string.
+   (arc_override_options): Likewise.
+   (check_if_valid_regno_const): Likewise.
+   (arc_reorg): Likewise.
+
 2019-04-17  Segher Boessenkool  
 
PR target/17108
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 65eef30747a..1a04f9ef793 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -950,13 +950,13 @@ arc_init (void)
   /* FPX-4.  No FPX extensions mixed with FPU extensions.  */
   if ((TARGET_DPFP_FAST_SET || TARGET_DPFP_COMPACT_SET || TARGET_SPFP)
   && TARGET_HARD_FLOAT)
-error ("No FPX/FPU mixing allowed");
+error ("no FPX/FPU mixing allowed");
 
   /* Warn for unimplemented PIC in pre-ARC700 cores, and disable flag_pic.  */
   if (flag_pic && TARGET_ARC600_FAMILY)
 {
   warning (0,
-  "PIC is not supported for %s. Generating non-PIC code only..",
+  "PIC is not supported for %s.  Generating non-PIC code only",
   arc_cpu_string);
   flag_pic = 0;
 }
@@ -1222,26 +1222,26 @@ arc_override_options (void)
   do { \
 if ((!(arc_selected_cpu->arch_info->flags & CODE)) \
&& (VAR == VAL))\
-  error ("Option %s=%s is not available for %s CPU.",  \
+  error ("option %s=%s is not available for %s CPU",   \
 DOC0, DOC1, arc_selected_cpu->name);   \
 if ((arc_selected_cpu->arch_info->dflags & CODE)   \
&& (VAR != DEFAULT_##VAR)   \
&& (VAR != VAL))\
-  warning (0, "Option %s is ignored, the default value %s" \
-  " is considered for %s CPU.", DOC0, DOC1,\
+  warning (0, "option %s is ignored, the default value %s" \
+  " is considered for %s CPU", DOC0, DOC1, \
   arc_selected_cpu->name); \
  } while (0);
 #define ARC_OPT(NAME, CODE, MASK, DOC) \
   do { \
 if ((!(arc_selected_cpu->arch_info->flags & CODE)) \
&& (target_flags & MASK))   \
-  error ("Option %s is not available for %s CPU",  \
+  error ("option %s is not available for %s CPU",  \
 DOC, arc_selected_cpu->name);  \
 if ((arc_selected_cpu->arch_info->dflags & CODE)   \
&& (target_flags_explicit & MASK)   \
&& (!(target_flags & MASK)))\
-  warning (0, "Unset option %s is ignored, it is always"   \
-  " enabled for %s CPU.", DOC, \
+  warning (0, "unset option %s is ignored, it is always"   \
+  " enabled for %s CPU", DOC,  \
   arc_selected_cpu->name); \
   } while (0);
 
@@ -7268,7 +7268,8 @@ check_if_valid_regno_const (rtx *operands, int opno)
 case CONST_INT :
   return true;
 default:
-   error ("register number must be a compile-time constant. Try giving 
higher optimization levels");
+   error ("register number must be a compile-time constant.  "
+  "Try giving higher optimization levels");
break;
 }
   return false;
@@ -8261,7 +8262,8 @@ arc_reorg (void)
   cfun->machine->ccfsm_current_insn = NULL_RTX;
 
   if (!INSN_ADDRESSES_SET_P())
- fatal_error (input_location, "Insn addresses not set after 
shorten_branches");
+ fatal_error (input_location,
+  "insn addresses not set after shorten_branches");
 
   for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
{
-- 
2.20.1



Re: [PATCH PR90078]Capping comp_cost computation in ivopts

2019-04-17 Thread Bin.Cheng
On Wed, Apr 17, 2019 at 3:10 PM Jakub Jelinek  wrote:
>
> On Wed, Apr 17, 2019 at 02:13:12PM +0800, bin.cheng wrote:
> > Hi,
> > As discussed in PR90078, this patch checks possible infinite_cost overflow 
> > in ivopts.
> > Also as discussed, overflow happens mostly because of cost scaling wrto 
> > bb_freq/loop_freq.
> > For the moment, we only implement capping in comp_cost operators, while in 
> > next
> > stage1, we may instead implement capping in get_scaled_computation_cost_at 
> > with
> > more supporting benchmark data.
> >
> > BTW, I think switching costs around comparison between infinite_cost is 
> > unnecessary
> > since there will be no overflow in integer after capping with infinite_cost.
> >
> > Bootstrap and test on x86_64, is it OK?
> >
> > Thanks,
> > bin
> >
> > 2019-04-17  Bin Cheng  
> >
> > PR tree-optimization/92078
> > * tree-ssa-loop-ivopts.c (comp_cost::operator +,-,+=,-+,/=,*=): Add
> > checks for infinite_cost overflow.
> >
> > 2018-04-17  Bin Cheng  
> >
> > PR tree-optimization/92078
> > * gcc/testsuite/g++.dg/tree-ssa/pr90078.C: New test.
>
> --- a/gcc/tree-ssa-loop-ivopts.c
> +++ b/gcc/tree-ssa-loop-ivopts.c
> @@ -243,6 +243,9 @@ operator+ (comp_cost cost1, comp_cost cost2)
>if (cost1.infinite_cost_p () || cost2.infinite_cost_p ())
>  return infinite_cost;
>
> +  if (cost1.cost + cost2.cost >= infinite_cost.cost)
> +return infinite_cost;
>
> As
> #define INFTY 1000
> what is the reason to keep the previous condition as well?
> I mean, if cost1.cost == INFTY or cost2.cost == INFTY,
> cost1.cost + cost2.cost >= INFTY too.
> Unless costs can go negative.
It's a bit complicated, but in general, costs can go negative.

>
> @@ -256,6 +259,8 @@ operator- (comp_cost cost1, comp_cost cost2)
>  return infinite_cost;
>
>gcc_assert (!cost2.infinite_cost_p ());
> +  if (cost1.cost - cost2.cost >= infinite_cost.cost)
> +return infinite_cost;
>
> Unless costs can be negative, when you first bail out
> for cost1.cost == INFTY, then cost1.cost - cost2.cost won't
> be INFTY (but could get negative).  So shouldn't there be a guard against
> that instead?  Or, if costs can be negative, shouldn't there be also
> guards that it doesn't grow too negative (say smaller than -INFTY)?
Negative cost is kind of a result of booking cost cancellation at
different place.  For example, it mostly comes from in modeling auto
increment/decrement addressing mode.  To be specific, when IV's
increment instruction can be merged into addressing mode, we cancel
cost of IV increment operation in cand-use cost.  Very likely 4 will
be subtracted.  In general, we wouldn't expect negative cost can go
too big, so there is no -INFTY logic in ivopts at all.  So this is the
least invasive fix for the moment, I would consider capping
bb_freq/loop_freq in the future which should rule out the overflow
possibility in the first place.

Thanks,
bin
>
> Jakub


Re: Enable BF16 support (Please ignore my former email)

2019-04-17 Thread Uros Bizjak
On Wed, Apr 17, 2019 at 1:03 PM Uros Bizjak  wrote:
>
> On Wed, Apr 17, 2019 at 12:29 PM Hongtao Liu  wrote:
> >
> > On Fri, Apr 12, 2019 at 11:18 PM H.J. Lu  wrote:
> > >
> > > On Fri, Apr 12, 2019 at 3:19 AM Uros Bizjak  wrote:
> > > >
> > > > On Fri, Apr 12, 2019 at 11:03 AM Hongtao Liu  wrote:
> > > > >
> > > > > On Fri, Apr 12, 2019 at 3:30 PM Uros Bizjak  wrote:
> > > > > >
> > > > > > On Fri, Apr 12, 2019 at 9:09 AM Liu, Hongtao 
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hi :
> > > > > > > This patch is about to enable support for bfloat16 which will 
> > > > > > > be in Future Cooper Lake, Please refer to 
> > > > > > > https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference
> > > > > > > for more details about BF16.
> > > > > > >
> > > > > > > There are 3 instructions for AVX512BF16: VCVTNE2PS2BF16, 
> > > > > > > VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural 
> > > > > > > Network Instructions supporting:
> > > > > > >
> > > > > > > -   VCVTNE2PS2BF16: Convert Two Packed Single Data to One 
> > > > > > > Packed BF16 Data.
> > > > > > > -   VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 
> > > > > > > Data.
> > > > > > > -   VDPBF16PS: Dot Product of BF16 Pairs Accumulated into 
> > > > > > > Packed Single Precision.
> > > > > > >
> > > > > > > Since only BF16 intrinsics are supported, we treat it as HI for 
> > > > > > > simplicity.
> > > > > >
> > > > > > I think it was a mistake declaring cvtps2ph and cvtph2ps using 
> > > > > > HImode
> > > > > > instead of HFmode. Is there a compelling reason not to introduce
> > > > > > corresponding bf16_format supporting infrastructure and declare 
> > > > > > these
> > > > > > intrinsics using half-binary (HBmode ?) mode instead?
> > > > > >
> > > > > > Uros.
> > > > >
> > > > > Bfloat16 isn't IEEE standard which we want to reserve HFmode for.
> > > >
> > > > True.
> > > >
> > > > > The IEEE 754 standard specifies a binary16 as having the following 
> > > > > format:
> > > > > Sign bit: 1 bit
> > > > > Exponent width: 5 bits
> > > > > Significand precision: 11 bits (10 explicitly stored)
> > > > >
> > > > > Bfloat16 has the following format:
> > > > > Sign bit: 1 bit
> > > > > Exponent width: 8 bits
> > > > > Significand precision: 8 bits (7 explicitly stored), as opposed to 24
> > > > > bits in a classical single-precision floating-point format
> > > >
> > > > This is why I proposed to introduce HBmode (and corresponding
> > > > bfloat16_format) to distingush between ieee HFmode and BFmode.
> > > >
> > >
> > > Unless there is BF16 language level support,  HBmode has no advantage
> > > over HImode.   We can add HBmode when we gain BF16 language support.
> > >
> > > --
> > > H.J.
> >
> > Any other comments, I'll merge this to trunk?
>
> It is not a regression, so please no.

Ehm, "regression fix" ...

Uros.


Re: [PATCH] [ARC][COMMITTED] Fix diagnostic messages.

2019-04-17 Thread Jakub Jelinek
On Wed, Apr 17, 2019 at 02:09:33PM +0300, Claudiu Zissulescu wrote:
>/* Warn for unimplemented PIC in pre-ARC700 cores, and disable flag_pic.  
> */
>if (flag_pic && TARGET_ARC600_FAMILY)
>  {
>warning (0,
> -"PIC is not supported for %s. Generating non-PIC code only..",
> +"PIC is not supported for %s.  Generating non-PIC code only",
>  arc_cpu_string);

I believe this is undesirable too.  Either use something like
"PIC is not supported for %s; generating non-PIC code only"
or split that into two messages
if (warning (0, "PIC is not supported for %s", arc_cpu_string))
  inform (input_location, "generating non-PIC code only");

> @@ -1222,26 +1222,26 @@ arc_override_options (void)
>do {   \
>  if ((!(arc_selected_cpu->arch_info->flags & CODE))   \
>   && (VAR == VAL))\
> -  error ("Option %s=%s is not available for %s CPU.",\
> +  error ("option %s=%s is not available for %s CPU", \
>DOC0, DOC1, arc_selected_cpu->name);   \

I think another complaint in the PR was that it is unclear what
those DOC0/DOC1/DOC strings stand for, if they are keywords on what
one writes on the command line or similar (then it should be quoted,
%qs or %<%s=%s%>), if it is something different, then maybe it is
not the right thing to construct a translatable sentence from that
error/warning gmsgid string and one or more words that are inserted
somewhere into the sentence.  At least for the ARC_OPT the latter seems to
be the case, given e.g.:
ARC_OPT (FL_LL64, (1ULL << 5), MASK_LL64,  "double load/store")
ARC_OPT (FL_BS,   (1ULL << 6), MASK_BARREL_SHIFTER,"barrel shifter")
Is barrel shifter a keyword, or just random words added into the sentence?
If the latter, then the translators might want to translate that too, but in
that case together with the surroundings too.
ARC_OPT (FL_SPFP, (1ULL << 12), MASK_SPFP_COMPACT_SET, "single precission 
FPX")
ARC_OPT (FL_DPFP, (1ULL << 13), MASK_DPFP_COMPACT_SET, "double precission 
FPX")
has spelling errors,
s/precission/precision/g

>  if ((arc_selected_cpu->arch_info->dflags & CODE) \
>   && (VAR != DEFAULT_##VAR)   \
>   && (VAR != VAL))\
> -  warning (0, "Option %s is ignored, the default value %s"   \
> -" is considered for %s CPU.", DOC0, DOC1,\
> +  warning (0, "option %s is ignored, the default value %s"   \
> +" is considered for %s CPU", DOC0, DOC1, \
>  arc_selected_cpu->name); \
>   } while (0);
>  #define ARC_OPT(NAME, CODE, MASK, DOC)   \
>do {   \
>  if ((!(arc_selected_cpu->arch_info->flags & CODE))   \
>   && (target_flags & MASK))   \
> -  error ("Option %s is not available for %s CPU",\
> +  error ("option %s is not available for %s CPU",\
>DOC, arc_selected_cpu->name);  \
>  if ((arc_selected_cpu->arch_info->dflags & CODE) \
>   && (target_flags_explicit & MASK)   \
>   && (!(target_flags & MASK)))\
> -  warning (0, "Unset option %s is ignored, it is always" \
> -" enabled for %s CPU.", DOC, \
> +  warning (0, "unset option %s is ignored, it is always" \
> +" enabled for %s CPU", DOC,  \
>  arc_selected_cpu->name); \
>} while (0);
>  
> @@ -7268,7 +7268,8 @@ check_if_valid_regno_const (rtx *operands, int opno)
>  case CONST_INT :
>return true;
>  default:
> - error ("register number must be a compile-time constant. Try giving 
> higher optimization levels");
> + error ("register number must be a compile-time constant.  "
> +"Try giving higher optimization levels");

Similarly to the above case.

Jakub


Re: [PATCH PR90078]Capping comp_cost computation in ivopts

2019-04-17 Thread Jakub Jelinek
On Wed, Apr 17, 2019 at 07:14:05PM +0800, Bin.Cheng wrote:
> > As
> > #define INFTY 1000
> > what is the reason to keep the previous condition as well?
> > I mean, if cost1.cost == INFTY or cost2.cost == INFTY,
> > cost1.cost + cost2.cost >= INFTY too.
> > Unless costs can go negative.
> It's a bit complicated, but in general, costs can go negative.

Ok, no objections from me then (but as I don't know anything about it,
not an ack either; you are ivopts maintainer, so you don't need one).

Jakub


Re: [PATCH] Fix up RTL DCE find_call_stack_args (PR rtl-optimization/89965)

2019-04-17 Thread Michael Matz
Hi,

On Tue, 16 Apr 2019, Jeff Law wrote:

> So going back to Jakub's patch...  I think the discussion points to 
> avoiding the REG_EQUIV notes for outgoing argument slots.

In the long run definitely, but maybe his current solution is more 
amenable to stage 4, no idea.


Ciao,
Michael.


Re: [PATCH][RFC] Improve get_qualified_type linear list walk

2019-04-17 Thread Michael Matz
Hi,

On Wed, 17 Apr 2019, Richard Biener wrote:

> for the C++ FE the LRU cache effectively moves the unqualified
> variants first in the variant list.  Since we always first
> build the unqualified variants before the qualified ones
> the unqualified ones tend to be at the end of the list.  That's
> clearly bad for the C++ pattern of repeatedly looking up the
> unqualified type variant from a type.  Of course a direct
> shortcut would be much cheaper here (but it obviously isn't
> the main variant due to TYPE_NAME differences).
> 
> So do you think the change to get_qualified_type is OK?  Or
> do we absolutely want to avoid changing the variant list from
> a function like this?

I think changing the variant list in this accessor should be okay.  For it 
not to be okay some callers would have to remember a particular subset of 
that list and also care about the order of that subset.  That would be 
fragile no matter what.

I had the additional idea to only move the non-qualified variant to the 
front, i.e. not really LRU.  By that we would slowly establish the 
invariant that unqualified variants are early in the list; or 
alternatively add a combination of build_variant_type_copy+set_type_quals 
which would establish that invariant directly.  But unlike a real LRU 
cache it's harder to see if this brings similar benefits as the scheme is 
then lopsided towards the specific case of looking up unqualified 
variants.


Ciao,
Michael.


Re: [PATCH][RFC] Improve get_qualified_type linear list walk

2019-04-17 Thread Michael Matz
Hi,

On Tue, 16 Apr 2019, Jakub Jelinek wrote:

> > +  if (type1 == type2)
> > +return true;
> > +  if (TYPE_MAIN_VARIANT (type1) != TYPE_MAIN_VARIANT (type2))
> > +return false;
> 
> Is this second one correct though?  Doesn't comptypes return for various
> cases true even if the TYPE_MAIN_VARIANT is different?

Right, that was a thinko.  As I said, I rushed this somewhat :)


Ciao,
Michael.


[PATCH] S/390: Fix PR89952 incorrect CFI

2019-04-17 Thread Andreas Krebbel
This patch fixes a cases where inconsistent CFI is generated.

After restoring the hard frame pointer (r11) from an FPR we have to
set the CFA register.  In order to be able to set it back to the stack
pointer (r15) we have to make sure that r15 has been restored already.

The patch also adds a scheduler dependency to prevent the instruction
scheduler from swapping the r11 and r15 restore again.

gcc/ChangeLog:

2019-04-17  Andreas Krebbel  

PR target/89952
* config/s390/s390.c (s390_restore_gprs_from_fprs): Restore GPRs
from FPRs in reverse order.  Generate REG_CFA_DEF_CFA note also
for restored hard frame pointer.
(s390_sched_dependencies_evaluation): Implement new target hook.
(TARGET_SCHED_DEPENDENCIES_EVALUATION_HOOK): New macro definition.

gcc/testsuite/ChangeLog:

2019-04-17  Andreas Krebbel  

PR target/89952
* gcc.target/s390/pr89952.c: New test.
---
 gcc/config/s390/s390.c  | 62 +++--
 gcc/testsuite/gcc.target/s390/pr89952.c | 12 +++
 2 files changed, 72 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr89952.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index ad8eacd..fc4571d 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -10685,7 +10685,11 @@ s390_restore_gprs_from_fprs (void)
   if (!TARGET_Z10 || !TARGET_HARD_FLOAT || !crtl->is_leaf)
 return;
 
-  for (i = 6; i < 16; i++)
+  /* Restore the GPRs starting with the stack pointer.  That way the
+ stack pointer already has its original value when it comes to
+ restoring the hard frame pointer.  So we can set the cfa reg back
+ to the stack pointer.  */
+  for (i = STACK_POINTER_REGNUM; i >= 6; i--)
 {
   rtx_insn *insn;
 
@@ -10701,7 +10705,13 @@ s390_restore_gprs_from_fprs (void)
 
   df_set_regs_ever_live (i, true);
   add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DImode, i));
-  if (i == STACK_POINTER_REGNUM)
+
+  /* If either the stack pointer or the frame pointer get restored
+set the CFA value to its value at function start.  Doing this
+for the frame pointer results in .cfi_def_cfa_register 15
+what is ok since if the stack pointer got modified it has
+been restored already.  */
+  if (i == STACK_POINTER_REGNUM || i == HARD_FRAME_POINTER_REGNUM)
add_reg_note (insn, REG_CFA_DEF_CFA,
  plus_constant (Pmode, stack_pointer_rtx,
 STACK_POINTER_OFFSET));
@@ -16294,6 +16304,49 @@ s390_case_values_threshold (void)
   return default_case_values_threshold ();
 }
 
+/* Evaluate the insns between HEAD and TAIL and do back-end to install
+   back-end specific dependencies.
+
+   Establish an ANTI dependency between r11 and r15 restores from FPRs
+   to prevent the instructions scheduler from reordering them since
+   this would break CFI.  No further handling in the sched_reorder
+   hook is required since the r11 and r15 restore will never appear in
+   the same ready list with that change.  */
+void
+s390_sched_dependencies_evaluation (rtx_insn *head, rtx_insn *tail)
+{
+  if (!frame_pointer_needed || !epilogue_completed)
+return;
+
+  while (head != tail && DEBUG_INSN_P (head))
+head = NEXT_INSN (head);
+
+  rtx_insn *r15_restore = NULL, *r11_restore = NULL;
+
+  for (rtx_insn *insn = tail; insn != head; insn = PREV_INSN (insn))
+{
+  rtx set = single_set (insn);
+  if (!INSN_P (insn)
+ || !RTX_FRAME_RELATED_P (insn)
+ || set == NULL_RTX
+ || !REG_P (SET_DEST (set))
+ || !FP_REG_P (SET_SRC (set)))
+   continue;
+
+  if (REGNO (SET_DEST (set)) == HARD_FRAME_POINTER_REGNUM)
+   r11_restore = insn;
+
+  if (REGNO (SET_DEST (set)) == STACK_POINTER_REGNUM)
+   r15_restore = insn;
+}
+
+  if (r11_restore == NULL || r15_restore == NULL)
+return;
+  add_dependence (r11_restore, r15_restore, REG_DEP_ANTI);
+}
+
+
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -16585,6 +16638,11 @@ s390_case_values_threshold (void)
 #undef TARGET_CASE_VALUES_THRESHOLD
 #define TARGET_CASE_VALUES_THRESHOLD s390_case_values_threshold
 
+#undef TARGET_SCHED_DEPENDENCIES_EVALUATION_HOOK
+#define TARGET_SCHED_DEPENDENCIES_EVALUATION_HOOK \
+  s390_sched_dependencies_evaluation
+
+
 /* Use only short displacement, since long displacement is not available for
the floating point instructions.  */
 #undef TARGET_MAX_ANCHOR_OFFSET
diff --git a/gcc/testsuite/gcc.target/s390/pr89952.c 
b/gcc/testsuite/gcc.target/s390/pr89952.c
new file mode 100644
index 000..9f48e08
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/pr89952.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-march=zEC12 -fno-omit-frame-pointer -Os" } */
+
+
+extern void j(int);
+
+void
+d(int e, long f, int g, int h, int i) {
+  if (h == 5 && i >= 4 && i <= 7)
+h =

Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-17 Thread Andreas Schwab
On Apr 17 2019, Florian Weimer  wrote:

> Not just that, .bss adds to the commit charge,

Only one page at most.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-17 Thread Florian Weimer
* Andreas Schwab:

> On Apr 17 2019, Florian Weimer  wrote:
>
>> Not just that, .bss adds to the commit charge,
>
> Only one page at most.

That would be a bug.  All of it is anonymous memory which needs backing
from RAM or swap, in case the process writes to it.

Thanks,
Florian


collect2 patch to https in URL

2019-04-17 Thread Jonny Grant

Hello

Change the "collect2 -help" output to have https URL:

Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html

2019-04-14  Jonny Grant  
* collect2.c: Change gcc.gnu.org URL to HTTPS


Thank you
Jonny
Index: gcc/collect2.c
===
--- gcc/collect2.c	(revision 270408)
+++ gcc/collect2.c	(working copy)
@@ -1640,7 +1640,7 @@
   printf ("  --help  Display this information\n");
   printf ("  -v, --version   Display this program's version number\n");
   printf ("\n");
-  printf ("Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";);
+  printf ("Overview: https://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";);
   printf ("Report bugs: %s\n", bug_report_url);
   printf ("\n");
 }


Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-17 Thread Andreas Schwab
On Apr 17 2019, Florian Weimer  wrote:

> * Andreas Schwab:
>
>> On Apr 17 2019, Florian Weimer  wrote:
>>
>>> Not just that, .bss adds to the commit charge,
>>
>> Only one page at most.
>
> That would be a bug.

You cannot avoid it for the page shared with .data, unless you force
.bss to be page aligned.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-17 Thread Florian Weimer
* Andreas Schwab:

> On Apr 17 2019, Florian Weimer  wrote:
>
>> * Andreas Schwab:
>>
>>> On Apr 17 2019, Florian Weimer  wrote:
>>>
 Not just that, .bss adds to the commit charge,
>>>
>>> Only one page at most.
>>
>> That would be a bug.
>
> You cannot avoid it for the page shared with .data, unless you force
> .bss to be page aligned.

Would you please elaborate?

With “commit charge”, I mean address space accounted towards the commit
limit, when Linux is running in vm.overcommit_memory=2 mode.

Thanks,
Florian


gcc-patches@gcc.gnu.org

2019-04-17 Thread Jonathan Wakely

In C++1z drafts up to N4606 the constexpr keyword was missing from the
detailed description of this function, despite being shown in the class
synopsis.  That was fixed editorially for N4618, but our implementation
was not corrected to match.

* include/std/optional (optional::value_or(U&&) &&): Add missing
constexpr specifier.
* testsuite/20_util/optional/constexpr/observers/4.cc: Check value_or
for disengaged optionals and rvalue optionals.
* testsuite/20_util/optional/observers/4.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.

I will backport this to gcc-8-branch too.

commit ce471593e4ce944807efad1d0fa7ed5d0a53da1e
Author: Jonathan Wakely 
Date:   Wed Apr 17 13:57:14 2019 +0100

Add constexpr to std::optional::value_or(U&&)&&

In C++1z drafts up to N4606 the constexpr keyword was missing from the
detailed description of this function, despite being shown in the class
synopsis.  That was fixed editorially for N4618, but our implementation
was not corrected to match.

* include/std/optional (optional::value_or(U&&) &&): Add missing
constexpr specifier.
* testsuite/20_util/optional/constexpr/observers/4.cc: Check 
value_or
for disengaged optionals and rvalue optionals.
* testsuite/20_util/optional/observers/4.cc: Likewise.

diff --git a/libstdc++-v3/include/std/optional 
b/libstdc++-v3/include/std/optional
index d243930fed4..503d859bee6 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -959,7 +959,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 
   template
-   _Tp
+   constexpr _Tp
value_or(_Up&& __u) &&
{
  static_assert(is_move_constructible_v<_Tp>);
diff --git a/libstdc++-v3/testsuite/20_util/optional/constexpr/observers/4.cc 
b/libstdc++-v3/testsuite/20_util/optional/constexpr/observers/4.cc
index 1f7f0e8b6a2..a085f53f8fa 100644
--- a/libstdc++-v3/testsuite/20_util/optional/constexpr/observers/4.cc
+++ b/libstdc++-v3/testsuite/20_util/optional/constexpr/observers/4.cc
@@ -25,10 +25,42 @@ struct value_type
   int i;
 };
 
-int main()
+void test01()
 {
   constexpr std::optional o { value_type { 51 } };
   constexpr value_type fallback { 3 };
-  static_assert( o.value_or(fallback).i == 51, "" );
-  static_assert( o.value_or(fallback).i == (*o).i, "" );
+  static_assert( o.value_or(fallback).i == 51 );
+  static_assert( o.value_or(fallback).i == (*o).i );
+}
+
+void test02()
+{
+  constexpr std::optional o;
+  constexpr value_type fallback { 3 };
+  static_assert( o.value_or(fallback).i == 3 );
+}
+
+template
+  constexpr std::optional
+  make_rvalue(T t)
+  { return std::optional{t}; }
+
+void test03()
+{
+  constexpr value_type fallback { 3 };
+  static_assert( make_rvalue(value_type{51}).value_or(fallback).i == 51 );
+}
+
+void test04()
+{
+  constexpr value_type fallback { 3 };
+  static_assert( make_rvalue(std::nullopt).value_or(fallback).i == 3 );
+}
+
+int main()
+{
+  test01();
+  test02();
+  test03();
+  test04();
 }
diff --git a/libstdc++-v3/testsuite/20_util/optional/observers/4.cc 
b/libstdc++-v3/testsuite/20_util/optional/observers/4.cc
index c24e4e6856e..5d608cdeaf7 100644
--- a/libstdc++-v3/testsuite/20_util/optional/observers/4.cc
+++ b/libstdc++-v3/testsuite/20_util/optional/observers/4.cc
@@ -26,10 +26,42 @@ struct value_type
   int i;
 };
 
-int main()
+void test01()
 {
   std::optional o { value_type { 51 } };
   value_type fallback { 3 };
   VERIFY( o.value_or(fallback).i == 51 );
   VERIFY( o.value_or(fallback).i == (*o).i );
 }
+
+void test02()
+{
+  std::optional o;
+  value_type fallback { 3 };
+  VERIFY( o.value_or(fallback).i == 3 );
+}
+
+void test03()
+{
+  std::optional o { value_type { 51 } };
+  value_type fallback { 3 };
+  VERIFY( std::move(o).value_or(fallback).i == 51 );
+  VERIFY( o.has_value() );
+  VERIFY( std::move(o).value_or(fallback).i == (*o).i );
+}
+
+void test04()
+{
+  std::optional o;
+  value_type fallback { 3 };
+  VERIFY( std::move(o).value_or(fallback).i == 3 );
+  VERIFY( !o.has_value() );
+}
+
+int main()
+{
+  test01();
+  test02();
+  test03();
+  test04();
+}


Re: [Patch] [Aarch64] PR rtl-optimization/87763 - this patch fixes gcc.target/aarch64/lsl_asr_sbfiz.c

2019-04-17 Thread Jeff Law
On 4/17/19 4:19 AM, Richard Earnshaw (lists) wrote:
> On 10/04/2019 23:03, Steve Ellcey wrote:
>>
>> Here is another patch to fix one of the failures
>> listed in PR rtl-optimization/87763. This change
>> fixes gcc.target/aarch64/lsl_asr_sbfiz.c by adding
>> an alternative version of *ashiftsi_extv_bfiz that
>> has a subreg in it.
>>
>> Tested with bootstrap and regression test run.
>>
>> OK for checkin?
>>
>> Steve Ellcey
>>
>>
>> 2018-04-10  Steve Ellcey  
>>
>>  PR rtl-optimization/87763
>>  * config/aarch64/aarch64.md (*ashiftsi_extv_bfiz_alt):
>>  New Instruction.
>>
>>
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index e0df975..04dc06f 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -5634,6 +5634,22 @@
>>[(set_attr "type" "bfx")]
>>  )
>>  
>> +(define_insn "*ashiftsi_extv_bfiz_alt"
>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>> +(ashift:SI
>> +  (subreg:SI
>> +(sign_extract:DI
>> +  (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
>> +  (match_operand 2 "aarch64_simd_shift_imm_offset_si" "n")
>> +  (const_int 0))
>> +0)
>> +  (match_operand 3 "aarch64_simd_shift_imm_si" "n")))]
>> +  "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
>> + 1, GET_MODE_BITSIZE (SImode) - 1)"
>> +  "sbfiz\\t%w0, %w1, %3, %2"
>> +  [(set_attr "type" "bfx")]
>> +)
>> +
>>  ;; When the bit position and width of the equivalent extraction add up to 32
>>  ;; we can use a W-reg LSL instruction taking advantage of the implicit
>>  ;; zero-extension of the X-reg.
>>
> 
> I don't think this is right for big-endian, where the subreg offset is
> not zero.  Perhaps you should look at using subreg_lowpart_operator.
As general guidance anytime I see a subreg in the .md files I suspect
we're likely gone the wrong direction at some point.

That doesn't mean we can't use subregs, nor does it mean it's wrong in
this instance, but it certainly makes me look at the changes more
carefully to see if we can do something earlier or later so that we're
not matching subreg expressions in the md files.

I agree that in this specific case, it's likely incorrect.
subreg_lowpart_* would likely help, either as a predicate or as an operator.

jeff


Re: [PATCH][C++] Improve compile-time by ordering expensive checks last

2019-04-17 Thread Richard Biener
On Tue, 16 Apr 2019, Richard Biener wrote:

> 
> Two cases from a -fsynax-only tramp3d callgrind profile.

Amended by two others.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2019-04-17  Richard Biener  

cp/
* call.c (null_ptr_cst_p): Order checks according to expensiveness.
(conversion_null_warnings): Likewise.
* typeck.c (same_type_ignoring_top_level_qualifiers_p): Return
early if type1 == type2.

Index: gcc/cp/call.c
===
--- gcc/cp/call.c   (revision 270407)
+++ gcc/cp/call.c   (working copy)
@@ -541,11 +541,11 @@ null_ptr_cst_p (tree t)
   STRIP_ANY_LOCATION_WRAPPER (t);
 
   /* Core issue 903 says only literal 0 is a null pointer constant.  */
-  if (TREE_CODE (type) == INTEGER_TYPE
- && !char_type_p (type)
- && TREE_CODE (t) == INTEGER_CST
+  if (TREE_CODE (t) == INTEGER_CST
+ && !TREE_OVERFLOW (t)
+ && TREE_CODE (type) == INTEGER_TYPE
  && integer_zerop (t)
- && !TREE_OVERFLOW (t))
+ && !char_type_p (type))
return true;
 }
   else if (CP_INTEGRAL_TYPE_P (type))
@@ -6844,8 +6844,9 @@ static void
 conversion_null_warnings (tree totype, tree expr, tree fn, int argnum)
 {
   /* Issue warnings about peculiar, but valid, uses of NULL.  */
-  if (null_node_p (expr) && TREE_CODE (totype) != BOOLEAN_TYPE
-  && ARITHMETIC_TYPE_P (totype))
+  if (TREE_CODE (totype) != BOOLEAN_TYPE
+  && ARITHMETIC_TYPE_P (totype)
+  && null_node_p (expr))
 {
   location_t loc = get_location_for_expr_unwinding_for_system_header 
(expr);
   if (fn)
@@ -6882,8 +6883,8 @@ conversion_null_warnings (tree totype, t
 }
   /* Handle zero as null pointer warnings for cases other
  than EQ_EXPR and NE_EXPR */
-  else if (null_ptr_cst_p (expr) &&
-  (TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype)))
+  else if ((TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype))
+  && null_ptr_cst_p (expr))
 {
   location_t loc = get_location_for_expr_unwinding_for_system_header 
(expr);
   maybe_warn_zero_as_null_pointer_constant (expr, loc);
Index: gcc/cp/typeck.c
===
--- gcc/cp/typeck.c (revision 270407)
+++ gcc/cp/typeck.c (working copy)
@@ -1508,6 +1508,8 @@ same_type_ignoring_top_level_qualifiers_
 {
   if (type1 == error_mark_node || type2 == error_mark_node)
 return false;
+  if (type1 == type2)
+return true;
 
   type1 = cp_build_qualified_type (type1, TYPE_UNQUALIFIED);
   type2 = cp_build_qualified_type (type2, TYPE_UNQUALIFIED);


Re: [PATCH] Fixup IRA debug dump output

2019-04-17 Thread Peter Bergner
On 4/16/19 12:47 PM, Peter Bergner wrote:
> The patch below fixes the issue not continuing if the allocno's conflict
> array is null and instead guarding the current conflict prints by that
> test.  If the conflict array is null, we instead now print out simple
> empty conflict info.  This now gives us what we'd expect to see:
> 
> ;; a5(r116,l0) conflicts:
> ;; total conflict hard regs:
> ;; conflict hard regs:
> 
> 
>   cp0:a0(r111)<->a4(r117)@330:move


Actually, if we keep the continue, it makes the patch smaller and more
readable.  How about this instead which gives the same output as the
previous patch?

Peter

* ira-conflicts.c (print_allocno_conflicts): Always print something,
even for allocno's with no conflicts.
(print_conflicts): Print an extra newline.

Index: gcc/ira-conflicts.c
===
--- gcc/ira-conflicts.c (revision 270331)
+++ gcc/ira-conflicts.c (working copy)
@@ -633,7 +631,12 @@ print_allocno_conflicts (FILE * file, bo
   ira_object_conflict_iterator oci;
 
   if (OBJECT_CONFLICT_ARRAY (obj) == NULL)
-   continue;
+   {
+ fprintf (file, "\n;; total conflict hard regs:\n");
+ fprintf (file, ";; conflict hard regs:\n\n");
+ continue;
+   }
+
   if (n > 1)
fprintf (file, "\n;;   subobject %d:", i);
   FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
@@ -683,6 +686,7 @@ print_conflicts (FILE *file, bool reg_p)
 
   FOR_EACH_ALLOCNO (a, ai)
 print_allocno_conflicts (file, reg_p, a);
+  putc ('\n', file);
 }
 
 /* Print information about allocno or only regno (if REG_P) conflicts



[PATCH] Fix up dg-extract-results.sh

2019-04-17 Thread Jakub Jelinek
Hi!

On Tue, Apr 16, 2019 at 09:26:46PM +0200, Christophe Lyon wrote:
> > Actually, I managed to reproduce in a Fedora 31 chroot, in which I don't
> > have /usr/bin/python installed (I think in Fedora 30+ there is
> > /usr/bin/python2 and /usr/bin/python3 but not /usr/bin/python, at least not
> > in the default buildroot).

So, I've grabbed 11 *.log and 11 *.sum files from testsuite/gcc*/, injected
a couple of
-PASS: gcc.c-torture/execute/20001009-2.c   -O1  execution test
+WARNING: program timed out.
+FAIL: gcc.c-torture/execute/20001009-2.c   -O1  execution test
changes into them (both to *.log and *.sum) and tested, each time
with dg-extract-results.sh -L *.log > logN and dg-extract-results.sh *.sum > 
sumN
The tested versions were:
1) gcc-8
2) current trunk
3) current trunk with this patch
(all of them with /usr/bin/python available, so effectively using the python
version) and
4) gcc-8
5) current trunk
6) current trunk with this patch
contrib/dg-extract-results.sh copied to /tmp/dg-extract-results.sh so that
it doesn't find the python version and thus uses awk.  I found a couple of
issues in the patch I've sent earlier, so this contains fixes.
973bf331c5f223a08a4724289635fe43  log1
973bf331c5f223a08a4724289635fe43  log2
973bf331c5f223a08a4724289635fe43  log3
b7fde321188f9d60265c2801fb8e81e9  log4
26e7dc514ab063b99d2929759826814b  log5
b7fde321188f9d60265c2801fb8e81e9  log6
d6a24e581653e284d9db118cca48f72c  sum1
ca25461808ea1f9b061409fe096f286f  sum2
ca25461808ea1f9b061409fe096f286f  sum3
33e194e093632290a5d5bd16cb15ca10  sum4
f82f4a60a095655d7359700b7bf688e1  sum5
f82f4a60a095655d7359700b7bf688e1  sum6

Thus, there is no change in -L log mode generation between any of those 3 python
versions (note, the patch just changes a comment typo in the *.py, so 2 and
3 are expected to be identical with -L), and appart from the broken trunk
handling of -L gcc 8 as well as trunk with this patch generate the same
logfile too (though, not identical to python).
As for the sum mode, gcc 8 generated with both python and sh/awk different
results from trunk and trunk with patch, with the WARNING: program timed out.
lines sorted together at one spot, while trunk and trunk with patch emit
identical result (though, again, python generates different ordering from
sh/awk).  So, I believe with this patch the results are exactly as I expect
them, the *.sum WARNING: thing is improved as Christophe wanted, while
*.log files which are broken on current trunk totally when not using python
are fixed again.

The incremental fixes from previous patch are using correct operator for
$0 matching and also, as we use timeout_cnt value of 0 with the meaning that
no timeout message needs to be handled, but in theory
WARNING: program timed out. could appear also with cnt == 0, I've made
that var contain otherwise cnt + 1 and subtract 1 again when printing it.

Ok for trunk?

2019-04-17  Jakub Jelinek  

* dg-extract-results.sh: Only handle WARNING: program timed out
lines specially in "$MODE" == "sum".  Restore previous behavior
for "$MODE" != "sum".  Clear has_timeout and timeout_cnt if in
a different variant or curfile is empty.
* dg-extract-results.py: Fix a typo.

--- contrib/dg-extract-results.sh.jj2019-03-05 21:49:34.471573434 +0100
+++ contrib/dg-extract-results.sh   2019-04-17 17:35:53.718285283 +0200
@@ -331,13 +331,15 @@ BEGIN {
   # Ugly hack for gfortran.dg/dg.exp
   if ("$TOOL" == "gfortran" && testname ~ /^gfortran.dg\/g77\//)
 testname="h"testname
-  if (\$1 == "WARNING:" && \$2 == "program" && \$3 == "timed" && (\$4 == "out" 
|| \$4 == "out.")) {
-has_timeout=1
-timeout_cnt=cnt
-  } else {
-  # Prepare timeout replacement message in case it's needed
-timeout_msg=\$0
-sub(\$1, "WARNING:", timeout_msg)
+  if ("$MODE" == "sum") {
+if (\$0 ~ /^WARNING: program timed out/) {
+  has_timeout=1
+  timeout_cnt=cnt+1
+} else {
+  # Prepare timeout replacement message in case it's needed
+  timeout_msg=\$0
+  sub(\$1, "WARNING:", timeout_msg)
+}
   }
 }
 /^$/ { if ("$MODE" == "sum") next }
@@ -345,25 +347,30 @@ BEGIN {
 if ("$MODE" == "sum") {
   # Do not print anything if the current line is a timeout
   if (has_timeout == 0) {
-# If the previous line was a timeout,
-# insert the full current message without keyword
-if (timeout_cnt != 0) {
-  printf "%s %08d|%s program timed out.\n", testname, timeout_cnt, 
timeout_msg >> curfile
-  timeout_cnt = 0
-  cnt = cnt + 1
-}
-printf "%s %08d|", testname, cnt >> curfile
-cnt = cnt + 1
-filewritten[curfile]=1
-need_close=1
-if (timeout_cnt == 0)
-  print >> curfile
+   # If the previous line was a timeout,
+   # insert the full current message without keyword
+   if (timeout_cnt != 0) {
+ printf "%s %08d|%s program timed out.\n", testname, timeout_cnt-1,

[PATCH 1/3] Fix tests for std::variant to match original intention

2019-04-17 Thread Jonathan Wakely

* testsuite/20_util/variant/compile.cc (MoveCtorOnly): Fix type to
actually match its name.
(MoveCtorAndSwapOnly): Define new type that adds swap to MoveCtorOnly.
(test_swap()): Fix result for MoveCtorOnly and check
MoveCtorAndSwapOnly.

Tested powerpc64le-linux.

commit 855e2fb029adf77f6189f01b1a8d86dc2cca2464
Author: Jonathan Wakely 
Date:   Wed Apr 17 14:55:39 2019 +0100

Fix tests for std::variant to match original intention

* testsuite/20_util/variant/compile.cc (MoveCtorOnly): Fix type to
actually match its name.
(MoveCtorAndSwapOnly): Define new type that adds swap to 
MoveCtorOnly.
(test_swap()): Fix result for MoveCtorOnly and check
MoveCtorAndSwapOnly.

diff --git a/libstdc++-v3/testsuite/20_util/variant/compile.cc 
b/libstdc++-v3/testsuite/20_util/variant/compile.cc
index 04fef0be13f..5a2d91709a0 100644
--- a/libstdc++-v3/testsuite/20_util/variant/compile.cc
+++ b/libstdc++-v3/testsuite/20_util/variant/compile.cc
@@ -54,12 +54,15 @@ struct DefaultNoexcept
 struct MoveCtorOnly
 {
   MoveCtorOnly() noexcept = delete;
-  MoveCtorOnly(const DefaultNoexcept&) noexcept = delete;
-  MoveCtorOnly(DefaultNoexcept&&) noexcept { }
-  MoveCtorOnly& operator=(const DefaultNoexcept&) noexcept = delete;
-  MoveCtorOnly& operator=(DefaultNoexcept&&) noexcept = delete;
+  MoveCtorOnly(const MoveCtorOnly&) noexcept = delete;
+  MoveCtorOnly(MoveCtorOnly&&) noexcept { }
+  MoveCtorOnly& operator=(const MoveCtorOnly&) noexcept = delete;
+  MoveCtorOnly& operator=(MoveCtorOnly&&) noexcept = delete;
 };
 
+struct MoveCtorAndSwapOnly : MoveCtorOnly { };
+void swap(MoveCtorAndSwapOnly&, MoveCtorAndSwapOnly&) { }
+
 struct nonliteral
 {
   nonliteral() { }
@@ -259,7 +262,8 @@ static_assert( !std::is_swappable_v> );
 void test_swap()
 {
   static_assert(is_swappable_v>, "");
-  static_assert(is_swappable_v>, "");
+  static_assert(!is_swappable_v>, "");
+  static_assert(is_swappable_v>, "");
   static_assert(!is_swappable_v>, "");
 }
 


[PATCH 2/3] Remove unnecessary string literals from static_assert in C++17 tests

2019-04-17 Thread Jonathan Wakely

Remove unnecessary string literals from static_assert in C++17 tests
   
The string literal is optional in C++17 and all these are empty so add

no value.


Tested powerpc64le-linux.

commit 028676a32fa51c0116e3c117a36550dd04cd39fe
Author: Jonathan Wakely 
Date:   Wed Apr 17 14:57:41 2019 +0100

Remove unnecessary string literals from static_assert in C++17 tests

The string literal is optional in C++17 and all these are empty so add
no value.

* testsuite/20_util/variant/compile.cc: Remove empty string literals
from static_assert declarations.

diff --git a/libstdc++-v3/testsuite/20_util/variant/compile.cc b/libstdc++-v3/testsuite/20_util/variant/compile.cc
index 5a2d91709a0..b67c98adf4a 100644
--- a/libstdc++-v3/testsuite/20_util/variant/compile.cc
+++ b/libstdc++-v3/testsuite/20_util/variant/compile.cc
@@ -77,59 +77,59 @@ struct nonliteral
 
 void default_ctor()
 {
-  static_assert(is_default_constructible_v>, "");
-  static_assert(is_default_constructible_v>, "");
-  static_assert(!is_default_constructible_v>, "");
-  static_assert(is_default_constructible_v>, "");
+  static_assert(is_default_constructible_v>);
+  static_assert(is_default_constructible_v>);
+  static_assert(!is_default_constructible_v>);
+  static_assert(is_default_constructible_v>);
 
-  static_assert(noexcept(variant()), "");
-  static_assert(!noexcept(variant()), "");
-  static_assert(noexcept(variant()), "");
+  static_assert(noexcept(variant()));
+  static_assert(!noexcept(variant()));
+  static_assert(noexcept(variant()));
 }
 
 void copy_ctor()
 {
-  static_assert(is_copy_constructible_v>, "");
-  static_assert(!is_copy_constructible_v>, "");
-  static_assert(is_trivially_copy_constructible_v>, "");
-  static_assert(!is_trivially_copy_constructible_v>, "");
+  static_assert(is_copy_constructible_v>);
+  static_assert(!is_copy_constructible_v>);
+  static_assert(is_trivially_copy_constructible_v>);
+  static_assert(!is_trivially_copy_constructible_v>);
 
   {
 variant a;
-static_assert(noexcept(variant(a)), "");
+static_assert(noexcept(variant(a)));
   }
   {
 variant a;
-static_assert(!noexcept(variant(a)), "");
+static_assert(!noexcept(variant(a)));
   }
   {
 variant a;
-static_assert(!noexcept(variant(a)), "");
+static_assert(!noexcept(variant(a)));
   }
   {
 variant a;
-static_assert(noexcept(variant(a)), "");
+static_assert(noexcept(variant(a)));
   }
 }
 
 void move_ctor()
 {
-  static_assert(is_move_constructible_v>, "");
-  static_assert(!is_move_constructible_v>, "");
-  static_assert(is_trivially_move_constructible_v>, "");
-  static_assert(!is_trivially_move_constructible_v>, "");
-  static_assert(!noexcept(variant(declval>())), "");
-  static_assert(noexcept(variant(declval>())), "");
+  static_assert(is_move_constructible_v>);
+  static_assert(!is_move_constructible_v>);
+  static_assert(is_trivially_move_constructible_v>);
+  static_assert(!is_trivially_move_constructible_v>);
+  static_assert(!noexcept(variant(declval>(;
+  static_assert(noexcept(variant(declval>(;
 }
 
 void arbitrary_ctor()
 {
-  static_assert(!is_constructible_v, const char*>, "");
-  static_assert(is_constructible_v, const char*>, "");
-  static_assert(noexcept(variant(int{})), "");
-  static_assert(noexcept(variant(int{})), "");
-  static_assert(!noexcept(variant(Empty{})), "");
-  static_assert(noexcept(variant(DefaultNoexcept{})), "");
+  static_assert(!is_constructible_v, const char*>);
+  static_assert(is_constructible_v, const char*>);
+  static_assert(noexcept(variant(int{})));
+  static_assert(noexcept(variant(int{})));
+  static_assert(!noexcept(variant(Empty{})));
+  static_assert(noexcept(variant(DefaultNoexcept{})));
 }
 
 void in_place_index_ctor()
@@ -142,105 +142,105 @@ void in_place_type_ctor()
 {
   variant a(in_place_type, "a");
   variant b(in_place_type, {'a'});
-  static_assert(!is_constructible_v, in_place_type_t, const char*>, "");
+  static_assert(!is_constructible_v, in_place_type_t, const char*>);
 }
 
 void dtor()
 {
-  static_assert(is_destructible_v>, "");
-  static_assert(is_destructible_v>, "");
+  static_assert(is_destructible_v>);
+  static_assert(is_destructible_v>);
 }
 
 void copy_assign()
 {
-  static_assert(is_copy_assignable_v>, "");
-  static_assert(!is_copy_assignable_v>, "");
-  static_assert(is_trivially_copy_assignable_v>, "");
-  static_assert(!is_trivially_copy_assignable_v>, "");
+  static_assert(is_copy_assignable_v>);
+  static_assert(!is_copy_assignable_v>);
+  static_assert(is_trivially_copy_assignable_v>);
+  static_assert(!is_trivially_copy_assignable_v>);
   {
 variant a;
-static_assert(!noexcept(a = a), "");
+static_assert(!noexcept(a = a));
   }
   {
 variant a;
-static_assert(noexcept(a = a), "");
+static_assert(noexcept(a = a));
   }
 }
 
 void move_assign()
 {
-  static_assert(is_move_assignable_v>, "");
-  static_assert(!is_move_assignable_v>, "");
-  static_as

[PATCH 3/3] Fix condition for std::variant to be copy constructible

2019-04-17 Thread Jonathan Wakely

The standard says the std::variant copy constructor is defined as
deleted unless all alternative types are copy constructible, but we were
making it also depend on move constructible. Fix the condition and
enhance the tests to check the semantics with pathological copy-only
types (i.e. supporting copying but having deleted moves).
   
The enhanced tests revealed a regression in copy assignment for

non-trivial alternative types, where the assignment would not be
performed because the condition in the _Copy_assign_base visitor is
false: is_same_v, remove_reference_t>.


Tested powerpc64le-linux.

I plan to commit all three of these patches later today, unless
somebody sees a problem with them.


commit a5a517df4933ffd0e6a08c42280c7d2ee0699904
Author: Jonathan Wakely 
Date:   Wed Apr 17 16:17:25 2019 +0100

Fix condition for std::variant to be copy constructible

The standard says the std::variant copy constructor is defined as
deleted unless all alternative types are copy constructible, but we were
making it also depend on move constructible. Fix the condition and
enhance the tests to check the semantics with pathological copy-only
types (i.e. supporting copying but having deleted moves).

The enhanced tests revealed a regression in copy assignment for
non-trivial alternative types, where the assignment would not be
performed because the condition in the _Copy_assign_base visitor is
false: is_same_v, remove_reference_t>.

* include/std/variant (__detail::__variant::_Traits::_S_copy_assign):
Do not depend on whether all alternative types are move constructible.
(__detail::__variant::_Copy_assign_base::operator=): Remove cv-quals
from the operand when deciding whether to perform the assignment.
* testsuite/20_util/variant/compile.cc (DeletedMoves): Define type
with deleted move constructor and deleted move assignment operator.
(default_ctor, copy_ctor, move_ctor, copy_assign, move_assign): Check
behaviour of variants with DeletedMoves as an alternative.
* testsuite/20_util/variant/run.cc (DeletedMoves): Define same type.
(move_ctor, move_assign): Check that moving a variant with a
DeletedMoves alternative falls back to copying instead of moving.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 22b0c3d5c22..e153363bbf3 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -279,7 +279,7 @@ namespace __variant
   static constexpr bool _S_move_ctor =
 	  (is_move_constructible_v<_Types> && ...);
   static constexpr bool _S_copy_assign =
-	  _S_copy_ctor && _S_move_ctor
+	  _S_copy_ctor
 	  && (is_copy_assignable_v<_Types> && ...);
   static constexpr bool _S_move_assign =
 	  _S_move_ctor
@@ -613,7 +613,7 @@ namespace __variant
 			  __variant::__get<__rhs_index>(*this);
 			if constexpr (is_same_v<
   remove_reference_t,
-  remove_reference_t>)
+  __remove_cvref_t>)
 			  __this_mem = __rhs_mem;
 		  }
 		  }
diff --git a/libstdc++-v3/testsuite/20_util/variant/compile.cc b/libstdc++-v3/testsuite/20_util/variant/compile.cc
index b67c98adf4a..5cc2a9460a9 100644
--- a/libstdc++-v3/testsuite/20_util/variant/compile.cc
+++ b/libstdc++-v3/testsuite/20_util/variant/compile.cc
@@ -63,6 +63,15 @@ struct MoveCtorOnly
 struct MoveCtorAndSwapOnly : MoveCtorOnly { };
 void swap(MoveCtorAndSwapOnly&, MoveCtorAndSwapOnly&) { }
 
+struct DeletedMoves
+{
+  DeletedMoves() = default;
+  DeletedMoves(const DeletedMoves&) = default;
+  DeletedMoves(DeletedMoves&&) = delete;
+  DeletedMoves& operator=(const DeletedMoves&) = default;
+  DeletedMoves& operator=(DeletedMoves&&) = delete;
+};
+
 struct nonliteral
 {
   nonliteral() { }
@@ -81,6 +90,7 @@ void default_ctor()
   static_assert(is_default_constructible_v>);
   static_assert(!is_default_constructible_v>);
   static_assert(is_default_constructible_v>);
+  static_assert(is_default_constructible_v>);
 
   static_assert(noexcept(variant()));
   static_assert(!noexcept(variant()));
@@ -93,6 +103,7 @@ void copy_ctor()
   static_assert(!is_copy_constructible_v>);
   static_assert(is_trivially_copy_constructible_v>);
   static_assert(!is_trivially_copy_constructible_v>);
+  static_assert(is_trivially_copy_constructible_v>);
 
   {
 variant a;
@@ -116,6 +127,7 @@ void move_ctor()
 {
   static_assert(is_move_constructible_v>);
   static_assert(!is_move_constructible_v>);
+  static_assert(is_move_constructible_v>); // uses copy ctor
   static_assert(is_trivially_move_constructible_v>);
   static_assert(!is_trivially_move_constructible_v>);
   static_assert(!noexcept(variant(declval>(;
@@ -157,6 +169,7 @@ void copy_assign()
   static_assert(!is_copy_assignable_v>);
   static_assert(is_trivially_copy_assignable_v>);
   static_assert(!is_trivially_copy_assignable_v>);
+  static_assert(is_t

Re: [PATCH 1/3] Fix tests for std::variant to match original intention

2019-04-17 Thread Ville Voutilainen
On Wed, 17 Apr 2019 at 19:07, Jonathan Wakely  wrote:
>
> * testsuite/20_util/variant/compile.cc (MoveCtorOnly): Fix type to
> actually match its name.
> (MoveCtorAndSwapOnly): Define new type that adds swap to MoveCtorOnly.
> (test_swap()): Fix result for MoveCtorOnly and check
> MoveCtorAndSwapOnly.
>
> Tested powerpc64le-linux.

Looks good to me.


Re: [PATCH 2/3] Remove unnecessary string literals from static_assert in C++17 tests

2019-04-17 Thread Ville Voutilainen
On Wed, 17 Apr 2019 at 19:09, Jonathan Wakely  wrote:
>
> Remove unnecessary string literals from static_assert in C++17 tests
>
> The string literal is optional in C++17 and all these are empty so add
> no value.
>
>
> Tested powerpc64le-linux.

Looks good to me.


Re: [PATCH 3/3] Fix condition for std::variant to be copy constructible

2019-04-17 Thread Ville Voutilainen
On Wed, 17 Apr 2019 at 19:12, Jonathan Wakely  wrote:
>
> The standard says the std::variant copy constructor is defined as
> deleted unless all alternative types are copy constructible, but we were
> making it also depend on move constructible. Fix the condition and
> enhance the tests to check the semantics with pathological copy-only
> types (i.e. supporting copying but having deleted moves).
>
> The enhanced tests revealed a regression in copy assignment for
> non-trivial alternative types, where the assignment would not be
> performed because the condition in the _Copy_assign_base visitor is
> false: is_same_v, remove_reference_t>.
>
>
> Tested powerpc64le-linux.
>
> I plan to commit all three of these patches later today, unless
> somebody sees a problem with them.

Looks good to me.


Re: [PATCH] Fix up dg-extract-results.sh

2019-04-17 Thread Mike Stump
On Apr 17, 2019, at 8:59 AM, Jakub Jelinek  wrote:
> Ok for trunk?

Ok.


Re: [PATCH] backport r257541, r259936, r260294, r260623, r261098, r261333, r268585.

2019-04-17 Thread Segher Boessenkool
Hi!

On Wed, Apr 17, 2019 at 03:05:06PM +0800, Xiong Hu Luo wrote:
> On 2019/4/16 PM6:54, Segher Boessenkool wrote:
> > ("be" and "le" are essentially PowerPC-specific selectors on the 7 branch,
> > otherwise you'd need a release manager's approval as well).
> 
> Do you mean move the "be" and "le" code from
> gcc/testsuite/lib/target-supports.exp to
> gcc/testsuite/gcc.target/powerpc/powerpc.exp here?

I mean it is okay as you posted it, and I can approve it even though it
is in generic code.

:-)


Segher


C++ PATCH for c++/90124 - bogus error with incomplete type in decltype

2019-04-17 Thread Marek Polacek
This fixes a recent P1.  Here we were giving the "invalid use of incomplete
type" error, but "the operand of the decltype specifier is an unevaluated 
operand"
and so the objects it names are not required to have a definition.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-04-17  Marek Polacek  

PR c++/90124 - bogus error with incomplete type in decltype.
* typeck.c (build_class_member_access_expr): Check
cp_unevaluated_operand.

* g++.dg/cpp0x/decltype70.C: New test.

diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 03b14024738..7224d9bf9ed 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -2477,7 +2477,8 @@ build_class_member_access_expr (cp_expr object, tree 
member,
  /* We didn't complain above about a currently open class, but now we
 must: we don't know how to refer to a base member before layout is
 complete.  But still don't complain in a template.  */
- if (!dependent_type_p (object_type)
+ if (!cp_unevaluated_operand
+ && !dependent_type_p (object_type)
  && !complete_type_or_maybe_complain (object_type, object,
   complain))
return error_mark_node;
diff --git gcc/testsuite/g++.dg/cpp0x/decltype70.C 
gcc/testsuite/g++.dg/cpp0x/decltype70.C
new file mode 100644
index 000..b26aca90651
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/decltype70.C
@@ -0,0 +1,10 @@
+// PR c++/90124
+// { dg-do compile { target c++11 } }
+
+class a {
+public:
+  int b;
+};
+class c : a {
+  auto m_fn1() -> decltype(b);
+};


Re: C++ PATCH for c++/90124 - bogus error with incomplete type in decltype

2019-04-17 Thread Jason Merrill
On Wed, Apr 17, 2019 at 10:45 AM Marek Polacek  wrote:
>
> This fixes a recent P1.  Here we were giving the "invalid use of incomplete
> type" error, but "the operand of the decltype specifier is an unevaluated 
> operand"
> and so the objects it names are not required to have a definition.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2019-04-17  Marek Polacek  
>
> PR c++/90124 - bogus error with incomplete type in decltype.
> * typeck.c (build_class_member_access_expr): Check
> cp_unevaluated_operand.

OK, thanks.

Jason


Re: [PATCH] Fixup IRA debug dump output

2019-04-17 Thread Jeff Law
On 4/17/19 9:35 AM, Peter Bergner wrote:
> On 4/16/19 12:47 PM, Peter Bergner wrote:
>> The patch below fixes the issue not continuing if the allocno's conflict
>> array is null and instead guarding the current conflict prints by that
>> test.  If the conflict array is null, we instead now print out simple
>> empty conflict info.  This now gives us what we'd expect to see:
>>
>> ;; a5(r116,l0) conflicts:
>> ;; total conflict hard regs:
>> ;; conflict hard regs:
>>
>>
>>   cp0:a0(r111)<->a4(r117)@330:move
> 
> 
> Actually, if we keep the continue, it makes the patch smaller and more
> readable.  How about this instead which gives the same output as the
> previous patch?
> 
> Peter
> 
>   * ira-conflicts.c (print_allocno_conflicts): Always print something,
>   even for allocno's with no conflicts.
>   (print_conflicts): Print an extra newline.
OK.  And while it's technically not a regression fix, I think this can
safely go in now :-)

jeff


Re: [PATCH] auto-inc-dec: Set alignment properly

2019-04-17 Thread Jeff Law
On 4/17/19 4:13 AM, Segher Boessenkool wrote:
> When auto-inc-dec creates a new mem to compute the cost of doing some
> transform, it forgets to copy over the alignment of the original mem.
> This gives wrong costs, for example, for rs6000 a floating point load
> or store is hugely expensive if unaligned.  This patch fixes it.
> 
> This doesn't fix any test case I'm aware of, but it is a very simple
> patch.  Is it okay for trunk?
> 
> 
> Segher
> 
> 
> 2019-04-17  Segher Boessenkool  
> 
>   * auto-inc-dec.c (attempt_change): Set the alignment of the
>   temporary memory to that of the original.
Given this is only changing the RTL passed into the costing
calculations, I think it can go in now.

OK for the trunk.

jeff


Re: collect2 patch to https in URL

2019-04-17 Thread Jeff Law
On 4/17/19 6:45 AM, Jonny Grant wrote:
> Hello
> 
> Change the "collect2 -help" output to have https URL:
> 
> Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html
> 
> 2019-04-14  Jonny Grant  
>     * collect2.c: Change gcc.gnu.org URL to HTTPS
> 
> 
> Thank you
> Jonny
THanks.  I've installed this on the trunk.

jeff


[PATCH] Fix up _mm_maskz_f{,n}m{add,sub}_round_s{s,d} at -O0 (PR target/90125)

2019-04-17 Thread Jakub Jelinek
Hi!

The following patch fixes a bunch of pastos in the -O0 macros in the
PR89784 implementation plus testcase coverage that FAILs without the header
change and succeeds with that (the tests were previously run at -O2 only
where they test the inline functions and not the macros).
Because at -O0 the C x * y + z isn't contracted into FMA, there is a small
precision difference in two of the tests with the chosen constants, so I've
changed them to ones where a precision difference isn't really possible.
I think the constants weren't chosen very well, because either we just want
some basic testing, for which even the adjusted ones are ok, or we want
to specifically check for FMA, in that case we should check some FMA
cornercases where without FMA the result is completely different from one
with FMA.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

And sorry for screwing it up.

2019-04-17  Hongtao Liu  

PR target/90125
* config/i386/avx512fintrin.h (_mm_maskz_fmadd_round_sd,
_mm_maskz_fmadd_round_ss, _mm_maskz_fmsub_round_sd,
_mm_maskz_fmsub_round_ss, _mm_maskz_fnmadd_round_sd,
_mm_maskz_fnmadd_round_ss, _mm_maskz_fnmsub_round_sd,
_mm_maskz_fnmsub_round_ss): Use _maskz builtin instead of _mask3.

2019-04-17  Jakub Jelinek  

PR target/90125
* gcc.target/i386/avx512f-vfmsubXXXss-2.c (avx512f_test): Adjust
constants to ensure precise result even when not using fma.
* gcc.target/i386/avx512f-vfnmaddXXXss-2.c (avx512f_test): Likewise.
* gcc.target/i386/avx512f-vfmaddXXXsd-3.c: New test.
* gcc.target/i386/avx512f-vfmaddXXXss-3.c: New test.
* gcc.target/i386/avx512f-vfmsubXXXsd-3.c: New test.
* gcc.target/i386/avx512f-vfmsubXXXss-3.c: New test.
* gcc.target/i386/avx512f-vfnmaddXXXsd-3.c: New test.
* gcc.target/i386/avx512f-vfnmaddXXXss-3.c: New test.
* gcc.target/i386/avx512f-vfnmsubXXXsd-3.c: New test.
* gcc.target/i386/avx512f-vfnmsubXXXss-3.c: New test.

--- gcc/config/i386/avx512fintrin.h.jj  2019-03-22 11:07:00.699948784 +0100
+++ gcc/config/i386/avx512fintrin.h 2019-04-17 11:24:53.683695473 +0200
@@ -12104,10 +12104,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U,
 (__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, C, U, R)
 
 #define _mm_maskz_fmadd_round_sd(U, A, B, C, R)\
-(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, B, C, U, R)
+(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, B, C, U, R)
 
 #define _mm_maskz_fmadd_round_ss(U, A, B, C, R)\
-(__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, C, U, R)
+(__m128) __builtin_ia32_vfmaddss3_maskz (A, B, C, U, R)
 
 #define _mm_mask_fmsub_round_sd(A, U, B, C, R)\
 (__m128d) __builtin_ia32_vfmaddsd3_mask (A, B, -(C), U, R)
@@ -12122,10 +12122,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U,
 (__m128) __builtin_ia32_vfmsubss3_mask3 (A, B, C, U, R)
 
 #define _mm_maskz_fmsub_round_sd(U, A, B, C, R)\
-(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, B, -(C), U, R)
+(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, B, -(C), U, R)
 
 #define _mm_maskz_fmsub_round_ss(U, A, B, C, R)\
-(__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, -(C), U, R)
+(__m128) __builtin_ia32_vfmaddss3_maskz (A, B, -(C), U, R)
 
 #define _mm_mask_fnmadd_round_sd(A, U, B, C, R)\
 (__m128d) __builtin_ia32_vfmaddsd3_mask (A, -(B), C, U, R)
@@ -12140,10 +12140,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U,
 (__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), C, U, R)
 
 #define _mm_maskz_fnmadd_round_sd(U, A, B, C, R)\
-(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, -(B), C, U, R)
+(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, -(B), C, U, R)
 
 #define _mm_maskz_fnmadd_round_ss(U, A, B, C, R)\
-(__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), C, U, R)
+(__m128) __builtin_ia32_vfmaddss3_maskz (A, -(B), C, U, R)
 
 #define _mm_mask_fnmsub_round_sd(A, U, B, C, R)\
 (__m128d) __builtin_ia32_vfmaddsd3_mask (A, -(B), -(C), U, R)
@@ -12158,10 +12158,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U,
 (__m128) __builtin_ia32_vfmsubss3_mask3 (A, -(B), C, U, R)
 
 #define _mm_maskz_fnmsub_round_sd(U, A, B, C, R)\
-(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, -(B), -(C), U, R)
+(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, -(B), -(C), U, R)
 
 #define _mm_maskz_fnmsub_round_ss(U, A, B, C, R)\
-(__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), -(C), U, R)
+(__m128) __builtin_ia32_vfmaddss3_maskz (A, -(B), -(C), U, R)
 #endif
 
 #ifdef __OPTIMIZE__
--- gcc/testsuite/gcc.target/i386/avx512f-vfmsubXXXss-2.c.jj2019-03-22 
11:07:00.701948752 +0100
+++ gcc/testsuite/gcc.target/i386/avx512f-vfmsubXXXss-2.c   2019-04-17 
11:35:57.314481901 +0200
@@ -41,8 +41,8 @@ avx512f_test (void)
   for (i = 0; i < SIZE; i++)
 {
   src1.a[i] = DEFAULT_VALUE;
-  

Re: [PATCH][RFC] Improve get_qualified_type linear list walk

2019-04-17 Thread Jeff Law
On 4/16/19 6:55 AM, Richard Biener wrote:
> 
> The following makes the C++ FEs heavy use of build_qualified_type
> cheaper.  When looking at a tramp3d -fsyntax-only compile you can
> see that for 470.000 build_qualified_type calls we end up
> with 9.492.205 calls to check_qualified_type (thus we visit around
> 20 variant type candidates) ending up finding it in all but
> 15.300 cases that end up in build_variant_type_copy.
> 
> That's of course because the FE uses this machinery to do things like
> 
> bool
> same_type_ignoring_top_level_qualifiers_p (tree type1, tree type2)
> {
>   if (type1 == error_mark_node || type2 == error_mark_node)
> return false;
> 
>   type1 = cp_build_qualified_type (type1, TYPE_UNQUALIFIED);
>   type2 = cp_build_qualified_type (type2, TYPE_UNQUALIFIED);
>   return same_type_p (type1, type2);
> 
> but so it be.  The improvement is to re-organize get_qualified_type
> to put found type variants on the head of the variant list.  This
> improves the number of calls to check_qualified_type to 1.215.030
> thus around 2.5 candidates.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> Comments?  OK?
> 
> Richard.
> 
> 2019-04-16  Richard Biener  
> 
>   * tree.c (get_qualified_type): Put found type variants at the
>   head of the variant list.
Seems quite reasonable to me.   I just hope we don't find a case where
this is the exact worst case behavior ;-)

jeff


Re: [PATCH v2] Fix __patchable_function_entries section flags

2019-04-17 Thread Jeff Law
On 4/15/19 10:31 AM, Joao Moreira wrote:
> 
> 
> On 4/12/19 1:19 PM, Jeff Law wrote:
>> On 4/11/19 11:18 AM, Joao Moreira wrote:
>>> When -fpatchable-relocation-entry is used, gcc places nops on the
>>> prologue of each compiled function and creates a section named
>>> __patchable_function_entries which holds relocation entries for the
>>> positions in which the nops were placed. As is, gcc creates this
>>> section without the proper section flags, causing crashes in the
>>> compiled program during its load.
>>>
>>> Given the above, fix the problem by creating the section with the
>>> SECTION_WRITE and SECTION_RELRO flags.
>>>
>>> The problem was noticed while compiling glibc with
>>> -fpatchable-function-entry compiler flag. After applying the patch,
>>> this issue was solved.
>>>
>>> This was also tested on x86-64 arch without visible problems under
>>> the gcc standard tests.
>>>
>>> 2019-04-10  Joao Moreira  
>>>
>>> * targhooks.c (default_print_patchable_function_entry): Emit
>>> __patchable_function_entries section with writable flags to allow
>>> relocation resolution.
>> OK.  Do you have write access to the GCC repo?
>>
> No.
I went ahead and installed on the trunk for you.  Are you going to be
working on GCC regularly?  If so it might make sense to go ahead and get
that access setup.

Jeff


Re: [PATCH] Fix up _mm_maskz_f{,n}m{add,sub}_round_s{s,d} at -O0 (PR target/90125)

2019-04-17 Thread Uros Bizjak
On Wed, Apr 17, 2019 at 8:13 PM Jakub Jelinek  wrote:
>
> Hi!
>
> The following patch fixes a bunch of pastos in the -O0 macros in the
> PR89784 implementation plus testcase coverage that FAILs without the header
> change and succeeds with that (the tests were previously run at -O2 only
> where they test the inline functions and not the macros).
> Because at -O0 the C x * y + z isn't contracted into FMA, there is a small
> precision difference in two of the tests with the chosen constants, so I've
> changed them to ones where a precision difference isn't really possible.
> I think the constants weren't chosen very well, because either we just want
> some basic testing, for which even the adjusted ones are ok, or we want
> to specifically check for FMA, in that case we should check some FMA
> cornercases where without FMA the result is completely different from one
> with FMA.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> And sorry for screwing it up.
>
> 2019-04-17  Hongtao Liu  
>
> PR target/90125
> * config/i386/avx512fintrin.h (_mm_maskz_fmadd_round_sd,
> _mm_maskz_fmadd_round_ss, _mm_maskz_fmsub_round_sd,
> _mm_maskz_fmsub_round_ss, _mm_maskz_fnmadd_round_sd,
> _mm_maskz_fnmadd_round_ss, _mm_maskz_fnmsub_round_sd,
> _mm_maskz_fnmsub_round_ss): Use _maskz builtin instead of _mask3.
>
> 2019-04-17  Jakub Jelinek  
>
> PR target/90125
> * gcc.target/i386/avx512f-vfmsubXXXss-2.c (avx512f_test): Adjust
> constants to ensure precise result even when not using fma.
> * gcc.target/i386/avx512f-vfnmaddXXXss-2.c (avx512f_test): Likewise.
> * gcc.target/i386/avx512f-vfmaddXXXsd-3.c: New test.
> * gcc.target/i386/avx512f-vfmaddXXXss-3.c: New test.
> * gcc.target/i386/avx512f-vfmsubXXXsd-3.c: New test.
> * gcc.target/i386/avx512f-vfmsubXXXss-3.c: New test.
> * gcc.target/i386/avx512f-vfnmaddXXXsd-3.c: New test.
> * gcc.target/i386/avx512f-vfnmaddXXXss-3.c: New test.
> * gcc.target/i386/avx512f-vfnmsubXXXsd-3.c: New test.
> * gcc.target/i386/avx512f-vfnmsubXXXss-3.c: New test.

The patch can be committed under obvious rule.

Thanks,
Uros.

> --- gcc/config/i386/avx512fintrin.h.jj  2019-03-22 11:07:00.699948784 +0100
> +++ gcc/config/i386/avx512fintrin.h 2019-04-17 11:24:53.683695473 +0200
> @@ -12104,10 +12104,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U,
>  (__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, C, U, R)
>
>  #define _mm_maskz_fmadd_round_sd(U, A, B, C, R)\
> -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, B, C, U, R)
> +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, B, C, U, R)
>
>  #define _mm_maskz_fmadd_round_ss(U, A, B, C, R)\
> -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, C, U, R)
> +(__m128) __builtin_ia32_vfmaddss3_maskz (A, B, C, U, R)
>
>  #define _mm_mask_fmsub_round_sd(A, U, B, C, R)\
>  (__m128d) __builtin_ia32_vfmaddsd3_mask (A, B, -(C), U, R)
> @@ -12122,10 +12122,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U,
>  (__m128) __builtin_ia32_vfmsubss3_mask3 (A, B, C, U, R)
>
>  #define _mm_maskz_fmsub_round_sd(U, A, B, C, R)\
> -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, B, -(C), U, R)
> +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, B, -(C), U, R)
>
>  #define _mm_maskz_fmsub_round_ss(U, A, B, C, R)\
> -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, B, -(C), U, R)
> +(__m128) __builtin_ia32_vfmaddss3_maskz (A, B, -(C), U, R)
>
>  #define _mm_mask_fnmadd_round_sd(A, U, B, C, R)\
>  (__m128d) __builtin_ia32_vfmaddsd3_mask (A, -(B), C, U, R)
> @@ -12140,10 +12140,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U,
>  (__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), C, U, R)
>
>  #define _mm_maskz_fnmadd_round_sd(U, A, B, C, R)\
> -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, -(B), C, U, R)
> +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, -(B), C, U, R)
>
>  #define _mm_maskz_fnmadd_round_ss(U, A, B, C, R)\
> -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), C, U, R)
> +(__m128) __builtin_ia32_vfmaddss3_maskz (A, -(B), C, U, R)
>
>  #define _mm_mask_fnmsub_round_sd(A, U, B, C, R)\
>  (__m128d) __builtin_ia32_vfmaddsd3_mask (A, -(B), -(C), U, R)
> @@ -12158,10 +12158,10 @@ _mm_maskz_fnmsub_round_ss (__mmask8 __U,
>  (__m128) __builtin_ia32_vfmsubss3_mask3 (A, -(B), C, U, R)
>
>  #define _mm_maskz_fnmsub_round_sd(U, A, B, C, R)\
> -(__m128d) __builtin_ia32_vfmaddsd3_mask3 (A, -(B), -(C), U, R)
> +(__m128d) __builtin_ia32_vfmaddsd3_maskz (A, -(B), -(C), U, R)
>
>  #define _mm_maskz_fnmsub_round_ss(U, A, B, C, R)\
> -(__m128) __builtin_ia32_vfmaddss3_mask3 (A, -(B), -(C), U, R)
> +(__m128) __builtin_ia32_vfmaddss3_maskz (A, -(B), -(C), U, R)
>  #endif
>
>  #ifdef __OPTIMIZE__
> --- gcc/testsuite/gcc.target/i386/av

Re: [PATCH wwwdocs] Mention GNU Tools Cauldron in the News section

2019-04-17 Thread Jeff Law
On 4/15/19 11:39 AM, Simon Marchi wrote:
> On 2019-04-15 12:42 p.m., Simon Marchi wrote:
>> Hi,
>>
>> Here is a patch that adds a mention of the 2019 Cauldron, similar to the 
>> entries
>> for the previous editions.
>>
>> Thanks,
>>
>> Simon
>>
>>
>> Index: index.html
>> ===
>> RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
>> retrieving revision 1.1125
>> diff -u -r1.1125 index.html
>> --- index.html   29 Mar 2019 12:28:15 -  1.1125
>> +++ index.html   15 Apr 2019 16:39:00 -
>> @@ -54,6 +54,10 @@
>>  News
>>  
>>
>> +https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools 
>> Cauldron 2019
>> +[2019-04-15]
>> +Held in Montréal, Canada, September 13-15 2019
>> +
>>  GCC 8.3 released
>>  [2019-02-22]
>>  
>>
> Actually, it would be better to use the same dates as are written on the wiki 
> (12-15),
> so please consider the patch below instead.
> 
> Also, please note that I don't have push access on GCC, so if somebody could 
> push the
> patch for me, once it's approved, I would appreciate it.  Thanks!
Thanks.  Committed.
jeff


Re: [PATCH] [ARC][COMMITTED] Fix diagnostic messages.

2019-04-17 Thread Marek Polacek
On Wed, Apr 17, 2019 at 01:25:05PM +0200, Jakub Jelinek wrote:
> On Wed, Apr 17, 2019 at 02:09:33PM +0300, Claudiu Zissulescu wrote:
> >/* Warn for unimplemented PIC in pre-ARC700 cores, and disable flag_pic. 
> >  */
> >if (flag_pic && TARGET_ARC600_FAMILY)
> >  {
> >warning (0,
> > -  "PIC is not supported for %s. Generating non-PIC code only..",
> > +  "PIC is not supported for %s.  Generating non-PIC code only",
> >arc_cpu_string);
> 
> I believe this is undesirable too.  Either use something like
> "PIC is not supported for %s; generating non-PIC code only"
> or split that into two messages
> if (warning (0, "PIC is not supported for %s", arc_cpu_string))
>   inform (input_location, "generating non-PIC code only");

And I suppose we should avoid pleonasm like "PIC code" ;).

Marek


Re: [PATCH 3/3] Fix condition for std::variant to be copy constructible

2019-04-17 Thread Jonathan Wakely

On 17/04/19 19:20 +0300, Ville Voutilainen wrote:

On Wed, 17 Apr 2019 at 19:12, Jonathan Wakely  wrote:


The standard says the std::variant copy constructor is defined as
deleted unless all alternative types are copy constructible, but we were
making it also depend on move constructible. Fix the condition and
enhance the tests to check the semantics with pathological copy-only
types (i.e. supporting copying but having deleted moves).

The enhanced tests revealed a regression in copy assignment for
non-trivial alternative types, where the assignment would not be
performed because the condition in the _Copy_assign_base visitor is
false: is_same_v, remove_reference_t>.


Tested powerpc64le-linux.

I plan to commit all three of these patches later today, unless
somebody sees a problem with them.


Looks good to me.


Thanks. All three patches committed to trunk.



[PATCH] Use builtin sort instead of shell sort

2019-04-17 Thread Émeric Dupont
Some build environments and configuration options may lead to the make
variable PLUGIN_HEADERS being too long to be passed as parameters to the
shell `echo` command, leading to a "write error" message when making the
target install-plugin.

The following patch fixes this issue by using the [Make $(sort list)][1]
function instead to remove duplicates from the list of headers. There is
no functional change, the value assigned to the shell variable is the
same.

Tested in production on x86 and armv7 cross-compilation toolchains.
- The length of the headers variable goes from 8+ chars to 7500+

Tested with make bootstrap and make check on host-x86_64-pc-linux-gnu
- make bootstrap successful
- make check fails even before the patch is applied

WARNING: program timed out.
FAIL: libgomp.c/../libgomp.c-c++-common/cancel-parallel-1.c execution 
test
...
make[4]: *** [Makefile:479: check-DEJAGNU] Error 1

2019-04-15  Emeric Dupont  

* Makefile.in: Use builtin sort instead of shell sort

Signed-off-by: Emeric Dupont 

[1]: 
https://www.gnu.org/software/make/manual/html_node/Text-Functions.html#index-sorting-words

Signed-off-by: Emeric Dupont 
---
 gcc/Makefile.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d186d71c91e..3196e774a26 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3538,7 +3538,7 @@ install-plugin: installdirs lang.install-plugin 
s-header-vars install-gengtype
 # We keep the directory structure for files in config or c-family and .def
 # files. All other files are flattened to a single directory.
 $(mkinstalldirs) $(DESTDIR)$(plugin_includedir)
-headers=`echo $(PLUGIN_HEADERS) $$(cd $(srcdir); echo *.h *.def) | tr ' ' 
'\012' | sort -u`; \
+headers=$(sort $(PLUGIN_HEADERS) $$(cd $(srcdir); echo *.h *.def)); \
 srcdirstrip=`echo "$(srcdir)" | sed 's/[].[^$$\\*|]/&/g'`; \
 for file in $$headers; do \
   if [ -f $$file ] ; then \
--
2.21.0




TriaGnoSys GmbH, Registergericht: München HRB 141647, Vat.: DE 813396184 
Geschäftsführer: Núria Riera Díaz, Peter Lewalter



This email and any files transmitted with it are confidential & proprietary to 
Zodiac Inflight Innovations. This information is intended solely for the use of 
the individual or entity to which it is addressed. Access or transmittal of the 
information contained in this e-mail, in full or in part, to any other 
organization or persons is not authorized.


Re: collect2 patch to https in URL

2019-04-17 Thread Jonny Grant




On 17/04/2019 19:11, Jeff Law wrote:

On 4/17/19 6:45 AM, Jonny Grant wrote:

Hello

Change the "collect2 -help" output to have https URL:

Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html

2019-04-14  Jonny Grant  
     * collect2.c: Change gcc.gnu.org URL to HTTPS


Thank you
Jonny

THanks.  I've installed this on the trunk.

jeff



Excellent



Re: [PATCH] Use builtin sort instead of shell sort

2019-04-17 Thread Émeric Dupont
The 17.04.2019 21:36, Emeric Dupont wrote:
<... Unwanted legalese ...>

Sorry, please disregard the unwanted footer added against my will. I am
actively trying to have out admins get rid of it where it is not
applicable.

--
Emeric Dupont
Zodiac Inflight Innovations
P +49815388678207
Argelsrieder Feld, 22 - 82234 Wessling
www.safran-aerosystems.com



TriaGnoSys GmbH, Registergericht: München HRB 141647, Vat.: DE 813396184 
Geschäftsführer: Núria Riera Díaz, Peter Lewalter



This email and any files transmitted with it are confidential & proprietary to 
Zodiac Inflight Innovations. This information is intended solely for the use of 
the individual or entity to which it is addressed. Access or transmittal of the 
information contained in this e-mail, in full or in part, to any other 
organization or persons is not authorized.


Re: [PATCH][RFC] Improve get_qualified_type linear list walk

2019-04-17 Thread Marc Glisse

On Wed, 17 Apr 2019, Jeff Law wrote:


* tree.c (get_qualified_type): Put found type variants at the
head of the variant list.

Seems quite reasonable to me.   I just hope we don't find a case where
this is the exact worst case behavior ;-)


That seems unlikely. Competitive analysis of the list update problem shows 
that the move-to-front strategy is 2-competitive. Here we also have 
insertions so the problem is different, but still close.


--
Marc Glisse


Re: [PATCH] Fix up dg-extract-results.sh

2019-04-17 Thread Christophe Lyon
On Wed, 17 Apr 2019 at 18:44, Mike Stump  wrote:
>
> On Apr 17, 2019, at 8:59 AM, Jakub Jelinek  wrote:
> > Ok for trunk?
>
> Ok.

Thanks!


[PATCH] rs6000: Remove a comma in a debug string

2019-04-17 Thread Segher Boessenkool
It is a bit confusing, it looks as if the compiler tried to print
something there.

Committing.


Segher


2018-04-17  Segher Boessenkool  

* config/rs6000/rs6000.c (rs6000_register_move_cost): Fix typo.

---
 gcc/config/rs6000/rs6000.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3c9b557..1b94e16 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -34893,7 +34893,7 @@ rs6000_register_move_cost (machine_mode mode,
 {
   if (dbg_cost_ctrl == 1)
fprintf (stderr,
-"rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, 
to=%s\n",
+"rs6000_register_move_cost: ret=%d, mode=%s, from=%s, to=%s\n",
 ret, GET_MODE_NAME (mode), reg_class_names[from],
 reg_class_names[to]);
   dbg_cost_ctrl--;
-- 
1.8.3.1



Re: [PATCH] Fixup IRA debug dump output

2019-04-17 Thread Peter Bergner
On 4/17/19 12:57 PM, Jeff Law wrote:
> On 4/17/19 9:35 AM, Peter Bergner wrote:
>>  * ira-conflicts.c (print_allocno_conflicts): Always print something,
>>  even for allocno's with no conflicts.
>>  (print_conflicts): Print an extra newline.
> OK.  And while it's technically not a regression fix, I think this can
> safely go in now :-)


Hi Jeff,

Ok, I committed the patch which is an improvement over the old code.  Thanks!


However, debugging PR87871 some more, I still didn't see p116 conflict with
r0 like Vlad said it did with the new debug output.  Not surprising, since
the patch only affected adding missing \n's to the output.  So I dumped the
OBJECT_TOTAL_CONFLICT_HARD_REGS() output for p116 and sure enough, it does
mention r0.  I then called print_allocno_conflicts() by hand and it still
didn't output r0 as a conflicting hard reg.  Stepping through the debugger,
I see that the:

  if (OBJECT_CONFLICT_ARRAY (obj) != NULL)
{
  fprintf (file, "\n;; total conflict hard regs:\n");
  fprintf (file, ";; conflict hard regs:\n\n");
  continue;
}

...is actually incorrect.  The "if" test only says we don't have any
conflicts with any other allocnos/pseudos.  It doesn't tell us whether we
have any hard register conflicts or not, so we really shouldn't do a continue
here.  Instead, we should guard the code that outputs the allocno conflicts
and then fall down to the hard reg conflict prints, that should also be
suitably guarded.  With the patch below, we now see the missing r0 conflict
Vlad said was there.

  ;; a5(r116,l0) conflicts:
  ;; total conflict hard regs: 0
  ;; conflict hard regs:


cp0:a0(r111)<->a4(r117)@330:move
cp1:a2(r114)<->a3(r112)@41:shuffle
...

Note, I still don't understand why p116 conflicts with r0, but that is
orthogonal to actually printing out the conflict sets as they exist.

Is this ok as well?  ...and I'm sorry for not noticing this issue before.

Peter

* ira-conflicts.c (print_allocno_conflicts): Print the hard register
conflicts, even if there are no allocno conflicts.

Index: gcc/ira-conflicts.c
===
--- gcc/ira-conflicts.c (revision 270420)
+++ gcc/ira-conflicts.c (working copy)
@@ -632,47 +632,58 @@ print_allocno_conflicts (FILE * file, bo
   ira_object_t conflict_obj;
   ira_object_conflict_iterator oci;
 
-  if (OBJECT_CONFLICT_ARRAY (obj) == NULL)
+  if (OBJECT_CONFLICT_ARRAY (obj) != NULL)
{
- fprintf (file, "\n;; total conflict hard regs:\n");
- fprintf (file, ";; conflict hard regs:\n\n");
- continue;
-   }
-
-  if (n > 1)
-   fprintf (file, "\n;;   subobject %d:", i);
-  FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
-   {
- ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj);
- if (reg_p)
-   fprintf (file, " r%d,", ALLOCNO_REGNO (conflict_a));
- else
+ if (n > 1)
+   fprintf (file, "\n;;   subobject %d:", i);
+ FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
{
- fprintf (file, " a%d(r%d", ALLOCNO_NUM (conflict_a),
-  ALLOCNO_REGNO (conflict_a));
- if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1)
-   fprintf (file, ",w%d", OBJECT_SUBWORD (conflict_obj));
- if ((bb = ALLOCNO_LOOP_TREE_NODE (conflict_a)->bb) != NULL)
-   fprintf (file, ",b%d", bb->index);
+ ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj);
+ if (reg_p)
+   fprintf (file, " r%d,", ALLOCNO_REGNO (conflict_a));
  else
-   fprintf (file, ",l%d",
-ALLOCNO_LOOP_TREE_NODE (conflict_a)->loop_num);
- putc (')', file);
+   {
+ fprintf (file, " a%d(r%d", ALLOCNO_NUM (conflict_a),
+  ALLOCNO_REGNO (conflict_a));
+ if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1)
+   fprintf (file, ",w%d", OBJECT_SUBWORD (conflict_obj));
+ if ((bb = ALLOCNO_LOOP_TREE_NODE (conflict_a)->bb) != NULL)
+   fprintf (file, ",b%d", bb->index);
+ else
+   fprintf (file, ",l%d",
+ALLOCNO_LOOP_TREE_NODE (conflict_a)->loop_num);
+ putc (')', file);
+   }
}
}
-  COPY_HARD_REG_SET (conflicting_hard_regs, 
OBJECT_TOTAL_CONFLICT_HARD_REGS (obj));
-  AND_COMPL_HARD_REG_SET (conflicting_hard_regs, ira_no_alloc_regs);
-  AND_HARD_REG_SET (conflicting_hard_regs,
-   reg_class_contents[ALLOCNO_CLASS (a)]);
-  print_hard_reg_set (file, "\n;; total conflict hard regs:",
- conflicting_hard_regs);
-
-  COPY_HARD_REG_SET (conflicting_hard_regs, OBJECT_CONFLICT_HARD_REGS 
(obj));
-  AND_COMPL_HARD_REG_SET (conflic

Re: [PATCH] PR libstdc++/90105 make forward_list::sort stable

2019-04-17 Thread Jonathan Wakely

On 16/04/19 23:16 +0100, Jonathan Wakely wrote:

While testing the fix I also discovered that operator== assumes the
elements are comparable with operator!= which is not required.

PR libstdc++/90105
* include/bits/forward_list.h (operator==): Do not use operator!= to
compare elements.
(forward_list::sort(Comp)): When elements are equal take the one
earlier in the list, so that sort is stable.
* testsuite/23_containers/forward_list/operations/90105.cc: New test.
* testsuite/23_containers/forward_list/comparable.cc: Test with
types that meet the minimum EqualityComparable and LessThanComparable
requirements. Remove irrelevant comment.

Tested powerpc64le-linux.

I'm surprised nobody has noticed either of these bugs before! I think
this is safe for stage 4, and for backporting to active branches. Any
objections?


Committed to trunk.



[PATCH] avoid aarch64 ICE on large vectors (PR 89797)

2019-04-17 Thread Martin Sebor

The fix for pr89797 committed in r270326 was limited to targets
with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64.
The tests for the fix have been failing with an ICE on aarch64
because it suffers from more or less the same problem but in
its own target-specific code.  Attached is the patch I posted
yesterday that fixes the ICE, successfully bootstrapped and
regtested on x86_64-linux.  I also ran the dg.exp=*attr* and
aarch64.exp tests with an aarch64-linux-elf cross-compiler.
There are no ICEs but there are tons of errors in the latter
tests because many (most?) either expect to be able to find
libc headers or link executables (I have not built libc for
aarch64).

I'm around tomorrow but then traveling the next two weeks (with
no connectivity the first week) so I unfortunately won't be able
to fix whatever this change might break until the week of May 6.

Jeff, if you have an aarch64 tester that could verify this patch
tomorrow that would help give us some confidence.  Otherwise,
another option to consider for the time being is to xfail
the tests on aarch64.

Thanks
Martin
PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable

gcc/ChangeLog:

	PR middle-end/89797
	* tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when
	NUM_POLY_INT_COEFFS == 2.  Use HOST_WIDE_INT_1U.
	* config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid
	assuming type size fits in SHWI.

Index: gcc/tree.h
===
--- gcc/tree.h	(revision 270418)
+++ gcc/tree.h	(working copy)
@@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
   if (NUM_POLY_INT_COEFFS == 2)
 {
   poly_uint64 res = 0;
-  res.coeffs[0] = 1 << (precision & 0xff);
+  res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff);
   if (precision & 0x100)
-	res.coeffs[1] = 1 << (precision & 0xff);
+	res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16);
   return res;
 }
   else
-return (unsigned HOST_WIDE_INT)1 << precision;
+return HOST_WIDE_INT_1U << precision;
 }
 
 /* Set the number of elements in VECTOR_TYPE NODE to SUBPARTS, which must
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c	(revision 270418)
+++ gcc/config/aarch64/aarch64.c	(working copy)
@@ -14924,7 +14924,10 @@ aarch64_simd_vector_alignment (const_tree type)
be set for non-predicate vectors of booleans.  Modes are the most
direct way we have of identifying real SVE predicate types.  */
 return GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL ? 16 : 128;
-  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
+  tree size = TYPE_SIZE (type);
+  unsigned HOST_WIDE_INT align = 128;
+  if (tree_fits_uhwi_p (size))
+align = tree_to_uhwi (TYPE_SIZE (type));
   return MIN (align, 128);
 }
 


[C++ PATCH] PR c++/90047 - ICE with enable_if alias template.

2019-04-17 Thread Jason Merrill
In order to make alias templates useful for SFINAE we instantiate them under
the prevailing 'complain' argument, so an error encountered while
instantiating during SFINAE context is silent.  The problem in this PR comes
when we later look up the erroneous instantiation and don't give an error at
that point.  Fixed by not adding an erroneous instantiation to the hash
table, so we instantiate it again when needed and get the error.  This
required changes to a number of tests, which previously said "substitution
failed:" with no explanation of what the failure was; now we properly
explain.

Tested x86_64-pc-linux-gnu, applying to trunk.

* pt.c (tsubst_decl) [TYPE_DECL]: Don't put an erroneous decl in the
hash table when we're in SFINAE context.
---
 gcc/cp/pt.c   |  3 +-
 .../20_util/duration/arithmetic/dr3050.cc |  2 ++
 .../20_util/from_chars/1_c++20_neg.cc |  2 ++
 .../testsuite/20_util/from_chars/1_neg.cc |  2 ++
 .../20_util/shared_ptr/assign/auto_ptr_neg.cc |  1 +
 .../shared_ptr/assign/shared_ptr_neg.cc   |  2 ++
 .../20_util/shared_ptr/cons/unique_ptr_neg.cc |  1 +
 .../testsuite/20_util/to_chars/1_neg.cc   |  2 ++
 .../20_util/tuple/element_access/get_neg.cc   |  2 ++
 .../unique_ptr/cons/ptr_deleter_neg.cc|  2 ++
 .../20_util/unique_ptr/modifiers/reset_neg.cc |  2 ++
 .../deque/requirements/dr438/assign_neg.cc|  2 ++
 .../requirements/dr438/constructor_1_neg.cc   |  2 ++
 .../requirements/dr438/constructor_2_neg.cc   |  2 ++
 .../deque/requirements/dr438/insert_neg.cc|  2 ++
 .../requirements/dr438/assign_neg.cc  |  2 ++
 .../requirements/dr438/constructor_1_neg.cc   |  2 ++
 .../requirements/dr438/constructor_2_neg.cc   |  2 ++
 .../requirements/dr438/insert_neg.cc  |  2 ++
 .../list/requirements/dr438/assign_neg.cc |  2 ++
 .../requirements/dr438/constructor_1_neg.cc   |  2 ++
 .../requirements/dr438/constructor_2_neg.cc   |  2 ++
 .../list/requirements/dr438/insert_neg.cc |  2 ++
 .../vector/requirements/dr438/assign_neg.cc   |  2 ++
 .../requirements/dr438/constructor_1_neg.cc   |  2 ++
 .../requirements/dr438/constructor_2_neg.cc   |  2 ++
 .../vector/requirements/dr438/insert_neg.cc   |  2 ++
 .../memory/shared_ptr/cons/copy_ctor_neg.cc   |  2 ++
 .../shared_ptr/cons/pointer_ctor_neg.cc   |  2 ++
 .../memory/shared_ptr/modifiers/reset_neg.cc  |  2 ++
 gcc/testsuite/g++.dg/cpp0x/alias-decl-67.C| 30 +++
 gcc/testsuite/g++.old-deja/g++.robertl/eb43.C |  2 ++
 gcc/cp/ChangeLog  |  6 
 33 files changed, 96 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-67.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f8001317bda..3a11eaa7630 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13948,7 +13948,8 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain)
 
DECL_TEMPLATE_INFO (r) = build_template_info (tmpl, argvec);
SET_DECL_IMPLICIT_INSTANTIATION (r);
-   register_specialization (r, gen_tmpl, argvec, false, hash);
+   if (!error_operand_p (r) || (complain & tf_error))
+ register_specialization (r, gen_tmpl, argvec, false, hash);
  }
else
  {
diff --git a/libstdc++-v3/testsuite/20_util/duration/arithmetic/dr3050.cc 
b/libstdc++-v3/testsuite/20_util/duration/arithmetic/dr3050.cc
index 5854195dce5..fc64e5a4e61 100644
--- a/libstdc++-v3/testsuite/20_util/duration/arithmetic/dr3050.cc
+++ b/libstdc++-v3/testsuite/20_util/duration/arithmetic/dr3050.cc
@@ -28,3 +28,5 @@ void test01(std::chrono::seconds s, X x)
   s / x; // { dg-error "no match" }
   s % x; // { dg-error "no match" }
 }
+
+// { dg-prune-output "enable_if" }
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
index 83d297676bf..821cc17413d 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
@@ -36,3 +36,5 @@ test01(const char* first, const char* last)
   std::from_chars(first, last, c32); // { dg-error "no matching" }
   std::from_chars(first, last, c32, 10); // { dg-error "no matching" }
 }
+
+// { dg-prune-output "enable_if" }
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
index 2e3c34c9145..bc52628218a 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
@@ -36,3 +36,5 @@ test01(const char* first, const char* last)
   std::from_chars(first, last, c32); // { dg-error "no matching" }
   std::from_chars(first, last, c32, 10); // { dg-error "no matching" }
 }
+
+// { dg-prune-output "enable_if" }
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/assign/auto_ptr_neg.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/assign/auto_ptr_neg.cc
index 19a73a1d8f2..9c80c77c96e 1006

Go patch committed: Use temporary to avoid early destruction

2019-04-17 Thread Ian Lance Taylor
This patch to the Go frontend fixes a bug in which the code referred
to a temporary value after it was destroyed.  It also fixes an
incorrect test of the string index rather than the value parsed using
strtol.  This should fix PR 90110.  Bootstrapped and ran Go testsuite
on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 270373)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-20010e494f46d8fd58cfd372093b059578d3379a
+ecbd6562aff604b9559f63d714e922a0c9c2a77f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/import.cc
===
--- gcc/go/gofrontend/import.cc (revision 270373)
+++ gcc/go/gofrontend/import.cc (working copy)
@@ -1478,8 +1478,9 @@ Import_function_body::read_type()
   this->off_ = i + 1;
 
   char *end;
-  long val = strtol(this->body_.substr(start, i - start).c_str(), &end, 10);
-  if (*end != '\0' || i > 0x7fff)
+  std::string num = this->body_.substr(start, i - start);
+  long val = strtol(num.c_str(), &end, 10);
+  if (*end != '\0' || val > 0x7fff)
 {
   if (!this->saw_error_)
go_error_at(this->location(),


[C/C++ PATCH] Further typedef duplicate decl fixes (PR c++/90108)

2019-04-17 Thread Jakub Jelinek
Hi!

As reported, the newly added testcase ICEs with --param ggc-min-heapsize=0.
The problem is that while the remove type is not referenced by anything
else, it is a distinct type created to hold the attributes, there is another
type with TYPE_NAME equal to the newdecl we want to tree.  That one is
created in common_handle_aligned_attribute:
{
  if ((flags & (int) ATTR_FLAG_TYPE_IN_PLACE))
/* OK, modify the type in place.  */;
  /* If we have a TYPE_DECL, then copy the type, so that we
 don't accidentally modify a builtin type.  See pushdecl.  */
  else if (decl && TREE_TYPE (decl) != error_mark_node
   && DECL_ORIGINAL_TYPE (decl) == NULL_TREE)
{
  tree tt = TREE_TYPE (decl);
  *type = build_variant_type_copy (*type);
  DECL_ORIGINAL_TYPE (decl) = tt;
  TYPE_NAME (*type) = decl;
  TREE_USED (*type) = TREE_USED (decl);
  TREE_TYPE (decl) = *type;
}
  else
*type = build_variant_type_copy (*type);
where we create a variant type and set the TYPE_NAME too.  I've tried to
remove that else if ... and just do *type = build_variant_type_copy (*type);
but that regressed some DWARF DW_AT_alignment tests.

So, the following patch instead removes the remove type from the variants
list if it is not a main variant (as before), otherwise tries to find in
TYPE_MAIN_VARIANT (DECL_ORIGINAL_TYPE (newdecl)) variant list a type
with TYPE_NAME equal to newdecl and remove that one.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-04-18  Jakub Jelinek  

PR c++/90108
* c-decl.c (merge_decls): If remove is main variant and
DECL_ORIGINAL_TYPE is some other type, remove a DECL_ORIGINAL_TYPE
variant that has newdecl as TYPE_NAME if any.

* decl.c (duplicate_decls): If remove is main variant and
DECL_ORIGINAL_TYPE is some other type, remove a DECL_ORIGINAL_TYPE
variant that has newdecl as TYPE_NAME if any.

* c-c++-common/pr90108.c: New test.

--- gcc/c/c-decl.c.jj   2019-04-17 21:21:39.936133112 +0200
+++ gcc/c/c-decl.c  2019-04-17 23:25:08.098936888 +0200
@@ -2513,7 +2513,24 @@ merge_decls (tree newdecl, tree olddecl,
{
  tree remove = TREE_TYPE (newdecl);
  if (TYPE_MAIN_VARIANT (remove) == remove)
-   gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE);
+   {
+ gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE);
+ /* If remove is the main variant, no need to remove that
+from the list.  One of the DECL_ORIGINAL_TYPE
+variants, e.g. created for aligned attribute, might still
+refer to the newdecl TYPE_DECL though, so remove that one
+in that case.  */
+ if (DECL_ORIGINAL_TYPE (newdecl)
+ && DECL_ORIGINAL_TYPE (newdecl) != remove)
+   for (tree t = TYPE_MAIN_VARIANT (DECL_ORIGINAL_TYPE (newdecl));
+t; t = TYPE_MAIN_VARIANT (t))
+ if (TYPE_NAME (TYPE_NEXT_VARIANT (t)) == newdecl)
+   {
+ TYPE_NEXT_VARIANT (t)
+   = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT (t));
+ break;
+   }
+   }   
  else
for (tree t = TYPE_MAIN_VARIANT (remove); ;
 t = TYPE_NEXT_VARIANT (t))
--- gcc/cp/decl.c.jj2019-04-17 21:21:39.753136091 +0200
+++ gcc/cp/decl.c   2019-04-17 23:27:13.995875527 +0200
@@ -2133,7 +2133,24 @@ next_arg:;
{
  tree remove = TREE_TYPE (newdecl);
  if (TYPE_MAIN_VARIANT (remove) == remove)
-   gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE);
+   {
+ gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE);
+ /* If remove is the main variant, no need to remove that
+from the list.  One of the DECL_ORIGINAL_TYPE
+variants, e.g. created for aligned attribute, might still
+refer to the newdecl TYPE_DECL though, so remove that one
+in that case.  */
+ if (tree orig = DECL_ORIGINAL_TYPE (newdecl))
+   if (orig != remove)
+ for (tree t = TYPE_MAIN_VARIANT (orig); t;
+  t = TYPE_MAIN_VARIANT (t))
+   if (TYPE_NAME (TYPE_NEXT_VARIANT (t)) == newdecl)
+ {
+   TYPE_NEXT_VARIANT (t)
+ = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT (t));
+   break;
+ }
+   }   
  else
for (tree t = TYPE_MAIN_VARIANT (remove); ;
 t = TYPE_NEXT_VARIANT (t))
--- gcc/testsuite/c-c++-common/pr90108.c.jj 2019-04-17 23:18:23.466566296 
+0200
+++ gcc/testsuite/c-c++-com

[PATCH] i18n fix for gimple-ssa-sprintf.c (PR translation/79183)

2019-04-17 Thread Jakub Jelinek
Hi!

This patch fixes the following messages, so that they are translatable even
to languages that don't use the english
Plural-Forms: nplurals=2; plural=n != 1;
See 
https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html#Plural-forms
for more details.

Bootstrapped/regtested on x86_64-linux and i686-linux, plus generated
gcc.pot and eyeballed the changes.  Ok for trunk?

2019-04-18  Jakub Jelinek  

PR translation/79183
* gimple-ssa-sprintf.c (format_directive): Use inform_n instead of
inform where appropriate.

--- gcc/gimple-ssa-sprintf.c.jj 2019-04-10 09:26:49.476692760 +0200
+++ gcc/gimple-ssa-sprintf.c2019-04-17 21:37:51.535294586 +0200
@@ -3016,12 +3016,10 @@ format_directive (const sprintf_dom_walk
 help the user figure out how big a buffer they need.  */
 
  if (min == max)
-   inform (callloc,
-   (min == 1
-? G_("%qE output %wu byte into a destination of size %wu")
-: G_("%qE output %wu bytes into a destination of size "
- "%wu")),
-   info.func, min, info.objsize);
+   inform_n (callloc, min,
+ "%qE output %wu byte into a destination of size %wu",
+ "%qE output %wu bytes into a destination of size %wu",
+ info.func, min, info.objsize);
  else if (max < HOST_WIDE_INT_MAX)
inform (callloc,
"%qE output between %wu and %wu bytes into "
@@ -3044,11 +3042,9 @@ format_directive (const sprintf_dom_walk
 of printf with no destination size just print the computed
 result.  */
  if (min == max)
-   inform (callloc,
-   (min == 1
-? G_("%qE output %wu byte")
-: G_("%qE output %wu bytes")),
-   info.func, min);
+   inform_n (callloc, min,
+ "%qE output %wu byte", "%qE output %wu bytes",
+ info.func, min);
  else if (max < HOST_WIDE_INT_MAX)
inform (callloc,
"%qE output between %wu and %wu bytes",

Jakub