[PATCH][PR 67328] Improve bitfield testing

2017-01-25 Thread Yuri Gribov
Hi all,

This fixes inefficient bitfield code reported in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67328

Bootstrapped and regtested on x86_64.

Ok for trunk?

-I


pr67328-2.patch
Description: Binary data


Re: [PATCH] Add --with-gcc-major-version-only support to libhsail-rt

2017-01-25 Thread Jakub Jelinek
On Tue, Jan 24, 2017 at 11:59:36PM +0100, Jakub Jelinek wrote:
> Though, I wonder why configure.ac/Makefile.am have been based on one of the
> only 2 that aren't GPL licensed, there are over dozen other libraries that
> have very simple GPL configure.ac and Makefile.am, can't we just rewrite
> those based on those other files?

Please ignore this part, the whole library is BSD-ish licensed, so having
the Makefile.am/configure.ac also BSD-ish makes sense.

Jakub


[PATCH] Fix PR78363

2017-01-25 Thread Richard Biener

The following patch fixes PR78363, debug confused by early debug emitted
from inconsistent IL which happens after OMP outlining wrecks parts of
the BLOCK tree (outlined TYPE_DECLs have wrong context).

Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?

Thanks,
Richard.

2017-01-25  Richard Biener  

PR debug/78363
* omp-expand.c: Include debug.h.
(expand_omp_taskreg): Make sure to generate early debug before
outlining anything from a function.
(expand_omp_target): Likewise.
(grid_expand_target_grid_body): Likewise.

* g++.dg/gomp/pr78363-1.C: New testcase.
* g++.dg/gomp/pr78363-2.C: Likewise.
* g++.dg/gomp/pr78363-3.C: Likewise.

Index: gcc/omp-expand.c
===
--- gcc/omp-expand.c(revision 244890)
+++ gcc/omp-expand.c(working copy)
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.
 #include "gomp-constants.h"
 #include "gimple-pretty-print.h"
 #include "hsa-common.h"
+#include "debug.h"
 
 
 /* OMP region information.  Every parallel and workshare
@@ -1305,6 +1306,11 @@ expand_omp_taskreg (struct omp_region *r
   else
block = gimple_block (entry_stmt);
 
+  /* Make sure to generate early debug for the function before
+ outlining anything.  */
+  if (! gimple_in_ssa_p (cfun))
+   (*debug_hooks->early_global_decl) (cfun->decl);
+
   new_bb = move_sese_region_to_fn (child_cfun, entry_bb, exit_bb, block);
   if (exit_bb)
single_succ_edge (new_bb)->flags = EDGE_FALLTHRU;
@@ -7016,6 +7022,11 @@ expand_omp_target (struct omp_region *re
  gsi_remove (&gsi, true);
}
 
+  /* Make sure to generate early debug for the function before
+ outlining anything.  */
+  if (! gimple_in_ssa_p (cfun))
+   (*debug_hooks->early_global_decl) (cfun->decl);
+
   /* Move the offloading region into CHILD_CFUN.  */
 
   block = gimple_block (entry_stmt);
@@ -7589,6 +7600,11 @@ grid_expand_target_grid_body (struct omp
   init_tree_ssa (cfun);
   pop_cfun ();
 
+  /* Make sure to generate early debug for the function before
+ outlining anything.  */
+  if (! gimple_in_ssa_p (cfun))
+(*debug_hooks->early_global_decl) (cfun->decl);
+
   tree old_parm_decl = DECL_ARGUMENTS (kern_fndecl);
   gcc_assert (!DECL_CHAIN (old_parm_decl));
   tree new_parm_decl = copy_node (DECL_ARGUMENTS (kern_fndecl));
Index: gcc/testsuite/g++.dg/gomp/pr78363-1.C
===
--- gcc/testsuite/g++.dg/gomp/pr78363-1.C   (nonexistent)
+++ gcc/testsuite/g++.dg/gomp/pr78363-1.C   (working copy)
@@ -0,0 +1,14 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+// { dg-options "-g -fopenmp" }
+
+int main()
+{
+  int n = 0;
+
+#pragma omp parallel for reduction (+: n)
+  for (int i = [](){ return 3; }(); i < 10; ++i)
+n++;
+
+  return n;
+}
Index: gcc/testsuite/g++.dg/gomp/pr78363-2.C
===
--- gcc/testsuite/g++.dg/gomp/pr78363-2.C   (nonexistent)
+++ gcc/testsuite/g++.dg/gomp/pr78363-2.C   (working copy)
@@ -0,0 +1,15 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+// { dg-options "-g -fopenmp" }
+
+int main()
+{
+  int n = 0;
+#pragma omp target map(tofrom:n)
+#pragma omp for reduction (+: n)
+  for (int i = [](){ return 3; }(); i < 10; ++i)
+n++;
+  if (n != 7)
+__builtin_abort ();
+  return 0;
+}
Index: gcc/testsuite/g++.dg/gomp/pr78363-3.C
===
--- gcc/testsuite/g++.dg/gomp/pr78363-3.C   (nonexistent)
+++ gcc/testsuite/g++.dg/gomp/pr78363-3.C   (working copy)
@@ -0,0 +1,14 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+// { dg-options "-g -fopenmp" }
+
+int main()
+{
+  int n = 0;
+#pragma omp task shared (n)
+  for (int i = [](){ return 3; }(); i < 10; ++i)
+n = i;
+#pragma omp taskwait
+  if (n != 7)
+__builtin_abort ();
+}


Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Christophe Lyon
On 24 January 2017 at 18:15, Bernd Schmidt  wrote:
> On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>>
>> Ha... the regression occurred between r 244818  and r 244816,
>> and I read r 244816 ChangeLog too quickly and did not notice
>> it was modifying ifcvt.c in addition to x86-only files.
>>
>> So it's likely that it's your other patch for pr78634
>> that caused the regression I mentioned. Does it make
>> more sense?
>
>
> That's possible. That added a missing cost check, so the question becomes -
> is the change in generated assembly sensible, given the selected CPU type?
>

I can now confirm that the change is indeed caused by r244816 (pr78634).
The difference in the generated asm is:
-   vmovd17, r0, r1
-   vmovd16, r2, r3
-   vcmp.f64d17, d16
+   vmovd16, r0, r1
+   vmovd17, r2, r3
+   vcmp.f64d16, d17
vmrsAPSR_nzcv, FPSCR
-   vselvs.f64  d16, d16, d17
+   vmovvs.f64  d16, d17

which, besides swapping d16 and d17, summarizes to
-   vselvs.f64  d16, d16, d17
+   vmovvs.f64  d16, d17

I'm not sure if there is a "best" one ?

>
> Bernd


Re: [PATCH] Fix PR78363

2017-01-25 Thread Jakub Jelinek
On Wed, Jan 25, 2017 at 09:52:41AM +0100, Richard Biener wrote:
> 2017-01-25  Richard Biener  
> 
>   PR debug/78363
>   * omp-expand.c: Include debug.h.
>   (expand_omp_taskreg): Make sure to generate early debug before
>   outlining anything from a function.
>   (expand_omp_target): Likewise.
>   (grid_expand_target_grid_body): Likewise.
> 
>   * g++.dg/gomp/pr78363-1.C: New testcase.
>   * g++.dg/gomp/pr78363-2.C: Likewise.
>   * g++.dg/gomp/pr78363-3.C: Likewise.

Ok, with minor nit:

> --- gcc/testsuite/g++.dg/gomp/pr78363-1.C (nonexistent)
> +++ gcc/testsuite/g++.dg/gomp/pr78363-1.C (working copy)
> @@ -0,0 +1,14 @@
> +// { dg-do compile }
> +// { dg-require-effective-target c++11 }
> +// { dg-options "-g -fopenmp" }
> +
> +int main()
> +{
> +  int n = 0;
> +
> +#pragma omp parallel for reduction (+: n)
> +  for (int i = [](){ return 3; }(); i < 10; ++i)
> +n++;
> +
> +  return n;
> +}
> Index: gcc/testsuite/g++.dg/gomp/pr78363-2.C
> ===
> --- gcc/testsuite/g++.dg/gomp/pr78363-2.C (nonexistent)
> +++ gcc/testsuite/g++.dg/gomp/pr78363-2.C (working copy)
> @@ -0,0 +1,15 @@
> +// { dg-do compile }
> +// { dg-require-effective-target c++11 }
> +// { dg-options "-g -fopenmp" }

Please replace dg-options with:
// { dg-additional-options "-g" }
-fopenmp -Wno-hsa is the default, while dg-options of -g -fopenmp
overrides that and -Wno-hsa would be lost.  While it doesn't matter
in the first and last testcase (no offloading in those), on this one
I bet -Whsa (on by default) will warn if gcc is configured with hsa
offloading, because it is not gridifiable.

> +
> +int main()
> +{
> +  int n = 0;
> +#pragma omp target map(tofrom:n)
> +#pragma omp for reduction (+: n)
> +  for (int i = [](){ return 3; }(); i < 10; ++i)
> +n++;
> +  if (n != 7)
> +__builtin_abort ();
> +  return 0;
> +}
> Index: gcc/testsuite/g++.dg/gomp/pr78363-3.C
> ===
> --- gcc/testsuite/g++.dg/gomp/pr78363-3.C (nonexistent)
> +++ gcc/testsuite/g++.dg/gomp/pr78363-3.C (working copy)
> @@ -0,0 +1,14 @@
> +// { dg-do compile }
> +// { dg-require-effective-target c++11 }
> +// { dg-options "-g -fopenmp" }
> +
> +int main()
> +{
> +  int n = 0;
> +#pragma omp task shared (n)
> +  for (int i = [](){ return 3; }(); i < 10; ++i)
> +n = i;
> +#pragma omp taskwait
> +  if (n != 7)
> +__builtin_abort ();
> +}

Jakub


Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Kyrill Tkachov


On 25/01/17 08:53, Christophe Lyon wrote:

On 24 January 2017 at 18:15, Bernd Schmidt  wrote:

On 01/24/2017 06:03 PM, Christophe Lyon wrote:

Ha... the regression occurred between r 244818  and r 244816,
and I read r 244816 ChangeLog too quickly and did not notice
it was modifying ifcvt.c in addition to x86-only files.

So it's likely that it's your other patch for pr78634
that caused the regression I mentioned. Does it make
more sense?


That's possible. That added a missing cost check, so the question becomes -
is the change in generated assembly sensible, given the selected CPU type?


I can now confirm that the change is indeed caused by r244816 (pr78634).
The difference in the generated asm is:
-   vmovd17, r0, r1
-   vmovd16, r2, r3
-   vcmp.f64d17, d16
+   vmovd16, r0, r1
+   vmovd17, r2, r3
+   vcmp.f64d16, d17
 vmrsAPSR_nzcv, FPSCR
-   vselvs.f64  d16, d16, d17
+   vmovvs.f64  d16, d17

which, besides swapping d16 and d17, summarizes to
-   vselvs.f64  d16, d16, d17
+   vmovvs.f64  d16, d17

I'm not sure if there is a "best" one ?



The test is supposed to test the generation of the vsel instruction.
I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
VSEL isn't actually available on Cortex-A5, it's just enabled by the 
-mfpu=fp-armv8 option.
A more realistic configuration would target an ARMv8-A CPU like the Cortex-A57.

Thanks,
Kyrill


Bernd




Re: [PATCH] [AArch64] Enable AES and cmp_branch fusion for Thunderx2t99

2017-01-25 Thread Kyrill Tkachov

Hi Naveen,

On 25/01/17 06:16, Hurugalawadi, Naveen wrote:

Hi,

Please find attached the patch that adds AES and CMP_BRANCH
fusion for Thunderx2t99.

Bootstrapped and Regression tested on aarch64-thunderx2t99.
Please review the patch and let us know if its okay?


Code looks ok (it's quite simple), but I can't approve.
but there are a couple of issues with the ChangeLog


2017-1-25  Naveen H.S 


2017-01-25.
Also, two spaces between name and email



gcc
 * config/aarch64/aarch64.c (thunderx2t99_tunings):
Improve vector initialization code gen.


This doesn't fit the code in the patch

Cheers,
Kyrill


Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Christophe Lyon
On 25 January 2017 at 10:18, Kyrill Tkachov  wrote:
>
> On 25/01/17 08:53, Christophe Lyon wrote:
>>
>> On 24 January 2017 at 18:15, Bernd Schmidt  wrote:
>>>
>>> On 01/24/2017 06:03 PM, Christophe Lyon wrote:

 Ha... the regression occurred between r 244818  and r 244816,
 and I read r 244816 ChangeLog too quickly and did not notice
 it was modifying ifcvt.c in addition to x86-only files.

 So it's likely that it's your other patch for pr78634
 that caused the regression I mentioned. Does it make
 more sense?
>>>
>>>
>>> That's possible. That added a missing cost check, so the question becomes
>>> -
>>> is the change in generated assembly sensible, given the selected CPU
>>> type?
>>>
>> I can now confirm that the change is indeed caused by r244816 (pr78634).
>> The difference in the generated asm is:
>> -   vmovd17, r0, r1
>> -   vmovd16, r2, r3
>> -   vcmp.f64d17, d16
>> +   vmovd16, r0, r1
>> +   vmovd17, r2, r3
>> +   vcmp.f64d16, d17
>>  vmrsAPSR_nzcv, FPSCR
>> -   vselvs.f64  d16, d16, d17
>> +   vmovvs.f64  d16, d17
>>
>> which, besides swapping d16 and d17, summarizes to
>> -   vselvs.f64  d16, d16, d17
>> +   vmovvs.f64  d16, d17
>>
>> I'm not sure if there is a "best" one ?
>>
>
> The test is supposed to test the generation of the vsel instruction.
> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
> VSEL isn't actually available on Cortex-A5, it's just enabled by the
> -mfpu=fp-armv8 option.
> A more realistic configuration would target an ARMv8-A CPU like the
> Cortex-A57.
>

Yes indeed, it's always confusing to be able to provide "incompatible"
-mcpu and -mfpu flags (as in: no such combination actually exists).


> Thanks,
> Kyrill
>
>>> Bernd
>
>


Re: [PATCH] BRIG frontend: request for a global review

2017-01-25 Thread Thomas Schwinge
Hi!

On Tue, 24 Jan 2017 13:52:10 +0100, Martin Jambor  wrote:
> [BRIG front end]

"contrib/gcc_update" needs to be updated for "libhsail-rt".


Here is a patch to fix some Autotools issues in libhsail-rt (currently
testing); OK for trunk?

commit 00d64708323f74191ce5a39b223bca92295fc606
Author: Thomas Schwinge 
Date:   Wed Jan 25 10:33:56 2017 +0100

libhsail-rt: Fix some Autotools issues

* Makefile.am (ACLOCAL_AMFLAGS): Set to "-I .. -I ../config".
* configure.ac: Don't instantiate AC_CONFIG_MACRO_DIR.
* config.h.in: Remove stale file.
* Makefile.in: Regenerate.
* aclocal.m4: Regenerate.
* configure: Regenerate.
---
 libhsail-rt/Makefile.am  |   4 +-
 libhsail-rt/Makefile.in  |  18 ++--
 libhsail-rt/aclocal.m4   |  72 ++--
 libhsail-rt/config.h.in  | 217 ---
 libhsail-rt/configure|  25 +++---
 libhsail-rt/configure.ac |   2 -
 6 files changed, 71 insertions(+), 267 deletions(-)

diff --git libhsail-rt/Makefile.am libhsail-rt/Makefile.am
index ef12df8..3f8806a 100644
--- libhsail-rt/Makefile.am
+++ libhsail-rt/Makefile.am
@@ -44,14 +44,14 @@
 
 AUTOMAKE_OPTIONS = foreign subdir-objects
 
+ACLOCAL_AMFLAGS = -I .. -I ../config
+
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 
 MAINT_CHARSET = latin1
 
 mkinstalldirs = $(SHELL) $(toplevel_srcdir)/mkinstalldirs
 
-ACLOCAL_AMFLAGS = -I m4
-
 WARN_CFLAGS = $(WARN_FLAGS) $(WERROR)
 
 # -I/-D flags to pass when compiling.
diff --git libhsail-rt/Makefile.in libhsail-rt/Makefile.in
index 250cfbc..2e5c8df 100644
--- libhsail-rt/Makefile.in
+++ libhsail-rt/Makefile.in
@@ -97,17 +97,17 @@ build_triplet = @build@
 host_triplet = @host@
 target_triplet = @target@
 subdir = .
-DIST_COMMON = README $(srcdir)/Makefile.in $(srcdir)/Makefile.am \
-   $(top_srcdir)/configure $(am__configure_deps) \
-   $(srcdir)/target-config.h.in $(srcdir)/../mkinstalldirs \
-   $(srcdir)/../depcomp
+DIST_COMMON = README ChangeLog $(srcdir)/Makefile.in \
+   $(srcdir)/Makefile.am $(top_srcdir)/configure \
+   $(am__configure_deps) $(srcdir)/target-config.h.in \
+   $(srcdir)/../mkinstalldirs $(srcdir)/../depcomp
 ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
 am__aclocal_m4_deps = $(top_srcdir)/../config/depstand.m4 \
$(top_srcdir)/../config/lead-dot.m4 \
-   $(top_srcdir)/../config/multi.m4 $(top_srcdir)/../libtool.m4 \
-   $(top_srcdir)/../ltoptions.m4 $(top_srcdir)/../ltsugar.m4 \
-   $(top_srcdir)/../ltversion.m4 $(top_srcdir)/../lt~obsolete.m4 \
-   $(top_srcdir)/configure.ac
+   $(top_srcdir)/../config/override.m4 \
+   $(top_srcdir)/../libtool.m4 $(top_srcdir)/../ltoptions.m4 \
+   $(top_srcdir)/../ltsugar.m4 $(top_srcdir)/../ltversion.m4 \
+   $(top_srcdir)/../lt~obsolete.m4 $(top_srcdir)/configure.ac
 am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
 am__CONFIG_DISTCLEAN_FILES = config.status config.cache config.log \
@@ -300,10 +300,10 @@ top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
 AUTOMAKE_OPTIONS = foreign subdir-objects
+ACLOCAL_AMFLAGS = -I .. -I ../config
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 MAINT_CHARSET = latin1
 mkinstalldirs = $(SHELL) $(toplevel_srcdir)/mkinstalldirs
-ACLOCAL_AMFLAGS = -I m4
 WARN_CFLAGS = $(WARN_FLAGS) $(WERROR)
 
 # -I/-D flags to pass when compiling.
diff --git libhsail-rt/aclocal.m4 libhsail-rt/aclocal.m4
index f77a2da..7a56c88 100644
--- libhsail-rt/aclocal.m4
+++ libhsail-rt/aclocal.m4
@@ -1,7 +1,8 @@
-# generated automatically by aclocal 1.11.1 -*- Autoconf -*-
+# generated automatically by aclocal 1.11.6 -*- Autoconf -*-
 
 # Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
-# 2005, 2006, 2007, 2008, 2009  Free Software Foundation, Inc.
+# 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation,
+# Inc.
 # This file is free software; the Free Software Foundation
 # gives unlimited permission to copy and/or distribute it,
 # with or without modifications, as long as this notice is preserved.
@@ -19,12 +20,15 @@ You have another version of autoconf.  It may work, but is 
not guaranteed to.
 If you have problems, you may need to regenerate the build system entirely.
 To do so, use the procedure documented by the package, typically 
`autoreconf'.])])
 
-# Copyright (C) 2002, 2003, 2005, 2006, 2007, 2008  Free Software Foundation, 
Inc.
+# Copyright (C) 2002, 2003, 2005, 2006, 2007, 2008, 2011 Free Software
+# Foundation, Inc.
 #
 # This file is free software; the Free Software Foundation
 # gives unlimited permission to copy and/or distribute it,
 # with or without modifications, as long as this notice is preserved.
 
+# serial 1
+
 # AM_AUTOMAKE_VERSION(VERSION)
 # 
 # Automake X.Y traces this macro to ensure aclocal.m4 has been
@@ -57,12 +61,14 @@ _AM_AUTOCONF_VERSION(

Re: [PATCH][wwwdocs] Mention new store merging pass for GCC 7

2017-01-25 Thread Kyrill Tkachov


On 24/01/17 13:44, Richard Earnshaw (lists) wrote:

On 23/01/17 16:45, Gerald Pfeifer wrote:

Hi Kyrill,

On Mon, 23 Jan 2017, Kyrill Tkachov wrote:

This patch adds a short entry for the store merging pass in GCC 7 to the
"General Optimizer Improvements" section.

+  A new store merging pass has been added.  It will attempt to merge
+  constant stores to adjacent memory locations into fewer wider stores.
+  It can be enabled by using the -fstore-merging option
and is
+  enabled by default at the -O2 optimization level or
+  higher.

I also think you should either use 'fewer, wider, stores' (with commas)
or, if you don't like the commas: 'a smaller number of wider stores'.

R.

Here I'd say "it attempts to merge" or, better yet, let's just say
"it merges".

Let's not be too shy. :-)  (This still does not claim that it always
succeeds or anything like that, mind.)

Okay, with that note taken into consideration.


Thanks, Gerald, Richard. I've done that, and also added that it's enabled at 
-Os as well.
Committing this to the repo.

Kyrill


Thanks,
Gerald


Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.39
diff -U 3 -r1.39 changes.html
--- htdocs/gcc-7/changes.html	17 Jan 2017 21:26:31 -	1.39
+++ htdocs/gcc-7/changes.html	24 Jan 2017 13:45:11 -
@@ -40,11 +40,14 @@
 
 
 General Optimizer Improvements
-
 
 
 New Languages and Language specific improvements


Re: [PATCH][doc] Correct optimisation levels documentation for -fstore-merging

2017-01-25 Thread Kyrill Tkachov


On 23/01/17 23:39, Jeff Law wrote:

On 01/23/2017 10:28 AM, Kyrill Tkachov wrote:

Hi all,

I had forgotten to update the -fstore-merging documentation from a
previous iteration of the pass
and it says that it's enabled at -O and higher. The option is in fact
enabled at -O2 and higher, as well as -Os.
This patch clarifies that.

Is this ok? Or is there a more preferred style of listing optimisation
levels?

Thanks,
Kyrill

2016-01-23  Kyrylo Tkachov  

* doc/invoke.texi (-fstore-merging): Correct default optimization
levels at which it is enabled.

I think you also need to remove -fstore-merging from list of options turned on 
by -O:

@option{-O} turns on the following optimization flags:
[ ... ]
-fstore-merging @gol


And instead add it to the list of options enabled at -O2 and higher which 
immediately follows.

OK with those changes.


Thanks Jeff, Sandra.
I've done that. Committing this version to trunk.

Kyrill

2016-01-25  Kyrylo Tkachov  

* doc/invoke.texi (-fstore-merging): Correct default optimization
levels at which it is enabled.
(-O): Move -fstore-merging from list to...
(-O2): ... Here.



jeff


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 45af80c..59ab394 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7012,7 +7012,6 @@ compilation time.
 -fsplit-wide-types @gol
 -fssa-backprop @gol
 -fssa-phiopt @gol
--fstore-merging @gol
 -ftree-bit-ccp @gol
 -ftree-ccp @gol
 -ftree-ch @gol
@@ -7072,6 +7071,7 @@ also turns on the following optimization flags:
 -frerun-cse-after-loop  @gol
 -fsched-interblock  -fsched-spec @gol
 -fschedule-insns  -fschedule-insns2 @gol
+-fstore-merging @gol
 -fstrict-aliasing -fstrict-overflow @gol
 -ftree-builtin-call-dce @gol
 -ftree-switch-conversion -ftree-tail-merge @gol
@@ -8342,7 +8342,7 @@ early.  This flag is enabled by default at @option{-O} and higher.
 Perform merging of narrow stores to consecutive memory addresses.  This pass
 merges contiguous stores of immediate values narrower than a word into fewer
 wider stores to reduce the number of instructions.  This is enabled by default
-at @option{-O} and higher.
+at @option{-O2} and higher as well as @option{-Os}.
 
 @item -ftree-ter
 @opindex ftree-ter


[PATCH] Fix PR69264

2017-01-25 Thread Richard Biener

This fixes PR69264, reverting an earlier change that was trying to
bypass the broken/unclear vector_alignment_reachable hook.  x86
doesn't define this hook and thus inherits the default implementation.

Now - the args we feed to the hook changed over time, esp. the
is_packed arg now (correctly) tells the hook about the alignment
of the access and whether it is naturally aligned (according to its size).

The hook was originally added for power where alignment of double
is 32bits but vector double requires 128bit alignment and thus peeling
might never reach proper aligned vector accesses.  Nowadays the
default implementation of the hook already ensures proper behavior
here but it also contains weird code from the times is_packed was
just TYPE_PACKED of the access.

So the following simplifies it up to the point that I know no target
that isn't happy with the default implementation which all targets
can use to get conservative correct behavior.

But at this point I didn't want to remove the hook (targets can
individually remove theirs please -- comments in most of those
hooks tell me they didn't really understand the purpose of the
hook, thus I clarified its docs).

Similar misconceptions may be in the related support_vector_misalignment
hook (for the is_packed handling for targets requiring element aligned
vectors only for example).

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-01-25  Richard Biener  

PR tree-optimization/69264
* target.def (vector_alignment_reachable): Improve documentation.
* targhooks.c (default_builtin_vector_alignment_reachable): Simplify
and add a comment.
* tree-vect-data-refs.c (vect_supportable_dr_alignment): Revert
earlier changes with respect to TYPE_USER_ALIGN.
(vector_alignment_reachable_p): Likewise.  Improve dumping.

* g++.dg/torture/pr69264.C: New testcase.

Index: gcc/target.def
===
--- gcc/target.def  (revision 244890)
+++ gcc/target.def  (working copy)
@@ -1801,10 +1801,10 @@ misalignment value (@var{misalign}).",
  default_builtin_vectorization_cost)
 
 /* Return true if vector alignment is reachable (by peeling N
-   iterations) for the given type.  */
+   iterations) for the given scalar type.  */
 DEFHOOK
 (vector_alignment_reachable,
- "Return true if vector alignment is reachable (by peeling N iterations) for 
the given type.",
+ "Return true if vector alignment is reachable (by peeling N iterations) for 
the given scalar type @var{type}.  @var{is_packed} is false if the scalar 
access using @var{type} is known to be naturally aligned.",
  bool, (const_tree type, bool is_packed),
  default_builtin_vector_alignment_reachable)
 
Index: gcc/targhooks.c
===
--- gcc/targhooks.c (revision 244890)
+++ gcc/targhooks.c (working copy)
@@ -1127,20 +1127,12 @@ default_vector_alignment (const_tree typ
   return align;
 }
 
+/* By default assume vectors of element TYPE require a multiple of the natural
+   alignment of TYPE.  TYPE is naturally aligned if IS_PACKED is false.  */
 bool
-default_builtin_vector_alignment_reachable (const_tree type, bool is_packed)
+default_builtin_vector_alignment_reachable (const_tree /*type*/, bool 
is_packed)
 {
-  if (is_packed)
-return false;
-
-  /* Assuming that types whose size is > pointer-size are not guaranteed to be
- naturally aligned.  */
-  if (tree_int_cst_compare (TYPE_SIZE (type), bitsize_int (POINTER_SIZE)) > 0)
-return false;
-
-  /* Assuming that types whose size is <= pointer-size
- are naturally aligned.  */
-  return true;
+  return ! is_packed;
 }
 
 /* By default, assume that a target supports any factor of misalignment
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 244890)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -1098,12 +1098,9 @@ vector_alignment_reachable_p (struct dat
   bool is_packed = not_size_aligned (DR_REF (dr));
   if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"Unknown misalignment, is_packed = %d\n",is_packed);
-  if ((TYPE_USER_ALIGN (type) && !is_packed)
- || targetm.vectorize.vector_alignment_reachable (type, is_packed))
-   return true;
-  else
-   return false;
+"Unknown misalignment, %snaturally aligned\n",
+is_packed ? "not " : "");
+  return targetm.vectorize.vector_alignment_reachable (type, is_packed);
 }
 
   return true;
@@ -6153,10 +6150,8 @@ vect_supportable_dr_alignment (struct da
   if (!known_alignment_for_access_p (dr))
is_packed = not_size_aligned (DR_REF (dr));
 
-  if ((TYPE_USER_ALIGN (type) && !is_packed)
- || targetm.vectorize.
-  support_vector_mi

Re: [PATCH] Fix PR78189

2017-01-25 Thread Kyrill Tkachov


On 23/01/17 19:26, Christophe Lyon wrote:

Hi Nick,

On 23 January 2017 at 10:04, Richard Biener  wrote:

On Fri, 20 Jan 2017, Nick Clifton wrote:


Hi Guys,

   [I have been asked to look at this PR in the hopes that it can be
   fixed soon and so no longer act as a blocker for the gcc 7 branch].

   It seems to me that Richard's proposed patch does work:

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00909.html

   The only problem is that the check_effective_target_vect_hw_misalign
   proc is always returning 0 (or false) for ARM, even when unaligned
   vectors are supported.  This is why Richard's patch introduces a new
   failure for the arm-* targets.

   So what I would like to suggest is an extended patch (attached) which
   also updates the check_effective_target_vect_hw_misalign proc to use
   the check_effective_target_arm_vect_no_misalign proc.  With this patch
   applied not only does the gcc.dg/vect/vect-strided-a-u8-i2-gap.c test
   for both big-endian and little-endian arm targets, but there is also a
   significant reduction in the number of failures in the gcc.dg/vect
   tests overall:

Little Endian ARM:
< # of expected passes3275
< # of unexpected failures63
< # of unexpected successes   125
< # of expected failures  123
< # of unsupported tests  153
---

# of expected passes3448
# of unexpected failures2
# of unexpected successes   14
# of expected failures  131
# of unsupported tests  151

   Big Endian ARM:
< # of expected passes2995
< # of unexpected failures269
< # of unexpected successes   21
< # of expected failures  128
---

# of expected passes3037
# of unexpected failures127
# of unexpected successes   24
# of expected failures  228

   Which looks like a win to me.  So - any objections to my applying this
   patch and then closing the PR ?

Ok.


I must be missing something, but I see many regressions since you
committed this patch (r244796).
See
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/244796/report-build-info.html
for more details.

In short, on arm-*,
   gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorized 1 loops" 1
   gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
"vectorized 1 loops" 1
now FAIL instead of PASS.


I also see this when testing 
-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard.

Thanks,
Kyrill


on armeb, there are many more differences.

Christophe



Thanks,
Richard.


Cheers
   Nick

gcc/ChangeLog
2017-01-20  Richard Biener  
   Nick Clifton  

   PR testsuite/78421
   * lib/target-supports.exp (check_effective_target_vect_hw_misalign):
   If the target is ARM return the result of the
   check_effective_target_arm_vect_no_misalign proc.
   * gcc.dg/vect/vect-strided-a-u8-i2-gap.c: If the target does not
   support unaligned vectors then only expect one of the loops to be
   unrolled.



--
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)




RFA: Patch for ARM PR77770

2017-01-25 Thread Nick Clifton
Hi Richard, Hi Ramana,

  The patch below is a simple fix for PR0.  I am not really
  expecting you to agree with it, but I thought that it was worth
  posting so that this PR could be looked at again and maybe a better
  patch found.  (Plus I am trying to close PRs so that the gcc 7 branch
  will happen...)

  The patch is simple - it disparages the SF-mode store alternative in
  the thumb1_movsf_insn just enough to break the infinite load/store
  cycle being triggered in reload.  The patch does not introduce any
  regressions, although I suspect it might affect code quality for
  thumb1.  (I have not checked this).  Nor have I gone deep into why
  reload is generating an infinite cycle of SF load/store insns.  That
  was a bit beyond me.  But the patch works and maybe that is enough.

  Obviously, if you think that this patch is OK, I would also create a
  new ARM specific gcc testsuite entry based upon the simplified test in
  the PR.

  So - OK to apply ?
  
Cheers
  Nick

Index: gcc/config/arm/thumb1.md
===
--- gcc/config/arm/thumb1.md(revision 244853)
+++ gcc/config/arm/thumb1.md(working copy)
@@ -865,7 +865,7 @@
(set_attr "conds" "clob,nocond,nocond,nocond,nocond")])
 ;;; ??? This should have alternatives for constants.
 (define_insn "*thumb1_movsf_insn"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=l,l,>,l, m,*r,*h")
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=l,l,>,l, ?m,*r,*h")
(match_operand:SF 1 "general_operand"  "l, >,l,mF,l,*h,*r"))]
   "TARGET_THUMB1
&& (   register_operand (operands[0], SFmode)


Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Richard Earnshaw (lists)
On 25/01/17 09:29, Christophe Lyon wrote:
> On 25 January 2017 at 10:18, Kyrill Tkachov  
> wrote:
>>
>> On 25/01/17 08:53, Christophe Lyon wrote:
>>>
>>> On 24 January 2017 at 18:15, Bernd Schmidt  wrote:

 On 01/24/2017 06:03 PM, Christophe Lyon wrote:
>
> Ha... the regression occurred between r 244818  and r 244816,
> and I read r 244816 ChangeLog too quickly and did not notice
> it was modifying ifcvt.c in addition to x86-only files.
>
> So it's likely that it's your other patch for pr78634
> that caused the regression I mentioned. Does it make
> more sense?


 That's possible. That added a missing cost check, so the question becomes
 -
 is the change in generated assembly sensible, given the selected CPU
 type?

>>> I can now confirm that the change is indeed caused by r244816 (pr78634).
>>> The difference in the generated asm is:
>>> -   vmovd17, r0, r1
>>> -   vmovd16, r2, r3
>>> -   vcmp.f64d17, d16
>>> +   vmovd16, r0, r1
>>> +   vmovd17, r2, r3
>>> +   vcmp.f64d16, d17
>>>  vmrsAPSR_nzcv, FPSCR
>>> -   vselvs.f64  d16, d16, d17
>>> +   vmovvs.f64  d16, d17
>>>
>>> which, besides swapping d16 and d17, summarizes to
>>> -   vselvs.f64  d16, d16, d17
>>> +   vmovvs.f64  d16, d17
>>>
>>> I'm not sure if there is a "best" one ?
>>>
>>
>> The test is supposed to test the generation of the vsel instruction.
>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
>> VSEL isn't actually available on Cortex-A5, it's just enabled by the
>> -mfpu=fp-armv8 option.
>> A more realistic configuration would target an ARMv8-A CPU like the
>> Cortex-A57.
>>
> 
> Yes indeed, it's always confusing to be able to provide "incompatible"
> -mcpu and -mfpu flags (as in: no such combination actually exists).
> 

As discussed at the Cauldron, I'm working on fixing that, but it won't
be until GCC-8 now.

R.

> 
>> Thanks,
>> Kyrill
>>
 Bernd
>>
>>



Re: A + B CMP A -> A CMP' CST' match.pd patterns [was [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)]

2017-01-25 Thread Richard Biener
On Tue, Jan 24, 2017 at 4:05 PM, Jeff Law  wrote:
> On 01/24/2017 07:29 AM, Marc Glisse wrote:
>>
>> On Tue, 24 Jan 2017, Richard Biener wrote:
>>
 That was my thought as well, but AFAICT we only call into match.pd
 from VRP if we changed the insn.
>>>
>>>
>>> Yes - there was thoughts to change that (but it comes at an expense).
>>> Basically we'd like to re-fold stmts that indirectly use stmts we
>>> changed.  We certainly don't want to re-fold everything all the time.
>>
>>
>> VRP is kind of a special case, every variable for which it finds a
>> new/improved range could be considered changed, since it may trigger
>> some extra transformation in match.pd (same for CCP and the nonzero
>> mask).
>
> But that would assume that match.pd is relying on range information and
> could thus use the improved range information.  *If* match.pd is using the
> range information generated by VRP, it's not terribly pervasive.
>
> But waiting until forwprop3 means we're leaving a ton of useless blocks and
> statements in the IL for this testcase, and likely other code using
> std::vec.
>
> Perhaps rather than open-coding a fix in VRP I could have VRP call into
> match.pd slightly more aggressively (say just for gimple_cond).  That may be
> enough to capture the effects much earlier in the pipeline without trying to
> fold *everything*.

Sure, the only disadvantage of doing it in VRP (in vrp_fold_stmt) is that you
may end up doing it twice.

Richard.

> Jeff
>
>


Re: [PATCH TEST]Remove xfail for gcc.dg/vect/vect-24.c on ARM targets

2017-01-25 Thread Richard Earnshaw (lists)
On 24/01/17 17:22, Bin Cheng wrote:
> Hi,
> Test gcc.dg/vect/vect-24.c starts passing after my vectorizer changes, but 
> not on all targets.  For x86_64, looks like other passes still mess up the IR 
> and prevent it from being vectorized.  This patch removes xfail for ARM 
> targets.
> Test result checked.  Is it OK?
> 
> Thanks,
> bin
> 
> gcc/testsuite/ChangeLog
> 2017-01-23  Bin Cheng  
> 
>   * gcc.dg/vect/vect-24.c: Remove xfail on ARM targets.
> 
> 
> xfail-2.txt
> 
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-24.c 
> b/gcc/testsuite/gcc.dg/vect/vect-24.c
> index 09a6d7e..0511f7b 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-24.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-24.c
> @@ -122,6 +122,5 @@ int main (void)
>  
>return main1 ();
>  }
> -
> -/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail 
> *-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail { 
> { ! aarch64*-*-* } && { ! arm-*-* } } } } } */
>  /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 
> "vect" } } */
> 

OK.

R.


Re: [PATCH][PR 67328] Improve bitfield testing

2017-01-25 Thread Richard Biener
On Wed, 25 Jan 2017, Yuri Gribov wrote:

> Hi all,
> 
> This fixes inefficient bitfield code reported in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67328
> 
> Bootstrapped and regtested on x86_64.
> 
> Ok for trunk?

This isn't a regression fix and thus not appropriate at this stage.

Some comments on the patch:

+/* A & (2**N - 1) <= 2**K - 1 -> ~(A & (2**N - 2**K)
+   A & (2**N - 1) <  2**K -> ~(A & (2**N - 2**K)
+   A & (2**N - 1) >= 2**K -> A & (2**N - 2**K)
+   A & (2**N - 1) >  2**K - 1 -> A & (2**N - 2**K)
+ */

you miss the != 0/== 0 in the result (and the ~ is redundant then).

Note that A & (2**N - 1) >= 2**K should already have been simplified
to A & (2**N - 1) >  2**K - 1 (we canonicalize to smaller constants).

+  (if (TYPE_UNSIGNED (TREE_TYPE (@0)) && tree_fits_uhwi_p (@2) && 
tree_fits_uhwi_p (@3))
+   (with
+{

I think you should restrict this to INTEGRAL_TYPE_P types.

Please use wide-ints so you do not restrict yourself to fits_uhwi_p
values.

Thanks,
Richard.


[PATCH][ARM] PR target/79145 Fix xordi3 expander for immediate operands in iWMMXt

2017-01-25 Thread Kyrill Tkachov

Hi all,

We're hitting an ICE when expanding a DImode xor with an immediate on 
TARGET_IWMMXT:
(insn 6 5 7 2 (set (reg:DI 111 [ t1.1_3 ])
(xor:DI (reg:DI 110 [ t1.0_2 ])
(const_int 85 [0x55]))) ./z32.c:13 -1
 (nil))

The problem is that the general xordi3 expander accepts some immediates in 
operand 2 but the iwmmxt_xordi3
define_insn only accepts register operands, and nothing forces the operand into 
a register in between.
This doesn't affect the iordi3 or anddi3 expanders because their predicates are 
designed to accept immediates
valid for the VORR and VBIC NEON instructions and thus check for TARGET_NEON as 
well, so they don't accept any
immediates during expand time for TARGET_IWMMXT.

A fix could be to modify arm_xordi_operand to allow only register operands for 
TARGET_IWMMXT.
Another approach, used in this patch, is to force the constants into registers 
in the expander itself.

Bootstrapped and tested on arm-none-linux-gnueabihf (I don't have access to 
iWMMXt hardware).

Ok for trunk and the branches after some time?
This patch should only affect TARGET_IWMMXT and therefore is pretty safe at any 
stage.

Thanks,
Kyrill

2016-01-25  Kyrylo Tkachov  

PR target/79145
* config/arm/arm.md (xordi3): Force constant operand into a register
for TARGET_IWMMXT.

2016-01-25  Kyrylo Tkachov  

PR target/79145
* gcc.target/arm/pr79145.c: New test.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 2eee8bc5701297f52e5ed991f074f1069bde1b6e..48bf07e5b6c121944b38ab8d0d14b029d2b34560 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -3328,7 +3328,14 @@ (define_expand "xordi3"
 	(xor:DI (match_operand:DI 1 "s_register_operand" "")
 		(match_operand:DI 2 "arm_xordi_operand" "")))]
   "TARGET_32BIT"
-  ""
+  {
+/* The iWMMXt pattern for xordi3 accepts only register operands but we want
+   to reuse this expander for all TARGET_32BIT targets so just force the
+   constants into a register.  Unlike for the anddi3 and iordi3 there are
+   no NEON instructions that take an immediate.  */
+if (TARGET_IWMMXT && !REG_P (operands[2]))
+  operands[2] = force_reg (DImode, operands[2]);
+  }
 )
 
 (define_insn_and_split "*xordi3_insn"
diff --git a/gcc/testsuite/gcc.target/arm/pr79145.c b/gcc/testsuite/gcc.target/arm/pr79145.c
new file mode 100644
index ..667824400390d6fe72d05a85769d210791b8c378
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr79145.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-mcpu=*" } { "-mcpu=iwmmxt" } } */
+/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-mabi=*" } { "-mabi=iwmmxt" } } */
+/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-march=*" } { "-march=iwmmxt" } } */
+/* { dg-skip-if "Test is specific to ARM mode" { arm*-*-* } { "-mthumb" } { "" } } */
+/* { dg-require-effective-target arm32 } */
+/* { dg-require-effective-target arm_iwmmxt_ok } */
+/* { dg-options "-mcpu=iwmmxt" } */
+
+int
+main (void)
+{
+  volatile long long t1;
+  t1 ^= 0x55;
+  return 0;
+}


Re: RFA: Patch for ARM PR77770

2017-01-25 Thread Richard Earnshaw (lists)
On 25/01/17 10:28, Nick Clifton wrote:
> Hi Richard, Hi Ramana,
> 
>   The patch below is a simple fix for PR0.  I am not really
>   expecting you to agree with it, but I thought that it was worth
>   posting so that this PR could be looked at again and maybe a better
>   patch found.  (Plus I am trying to close PRs so that the gcc 7 branch
>   will happen...)
> 
>   The patch is simple - it disparages the SF-mode store alternative in
>   the thumb1_movsf_insn just enough to break the infinite load/store
>   cycle being triggered in reload.  The patch does not introduce any
>   regressions, although I suspect it might affect code quality for
>   thumb1.  (I have not checked this).  Nor have I gone deep into why
>   reload is generating an infinite cycle of SF load/store insns.  That
>   was a bit beyond me.  But the patch works and maybe that is enough.
> 
>   Obviously, if you think that this patch is OK, I would also create a
>   new ARM specific gcc testsuite entry based upon the simplified test in
>   the PR.
> 
>   So - OK to apply ?

No, I don't think so.  At least not on trunk.

If, come the time of the release, we have no better solution it might be
ok to put something like this on the release branch as a palliative, but
I don't think we should just paper over the underlying problem.  It
could be that another testcase will still fail with this change.

R.

>   
> Cheers
>   Nick
> 
> Index: gcc/config/arm/thumb1.md
> ===
> --- gcc/config/arm/thumb1.md  (revision 244853)
> +++ gcc/config/arm/thumb1.md  (working copy)
> @@ -865,7 +865,7 @@
> (set_attr "conds" "clob,nocond,nocond,nocond,nocond")])
>  ;;; ??? This should have alternatives for constants.
>  (define_insn "*thumb1_movsf_insn"
> -  [(set (match_operand:SF 0 "nonimmediate_operand" "=l,l,>,l, m,*r,*h")
> +  [(set (match_operand:SF 0 "nonimmediate_operand" "=l,l,>,l, ?m,*r,*h")
>   (match_operand:SF 1 "general_operand"  "l, >,l,mF,l,*h,*r"))]
>"TARGET_THUMB1
> && (   register_operand (operands[0], SFmode)
> 



Re: [PATCH][ARM] PR target/79145 Fix xordi3 expander for immediate operands in iWMMXt

2017-01-25 Thread Richard Earnshaw (lists)
On 25/01/17 10:58, Kyrill Tkachov wrote:
> Hi all,
> 
> We're hitting an ICE when expanding a DImode xor with an immediate on
> TARGET_IWMMXT:
> (insn 6 5 7 2 (set (reg:DI 111 [ t1.1_3 ])
> (xor:DI (reg:DI 110 [ t1.0_2 ])
> (const_int 85 [0x55]))) ./z32.c:13 -1
>  (nil))
> 
> The problem is that the general xordi3 expander accepts some immediates
> in operand 2 but the iwmmxt_xordi3
> define_insn only accepts register operands, and nothing forces the
> operand into a register in between.
> This doesn't affect the iordi3 or anddi3 expanders because their
> predicates are designed to accept immediates
> valid for the VORR and VBIC NEON instructions and thus check for
> TARGET_NEON as well, so they don't accept any
> immediates during expand time for TARGET_IWMMXT.
> 
> A fix could be to modify arm_xordi_operand to allow only register
> operands for TARGET_IWMMXT.
> Another approach, used in this patch, is to force the constants into
> registers in the expander itself.
> 
> Bootstrapped and tested on arm-none-linux-gnueabihf (I don't have access
> to iWMMXt hardware).
> 
> Ok for trunk and the branches after some time?
> This patch should only affect TARGET_IWMMXT and therefore is pretty safe
> at any stage.
> 
> Thanks,
> Kyrill
> 
> 2016-01-25  Kyrylo Tkachov  
> 
> PR target/79145
> * config/arm/arm.md (xordi3): Force constant operand into a register
> for TARGET_IWMMXT.
> 
> 2016-01-25  Kyrylo Tkachov  
> 
> PR target/79145
> * gcc.target/arm/pr79145.c: New test.
> 

OK.

R.

> iwmmxt-xor.patch
> 
> 
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 
> 2eee8bc5701297f52e5ed991f074f1069bde1b6e..48bf07e5b6c121944b38ab8d0d14b029d2b34560
>  100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -3328,7 +3328,14 @@ (define_expand "xordi3"
>   (xor:DI (match_operand:DI 1 "s_register_operand" "")
>   (match_operand:DI 2 "arm_xordi_operand" "")))]
>"TARGET_32BIT"
> -  ""
> +  {
> +/* The iWMMXt pattern for xordi3 accepts only register operands but we 
> want
> +   to reuse this expander for all TARGET_32BIT targets so just force the
> +   constants into a register.  Unlike for the anddi3 and iordi3 there are
> +   no NEON instructions that take an immediate.  */
> +if (TARGET_IWMMXT && !REG_P (operands[2]))
> +  operands[2] = force_reg (DImode, operands[2]);
> +  }
>  )
>  
>  (define_insn_and_split "*xordi3_insn"
> diff --git a/gcc/testsuite/gcc.target/arm/pr79145.c 
> b/gcc/testsuite/gcc.target/arm/pr79145.c
> new file mode 100644
> index 
> ..667824400390d6fe72d05a85769d210791b8c378
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr79145.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-mcpu=*" } 
> { "-mcpu=iwmmxt" } } */
> +/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-mabi=*" } 
> { "-mabi=iwmmxt" } } */
> +/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-march=*" } 
> { "-march=iwmmxt" } } */
> +/* { dg-skip-if "Test is specific to ARM mode" { arm*-*-* } { "-mthumb" } { 
> "" } } */
> +/* { dg-require-effective-target arm32 } */
> +/* { dg-require-effective-target arm_iwmmxt_ok } */
> +/* { dg-options "-mcpu=iwmmxt" } */
> +
> +int
> +main (void)
> +{
> +  volatile long long t1;
> +  t1 ^= 0x55;
> +  return 0;
> +}
> 



Re: [PATCH][PR 67328] Improve bitfield testing

2017-01-25 Thread Yuri Gribov
On Wed, Jan 25, 2017 at 10:49 AM, Richard Biener  wrote:
> On Wed, 25 Jan 2017, Yuri Gribov wrote:
>
>> Hi all,
>>
>> This fixes inefficient bitfield code reported in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67328
>>
>> Bootstrapped and regtested on x86_64.
>>
>> Ok for trunk?
>
> This isn't a regression fix and thus not appropriate at this stage.

Definitely, just wanted to get initial round of comments. Thanks for
review, will get back with updated patch at stage 1.

> Some comments on the patch:
>
> +/* A & (2**N - 1) <= 2**K - 1 -> ~(A & (2**N - 2**K)
> +   A & (2**N - 1) <  2**K -> ~(A & (2**N - 2**K)
> +   A & (2**N - 1) >= 2**K -> A & (2**N - 2**K)
> +   A & (2**N - 1) >  2**K - 1 -> A & (2**N - 2**K)
> + */
>
> you miss the != 0/== 0 in the result (and the ~ is redundant then).
>
> Note that A & (2**N - 1) >= 2**K should already have been simplified
> to A & (2**N - 1) >  2**K - 1 (we canonicalize to smaller constants).
>
> +  (if (TYPE_UNSIGNED (TREE_TYPE (@0)) && tree_fits_uhwi_p (@2) &&
> tree_fits_uhwi_p (@3))
> +   (with
> +{
>
> I think you should restrict this to INTEGRAL_TYPE_P types.
>
> Please use wide-ints so you do not restrict yourself to fits_uhwi_p
> values.
>
> Thanks,
> Richard.


Re: [PATCH] Add --with-gcc-major-version-only support to libhsail-rt

2017-01-25 Thread Richard Biener
On Tue, Jan 24, 2017 at 11:59 PM, Jakub Jelinek  wrote:
> Hi!
>
> Apparently the configury of this library has been copied over before the
> PR79046 changes were done, the following patch updates it.  Ok for trunk?

Ok.

Richard.

> Though, I wonder why configure.ac/Makefile.am have been based on one of the
> only 2 that aren't GPL licensed, there are over dozen other libraries that
> have very simple GPL configure.ac and Makefile.am, can't we just rewrite
> those based on those other files?
>
> 2017-01-24  Jakub Jelinek  
>
> PR other/79046
> * configure.ac: Add GCC_BASE_VER.
> * Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to
> get version from BASE-VER file.
> (ACLOCAL_AMFLAGS): Set to -I .. -I ../config .
> * aclocal.m4: Regenerated.
> * configure: Regenerated.
> * Makefile.in: Regenerated.
>
> --- libhsail-rt/configure.ac.jj 2017-01-24 23:29:11.0 +0100
> +++ libhsail-rt/configure.ac2017-01-24 23:48:13.743605310 +0100
> @@ -147,5 +147,8 @@ AC_CONFIG_HEADER(target-config.h)
>  AC_CHECK_SIZEOF([int])
>  AC_CHECK_SIZEOF([void*])
>
> +# Determine what GCC version number to use in filesystem paths.
> +GCC_BASE_VER
> +
>  # Must be last
>  AC_OUTPUT
> --- libhsail-rt/Makefile.am.jj  2017-01-24 23:29:12.0 +0100
> +++ libhsail-rt/Makefile.am 2017-01-24 23:49:42.93518 +0100
> @@ -44,13 +44,13 @@
>
>  AUTOMAKE_OPTIONS = foreign subdir-objects
>
> -gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
> +gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
>
>  MAINT_CHARSET = latin1
>
>  mkinstalldirs = $(SHELL) $(toplevel_srcdir)/mkinstalldirs
>
> -ACLOCAL_AMFLAGS = -I m4
> +ACLOCAL_AMFLAGS = -I .. -I ../config
>
>  WARN_CFLAGS = $(WARN_FLAGS) $(WERROR)
>
> @@ -120,5 +120,3 @@ AM_MAKEFLAGS = \
> "DESTDIR=$(DESTDIR)"
>
>  MAKEOVERRIDES=
> -
> -
> --- libhsail-rt/aclocal.m4.jj   2017-01-24 23:29:12.0 +0100
> +++ libhsail-rt/aclocal.m4  2017-01-24 23:49:56.352268889 +0100
> @@ -1,7 +1,8 @@
> -# generated automatically by aclocal 1.11.1 -*- Autoconf -*-
> +# generated automatically by aclocal 1.11.6 -*- Autoconf -*-
>
>  # Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
> -# 2005, 2006, 2007, 2008, 2009  Free Software Foundation, Inc.
> +# 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation,
> +# Inc.
>  # This file is free software; the Free Software Foundation
>  # gives unlimited permission to copy and/or distribute it,
>  # with or without modifications, as long as this notice is preserved.
> @@ -19,12 +20,15 @@ You have another version of autoconf.  I
>  If you have problems, you may need to regenerate the build system entirely.
>  To do so, use the procedure documented by the package, typically 
> `autoreconf'.])])
>
> -# Copyright (C) 2002, 2003, 2005, 2006, 2007, 2008  Free Software 
> Foundation, Inc.
> +# Copyright (C) 2002, 2003, 2005, 2006, 2007, 2008, 2011 Free Software
> +# Foundation, Inc.
>  #
>  # This file is free software; the Free Software Foundation
>  # gives unlimited permission to copy and/or distribute it,
>  # with or without modifications, as long as this notice is preserved.
>
> +# serial 1
> +
>  # AM_AUTOMAKE_VERSION(VERSION)
>  # 
>  # Automake X.Y traces this macro to ensure aclocal.m4 has been
> @@ -57,12 +61,14 @@ _AM_AUTOCONF_VERSION(m4_defn([AC_AUTOCON
>
>  # AM_AUX_DIR_EXPAND -*- Autoconf -*-
>
> -# Copyright (C) 2001, 2003, 2005  Free Software Foundation, Inc.
> +# Copyright (C) 2001, 2003, 2005, 2011 Free Software Foundation, Inc.
>  #
>  # This file is free software; the Free Software Foundation
>  # gives unlimited permission to copy and/or distribute it,
>  # with or without modifications, as long as this notice is preserved.
>
> +# serial 1
> +
>  # For projects using AC_CONFIG_AUX_DIR([foo]), Autoconf sets
>  # $ac_aux_dir to `$srcdir/foo'.  In other projects, it is set to
>  # `$srcdir', `$srcdir/..', or `$srcdir/../..'.
> @@ -144,14 +150,14 @@ AC_CONFIG_COMMANDS_PRE(
>  Usually this means the macro was only invoked conditionally.]])
>  fi])])
>
> -# Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2009
> -# Free Software Foundation, Inc.
> +# Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2009,
> +# 2010, 2011 Free Software Foundation, Inc.
>  #
>  # This file is free software; the Free Software Foundation
>  # gives unlimited permission to copy and/or distribute it,
>  # with or without modifications, as long as this notice is preserved.
>
> -# serial 10
> +# serial 12
>
>  # There are a few dirty hacks below to avoid letting `AC_PROG_CC' be
>  # written in clear, in which case automake, when reading aclocal.m4,
> @@ -191,6 +197,7 @@ AC_CACHE_CHECK([dependency style of $dep
># instance it was reported that on HP-UX the gcc test will end up
># making a dummy file named `D' -- becau

Re: [PATCH v2] aarch64: Add split-stack initial support

2017-01-25 Thread Jiong Wang

On 24/01/17 18:05, Adhemerval Zanella wrote:


On 03/01/2017 13:13, Wilco Dijkstra wrote:


+  /* If function uses stacked arguments save the old stack value so morestack
+ can return it.  */
+  reg11 = gen_rtx_REG (Pmode, R11_REGNUM);
+  if (cfun->machine->frame.saved_regs_size
+  || cfun->machine->frame.saved_varargs_size)
+emit_move_insn (reg11, stack_pointer_rtx);

This doesn't look right - we could have many arguments even without varargs or
saved regs.  This would need to check varargs as well as ctrl->args.size (I 
believe
that is the size of the arguments on the stack). It's fine to omit this 
optimization
in the first version - we already emit 2-3 extra instructions for the check 
anyway.

I will check for a better solution.


Hi Adhemerval

  My only concern on this this patch is the initialization of R11 (internal arg
pointer).  The current implementation looks to me is generating wrong code for a
testcase simply return the sum of ten int param, I see the function body is
using R11 while there is no initialization of it in split prologue,  so if the
execution flow is *not* through __morestack, then R11 is not initialized.

As Wilco suggested, I feel using crtl->args.size instead of

cfun->machine->frame.saved_regs_size might be the correct approach after
checking assign_parms in function.c.



Re: [PATCH] BRIG frontend: request for a global review

2017-01-25 Thread Jakub Jelinek
On Wed, Jan 25, 2017 at 11:00:50AM +0100, Thomas Schwinge wrote:
> Hi!
> 
> On Tue, 24 Jan 2017 13:52:10 +0100, Martin Jambor  wrote:
> > [BRIG front end]
> 
> "contrib/gcc_update" needs to be updated for "libhsail-rt".
> 
> 
> Here is a patch to fix some Autotools issues in libhsail-rt (currently
> testing); OK for trunk?
> 
> commit 00d64708323f74191ce5a39b223bca92295fc606
> Author: Thomas Schwinge 
> Date:   Wed Jan 25 10:33:56 2017 +0100
> 
> libhsail-rt: Fix some Autotools issues
> 
> * Makefile.am (ACLOCAL_AMFLAGS): Set to "-I .. -I ../config".
> * configure.ac: Don't instantiate AC_CONFIG_MACRO_DIR.
> * config.h.in: Remove stale file.
> * Makefile.in: Regenerate.
> * aclocal.m4: Regenerate.
> * configure: Regenerate.

Note, lots of this changed in r244895, so your patch doesn't apply any
longer.  Still removing AC_CONFIG_MACRO_DIR, removing config.h.in and
regenerating whatever is affected by that (most likely just configure)
is in order.

Jakub


[PATCH][ARM][PR target/78945] Fix libatomic on armv7-m

2017-01-25 Thread Szabolcs Nagy
ARM libatomic inline asm uses sel, uadd8, uadd16 instructions
which are only available if __ARM_FEATURE_SIMD32 is defined.

libatomic/
2017-01-25  Szabolcs Nagy  

PR target/78945
* config/arm/exch_n.c (libat_exchange): Check __ARM_FEATURE_SIMD32.
diff --git a/libatomic/config/arm/exch_n.c b/libatomic/config/arm/exch_n.c
index 991f19d..685cb95 100644
--- a/libatomic/config/arm/exch_n.c
+++ b/libatomic/config/arm/exch_n.c
@@ -29,7 +29,7 @@
 /* When using STREX to implement sub-word exchange, we can do much better
than the compiler by using the APSR.GE and APSR.C flags.  */
 
-#if !DONE && HAVE_STREX && !HAVE_STREXBH && N == 2
+#if !DONE && __ARM_FEATURE_SIMD32 && HAVE_STREX && !HAVE_STREXBH && N == 2
 UTYPE
 SIZE(libat_exchange) (UTYPE *mptr, UTYPE newval, int smodel)
 {
@@ -79,7 +79,7 @@ SIZE(libat_exchange) (UTYPE *mptr, UTYPE newval, int smodel)
 #endif /* !HAVE_STREXBH && N == 2 */
 
 
-#if !DONE && HAVE_STREX && !HAVE_STREXBH && N == 1
+#if !DONE && __ARM_FEATURE_SIMD32 && HAVE_STREX && !HAVE_STREXBH && N == 1
 UTYPE
 SIZE(libat_exchange) (UTYPE *mptr, UTYPE newval, int smodel)
 {


[PATCH] Fix PR72850

2017-01-25 Thread Richard Biener

This changes the testcase back to its original form, it had been
adjusted for the new threading passes but those were tamed down
by a cost change later.

Tested on x86_64-unknwon-linux-gnu, applied.

Richard.

2017-01-25  Richard Biener  

PR testsuite/72850
* gcc.dg/tree-ssa/pr69270-3.c: Change back expected outcome
to what we had before adding the threading passes.

Index: gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c   (revision 244891)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c   (working copy)
@@ -3,7 +3,7 @@
 
 /* We're looking for a constant argument a PHI node.  There
should only be one if we unpropagate correctly.  */
-/* { dg-final { scan-tree-dump-times ", 1" 4 "uncprop1"} } */
+/* { dg-final { scan-tree-dump-times ", 1" 1 "uncprop1"} } */
 
 typedef long unsigned int size_t;
 typedef union gimple_statement_d *gimple;


Re: [PATCH], PR 79212: Fix ICE when compiling fortran test with openmp

2017-01-25 Thread Jakub Jelinek
On Tue, Jan 24, 2017 at 05:27:26PM +, David Sherwood wrote:
> I have a patch to fix the following openmp issue:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79212
> 
> Writing openmp directives in a certain way in fortran programs can lead to
> the following assert:
> 
> internal compiler error: in maybe_lookup_decl_in_outer_ctx, at omp-low.c:4134
> 0xa941e6 maybe_lookup_decl_in_outer_ctx
>     /work/davshe01/oban-work-shoji/src/gcc/gcc/omp-low.c:4134
> 0xa9cadc scan_sharing_clauses
>     /work/davshe01/oban-work-shoji/src/gcc/gcc/omp-low.c:1975
> 
> Tested:
> aarch64 - No regressions in gcc/testsuite/fortran.dg, gcc/testsuite/gcc.dg,
> gcc/testsuite/g++.dg or libgomp/testsuite
> 
> Will do a full test run before submitting.
> 
> Good to go?
> David Sherwood.
> 
> ChangeLog:
> 
> 2017-01-24  David Sherwood  
> 
>     PR middle-end/79212

This line should be intended by a single tab.

>     gcc/
>     * gimplify.c (omp_notice_variable): Add GOVD_SEEN flag to variables 
> in all contexts.

These too.

>     gcc/testsuite/
>     * gfortran.dg/gomp/sharing-4.f90: New test.
> 

--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7147,12 +7147,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
   && (TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl
   != INTEGER_CST))
{
- splay_tree_node n2;
  tree t = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl)));
  gcc_assert (DECL_P (t));
- n2 = splay_tree_lookup (ctx->variables, (splay_tree_key) t);
- if (n2)
-   n2->value |= GOVD_SEEN;
+ omp_notice_variable (ctx, t, true);
}
 }

As this is conditional, if the decl won't be in ctx->variables splay tree,
bad things will happen (omp_notice_variable could complain loudly, or make
it shared etc. or whatever the default is).  So perhaps it should be
instead:
  splay_tree_node n2;
  tree t = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl)));
  gcc_assert (DECL_P (t));
  n2 = splay_tree_lookup (ctx->variables, (splay_tree_key) t);
  if (n2)
omp_notice_variable (ctx, t, true);

Ok for trunk with those changes.

Jakub


[wwwdocs] changes.html - Fortran changes

2017-01-25 Thread Martin Liška
Hello.

Following patch documents DO loop changes which were done for upcoming GCC 7.1.

Thanks for feedback,
Martin
Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.40
diff --unified -r1.40 changes.html
--- htdocs/gcc-7/changes.html	25 Jan 2017 10:10:56 -	1.40
+++ htdocs/gcc-7/changes.html	25 Jan 2017 13:09:28 -
@@ -362,6 +362,27 @@
 derived-type variables.
   
 
+  
+DO loops with step equal to 1 or -1 generate faster code as they do not
+have a loop preheader.  New warning -Wundefined-do-loop
+warns when a loop iterates either to HUGE(i) (with step equal
+to 1), or to -HUGE(i) (with step equal to -1). Apart from
+that the invalid behaviour can be caught during run-time of a program with
+-fcheck=do:
+
+program test
+  implicit none
+  integer(1) :: i
+  do i = -HUGE(i)+10, -HUGE(i)-1, -1
+print *, i
+  end do
+end program test
+
+At line 8 of file /home/marxin/Programming/gcc/gcc/testsuite/gfortran.dg/do_check_12.f90
+Fortran runtime error: Loop iterates infinitely
+
+  
+
 
 
 


[wwwdocs] changes.html - PGO and GCOV changes

2017-01-25 Thread Martin Liška
Hello.

Following patch documents changes in PGO and GCOV which were done for upcoming 
GCC 7.1.

Thanks for feedback,
Martin
--- /tmp/wwwdocs/htdocs/gcc-7/changes.html	2017-01-25 11:10:56.0 +0100
+++ htdocs/gcc-7/changes.html	2017-01-25 14:48:56.257587082 +0100
@@ -630,6 +630,18 @@
 
   GCC has gained an internal unit-testing framework, allowing for
 more detailed testing of its implementation details.
+
+  Profile-guided optimization (PGO) instrumentation, as well as test coverage (GCOV),
+  can newly instrument constructors (functions marks with __attribute__((constructor))),
+  destructors and C++ constructors (and destructors) of classes that are used
+  as a type of a global variable.
+  
+  A new option -fprofile-update=atomic prevents creation of corrupted
+  profiles created during instrumentation run (-fprofile=generate)
+  of an application.  Downside of the option is a speed penalty.  Providing
+  -pthread on command line would result in selection of atomic
+  profile updating (when supports by a target).
+  
 
 
 

[wwwdocs] changes.html - document -fsanitize-address-use-after-scope

2017-01-25 Thread Martin Liška
Hello.

Following patch documents new option -fsanitize-address-use-after-scope which 
was done for upcoming GCC 7.1.

Thanks for feedback,
Martin
--- /tmp/wwwdocs/htdocs/gcc-7/changes.html	2017-01-25 11:10:56.0 +0100
+++ htdocs/gcc-7/changes.html	2017-01-25 15:44:54.441943930 +0100
@@ -47,6 +47,55 @@
   It can be enabled by using the -fstore-merging option and is
   enabled by default at -Os and the -O2 optimization
   level or higher.
+  AddressSanitizer gained a new sanitization option, -fsanitize-address-use-after-scope,
+  which enables sanitization of variables whose address is taken and used after a scope where the
+  variable is defined:
+  
+int
+main (int argc, char **argv)
+{
+  char *ptr;
+{
+  char my_char;
+  ptr = &my_char;
+}
+
+  *ptr = 123;
+  return *ptr;
+}
+
+==28882==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fffb8dba990 at pc 0x004006d5 bp 0x7fffb8dba960 sp 0x7fffb8dba958
+WRITE of size 1 at 0x7fffb8dba990 thread T0
+#0 0x4006d4 in main /tmp/use-after-scope-1.c:10
+#1 0x7f9c71943290 in __libc_start_main (/lib64/libc.so.6+0x20290)
+#2 0x400739 in _start (/tmp/a.out+0x400739)
+
+Address 0x7fffb8dba990 is located in stack of thread T0 at offset 32 in frame
+#0 0x40067f in main /tmp/use-after-scope-1.c:3
+
+  This frame has 1 object(s):
+[32, 33) 'my_char' <== Memory access at offset 32 is inside this variable
+  
+
+  Compared to the LLVM compiler, where the option already exists,
+  the implementation in the GCC compiler has couple of improvements and advantages:
+  
+  A complex usage of gotos and case labels are properly handled and should not
+  report any false positive or false negatives.
+  
+  Shadow memory poisoning (and unpoisoning) is optimized out in common situations
+  where the call is not needed.
+  
+  C++ temporaries are sanitized.
+  Using -O2 optimization level (and above) rewrites variables of a GIMPLE
+  type that are rewritten into SSA.  This removes shadow memory usage and
+  results in faster code.
+  Sanitization can handle invalid memory stores that are optimized out
+  by the LLVM compiler when using an optimization level.
+  
+
+  
+
 
 
 


Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Bernd Schmidt

On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:

The test is supposed to test the generation of the vsel instruction.
I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
VSEL isn't actually available on Cortex-A5, it's just enabled by the
-mfpu=fp-armv8 option.
A more realistic configuration would target an ARMv8-A CPU like the
Cortex-A57.


Ok, let me know if there's anything else you need from my side.


Bernd



Re: [PATCH, rs6000] Fix for entries in table of overloaded built-in functions

2017-01-25 Thread Bill Schmidt
On Tue, 2017-01-24 at 10:09 -0800, Carl E. Love wrote:
> On Tue, 2017-01-24 at 11:08 -0600, Segher Boessenkool wrote:
> > On Tue, Jan 24, 2017 at 08:28:37AM -0800, Carl E. Love wrote:
> > > The following patch fixes an issue with the entries in the table of
> > > built-in functions.  All of the entries for a given built-in, must occur
> > > in the table as a single block of entries.  Otherwise the code that
> > > searches the table for a given built-in definition will stop looking
> > > once it reaches the end of the initial block of definitions for that
> > > built-in function and subsequent definitions for that built-in will
> > > never be checked.  This issue currently occurs with the
> > > ALTIVEC_BUILTIN_VEC_PACKS and P8V_BUILTIN_VEC_VGBBD built-ins.  The
> > > patch simply moves the existing entries so the definition for a given
> > > built-in are all together in the same block of entries.
> > 
> > Do we need a separate testcase to check for this?  Or do those specific
> > builtins need better testcases?  Or was the bug obvious already?
> 
> I have a list of built-ins that need to have support and test cases
> added.  I found the issue with the ALTIVEC_BUILTIN_VEC_PACKS when I
> tried to add support for the built-ins:
> 
>   vector signed int vec_packs (vector signed long long x, vector signed long 
> long y);
>   vector unsigned int vec_packs (vector unsigned long long x, vector unsigned 
> long long y);
> 
> which were in my to do list.  What I found was the support for vec_packs
> is all there but I don't find any test cases for these built-ins.  At
> this point, I do plan to add the vec_pack test cases as part of my work
> to add the support for the other built-ins on my list.  I have the patch
> in my patch series with the others that need adding.  Currently holding
> off on posting patches since we are only supposed to be posting bug
> fixes at the moment.
> 
> Once the bug for the ALTIVEC_BUILTIN_VEC_PACKS built-in was found, I
> wrote a perl script to scan through the entire table looking for the
> issue with any other built-in functions.  The script found the issue
> with the P8V_BUILTIN_VEC_VGBBD built-in.  My list of built-ins to add
> doesn't include anything for vec_vgbbd.  
> 
> It would be easy for my to add the test cases for the vec_packs()
> built-ins to this patch if you would like?  
> 
> I just took a look at the vec_vgbbd() built-in.  I grep'd for vgbbd and
> found the followint two testcases in
> gcc/testsuite/gcc.target/powerpc/p8vector-builtin-4.c:
> 
> typedef vector signed charvc_sign;
>   
> typedef vector unsigned char  vc_uns;
>  
> vc_sign vc_gbb_2 (vc_sign a)  
>   
> { 
>   
>   return vec_vgbbd (a);   
>   
> } 
>   
>   
>   
> vc_uns vc_gbb_3 (vc_uns a)
>   
> { 
>   
>   return vec_vgbbd (a);   
>   
> } 
>   
> 
> which correspond to the built-in entries in rs6000-c.c which I didn't move
> 
>   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, 
> 
> RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },   
> 
>   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, 
> 
> RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
>   
> I don't see any tests for the two built-in entries in rs6000-c.c which the 
> patch moves, i.e.
> 
>   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, 
> 
> RS6000_BTI_V16QI, 0, 0, 0 },  
> 
>   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, 
> 
> RS6000_BTI_unsigned_V16QI, 0, 0, 0 },  
> 
> I tried a quick test of adding the following to the test file  
> p8vector-builtin-4.c for these entries:
>   
> vc_sign vc_gbb_4 (void)   
>   
> {

Re: [wwwdocs] changes.html - document -fsanitize-address-use-after-scope

2017-01-25 Thread Kyrill Tkachov

Hi Martin,

On 25/01/17 14:54, Martin Liška wrote:

Hello.

Following patch documents new option -fsanitize-address-use-after-scope which 
was done for upcoming GCC 7.1.

Thanks for feedback,
Martin


+  Using -O2 optimization level (and above) rewrites variables of a 
GIMPLE
+  type that are rewritten into SSA.  This removes shadow memory usage and
+  results in faster code.

I believe the changes page is targeted towards end users rather than GCC 
developers
and the above description wouldn't make much sense to them. Maybe better to say:
"Using -O2 optimization level and above improves shadow memory usage over LLVM" 
?

Kyrill


Re: [wwwdocs] changes.html - document -fsanitize-address-use-after-scope

2017-01-25 Thread Jakub Jelinek
On Wed, Jan 25, 2017 at 03:00:19PM +, Kyrill Tkachov wrote:
> Hi Martin,
> 
> On 25/01/17 14:54, Martin Liška wrote:
> > Hello.
> > 
> > Following patch documents new option -fsanitize-address-use-after-scope 
> > which was done for upcoming GCC 7.1.
> > 
> > Thanks for feedback,
> > Martin
> 
> +  Using -O2 optimization level (and above) rewrites variables of a 
> GIMPLE
> +  type that are rewritten into SSA.  This removes shadow memory usage and
> +  results in faster code.
> 
> I believe the changes page is targeted towards end users rather than GCC 
> developers
> and the above description wouldn't make much sense to them. Maybe better to 
> say:
> "Using -O2 optimization level and above improves shadow memory usage over 
> LLVM" ?

It isn't even correct, we only rewrite vars into SSA that aren't address
taken except for the implicit address taking by ASAN_MARK.  It is just an
implementation detail, I think we just should leave it out, it is up to users
to compare our and LLVM -fsanitize=address performance and what it can
report if they want.  What you should mention is that 
-fsanitize-address-use-after-scope
is on by default if -fsanitize=address and not when
-fsanitize=kernel-address.

Jakub


[PATCH] PR libstdc++/70607 make proj(T) and conj(T) return complex

2017-01-25 Thread Jonathan Wakely

We implemented DR 1137 which changed the return types of proj and conj
on scalars, but then before the final C++11 standard DR 1522 reverted
that change, and we never implemented it. That means since GCC 4.5.0
we've been shipping non-conforming proj and conj functions.

This fixes the return types to match the standard, and adds some
missing constexpr on the real and imag overloads for scalars.

Since we're reverting the change from DR 1137 there's no reason to
name the test after that DR, so I'm moving it back to its original
name.

I'm also changing a test's { target c++14 } to c++11. It was
originally written to test std::complex in C++14 mode when the default
was gnu++98, so needed an explicit -std=gnu++14. Now that is the
default and so isn't needed, but there's no reason the test can't also
be used in C++11 mode.

PR libstdc++/61791
PR libstdc++/70607
* include/std/complex (real(T), imag(T)): Add _GLIBCXX_CONSTEXPR.
(proj(T), conj(T)): Change return types per DR 1522.
* include/tr1/complex (conj): Remove overloads and use std::conj.
* testsuite/26_numerics/complex/dr781_dr1137.cc: Rename to...
* testsuite/26_numerics/complex/dr781.cc: ... this, and update.
* testsuite/26_numerics/complex/value_operations/constexpr2.cc: Test
real(T) and imag(T). Allow testing for C++11 too.

Tested powerpc64le-linux, committed to trunk.

commit 4fb1b36a299577fdf65008a9793fbaf66204f6a2
Author: Jonathan Wakely 
Date:   Wed Jan 25 10:24:43 2017 +

PR libstdc++/70607 make proj(T) and conj(T) return complex

PR libstdc++/61791
PR libstdc++/70607
* include/std/complex (real(T), imag(T)): Add _GLIBCXX_CONSTEXPR.
(proj(T), conj(T)): Change return types per DR 1522.
* include/tr1/complex (conj): Remove overloads and use std::conj.
* testsuite/26_numerics/complex/dr781_dr1137.cc: Rename to...
* testsuite/26_numerics/complex/dr781.cc: ... this, and update.
* testsuite/26_numerics/complex/value_operations/constexpr2.cc: Test
real(T) and imag(T). Allow testing for C++11 too.

diff --git a/libstdc++-v3/include/std/complex b/libstdc++-v3/include/std/complex
index 12b6e41..6342c98 100644
--- a/libstdc++-v3/include/std/complex
+++ b/libstdc++-v3/include/std/complex
@@ -1840,7 +1840,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
-inline typename __gnu_cxx::__promote<_Tp>::__type
+_GLIBCXX_CONSTEXPR inline typename __gnu_cxx::__promote<_Tp>::__type
 imag(_Tp)
 { return _Tp(); }
 
@@ -1853,7 +1853,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   template
-inline typename __gnu_cxx::__promote<_Tp>::__type
+_GLIBCXX_CONSTEXPR inline typename __gnu_cxx::__promote<_Tp>::__type
 real(_Tp __x)
 { return __x; }
 
@@ -1921,16 +1921,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return __complex_proj(__z); }
 #endif
 
-  // DR 1137.
   template
-inline typename __gnu_cxx::__promote<_Tp>::__type
+inline std::complex::__type>
 proj(_Tp __x)
-{ return __x; }
+{
+  typedef typename __gnu_cxx::__promote<_Tp>::__type __type;
+  return std::proj(std::complex<__type>(__x));
+}
 
   template
-inline typename __gnu_cxx::__promote<_Tp>::__type
+inline std::complex::__type>
 conj(_Tp __x)
-{ return __x; }
+{
+  typedef typename __gnu_cxx::__promote<_Tp>::__type __type;
+  return std::complex<__type>(__x, -__type());
+}
 
 _GLIBCXX_END_NAMESPACE_VERSION
 
diff --git a/libstdc++-v3/include/tr1/complex b/libstdc++-v3/include/tr1/complex
index 8624e55..06f9ab0 100644
--- a/libstdc++-v3/include/tr1/complex
+++ b/libstdc++-v3/include/tr1/complex
@@ -371,17 +371,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   using std::arg;
-
-  template
-inline std::complex<_Tp>
-conj(const std::complex<_Tp>& __z)
-{ return std::conj(__z); }  
-
-  template
-inline std::complex::__type>
-conj(_Tp __x)
-{ return __x; }
-
+  using std::conj;
   using std::imag;
   using std::norm;
   using std::polar;
diff --git a/libstdc++-v3/testsuite/26_numerics/complex/dr781.cc 
b/libstdc++-v3/testsuite/26_numerics/complex/dr781.cc
new file mode 100644
index 000..3fb6cd1
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/complex/dr781.cc
@@ -0,0 +1,81 @@
+// { dg-do run { target c++11 } }
+// 2008-05-22  Paolo Carlini  
+//
+// Copyright (C) 2008-2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public Licen

[PATCH] Fix "classe" typo in C++ Dialect Options docs

2017-01-25 Thread Jonathan Wakely

* doc/invoke.texi (C++ Dialect Options): Fix typo.

Committed as obvious.

commit 8d4ebdf7bfffefb077a28174aed5cb13e89cb90e
Author: Jonathan Wakely 
Date:   Wed Jan 25 14:30:12 2017 +

Fix "classe" typo in C++ Dialect Options docs

* doc/invoke.texi (C++ Dialect Options): Fix typo.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6a42193..d388d01 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3084,7 +3084,7 @@ classes that indirectly use multiple inheritance.
 
 @item -Wvirtual-inheritance
 @opindex Wvirtual-inheritance
-Warn when a class is defined with a virtual direct base classe.  Some
+Warn when a class is defined with a virtual direct base class.  Some
 coding rules disallow multiple inheritance, and this may be used to
 enforce that rule.  The warning is inactive inside a system header file,
 such as the STL, so one can still use the STL.  One may also define


Re: [PATCH] Fix "classe" typo in C++ Dialect Options docs

2017-01-25 Thread Jonathan Wakely

On 25/01/17 15:07 +, Jonathan Wakely wrote:

* doc/invoke.texi (C++ Dialect Options): Fix typo.

Committed as obvious.


Oops, this was only meant for gcc-patches, not the libstdc++ list,
sorry.



Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Christophe Lyon
On 25 January 2017 at 15:55, Bernd Schmidt  wrote:
> On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:
>>
>> The test is supposed to test the generation of the vsel instruction.
>> I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
>> VSEL isn't actually available on Cortex-A5, it's just enabled by the
>> -mfpu=fp-armv8 option.
>> A more realistic configuration would target an ARMv8-A CPU like the
>> Cortex-A57.
>
>
> Ok, let me know if there's anything else you need from my side.
>
Kyrill,

How about the attached patch?

I've added dg-require-effective-target arm_arch_v8a_ok to make sure
it's legitimate to request an armv8-class core, but force -mcpu=cortex-a57
to make sure the intended instructions are present (in case at some
point add-options-for-arm-arch-v8a activates costs/arch variant that
would imply not generating vsel anymore).

I've noticed there are other tests adding arm_v8_vfp and not making
sure to select an appriopriate cpu. As a follow-up patch?

And I checked that my patch makes the tests pass again even
when configuring --with-cpu=cortex-a5.

Thanks,

Christophe

>
> Bernd
>
gcc/testsuite/ChangeLog:

2017-01-25  Christophe Lyon  

* gcc.target/arm/vseleqdf.c: Require arm_arch_v8a_ok, add
-mcpu=cortex-a57.
* gcc.target/arm/vseleqsf.c: Likewise.
* gcc.target/arm/vselgedf.c: Likewise.
* gcc.target/arm/vselgesf.c: Likewise.
* gcc.target/arm/vselgtdf.c: Likewise.
* gcc.target/arm/vselgtsf.c: Likewise.
* gcc.target/arm/vselledf.c: Likewise.
* gcc.target/arm/vsellesf.c: Likewise.
* gcc.target/arm/vselltdf.c: Likewise.
* gcc.target/arm/vselltsf.c: Likewise.
* gcc.target/arm/vselnedf.c: Likewise.
* gcc.target/arm/vselnesf.c: Likewise.
* gcc.target/arm/vselvcdf.c: Likewise.
* gcc.target/arm/vselvcsf.c: Likewise.
* gcc.target/arm/vselvsdf.c: Likewise.
* gcc.target/arm/vselvssf.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/arm/vseleqdf.c 
b/gcc/testsuite/gcc.target/arm/vseleqdf.c
index 86e147b..64d5784 100644
--- a/gcc/testsuite/gcc.target/arm/vseleqdf.c
+++ b/gcc/testsuite/gcc.target/arm/vseleqdf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vseleqsf.c 
b/gcc/testsuite/gcc.target/arm/vseleqsf.c
index 120f44b..b052704 100644
--- a/gcc/testsuite/gcc.target/arm/vseleqsf.c
+++ b/gcc/testsuite/gcc.target/arm/vseleqsf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselgedf.c 
b/gcc/testsuite/gcc.target/arm/vselgedf.c
index cea08d1..e10508f 100644
--- a/gcc/testsuite/gcc.target/arm/vselgedf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgedf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vselgesf.c 
b/gcc/testsuite/gcc.target/arm/vselgesf.c
index 86f2a04..645cf5d 100644
--- a/gcc/testsuite/gcc.target/arm/vselgesf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgesf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselgtdf.c 
b/gcc/testsuite/gcc.target/arm/vselgtdf.c
index 2c4a6ba..741b9a8 100644
--- a/gcc/testsuite/gcc.target/arm/vselgtdf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgtdf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 double
diff --git a/gcc/testsuite/gcc.target/arm/vselgtsf.c 
b/gcc/testsuite/gcc.target/arm/vselgtsf.c
index 388e74c..3042c5b 100644
--- a/gcc/testsuite/gcc.target/arm/vselgtsf.c
+++ b/gcc/testsuite/gcc.target/arm/vselgtsf.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok */
 /* { dg-require-effective-target arm_v8_vfp_ok } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
 /* { dg-add-options arm_v8_vfp } */
 
 float
diff --git a/gcc/testsuite/gcc.target/arm/vselledf.c 
b/gcc/testsuite/gcc.target/arm/vselledf.c
index 088dc04

Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Kyrill Tkachov


On 25/01/17 15:20, Christophe Lyon wrote:

On 25 January 2017 at 15:55, Bernd Schmidt  wrote:

On 01/25/2017 10:18 AM, Kyrill Tkachov wrote:

The test is supposed to test the generation of the vsel instruction.
I believe adding an -mcpu=cortex-a57 to the testcases would be best, as
VSEL isn't actually available on Cortex-A5, it's just enabled by the
-mfpu=fp-armv8 option.
A more realistic configuration would target an ARMv8-A CPU like the
Cortex-A57.


Ok, let me know if there's anything else you need from my side.


Kyrill,

How about the attached patch?


Yes, thanks Christophe.


I've added dg-require-effective-target arm_arch_v8a_ok to make sure
it's legitimate to request an armv8-class core, but force -mcpu=cortex-a57
to make sure the intended instructions are present (in case at some
point add-options-for-arm-arch-v8a activates costs/arch variant that
would imply not generating vsel anymore).

I've noticed there are other tests adding arm_v8_vfp and not making
sure to select an appriopriate cpu. As a follow-up patch?


I wouldn't want to do that too much in the testsuite.
In the VSEL tests we have a C-level idiom (?: construct) that we expect
the optimisers to transform into a conditional select instruction that may or 
may not
be a win on some cores.

In some of those other tests I suspect we want to generate the instruction for 
all tunings.
I.e. I'd expect the rounding tests (__builtin_floor/trunc etc) to always 
generate VRINT*
when the -mfpu allows it, regardless of the CPU tuning.

Kyrill


And I checked that my patch makes the tests pass again even
when configuring --with-cpu=cortex-a5.




Thanks,

Christophe


Bernd





Re: [wwwdocs] changes.html - PGO and GCOV changes

2017-01-25 Thread Martin Liška
Hello.

Following patch adds what was said in the changes file to our documentation.

Thanks,
Martin
>From 0da7f4d9a2a895e63271e9dc870814c6c7e3f419 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 25 Jan 2017 16:41:23 +0100
Subject: [PATCH] Enhance doc for -fprofile-arcs

gcc/ChangeLog:

2017-01-25  Martin Liska  

	* doc/invoke.texi (-fprofile-arcs): Document profiling support
	for {cd}tors and C++ {cd}tors.
---
 gcc/doc/invoke.texi | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6a42193d106..223a0aed7af 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10584,7 +10584,12 @@ linking.
 @opindex fprofile-arcs
 Add code so that program flow @dfn{arcs} are instrumented.  During
 execution the program records how many times each branch and call is
-executed and how many times it is taken or returns.  When the compiled
+executed and how many times it is taken or returns.  On targets that support
+constructors with priority support, the profiling properly handles constructors,
+destructors and C++ constructors (and destructors) of classes which are used
+as a type of a global variable.
+
+When the compiled
 program exits it saves this data to a file called
 @file{@var{auxname}.gcda} for each source file.  The data may be used for
 profile-directed optimizations (@option{-fbranch-probabilities}), or for
-- 
2.11.0



Re: [PATCH PR78559][RFC]Proposed fix

2017-01-25 Thread Segher Boessenkool
Hi!

I was worried this patch would prevent too many other optimisations,
so I looked into better options.  I didn't find any.  I tested the
effects of the patch on 31 architectures (building GCC and then Linux
with it; 6 errored out building the kernel).  There were exactly zero
differences in generated code.

The patch is fine for mainline.  Thanks Bin!


Segher


On Thu, Dec 01, 2016 at 09:47:51AM +, Bin Cheng wrote:
> 2016-12-01  Bin Cheng  
> 
>   PR rtl-optimization/78559
>   * combine.c (try_combine): Discard REG_EQUAL and REG_EQUIV for
>   other_insn in combine.

> diff --git a/gcc/combine.c b/gcc/combine.c
> index 22fb7a9..93b0901 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -4138,7 +4138,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
>PATTERN (undobuf.other_insn)))
> ||(REG_NOTE_KIND (note) == REG_UNUSED
>&& !reg_set_p (XEXP (note, 0),
> - PATTERN (undobuf.other_insn
> + PATTERN (undobuf.other_insn)))
> +   || REG_NOTE_KIND (note) == REG_EQUAL
> +   || REG_NOTE_KIND (note) == REG_EQUIV)
>   remove_note (undobuf.other_insn, note);
>   }


Re: [PATCH] BRIG frontend: request for a global review

2017-01-25 Thread Thomas Schwinge
Hi!

On Wed, 25 Jan 2017 13:21:13 +0100, Jakub Jelinek  wrote:
> On Wed, Jan 25, 2017 at 11:00:50AM +0100, Thomas Schwinge wrote:
> > On Tue, 24 Jan 2017 13:52:10 +0100, Martin Jambor  wrote:
> > > [BRIG front end]

$ git grep --cached libbrig
gcc/brig/config-lang.in:target_libs="target-libbrig target-libhsail-rt"

What is "libbrig"; should we remove that (as far as I can tell?) stale
reference?


$ git show 55a56509bb4ae0c844c27f0679a22844bed3a3c5 -- libhsail-rt/README | 
filterdiff
--- /dev/null
+++ libhsail-rt/README
@@ -0,0 +1,4 @@
+Run autoconf2.64 && automake-1.11  to regenerate the buildfiles.
+You might need to manually tweak the minor automake version number
+in configure.ac and aclocal.m4 (search for 1.11.6) in case your
+local 1.11 minor version doesn't match. 
\ No newline at end of file

I don't understand that "manually tweak" comment -- you should just
install/build the right versions, and run "PATH=[...]:$PATH autoreconf",
which is the same for all GCC subdirectories.

Instead, the README file should contain a note what the "libhsail-rt"
directory is about.


$ git show 55a56509bb4ae0c844c27f0679a22844bed3a3c5 -- 
gcc/builtin-types.def | filterdiff --hunks=1
diff --git gcc/builtin-types.def gcc/builtin-types.def
index 91745b4..ee6d052 100644
--- gcc/builtin-types.def
+++ gcc/builtin-types.def
@@ -67,7 +67,10 @@ DEF_PRIMITIVE_TYPE (BT_LONGLONG, 
long_long_integer_type_node)
 DEF_PRIMITIVE_TYPE (BT_ULONGLONG, long_long_unsigned_type_node)
 DEF_PRIMITIVE_TYPE (BT_INTMAX, intmax_type_node)
 DEF_PRIMITIVE_TYPE (BT_UINTMAX, uintmax_type_node)
-DEF_PRIMITIVE_TYPE (BT_UINT16, uint16_type_node)
+DEF_PRIMITIVE_TYPE (BT_INT8, signed_char_type_node)
+DEF_PRIMITIVE_TYPE (BT_INT16, short_integer_type_node)
+DEF_PRIMITIVE_TYPE (BT_UINT8, char_type_node)
+DEF_PRIMITIVE_TYPE (BT_UINT16, short_unsigned_type_node)
 DEF_PRIMITIVE_TYPE (BT_UINT32, uint32_type_node)
 DEF_PRIMITIVE_TYPE (BT_UINT64, uint64_type_node)
 DEF_PRIMITIVE_TYPE (BT_WORD, (*lang_hooks.types.type_for_mode) (word_mode, 
1))

Is that change alright?  For instance, uint16_type_node is still used
elsewhere.  Some of these intN/uintN type_nodes apparently don't exist as
global_trees; should they, and then be referred to here instead of the
C-like type_nodes?


The "News" section on , and
 should also be updated, I guess?
:-)


By the way, see  "Questionable
-Wmisleading-indentation diagnostic in HSAIL-Tools" for a build problem
with HSAILasm that I ran into.  With that resolved (trivial), I'm
reporting from "gcc/testsuite/brig/brig.sum": "# of expected passes 95".

Just one concern there is output like:

[...]
PASS: brig.dg/test/gimple/mem.hsail (test for excess errors)
PASS: mem.hsail.brig scan-tree-dump original "__args;[\n ]+d0 ="
PASS: mem.hsail.brig scan-tree-dump original "\\(__args \\+ 8\\);[\n ]+d2 ="
[...]

..., that is, the "scan-tree-dump"s don't print the full filename of the
test case.  But that problem supposedly isn't specific to the BRIG test
cases.  (I may look into that later.)


> > "contrib/gcc_update" needs to be updated for "libhsail-rt".

Done.

I suppose that also contrib/update-copyright.py need to be updated?  (I
never looked into that, so don't know.)

> > Here is a patch to fix some Autotools issues in libhsail-rt (currently
> > testing); OK for trunk?

> Note, lots of this changed in r244895, so your patch doesn't apply any
> longer.  Still removing AC_CONFIG_MACRO_DIR, removing config.h.in and
> regenerating whatever is affected by that (most likely just configure)
> is in order.

Committed to trunk in r244902:

commit c8cd62c4e211f2e2bfabaf25a64842004e611797
Author: tschwinge 
Date:   Wed Jan 25 15:38:01 2017 +

libhsail-rt: Fix some Autotools issues

contrib/
* gcc_update (files_and_dependencies): Care for "libhsail-rt".

libhsail-rt/
* configure.ac: Don't instantiate AC_CONFIG_MACRO_DIR.
* configure: Regenerate.

libhsail-rt/
* config.h.in: Remove stale file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244902 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 contrib/ChangeLog|   4 +
 contrib/gcc_update   |   4 +
 libhsail-rt/ChangeLog|  13 +++
 libhsail-rt/config.h.in  | 217 ---
 libhsail-rt/configure|   6 +-
 libhsail-rt/configure.ac |   2 -
 6 files changed, 23 insertions(+), 223 deletions(-)

diff --git contrib/ChangeLog contrib/ChangeLog
index d429beb..2f862fa 100644
--- contrib/ChangeLog
+++ contrib/ChangeLog
@@ -1,3 +1,7 @@
+2017-01-25  Thomas Schwinge  
+
+   * gcc_update (files_and_dependencies): Care for "libhsail-rt".
+
 2017-01-23  Gerald Pfeifer  
 
* patch_tester.sh (TESTLOGS): Remove
diff --git con

Re: [PATCH PR78559][RFC]Proposed fix

2017-01-25 Thread Bin.Cheng
On Wed, Jan 25, 2017 at 3:56 PM, Segher Boessenkool
 wrote:
> Hi!
>
> I was worried this patch would prevent too many other optimisations,
> so I looked into better options.  I didn't find any.  I tested the
> effects of the patch on 31 architectures (building GCC and then Linux
Thanks very much for this, that's a lot of testing work.  I will
revise the patch by explaining why is the change, as well as the
impact.

Thanks,
bin
> with it; 6 errored out building the kernel).  There were exactly zero
> differences in generated code.
>
> The patch is fine for mainline.  Thanks Bin!
>
>
> Segher
>
>
> On Thu, Dec 01, 2016 at 09:47:51AM +, Bin Cheng wrote:
>> 2016-12-01  Bin Cheng  
>>
>>   PR rtl-optimization/78559
>>   * combine.c (try_combine): Discard REG_EQUAL and REG_EQUIV for
>>   other_insn in combine.
>
>> diff --git a/gcc/combine.c b/gcc/combine.c
>> index 22fb7a9..93b0901 100644
>> --- a/gcc/combine.c
>> +++ b/gcc/combine.c
>> @@ -4138,7 +4138,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
>> rtx_insn *i0,
>>PATTERN (undobuf.other_insn)))
>> ||(REG_NOTE_KIND (note) == REG_UNUSED
>>&& !reg_set_p (XEXP (note, 0),
>> - PATTERN (undobuf.other_insn
>> + PATTERN (undobuf.other_insn)))
>> +   || REG_NOTE_KIND (note) == REG_EQUAL
>> +   || REG_NOTE_KIND (note) == REG_EQUIV)
>>   remove_note (undobuf.other_insn, note);
>>   }


Re: [PATCH PR78559][RFC]Proposed fix

2017-01-25 Thread Segher Boessenkool
On Wed, Jan 25, 2017 at 04:08:54PM +, Bin.Cheng wrote:
> > I was worried this patch would prevent too many other optimisations,
> > so I looked into better options.  I didn't find any.  I tested the
> > effects of the patch on 31 architectures (building GCC and then Linux
> Thanks very much for this, that's a lot of testing work.

The power of scripting ;-)  Of course it doesn't help if you do not
have enough free disk space, and other yak shaving, sigh.

Analysing the differences is mostly manual of course.  Luckily there
weren't any.  I also tested a few other changes at the same time; if
I would have known this one was easy in the end...  Oh well.

> I will
> revise the patch by explaining why is the change, as well as the
> impact.

Yes please.  Make sure to mention the PR, it helps finding the mail
threads and commits a lot.


Segher


[PATCH PR71437]Prefer symbolic range bound if the var doesn't have useful range.

2017-01-25 Thread Bin Cheng
Hi,
As analyzed in PR71437, it's a missed PRE issue due to missed jump threading,
and then due to inaccurate VRP information.  In function 
extract_range_for_var_from_comparison_expr,
we compute range for variable "a" under condition that comparison like "a <= 
limit"
is true.  It extracts limit's range information and set range [MIN, 
limit_vr->max] to var.
This is inaccurate when limit_vr->max is MAX.  In this case the final range 
computed
is [MIN, MAX] which is VARYING.  In fact, symbolic range info [MIN, limit] is 
better here.
This patch fixes PR71437 by making such change.  It also handles ">=" cases.

Bootstrap and test on x86_64 and AArch64 finished.  All tests are OK except test
gcc.dg/tree-ssa/pr31521.c.  I further investigated it and believe it's another 
missed
optimization in VRP.  Basically, operand_less_p is weak in handling symbolic 
value
range.  Given below value ranges:
x: [1, INF+]
a: [-INF, x - 1]
b: [0, INF+]
It doesn't know that "x - 1 < INF+" must be true, thus (intersect a b) is [0, x 
- 1].
I believe there may be other places in which symbolic value range is not handled
properly.  So any comment?

Thanks,
bin

2017-01-24  Bin Cheng  

PR tree-optimization/71437
* tree-vrp.c (extract_range_for_var_from_comparison_expr): Prefer
symbolic range form if limit has no useful range information.

gcc/testsuite/ChangeLog
2017-01-24  Bin Cheng  

PR tree-optimization/71437
* gcc.dg/tree-ssa/pr71437.c: New test.diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71437.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr71437.c
new file mode 100644
index 000..66a5405
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71437.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-ffast-math -O3 -fdump-tree-vrp1-details" } */
+
+int I = 50, J = 50;
+int S, L;
+const int *pL;
+const int *pS;
+
+void bar (float, float);
+
+void foo (int K)
+{
+  int k, i, j;
+  static float LD, SD;
+  for (k = 0 ; k < K; k++)
+{
+for( i = 0 ; i < ( I - 1 ) ; i++ )
+{
+if( ( L < pL[i+1] ) && ( L >= pL[i] ) )
+  break ;
+}
+
+if( i == ( I - 1 ) )
+  L = pL[i] ;
+LD = (float)( L - pL[i] ) /
+(float)( pL[i + 1] - pL[i] ) ;
+
+for( j = 0 ; j < ( J-1 ) ; j++ )
+{
+if( ( S < pS[j+1] ) && ( S >= pS[j] ) )
+  break ;
+}
+
+if( j == ( J - 1 ) )
+  S = pS[j] ;
+SD = (float)( S - pS[j] ) /
+ (float)( pS[j + 1] - pS[j] ) ;
+
+   bar (LD, SD);
+}
+}
+/* { dg-final { scan-tree-dump-times "Threaded jump " 2 "vrp1" } } */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index ac37d3f..fc3f9a9 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -1663,7 +1663,10 @@ extract_range_for_var_from_comparison_expr (tree var, 
enum tree_code cond_code,
 {
   min = TYPE_MIN_VALUE (type);
 
-  if (limit_vr == NULL || limit_vr->type == VR_ANTI_RANGE)
+  if (limit_vr == NULL || limit_vr->type == VR_ANTI_RANGE
+ || (limit_vr->type == VR_RANGE
+ && vrp_val_is_max (limit_vr->max)
+ && compare_values (limit_vr->min, limit_vr->max) != 0))
max = limit;
   else
{
@@ -1703,7 +1706,10 @@ extract_range_for_var_from_comparison_expr (tree var, 
enum tree_code cond_code,
 {
   max = TYPE_MAX_VALUE (type);
 
-  if (limit_vr == NULL || limit_vr->type == VR_ANTI_RANGE)
+  if (limit_vr == NULL || limit_vr->type == VR_ANTI_RANGE
+ || (limit_vr->type == VR_RANGE
+ && vrp_val_is_min (limit_vr->min)
+ && compare_values (limit_vr->min, limit_vr->max) != 0))
min = limit;
   else
{


[PATCH AARCH64]XFAIL gcc.target/aarch64/ldp_vec_64_1.c

2017-01-25 Thread Bin Cheng
Hi,
Test gcc.target/aarch64/ldp_vec_64_1.c because we don't choose [base+offset] 
addressing mode in IVOPT
on AArch64.  Given auto-increment addressing mode is disabled in IVOPT on 
AArch64, we can't really test
 the addressing mode.  I may try to enable it only for small loops in GCC8, so 
this patch xfail the case at the
 moment.  Also I filed PR79213 for tracking.

Test result checked.  Is it OK?

Thanks,
bin

gcc/testsuite/ChangeLog
2017-01-23  Bin Cheng  

* gcc.target/aarch64/ldp_vec_64_1.c: Xfail.diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c 
b/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c
index 62213f3..59cf914 100644
--- a/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c
@@ -13,4 +13,6 @@ foo (int32x2_t *foo, int32x2_t *bar)
 foo[i] = bar[i] + bar[i + 1];
 }
 
-/* { dg-final { scan-assembler "ldp\td\[0-9\]+, d\[0-9\]" } } */
+/* Xfail for now since IVOPT doesn't choose [base+offset] addressing mode.
+   See PR79213.  */
+/* { dg-final { scan-assembler "ldp\td\[0-9\]+, d\[0-9\]" { xfail *-*-* } } } 
*/


[PATCH] restore pedantic warning on flexible array members (c++/71290)

2017-01-25 Thread Martin Sebor

The improvements to the handling of flexible array members in
C++ in GCC 6 inadvertently removed the pedantic warnings GCC
used to issue for their declarations.  The attached patch
restores it.

Martin
PR c++/71290 - [6/7 Regression] Flexible array member is not diagnosed with -pedantic

gcc/cp/ChangeLog:

	PR c++/71290
	* decl.c (grokdeclarator): Warn on flexible array members.

gcc/testsuite/ChangeLog:

	PR c++/71290
	* g++.dg/ext/flexarray-mangle-2.C: Adjust.
	* g++.dg/ext/flexarray-mangle.C: Adjust.
	* g++.dg/ext/flexarray-subst.C: Adjust.
	* g++.dg/ext/flexary10.C: Adjust.
	* g++.dg/ext/flexary11.C: Adjust.
	* g++.dg/ext/flexary14.C: Adjust.
	* g++.dg/ext/flexary16.C: Adjust.
	* g++.dg/ext/flexary18.C: Adjust.
	* g++.dg/ext/flexary19.C: Adjust.
	* g++.dg/ext/flexary7.C: Adjust.
	* g++.dg/ext/pr71290.C: New test.

Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c	(revision 244844)
+++ gcc/cp/decl.c	(working copy)
@@ -11798,6 +11798,17 @@ grokdeclarator (const cp_declarator *declarator,
 	  }
 	else 
 	  {
+		/* Array is a flexible member.  */
+		if (in_system_header_at (input_location))
+		  /* Do not warn flexible them in system headers because glibc
+		 uses them.  */;
+		else if (name)
+		  pedwarn (input_location, OPT_Wpedantic,
+			   "ISO C++ forbids flexible array member %<%s%>", name);
+		else
+		  pedwarn (input_location, OPT_Wpedantic,
+			   "ISO C++ forbids flexible array members");
+
 		/* Flexible array member has a null domain.  */
 		type = build_cplus_array_type (TREE_TYPE (type), NULL_TREE);
 	  }
Index: gcc/testsuite/g++.dg/ext/flexarray-mangle-2.C
===
--- gcc/testsuite/g++.dg/ext/flexarray-mangle-2.C	(revision 244844)
+++ gcc/testsuite/g++.dg/ext/flexarray-mangle-2.C	(working copy)
@@ -1,9 +1,10 @@
 // PR c++/69277 - [6 Regression] ICE mangling a flexible array member
 // { dg-do compile { target c++11 } }
+// { dg-additional-options "-Wno-error=pedantic" }
 
 struct A {
   int n;
-  char a [];
+  char a[];   // { dg-warning "forbids flexible array member" }
 };
 
 // Declare but do not define function templates.
Index: gcc/testsuite/g++.dg/ext/flexarray-mangle.C
===
--- gcc/testsuite/g++.dg/ext/flexarray-mangle.C	(revision 244844)
+++ gcc/testsuite/g++.dg/ext/flexarray-mangle.C	(working copy)
@@ -1,9 +1,10 @@
 // PR c++/69277 - [6 Regression] ICE mangling a flexible array member
 // { dg-do compile }
+// { dg-additional-options "-Wno-error=pedantic" }
 
 struct A {
   int n;
-  char a [];
+  char a[];   // { dg-warning "forbids flexible array member" }
 };
 
 // Declare but do not define function templates.
Index: gcc/testsuite/g++.dg/ext/flexarray-subst.C
===
--- gcc/testsuite/g++.dg/ext/flexarray-subst.C	(revision 244844)
+++ gcc/testsuite/g++.dg/ext/flexarray-subst.C	(working copy)
@@ -1,8 +1,12 @@
 // PR c++/69251 - [6 Regression] ICE (segmentation fault) in unify_array_domain
 // on i686-linux-gnu
 // { dg-do compile }
+// { dg-additional-options "-Wno-error=pedantic" }
 
-struct A { int n; char a[]; };
+struct A {
+  int n;
+  char a[];   // { dg-warning "forbids flexible array member" }
+};
 
 template 
 struct B;
Index: gcc/testsuite/g++.dg/ext/flexary10.C
===
--- gcc/testsuite/g++.dg/ext/flexary10.C	(revision 244844)
+++ gcc/testsuite/g++.dg/ext/flexary10.C	(working copy)
@@ -4,7 +4,7 @@
 
 struct A {
   int n;
-  int a [];
+  int a[];  // { dg-warning "forbids flexible array member" }
 };
 
 struct A foo (void)
Index: gcc/testsuite/g++.dg/ext/flexary11.C
===
--- gcc/testsuite/g++.dg/ext/flexary11.C	(revision 244844)
+++ gcc/testsuite/g++.dg/ext/flexary11.C	(working copy)
@@ -4,7 +4,7 @@
 
 struct A {
   int n;
-  char a [];
+  char a[];   // { dg-error "forbids flexible array member" }
 };
 
 void f ()
Index: gcc/testsuite/g++.dg/ext/flexary14.C
===
--- gcc/testsuite/g++.dg/ext/flexary14.C	(revision 244844)
+++ gcc/testsuite/g++.dg/ext/flexary14.C	(working copy)
@@ -9,7 +9,9 @@ struct A { typedef int X; };
 
 template  int foo (T&, typename A::X = 0) { return 0; }
 
-struct B { int n, a[]; };
+struct B {
+  int n, a[]; // { dg-error "forbids flexible array member" }
+};
 
 void bar (B *b)
 {
Index: gcc/testsuite/g++.dg/ext/flexary16.C
===
--- gcc/testsuite/g++.dg/ext/flexary16.C	(revision 244844)
+++ gcc/testsuite/g++.dg/ext/flexary16.C	(working copy)
@@ -1,6 +1,7 @@
 // PR c++/71147 - [6 Regression] Flexible array member wrongly rejected
 //   in template
 // { dg-do compile }
+// { dg-options "-Wpedantic -Wno-error=pedantic" }
 
 template 
 struct co

Re: A + B CMP A -> A CMP' CST' match.pd patterns [was [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)]

2017-01-25 Thread Jeff Law

On 01/25/2017 03:34 AM, Richard Biener wrote:

On Tue, Jan 24, 2017 at 4:05 PM, Jeff Law  wrote:

On 01/24/2017 07:29 AM, Marc Glisse wrote:


On Tue, 24 Jan 2017, Richard Biener wrote:


That was my thought as well, but AFAICT we only call into match.pd
from VRP if we changed the insn.



Yes - there was thoughts to change that (but it comes at an expense).
Basically we'd like to re-fold stmts that indirectly use stmts we
changed.  We certainly don't want to re-fold everything all the time.



VRP is kind of a special case, every variable for which it finds a
new/improved range could be considered changed, since it may trigger
some extra transformation in match.pd (same for CCP and the nonzero
mask).


But that would assume that match.pd is relying on range information and
could thus use the improved range information.  *If* match.pd is using the
range information generated by VRP, it's not terribly pervasive.

But waiting until forwprop3 means we're leaving a ton of useless blocks and
statements in the IL for this testcase, and likely other code using
std::vec.

Perhaps rather than open-coding a fix in VRP I could have VRP call into
match.pd slightly more aggressively (say just for gimple_cond).  That may be
enough to capture the effects much earlier in the pipeline without trying to
fold *everything*.


Sure, the only disadvantage of doing it in VRP (in vrp_fold_stmt) is that you
may end up doing it twice.

Once per VRP pass doesn't seem excessive.

If we simplify in VRP with a valueizer that walks up the ASSERT_EXPRs, 
then VRP1 will simplify the two key conditionals.  The first DOM pass is 
then able to clean up the whole mess.  But that valueizer runs afoul of 
maybe_set_nonzero_bits's assumptions for an unrelated testcase (pr60482).


maybe_set_nonzero_bits has restrictions on the number of uses of an 
SSA_NAME.  folding with a valueizer that walks the ASSERT_EXPR chain has 
a side effect of copy propagating through ASSERT_EXPRs.  So for the 
pr60482 testcase we end up with 3 uses of "n_12" rather than the 
expected 2.  That in turn causes us to avoid aggressively clearing bits 
in the non-zero bitmask of n_12.  That in turn causes us to fail to 
eliminate a conditional, which in turn causes us to need a loop epilogue 
for vectorization.  Ugh.


If we fold in VRP1 without walking up the ASSERT_EXPRs, we transform 
just the first conditional in VRP1.  A goodly amount of simplification 
is still done in the first DOM pass, but not all of it.


forwprop3 then transforms the second conditional which PRE is then able 
to optimize away.  That's early enough to allow sinking of the arithmetic.


The first DOM pass still cleaned up most of the crud early so we're 
avoiding useless work.  The final result is the same as with the 
ASSERT_EXPR walking valueizer.  That seems like a reasonable compromise.


Spinning that version...

jeff






Re: [PATCH] restore pedantic warning on flexible array members (c++/71290)

2017-01-25 Thread Jason Merrill
OK.

On Wed, Jan 25, 2017 at 12:02 PM, Martin Sebor  wrote:
> The improvements to the handling of flexible array members in
> C++ in GCC 6 inadvertently removed the pedantic warnings GCC
> used to issue for their declarations.  The attached patch
> restores it.
>
> Martin


Re: [PATCH AARCH64]XFAIL gcc.target/aarch64/ldp_vec_64_1.c

2017-01-25 Thread Richard Earnshaw (lists)
On 25/01/17 16:49, Bin Cheng wrote:
> Hi,
> Test gcc.target/aarch64/ldp_vec_64_1.c because we don't choose [base+offset] 
> addressing mode in IVOPT
> on AArch64.  Given auto-increment addressing mode is disabled in IVOPT on 
> AArch64, we can't really test
>  the addressing mode.  I may try to enable it only for small loops in GCC8, 
> so this patch xfail the case at the
>  moment.  Also I filed PR79213 for tracking.
> 
> Test result checked.  Is it OK?
> 
> Thanks,
> bin
> 
> gcc/testsuite/ChangeLog
> 2017-01-23  Bin Cheng  
> 
>   * gcc.target/aarch64/ldp_vec_64_1.c: Xfail.
> 
> 
> xfail-1.txt
> 
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c 
> b/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c
> index 62213f3..59cf914 100644
> --- a/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c
> @@ -13,4 +13,6 @@ foo (int32x2_t *foo, int32x2_t *bar)
>  foo[i] = bar[i] + bar[i + 1];
>  }
>  
> -/* { dg-final { scan-assembler "ldp\td\[0-9\]+, d\[0-9\]" } } */
> +/* Xfail for now since IVOPT doesn't choose [base+offset] addressing mode.
> +   See PR79213.  */
> +/* { dg-final { scan-assembler "ldp\td\[0-9\]+, d\[0-9\]" { xfail *-*-* } } 
> } */
> 

OK.


Enable jump threading on maths meeting hot paths

2017-01-25 Thread Jan Hubicka
Hi,
this patch modifies profitable_jump_thread_path heuristics by enabling
code expansion when the threaded path contains at least one hot path.
The basic idea is that while we do not decrease instruction count on the
non-duplicated path, we reduce number of entry edges and by this path
separation we possibly enable later optimization.

We may try to get more careful about when optimization is enabled but it is
hard to do and I don't think the number of cold paths that meet hot paths
is large enough to make this matter too much.

Bootstrapped/regtested x86_64-linux, OK?
* tree-ssa-threadbackward.c (profitable_jump_thread_path): Adjust
hot/cold heuristics.

* gcc.dg/tree-ssa/pr77445-2.c: Update testcase.
Index: tree-ssa-threadbackward.c
===
--- tree-ssa-threadbackward.c   (revision 244732)
+++ tree-ssa-threadbackward.c   (working copy)
@@ -159,6 +159,7 @@ profitable_jump_thread_path (vec= PARAM_VALUE (PARAM_MAX_FSM_THREAD_PATH_INSNS))
{
Index: testsuite/gcc.dg/tree-ssa/pr77445-2.c
===
--- testsuite/gcc.dg/tree-ssa/pr77445-2.c   (revision 244746)
+++ testsuite/gcc.dg/tree-ssa/pr77445-2.c   (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread1-details-blocks-stats" } */
+/* { dg-options "-O2 -fdump-tree-thread1-details-blocks-stats 
-fdump-tree-thread2-details-blocks-stats 
-fdump-tree-thread3-details-blocks-stats 
-fdump-tree-thread4-details-blocks-stats" } */
 typedef enum STATES {
START=0,
INVALID,
@@ -121,3 +121,7 @@ enum STATES FMS( u8 **in , u32 *transiti
increase much.  */
 /* { dg-final { scan-tree-dump "Jumps threaded: 1[1-9]" "thread1" } } */
 /* { dg-final { scan-tree-dump-times "Invalid sum" 2 "thread1" } } */
+/* { dg-final { scan-tree-dump-not "not considered" "thread1" } } */
+/* { dg-final { scan-tree-dump-not "not considered" "thread2" } } */
+/* { dg-final { scan-tree-dump-not "not considered" "thread3" } } */
+/* { dg-final { scan-tree-dump-not "not considered" "thread4" } } */
Index: testsuite/gcc.dg/tree-ssa/threadbackward-1.c
===
--- testsuite/gcc.dg/tree-ssa/threadbackward-1.c(revision 0)
+++ testsuite/gcc.dg/tree-ssa/threadbackward-1.c(working copy)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ethread" } */
+char *c;
+int t()
+{
+  for (int i=0;i<5000;i++)
+c[i]=i;
+}
+/* { dg-final { scan-tree-dump-times "Registering FSM jump thread" 1 
"ethread"} } */


Re: [PATCH 0/5] OpenMP/PTX: improve correctness in SIMD regions

2017-01-25 Thread Alexander Monakov
Hi,

Here's a different approach that doesn't introduce indirection for privatized
variables at all, and keeps dependencies obvious in the IR, but, on the flip
side, requires mentioning all subfields of privatized structures in a few
places.

For each privatized variable, add it to the list of outputs of an asm at
function entry, and to the lists of inputs+outputs at SIMT region entry.  That
is, if original code has

  void foo()
  {
int var1, var2;
...
  #pragma omp private(var1, var2)
for (...) { ... }
  }

lower it to

void foo()
{
  int priv_var1, priv_var2;
  asm ("// non-transparent initialization" : "=r"(priv_var1), "=r"(priv_var2));

  ...

  void *simtrec;
  asm ("// simt entry" : "=r"(simtrec), "+r"(priv_var1), "+r"(priv_var2));

  for (...) { ... }

  asm ("// simt exit" : : "r"(simtrec) : "r"(priv_var1), "r"(priv_var2));
}

Note that technically adding privatized vars just to the clobbers list at region
exit is not enough since the compiler is free to move a write past that
downwards (with any subsequent reads).  But I'm not sure if that should be a
real concern.  Apart from that, I think this introduces the required data
depdendencies, and nothing more.

For composite privatized objects (arrays, structures), we'd had to recursively
walk them and emit individual fields/array entries in asm arguments.

At some point in the pass pipeline we'd have to remove the first asm and replace
the other two with IFN_GOMP_SIMT_ENTER/EXIT like in earlier approach (Jakub
suggested ompdevlow, as I remember).

Does this look ok?

Thanks.
Alexander


Re: [PATCH, rs6000] Fix for entries in table of overloaded built-in functions

2017-01-25 Thread Carl E. Love
Bill:


> >   
> > I don't see any tests for the two built-in entries in rs6000-c.c which the 
> > patch moves, i.e.
> > 
> >   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,   
> >   
> > RS6000_BTI_V16QI, 0, 0, 0 },
> >   
> >   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,   
> >   
> > RS6000_BTI_unsigned_V16QI, 0, 0, 0 },  
> > 

> 
> Those two entries look bogus to me, and they should just be removed, not
> moved.  I have no idea where they came from.  I suspect they were
> place-holders at one time that snuck into the code by accident.
> 
> The relevant API interface listed in the ELFv2 ABI is vec_gb, which
> should support only one interface:
> 
> vector unsigned char vec_gb (vector unsigned char);
> 
> So please remove the two bogus interfaces, and make sure we have support
> for the vec_gb interface in your GCC 8 patch list.  Thanks!

Taking this off list.

Bill sorry I missed your email this morning before I committed the patch
that moved the vec_vgbbd.  I agree the two vec_gbbd entries look bogus
to me.  There is a test in 
gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c
for the vec_gb() interface you mentioned from the ABI that covers this
case.

I will create and test a patch to remove the bogus entries.  I will then
roll it into a single patch that fixes the vex_packs entries and adds
the missing vex_packs tests.  I will then back port the single patch to
GCC-5 and GCC-6.  I will post the back ported patches to the list in a
week or so assuming no issues arise with the changes to mainline. 

Does that all sound reasonable?

   Carl Love 



Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Segher Boessenkool
On Fri, Jan 20, 2017 at 01:24:15PM -0600, Segher Boessenkool wrote:
> On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote:
> > So, when looking for situations where we have only one condition, we can 
> > try to undo the conversion of a plain REG into a condition, on the 
> > grounds that this is probably less helpful.
> > 
> > This seems to cure the testcase, but Segher also has a patch in the PR 
> > that looks like a good and more direct approach. IMO both should be 
> > applied. This one was bootstrapped and tested on x86_64-linux. Ok?
> 
> My patch does not cure all problems, it simply simplifies things a bit
> better; but the same is true for your patch if I read it correctly.
> 
> Okay for trunk, and I'll do my half as well.  Thanks,

It turns out my patch (see the PR) causes (or at least triggers)
miscompilations on tilegx.  I will drop it for now.

Longer term we will have to fix this whole if_then_else_cond business,
maybe make it less expensive and/or more effective as well.


Segher


[committed] pedwarn on lambda templates (PR c++/77914)

2017-01-25 Thread Jakub Jelinek
Hi!

As mentioned in the PR, lambda templates are something that in the end
didn't end up in C++14 nor C++17 (only generic lambdas with auto arguments
made it).  This patch pedwarns on them if -pedantic{,-errors}.

Bootstrapped/regtested on x86_64-linux and i686-linux, acked by Jason in the
PR, committed to trunk.

2017-01-25  Jakub Jelinek  

PR c++/77914
* parser.c (cp_parser_lambda_declarator_opt): Pedwarn with
OPT_Wpedantic on lambda templates for -std=c++14 and higher.

* g++.dg/cpp1y/lambda-generic-77914.C: New test.
* g++.dg/cpp1y/lambda-generic-dep.C: Add -pedantic to dg-options,
expect a warning.
* g++.dg/cpp1y/lambda-generic-x.C: Add -Wpedantic to dg-options,
expect warnings.
* g++.dg/cpp1y/lambda-generic-mixed.C: Add empty dg-options.
* g++.dg/cpp1y/pr59636.C: Likewise.
* g++.dg/cpp1y/pr60190.C: Likewise.

--- gcc/cp/parser.c.jj  2017-01-25 19:23:46.240197073 +0100
+++ gcc/cp/parser.c 2017-01-25 19:26:17.397274647 +0100
@@ -10174,6 +10174,9 @@ cp_parser_lambda_declarator_opt (cp_pars
pedwarn (parser->lexer->next_token->location, 0,
 "lambda templates are only available with "
 "-std=c++14 or -std=gnu++14");
+  else
+   pedwarn (parser->lexer->next_token->location, OPT_Wpedantic,
+"ISO C++ does not support lambda templates");
 
   cp_lexer_consume_token (parser->lexer);
 
--- gcc/testsuite/g++.dg/cpp1y/lambda-generic-77914.C.jj2017-01-25 
19:26:17.398274634 +0100
+++ gcc/testsuite/g++.dg/cpp1y/lambda-generic-77914.C   2017-01-25 
21:41:40.964310527 +0100
@@ -0,0 +1,9 @@
+// PR c++/77914
+// { dg-do compile { target c++14 } }
+
+int
+main ()
+{
+  auto l = []  () {};  // { dg-error "does not support lambda 
templates" }
+  l.operator ()  ();
+}
--- gcc/testsuite/g++.dg/cpp1y/lambda-generic-dep.C.jj  2014-09-25 
15:02:34.340817869 +0200
+++ gcc/testsuite/g++.dg/cpp1y/lambda-generic-dep.C 2017-01-25 
19:26:17.400274609 +0100
@@ -1,5 +1,6 @@
 // Generic lambda type dependence test part from N3690 5.1.2.12
 // { dg-do compile { target c++14 } }
+// { dg-options "-pedantic" }
 
 void f(int, const int (&)[2] = {}) { } // #1
 void f(const int&, const int (&)[1]) { } // #2
@@ -26,7 +27,7 @@ struct S {
 
 int main()
 {
-  auto f = []  (T const& s) mutable {
+  auto f = []  (T const& s) mutable {  // { dg-warning "does 
not support lambda templates" }
 typename T::N x;
 return x.test ();
   };
--- gcc/testsuite/g++.dg/cpp1y/lambda-generic-x.C.jj2014-09-25 
15:02:34.352817644 +0200
+++ gcc/testsuite/g++.dg/cpp1y/lambda-generic-x.C   2017-01-25 
19:26:17.400274609 +0100
@@ -1,21 +1,22 @@
 // Explicit generic lambda test from N3690 5.1.2.5
 // { dg-do compile { target c++14 } }
+// { dg-options "-Wpedantic" }
 
 #include 
 
 int main()
 {
-   auto glambda = []  (A a, B&& b) { return a < b; };
+   auto glambda = []  (A a, B&& b) { return a < b; };  
// { dg-warning "does not support lambda templates" }
bool b = glambda(3, 3.14); // OK
-   auto vglambda = []  (P printer) {
+   auto vglambda = []  (P printer) {   
// { dg-warning "does not support lambda templates" }
  return [=]  (T&& ... ts) { // OK: ts is a function 
parameter pack
-   printer(std::forward(ts)...);
+   printer(std::forward(ts)...); 
// { dg-warning "does not support lambda templates" "" { target *-*-* } .-1 }
return [=]() {
  printer(ts ...);
};
  };
};
-   auto p = vglambda( []  (A v1, B v2, C v3)
  { std::cout << v1 << v2 << v3; } );
--- gcc/testsuite/g++.dg/cpp1y/lambda-generic-mixed.C.jj2014-09-25 
15:02:34.348817719 +0200
+++ gcc/testsuite/g++.dg/cpp1y/lambda-generic-mixed.C   2017-01-25 
19:26:17.401274596 +0100
@@ -1,5 +1,6 @@
 // Mixed explicit and implicit generic lambda test.
 // { dg-do compile { target c++14 } }
+// { dg-options "" }
 
 int main()
 {
--- gcc/testsuite/g++.dg/cpp1y/pr59636.C.jj 2014-09-25 15:02:34.0 
+0200
+++ gcc/testsuite/g++.dg/cpp1y/pr59636.C2017-01-25 21:42:29.946690283 
+0100
@@ -1,4 +1,5 @@
 // PR c++/59636
 // { dg-do compile { target c++14 } }
+// { dg-options "" }
 
 auto f = []() { return []<>() {}; };  // { dg-error "expected identifier" }
--- gcc/testsuite/g++.dg/cpp1y/pr60190.C.jj 2014-09-25 15:02:34.0 
+0200
+++ gcc/testsuite/g++.dg/cpp1y/pr60190.C2017-01-25 21:42:54.058384967 
+0100
@@ -1,4 +1,5 @@
 // PR c++/60190
 // { dg-do compile { target c++14 } }
+// { dg-options "" }
 
 auto f = []() -> int() {}; // { dg-error "returning a function|expected" }

Jakub


[PATCH] Use fld b; fld a; instead of fld a; fld b; fxch %st(1) in reg-stack (PR target/70465)

2017-01-25 Thread Jakub Jelinek
Hi!

This patch adds a little optimization, if %st and %st(1) are results of
memory loads and we need to exchange them, it is shorter (and on older
machines probably also cheaper) to swap the two loads.

This triggers 26783 in x86_64-linux and i686-linux bootstraps+regtests.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Initially I thought I should also adjust debug insns in between i1 and i2
if they refer to i387 stack registers, but further look at regstack shows
that nothing does that and that such debug insns are wrong-debug for
multiple reasons.  In particular, e.g. emit_swap_insn alone emits the
fxch right after i1 or at the start of bb, without trying to adjust
debug insns in between that and insn (we should swap %st with the other reg
in those).  More importantly, at least if the debug regnos actually mean
what they stand in the assembly (i.e. the first one %st, the second one
%st(1) etc.), then we would need to ensure that every insn that pops or
pushes anything onto the i387 stack should update all DWARF expressions
that refer to those registers.  If that is really the case, the easiest
solution for this might be either to reset all debug insns that refer to %st
to %st(7) registers (easy), or replace all those registers in debug insns
with debug temporaries (8 debug temporaries for each pre-regstack register)
and then on i387 pushes or pops emit debug insns for those temporaries,
changing how they are mapped to the actual hw registers and let
var-tracking.c handle the rest.  I guess I should file a PR about this.

2017-01-25  Jakub Jelinek  

PR target/70465
* reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
emit fld b; fld a; if possible.

* gcc.target/i386/pr70465.c: New test.

--- gcc/reg-stack.c.jj  2017-01-25 11:38:54.924320927 +0100
+++ gcc/reg-stack.c 2017-01-25 12:12:08.459045376 +0100
@@ -887,6 +887,52 @@ emit_swap_insn (rtx_insn *insn, stack_pt
  && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG
  && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX)
return;
+
+  if (REG_P (i1dest)
+ && REGNO (i1dest) == FIRST_STACK_REG
+ && MEM_P (SET_SRC (i1set))
+ && !side_effects_p (SET_SRC (i1set))
+ && hard_regno == FIRST_STACK_REG + 1
+ && i1 != BB_HEAD (current_block))
+   {
+ rtx_insn *i2 = NULL;
+ rtx i2set;
+ rtx_insn *tmp = PREV_INSN (i1);
+ rtx_insn *limit = PREV_INSN (BB_HEAD (current_block));
+ while (tmp != limit)
+   {
+ if (LABEL_P (tmp)
+ || CALL_P (tmp)
+ || NOTE_INSN_BASIC_BLOCK_P (tmp)
+ || (NONJUMP_INSN_P (tmp)
+ && stack_regs_mentioned (tmp)))
+   {
+ i2 = tmp;
+ break;
+   }
+ tmp = PREV_INSN (tmp);
+   }
+ if (i2 != NULL_RTX
+ && (i2set = single_set (i2)) != NULL_RTX)
+   {
+ /* Instead of fld a; fld b; fxch %st(1); just
+use fld b; fld a; if possible.  */
+ rtx i2dest = *get_true_reg (&SET_DEST (i2set));
+ if (REG_P (i2dest)
+ && REGNO (i2dest) == FIRST_STACK_REG
+ && MEM_P (SET_SRC (i2set))
+ && !side_effects_p (SET_SRC (i2set))
+ && !modified_between_p (SET_SRC (i1set), i2, i1))
+   {
+ remove_insn (i1);
+ SET_PREV_INSN (i1) = NULL_RTX;
+ SET_NEXT_INSN (i1) = NULL_RTX;
+ set_block_for_insn (i1, NULL);
+ emit_insn_before (i1, i2);
+ return;
+   }
+   }
+   }
 }
 
   /* Avoid emitting the swap if this is the first register stack insn
--- gcc/testsuite/gcc.target/i386/pr70465.c.jj  2017-01-25 12:24:25.183041154 
+0100
+++ gcc/testsuite/gcc.target/i386/pr70465.c 2017-01-25 12:23:59.0 
+0100
@@ -0,0 +1,12 @@
+/* PR target/70465 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -mfpmath=387 -fomit-frame-pointer" } */
+/* { dg-final { scan-assembler-not "fxch\t%st.1" } } */
+
+double
+atan2 (double y, double x)
+{
+  double res = 0.0;
+  asm ("fpatan" : "=t" (res) : "u" (y), "0" (x) : "st(1)");
+  return res;
+}

Jakub


[C++ PATCH] Reject lambda closure types in decompositions (PR c++/78896)

2017-01-25 Thread Jakub Jelinek
Hi!

As discussed in the PR, while lambda closure types are class types, it
is implementation dependent on what those class types actually contain,
allowing that to be decomposed is just weird.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-01-25  Jakub Jelinek  

PR c++/78896
* decl.c (cp_finish_decomp): Disallow memberwise decomposition of
lambda expressions.

* g++.dg/cpp1z/decomp24.C: New test.

--- gcc/cp/decl.c.jj2017-01-25 17:17:51.0 +0100
+++ gcc/cp/decl.c   2017-01-25 19:16:19.439879509 +0100
@@ -7562,6 +7562,11 @@ cp_finish_decomp (tree decl, tree first,
   error_at (loc, "cannot decompose non-array non-class type %qT", type);
   goto error_out;
 }
+  else if (LAMBDA_TYPE_P (type))
+{
+  error_at (loc, "cannot decompose lambda closure type %qT", type);
+  goto error_out;
+}
   else
 {
   tree btype = find_decomp_class_base (loc, type, NULL_TREE);
--- gcc/testsuite/g++.dg/cpp1z/decomp24.C.jj2017-01-25 19:19:42.536296515 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/decomp24.C   2017-01-25 19:19:20.0 
+0100
@@ -0,0 +1,11 @@
+// PR c++/78896
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+int
+foo ()
+{
+  int a {10};
+  auto [b] { [&a](){} };   // { dg-error "cannot decompose lambda closure 
type" }
+  return b - a;// { dg-warning "decomposition 
declaration only available with" "" { target c++14_down } .-1 }
+}

Jakub


[PATCH] prevent -Wno-system-headers from suppressing -Wstringop-overflow (PR 79214)

2017-01-25 Thread Martin Sebor

While putting together examples for the GCC 7 changes document
I noticed that a few of the buffer overflow warnings issued by
-Wstringop-overflow are defeated by Glibc's macros for string
manipulation functions like strncat and strncpy.

While testing my fix I also noticed that I had missed a couple
of functions when implementing the warning: memmove and stpcpy.

The attached patch adds handlers for those and fixes the three
bugs below I raised for these omissions.

Is this patch okay for trunk?

PR preprocessor/79214 -  -Wno-system-header defeats strncat buffer
  overflow warnings
PR middle-end/79222 - missing -Wstringop-overflow= on a stpcpy overflow
PR middle-end/79223 - missing -Wstringop-overflow on a memmove overflow

Martin
PR preprocessor/79214 -  -Wno-system-header defeats strncat buffer overflow warnings
PR middle-end/79222 - missing -Wstringop-overflow= on a stpcpy overflow
PR middle-end/79223 - missing -Wstringop-overflow on a memmove overflow

gcc/ChangeLog:

	PR preprocessor/79214
	PR middle-end/79222
	PR middle-end/79223
	* builtins.c (check_sizes): Add inlinining context and issue
	warnings even when -Wno-system-headers is set.
	(check_strncat_sizes): Same.
	(expand_builtin_strncat): Same.
	(expand_builtin_memmove): New function.
	(expand_builtin_stpncpy): Same.
	(expand_builtin): Handle memmove and stpncpy.

gcc/testsuite/ChangeLog:

	PR preprocessor/79214
	PR middle-end/79222
	PR middle-end/79223
	* gcc.dg/pr79214.c: New test.
	* gcc.dg/pr79214.h: New test header.
	* gcc.dg/pr79222.c: New test.
	* gcc.dg/pr79223.c: New test.
	* gcc.dg/pr78138.c: Adjust.

Index: gcc/builtins.c
===
--- gcc/builtins.c	(revision 244844)
+++ gcc/builtins.c	(working copy)
@@ -121,6 +121,7 @@ static rtx builtin_memcpy_read_str (void *, HOST_W
 static rtx expand_builtin_memcpy (tree, rtx);
 static rtx expand_builtin_memcpy_with_bounds (tree, rtx);
 static rtx expand_builtin_memcpy_args (tree, tree, tree, rtx, tree);
+static rtx expand_builtin_memmove (tree, rtx);
 static rtx expand_builtin_mempcpy (tree, rtx, machine_mode);
 static rtx expand_builtin_mempcpy_with_bounds (tree, rtx, machine_mode);
 static rtx expand_builtin_mempcpy_args (tree, tree, tree, rtx,
@@ -129,6 +130,7 @@ static rtx expand_builtin_strcat (tree, rtx);
 static rtx expand_builtin_strcpy (tree, rtx);
 static rtx expand_builtin_strcpy_args (tree, tree, rtx);
 static rtx expand_builtin_stpcpy (tree, rtx, machine_mode);
+static rtx expand_builtin_stpncpy (tree, rtx);
 static rtx expand_builtin_strncat (tree, rtx);
 static rtx expand_builtin_strncpy (tree, rtx);
 static rtx builtin_memset_gen_str (void *, HOST_WIDE_INT, machine_mode);
@@ -3123,6 +3125,7 @@ check_sizes (int opt, tree exp, tree size, tree ma
   if (range[0] && tree_int_cst_lt (maxobjsize, range[0]))
 {
   location_t loc = tree_nonartificial_location (exp);
+  loc = expansion_point_location_if_in_system_header (loc);
 
   if (range[0] == range[1])
 	warning_at (loc, opt,
@@ -3155,10 +3158,11 @@ check_sizes (int opt, tree exp, tree size, tree ma
 	  unsigned HOST_WIDE_INT uwir0 = tree_to_uhwi (range[0]);
 
 	  location_t loc = tree_nonartificial_location (exp);
+	  loc = expansion_point_location_if_in_system_header (loc);
 
 	  if (at_least_one)
 	warning_at (loc, opt,
-			"%K%qD: writing at least %wu byte into a region "
+			"%K%qD writing at least %wu byte into a region "
 			"of size %wu overflows the destination",
 			exp, get_callee_fndecl (exp), uwir0,
 			tree_to_uhwi (objsize));
@@ -3165,7 +3169,7 @@ check_sizes (int opt, tree exp, tree size, tree ma
 	  else if (range[0] == range[1])
 	warning_at (loc, opt,
 			(uwir0 == 1
-			 ? G_("%K%qD: writing %wu byte into a region "
+			 ? G_("%K%qD writing %wu byte into a region "
 			  "of size %wu overflows the destination")
 			 : G_("%K%qD writing %wu bytes into a region "
 			  "of size %wu overflows the destination")),
@@ -3173,7 +3177,7 @@ check_sizes (int opt, tree exp, tree size, tree ma
 			tree_to_uhwi (objsize));
 	  else
 	warning_at (loc, opt,
-			"%K%qD: writing between %wu and %wu bytes "
+			"%K%qD writing between %wu and %wu bytes "
 			"into a region of size %wu overflows "
 			"the destination",
 			exp, get_callee_fndecl (exp), uwir0,
@@ -3194,6 +3198,7 @@ check_sizes (int opt, tree exp, tree size, tree ma
   if (range[0] && objsize && tree_fits_uhwi_p (objsize))
 	{
 	  location_t loc = tree_nonartificial_location (exp);
+	  loc = expansion_point_location_if_in_system_header (loc);
 
 	  if (tree_int_cst_lt (maxobjsize, range[0]))
 	{
@@ -3302,6 +3307,24 @@ expand_builtin_memcpy (tree exp, rtx target)
   return expand_builtin_memcpy_args (dest, src, len, target, exp);
 }
 
+/* Check a call EXP to the memmove built-in for validity.
+   Return NULL_RTX on both success and failure.  */
+
+static rtx
+expand_builtin_memmove (tree exp, rtx)
+{
+  if (!validate_arglist (exp,
+ 			 POINTER_TYPE, POINTER_TYPE, INT

Re: [PATCH] Use fld b; fld a; instead of fld a; fld b; fxch %st(1) in reg-stack (PR target/70465)

2017-01-25 Thread Jeff Law

On 01/25/2017 02:06 PM, Jakub Jelinek wrote:

Hi!

This patch adds a little optimization, if %st and %st(1) are results of
memory loads and we need to exchange them, it is shorter (and on older
machines probably also cheaper) to swap the two loads.

This triggers 26783 in x86_64-linux and i686-linux bootstraps+regtests.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Initially I thought I should also adjust debug insns in between i1 and i2
if they refer to i387 stack registers, but further look at regstack shows
that nothing does that and that such debug insns are wrong-debug for
multiple reasons.  In particular, e.g. emit_swap_insn alone emits the
fxch right after i1 or at the start of bb, without trying to adjust
debug insns in between that and insn (we should swap %st with the other reg
in those).  More importantly, at least if the debug regnos actually mean
what they stand in the assembly (i.e. the first one %st, the second one
%st(1) etc.), then we would need to ensure that every insn that pops or
pushes anything onto the i387 stack should update all DWARF expressions
that refer to those registers.  If that is really the case, the easiest
solution for this might be either to reset all debug insns that refer to %st
to %st(7) registers (easy), or replace all those registers in debug insns
with debug temporaries (8 debug temporaries for each pre-regstack register)
and then on i387 pushes or pops emit debug insns for those temporaries,
changing how they are mapped to the actual hw registers and let
var-tracking.c handle the rest.  I guess I should file a PR about this.

2017-01-25  Jakub Jelinek  

PR target/70465
* reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
emit fld b; fld a; if possible.

* gcc.target/i386/pr70465.c: New test.
So please comment on the general approach you're taking here.  I have a 
pretty good sense of what you're doing, mostly because I pondered 
something similar.  But I doubt others coming across the code would see 
the overall structure as quickly.





--- gcc/reg-stack.c.jj  2017-01-25 11:38:54.924320927 +0100
+++ gcc/reg-stack.c 2017-01-25 12:12:08.459045376 +0100
@@ -887,6 +887,52 @@ emit_swap_insn (rtx_insn *insn, stack_pt
  && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG
  && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX)
return;
+
+  if (REG_P (i1dest)
+ && REGNO (i1dest) == FIRST_STACK_REG
+ && MEM_P (SET_SRC (i1set))
+ && !side_effects_p (SET_SRC (i1set))
+ && hard_regno == FIRST_STACK_REG + 1
+ && i1 != BB_HEAD (current_block))
So I'd bring that inner comment to before this conditional.  It gives 
the motivation for the entire block of code.




+   {
+ rtx_insn *i2 = NULL;
+ rtx i2set;
+ rtx_insn *tmp = PREV_INSN (i1);
+ rtx_insn *limit = PREV_INSN (BB_HEAD (current_block));
+ while (tmp != limit)
+   {
+ if (LABEL_P (tmp)
+ || CALL_P (tmp)
+ || NOTE_INSN_BASIC_BLOCK_P (tmp)
+ || (NONJUMP_INSN_P (tmp)
+ && stack_regs_mentioned (tmp)))
+   {
+ i2 = tmp;
+ break;
+   }
+ tmp = PREV_INSN (tmp);
+   }
So a comment before the while loop.  You're basically looking for the 
previous insn (relative to I1) that involves stack regs within the same 
block.  It might also be worth noting that I1 is known to push a value 
onto the FP register stack.





+ if (i2 != NULL_RTX
+ && (i2set = single_set (i2)) != NULL_RTX)
+   {
+ /* Instead of fld a; fld b; fxch %st(1); just
+use fld b; fld a; if possible.  */
+ rtx i2dest = *get_true_reg (&SET_DEST (i2set));
+ if (REG_P (i2dest)
+ && REGNO (i2dest) == FIRST_STACK_REG
+ && MEM_P (SET_SRC (i2set))
+ && !side_effects_p (SET_SRC (i2set))
+ && !modified_between_p (SET_SRC (i1set), i2, i1))
And here we're trying to verify that the insn found above (I2) is 
pushing another value onto the FP register stack and that the value in 
I2 is not modified between I1 and I2.


So ISTM with just some comment improvements, this is fine for the trunk.

jeff


[PATCH,rs6000] Remove invalid P8V_BUILTIN_VEC_VGBBD entries

2017-01-25 Thread Carl E. Love
GCC Maintainers:

After further discussion of the two P8V_BUILTIN_VGBBD built-ins that do
not take any arguments, it was determined they should just be removed as
they are not valid.

The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE)
with no regressions.

Is the patch OK for trunk?  

   Carl Love
-

gcc/ChangeLog:

2017-01-24  Carl Love  

* config/rs6000/rs6000-c (altivec_overloaded_builtins): Remove
bogus entries for the P8V_BUILTIN_VEC_VGBBD built-ins
---
 gcc/config/rs6000/rs6000-c.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 1466c0c..cda0da8 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -4789,10 +4789,6 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
 RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
-  { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
-RS6000_BTI_V16QI, 0, 0, 0 },
-  { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
-RS6000_BTI_unsigned_V16QI, 0, 0, 0 },
   
   { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B,
 RS6000_BTI_V16QI, RS6000_BTI_V4SI,
-- 
1.9.1





Re: [PATCH] Use fld b; fld a; instead of fld a; fld b; fxch %st(1) in reg-stack (PR target/70465)

2017-01-25 Thread Jakub Jelinek
On Wed, Jan 25, 2017 at 02:43:34PM -0700, Jeff Law wrote:
> > 2017-01-25  Jakub Jelinek  
> > 
> > PR target/70465
> > * reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
> > emit fld b; fld a; if possible.
> > 
> > * gcc.target/i386/pr70465.c: New test.
> So please comment on the general approach you're taking here.  I have a
> pretty good sense of what you're doing, mostly because I pondered something
> similar.  But I doubt others coming across the code would see the overall
> structure as quickly.

Does the following updated patch explain it sufficiently?

> > + if (i2 != NULL_RTX
> > + && (i2set = single_set (i2)) != NULL_RTX)
> > +   {
> > + /* Instead of fld a; fld b; fxch %st(1); just
> > +use fld b; fld a; if possible.  */
> > + rtx i2dest = *get_true_reg (&SET_DEST (i2set));
> > + if (REG_P (i2dest)
> > + && REGNO (i2dest) == FIRST_STACK_REG
> > + && MEM_P (SET_SRC (i2set))
> > + && !side_effects_p (SET_SRC (i2set))
> > + && !modified_between_p (SET_SRC (i1set), i2, i1))
> And here we're trying to verify that the insn found above (I2) is pushing
> another value onto the FP register stack and that the value in I2 is not
> modified between I1 and I2.

No, that last call (as I've tried to explain in the new comment) wants
to ensure that there are no stores in between i2 and i1 that might
alias with the second load's memory (then it would be invalid to move it
before i2) and that the address of the memory doesn't depend on something
set after i2.

2017-01-25  Jakub Jelinek  

PR target/70465
* reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
emit fld b; fld a; if possible.

* gcc.target/i386/pr70465.c: New test.

--- gcc/reg-stack.c.jj  2017-01-25 17:17:46.086112137 +0100
+++ gcc/reg-stack.c 2017-01-25 22:58:00.403259702 +0100
@@ -887,6 +887,77 @@ emit_swap_insn (rtx_insn *insn, stack_pt
  && REG_P (i1src) && REGNO (i1src) == FIRST_STACK_REG
  && find_regno_note (i1, REG_DEAD, FIRST_STACK_REG) == NULL_RTX)
return;
+
+  /* Instead of
+  fld a
+  fld b
+  fxch %st(1)
+just use
+  fld b
+  fld a
+ if possible.  */
+
+  if (REG_P (i1dest)
+ && REGNO (i1dest) == FIRST_STACK_REG
+ && MEM_P (SET_SRC (i1set))
+ && !side_effects_p (SET_SRC (i1set))
+ && hard_regno == FIRST_STACK_REG + 1
+ && i1 != BB_HEAD (current_block))
+   {
+ /* i1 is the last insn that involves stack regs before insn, and
+is known to be a load without other side-effects, i.e. fld b
+in the above comment.  */
+ rtx_insn *i2 = NULL;
+ rtx i2set;
+ rtx_insn *tmp = PREV_INSN (i1);
+ rtx_insn *limit = PREV_INSN (BB_HEAD (current_block));
+ /* Find the previous insn involving stack regs, but don't pass a
+block boundary.  */
+ while (tmp != limit)
+   {
+ if (LABEL_P (tmp)
+ || CALL_P (tmp)
+ || NOTE_INSN_BASIC_BLOCK_P (tmp)
+ || (NONJUMP_INSN_P (tmp)
+ && stack_regs_mentioned (tmp)))
+   {
+ i2 = tmp;
+ break;
+   }
+ tmp = PREV_INSN (tmp);
+   }
+ if (i2 != NULL_RTX
+ && (i2set = single_set (i2)) != NULL_RTX)
+   {
+ rtx i2dest = *get_true_reg (&SET_DEST (i2set));
+ /* If the last two insns before insn that involve
+stack regs are loads, where the latter (i1)
+pushes onto the register stack and thus
+moves the value from the first load (i2) from
+%st to %st(1), consider swapping them.  */
+ if (REG_P (i2dest)
+ && REGNO (i2dest) == FIRST_STACK_REG
+ && MEM_P (SET_SRC (i2set))
+ /* Ensure i2 doesn't have other side-effects.  */
+ && !side_effects_p (SET_SRC (i2set))
+ /* And that the two instructions can actually be
+swapped, i.e. there shouldn't be any stores
+in between i2 and i1 that might alias with
+the i1 memory, and the memory address can't
+use registers set in between i2 and i1.  */
+ && !modified_between_p (SET_SRC (i1set), i2, i1))
+   {
+ /* Move i1 (fld b above) right before i2 (fld a
+above.  */
+ remove_insn (i1);
+ SET_PREV_INSN (i1) = NULL_RTX;
+ SET_NEXT_INSN (i1) = NULL_RTX;
+ set_block_for_insn (i1, NULL);
+ emit_insn_before (i1, i2);
+ return;
+   }
+   }
+   }
 }
 
   /* Avoid emitting

Merge from trunk to gccgo branch

2017-01-25 Thread Ian Lance Taylor
I merged trunk revision 244906 to the gccgo branch.

Ian


Re: [PATCH] c++/78771 ICE with inheriting ctor

2017-01-25 Thread Jason Merrill
On Wed, Jan 11, 2017 at 10:53 AM, Nathan Sidwell  wrote:
> On 01/04/2017 12:53 AM, Jason Merrill wrote:
>
>> Hmm, that seems like where the problem is.  We shouldn't try to
>> instantiate the inheriting constructor until we've already chosen the
>> base constructor; in the new model the inheriting constructor is just an
>> implementation detail.
>
> Oh what fun.  This testcase behaves differently for C++17, C++11
> -fnew-inheriting-ctors and C++11 -fno-new-inheriting-ctors compilation
> modes.
>
> Firstly, unpatched G++ is fine in C++17 mode, because:
>   /* In C++17, "If the initializer expression is a prvalue and the
>  cv-unqualified version of the source type is the same class as the
> class
>  of the destination, the initializer expression is used to initialize
> the
>  destination object."  Handle that here to avoid doing overload
>  resolution.  */
> and inside that we have:
>
>   /* FIXME P0135 doesn't say how to handle direct initialization from a
>  type with a suitable conversion operator.  Let's handle it like
>  copy-initialization, but allowing explict conversions.  */
>
> That conversion lookup short-circuits the subsequent overload resolution
> that would otherwise explode.
>
> Otherwise, with -fnew-inheriting-ctors, you are indeed correct.  There needs
> to be a call to strip_inheriting_ctors in deduce_inheriting_ctor.

That doesn't seem quite right; in deducing the inheriting ctor we are
interested in what it actually calls, so we don't want to strip.  I
was thinking about changing when we do that deduction: we shouldn't be
calling deduce_inheriting_ctor until we actually know we're calling
this inheriting ctor.  I was thinking that would mean removing the
code in fn_type_unification with the comment

  /* After doing deduction with the inherited constructor, actually
return an
 instantiation of the inheriting constructor.  */

and then looking up the inheriting constructor somehow in
build_over_call.  But that gets to be a big change.

Something smaller would be moving the call to deduce_inheriting_ctor
to build_over_call; we can get away with that because calling is the
only way to refer to a constructor. What do you think of this
approach?
commit 56586a488a27f2d5b502bd35aaec7225d0fb1d31
Author: Jason Merrill 
Date:   Wed Jan 25 16:52:28 2017 -0500

deduce-late

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index a78e1a9..99c51f3 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7581,6 +7581,11 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
joust (cand, w->loser, 1, complain);
 }
 
+  /* OK, we're actually calling this inherited constructor; set its deletedness
+ appropriately.  */
+  if (DECL_INHERITED_CTOR (fn))
+deduce_inheriting_ctor (fn);
+
   /* Make =delete work with SFINAE.  */
   if (DECL_DELETED_FN (fn) && !(complain & tf_error))
 return error_mark_node;
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index b7c26a1..03a9730 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -1197,8 +1197,6 @@ add_method (tree type, tree method, tree using_decl)
  SET_DECL_INHERITED_CTOR
(fn, ovl_cons (DECL_INHERITED_CTOR (method),
   DECL_INHERITED_CTOR (fn)));
- /* Adjust deletedness and such.  */
- deduce_inheriting_ctor (fn);
  /* And discard the new one.  */
  return false;
}
diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 5b366f0..e80b806 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -1855,6 +1855,7 @@ explain_implicit_non_constexpr (tree decl)
 void
 deduce_inheriting_ctor (tree decl)
 {
+  decl = DECL_ORIGIN (decl);
   gcc_assert (DECL_INHERITED_CTOR (decl));
   tree spec;
   bool trivial, constexpr_, deleted;
@@ -1868,6 +1869,13 @@ deduce_inheriting_ctor (tree decl)
 deleted = true;
   DECL_DELETED_FN (decl) = deleted;
   TREE_TYPE (decl) = build_exception_variant (TREE_TYPE (decl), spec);
+
+  tree clone;
+  FOR_EACH_CLONE (clone, decl)
+{
+  DECL_DELETED_FN (clone) = deleted;
+  TREE_TYPE (clone) = build_exception_variant (TREE_TYPE (clone), spec);
+}
 }
 
 /* Implicitly declare the special function indicated by KIND, as a
@@ -1968,10 +1976,10 @@ implicitly_declare_fn (special_function_kind kind, tree 
type,
 
   bool trivial_p = false;
 
-  if (inherited_ctor && TREE_CODE (inherited_ctor) == TEMPLATE_DECL)
+  if (inherited_ctor)
 {
-  /* For an inheriting constructor template, just copy these flags from
-the inherited constructor template for now.  */
+  /* For an inheriting constructor, just copy these flags from the
+inherited constructor until deduce_inheriting_ctor.  */
   raises = TYPE_RAISES_EXCEPTIONS (TREE_TYPE (inherited_ctor));
   deleted_p = DECL_DELETED_FN (inherited_ctor);
   constexpr_p = DE

Re: [C++ PATCH] Reject lambda closure types in decompositions (PR c++/78896)

2017-01-25 Thread Jason Merrill
OK.

On Wed, Jan 25, 2017 at 4:07 PM, Jakub Jelinek  wrote:
> Hi!
>
> As discussed in the PR, while lambda closure types are class types, it
> is implementation dependent on what those class types actually contain,
> allowing that to be decomposed is just weird.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2017-01-25  Jakub Jelinek  
>
> PR c++/78896
> * decl.c (cp_finish_decomp): Disallow memberwise decomposition of
> lambda expressions.
>
> * g++.dg/cpp1z/decomp24.C: New test.
>
> --- gcc/cp/decl.c.jj2017-01-25 17:17:51.0 +0100
> +++ gcc/cp/decl.c   2017-01-25 19:16:19.439879509 +0100
> @@ -7562,6 +7562,11 @@ cp_finish_decomp (tree decl, tree first,
>error_at (loc, "cannot decompose non-array non-class type %qT", type);
>goto error_out;
>  }
> +  else if (LAMBDA_TYPE_P (type))
> +{
> +  error_at (loc, "cannot decompose lambda closure type %qT", type);
> +  goto error_out;
> +}
>else
>  {
>tree btype = find_decomp_class_base (loc, type, NULL_TREE);
> --- gcc/testsuite/g++.dg/cpp1z/decomp24.C.jj2017-01-25 19:19:42.536296515 
> +0100
> +++ gcc/testsuite/g++.dg/cpp1z/decomp24.C   2017-01-25 19:19:20.0 
> +0100
> @@ -0,0 +1,11 @@
> +// PR c++/78896
> +// { dg-do compile { target c++11 } }
> +// { dg-options "" }
> +
> +int
> +foo ()
> +{
> +  int a {10};
> +  auto [b] { [&a](){} };   // { dg-error "cannot decompose lambda 
> closure type" }
> +  return b - a;// { dg-warning "decomposition 
> declaration only available with" "" { target c++14_down } .-1 }
> +}
>
> Jakub


[PATCH, rs6000] Fix PR79160 (vsx-elemrev-4.c)

2017-01-25 Thread Bill Schmidt
Hi,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79160 records that
gcc.target/powerpc/vsx-elemrev-4.c fails on powerpc64 big-endian.  The
test was developed just prior to the introduction of D-form memory
access instructions lxv and stxv, so it relied on output of X-form
instructions lxvx and stxvx.  Either would be acceptable, but with the
introduction of the D-form instructions, they are preferred by the code
generator when they apply.  I've changed the test case to accept either
the D-form or the X-form instructions.

Tested adn veriried on powerpc64-unknown-linux-gnu.  Ok for trunk?

Thanks,
Bill


2017-01-25  Bill Schmidt  

* gcc.target/powerpc/vsx-elemrev-4.c: Change expected code
generation to accept D-mode memory accesses.


Index: gcc/testsuite/gcc.target/powerpc/vsx-elemrev-4.c
===
--- gcc/testsuite/gcc.target/powerpc/vsx-elemrev-4.c(revision 244824)
+++ gcc/testsuite/gcc.target/powerpc/vsx-elemrev-4.c(working copy)
@@ -3,8 +3,8 @@
 /* { dg-options "-mcpu=power9 -O0" } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-skip-if "" { powerpc*-*-aix* } { "*" } { "" } } */
-/* { dg-final { scan-assembler-times "lxvx" 40 } } */
-/* { dg-final { scan-assembler-times "stxvx" 40 } } */
+/* { dg-final { scan-assembler-times "lxv" 40 } } */
+/* { dg-final { scan-assembler-times "stxv" 40 } } */
 
 #include 
 




Re: [PATCH 4/5] distinguish likely and unlikely results (PR 78703)

2017-01-25 Thread Jeff Law

On 01/22/2017 04:53 PM, Martin Sebor wrote:

The attached patch adds the concept of likely and unlikely results
of formatted functions to improve the quality of diagnostics (reduce
false positives and negatives) while at the same time allowing for
rare cases such as a multibyte decimal point in floating point output
or excessive width or precision bounds (when width and precision range
support is added, in the final patch of the series).  To this end,
this patch makes the following changes:

1) Eliminate the format_result exact byte counter (since its value
can be determined from the other counters) and introduce the likely
and unlikely counters.  The likely counter is considered (along
with the min and max counters) when deciding whether a directive
that may write too many bytes should be diagnosed.  The unlikely
counter is used as the worst case scenario by the return value
optimization but not to trigger diagnostics.

2) Eliminate the format_result::bounded flag since its value can
be derived from the counters.  This simplifies the warning logic.

3) Introduce the fmtresult::adjust_for_width_or_precision() function
and factor code that handled width and precision in individual type-
specific format_xxx handlers out of those handlers and into it.
This reduce code duplication, avoiding subtle inconsistencies.

4) Introduce the type_max_digits() function to compute the likely
amount of output for integer directives with unknown (and thus
possibly unlimited) precision and width.  This reduces the rate
of false positives.

5) Consolidate the min_bytes_remaining() function into
bytes_remaining() to handle all counters consistently.  This again
reduces code duplication and subtle inconsistencies.

6) Complete the consolidation of handling sequences of plain format
characters with format specifications (that start with %) and remove
the add_bytes function.

7) Introduce the should_warn_p() function and move the logic
to determine whether the result of a format directive should be
diagnosed out of format_directive and into it.

8) Update diagnostics to make use of the new counters.

gcc-78703-4.diff


commit c9a95d19eb307b7df06c1285325b23746ddbc738
Author: Martin Sebor 
Date:   Sun Jan 22 11:48:13 2017 -0700

2017-01-22  Martin Sebor  

* gimple-ssa-sprintf.c (struct result_range): Add likely and
unlikely counters.
(struct format_result): Replace number_chars, number_chars_min,
and number_chars_max with a single member of struct result_range.
Remove bounded.
(format_result::operator+=): Adjust.
(struct fmtresult): Remove bounded.  Handle likely and unlikely
counters.
(fmtresult::adjust_for_width_or_precision): New function.
(fmtresult:type_max_digits): New function.
(bytes_remaining): Handle likely and unlikely counters.
(min_bytes_remaining): Remove.
(format_percent): Simplify.
(format_integer, format_floating): Set likely and unlikely counters.
(get_string_length, format_character, format_string): Same.
(format_plain, should_warn_p): New function.
(maybe_warn): Call should_warn_p.  Update diagnostic messages
and handle those for all directives, including plain strings.
(format_directive): Handle likely and unlikely counters.
Remove unnecessary quoting from diagnostics.  Add an informational
note.
(add_bytes): Remove.
(pass_sprintf_length::compute_format_length): Simplify.
(try_substitute_return_value): Handle likely and unlikely counters.

gcc/testsuite/
* gcc.dg/tree-ssa/builtin-snprintf-warn-2.c: Adjust.
* gcc.dg/tree-ssa/builtin-sprintf-2.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-5.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-2.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-4.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-6.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-9.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf.c: Same.

So I see the introduction of many

if (const OP object) expressions

Can you please fix those as an independent patch after #4 and #5 are 
installed on the trunk?  Consider that patch pre-approved, but please 
post it here for the historical record.


I think a regexp of paren followed by a constant would probably take you 
to them pretty quickly.


I'm going to trust that the actual computations, particularly in 
format_* are correct.  You know this stuff far better than I.


I probably would have tried to break this down further, but in the 
interests if getting this wrapped up, I've just slogged my way through 
it.For the future, I would have tried to break each format_XXX patch 
out.  Should_warn_p and the simplifications it enabled would hav

Re: [PATCH] restore pedantic warning on flexible array members (c++/71290)

2017-01-25 Thread Jakub Jelinek
On Wed, Jan 25, 2017 at 10:02:23AM -0700, Martin Sebor wrote:
> --- gcc/cp/decl.c (revision 244844)
> +++ gcc/cp/decl.c (working copy)
> @@ -11798,6 +11798,17 @@ grokdeclarator (const cp_declarator *declarator,
> }
>   else 
> {
> + /* Array is a flexible member.  */
> + if (in_system_header_at (input_location))
> +   /* Do not warn flexible them in system headers because glibc
> +  uses them.  */;

The comment is weird.  Did you mean warn about them, or warn about
flexible array members or something similar?

Jakub


Re: [PATCH] restore pedantic warning on flexible array members (c++/71290)

2017-01-25 Thread Martin Sebor

On 01/25/2017 04:53 PM, Jakub Jelinek wrote:

On Wed, Jan 25, 2017 at 10:02:23AM -0700, Martin Sebor wrote:

--- gcc/cp/decl.c   (revision 244844)
+++ gcc/cp/decl.c   (working copy)
@@ -11798,6 +11798,17 @@ grokdeclarator (const cp_declarator *declarator,
  }
else
  {
+   /* Array is a flexible member.  */
+   if (in_system_header_at (input_location))
+ /* Do not warn flexible them in system headers because glibc
+uses them.  */;


The comment is weird.  Did you mean warn about them, or warn about
flexible array members or something similar?


It sure is.  I must have mangled it while copying the whole block
from the 5 branch.  I just fixed it.  Thanks for pointing it out!

Martin


Re: [PATCH] PR libstdc++/79190 add fallback aligned_alloc implementation

2017-01-25 Thread Jakub Jelinek
On Tue, Jan 24, 2017 at 06:33:51PM +, Jonathan Wakely wrote:
> --- a/libstdc++-v3/libsupc++/new_opa.cc
> +++ b/libstdc++-v3/libsupc++/new_opa.cc
> @@ -55,9 +55,30 @@ extern "C" void *memalign(std::size_t boundary, 
> std::size_t size);
>  #endif
>  #define aligned_alloc memalign
>  #else
> -// The C library doesn't provide any aligned allocation functions, declare
> -// aligned_alloc and get a link failure if aligned new is used.
> -extern "C" void *aligned_alloc(std::size_t, std::size_t);
> +// This is a modified version of code from gcc/config/i386/gmm_malloc.h
> +static inline void*
> +aligned_alloc (std::size_t al, std::size_t sz)
> +{
> +  // Alignment must be a power of two.
> +  if (al & (al - 1))
> +return nullptr;
> +  else if (!sz)
> +return nullptr;
> +
> +  // We need extra bytes to store the original value returned by malloc.
> +  if (al < sizeof(void*))
> +al = sizeof(void*);
> +  void* const malloc_ptr = malloc(sz + al);
> +  if (!malloc_ptr)
> +return nullptr;
> +  // Align to the requested value, leaving room for the original malloc 
> value.
> +  void* const aligned_ptr = (void *) (((size_t) malloc_ptr + al) & -al);

Shouldn't this be cast to uintptr_t rather than size_t?  On some targets
that is not the same thing, I think e.g. on m32c:
grep 'SIZE_TYPE\|UINTPTR_TYPE\|POINTER_SIZE\|INT_TYPE_SIZE' config/m32c/*
config/m32c/m32c.h:#define POINTER_SIZE (TARGET_A16 ? 16 : 32)
config/m32c/m32c.h:#define INT_TYPE_SIZE 16
config/m32c/m32c.h:#undef UINTPTR_TYPE
config/m32c/m32c.h:#define UINTPTR_TYPE (TARGET_A16 ? "unsigned int" : "long 
unsigned int")
config/m32c/m32c.h:#undef  SIZE_TYPE
config/m32c/m32c.h:#define SIZE_TYPE "unsigned int"
which means e.g. for -mcpu=m32c pointers are 24-bit, integers/size_t are
16-bit and uintptr_t is 32-bit, so if you cast a pointer to size_t, you'll
lose the upper 8 bits.
Also, for the arguments you use std::size_t, but not here, shouldn't that
be std::uintptr_t then?

> +
> +  // Store the original malloc value where it can be found by operator 
> delete.
> +  ((void **) aligned_ptr)[-1] = malloc_ptr;
> +
> +  return aligned_ptr;
> +}
>  #endif
>  #endif
>  

Jakub


[PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings

2017-01-25 Thread Jeff Law
As has been discussed extensively, we're not doing a good job at 
simplifying overflow tests, particularly those which collapse down to an 
EQ/NE test.


x + -1 > x  -> x == 0
x + -1 < x  -> x != 0
x + 1 < x   -> x == -1U
x + 1 > x   -> x != -1U

The simplifications allow us to propagate a constant for X into one ARM 
of the associated IF/ELSE construct.  For C++ std::vector operations 
those propagations can eliminate lots of unnecessary code.


Those propagations also eliminate (by way of removing unnecessary code) 
false positive warnings for memset calls that come from std::vector 
operations.


This patch does two things.

1. It adds special case patterns to the A+CST CMP A pattern for cases 
where CST is 1 or -1 where the result turns into A EQ/NE 0 or A EQ/NE 
-1U.  These special patterns are applied regardless of the single_use 
status of the expression.


2. It adds a call to fold_stmt in simplify_cond_using_ranges.  This 
allows VRP to transform the code early and the first DOM pass to often 
see the simpified conditional and thus optimize better, rather than 
waiting for forwprop3 to simplify the conditional and the last DOM pass 
to optimize the code.


Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?

PR tree-optimization/79095
* match.pd (A + CST CMP A): Add special cases for CST of 1 or -1.
* tree-vrp.c (simplify_cond_using_ranges): Accept GSI rather than 
statement.
Callers changed.  Just fold the conditional if no other simplifications
were possible.

PR tree-optimization/79095
* g++.dg/pr79095: New test.
* gcc.c-torture/execute/arith-1.c: Test additional cases.

diff --git a/gcc/match.pd b/gcc/match.pd
index 7b96800..8178e9c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3019,15 +3019,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
ADD_OVERFLOW detection in tree-ssa-math-opts.c.
A + CST CMP A  ->  A CMP' CST' */
 (for cmp (lt le ge gt)
+ out_eqneq_zero (ne ne eq eq)
+ out_eqneq_m1 (eq eq ne ne)
  out (gt gt le le)
  (simplify
   (cmp:c (plus@2 @0 INTEGER_CST@1) @0)
   (if (TYPE_UNSIGNED (TREE_TYPE (@0))
&& TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+   && wi::eq_p (@1, 1))
+   (out_eqneq_m1 @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
+  (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED)); })
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+   && wi::eq_p (@1, -1))
+   (out_eqneq_zero @0 { fold_convert (TREE_TYPE (@0), integer_zero_node) ; })
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
&& wi::ne_p (@1, 0)
&& single_use (@2))
(out @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
-  (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }
+  (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }))
 
 /* To detect overflow in unsigned A - B, A < B is simpler than A - B > A.
However, the detection logic for SUB_OVERFLOW in tree-ssa-math-opts.c
diff --git a/gcc/testsuite/g++.dg/pr79095.C b/gcc/testsuite/g++.dg/pr79095.C
new file mode 100644
index 000..edf3739
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr79095.C
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-Wuninitialized -O2" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+struct S {
+  int *p0, *p1, *p2;
+
+  size_t size () const { return p1 - p0; }
+
+  void f (size_t n) {
+if (n > size ())   // can't happen because
+  foo (n - size ());   //   n is in [1, MIN(size() - 1, 3)]
+else if (n < size ())
+  bar (p0 + n);
+  }
+
+  void foo (size_t n)
+  {
+size_t left = (size_t)(p2 - p1);
+if (left >= n)
+  __builtin_memset (p2, 0, n * sizeof *p2); /*  { dg-bogus "maximum 
object" "false warning" } */
+  }
+
+  void bar (int*);
+};
+
+void f (S &s)
+{
+  size_t n = s.size ();
+  if (n > 1 && n < 5)
+s.f (n - 1);
+}
+
+
diff --git a/gcc/testsuite/gcc.c-torture/execute/arith-1.c 
b/gcc/testsuite/gcc.c-torture/execute/arith-1.c
index 58df322..6168d77 100644
--- a/gcc/testsuite/gcc.c-torture/execute/arith-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/arith-1.c
@@ -7,9 +7,41 @@ sat_add (unsigned i)
   return ret;
 }
 
+unsigned
+sat_add2 (unsigned i)
+{
+  unsigned ret = i + 1;
+  if (ret > i)
+return ret;
+  return i;
+}
+
+unsigned
+sat_add3 (unsigned i)
+{
+  unsigned ret = i - 1;
+  if (ret > i)
+ret = i;
+  return ret;
+}
+
+unsigned
+sat_add4 (unsigned i)
+{
+  unsigned ret = i - 1;
+  if (ret < i)
+return ret;
+  return i;
+}
 main ()
 {
   if (sat_add (~0U) != ~0U)
 abort ();
+  if (sat_add2 (~0U) != ~0U)
+abort ();
+  if (sat_add3 (0U) != 0U)
+abort ();
+  if (sat_add4 (0U) != 0U)
+abort ();
   exit (0);
 }
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d7d7a0d..b4b6d8a 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -9550,8 +9550,9 @@ range_fits_type_p (value_range *vr, unsigned 
dest_precision, signo

[PATCH], Fix PR target/79179, wrong constraint for power9

2017-01-25 Thread Michael Meissner
This patch fixes the wrong constraint (o instead of wY) for a STXSD instruction
on a vector extract operation that is optimizd to do a store if the element
being extracted is in the right position.

I have built a compiler with/with out the patch on a little endian power8
system.  The runs finished with no regressions.  Can I check it into trunk?

[gcc]
2017-01-25  Michael Meissner  

PR target/79179
* config/rs6000/vsx.md (vsx_extract__store): Use wY
constraint instead of o for the stxsd instruction.

[gcc/testsuite]
2017-01-25  Michael Meissner  

PR target/79179
* gcc.target/powerpc/pr79179.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 244906)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -2323,7 +2323,7 @@ (define_insn_and_split "*vsx_extract__store"
-  [(set (match_operand: 0 "memory_operand" "=m,Z,o")
+  [(set (match_operand: 0 "memory_operand" "=m,Z,wY")
(vec_select:
 (match_operand:VSX_D 1 "register_operand" "d,wv,wb")
 (parallel [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD")])))]
Index: gcc/testsuite/gcc.target/powerpc/pr79179.c
===
--- gcc/testsuite/gcc.target/powerpc/pr79179.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr79179.c  (revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do assemble { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O3" } */
+
+/* Compile with -O3 -mcpu=power9.  It originally generated
+
+stxsd 12,1(9)
+
+   which is illegal.  */
+
+#pragma pack(1)
+struct {
+signed : 1;
+unsigned long a;
+} b;
+
+void c(void)
+{
+b.a = 0;
+for (; b.a <= 45; b.a = (long)b.a + 1)
+;
+}


Re: [PATCH,rs6000] Remove invalid P8V_BUILTIN_VEC_VGBBD entries

2017-01-25 Thread Segher Boessenkool
On Wed, Jan 25, 2017 at 01:52:34PM -0800, Carl E. Love wrote:
> After further discussion of the two P8V_BUILTIN_VGBBD built-ins that do
> not take any arguments, it was determined they should just be removed as
> they are not valid.
> 
> The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE)
> with no regressions.
> 
> Is the patch OK for trunk?  

Yes, thanks!


Segher


> 2017-01-24  Carl Love  
> 
> * config/rs6000/rs6000-c (altivec_overloaded_builtins): Remove
> bogus entries for the P8V_BUILTIN_VEC_VGBBD built-ins


Re: [PATCH, rs6000] Fix PR79160 (vsx-elemrev-4.c)

2017-01-25 Thread Segher Boessenkool
On Wed, Jan 25, 2017 at 04:41:06PM -0600, Bill Schmidt wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79160 records that
> gcc.target/powerpc/vsx-elemrev-4.c fails on powerpc64 big-endian.  The
> test was developed just prior to the introduction of D-form memory
> access instructions lxv and stxv, so it relied on output of X-form
> instructions lxvx and stxvx.  Either would be acceptable, but with the
> introduction of the D-form instructions, they are preferred by the code
> generator when they apply.  I've changed the test case to accept either
> the D-form or the X-form instructions.
> 
> Tested adn veriried on powerpc64-unknown-linux-gnu.  Ok for trunk?

I hope you did proofread the patch better than your email ;-)

> --- gcc/testsuite/gcc.target/powerpc/vsx-elemrev-4.c  (revision 244824)
> +++ gcc/testsuite/gcc.target/powerpc/vsx-elemrev-4.c  (working copy)
> @@ -3,8 +3,8 @@
>  /* { dg-options "-mcpu=power9 -O0" } */
>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>  /* { dg-skip-if "" { powerpc*-*-aix* } { "*" } { "" } } */
> -/* { dg-final { scan-assembler-times "lxvx" 40 } } */
> -/* { dg-final { scan-assembler-times "stxvx" 40 } } */
> +/* { dg-final { scan-assembler-times "lxv" 40 } } */
> +/* { dg-final { scan-assembler-times "stxv" 40 } } */

Please add a comment saying this is meant to match either lxv or lxvx,
etc.  Okay with that added.  Thanks,


Segher


Re: [PATCH] Use fld b; fld a; instead of fld a; fld b; fxch %st(1) in reg-stack (PR target/70465)

2017-01-25 Thread Jeff Law

On 01/25/2017 03:01 PM, Jakub Jelinek wrote:

On Wed, Jan 25, 2017 at 02:43:34PM -0700, Jeff Law wrote:

2017-01-25  Jakub Jelinek  

PR target/70465
* reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
emit fld b; fld a; if possible.

* gcc.target/i386/pr70465.c: New test.

So please comment on the general approach you're taking here.  I have a
pretty good sense of what you're doing, mostly because I pondered something
similar.  But I doubt others coming across the code would see the overall
structure as quickly.


Does the following updated patch explain it sufficiently?


+ if (i2 != NULL_RTX
+ && (i2set = single_set (i2)) != NULL_RTX)
+   {
+ /* Instead of fld a; fld b; fxch %st(1); just
+use fld b; fld a; if possible.  */
+ rtx i2dest = *get_true_reg (&SET_DEST (i2set));
+ if (REG_P (i2dest)
+ && REGNO (i2dest) == FIRST_STACK_REG
+ && MEM_P (SET_SRC (i2set))
+ && !side_effects_p (SET_SRC (i2set))
+ && !modified_between_p (SET_SRC (i1set), i2, i1))

And here we're trying to verify that the insn found above (I2) is pushing
another value onto the FP register stack and that the value in I2 is not
modified between I1 and I2.


No, that last call (as I've tried to explain in the new comment) wants
to ensure that there are no stores in between i2 and i1 that might
alias with the second load's memory (then it would be invalid to move it
before i2) and that the address of the memory doesn't depend on something
set after i2.

2017-01-25  Jakub Jelinek  

PR target/70465
* reg-stack.c (emit_swap_insn): Instead of fld a; fld b; fxchg %st(1);
emit fld b; fld a; if possible.

* gcc.target/i386/pr70465.c: New test.

Yes. Thanks for the comments.  Ok for the trunk.

jeff



Re: [PATCH], Fix PR target/79179, wrong constraint for power9

2017-01-25 Thread Segher Boessenkool
On Wed, Jan 25, 2017 at 08:03:04PM -0500, Michael Meissner wrote:
> This patch fixes the wrong constraint (o instead of wY) for a STXSD 
> instruction
> on a vector extract operation that is optimizd to do a store if the element
> being extracted is in the right position.
> 
> I have built a compiler with/with out the patch on a little endian power8
> system.  The runs finished with no regressions.  Can I check it into trunk?

Yes please.  Thanks,


Segher


> 2017-01-25  Michael Meissner  
> 
>   PR target/79179
>   * config/rs6000/vsx.md (vsx_extract__store): Use wY
>   constraint instead of o for the stxsd instruction.
> 
> [gcc/testsuite]
> 2017-01-25  Michael Meissner  
> 
>   PR target/79179
>   * gcc.target/powerpc/pr79179.c: New test.


Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings

2017-01-25 Thread Marc Glisse

On Wed, 25 Jan 2017, Jeff Law wrote:

As has been discussed extensively, we're not doing a good job at simplifying 
overflow tests, particularly those which collapse down to an EQ/NE test.


x + -1 > x  -> x == 0
x + -1 < x  -> x != 0
x + 1 < x   -> x == -1U
x + 1 > x   -> x != -1U

The simplifications allow us to propagate a constant for X into one ARM of 
the associated IF/ELSE construct.  For C++ std::vector operations those 
propagations can eliminate lots of unnecessary code.


Those propagations also eliminate (by way of removing unnecessary code) false 
positive warnings for memset calls that come from std::vector operations.


This patch does two things.

1. It adds special case patterns to the A+CST CMP A pattern for cases where 
CST is 1 or -1 where the result turns into A EQ/NE 0 or A EQ/NE -1U.  These 
special patterns are applied regardless of the single_use status of the 
expression.


2. It adds a call to fold_stmt in simplify_cond_using_ranges.  This allows 
VRP to transform the code early and the first DOM pass to often see the 
simpified conditional and thus optimize better, rather than waiting for 
forwprop3 to simplify the conditional and the last DOM pass to optimize the 
code.


Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?


I assume this causes a regression for code like

unsigned f(unsigned a){
  unsigned b=a+1;
  if(b? On the other hand, the optimization is already very fragile, if I write 
b<=a (which is equivalent since 1 != 0), it doesn't apply.


We currently get
addl$1, %edi
movl$42, %eax
cmovnc  %edi, %eax
or almost as good with b==0
movl%edi, %eax
movl$42, %edx
addl$1, %eax
cmove   %edx, %eax
while with a==-1 we have the redundant comparison
leal1(%rdi), %eax
cmpl$-1, %edi
movl$42, %edx
cmove   %edx, %eax

Simplifying x + 1 < x to x + 1 == 0 might not be enough to simplify your 
examples though I guess?


--
Marc Glisse


Re: [PATCH] BRIG frontend: request for a global review

2017-01-25 Thread Pekka Jääskeläinen
On Wed, Jan 25, 2017 at 6:07 PM, Thomas Schwinge
 wrote:
> Hi!
>
> On Wed, 25 Jan 2017 13:21:13 +0100, Jakub Jelinek  wrote:
>> On Wed, Jan 25, 2017 at 11:00:50AM +0100, Thomas Schwinge wrote:
>> > On Tue, 24 Jan 2017 13:52:10 +0100, Martin Jambor  wrote:
>> > > [BRIG front end]
>
> $ git grep --cached libbrig
> gcc/brig/config-lang.in:target_libs="target-libbrig target-libhsail-rt"
>
> What is "libbrig"; should we remove that (as far as I can tell?) stale
> reference?

Yes, a leftover that can be removed.

> $ git show 55a56509bb4ae0c844c27f0679a22844bed3a3c5 -- libhsail-rt/README 
> | filterdiff
> --- /dev/null
> +++ libhsail-rt/README
> @@ -0,0 +1,4 @@
> +Run autoconf2.64 && automake-1.11  to regenerate the buildfiles.
> +You might need to manually tweak the minor automake version number
> +in configure.ac and aclocal.m4 (search for 1.11.6) in case your
> +local 1.11 minor version doesn't match.
> \ No newline at end of file
>
> I don't understand that "manually tweak" comment -- you should just
> install/build the right versions, and run "PATH=[...]:$PATH autoreconf",
> which is the same for all GCC subdirectories.

OK. I'll remove that. IIRC, I had some difficulties with getting the
exact minor versions
of autotools working together, and found out that the minor version
didn't matter
here, so left this as a note.

> Instead, the README file should contain a note what the "libhsail-rt"
> directory is about.

OK, I will add a note.

> $ git show 55a56509bb4ae0c844c27f0679a22844bed3a3c5 -- 
> gcc/builtin-types.def | filterdiff --hunks=1
> diff --git gcc/builtin-types.def gcc/builtin-types.def
> index 91745b4..ee6d052 100644
> --- gcc/builtin-types.def
> +++ gcc/builtin-types.def
> @@ -67,7 +67,10 @@ DEF_PRIMITIVE_TYPE (BT_LONGLONG, 
> long_long_integer_type_node)
>  DEF_PRIMITIVE_TYPE (BT_ULONGLONG, long_long_unsigned_type_node)
>  DEF_PRIMITIVE_TYPE (BT_INTMAX, intmax_type_node)
>  DEF_PRIMITIVE_TYPE (BT_UINTMAX, uintmax_type_node)
> -DEF_PRIMITIVE_TYPE (BT_UINT16, uint16_type_node)
> +DEF_PRIMITIVE_TYPE (BT_INT8, signed_char_type_node)
> +DEF_PRIMITIVE_TYPE (BT_INT16, short_integer_type_node)
> +DEF_PRIMITIVE_TYPE (BT_UINT8, char_type_node)
> +DEF_PRIMITIVE_TYPE (BT_UINT16, short_unsigned_type_node)
>  DEF_PRIMITIVE_TYPE (BT_UINT32, uint32_type_node)
>  DEF_PRIMITIVE_TYPE (BT_UINT64, uint64_type_node)
>  DEF_PRIMITIVE_TYPE (BT_WORD, (*lang_hooks.types.type_for_mode) 
> (word_mode, 1))
>
> Is that change alright?  For instance, uint16_type_node is still used
> elsewhere.  Some of these intN/uintN type_nodes apparently don't exist as
> global_trees; should they, and then be referred to here instead of the
> C-like type_nodes?

Yes, it makes sense. I will fix and test this.

> The "News" section on , and
>  should also be updated, I guess?
> :-)

Yes, of course. I will provide text.

> I suppose that also contrib/update-copyright.py need to be updated?  (I
> never looked into that, so don't know.)

Does it? The files are (c) FSF now. What should I do here exactly?

BR,
Pekka