date:20180416

[patch][i386] PR84331.

2018-04-16 Thread Makhotina, Olga

Hi,

This patch fixes g++.dg/ext/mv16.C test with -march=native.

gcc/
PR target/84331
* gcc/config.gcc: Support "skylake".
* gcc/config/i386/i386-c.c (ix86_target_macros_internal): Handle
PROCESSOR_SKYLAKE.
* gcc/config/i386/i386.c (m_SKYLAKE): Define.
(processor_target_table): Add "skylake".
(ix86_option_override_internal): Add "skylake".
(get_builtin_code_for_version): Handle PROCESSOR_SKYLAKE,
PROCESSOR_CANNONLAKE.
(get_builtin_code_for_version): Fix priority for 
PROCESSOR_ICELAKE_CLIENT,
PROCESSOR_ICELAKE_SERVER, PROCESSOR_SKYLAKE-AVX512.
* gcc/config/i386/i386.h (processor_costs): Define TARGET_SKYLAKE.
(processor_type): Add PROCESSOR_SKYLAKE.

gcc/testsuite/
PR target/84331
* gcc/testsuite/gcc.target/i386/funcspec-56.inc: Test arch=skylake.

Is it OK?

Thanks.


0001-fix-mv16.c.patch
Description: 0001-fix-mv16.c.patch

[PATCH] DWARF sort longer dirs before shorter ones in directory table.

2018-04-16 Thread Mark Wielaard

When gcc dwarf2out generates the .debug_line table itself (for example
when generating one for a split DWARF .dwo) it uses natural sorting for
the directory table. Longer directory paths come before shorter directory
paths with the same prefix. This causes the files in the line table to
pick the shorter dir. Creating slightly ineffecient line tables because
the longer directory paths will never be used.

Fix this by changing file_info_cmp () to pick longer directory prefixes
before shorter ones. We still sort files (the compilation unit) without
any directory path before all entries with a directory path, so they
will still use dir entry 0 (the working directory).

A hello.c program would get the following dir and line table before:

Directory table:
 /opt/local/install/gcc/lib/gcc/x86_64-pc-linux-gnu/8.0.1/include
 /usr/include
 /usr/include/bits

File name table:
 Entry Dir   Time  Size  Name
 1 0 0 0 hello.c
 2 1 0 0 stddef.h
 3 2 0 0 bits/types.h
 4 2 0 0 libio.h
 5 2 0 0 stdio.h
 6 2 0 0 bits/sys_errlist.h

Note that the last directory table entry is never used.
After this patch it looks as follows:

Directory table:
 /opt/local/install/gcc/lib/gcc/x86_64-pc-linux-gnu/8.0.1/include
 /usr/include/bits
 /usr/include

File name table:
 Entry Dir   Time  Size  Name
 1 0 0 0 hello.c
 2 1 0 0 stddef.h
 3 2 0 0 types.h
 4 3 0 0 libio.h
 5 3 0 0 stdio.h
 6 2 0 0 sys_errlist.h

Which is similar to what gas would output.

gcc/ChangeLog:

* dwarf2out.c (file_info_cmp): Sort longer dir prefixes before
shorter ones.
---

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 620e669..d3d925d 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -11955,7 +11955,9 @@ file_info_cmp (const void *p1, const void *p2)
  we return consistent values to qsort since some will get confused if
  we return the same value when identical operands are passed in opposite
  orders.  So if neither has a directory, return 0 and otherwise return
- 1 or -1 depending on which one has the directory.  */
+ 1 or -1 depending on which one has the directory.  We want the one with
+ the directory to sort after the one without, so all no directory files
+ are at the start (normally only the compilation unit file).  */
   if ((s1->path == s1->fname || s2->path == s2->fname))
 return (s2->path == s2->fname) - (s1->path == s1->fname);
 
@@ -11966,11 +11968,12 @@ file_info_cmp (const void *p1, const void *p2)
 {
   ++cp1;
   ++cp2;
-  /* Reached the end of the first path?  If so, handle like above.  */
+  /* Reached the end of the first path?  If so, handle like above,
+but now we want longer directory prefixes before shorter ones.  */
   if ((cp1 == (const unsigned char *) s1->fname)
  || (cp2 == (const unsigned char *) s2->fname))
-   return ((cp2 == (const unsigned char *) s2->fname)
-   - (cp1 == (const unsigned char *) s1->fname));
+   return ((cp1 == (const unsigned char *) s1->fname)
+   - (cp2 == (const unsigned char *) s2->fname));
 
   /* Character of current path component the same?  */
   else if (*cp1 != *cp2)

Re: [patch][i386] PR84331.

2018-04-16 Thread Uros Bizjak

On Mon, Apr 16, 2018 at 9:07 AM, Makhotina, Olga
 wrote:
> Hi,
>
> This patch fixes g++.dg/ext/mv16.C test with -march=native.
>
> gcc/
> PR target/84331
> * gcc/config.gcc: Support "skylake".
> * gcc/config/i386/i386-c.c (ix86_target_macros_internal): Handle
> PROCESSOR_SKYLAKE.
> * gcc/config/i386/i386.c (m_SKYLAKE): Define.
> (processor_target_table): Add "skylake".
> (ix86_option_override_internal): Add "skylake".
> (get_builtin_code_for_version): Handle PROCESSOR_SKYLAKE,
> PROCESSOR_CANNONLAKE.
> (get_builtin_code_for_version): Fix priority for 
> PROCESSOR_ICELAKE_CLIENT,
> PROCESSOR_ICELAKE_SERVER, PROCESSOR_SKYLAKE-AVX512.
> * gcc/config/i386/i386.h (processor_costs): Define TARGET_SKYLAKE.
> (processor_type): Add PROCESSOR_SKYLAKE.
>
> gcc/testsuite/
> PR target/84331
> * gcc/testsuite/gcc.target/i386/funcspec-56.inc: Test arch=skylake.
>
> Is it OK?

OK.

Thanks,
Uros.

[PATCH] Make Filesystem TS tests pass in C++17 mode

2018-04-16 Thread Jonathan Wakely

The  header defaults to using std::filesystem in C++17
mode. The Filesystem TS tests need to define the macro that causes
std::experimental::filesystem to be used instead.

This fixes the experimental/filesystem FAILs seen at:
https://gcc.gnu.org/ml/gcc-testresults/2018-04/msg00986.html

Tested powerp64le-linux, committed to trunk.
commit 7a758b36da1f741955bf5fe3893dc32262bd4f23
Author: Jonathan Wakely 
Date:   Sun Apr 15 20:07:44 2018 +0100

Make Filesystem TS tests pass in C++17 mode

The  header defaults to using std::filesystem in C++17
mode. The Filesystem TS tests need to define the macro that causes
std::experimental::filesystem to be used instead.

* testsuite/experimental/filesystem/file_status/1.cc: Add
-DUSE_FILESYSTEM_TS to dg-options.
* testsuite/experimental/filesystem/iterators/directory_iterator.cc:
Likewise.
* testsuite/experimental/filesystem/iterators/pop.cc: Likewise.
* testsuite/experimental/filesystem/iterators/
recursive_directory_iterator.cc: Likewise.
* testsuite/experimental/filesystem/operations/absolute.cc: 
Likewise.
* testsuite/experimental/filesystem/operations/canonical.cc: 
Likewise.
* testsuite/experimental/filesystem/operations/copy.cc: Likewise.
* testsuite/experimental/filesystem/operations/copy_file.cc: 
Likewise.
* 
testsuite/experimental/filesystem/operations/create_directories.cc:
Likewise.
* testsuite/experimental/filesystem/operations/create_directory.cc:
Likewise.
* testsuite/experimental/filesystem/operations/create_symlink.cc:
Likewise.
* testsuite/experimental/filesystem/operations/current_path.cc:
Likewise.
* testsuite/experimental/filesystem/operations/equivalent.cc: 
Likewise.
* testsuite/experimental/filesystem/operations/exists.cc: Likewise.
* testsuite/experimental/filesystem/operations/file_size.cc: 
Likewise.
* testsuite/experimental/filesystem/operations/is_empty.cc: 
Likewise.
* testsuite/experimental/filesystem/operations/last_write_time.cc:
Likewise.
* testsuite/experimental/filesystem/operations/permissions.cc:
Likewise.
* testsuite/experimental/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/experimental/filesystem/operations/remove.cc: Likewise.
* testsuite/experimental/filesystem/operations/remove_all.cc: 
Likewise.
* testsuite/experimental/filesystem/operations/status.cc: Likewise.
* 
testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/experimental/filesystem/path/append/path.cc: Likewise.
* testsuite/experimental/filesystem/path/assign/assign.cc: Likewise.
* testsuite/experimental/filesystem/path/assign/copy.cc: Likewise.
* testsuite/experimental/filesystem/path/compare/compare.cc: 
Likewise.
* testsuite/experimental/filesystem/path/compare/path.cc: Likewise.
* testsuite/experimental/filesystem/path/compare/strings.cc: 
Likewise.
* testsuite/experimental/filesystem/path/concat/path.cc: Likewise.
* testsuite/experimental/filesystem/path/concat/strings.cc: 
Likewise.
* testsuite/experimental/filesystem/path/construct/copy.cc: 
Likewise.
* testsuite/experimental/filesystem/path/construct/default.cc:
Likewise.
* testsuite/experimental/filesystem/path/construct/locale.cc: 
Likewise.
* testsuite/experimental/filesystem/path/construct/range.cc: 
Likewise.
* testsuite/experimental/filesystem/path/construct/string_view.cc:
Likewise.
* testsuite/experimental/filesystem/path/decompose/extension.cc:
Likewise.
* testsuite/experimental/filesystem/path/decompose/filename.cc:
Likewise.
* testsuite/experimental/filesystem/path/decompose/parent_path.cc:
Likewise.
* testsuite/experimental/filesystem/path/decompose/relative_path.cc:
Likewise.
* 
testsuite/experimental/filesystem/path/decompose/root_directory.cc:
Likewise.
* testsuite/experimental/filesystem/path/decompose/root_name.cc:
Likewise.
* testsuite/experimental/filesystem/path/decompose/root_path.cc:
Likewise.
* testsuite/experimental/filesystem/path/decompose/stem.cc: 
Likewise.
* testsuite/experimental/filesystem/path/generic/generic_string.cc:
Likewise.
* testsuite/experimental/filesystem/path/itr/traversal.cc: Likewise.
* testsuite/experimental/filesystem/path/modifiers/clear.cc: 
Likewise.
* 
testsuite/experimen

Re: [PATCH] DWARF sort longer dirs before shorter ones in directory table.

2018-04-16 Thread Jakub Jelinek

On Mon, Apr 16, 2018 at 09:34:17AM +0200, Mark Wielaard wrote:
> gcc/ChangeLog:
> 
>   * dwarf2out.c (file_info_cmp): Sort longer dir prefixes before
>   shorter ones.

Ok, thanks.

Jakub

Re: [PATCH i386: Check error_mark_node in multiversioning

2018-04-16 Thread Uros Bizjak

On Sun, Apr 15, 2018 at 12:55 PM, H.J. Lu  wrote:
> Since CET is applied to the whole program, it is correct to disallow
> -fcf-protection=full without -mcet.  But compiler shouldn't crash.

I don't think this is correct approach. If CET appleis to the whole
program, then it shouldn't be affected by target attributes (similar
how -m64 can't be switched by a target attribute).

Uros.

> OK for trunk?
>
> H.J.
> 
> gcc/
>
> PR target/85403
> * config/i386/i386.c (get_builtin_code_for_version): Check
> error_mark_node.
>
> gcc/testsuite/
>
> PR target/85403
> * g++.dg/ext/mv1.C: Compile with -fcf-protection=none.
> * g++.dg/ext/mv14.C: Likewise.
> * g++.dg/ext/mv15.C: Likewise.
> * g++.dg/ext/mv16.C: Likewise.
> * g++.dg/ext/mv17.C: Likewise.
> * g++.dg/ext/mv18.C: Likewise.
> * g++.dg/ext/mv19.C: Likewise.
> * g++.dg/ext/mv20.C: Likewise.
> * g++.dg/ext/mv21.C: Likewise.
> * g++.dg/ext/mv22.C: Likewise.
> * g++.dg/ext/mv23.C: Likewise.
> * g++.dg/ext/mv26.C: Likewise.
> * g++.dg/ext/mv6.C: Likewise.
> * g++.dg/ext/mvc1.C: Likewise.
> * gcc.target/i386/cet-notrack-icf-1.c: Likewise.
> * gcc.target/i386/cet-notrack-icf-3.c: Likewise.
> * gcc.target/i386/cet-property-2.c: Likewise.
> * gcc.target/i386/mvc1.c: Likewise.
> * gcc.target/i386/mvc10.c: Likewise.
> * gcc.target/i386/mvc11.c: Likewise.
> * gcc.target/i386/mvc6.c: Likewise.
> * gcc.target/i386/mvc7.c: Likewise.
> * gcc.target/i386/mvc8.c: Likewise.
> * gcc.target/i386/mvc9.c: Likewise.
> * gcc.target/i386/pr81213.c: Likewise.
> * gcc.target/i386/pr81214.c: Likewise.
> * gcc.target/i386/sse-26.c: Likewise.
> * gcc.target/i386/pr85403.c: New test.
> ---
>  gcc/config/i386/i386.c|  2 ++
>  gcc/testsuite/g++.dg/ext/mv1.C|  2 +-
>  gcc/testsuite/g++.dg/ext/mv14.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv15.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv16.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv17.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv18.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv19.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv20.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv21.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv22.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv23.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv26.C   |  1 +
>  gcc/testsuite/g++.dg/ext/mv6.C|  2 +-
>  gcc/testsuite/g++.dg/ext/mvc1.C   |  1 +
>  gcc/testsuite/gcc.target/i386/cet-notrack-icf-1.c |  2 +-
>  gcc/testsuite/gcc.target/i386/cet-notrack-icf-3.c |  2 +-
>  gcc/testsuite/gcc.target/i386/cet-property-2.c|  2 +-
>  gcc/testsuite/gcc.target/i386/mvc1.c  |  1 +
>  gcc/testsuite/gcc.target/i386/mvc10.c |  1 +
>  gcc/testsuite/gcc.target/i386/mvc11.c |  2 +-
>  gcc/testsuite/gcc.target/i386/mvc6.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/mvc7.c  |  1 +
>  gcc/testsuite/gcc.target/i386/mvc8.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/mvc9.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/pr81213.c   |  1 +
>  gcc/testsuite/gcc.target/i386/pr81214.c   |  1 +
>  gcc/testsuite/gcc.target/i386/pr85403.c   | 10 ++
>  gcc/testsuite/gcc.target/i386/sse-26.c|  2 +-
>  29 files changed, 39 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr85403.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 6fa5b0add02..8a73fc0d316 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -32344,6 +32344,8 @@ get_builtin_code_for_version (tree decl, tree 
> *predicate_list)
>   &global_options_set);
>
>gcc_assert (target_node);
> +  if (target_node == error_mark_node)
> +   return 0;
>new_target = TREE_TARGET_OPTION (target_node);
>gcc_assert (new_target);
>
> diff --git a/gcc/testsuite/g++.dg/ext/mv1.C b/gcc/testsuite/g++.dg/ext/mv1.C
> index 4eedbff7b23..af81f3aa056 100644
> --- a/gcc/testsuite/g++.dg/ext/mv1.C
> +++ b/gcc/testsuite/g++.dg/ext/mv1.C
> @@ -1,7 +1,7 @@
>  /* Test case to check if Multiversioning works.  */
>  /* { dg-do run { target i?86-*-* x86_64-*-* } } */
>  /* { dg-require-ifunc "" }  */
> -/* { dg-options "-O2 -fPIC" } */
> +/* { dg-options "-O2 -fPIC -fcf-protection=none" } */
>
>  #include 
>
> diff --git a/gcc/testsuite/g++.dg/ext/mv14.C b/gcc/testsuite/g++.dg/ext/mv14.C
> index 1e7a1619698..82b153b7ebb 100644
> --- a/gcc/testsuite/g++.dg/ext/mv14.C
> +++ b/gcc/testsuite/g++.dg/ext/mv14.C
> @@ -1,7 +1,7 @@
>

Re: [PATCH i386: Check error_mark_node in multiversioning

2018-04-16 Thread Uros Bizjak

On Sun, Apr 15, 2018 at 12:55 PM, H.J. Lu  wrote:
> Since CET is applied to the whole program, it is correct to disallow
> -fcf-protection=full without -mcet.  But compiler shouldn't crash.
>
> OK for trunk?
>
> H.J.
> 
> gcc/
>
> PR target/85403
> * config/i386/i386.c (get_builtin_code_for_version): Check
> error_mark_node.
>
> gcc/testsuite/
>
> PR target/85403
> * g++.dg/ext/mv1.C: Compile with -fcf-protection=none.
> * g++.dg/ext/mv14.C: Likewise.
> * g++.dg/ext/mv15.C: Likewise.
> * g++.dg/ext/mv16.C: Likewise.
> * g++.dg/ext/mv17.C: Likewise.
> * g++.dg/ext/mv18.C: Likewise.
> * g++.dg/ext/mv19.C: Likewise.
> * g++.dg/ext/mv20.C: Likewise.
> * g++.dg/ext/mv21.C: Likewise.
> * g++.dg/ext/mv22.C: Likewise.
> * g++.dg/ext/mv23.C: Likewise.
> * g++.dg/ext/mv26.C: Likewise.
> * g++.dg/ext/mv6.C: Likewise.
> * g++.dg/ext/mvc1.C: Likewise.
> * gcc.target/i386/cet-notrack-icf-1.c: Likewise.
> * gcc.target/i386/cet-notrack-icf-3.c: Likewise.
> * gcc.target/i386/cet-property-2.c: Likewise.
> * gcc.target/i386/mvc1.c: Likewise.
> * gcc.target/i386/mvc10.c: Likewise.
> * gcc.target/i386/mvc11.c: Likewise.
> * gcc.target/i386/mvc6.c: Likewise.
> * gcc.target/i386/mvc7.c: Likewise.
> * gcc.target/i386/mvc8.c: Likewise.
> * gcc.target/i386/mvc9.c: Likewise.
> * gcc.target/i386/pr81213.c: Likewise.
> * gcc.target/i386/pr81214.c: Likewise.
> * gcc.target/i386/sse-26.c: Likewise.
> * gcc.target/i386/pr85403.c: New test.

i386.c error-recovery change and pr85403.c testcase are OK. Please do
not change other tests.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.c|  2 ++
>  gcc/testsuite/g++.dg/ext/mv1.C|  2 +-
>  gcc/testsuite/g++.dg/ext/mv14.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv15.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv16.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv17.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv18.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv19.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv20.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv21.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv22.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv23.C   |  2 +-
>  gcc/testsuite/g++.dg/ext/mv26.C   |  1 +
>  gcc/testsuite/g++.dg/ext/mv6.C|  2 +-
>  gcc/testsuite/g++.dg/ext/mvc1.C   |  1 +
>  gcc/testsuite/gcc.target/i386/cet-notrack-icf-1.c |  2 +-
>  gcc/testsuite/gcc.target/i386/cet-notrack-icf-3.c |  2 +-
>  gcc/testsuite/gcc.target/i386/cet-property-2.c|  2 +-
>  gcc/testsuite/gcc.target/i386/mvc1.c  |  1 +
>  gcc/testsuite/gcc.target/i386/mvc10.c |  1 +
>  gcc/testsuite/gcc.target/i386/mvc11.c |  2 +-
>  gcc/testsuite/gcc.target/i386/mvc6.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/mvc7.c  |  1 +
>  gcc/testsuite/gcc.target/i386/mvc8.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/mvc9.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/pr81213.c   |  1 +
>  gcc/testsuite/gcc.target/i386/pr81214.c   |  1 +
>  gcc/testsuite/gcc.target/i386/pr85403.c   | 10 ++
>  gcc/testsuite/gcc.target/i386/sse-26.c|  2 +-
>  29 files changed, 39 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr85403.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 6fa5b0add02..8a73fc0d316 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -32344,6 +32344,8 @@ get_builtin_code_for_version (tree decl, tree 
> *predicate_list)
>   &global_options_set);
>
>gcc_assert (target_node);
> +  if (target_node == error_mark_node)
> +   return 0;
>new_target = TREE_TARGET_OPTION (target_node);
>gcc_assert (new_target);
>
> diff --git a/gcc/testsuite/g++.dg/ext/mv1.C b/gcc/testsuite/g++.dg/ext/mv1.C
> index 4eedbff7b23..af81f3aa056 100644
> --- a/gcc/testsuite/g++.dg/ext/mv1.C
> +++ b/gcc/testsuite/g++.dg/ext/mv1.C
> @@ -1,7 +1,7 @@
>  /* Test case to check if Multiversioning works.  */
>  /* { dg-do run { target i?86-*-* x86_64-*-* } } */
>  /* { dg-require-ifunc "" }  */
> -/* { dg-options "-O2 -fPIC" } */
> +/* { dg-options "-O2 -fPIC -fcf-protection=none" } */
>
>  #include 
>
> diff --git a/gcc/testsuite/g++.dg/ext/mv14.C b/gcc/testsuite/g++.dg/ext/mv14.C
> index 1e7a1619698..82b153b7ebb 100644
> --- a/gcc/testsuite/g++.dg/ext/mv14.C
> +++ b/gcc/testsuite/g++.dg/ext/mv14.C
> @@ -1,7 +1,7 @@
>  /* Test case to check if Multiversioning works.  */
>  /* { dg-do run { target i?8

[PATCH, openacc, PR85411] Move GOMP_OPENACC_DIM parsing out of nvptx plugin

2018-04-16 Thread Tom de Vries


Hi,

this patch moves the parsing of the GOMP_OPENACC_DIM environment 
variable from the nvptx target plugin to the libgomp library.


The variable is not part of the OpenACC standard, but it is specific for 
the gcc implementation of OpenACC, so it makes sense to share the part 
handling the parsing, rather than having each target plugin duplicate it.


Build on x86_64 with nvptx accelerator and reg-tested libgomp.

OK for stage1?

Thanks,
- Tom
[openacc] Move GOMP_OPENACC_DIM parsing out of nvptx plugin

2018-04-15  Tom de Vries  

	PR libgomp/85411
	* plugin/plugin-nvptx.c (notify_var): Remove no longer used function.
	(nvptx_exec): Move parsing of
	GOMP_OPENACC_DIM ...
	* env.c (parse_gomp_openacc_dim): ... here.  New function.
	(initialize_env): Call parse_gomp_openacc_dim.
	(goacc_default_dims): Define.
	* libgomp.h (goacc_default_dims): Declare.
	* oacc-plugin.c (GOMP_PLUGIN_acc_default_dim): New function.
	* oacc-plugin.h (GOMP_PLUGIN_acc_default_dim): Declare.
	* libgomp.map: New version "GOMP_PLUGIN_1.2". Add
	GOMP_PLUGIN_acc_default_dim.
	* testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/loop-default.h: New test.

---
 libgomp/env.c  |  32 +
 libgomp/libgomp.h  |   2 +
 libgomp/libgomp.map|   5 +
 libgomp/oacc-plugin.c  |  11 ++
 libgomp/oacc-plugin.h  |   1 +
 libgomp/plugin/plugin-nvptx.c  |  38 +-
 .../loop-default-runtime.c |  16 +++
 .../libgomp.oacc-c-c++-common/loop-default.h   | 144 +
 8 files changed, 213 insertions(+), 36 deletions(-)

diff --git a/libgomp/env.c b/libgomp/env.c
index 871a3e4..18c90bb 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -90,6 +90,7 @@ int gomp_debug_var;
 unsigned int gomp_num_teams_var;
 char *goacc_device_type;
 int goacc_device_num;
+int goacc_default_dims[GOMP_DIM_MAX];
 
 #ifndef LIBGOMP_OFFLOADED_ONLY
 
@@ -1066,6 +1067,36 @@ parse_acc_device_type (void)
 }
 
 static void
+parse_gomp_openacc_dim (void)
+{
+  /* The syntax is the same as for the -fopenacc-dim compilation option.  */
+  const char *var_name = "GOMP_OPENACC_DIM";
+  const char *env_var = getenv (var_name);
+  if (!env_var)
+return;
+
+  const char *pos = env_var;
+  int i;
+  for (i = 0; *pos && i != GOMP_DIM_MAX; i++)
+{
+  if (i && *pos++ != ':')
+	break;
+
+  if (*pos == ':')
+	continue;
+
+  const char *eptr;
+  errno = 0;
+  long val = strtol (pos, (char **)&eptr, 10);
+  if (errno || val < 0 || (unsigned)val != val)
+	break;
+
+  goacc_default_dims[i] = (int)val;
+  pos = eptr;
+}
+}
+
+static void
 handle_omp_display_env (unsigned long stacksize, int wait_policy)
 {
   const char *env;
@@ -1336,6 +1367,7 @@ initialize_env (void)
 goacc_device_num = 0;
 
   parse_acc_device_type ();
+  parse_gomp_openacc_dim ();
 
   goacc_runtime_initialize ();
 }
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index d659cd2..10ea894 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -44,6 +44,7 @@
 #include "config.h"
 #include "gstdint.h"
 #include "libgomp-plugin.h"
+#include "gomp-constants.h"
 
 #ifdef HAVE_PTHREAD_H
 #include 
@@ -367,6 +368,7 @@ extern unsigned int gomp_num_teams_var;
 extern int gomp_debug_var;
 extern int goacc_device_num;
 extern char *goacc_device_type;
+extern int goacc_default_dims[GOMP_DIM_MAX];
 
 enum gomp_task_kind
 {
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index f9044ae..8752348 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -420,3 +420,8 @@ GOMP_PLUGIN_1.1 {
   global:
 	GOMP_PLUGIN_target_task_completion;
 } GOMP_PLUGIN_1.0;
+
+GOMP_PLUGIN_1.2 {
+  global:
+	GOMP_PLUGIN_acc_default_dim;
+} GOMP_PLUGIN_1.1;
diff --git a/libgomp/oacc-plugin.c b/libgomp/oacc-plugin.c
index 475f357..c04db90 100644
--- a/libgomp/oacc-plugin.c
+++ b/libgomp/oacc-plugin.c
@@ -49,3 +49,14 @@ GOMP_PLUGIN_acc_thread (void)
   struct goacc_thread *thr = goacc_thread ();
   return thr ? thr->target_tls : NULL;
 }
+
+int
+GOMP_PLUGIN_acc_default_dim (unsigned int i)
+{
+  if (i >= GOMP_DIM_MAX)
+{
+  gomp_fatal ("invalid dimension argument: %d", i);
+  return -1;
+}
+  return goacc_default_dims[i];
+}
diff --git a/libgomp/oacc-plugin.h b/libgomp/oacc-plugin.h
index ae152aa..0a183bb 100644
--- a/libgomp/oacc-plugin.h
+++ b/libgomp/oacc-plugin.h
@@ -29,5 +29,6 @@
 
 extern void GOMP_PLUGIN_async_unmap_vars (void *, int);
 extern void *GOMP_PLUGIN_acc_thread (void);
+extern int GOMP_PLUGIN_acc_default_dim (unsigned int);
 
 #endif
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 9ae6095..365b45e 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -867,15 +867,6 @@ nvptx_get_num_devices (void)
   return n;
 }
 
-static void
-notify_var (const

Patch ping

2018-04-16 Thread Jakub Jelinek

Hi!

I'd like to ping the

  http://gcc.gnu.org/ml/gcc-patches/2018-04/msg00414.html
  PR85281 - assorted -masm=intel fixes

patch.  Thanks.

Jakub

Re: [PATCH] Fix __builtin_cpu_supports (PR target/84945)

2018-04-16 Thread Thomas Schwinge

Hi!

On Mon, 19 Mar 2018 20:35:56 +0100, Jakub Jelinek  wrote:
> --- libgcc/config/i386/cpuinfo.c.jj   2018-03-15 09:10:20.870075051 +0100
> +++ libgcc/config/i386/cpuinfo.c  2018-03-19 16:13:25.059481079 +0100
> @@ -231,78 +238,81 @@ get_available_features (unsigned int ecx
>unsigned int ext_level;
>  
>unsigned int features = 0;
> +  unsigned int features2 = 0;
>  
> +#define set_feature(f) \
> +  if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
>if (edx & bit_CMOV)
> -features |= (1 << FEATURE_CMOV);
> +set_feature (FEATURE_CMOV);

[...]/libgcc/config/i386/cpuinfo.c: In function 'get_available_features':
[...]/libgcc/config/i386/cpuinfo.c:278:60: warning: left shift count is 
negative [-Wshift-count-negative]
   if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
^~
[...]/libgcc/config/i386/cpuinfo.c:281:5: note: in expansion of macro 
'set_feature'
 set_feature (FEATURE_CMOV);
 ^~~
[...]/libgcc/config/i386/cpuinfo.c:280:6: warning: suggest explicit braces 
to avoid ambiguous 'else' [-Wdangling-else]
   if (edx & bit_CMOV)
  ^
[Many more.]


Grüße
 Thomas

Re: [PATCH] Fix libgcc/config/i386/cpuinfo.c warnings (PR target/84945)

2018-04-16 Thread Uros Bizjak

On Mon, Apr 16, 2018 at 1:11 PM, Jakub Jelinek  wrote:
> On Mon, Apr 16, 2018 at 12:50:29PM +0200, Thomas Schwinge wrote:
>> > +#define set_feature(f) \
>> > +  if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
>> >if (edx & bit_CMOV)
>> > -features |= (1 << FEATURE_CMOV);
>> > +set_feature (FEATURE_CMOV);
>>
>> [...]/libgcc/config/i386/cpuinfo.c: In function 'get_available_features':
>> [...]/libgcc/config/i386/cpuinfo.c:278:60: warning: left shift count is 
>> negative [-Wshift-count-negative]
>>if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
>> ^~
>> [...]/libgcc/config/i386/cpuinfo.c:281:5: note: in expansion of macro 
>> 'set_feature'
>>  set_feature (FEATURE_CMOV);
>>  ^~~
>> [...]/libgcc/config/i386/cpuinfo.c:280:6: warning: suggest explicit 
>> braces to avoid ambiguous 'else' [-Wdangling-else]
>>if (edx & bit_CMOV)
>>   ^
>> [Many more.]
>
> Oops, missed these.  All of the warnings are false positives (warn on dead
> code) or style warning (-Wdangling-else), not actual code bugs.
>
> The following patch fixes all the warnings without changing code generation,
> we are using set_feature with constant arguments only, so everything is
> folded properly anyway.
>
> Ok for trunk?
>
> 2018-04-16  Jakub Jelinek  
>
> PR target/84945
> * config/i386/cpuinfo.c (set_feature): Wrap into do while (0) to avoid
> -Wdangling-else warnings.  Mask shift counts to avoid
> -Wshift-count-negative and -Wshift-count-overflow false positives.

OK.

Thanks,
Uros.

> --- libgcc/config/i386/cpuinfo.c.jj 2018-03-30 20:37:37.683185248 +0200
> +++ libgcc/config/i386/cpuinfo.c2018-04-16 13:04:45.239490344 +0200
> @@ -275,7 +275,14 @@ get_available_features (unsigned int ecx
>  }
>
>  #define set_feature(f) \
> -  if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
> +  do   \
> +{  \
> +  if (f < 32)  \
> +   features |= (1U << (f & 31));   \
> +  else \
> +   features2 |= (1U << ((f - 32) & 31));   \
> +}  \
> +  while (0)
>
>if (edx & bit_CMOV)
>  set_feature (FEATURE_CMOV);
>
>
> Jakub

[PATCH] Fix libgcc/config/i386/cpuinfo.c warnings (PR target/84945)

2018-04-16 Thread Jakub Jelinek

On Mon, Apr 16, 2018 at 12:50:29PM +0200, Thomas Schwinge wrote:
> > +#define set_feature(f) \
> > +  if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
> >if (edx & bit_CMOV)
> > -features |= (1 << FEATURE_CMOV);
> > +set_feature (FEATURE_CMOV);
> 
> [...]/libgcc/config/i386/cpuinfo.c: In function 'get_available_features':
> [...]/libgcc/config/i386/cpuinfo.c:278:60: warning: left shift count is 
> negative [-Wshift-count-negative]
>if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
> ^~
> [...]/libgcc/config/i386/cpuinfo.c:281:5: note: in expansion of macro 
> 'set_feature'
>  set_feature (FEATURE_CMOV);
>  ^~~
> [...]/libgcc/config/i386/cpuinfo.c:280:6: warning: suggest explicit 
> braces to avoid ambiguous 'else' [-Wdangling-else]
>if (edx & bit_CMOV)
>   ^
> [Many more.]

Oops, missed these.  All of the warnings are false positives (warn on dead
code) or style warning (-Wdangling-else), not actual code bugs.

The following patch fixes all the warnings without changing code generation,
we are using set_feature with constant arguments only, so everything is
folded properly anyway.

Ok for trunk?

2018-04-16  Jakub Jelinek  

PR target/84945
* config/i386/cpuinfo.c (set_feature): Wrap into do while (0) to avoid
-Wdangling-else warnings.  Mask shift counts to avoid
-Wshift-count-negative and -Wshift-count-overflow false positives.

--- libgcc/config/i386/cpuinfo.c.jj 2018-03-30 20:37:37.683185248 +0200
+++ libgcc/config/i386/cpuinfo.c2018-04-16 13:04:45.239490344 +0200
@@ -275,7 +275,14 @@ get_available_features (unsigned int ecx
 }
 
 #define set_feature(f) \
-  if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32))
+  do   \
+{  \
+  if (f < 32)  \
+   features |= (1U << (f & 31));   \
+  else \
+   features2 |= (1U << ((f - 32) & 31));   \
+}  \
+  while (0)
 
   if (edx & bit_CMOV)
 set_feature (FEATURE_CMOV);


Jakub

RE: [wwwdocs] [COMMITTED] ARC gcc8 changes entry

2018-04-16 Thread Claudiu Zissulescu

Done. Thank you,
Claudiu

> -Original Message-
> From: Bernhard Reutner-Fischer [mailto:rep.dot@gmail.com]
> Sent: Thursday, April 12, 2018 8:40 PM
> To: gcc-patches@gcc.gnu.org; Claudiu Zissulescu
> ; gcc-patches@gcc.gnu.org
> Cc: Gerald Pfeifer ; Francois Bedard
> 
> Subject: Re: [wwwdocs] [COMMITTED] ARC gcc8 changes entry
> 
> On 11 April 2018 13:05:52 CEST, Claudiu Zissulescu
>  wrote:
> >Hi,
> >
> >Please find the ARC's gcc8 changes entry section as committed to
> >wwwdocs.
> 
> s/qualifer/qualifier/
> 
> thanks,

Re: [PATCH] Make redirection only for target_clones: V2 (PR ipa/85329).

2018-04-16 Thread Martin Liška

Hi.

I'm sending V3 which we did with Honza. It's similar to V2, but properly makes
FUNCTION_DECL local for default implementation.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
Tests on ppc64le are running.

Ready to be installed?
Martin
>From 77b48cfad59dd24a5c068bfb32b1059535ae1f75 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Sat, 14 Apr 2018 09:55:35 +0200
Subject: [PATCH] Make redirection only for target_clones: V3 (PR ipa/85329).

2018-04-16  Martin Liska  

	* multiple_target.c (create_dispatcher_calls): Set apostrophes
	for target_clone error message.  Make default implementation
clone to be a local declaration.
	(separate_attrs): Add new argument and check for an emptry
	string.
	(expand_target_clones): Handle it.
	(ipa_target_clone): Make redirection just for target_clones
	functions.

gcc/testsuite/ChangeLog:

2018-04-16  Martin Liska  

	* g++.dg/ext/pr85329-2.C: New test.
	* g++.dg/ext/pr85329.C: New test.
	* gcc.target/i386/mvc12.c: New test.
---
 gcc/multiple_target.c | 55 ++-
 gcc/testsuite/g++.dg/ext/pr85329-2.C  | 22 ++
 gcc/testsuite/g++.dg/ext/pr85329.C| 19 
 gcc/testsuite/gcc.target/i386/mvc12.c | 11 +++
 4 files changed, 93 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/pr85329-2.C
 create mode 100644 gcc/testsuite/g++.dg/ext/pr85329.C
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc12.c

diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index b006a5ab6ec..a1fe09a5983 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -88,7 +88,7 @@ create_dispatcher_calls (struct cgraph_node *node)
   if (!idecl)
 {
   error_at (DECL_SOURCE_LOCATION (node->decl),
-		"default target_clones attribute was not set");
+		"default % attribute was not set");
   return;
 }
 
@@ -161,10 +161,25 @@ create_dispatcher_calls (struct cgraph_node *node)
 	}
 }
 
-  TREE_PUBLIC (node->decl) = 0;
   symtab->change_decl_assembler_name (node->decl,
   clone_function_name (node->decl,
 			   "default"));
+
+  /* FIXME: copy of cgraph_node::make_local that should be cleaned up
+	in next stage1.  */
+  node->make_decl_local ();
+  node->set_section (NULL);
+  node->set_comdat_group (NULL);
+  node->externally_visible = false;
+  node->forced_by_abi = false;
+  node->set_section (NULL);
+  node->unique_name = ((node->resolution == LDPR_PREVAILING_DEF_IRONLY
+			|| node->resolution == LDPR_PREVAILING_DEF_IRONLY_EXP)
+		   && !flag_incremental_link);
+  node->resolution = LDPR_PREVAILING_DEF_IRONLY;
+
+  DECL_ARTIFICIAL (node->decl) = 1;
+  node->force_output = true;
 }
 
 /* Return length of attribute names string,
@@ -216,26 +231,30 @@ get_attr_str (tree arglist, char *attr_str)
 }
 
 /* Return number of attributes separated by comma and put them into ARGS.
-   If there is no DEFAULT attribute return -1.  */
+   If there is no DEFAULT attribute return -1.  If there is an empty
+   string in attribute return -2.  */
 
 static int
-separate_attrs (char *attr_str, char **attrs)
+separate_attrs (char *attr_str, char **attrs, int attrnum)
 {
   int i = 0;
-  bool has_default = false;
+  int default_count = 0;
 
   for (char *attr = strtok (attr_str, ",");
attr != NULL; attr = strtok (NULL, ","))
 {
   if (strcmp (attr, "default") == 0)
 	{
-	  has_default = true;
+	  default_count++;
 	  continue;
 	}
   attrs[i++] = attr;
 }
-  if (!has_default)
+  if (default_count == 0)
 return -1;
+  else if (i + default_count < attrnum)
+return -2;
+
   return i;
 }
 
@@ -321,7 +340,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
 {
   warning_at (DECL_SOURCE_LOCATION (node->decl),
 		  0,
-		  "single target_clones attribute is ignored");
+		  "single % attribute is ignored");
   return false;
 }
 
@@ -345,7 +364,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
   int attrnum = get_attr_str (arglist, attr_str);
   char **attrs = XNEWVEC (char *, attrnum);
 
-  attrnum = separate_attrs (attr_str, attrs);
+  attrnum = separate_attrs (attr_str, attrs, attrnum);
   if (attrnum == -1)
 {
   error_at (DECL_SOURCE_LOCATION (node->decl),
@@ -354,6 +373,14 @@ expand_target_clones (struct cgraph_node *node, bool definition)
   XDELETEVEC (attr_str);
   return false;
 }
+  else if (attrnum == -2)
+{
+  error_at (DECL_SOURCE_LOCATION (node->decl),
+		"an empty string cannot be in % attribute");
+  XDELETEVEC (attrs);
+  XDELETEVEC (attr_str);
+  return false;
+}
 
   cgraph_function_version_info *decl1_v = NULL;
   cgraph_function_version_info *decl2_v = NULL;
@@ -427,14 +454,14 @@ static unsigned int
 ipa_target_clone (void)
 {
   struct cgraph_node *node;
+  auto_vec to_dispatch;
 
-  bool target_clone_pass = false;
   FOR_EACH_FUNCTION (node)
-target_clone_pass |= expand_target_clones (node, node->defin

Re: [PATCH] Make redirection only for target_clones: V2 (PR ipa/85329).

2018-04-16 Thread Jan Hubicka

> 2018-04-16  Martin Liska  
> 
>   * multiple_target.c (create_dispatcher_calls): Set apostrophes
>   for target_clone error message.  Make default implementation
> clone to be a local declaration.
>   (separate_attrs): Add new argument and check for an emptry
>   string.
>   (expand_target_clones): Handle it.
>   (ipa_target_clone): Make redirection just for target_clones
>   functions.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-04-16  Martin Liska  
> 
>   * g++.dg/ext/pr85329-2.C: New test.
>   * g++.dg/ext/pr85329.C: New test.
>   * gcc.target/i386/mvc12.c: New test.
> ---
>  gcc/multiple_target.c | 55 
> ++-
>  gcc/testsuite/g++.dg/ext/pr85329-2.C  | 22 ++
>  gcc/testsuite/g++.dg/ext/pr85329.C| 19 
>  gcc/testsuite/gcc.target/i386/mvc12.c | 11 +++
>  4 files changed, 93 insertions(+), 14 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ext/pr85329-2.C
>  create mode 100644 gcc/testsuite/g++.dg/ext/pr85329.C
>  create mode 100644 gcc/testsuite/gcc.target/i386/mvc12.c
> 
> diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
> index b006a5ab6ec..a1fe09a5983 100644
> --- a/gcc/multiple_target.c
> +++ b/gcc/multiple_target.c
> @@ -88,7 +88,7 @@ create_dispatcher_calls (struct cgraph_node *node)
>if (!idecl)
>  {
>error_at (DECL_SOURCE_LOCATION (node->decl),
> - "default target_clones attribute was not set");
> + "default % attribute was not set");
>return;
>  }
>  
> @@ -161,10 +161,25 @@ create_dispatcher_calls (struct cgraph_node *node)
>   }
>  }
>  
> -  TREE_PUBLIC (node->decl) = 0;
>symtab->change_decl_assembler_name (node->decl,
> clone_function_name (node->decl,
>  "default"));
> +
> +  /* FIXME: copy of cgraph_node::make_local that should be cleaned up
> + in next stage1.  */
> +  node->make_decl_local ();
> +  node->set_section (NULL);
> +  node->set_comdat_group (NULL);
> +  node->externally_visible = false;
> +  node->forced_by_abi = false;
> +  node->set_section (NULL);

Probably no need to do this twice (next stage1 we can fix that in make_local as 
well)

> +  node->unique_name = ((node->resolution == LDPR_PREVAILING_DEF_IRONLY
> + || node->resolution == LDPR_PREVAILING_DEF_IRONLY_EXP)
> +&& !flag_incremental_link);
> +  node->resolution = LDPR_PREVAILING_DEF_IRONLY;
> +
> +  DECL_ARTIFICIAL (node->decl) = 1;
> +  node->force_output = true;

And we don't really need force_output, but that is for next stage1 as well.

OK and thanks for working on this!
Honza

Re: [PATCH, openacc, PR85411] Move GOMP_OPENACC_DIM parsing out of nvptx plugin

2018-04-16 Thread Jakub Jelinek

On Mon, Apr 16, 2018 at 11:41:35AM +0200, Tom de Vries wrote:
> Hi,
> 
> this patch moves the parsing of the GOMP_OPENACC_DIM environment variable
> from the nvptx target plugin to the libgomp library.
> 
> The variable is not part of the OpenACC standard, but it is specific for the
> gcc implementation of OpenACC, so it makes sense to share the part handling
> the parsing, rather than having each target plugin duplicate it.
> 
> Build on x86_64 with nvptx accelerator and reg-tested libgomp.
> 
> OK for stage1?

Ok.

> [openacc] Move GOMP_OPENACC_DIM parsing out of nvptx plugin
> 
> 2018-04-15  Tom de Vries  
> 
>   PR libgomp/85411
>   * plugin/plugin-nvptx.c (notify_var): Remove no longer used function.
>   (nvptx_exec): Move parsing of
>   GOMP_OPENACC_DIM ...
>   * env.c (parse_gomp_openacc_dim): ... here.  New function.
>   (initialize_env): Call parse_gomp_openacc_dim.
>   (goacc_default_dims): Define.
>   * libgomp.h (goacc_default_dims): Declare.
>   * oacc-plugin.c (GOMP_PLUGIN_acc_default_dim): New function.
>   * oacc-plugin.h (GOMP_PLUGIN_acc_default_dim): Declare.
>   * libgomp.map: New version "GOMP_PLUGIN_1.2". Add
>   GOMP_PLUGIN_acc_default_dim.
>   * testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c: New test.
>   * testsuite/libgomp.oacc-c-c++-common/loop-default.h: New test.

Jakub

Re: [PATCH, rs6000] Fix PR85080

2018-04-16 Thread Segher Boessenkool

Hi Bill,

On Sun, Apr 15, 2018 at 09:41:04PM -0500, Bill Schmidt wrote:
> PR85080 identifies a test case that started failing last year when
> an improvement was made to the vectorizer.  The failure turns out to
> be appropriate.  The test used to not expect the loop in the first
> function to be vectorized, because the cost of potentially unaligned
> loads made this unprofitable.  However, with Power8 and later hardware,
> this is no longer the case.  This patch adjusts the test to only check
> its results for targets that have inefficient unaligned loads.

> --- gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr37194.c   
> (revision 259389)
> +++ gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr37194.c   
> (working copy)
> @@ -22,6 +22,6 @@ ggSpectrum_Set20(float * data, float d)
>data[i] = d;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" 
> } } */
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> +/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" 
> { target { ! vect_hw_misalign } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
> { ! vect_hw_misalign } } } } */

Alternatively you could skip the whole test if vect_hw_misalign.

But this is fine.  Okay for trunk (and backports if needed/wanted).
Thanks!


Segher

Re: [PATCH] [PR c++/80290] recycle tinst garbage sooner

2018-04-16 Thread Jason Merrill

On Fri, Apr 13, 2018, 5:19 PM Alexandre Oliva  wrote:

> tinst_level objects are created during template instantiation, and
> they're most often quite short-lived, but since there's no intervening
> garbage collection, they accumulate throughout the pass while most by
> far could be recycled after a short while.  The original testcase in
> PR80290, for example, creates almost 55 million tinst_level objects,
> all but 10 thousand of them without intervening GC, but we don't need
> more than 284 of them live at a time.
>
> Furthermore, in many cases, TREE_LIST objects are created to stand for
> the decl in tinst_level.  In most cases, those can be recycled as soon
> as the tinst_level object is recycled; in some relatively common
> cases, they are modified and reused in up to 3 tinst_level objects.
> In the same testcase, TREE_LISTs are used in all but 3 thousand of the
> tinst_level objects, and we don't need more than 89 tinst_level
> objects with TREE_LISTs live at a time.  Furthermore, all but 2
> thousand of those are created without intervening GC.
>
> This patch arranges for tinst_level objects to be refcounted (I've
> seen as many as 20 live references to a single tinst_level object in
> my testing), and for pending_template, tinst_level and TREE_LIST
> objects that can be recycled to be added to freelists; that's much
> faster than ggc_free()ing them and reallocating them shortly
> thereafter.  In fact, the introduction of such freelists is what makes
> this entire patch lighter-weight, when it comes to memory use, and
> faster.  With refcounting alone, the total memory footprint was still
> about the same, and we spent more time.
>
> In order to further reduce memory use, I've arranged for us to create
> TREE_LISTs lazily, only at points that really need them (when printing
> error messages).  This grows tinst_level by an additional pointer, but
> since a TREE_LIST holds a lot more than an extra pointer, and
> tinst_levels with TREE_LISTs used to be allocated tens of thousands of
> times more often than plain decl ones, we still save memory overall.
>
> I was still not quite happy about growing memory use in cases that
> used template classes but not template overload resolution, so I
> changed the type of the errors field to unsigned short, from int.
> With that change, in_system_header_p and refcount move into the same
> word or half-word that used to hold errors, releasing a full word,
> bringing tinst_level back to its original size, now without any
> padding.
>
> The errors field is only used to test whether we've had any errors
> since the expansion of some template, to skip the expansion of further
> templates.  If we get 2^16 errors or more, it is probably reasonable
> to refrain from expanding further templates, even if we would expand
> them before this change.
>
> With these changes, compile time for the original testcase at -O0,
> with default checking enabled, is cut down by some 3.7%, total GCable
> memory allocation is cut down by almost 20%, and total memory use (as
> reported by GNU time as maxresident) is cut down by slightly over 15%.
>

Great!

Regstrapped on i686- and x86_64-linux-gnu.  Ok to install?
> (See below for an incremental patch that introduces some statistics
> collection and reporting, that shows how I collected some of the data
> above.  That one is not meant for inclusion)
>
> for  gcc/cp/ChangeLog
>
> PR c++/80290
> * cp-tree.h (struct tinst_level): Split decl into tldcl and
> targs.  Add private split_list_p, tree_list_p, and not_list_p
> inline const predicates; to_list private member function
> declaration; list_p, get_node, and get_decl_maybe accessors,
> and refcount private data member.  Narrow errors to unsigned
> short.
> * error.c (print_instantiation_full_context): Use new
> accessors.
> (print_instantiation_partial_context_line): Likewise.
> * mangle.c (mangle_decl_string): Likewise.
> * pt.c (freelist): New template class.
> (tree_list_freelist_head): New var.
> (tree_list_freelist): New fn, along with specializations.
> (tinst_level_freelist_head): New var.
> (pending_template_freelist_head): Likewise.
> (tinst_level_freelist, pending_template_freelist): New fns.
> (tinst_level::to_list): Define.
> (inc_refcount_use, dec_refcount_use): New fns for tinst_level.
> (set_refcount_ptr): New template fn.
> (add_pending_template): Adjust for refcounting, freelists and
> new accessors.
> (neglectable_inst_p): Take a NULL d as a non-DECL.
> (limit_bad_template_recursion): Use new accessors.
> (push_tinst_level): New overload to create the list.
> (push_tinst_level_loc): Make it static, split decl into two
> args, adjust tests and initialization to cope with split
> lists, use freelist, adjust for refcounting.
> (push_tinst_level_lo

[PATCH] Fix ICE with single element vector

2018-04-16 Thread Andreas Krebbel

I did run into an ICE with a single element vector triggered by
dividing the number of elements by 2 with exact_div here:

tree-vect-data-refs.c:5132

  else
{
  /* If length is not equal to 3 then only power of 2 is supported.  */
  gcc_assert (pow2p_hwi (count));
  poly_uint64 nelt = GET_MODE_NUNITS (mode);

  /* The encoding has 2 interleaved stepped patterns.  */
  vec_perm_builder sel (nelt, 2, 3);
  sel.quick_grow (6);
  for (i = 0; i < 3; i++)
{
  sel[i * 2] = i;
  sel[i * 2 + 1] = i + nelt;
}
  vec_perm_indices indices (sel, 2, nelt);
  if (can_vec_perm_const_p (mode, indices))
{
  for (i = 0; i < 6; i++)
sel[i] += exact_div (nelt, 2);<-
  indices.new_vector (sel, 2, nelt);
  if (can_vec_perm_const_p (mode, indices))
return true;
}
}

The patch adds a check to prevent this.

Ok?

-Andreas-

gcc/ChangeLog:

2018-04-16  Andreas Krebbel  

* tree-vect-data-refs.c (vect_grouped_store_supported): Exit for
single element vectors.
---
 gcc/tree-vect-data-refs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 161a886..01e28ca 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -5135,6 +5135,9 @@ vect_grouped_store_supported (tree vectype, unsigned 
HOST_WIDE_INT count)
  gcc_assert (pow2p_hwi (count));
  poly_uint64 nelt = GET_MODE_NUNITS (mode);
 
+ if (maybe_eq (nelt, 1U))
+   return false;
+
  /* The encoding has 2 interleaved stepped patterns.  */
  vec_perm_builder sel (nelt, 2, 3);
  sel.quick_grow (6);
-- 
2.9.1

Re: [PATCH] PR 83402 Fix ICE for vec_splat_s8, vec_splat_s16, vec_splat_s32 builtins

2018-04-16 Thread Carl Love

On Fri, 2018-04-13 at 17:53 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Apr 13, 2018 at 03:27:40PM -0700, Carl Love wrote:
> > On Fri, 2018-04-13 at 16:54 -0500, Segher Boessenkool wrote:
> > > On Fri, Apr 13, 2018 at 09:49:25AM -0700, Carl Love wrote:
> > > > diff --git a/gcc/config/rs6000/rs6000.c
> > > > b/gcc/config/rs6000/rs6000.c
> > > > index a0c9b5c..855be43 100644
> > > > --- a/gcc/config/rs6000/rs6000.c
> > > > +++ b/gcc/config/rs6000/rs6000.c
> > > > @@ -16576,8 +16576,9 @@ rs6000_gimple_fold_builtin
> > > > (gimple_stmt_iterator *gsi)
> > > >    {
> > > >      arg0 = gimple_call_arg (stmt, 0);
> > > >      lhs = gimple_call_lhs (stmt);
> > > > -    /* Only fold the vec_splat_*() if arg0 is
> > > > constant.  */
> > > > -    if (TREE_CODE (arg0) != INTEGER_CST)
> > > > +    /* Only fold the vec_splat_*() if arg0 is a 5-bit
> > > > constant.  */
> > > > +    if (TREE_CODE (arg0) != INTEGER_CST
> > > > +    || TREE_INT_CST_LOW (arg0) & ~0x1f)
> > > >        return false;
> > > 
> > > Should this test for *signed* 5-bit constants only?
> > > 
> > >if (TREE_CODE (arg0) != INTEGER_CST
> > >    || !IN_RANGE (TREE_INT_CST_LOW (arg0), -16, 15))
> > 
> > The buitins for the unsigned vec_splat_u[8, 16,32]  are actually
> > mapped
> > to their corresponding signed version vec_splat_s[8, 16,32] in
> > altivec.h.  Both the vec_splat_u[8, 16,32] and vec_splat_s[8,
> > 16,32]
> > builtins will get you to the case statement
> > 
> > /* flavors of vec_splat_[us]{8,16,32}.  */
> > case ALTIVEC_BUILTIN_VSPLTISB:
> > case ALTIVEC_BUILTIN_VSPLTISH:
> > case ALTIVEC_BUILTIN_VSPLTISW:
> > 
> > under which the change is being made.  So technically arg0 could be
> > a
> > signed 5-bit or an unsiged 5-bit value.  Either way the 5-bit value
> > is
> > splatted into the vector with sign extension to 8/16/32 bits.  So I
> > would argue that the IN_RANGE test for signed values is maybe
> > misleading as arg0 could represent a signed or unsigned
> > value.  Both
> > tests, the one from the patch or Segher's suggestion, are really
> > just
> > looking to see if any of the upper bits are 1.  From a functional
> > standpoint, I don't see any difference and feel either one would do
> > the
> > job.  Am I missing anything?
> 
> But then the & ~0x1f test is not good either, it does not work for
> values
> -16..-1 ?
> 
> You cannot handle signed and unsigned exactly the same for the test
> for
> allowed values.
> 
Segher:

The VSPLTIS[S|H|W] can only be used if the value in arg0 is a bit
pattern consisting of at most 5-bits.  The issue is the field in the
vector splat instruction is only five bits wide.  So the value of
TREE_INT_CST_LOW (arg0) & ~0x1f) will be non-zero if the value requires
more then 5-bits to represent.  The if statement will thus evaluate to
TRUE and thus the code returns false indicating we can not do the
folding since the constant requires more then a 5-bit field to
represent.  I don't think we really care how the 5-bits are interpreted
as a signed or unsigned value just that the value can be represented by
at most 5-bits.  

The test

if (TREE_CODE (arg0) != INTEGER_CST
 || TREE_INT_CST_LOW (arg0) & ~0x1f)
       return false;

is really checking if the value is not an integer or the value is not a
5-bit constant.  It seems to me this should work fine for signed and
unsigned 5-bit numbers.

   Carl

Re: [PATCH] PR 85075, Fix PowerPC float182/ibm128 types and mangling

2018-04-16 Thread Segher Boessenkool

Hi!

Thank you for working on this.

On Sun, Apr 15, 2018 at 03:50:44PM -0400, Michael Meissner wrote:
> PR target/85075 shows that there are some problems with the types for the 3
> 128-bit floating point types on the PowerPC:
> 
>   __float128  (and _Float128 in C, IEEE 128-bit)
>   __ieee128   (IBM extended double)

(You mean __ibm128, right?)

>   long double (either IEEE 128-bit or IBM extended double)

> In developing this patch, Segher noticed that the mangling for __float128
> violated the mangling standards.  This patch includes a change to use a more
> official mangling (u10__float128).

To use a mangling that works *at all*, yeah.

> This means that GCC 8 will not be able to
> link with C++ functions that pass or return __float128 values compiled with 
> GCC
> 6 or GCC 7.  I have put in a warning if __float128 is mangled.  The warning 
> can
> be silenced by using -Wno-psabi.

It's a note, not a warning (so -Werror won't fail builds, etc.)

> In addition, when I built this on big endian system, the changes exposed a
> latent bug with the way __builtin_packlongdouble was done when it tried to
> support the first argument overlapping with the result.  I have removed the
> code to support overlapping input/output for this builtin.  I imagine that we
> will need to add __builtin_packieee128 and __builtin_unpackieee128 as well in
> the future (or make __builtin_{,un}packlongdouble support all three types).

Please send this part as a separate patch.  It will need backports, too.

> The manglings that are now used are:
> 
> For -mabi=ieeelongdouble:
> 
>   __float128  "u10__float128"
>   __ibm128"u8__ibm128"
>   long double "u9__ieee128"
> 
> For -mabi=ibmlongdouble:
> 
>   __float128  "u10__float128"
>   __ibm128"u8__ibm128"
>   long double "g"
> 
> For -mlong-double-64:
> 
>   __float128  "u10__float128"
>   __ibm128"u8__ibm128"
>   long double "e"

If __float128 is the same type as _Float128, as which of those should
it be mangled?  The C++ ABI has "DF" for _FloatN, but the demangler uses
that same code for fixed points types.  Ugh.

I don't think mangling "long double" as "u9__ieee128" works well with for
example GDB.

Cc:ing Joseph Myers; Joseph, do you have any advice here?


Segher

Re: [PATCH] Improve IPA-CP handling of self-recursive calls

2018-04-16 Thread H.J. Lu

On Wed, Apr 11, 2018 at 2:20 AM, Jan Hubicka  wrote:
>>
>> 2018-04-08  Martin Jambor  
>>
>> PR ipa/84149
>> * ipa-cp.c (propagate_vals_across_pass_through): Expand comment.
>> (cgraph_edge_brings_value_p): New parameter dest_val, check if it
>> is
>> not the same as the source val.
>> (cgraph_edge_brings_value_p): New parameter.
>> (gather_edges_for_value): Pass destination value to
>> cgraph_edge_brings_value_p.
>> (perhaps_add_new_callers): Likewise.
>> (get_info_about_necessary_edges): Likewise and exclude values
>> brought
>> only by self-recursive edges.
>> (create_specialized_node): Redirect only clones of self-calling
>> edges.
>> (+self_recursive_pass_through_p): New function.
>> (find_more_scalar_values_for_callers_subset): Use it.
>> (find_aggregate_values_for_callers_subset): Likewise.
>> (known_aggs_to_agg_replacement_list): Removed.
>> (decide_whether_version_node): Re-calculate known constants for
>> all
>> remaining context clones.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85421


-- 
H.J.

PING^3: [GCC 6] PATCH: Backport -mindirect-branch= patches

2018-04-16 Thread H.J. Lu

On Mon, Apr 2, 2018 at 5:06 AM, H.J. Lu  wrote:
> On Mon, Mar 26, 2018 at 4:04 AM, H.J. Lu  wrote:
>> On Mon, Mar 19, 2018 at 10:04 AM, H.J. Lu  wrote:

 Here are GCC 6 patches to backport all -mindirect-branch= patches.
 OK for GCC 6.

>>>
>>
>> PING:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2018-03/msg00884.html
>>
>
> PING.
>

PING.


-- 
H.J.

Re: [PATCH] Make redirection only for target_clones: V2 (PR ipa/85329).

2018-04-16 Thread Jakub Jelinek

On Mon, Apr 16, 2018 at 04:01:18PM +0200, Martin Liška wrote:
> >From 77b48cfad59dd24a5c068bfb32b1059535ae1f75 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Sat, 14 Apr 2018 09:55:35 +0200
> Subject: [PATCH] Make redirection only for target_clones: V3 (PR ipa/85329).
> 
> 2018-04-16  Martin Liska  
> 

Missing PR line.

>   * multiple_target.c (create_dispatcher_calls): Set apostrophes
>   for target_clone error message.  Make default implementation
> clone to be a local declaration.
>   (separate_attrs): Add new argument and check for an emptry

s/emptry/empty/

>   string.
>   (expand_target_clones): Handle it.
>   (ipa_target_clone): Make redirection just for target_clones
>   functions.

Jakub

[PATCH], PR target/85358, Fix ibm128 being converter to float128 on PowerPC ISA 3.0 (power9)

2018-04-16 Thread Michael Meissner

As I was working on PR target/85075 (to flesh some bugs with IEEE 128-bit
support on the PowerPC, particularly with switching the default of long
double), I noticed that for explicit IBM extended double, the compiler was
converting the __ibm128 type to an IEEE 128-bit type, because those types had
hardware support in ISA 3.0 (power9).

Unfortunately, IBM extended double (a pair of doubles meant to give you more
mantissa precision but not more expoenent range), is not completely
representable in IEEE 128-bit floating point.  There are likely some exposed
corner cases involved, and the bottom bits might not be the same.

While I would prefer IBM extended double to disappear entirely, I do think we
need to deal with it for a few versions still to come.

I tried to come up with machine dependent ways to prevent the automatic
widening from occuring, but I couldn't get anything to work reliably.  This
patch adds a new target hook that says whether the automatic widening between
two modes should be allowed.  The default hook says to allow all of the default
widenings to occur, while the PowerPC override says IBM extended double does
not widen to IEEE 128-bit and vice versa.

Given we are in stage4 right now, it is not the time to add new hooks, but here
is the patch.  If it doesn't go into GCC 8, is it acceptable for being put
early into GCC 9 with a backport before 8.2 comes out?

I have tested this patch using bootstrap builds on a power8 little system, a
power7 big endian system (both 32-bit and 64-bit), and an x86_64 bit system
with no regressions.  

[gcc]
2018-04-13  Michael Meissner  

PR target/85358
* target.def (default_widening_p): New target hook to say whether
default widening between modes should be done.
* targhooks.h (hook_bool_mode_mode_bool_true): New declaration.
* targhooks.c (hook_bool_mode_mode_bool_true): New default target
hook.
* optabs.c (expand_binop): Before doing default widening, check
whether the backend allows the widening.
(expand_twoval_unop): Likewise.
(expand_twoval_binop): Likewise.
(widen_leading): Likewise.
(widen_bswap): Likewise.
(expand_unop): Likewise.
* cse.c (cse_insn): Likewise.
* combine.c (simplify_comparison): Likewise.
* var-tracking.c (prepare_call_arguments): Likewise.
* config/rs6000/rs6000.c (TARGET_DEFAULT_WIDENING_P): Define
target hook to prevent IBM extended double and IEEE 128-bit
floating point from being converted to each by default.
(rs6000_default_widening_p): Likewise.
* doc/tm.texi (TARGET_DEFAULT_WIDENING_P): Document the new
default widening hook.
* doc/tm.texi.in (TARGET_DEFAULT_WIDENING_P): Likewise.

[gcc/testsuite]
2018-04-13  Michael Meissner  

PR target/85358
* gcc.target/powerpc/pr85358.c: New test to make sure __ibm128
does not widen to __float128 on ISA 3.0 systems.


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/target.def
===
--- gcc/target.def  (revision 259376)
+++ gcc/target.def  (working copy)
@@ -3486,6 +3486,13 @@ If this hook allows @code{val} to have a
  hook_bool_mode_uhwi_false)
 
 DEFHOOK
+(default_widening_p,
+ "Return true if GCC can automatically widen from @var{from_mode} to\n\
+@var{to_mode}.  Conversions are unsigned if @var{unsigned_p} is true.",
+ bool, (machine_mode, machine_mode, bool),
+ hook_bool_mode_mode_bool_true)
+
+DEFHOOK
 (libgcc_floating_mode_supported_p,
  "Define this to return nonzero if libgcc provides support for the \n\
 floating-point mode @var{mode}, which is known to pass \n\
Index: gcc/targhooks.c
===
--- gcc/targhooks.c (revision 259376)
+++ gcc/targhooks.c (working copy)
@@ -2336,4 +2336,12 @@ default_select_early_remat_modes (sbitma
 {
 }
 
+bool
+hook_bool_mode_mode_bool_true (machine_mode from_mode ATTRIBUTE_UNUSED,
+  machine_mode to_mode ATTRIBUTE_UNUSED,
+  bool unsigned_p ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
 #include "gt-targhooks.h"
Index: gcc/targhooks.h
===
--- gcc/targhooks.h (revision 259376)
+++ gcc/targhooks.h (working copy)
@@ -288,5 +288,6 @@ extern enum flt_eval_method
 default_excess_precision (enum excess_precision_type ATTRIBUTE_UNUSED);
 extern bool default_stack_clash_protection_final_dynamic_probe (rtx);
 extern void default_select_early_remat_modes (sbitmap);
+extern bool hook_bool_mode_mode_bool_true (machine_mode, machine_mode, bool);
 
 #endif /* GCC_TARGHOOKS_H */
Index: gcc/optabs.c
===
--- gcc/optabs.c(r

Re: [PATCH] Handle empty infinite loops in OpenACC for PR84955

2018-04-16 Thread Tom de Vries


On 04/12/2018 08:58 PM, Jakub Jelinek wrote:

On Thu, Apr 12, 2018 at 11:39:43AM -0700, Cesar Philippidis wrote:

Strange. I didn't observe any regressions when I tested it. But, then
again, I was testing against revision

r259092 | jason | 2018-04-04 09:42:55 -0700 (Wed, 04 Apr 2018) | 4 lines

which is over a week old. I'll revert that patch for now, and revisit
this issue in stage1.


You should have kept the omp-expand.c chunk, that is correct and shouldn't
cause issues.


Committed as attached (with correct changelog entry, the previously 
committed patch had an incorrect one).


I've filed the lto ICE (described as "the second problem" in this 
thread) as PR85422 - "[openacc] ICE at cfgloop.c:468: segfault in 
flow_loops_find".


[ When changing dg-do compile to link in the test-cases in this patch, 
we run into PR85422, which shows that in the reverted patch the 
problematic fix was actually not exercised by the test-cases. ]


Thanks,
- Tom
[openacc] Fix ICE when compiling tile loop containing infinite loop

2018-04-16  Cesar Philippidis  
	Tom de Vries  

	PR middle-end/84955
	* omp-expand.c (expand_oacc_for): Add dummy false branch for
	tiled basic blocks without omp continue statements.

	* testsuite/libgomp.oacc-c-c++-common/pr84955.c: New test.
	* testsuite/libgomp.oacc-fortran/pr84955.f90: New test.

---
 gcc/omp-expand.c  |  8 
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955.c | 15 +++
 libgomp/testsuite/libgomp.oacc-fortran/pr84955.f90| 13 +
 3 files changed, 36 insertions(+)

diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index bb20490..c7d30ea 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -5439,6 +5439,14 @@ expand_oacc_for (struct omp_region *region, struct omp_for_data *fd)
 
 	  split->flags ^= EDGE_FALLTHRU | EDGE_TRUE_VALUE;
 
+	  /* Add a dummy exit for the tiled block when cont_bb is missing.  */
+	  if (cont_bb == NULL)
+	{
+	  edge e = make_edge (body_bb, exit_bb, EDGE_FALSE_VALUE);
+	  e->probability = profile_probability::even ();
+	  split->probability = profile_probability::even ();
+	}
+
 	  /* Initialize the user's loop vars.  */
 	  gsi = gsi_start_bb (elem_body_bb);
 	  expand_oacc_collapse_vars (fd, true, &gsi, counts, e_offset);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955.c
new file mode 100644
index 000..e528faa
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955.c
@@ -0,0 +1,15 @@
+/* { dg-do compile }  */
+
+int
+main (void)
+{
+  int i, j;
+
+#pragma acc parallel loop tile(2,3)
+  for (i = 1; i < 10; i++)
+for (j = 1; j < 10; j++)
+  for (;;)
+	;
+
+  return i + j;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr84955.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr84955.f90
new file mode 100644
index 000..dc85865
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr84955.f90
@@ -0,0 +1,13 @@
+! { dg-do compile }
+
+subroutine s
+   integer :: i, j
+   !$acc parallel loop tile(2,3)
+   do i = 1, 10
+  do j = 1, 10
+ do
+ end do
+  end do
+   end do
+  !$acc end parallel loop
+end subroutine s

Re: PING^3: [GCC 6] PATCH: Backport -mindirect-branch= patches

2018-04-16 Thread Jan Hubicka

> On Mon, Apr 2, 2018 at 5:06 AM, H.J. Lu  wrote:
> > On Mon, Mar 26, 2018 at 4:04 AM, H.J. Lu  wrote:
> >> On Mon, Mar 19, 2018 at 10:04 AM, H.J. Lu  wrote:
> 
>  Here are GCC 6 patches to backport all -mindirect-branch= patches.
>  OK for GCC 6.
> 
> >>>
> >>
> >> PING:
> >>
> >> https://gcc.gnu.org/ml/gcc-patches/2018-03/msg00884.html
> >>
> >
> > PING.
> >
> 
> PING.

OK,
Honza
> 
> 
> -- 
> H.J.

[PATCH], Fix PR target/85424, PowerPC __builtin_packlongdouble bug discovered in PR target/85075

2018-04-16 Thread Michael Meissner

With the code changes in the patch for PR target/85075, I noticed that the
PowerPC big-endian build stopped in building big endian, 32-bit libgcc when
configured for power8.  The issue was this latent bug.  Segher asked me to
re-submit the bug separately, and I'm doing this for PR target/85424.

This is the patch that I did in PR target/85075.  For that set, I did bootstrap
builds for big endian power7 and little endian power8 (along with a previous
build on a big endian power8 system).

This patch will need to be back ported to GCC 7.  I suspect that it may not be
needed in GCC 6, since that revision used RELOAD instead of LRA.  But given it
is hard to replicate the bug, I will also back port it to GCC 6.

Can I apply this to GCC 8, and after a waiting period, apply it to GCC 7 and
GCC 6?

2018-04-15  Michael Meissner  

PR target/85424
* config/rs6000/rs6000.md (pack): Do not try handle a pack
where the inputs overlap with the output.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 259376)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -13934,16 +13934,14 @@ (define_insn_and_split "unpack_nod
(set_attr "length" "4")])
 
 (define_insn_and_split "pack"
-  [(set (match_operand:FMOVE128 0 "register_operand" "=d,&d")
+  [(set (match_operand:FMOVE128 0 "register_operand" "=&d")
(unspec:FMOVE128
-[(match_operand: 1 "register_operand" "0,d")
- (match_operand: 2 "register_operand" "d,d")]
+[(match_operand: 1 "register_operand" "d")
+ (match_operand: 2 "register_operand" "d")]
 UNSPEC_PACK_128BIT))]
   "FLOAT128_2REG_P (mode)"
-  "@
-   fmr %L0,%2
-   #"
-  "&& reload_completed && REGNO (operands[0]) != REGNO (operands[1])"
+  "#"
+  "&& reload_completed"
   [(set (match_dup 3) (match_dup 1))
(set (match_dup 4) (match_dup 2))]
 {
@@ -13956,8 +13954,8 @@ (define_insn_and_split "pack"
   operands[3] = gen_rtx_REG (mode, dest_hi);
   operands[4] = gen_rtx_REG (mode, dest_lo);
 }
-  [(set_attr "type" "fpsimple,fp")
-   (set_attr "length" "4,8")])
+  [(set_attr "type" "fp")
+   (set_attr "length" "8")])
 
 (define_insn "unpack"
   [(set (match_operand:DI 0 "register_operand" "=wa,wa")

Re: [PATCH] PR 85075, Fix PowerPC float182/ibm128 types and mangling

2018-04-16 Thread Michael Meissner

On Mon, Apr 16, 2018 at 11:53:13AM -0500, Segher Boessenkool wrote:
> Hi!
> 
> Thank you for working on this.
> 
> On Sun, Apr 15, 2018 at 03:50:44PM -0400, Michael Meissner wrote:
> > PR target/85075 shows that there are some problems with the types for the 3
> > 128-bit floating point types on the PowerPC:
> > 
> > __float128  (and _Float128 in C, IEEE 128-bit)
> > __ieee128   (IBM extended double)
> 
> (You mean __ibm128, right?)

Yes.

> > long double (either IEEE 128-bit or IBM extended double)
> 
> > In developing this patch, Segher noticed that the mangling for __float128
> > violated the mangling standards.  This patch includes a change to use a more
> > official mangling (u10__float128).
> 
> To use a mangling that works *at all*, yeah.
> 
> > This means that GCC 8 will not be able to
> > link with C++ functions that pass or return __float128 values compiled with 
> > GCC
> > 6 or GCC 7.  I have put in a warning if __float128 is mangled.  The warning 
> > can
> > be silenced by using -Wno-psabi.
> 
> It's a note, not a warning (so -Werror won't fail builds, etc.)
> 
> > In addition, when I built this on big endian system, the changes exposed a
> > latent bug with the way __builtin_packlongdouble was done when it tried to
> > support the first argument overlapping with the result.  I have removed the
> > code to support overlapping input/output for this builtin.  I imagine that 
> > we
> > will need to add __builtin_packieee128 and __builtin_unpackieee128 as well 
> > in
> > the future (or make __builtin_{,un}packlongdouble support all three types).
> 
> Please send this part as a separate patch.  It will need backports, too.

I opened up PR 85424, and submitted this patch:
https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00767.html

Here is the PR 85075 patch without the rs6000.md bits:

[gcc]
2018-04-16  Michael Meissner  

PR target/85075
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): __ibm128 is
now a separate type, don't #define __ibm128 as long double.
* config/rs6000/rs6000.c (rs6000_init_builtins): Make __ibm128 a
separate type on systems that support IEEE 128-bit floating point.
(rs6000_mangle_type): Use separate manglings for __ibm128 and
__float128.  Change __float128 mangling from U10__float128 to
u10__float128.  Issue a warning that the mangling has changed in
GCC 8.

[gcc/testsuite]
2018-04-16  Michael Meissner  

PR target/85075
* g++.dg/pr85075-1.C: New tests.  Make sure that __float128,
__ibm128, and long double are different types, and that you can
mix them in templates and with overloaded functions.  Test all 3
different long dobule variants (IEEE 128 bit, IBM 128 bit, and 64
bit).
* g++.dg/pr85075-2.C: Likewise.
* g++.dg/pr85075-3.C: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 259376)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -617,8 +617,6 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
 builtin_define ("__RSQRTEF__");
   if (TARGET_FLOAT128_TYPE)
 builtin_define ("__FLOAT128_TYPE__");
-  if (TARGET_LONG_DOUBLE_128 && FLOAT128_IBM_P (TFmode))
-builtin_define ("__ibm128=long double");
 #ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
   builtin_define ("__BUILTIN_CPU_SUPPORTS__");
 #endif
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 259376)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -16994,28 +16994,22 @@ rs6000_init_builtins (void)
  For IEEE 128-bit floating point, always create the type __ieee128.  If the
  user used -mfloat128, rs6000-c.c will create a define from __float128 to
  __ieee128.  */
-  if (TARGET_LONG_DOUBLE_128 && FLOAT128_IEEE_P (TFmode))
+  if (TARGET_FLOAT128_TYPE)
 {
   ibm128_float_type_node = make_node (REAL_TYPE);
   TYPE_PRECISION (ibm128_float_type_node) = 128;
   SET_TYPE_MODE (ibm128_float_type_node, IFmode);
   layout_type (ibm128_float_type_node);
-
   lang_hooks.types.register_builtin_type (ibm128_float_type_node,
  "__ibm128");
-}
-  else
-ibm128_float_type_node = long_double_type_node;
 
-  if (TARGET_FLOAT128_TYPE)
-{
   ieee128_float_type_node = float128_type_node;
   lang_hooks.types.register_builtin_type (ieee128_float_type_node,
  "__ieee128");
 }
 
   else
-ieee128_float_type_node = long_double_type_node;
+ieee128_float_type_node = ibm128_float_type_node = long_double_type_node;
 
   /* Initialize the modes for builtin_function_type, mapping

Re: [C++ PATCH] Fix constexpr handling of &x->y (PR c++/84463)

2018-04-16 Thread Jakub Jelinek

On Mon, Apr 16, 2018 at 09:28:43PM +0200, Jakub Jelinek wrote:
> On the following new testcase we emit 2 different constexpr errors
> because of premature folding, where the PR44100 hack which is supposed
> to fold expressions like &((S *)0)->f or
> &((S *)24)->f folds all the &x->y expressions if x is TREE_CONSTANT
> into (some type)(x + cst) where what we were actually trying to access
> is lost.
> 
> The following patch limits the offsetof-like expression hack to expressions
> where maybe_constant_value of val's operand is INTEGER_CST, or e.g.
> a cast of INTEGER_CST to some pointer type.  This way we don't regress
> e.g. init/struct2.C, but don't mess up with x is e.g. some constexpr
> variable initialized to address of something.  Or should it avoid
> maybe_constant_value and just handle the literal INTEGER_CST and cast
> thereof?  We wouldn't handle &((S *)(24 + 8))->f that way though...

Or shall we move this folding to cp_fold instead of cp_build_addr_expr_1
(while keeping it limited to INTEGER_CST pointers)?

Jakub

[PATCH] Fix gen_lowpart_if_possible (PR middle-end/85414)

2018-04-16 Thread Jakub Jelinek

Hi!

The following testcase FAILs, because cse_local sees
(zero_extend:TI (subreg/s/v:DI (reg:TI ...) 0))
inside of REG_EQUAL note, and simplify-rtx.c attempts to optimize it.
case ZERO_EXTEND:
  /* Check for a zero extension of a subreg of a promoted
 variable, where the promotion is zero-extended, and the
 target mode is the same as the variable's promotion.  */
  if (GET_CODE (op) == SUBREG
  && SUBREG_PROMOTED_VAR_P (op)
  && SUBREG_PROMOTED_UNSIGNED_P (op)
  && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (op
{
  temp = rtl_hooks.gen_lowpart_no_emit (mode, op);
  if (temp)
return temp;
}
This calls gen_lowpart_common which fails, because:
  /* MODE must occupy no more of the underlying registers than X.  */
  poly_uint64 regsize = REGMODE_NATURAL_SIZE (innermode);
  unsigned int mregs, xregs;
  if (!can_div_away_from_zero_p (msize, regsize, &mregs)
  || !can_div_away_from_zero_p (xsize, regsize, &xregs)
  || mregs > xregs)
return 0;
and so gen_lowpart_if_possible emits (subreg:TI (subreg/s/v:DI (reg:TI ...) 0) 
0)
which is invalid, unsurprisingly other passes have issues with it and DF in
the middle of reload discovers use of an unsaved call saved register that
way.

The following patch fixes it by never creating a subreg of subreg, other
spots avoid it similarly.  Bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

Perhaps in GCC9 we might consider relaxing the above gen_lowpart_common
mregs/xregs restriction, say if the inner operand is a lowpart subreg of
a register with mode having at least xregs regs, allow it; not really sure
if it is desirable though.

2018-04-16  Jakub Jelinek  

PR middle-end/85414
* rtlhooks.c (gen_lowpart_if_possible): Don't call gen_lowpart_SUBREG
on a SUBREG.

* gcc.dg/pr85414.c: New test.

--- gcc/rtlhooks.c.jj   2018-01-03 10:19:55.338533987 +0100
+++ gcc/rtlhooks.c  2018-04-16 15:16:54.257664102 +0200
@@ -123,9 +123,9 @@ gen_lowpart_if_possible (machine_mode mo
 
   return new_rtx;
 }
-  else if (mode != GET_MODE (x) && GET_MODE (x) != VOIDmode
+  else if (mode != GET_MODE (x) && GET_MODE (x) != VOIDmode && !SUBREG_P (x)
   && validate_subreg (mode, GET_MODE (x), x,
-   subreg_lowpart_offset (mode, GET_MODE (x
+  subreg_lowpart_offset (mode, GET_MODE (x
 return gen_lowpart_SUBREG (mode, x);
   else
 return 0;
--- gcc/testsuite/gcc.dg/pr85414.c.jj   2018-04-16 15:32:42.119137550 +0200
+++ gcc/testsuite/gcc.dg/pr85414.c  2018-04-16 15:31:39.332103633 +0200
@@ -0,0 +1,10 @@
+/* PR middle-end/85414 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-Og -fgcse -Wno-uninitialized" } */
+
+int
+foo (void)
+{
+  unsigned __int128 c;
+  return __builtin_mul_overflow_p (59, -c, (short) 0);
+}

Jakub

Protect from comma operator overload

2018-04-16 Thread François Dumont


Hi

    While working on something else on libstdc++ I started having a 
test failing because of the missing comma overload protection in 
deque.tcc. So I looked for other similar places in the code and here is 
a patch to fix the places I found.


    Let me know if it is still time to commit.

François

diff --git a/libstdc++-v3/include/bits/deque.tcc b/libstdc++-v3/include/bits/deque.tcc
index fe96330..1337394 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -297,7 +297,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 		std::input_iterator_tag)
   {
 iterator __cur = begin();
-for (; __first != __last && __cur != end(); ++__cur, ++__first)
+for (; __first != __last && __cur != end(); ++__cur, (void)++__first)
   *__cur = *__first;
 if (__first == __last)
   _M_erase_at_end(__cur);
diff --git a/libstdc++-v3/include/bits/list.tcc b/libstdc++-v3/include/bits/list.tcc
index 22538c9..e90d957 100644
--- a/libstdc++-v3/include/bits/list.tcc
+++ b/libstdc++-v3/include/bits/list.tcc
@@ -312,7 +312,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 iterator __first1 = begin();
 iterator __last1 = end();
 for (; __first1 != __last1 && __first2 != __last2;
-	 ++__first1, ++__first2)
+	 ++__first1, (void)++__first2)
   *__first1 = *__first2;
 if (__first2 == __last2)
   erase(__first1, __last1);
diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 05e9ff2..4b52c17 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -1229,7 +1229,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 		  std::input_iterator_tag)
 	{
 	  iterator __cur = begin();
-	  for (; __first != __last && __cur != end(); ++__cur, ++__first)
+	  for (; __first != __last && __cur != end(); ++__cur, (void)++__first)
 	*__cur = *__first;
 	  if (__first == __last)
 	_M_erase_at_end(__cur);
diff --git a/libstdc++-v3/include/bits/vector.tcc b/libstdc++-v3/include/bits/vector.tcc
index 2516e35..f33da03 100644
--- a/libstdc++-v3/include/bits/vector.tcc
+++ b/libstdc++-v3/include/bits/vector.tcc
@@ -273,7 +273,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   {
 	pointer __cur(this->_M_impl._M_start);
 	for (; __first != __last && __cur != this->_M_impl._M_finish;
-	 ++__cur, ++__first)
+	 ++__cur, (void)++__first)
 	  *__cur = *__first;
 	if (__first == __last)
 	  _M_erase_at_end(__cur);

Re: [PATCH], PR target/85358, Fix ibm128 being converter to float128 on PowerPC ISA 3.0 (power9)

2018-04-16 Thread Segher Boessenkool

Hi!

On Mon, Apr 16, 2018 at 01:41:29PM -0400, Michael Meissner wrote:
> As I was working on PR target/85075 (to flesh some bugs with IEEE 128-bit
> support on the PowerPC, particularly with switching the default of long
> double), I noticed that for explicit IBM extended double, the compiler was
> converting the __ibm128 type to an IEEE 128-bit type, because those types had
> hardware support in ISA 3.0 (power9).
> 
> Unfortunately, IBM extended double (a pair of doubles meant to give you more
> mantissa precision but not more expoenent range), is not completely
> representable in IEEE 128-bit floating point.  There are likely some exposed
> corner cases involved, and the bottom bits might not be the same.
> 
> While I would prefer IBM extended double to disappear entirely, I do think we
> need to deal with it for a few versions still to come.

There are many subtargets that still need it, and nothing to change
that has been planned.

> I tried to come up with machine dependent ways to prevent the automatic
> widening from occuring, but I couldn't get anything to work reliably.  This
> patch adds a new target hook that says whether the automatic widening between
> two modes should be allowed.  The default hook says to allow all of the 
> default
> widenings to occur, while the PowerPC override says IBM extended double does
> not widen to IEEE 128-bit and vice versa.
> 
> Given we are in stage4 right now, it is not the time to add new hooks, but 
> here
> is the patch.  If it doesn't go into GCC 8, is it acceptable for being put
> early into GCC 9 with a backport before 8.2 comes out?
> 
> I have tested this patch using bootstrap builds on a power8 little system, a
> power7 big endian system (both 32-bit and 64-bit), and an x86_64 bit system
> with no regressions.  

Can't you just set up things such that GET_MODE_WIDER_TYPE does not
return ieee128 for ibm128?  With maybe a new hook yes, if that is
necessary.

> --- gcc/target.def(revision 259376)
> +++ gcc/target.def(working copy)
> @@ -3486,6 +3486,13 @@ If this hook allows @code{val} to have a
>   hook_bool_mode_uhwi_false)
>  
>  DEFHOOK
> +(default_widening_p,
> + "Return true if GCC can automatically widen from @var{from_mode} to\n\
> +@var{to_mode}.  Conversions are unsigned if @var{unsigned_p} is true.",
> + bool, (machine_mode, machine_mode, bool),
> + hook_bool_mode_mode_bool_true)

This should have those names (from_mode, to_mode, unsigned_p) in the
prototype then, for the generated documentation to make sense.


Segher

[libgomp, testsuite] Use dg-set-target-env-var instead of setenv

2018-04-16 Thread Tom de Vries


Hi,

given the stage1 approval for "[openacc, PR85411] Move GOMP_OPENACC_DIM 
parsing out of nvptx plugin", setenv for GOMP_OPENACC_DIM won't work on 
trunk anymore, so there's no sense in using it on the og7 branch.


Committed as attached.

[ We may want to commit that patch to og7 as well, but given the extra 
symbol added to the plugin interface, I'm not sure about timing. ]


Thanks,
- Tom
[libgomp, testsuite] Use dg-set-target-env-var instead of setenv

2018-04-16  Tom de Vries  

	* testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c: Use
	dg-set-target-env-var instead of setenv.
	* testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c: Same.

---
 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c | 4 +---
 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c
index ec3ca52..6c479e4 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c
@@ -2,14 +2,12 @@
 /* This code uses nvptx inline assembly guarded with acc_on_device, which is
not optimized away at -O0, and then confuses the target assembler.
{ dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
+/* { dg-set-target-env-var "GOMP_OPENACC_DIM" "8:8" } */
 
 #include "loop-default.h"
-#include 
 
 int main ()
 {
   /* Environment should be ignored.  */
-  setenv ("GOMP_OPENACC_DIM", "8:8",  1);
-  
   return test_1 (16, 16, 32);
 }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c
index e61203d..eb00d32 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c
@@ -2,13 +2,11 @@
 /* This code uses nvptx inline assembly guarded with acc_on_device, which is
not optimized away at -O0, and then confuses the target assembler.
{ dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
+/* { dg-set-target-env-var "GOMP_OPENACC_DIM" "8:16" } */
 
 #include "loop-default.h"
-#include 
 
 int main ()
 {
-  setenv ("GOMP_OPENACC_DIM", "8:16", 1);
-
   return test_1 (8, 16, 32);
 }

[og7] Backport "[nvptx] Add exit after call to noreturn function"

2018-04-16 Thread Tom de Vries


Hi,

while investigating PR85381 - "[og7, nvptx, openacc] parallel-loop-1.c 
fails with default vector length 128", I ran into PR 80035/81069.


I've backported the fix to the og7 branch.

Thanks,
- Tom
Backport "[nvptx] Add exit after call to noreturn function"

2018-04-16  Tom de Vries  

	backport from trunk:
	2017-09-25  Tom de Vries  

	PR target/80035
	PR target/81069
	* config/nvptx/nvptx.c (nvptx_output_call_insn): Add exit after call to
	noreturn function.

---
 gcc/config/nvptx/nvptx.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 9d011eb..8c478c8 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -2387,11 +2387,14 @@ nvptx_output_call_insn (rtx_insn *insn, rtx result, rtx callee)
   fprintf (asm_out_file, ";\n");
 
   if (find_reg_note (insn, REG_NORETURN, NULL))
-/* No return functions confuse the PTX JIT, as it doesn't realize
-   the flow control barrier they imply.  It can seg fault if it
-   encounters what looks like an unexitable loop.  Emit a trailing
-   trap, which it does grok.  */
-fprintf (asm_out_file, "\t\ttrap; // (noreturn)\n");
+{
+  /* No return functions confuse the PTX JIT, as it doesn't realize
+	 the flow control barrier they imply.  It can seg fault if it
+	 encounters what looks like an unexitable loop.  Emit a trailing
+	 trap and exit, which it does grok.  */
+  fprintf (asm_out_file, "\t\ttrap; // (noreturn)\n");
+  fprintf (asm_out_file, "\t\texit; // (noreturn)\n");
+}
 
   if (result)
 {

Re: Protect from comma operator overload

2018-04-16 Thread Jonathan Wakely

On 16 April 2018 at 20:52, François Dumont wrote:
> Hi
>
> While working on something else on libstdc++ I started having a test
> failing because of the missing comma overload protection in deque.tcc. So I
> looked for other similar places in the code and here is a patch to fix the
> places I found.
>
> Let me know if it is still time to commit.

The changes look right, but please add new tests to demonstrate the
code that used to fail.

You can use  because the iterator types defined
in there have deleted comma operators that should cause errors in
these places.

Re: Protect from comma operator overload

2018-04-16 Thread Jonathan Wakely

On 16 April 2018 at 21:08, Jonathan Wakely wrote:
> On 16 April 2018 at 20:52, François Dumont wrote:
>> Hi
>>
>> While working on something else on libstdc++ I started having a test
>> failing because of the missing comma overload protection in deque.tcc. So I
>> looked for other similar places in the code and here is a patch to fix the
>> places I found.
>>
>> Let me know if it is still time to commit.
>
> The changes look right, but please add new tests to demonstrate the
> code that used to fail.
>
> You can use  because the iterator types defined
> in there have deleted comma operators that should cause errors in
> these places.

Something like this (but in four separate tests):

#include 
#include 
#include 
#include 

int main()
{
  using namespace __gnu_test;
  int a[1] = {};
  test_container t(a, a+1);

  std::deque d;
  d.assign(t.begin(), t.end());

  std::list l;
  l.assign(t.begin(), t.end());

  std::vector v;
  v.assign(t.begin(), t.end());

  std::vector b;
  b.assign(t.begin(), t.end());
}


Given how rare it is for real code to overload the comma operator, and
that nobody has reported these bugs, I think this can wait for after
GCC 8.

Re: [PATCH] [PR c++/80290] recycle tinst garbage sooner

2018-04-16 Thread Alexandre Oliva

On Apr 16, 2018, Jason Merrill  wrote:

> On Fri, Apr 13, 2018, 5:19 PM Alexandre Oliva  wrote:
>> +  tree get_node () const {
>> +if (!split_list_p ()) return tldcl;
>> +else return const_cast (this)->to_list ();
>> +  }

> This looks like a dangerous violation of const correctness, since it does
> in fact modify the object.

Well...  We know the object's actual type is not const, since we
allocate all objects of this type from GCable memory.  Conceptually, the
object is not modified, we're just lazily performing an expensive
rearrangement of the internal representation that we would have done
upfront if we weren't trying to save that expense.  This would be a
perfect case for mutable, if only gengtype didn't get confused by it.

OTOH, we could drop a tiny amount of constification in error.c and avoid
the problem altogether, as in the following incremental patch:

 gcc/cp/cp-tree.h |4 ++--
 gcc/cp/error.c   |2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index e9d9bab879bc..26a50ac136dd 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5909,9 +5909,9 @@ struct GTY((chain_next ("%h.next"))) tinst_level {
   /* Return the original node; if it's a split list, make it a
  TREE_LIST first, so that it can be returned as a single tree
  object.  */
-  tree get_node () const {
+  tree get_node () {
 if (!split_list_p ()) return tldcl;
-else return const_cast (this)->to_list ();
+else return to_list ();
   }

   /* Return the original node if it's a DECL or a TREE_LIST, but do
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 983ffdfedbb2..95b8b84f34ab 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3475,7 +3475,7 @@ print_instantiation_full_context (diagnostic_context 
*context)

 static void
 print_instantiation_partial_context_line (diagnostic_context *context,
- const struct tinst_level *t,
+ struct tinst_level *t,
  location_t loc, bool recursive_p)
 {
   if (loc == UNKNOWN_LOCATION)

Does this incremental change make it ok (pending retesting), or should I
look into adding mutable support to gengtype, or something else?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: [PATCH] PR 83402 Fix ICE for vec_splat_s8, vec_splat_s16, vec_splat_s32 builtins

2018-04-16 Thread Segher Boessenkool

On Mon, Apr 16, 2018 at 09:26:13AM -0700, Carl Love wrote:
> > But then the & ~0x1f test is not good either, it does not work for
> > values
> > -16..-1 ?
> > 
> > You cannot handle signed and unsigned exactly the same for the test
> > for
> > allowed values.

> The test
> 
> if (TREE_CODE (arg0) != INTEGER_CST
>    || TREE_INT_CST_LOW (arg0) & ~0x1f)
>      return false;
> 
> is really checking if the value is not an integer or the value is not a
> 5-bit constant.  It seems to me this should work fine for signed and
> unsigned 5-bit numbers.

All negative numbers (-16..-1 here) have bits outside 0x1f set (all bits
outside it).


Segher

Re: [PATCH], Fix PR target/85424, PowerPC __builtin_packlongdouble bug discovered in PR target/85075

2018-04-16 Thread Segher Boessenkool

On Mon, Apr 16, 2018 at 03:14:25PM -0400, Michael Meissner wrote:
> With the code changes in the patch for PR target/85075, I noticed that the
> PowerPC big-endian build stopped in building big endian, 32-bit libgcc when
> configured for power8.  The issue was this latent bug.  Segher asked me to
> re-submit the bug separately, and I'm doing this for PR target/85424.
> 
> This is the patch that I did in PR target/85075.  For that set, I did 
> bootstrap
> builds for big endian power7 and little endian power8 (along with a previous
> build on a big endian power8 system).
> 
> This patch will need to be back ported to GCC 7.  I suspect that it may not be
> needed in GCC 6, since that revision used RELOAD instead of LRA.  But given it
> is hard to replicate the bug, I will also back port it to GCC 6.
> 
> Can I apply this to GCC 8, and after a waiting period, apply it to GCC 7 and
> GCC 6?

This looks fine.  Yes please.  Thanks!


Segher


> 2018-04-15  Michael Meissner  
> 
>   PR target/85424
>   * config/rs6000/rs6000.md (pack): Do not try handle a pack
>   where the inputs overlap with the output.

Re: PING^4: [PATCH] Use dlsym to check if libdl is needed for plugin

2018-04-16 Thread Alexandre Oliva

On Apr  3, 2018, "H.J. Lu"  wrote:

> On Mon, Mar 26, 2018 at 4:10 AM, H.J. Lu  wrote:
>> On Wed, Mar 14, 2018 at 4:41 AM, H.J. Lu  wrote:
>>> On Wed, Feb 21, 2018 at 3:02 AM, H.J. Lu  wrote:
 On Wed, Oct 18, 2017 at 5:25 PM, H.J. Lu  wrote:
> * plugins.m4 (AC_PLUGINS): Use dlsym to check if libdl is needed.

This is ok, thanks

>>> PING.

>> PING.

> PING.

Sorry about the delay.  Next time, I suggest supplying a link to the
patch in the archives.  Speaking only for myself, I can say it would
have kept me on track a few times I set out to search for and review
this one patch, got distracted by something else and ended up not
reviewing it ;-)

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

[C++ PATCH] Fix constexpr handling of &x->y (PR c++/84463)

2018-04-16 Thread Jakub Jelinek

Hi!

On the following new testcase we emit 2 different constexpr errors
because of premature folding, where the PR44100 hack which is supposed
to fold expressions like &((S *)0)->f or
&((S *)24)->f folds all the &x->y expressions if x is TREE_CONSTANT
into (some type)(x + cst) where what we were actually trying to access
is lost.

The following patch limits the offsetof-like expression hack to expressions
where maybe_constant_value of val's operand is INTEGER_CST, or e.g.
a cast of INTEGER_CST to some pointer type.  This way we don't regress
e.g. init/struct2.C, but don't mess up with x is e.g. some constexpr
variable initialized to address of something.  Or should it avoid
maybe_constant_value and just handle the literal INTEGER_CST and cast
thereof?  We wouldn't handle &((S *)(24 + 8))->f that way though...

The patches causes a small regression on constexpr-nullptr-1.C, it tries
to compare (outside of constexpr contexts because it is undefined there)
null with non-equality comparison with &x->y and was expecting it to be
folded even when not optimizing, which is what doesn't happen anymore.
We optimize it in the GIMPLE passes fine, so I've just added -O1.

Earlier version of this patch has been bootstrapped/regtested on
x86_64-linux and i686-linux, is this one ok for trunk if it passes
bootstrap/regtest?

2018-04-16  Jakub Jelinek  

PR c++/84463
* typeck.c (cp_build_addr_expr_1): Don't use fold_offsetof_1 hack
if val's operand does not fold into INTEGER_CST possibly wrapped
with conversions.

* g++.dg/cpp0x/constexpr-nullptr-1.C: Add -O1 to dg-options.
* g++.dg/cpp0x/constexpr-84463.C: New test.

--- gcc/cp/typeck.c.jj  2018-04-16 18:11:54.784378158 +0200
+++ gcc/cp/typeck.c 2018-04-16 21:03:15.674152875 +0200
@@ -5902,8 +5902,13 @@ cp_build_addr_expr_1 (tree arg, bool str
   && INDIRECT_REF_P (val)
   && TREE_CONSTANT (TREE_OPERAND (val, 0)))
 {
-  tree type = build_pointer_type (argtype);
-  return fold_convert (type, fold_offsetof_1 (arg));
+  tree t = maybe_constant_value (TREE_OPERAND (val, 0));
+  STRIP_NOPS (t);
+  if (TREE_CODE (t) == INTEGER_CST)
+   {
+ tree type = build_pointer_type (argtype);
+ return fold_convert (type, fold_offsetof_1 (arg));
+   }
 }
 
   /* Handle complex lvalues (when permitted)
--- gcc/testsuite/g++.dg/cpp0x/constexpr-nullptr-1.C.jj 2016-09-30 
09:42:13.778446684 +0200
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-nullptr-1.C2018-04-16 
21:13:06.057428435 +0200
@@ -6,7 +6,7 @@
 // c++/67376 on gcc-patches for additional background.
 
 // { dg-do compile { target c++11 } }
-// { dg-options "-fdelete-null-pointer-checks -fdump-tree-optimized" }
+// { dg-options "-O1 -fdelete-null-pointer-checks -fdump-tree-optimized" }
 
 // Runtime assert.  Used for potentially invalid expressions.
 #define RA(e)  ((e) ? (void)0 : __builtin_abort ())
--- gcc/testsuite/g++.dg/cpp0x/constexpr-84463.C.jj 2018-04-16 
21:02:16.282128997 +0200
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-84463.C2018-04-16 
21:02:16.282128997 +0200
@@ -0,0 +1,22 @@
+// PR c++/84463
+// { dg-do compile { target c++11 } }
+
+struct S { int r; const unsigned char s[5]; };
+static constexpr S a[] = { { 0, "abcd" } };
+struct T { const unsigned char s[5]; };
+static constexpr T b[] = { { "abcd" } };
+
+constexpr int
+foo (const unsigned char *x)
+{
+  return x[0];
+}
+
+constexpr static const S *j = &a[0];
+constexpr static const int k = j->s[0];
+constexpr static int l = foo (a[0].s);
+constexpr static int m = foo (j->s);
+constexpr static const T *n = &b[0];
+constexpr static const int o = n->s[0];
+constexpr static int p = foo (b[0].s);
+constexpr static int q = foo (n->s);

Jakub

Re: [C++ PATCH] Fix constexpr handling of &x->y (PR c++/84463)

2018-04-16 Thread Jason Merrill

On Mon, Apr 16, 2018, 1:31 PM Jakub Jelinek  wrote:

> On Mon, Apr 16, 2018 at 09:28:43PM +0200, Jakub Jelinek wrote:
> > On the following new testcase we emit 2 different constexpr errors
> > because of premature folding, where the PR44100 hack which is supposed
> > to fold expressions like &((S *)0)->f or
> > &((S *)24)->f folds all the &x->y expressions if x is TREE_CONSTANT
> > into (some type)(x + cst) where what we were actually trying to access
> > is lost.
> >
> > The following patch limits the offsetof-like expression hack to
> expressions
> > where maybe_constant_value of val's operand is INTEGER_CST, or e.g.
> > a cast of INTEGER_CST to some pointer type.  This way we don't regress
> > e.g. init/struct2.C, but don't mess up with x is e.g. some constexpr
> > variable initialized to address of something.  Or should it avoid
> > maybe_constant_value and just handle the literal INTEGER_CST and cast
> > thereof?  We wouldn't handle &((S *)(24 + 8))->f that way though...
>
> Or shall we move this folding to cp_fold instead of cp_build_addr_expr_1
> (while keeping it limited to INTEGER_CST pointers)?


Yes, I think that would be better.


>

Re: [PATCH] [PR c++/80290] recycle tinst garbage sooner

2018-04-16 Thread Jason Merrill

On Mon, Apr 16, 2018, 3:20 PM Alexandre Oliva  wrote:

> On Apr 16, 2018, Jason Merrill  wrote:
>
> > On Fri, Apr 13, 2018, 5:19 PM Alexandre Oliva  wrote:
> >> +  tree get_node () const {
> >> +if (!split_list_p ()) return tldcl;
> >> +else return const_cast (this)->to_list ();
> >> +  }
>
> > This looks like a dangerous violation of const correctness, since it does
> > in fact modify the object.
>
> Well...  We know the object's actual type is not const, since we
> allocate all objects of this type from GCable memory.  Conceptually, the
> object is not modified, we're just lazily performing an expensive
> rearrangement of the internal representation that we would have done
> upfront if we weren't trying to save that expense.  This would be a
> perfect case for mutable, if only gengtype didn't get confused by it.
>
>
> OTOH, we could drop a tiny amount of constification in error.c and avoid
> the problem altogether, as in the following incremental patch:
>
>  gcc/cp/cp-tree.h |4 ++--
>  gcc/cp/error.c   |2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index e9d9bab879bc..26a50ac136dd 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -5909,9 +5909,9 @@ struct GTY((chain_next ("%h.next"))) tinst_level {
>/* Return the original node; if it's a split list, make it a
>   TREE_LIST first, so that it can be returned as a single tree
>   object.  */
> -  tree get_node () const {
> +  tree get_node () {
>  if (!split_list_p ()) return tldcl;
> -else return const_cast (this)->to_list ();
> +else return to_list ();
>}
>
>/* Return the original node if it's a DECL or a TREE_LIST, but do
> diff --git a/gcc/cp/error.c b/gcc/cp/error.c
> index 983ffdfedbb2..95b8b84f34ab 100644
> --- a/gcc/cp/error.c
> +++ b/gcc/cp/error.c
> @@ -3475,7 +3475,7 @@ print_instantiation_full_context (diagnostic_context
> *context)
>
>  static void
>  print_instantiation_partial_context_line (diagnostic_context *context,
> - const struct tinst_level *t,
> + struct tinst_level *t,
>   location_t loc, bool recursive_p)
>  {
>if (loc == UNKNOWN_LOCATION)
>
>
> Does this incremental change make it ok (pending retesting), or should I
> look into adding mutable support to gengtype, or something else?
>

This change looks good. One other nit: get_decl_maybe can also return a
list, so the name seems wrong. And this line

+  if (obj->list_p () && obj->get_decl_maybe ())

could be

  if (tree_list_p ())

I think, and then get_decl_maybe wouldn't need to return a list anymore?

Jason

-- 
> Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
>

Re: [PATCH] [PR c++/80290] recycle tinst garbage sooner

2018-04-16 Thread Alexandre Oliva

On Apr 16, 2018, Jason Merrill  wrote:

> This change looks good. One other nit: get_decl_maybe can also return a
> list, so the name seems wrong.

Uhh, it's "give me the decl if you got one, but it's ok if you give me a
list" (the latter is what makes it _maybe).  It replaces what used to be
called just 'decl'.  Maybe wrong is a bit too strong, but...  do you
have any suggestion?  get_decl_or_tree_list_but_dont_build_the_list is a
bit too long IMHO ;-)

> And this line

> +  if (obj->list_p () && obj->get_decl_maybe ())

> could be

>   if (tree_list_p ())

It could if we made tree_list_p public.  It's private because that's an
internal implementation detail, though it's one that happens to leak
through the combination of calls above.

> I think, and then get_decl_maybe wouldn't need to return a list anymore?

That's backwards.  It doesn't need to; the point of get_decl_maybe is to
make as simple a test as possible and return fast, without having to
look in the "decl" to find out what it is.  I even considered making it
return tldcl unconditionally, but that would just cause trouble for the
various pieces of code that used to compare tinst_level::decls for
equality, and would now get false positives out of split lists sharing
the same tmpl decl in the first element of the list.  Testing targs in
get_decl_maybe to avoid returning a partial former decl rules out this
case and is as cheap as it gets.

Do we need more detailed comments, besides a name change?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: [PATCH] [PR c++/80290] recycle tinst garbage sooner

2018-04-16 Thread Jason Merrill

On Mon, Apr 16, 2018 at 9:45 PM, Alexandre Oliva  wrote:
> On Apr 16, 2018, Jason Merrill  wrote:
>
>> This change looks good. One other nit: get_decl_maybe can also return a
>> list, so the name seems wrong.
>
> Uhh, it's "give me the decl if you got one, but it's ok if you give me a
> list" (the latter is what makes it _maybe).  It replaces what used to be
> called just 'decl'.  Maybe wrong is a bit too strong, but...  do you
> have any suggestion?  get_decl_or_tree_list_but_dont_build_the_list is a
> bit too long IMHO ;-)

I was thinking maybe_get_node, since it returns either the same as
get_node or null (and maybe usually comes early in a function name).

>> And this line
>
>> +  if (obj->list_p () && obj->get_decl_maybe ())
>
>> could be
>
>>   if (tree_list_p ())
>
> It could if we made tree_list_p public.  It's private because that's an
> internal implementation detail, though it's one that happens to leak
> through the combination of calls above.

I suppose better would be for

+  if (obj->list_p () && obj->get_decl_maybe ())
+   tree_list_freelist ().free (obj->get_node ());

to be e.g.

  obj->maybe_free_list();

Or put the list handling in a destructor for tinst_level, and have
freelist use placement new and explicit destruction.

Jason

[PATCH] Support bitfields in Wodr machinery (PR lto/85405).

2018-04-16 Thread Martin Liška

Hi.

This is Honza's ODR warning patch that I've just tested. He approved that.
Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Martin

gcc/ChangeLog:

2018-04-16  Jan Hubicka  

PR lto/85405
* ipa-devirt.c (odr_types_equivalent_p): Handle bit fields.

gcc/testsuite/ChangeLog:

2018-04-16  Martin Liska  

PR lto/85405
* g++.dg/lto/pr85405_0.C: New test.
* g++.dg/lto/pr85405_1.C: New test.
---
 gcc/ipa-devirt.c | 11 +--
 gcc/testsuite/g++.dg/lto/pr85405_0.C | 18 ++
 gcc/testsuite/g++.dg/lto/pr85405_1.C |  9 +
 3 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lto/pr85405_0.C
 create mode 100644 gcc/testsuite/g++.dg/lto/pr85405_1.C


diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index fa9380cce80..5da0f72d14f 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -1587,8 +1587,15 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, bool *warned,
  "in another translation unit"));
 		return false;
 		  }
-		gcc_assert (DECL_NONADDRESSABLE_P (f1)
-			== DECL_NONADDRESSABLE_P (f2));
+		if (DECL_BIT_FIELD (f1) != DECL_BIT_FIELD (f2))
+		  {
+		warn_odr (t1, t2, f1, f2, warn, warned,
+			  G_ ("one field is bitfield while other is not "));
+		return false;
+		  }
+		else
+		  gcc_assert (DECL_NONADDRESSABLE_P (f1)
+			  == DECL_NONADDRESSABLE_P (f2));
 	  }
 
 	/* If one aggregate has more fields than the other, they
diff --git a/gcc/testsuite/g++.dg/lto/pr85405_0.C b/gcc/testsuite/g++.dg/lto/pr85405_0.C
new file mode 100644
index 000..1a41d81099c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr85405_0.C
@@ -0,0 +1,18 @@
+// { dg-lto-do link }
+// { dg-lto-options {{-fPIC -shared -flto}} }
+
+class VclReferenceBase { // { dg-lto-warning "7: type 'struct VclReferenceBase' violates the C\\+\\+ One Definition Rule" }
+  int mnRefCnt;
+  bool mbDisposed : 1;
+  virtual ~VclReferenceBase();
+};
+class a;
+class b {
+  a &e;
+  bool c();
+};
+class B {
+  VclReferenceBase d;
+};
+class a : B {};
+bool b::c() { return false; }
diff --git a/gcc/testsuite/g++.dg/lto/pr85405_1.C b/gcc/testsuite/g++.dg/lto/pr85405_1.C
new file mode 100644
index 000..78606185624
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr85405_1.C
@@ -0,0 +1,9 @@
+class VclReferenceBase {
+  int mnRefCnt;
+  bool mbDisposed;
+
+protected:
+  virtual ~VclReferenceBase();
+};
+class : VclReferenceBase {
+} a;

Re: Patch ping

2018-04-16 Thread Kirill Yukhin

Hello Jakub
On 16 апр 10:12, Jakub Jelinek wrote:
> Hi!
> 
> I'd like to ping the
> 
>   http://gcc.gnu.org/ml/gcc-patches/2018-04/msg00414.html
>   PR85281 - assorted -masm=intel fixes
Sorry, for missing that. Patch is OK as far as tests beginning to pass.

--
Thanks, K
> 
> patch.  Thanks.
> 
>   Jakub

[PR 85421] Call expand_all_artificial_thunks in ipa-cp if necessary

2018-04-16 Thread Martin Jambor

Hi,

my fix from last week caused an ICE described in PR 85421 because I did
not read my own comment and forgot that after calls to
redirect_callee_duplicating_thunks, one must finish the work by calling
expand_all_artificial_thunks, otherwise we end up with unanalyzed thunks
and hit some assert sooner or later, like we did in this bug.

Fixed with the patch below, bootstrapped, LTO-bootstrapped and tested on
x86_64-linux.  OK for trunk?

I'm sorry for introducing an ICE so late in stage4,

Martin


2018-04-17  Martin Jambor  

PR ipa/85421
* ipa-cp.c (create_specialized_node): Call
expand_all_artificial_thunks if necessary.
* testsuite/g++.dg/ipa/pr85421.C: New test.

2018-04-08  Martin Jambor  
---
 gcc/ipa-cp.c   |   3 +
 gcc/testsuite/g++.dg/ipa/pr85421.C | 131 +
 2 files changed, 134 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr85421.C

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index b2627ffd05f..4e0e20af409 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -3863,6 +3863,7 @@ create_specialized_node (struct cgraph_node *node,
   new_node = node->create_virtual_clone (callers, replace_trees,
 args_to_skip, "constprop");
 
+  bool have_self_recursive_calls = !self_recursive_calls.is_empty ();
   for (unsigned j = 0; j < self_recursive_calls.length (); j++)
 {
   cgraph_edge *cs = next_edge_clone[self_recursive_calls[j]->uid];
@@ -3870,6 +3871,8 @@ create_specialized_node (struct cgraph_node *node,
   gcc_assert (cs->caller == new_node);
   cs->redirect_callee_duplicating_thunks (new_node);
 }
+  if (have_self_recursive_calls)
+new_node->expand_all_artificial_thunks ();
 
   ipa_set_node_agg_value_chain (new_node, aggvals);
   for (av = aggvals; av; av = av->next)
diff --git a/gcc/testsuite/g++.dg/ipa/pr85421.C 
b/gcc/testsuite/g++.dg/ipa/pr85421.C
new file mode 100644
index 000..517d99ae8f4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr85421.C
@@ -0,0 +1,131 @@
+// { dg-do compile }
+// { dg-options "-O3 -std=gnu++1y -w" }
+
+namespace {
+template  b c(b);
+template  class> struct f {
+  using g = d;
+};
+template  class aa> using h = f;
+template  class aa>
+using i = typename h::g;
+template  struct j { typedef b k; };
+} // namespace
+namespace l {
+template  class m {
+public:
+  typedef b k;
+};
+} // namespace l
+namespace a {
+template  using n = l::m;
+template  class ac : public n {};
+struct s {
+  template  using ad = typename b::e;
+};
+template  struct p : s {
+  typedef typename o::k k;
+  using ag = i;
+};
+} // namespace a
+namespace l {
+template  struct t : a::p {};
+} // namespace l
+namespace a {
+template  struct al {
+  template  static void an(am ao, am) { c(*ao); }
+};
+template  void aq(am ao, am ap) {
+  typedef typename j::k ar;
+  al<__has_trivial_destructor(ar)>::an(ao, ap);
+}
+namespace {
+typedef char au;
+}
+} // namespace a
+typedef char av;
+typedef int aw;
+typedef av ay;
+typedef aw az;
+namespace a {
+template  struct ba {
+  typedef typename l::t::ag ag;
+  struct {
+ag bb;
+ag bc;
+  } bd;
+};
+template > class be : ba {
+  typedef ba bf;
+  typedef typename bf::ag ag;
+
+public:
+  void bh() { bi(this->bd.bb); }
+  void bi(ag bj) { aq(bj, this->bd.bc); }
+};
+} // namespace a
+namespace bk {
+enum bl {};
+enum bn { bo };
+class q {
+public:
+  static a::au bp(bn);
+  static bool bq(a::au *br, bn g) { return bs(br, g); }
+  static bl bs(a::au *br, bn g) {
+if (br) {
+  auto bt = bp(g);
+  if (bt)
+return bl();
+}
+  }
+};
+template  class bu {};
+} // namespace bk
+namespace bv {
+namespace bw {
+class bx;
+}
+} // namespace bv
+namespace bk {
+enum by { bz };
+struct ca;
+class cb {
+public:
+  class cc {
+  public:
+virtual void cd(by) = 0;
+  };
+  virtual bu e();
+  cc *cf;
+};
+class cg {
+public:
+  ~cg() { q::bq(ch, bo); }
+  a::au *ch;
+};
+class ci {
+  cg cj;
+};
+namespace ck {
+enum cl : ay;
+}
+class r : ci {};
+class cn {
+public:
+  ck::cl co();
+};
+by cp(ck::cl);
+class cq : cb, cb::cc {
+  bu ce(bv::bw::bx &, az) noexcept;
+  void cd(by);
+  void cr(bv::bw::bx &, az, cb::cc *) noexcept;
+  cn cs;
+  a::be ct;
+};
+} // namespace bk
+using bv::bw::bx;
+namespace bk {
+bu cq::ce(bx &, az) noexcept { ct.bh(); }
+void cq::cr(bx &, az, cb::cc *) noexcept { cd(bz); }
+void cq::cd(by) { cf->cd(cp(cs.co())); }
+} // namespace bk
-- 
2.16.3

49 matches

Mail list logo