Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-15 Thread Richard Biener
On Sat, Apr 13, 2019 at 8:48 PM Thomas Koenig  wrote:
>
> Hello world,
>
> the attached patch fixes a 8/9 regression where _def_init, an internal
> Fortran variable containing only zeros, was placed into the .rodata
> section. This led to a large increase in executable size.
>
> There should be no impact on other languages because the change to
> varasm.c is guarded by lang_GNU_Fortran ().
>
> Regarding the test case: I did find one other test which checks
> for .bss, so I suppose this is safe.
>
> Regression-tested with a full test (--enable-languages=all and
> make -j64 -k check) on POWER9.
>
> I would like to apply it to both affected branches.
>
> OK for the general and the Fortran part?

This won't work with LTO.  Note we have the issue in the middle-end as well
since we promote variables we see are not written to to TREE_READONLY.
This can be seen with (the somewhat artificial...):

int a[1024*1024] = { 0 };

int __attribute__((noinline)) foo() { return *(volatile int *)a; }

int main()
{
  return foo ();
}

where without -flto a gets placed into .bss while with -flto it
gets into .rodata.  So I believe we should add a DECL flag
specifying whether for section placement we can "ignore"
TREE_READONLY.  We'd initialize that with the original
state of TREE_READONLY so that the R/O promotion doesn't
change section placement.  Also the Fortran FE can then
simply set this flag on variables that may live in .bss.

There are 14 unused bits in tree_decl_with_vis so a
patch for the middle-end part could look like the attached
(w/o solving the LTO issue yet).

Of course adding sth like a .robss section would be nice.

Richard.

> Regards
>
> Thomas
>
> 2019-04-13  Thomas Koenig  
>
>  PR fortran/84487
>  * trans-decl.c (gfc_get_symbol_decl): Mark _def_init as
>  artificial.
>
> 2019-04-13  Thomas Koenig  
>
>  PR fortran/84487
>  * varasm.c (bss_initializer_p): If we are compiling Fortran, the
>  decl is artifical and it has a size larger than 255, it can be
>  put into BSS.
>
> 2019-04-13  Thomas Koenig  
>
>  PR fortran/84487
>  * gfortran.dg/def_init_1.f90: New test.
>
>


p
Description: Binary data


Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).

2019-04-15 Thread Martin Liška
On 4/12/19 4:12 PM, H.J. Lu wrote:
> On Fri, Apr 12, 2019 at 4:41 AM Martin Liška  wrote:
>>
>> On 4/11/19 6:30 PM, H.J. Lu wrote:
>>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška  wrote:

 Hi.

 The patch is adding missing AVX512 ISAs for target and target_clone
 attributes.

 Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

 Ready to be installed?
 Thanks,
 Martin

 gcc/ChangeLog:

 2019-04-10  Martin Liska  

 PR target/89929
 * config/i386/i386.c (get_builtin_code_for_version): Add
 support for missing AVX512 ISAs.

 gcc/testsuite/ChangeLog:

 2019-04-10  Martin Liska  

 PR target/89929
 * g++.target/i386/mv28.C: New test.
 * gcc.target/i386/mvc14.c: New test.
 ---
  gcc/config/i386/i386.c| 34 ++-
  gcc/testsuite/g++.target/i386/mv28.C  | 30 +++
  gcc/testsuite/gcc.target/i386/mvc14.c | 16 +
  3 files changed, 79 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.target/i386/mv28.C
  create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c


>>>
>>
>> Hi.
>>
>>> Since any ISAs beyond AVX512F may be enabled individually, we
>>> can't simply assign priorities to them.   For GFNI, we can have
>>>
>>> 1. GFNI
>>> 2.  GFNI + AVX
>>> 3.  GFNI + AVX512F
>>> 4. GFNI + AVX512F + AVX512VL
>>
>> Makes sense to me! I'm considering syntax extension where one would be
>> able to come up with a priority. Eg.
>>
>> __attribute__((target("gfni,avx512bw", priority((3)
>>
>> Without that the ISA combinations are probably not comparable in a 
>> reasonable way.
>>
>>>
>>> For this code,  GFNI + AVX512BW is ignored:
>>>
>>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii
>>> __attribute__((target("gfni")))
>>> int foo(int i) {
>>> return 1;
>>> }
>>> __attribute__((target("gfni,avx512bw")))
>>> int foo(int i) {
>>> return 4;
>>> }
>>> __attribute__((target("default")))
>>> int foo(int i) {
>>> return 3;
>>> }
>>> int bar ()
>>> {
>>> return foo(2);
>>> }
>>
>> For 'target' attribute it works for me:
>>
>> 1) $ cat z.c && ./xg++ -B. z.c -c
>> #include 
>> volatile __m512i x1, x2;
>> volatile __mmask64 m64;
>>
>> __attribute__((target("gfni")))
>> int foo(int i) {
>> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>> return 1;
>> }
>> __attribute__((target("gfni,avx512bw")))
>> int foo(int i) {
>> return 4;
>> }
>> __attribute__((target("default")))
>> int foo(int i) {
>>   return 3;
>> }
>> int bar ()
>> {
>> return foo(2);
>> }
>> In file included from ./include/immintrin.h:117,
>>  from ./include/x86intrin.h:32,
>>  from z.c:1:
>> z.c: In function ‘int foo(int)’:
>> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa option 
>> -m32 -mgfni -mavx512f
>> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>>   |  ^~~~
>> z.c:7:10: note: the ABI for passing parameters with 64-byte alignment has 
>> changed in GCC 4.6
>>
>> 2) $ cat z.c && ./xg++ -B. z.c -c
>> #include 
>> volatile __m512i x1, x2;
>> volatile __mmask64 m64;
>>
>> __attribute__((target("gfni")))
>> int foo(int i) {
>> return 1;
>> }
>> __attribute__((target("gfni,avx512bw")))
>> int foo(int i) {
>> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>> return 4;
>> }
>> __attribute__((target("default")))
>> int foo(int i) {
>>   return 3;
>> }
>> int bar ()
>> {
>> return foo(2);
>> }
>>
>> [OK]
>>
>> Btw. is it really correct the '-m32' in: 'needs isa option -m32' ?
> 
> It does look odd.

Then let me take a look at this.

> 
>> Similar applies to target_clone attribute where we'll have to come up with
>> a syntax that will allow multiple ISA to be combined. Something like:
>>
>> __attribute__((target_clones("gfni+avx512bw")))
>>
>> ? Priorities can be maybe implemented by order?
>>
> 
> I am thinking -misa=processor which will enable ISAs for
> processor.  It differs from -march=.  -misa= doesn't set
> -mtune.
> 

Well, isn't that what we currently support, e.g.:

$ cat mvc11.c && gcc mvc11.c -c
__attribute__((target_clones("arch=sandybridge", "arch=cascadelake", 
"default"))) int
foo (void)
{
  return 0;
}

int
main ()
{
  foo ();
}

If so, we can provide a new warning that will tell that for AVX512* on should 
use 'arch=xyz'
instead?

Thanks,
Martin


Re: [PATCH] Reset proper type on vector types (PR middle-end/88587).

2019-04-15 Thread Richard Biener
On Mon, Apr 15, 2019 at 8:48 AM Martin Liška  wrote:
>
> Hi.
>
> Apparently, there's one another PR:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90083
>
> May I backport the patch to GCC-8 branch?

Hmm, it isn't a regression, right?  But it only
affects multi-versioning, so yes, go ahead.
Might as well consider GCC 7 then - do you have
an overall idea of the state of the MV stuff on branches?
IIRC you've done most of the "fixes"?

Richard.

> Thanks,
> Martin


[PATCH] Filter out LTO in config/bootstrap-lto-lean.mk.

2019-04-15 Thread Martin Liška
Hi.

The patch is fixing bootstrap-lto-lean.mk where with PGO LTO was
wrongly used in STAGEtrain.

Tested on openSUSE gcc9 package, I'm attaching build log:
https://drive.google.com/file/d/17sxGf_x_VaUekPk2SHI9joIXg1BR5-dY/view?usp=sharing

Ready to be installed?
Thanks,
Martin

config/ChangeLog:

2019-04-15  Martin Liska  

* bootstrap-lto-lean.mk: Filter out -flto in STAGEtrain_CFLAGS.
---
 config/bootstrap-lto-lean.mk | 1 +
 1 file changed, 1 insertion(+)


diff --git a/config/bootstrap-lto-lean.mk b/config/bootstrap-lto-lean.mk
index ee36f6fe544..79cea50a4c6 100644
--- a/config/bootstrap-lto-lean.mk
+++ b/config/bootstrap-lto-lean.mk
@@ -2,6 +2,7 @@
 # Otherwise, LTO is used in only stage3.
 
 STAGE3_CFLAGS += -flto=jobserver
+override STAGEtrain_CFLAGS := $(filter-out -flto=jobserver,$(STAGEtrain_CFLAGS))
 STAGEtrain_GENERATOR_CFLAGS += -flto=jobserver
 STAGEfeedback_CFLAGS += -flto=jobserver
 



Re: [PATCH] Reset proper type on vector types (PR middle-end/88587).

2019-04-15 Thread Martin Liška
On 4/15/19 9:27 AM, Richard Biener wrote:
> On Mon, Apr 15, 2019 at 8:48 AM Martin Liška  wrote:
>>
>> Hi.
>>
>> Apparently, there's one another PR:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90083
>>
>> May I backport the patch to GCC-8 branch?
> 
> Hmm, it isn't a regression, right?  But it only
> affects multi-versioning, so yes, go ahead.

No, it's not. The issue is very old.

> Might as well consider GCC 7 then - do you have
> an overall idea of the state of the MV stuff on branches?

Well, I've made quite some changes to target_clone pass
(multiple_target.c). Thus I would ignore GCC-7 if possible.

Martin

> IIRC you've done most of the "fixes"?
> 
> Richard.
> 
>> Thanks,
>> Martin



[PATCH committed] [Bug tree-optimization/90020] [7/8 regression] -O2 -Os x86-64 wrong code generated for GNU Emacs

2019-04-15 Thread Dominique d'Humières
Author: dominiq
Date: Mon Apr 15 07:56:43 2019
New Revision: 270360

 URL: https://gcc.gnu.org/viewcvs?rev=270360&root=gcc&view=rev
Log:
2019-04-15 Dominique d'Humieres 
PR tree-optimization/90020
* gcc.dg/torture/pr90020.c: Add linker options for darwin.

--- trunk/gcc/testsuite/gcc.dg/torture/pr90020.c2019/04/15 07:39:20 
270359
+++ trunk/gcc/testsuite/gcc.dg/torture/pr90020.c2019/04/15 07:56:43 
270360
@@ -1,5 +1,7 @@
 /* { dg-do run } */
 /* { dg-require-weak "" } */
+/* { dg-additional-options "-Wl,-undefined,dynamic_lookup" { target 
*-*-darwin* } } */
+/* { dg-additional-options "-Wl,-flat_namespace" { target *-*-darwin[89]* } } 
*/
 
 void __attribute__((noinline,noclone))
 check (int i)

Dominique

Re: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes.

2019-04-15 Thread Dominique d'Humières
Hi Paul,

I have found another glitch with -m32 and -O1 or -Os, but not with other values:

% gfc /opt/gcc/_clean/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_4.f90 -m32 
-O
% ./a.out
 FAIL
Note: The following floating-point exceptions are signalling: IEEE_DENORMAL
STOP 1

This looks tricky: if I add a line

  print *, x

before

  if (any (abs (x - [1.,20.,3.,40.,5.,60.]) > 1.e-6)) stop 2

the test succeeds!-(

Also you don’t want pr89844 to be solved, don’t you?

TIA

Dominique


> Le 11 avr. 2019 à 16:44, Paul Richard Thomas  
> a écrit :
> 
> Hi Dominique,
> 
> Yes indeed - I used int(kind(loc(res))) to achieve the same effect.
> 
> I am looking for but failing to find a similar problem for PR89846.
> Tomorrow I turn my attention to an incorrect cast in the compiler.
> 
> Regards
> 
> Paul



New template for 'gcc' made available

2019-04-15 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'gcc' has been made available
to the language teams for translation.  It is archived as:

https://translationproject.org/POT-files/gcc-9.1-b20190414.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

https://gcc.gnu.org/pub/gcc/snapshots/9-20190414/gcc-9-20190414.tar.xz

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




New Spanish PO file for 'gcc' (version 9.1-b20190414)

2019-04-15 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Spanish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/es.po

(This file, 'gcc-9.1-b20190414.es.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




AW: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes.

2019-04-15 Thread Bader, Reinhold
Dear Paul, 

mostly looks good. Apart from a regression with optional arguments reported as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90093 
all other  test cases I have now execute correctly.

Cheers
Reinhold

> -Ursprüngliche Nachricht-
> Von: Paul Richard Thomas 
> Gesendet: Sonntag, 14. April 2019 20:16
> An: Thomas Koenig 
> Cc: Gilles Gouaillardet ; Bader, Reinhold
> ; fort...@gcc.gnu.org; gcc-patches  patc...@gcc.gnu.org>
> Betreff: Re: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes.
> 
> Hi Thomas,
> 
> Thanks a lot. Committed as revision 270353.
> 
> I was determined not to repeat the PDT experience, which is still nagging at
> me. That has to be the next major gfc project, I guess.
> 
> Regards
> 
> Paul
> 
> On Sun, 14 Apr 2019 at 18:08, Thomas Koenig 
> wrote:
> >
> > Hi Paul,
> >
> >
> > > Please find attached the updated patch, which fixes the problem with
> > > -m32 in PR90022, eliminates the temporary creation for INTENT(IN)
> > > dummies and fixes PR89846.
> > >
> > > While it looks like it should be intrusive because of its size, I
> > > believe that the patch is still safe for trunk since it is hidden
> > > behind tests for CFI descriptors.
> > >
> > > Bootstraps and regtests on FC29/x86_64 - OK for trunk?
> >
> > OK.
> >
> > I we're going into the gcc 9 release with an implementation of the C
> > interop features, it will be better with fewer bugs :-)
> >
> > Thanks a lot for working on it!
> >
> > Regards
> >
> > Thomas
> 
> 
> 
> --
> "If you can't explain it simply, you don't understand it well enough"
> - Albert Einstein


smime.p7s
Description: S/MIME cryptographic signature


[PATCH] Fix PR90074

2019-04-15 Thread Richard Biener


I am testing the following patch to fix wrong-debug creatd by
loop-distribution simply dropping debug stmts on the floor
making earlier ones with bogus value live.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-04-15  Richard Biener  

PR debug/90074
* tree-loop-distribution.c (destroy_loop): Preserve correct
debug info.

* gcc.dg/guality/pr90074.c: New testcase.

Index: gcc/tree-loop-distribution.c
===
--- gcc/tree-loop-distribution.c(revision 270358)
+++ gcc/tree-loop-distribution.c(working copy)
@@ -1094,12 +1094,8 @@ destroy_loop (struct loop *loop)
 
   bbs = get_loop_body_in_dom_order (loop);
 
-  redirect_edge_pred (exit, src);
-  exit->flags &= ~(EDGE_TRUE_VALUE|EDGE_FALSE_VALUE);
-  exit->flags |= EDGE_FALLTHRU;
-  cancel_loop_tree (loop);
-  rescan_loop_exit (exit, false, true);
-
+  gimple_stmt_iterator dst_gsi = gsi_after_labels (exit->dest);
+  bool safe_p = single_pred_p (exit->dest);
   i = nbbs;
   do
 {
@@ -1116,14 +1112,45 @@ destroy_loop (struct loop *loop)
  if (virtual_operand_p (gimple_phi_result (phi)))
mark_virtual_phi_result_for_renaming (phi);
}
-  for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi);
-  gsi_next (&gsi))
+  for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi);)
{
  gimple *stmt = gsi_stmt (gsi);
  tree vdef = gimple_vdef (stmt);
  if (vdef && TREE_CODE (vdef) == SSA_NAME)
mark_virtual_operand_for_renaming (vdef);
+ /* Also move and eventually reset debug stmts.  We can leave
+constant values in place in case the stmt dominates the exit.
+???  Non-constant values from the last iteration can be
+replaced with final values if we can compute them.  */
+ if (gimple_debug_bind_p (stmt))
+   {
+ tree val = gimple_debug_bind_get_value (stmt);
+ gsi_move_before (&gsi, &dst_gsi);
+ if (val
+ && (!safe_p
+ || !is_gimple_min_invariant (val)
+ || !dominated_by_p (CDI_DOMINATORS, exit->src, bbs[i])))
+   {
+ gimple_debug_bind_reset_value (stmt);
+ update_stmt (stmt);
+   }
+   }
+ else
+   gsi_next (&gsi);
}
+}
+  while (i != 0);
+
+  redirect_edge_pred (exit, src);
+  exit->flags &= ~(EDGE_TRUE_VALUE|EDGE_FALSE_VALUE);
+  exit->flags |= EDGE_FALLTHRU;
+  cancel_loop_tree (loop);
+  rescan_loop_exit (exit, false, true);
+
+  i = nbbs;
+  do
+{
+  --i;
   delete_basic_block (bbs[i]);
 }
   while (i != 0);
Index: gcc/testsuite/gcc.dg/guality/pr90074.c
===
--- gcc/testsuite/gcc.dg/guality/pr90074.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/guality/pr90074.c  (working copy)
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-g" } */
+
+void __attribute__((noinline))
+optimize_me_not ()
+{
+  __asm__ volatile ("" : : : "memory");
+}
+char a;
+short b[7][1];
+int main()
+{
+  int i, c;
+  a = 0;
+  i = 0;
+  for (; i < 7; i++) {
+  c = 0;
+  for (; c < 1; c++)
+   b[i][c] = 0;
+  }
+  /* i may very well be optimized out, so we cannot test for i == 7.
+ Instead test i + 1 which will make the test UNSUPPORTED if i
+ is optimized out.  Since the test previously had wrong debug
+ with i == 0 this is acceptable.  Optimally we'd produce a
+ debug stmt for the final value of the loop during loop distribution
+ which would fix the UNSUPPORTED cases.
+ c is optimized out at -Og for no obvious reason.  */
+  optimize_me_not(); /* { dg-final { gdb-test . "i + 1" "8" } } */
+/* { dg-final { gdb-test .-1 "c + 1" "2" } } */
+  return 0;
+}


[PATCH] Fix PR90071

2019-04-15 Thread Richard Biener


The following fixes reassoc leaking abnormals into rewritten
conditon chains.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-04-15  Richard Biener  

PR tree-optimization/90071
* tree-ssa-reassoc.c (init_range_entry): Do not pick up
abnormal operands from def stmts.

* gcc.dg/torture/pr90071.c: New testcase.

Index: gcc/tree-ssa-reassoc.c
===
--- gcc/tree-ssa-reassoc.c  (revision 270358)
+++ gcc/tree-ssa-reassoc.c  (working copy)
@@ -2143,7 +2143,8 @@ init_range_entry (struct range_entry *r,
  exp_type = boolean_type_node;
}
 
-  if (TREE_CODE (arg0) != SSA_NAME)
+  if (TREE_CODE (arg0) != SSA_NAME
+ || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (arg0))
break;
   loc = gimple_location (stmt);
   switch (code)
Index: gcc/testsuite/gcc.dg/torture/pr90071.c
===
--- gcc/testsuite/gcc.dg/torture/pr90071.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr90071.c  (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+
+int a;
+static int b;
+
+void
+foo ()
+{
+  int d;
+  int e = (int) (__INTPTR_TYPE__) &&f;
+  void *g = &&h;
+h: ++e;
+   if (a)
+ i: goto *g;
+   for (;;)
+ {
+   e = 0;
+   if (b)
+goto i;
+ }
+f:
+   goto *({ d || e < 0 || e >= 2; });
+   &e;
+}


Re: [PATCH] Fix PR88936

2019-04-15 Thread Richard Biener
On Fri, 12 Apr 2019, Richard Biener wrote:

> On Fri, 12 Apr 2019, Richard Biener wrote:
> 
> > On Fri, 12 Apr 2019, Michael Matz wrote:
> > 
> > > Hi,
> > > 
> > > On Fri, 12 Apr 2019, Richard Biener wrote:
> > > 
> > > > > You miss PARM_DECLs and RESULT_DECLs, i.e. it's probably better to 
> > > > > factor 
> > > > > out tree.c:auto_var_in_fn_p and place the new auto_var_p in tree.c as 
> > > > > well.
> > > > 
> > > > Hmm, I left the above unchanged from a different variant of the patch
> > > > where for some reason I do not remember I explicitely decided
> > > > parameters and results are not affected...
> > > 
> > > Even if that were the case the function is sufficiently general (also its 
> > > name) that it should be generic infrastructure, not hidden away in 
> > > structalias.
> > 
> > It was not fully equivalent, but yes.  So - like the following?
> > I think checking DECL_CONTEXT isn't necessary given the 
> > !DECL_EXTERNAL/STATIC checks.
> > 
> > Bootstrap / regtest running on x86_64-unknown-linux-gnu.
> 
> Aww, hits
> 
> /space/rguenther/src/svn/trunk/libgomp/testsuite/libgomp.oacc-c/../libgomp.oacc-c-c++-common/zero_length_subarrays.c:33:1:
>  
> internal compiler error: in fold_builtin_alloca_with_align, at 
> tree-ssa-ccp.c:2186^M
> 0x6d7e45 fold_builtin_alloca_with_align^M
> 
> have to look/think about this.

I have applied the following variant after testing on 
x86_64-unknown-linux-gnu.

Richard.

2019-04-15  Richard Biener  

PR ipa/88936
* tree.h (auto_var_p): Declare.
* tree.c (auto_var_p): New function, split out from ...
(auto_var_in_fn_p): ... here.
* tree-ssa-structalias.c (struct variable_info): Add shadow_var_uid
member.
(new_var_info): Initialize it.
(set_uids_in_ptset): Also set the shadow variable uid if required.
(ipa_pta_execute): Postprocess points-to solutions assigning
shadow variable uids for locals that may reach their containing
function recursively.
* tree-ssa-ccp.c (fold_builtin_alloca_with_align): Do not
assert but instead check whether the points-to solution is
a singleton.

* gcc.dg/torture/pr88936-1.c: New testcase.
* gcc.dg/torture/pr88936-2.c: Likewise.
* gcc.dg/torture/pr88936-3.c: Likewise.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 270306)
+++ gcc/tree.c  (working copy)
@@ -9268,17 +9268,25 @@ get_type_static_bounds (const_tree type,
 }
 }
 
+/* Return true if VAR is an automatic variable.  */
+
+bool
+auto_var_p (const_tree var)
+{
+  return VAR_P (var) && ! DECL_EXTERNAL (var))
+   || TREE_CODE (var) == PARM_DECL)
+  && ! TREE_STATIC (var))
+ || TREE_CODE (var) == RESULT_DECL);
+}
+
 /* Return true if VAR is an automatic variable defined in function FN.  */
 
 bool
 auto_var_in_fn_p (const_tree var, const_tree fn)
 {
   return (DECL_P (var) && DECL_CONTEXT (var) == fn
- && VAR_P (var) && ! DECL_EXTERNAL (var))
-   || TREE_CODE (var) == PARM_DECL)
-  && ! TREE_STATIC (var))
- || TREE_CODE (var) == LABEL_DECL
- || TREE_CODE (var) == RESULT_DECL));
+ && (auto_var_p (var)
+ || TREE_CODE (var) == LABEL_DECL));
 }
 
 /* Subprogram of following function.  Called by walk_tree.
Index: gcc/tree.h
===
--- gcc/tree.h  (revision 270306)
+++ gcc/tree.h  (working copy)
@@ -4893,6 +4893,7 @@ extern bool stdarg_p (const_tree);
 extern bool prototype_p (const_tree);
 extern bool is_typedef_decl (const_tree x);
 extern bool typedef_variant_p (const_tree);
+extern bool auto_var_p (const_tree);
 extern bool auto_var_in_fn_p (const_tree, const_tree);
 extern tree build_low_bits_mask (tree, unsigned);
 extern bool tree_nop_conversion_p (const_tree, const_tree);
Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 270306)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -299,6 +299,11 @@ struct variable_info
   /* Full size of the base variable, in bits.  */
   unsigned HOST_WIDE_INT fullsize;
 
+  /* In IPA mode the shadow UID in case the variable needs to be duplicated in
+ the final points-to solution because it reaches its containing
+ function recursively.  Zero if none is needed.  */
+  unsigned int shadow_var_uid;
+
   /* Name of this variable */
   const char *name;
 
@@ -397,6 +402,7 @@ new_var_info (tree t, const char *name,
   ret->solution = BITMAP_ALLOC (&pta_obstack);
   ret->oldsolution = NULL;
   ret->next = 0;
+  ret->shadow_var_uid = 0;
   ret->head = ret->id;
 
   stats.total_vars++;
@@ -6452,6 +6458,16 @@ set_uids_in_ptset (bitmap into, bitmap f
  && (TREE_STATIC (vi->decl) || DECL_EXTERNAL (vi->decl))
  && ! decl_binds_to_current_def_p (vi->decl))
p

Re: [PATCH] Filter out LTO in config/bootstrap-lto-lean.mk.

2019-04-15 Thread Richard Biener
On Mon, Apr 15, 2019 at 9:46 AM Martin Liška  wrote:
>
> Hi.
>
> The patch is fixing bootstrap-lto-lean.mk where with PGO LTO was
> wrongly used in STAGEtrain.
>
> Tested on openSUSE gcc9 package, I'm attaching build log.
>
> Ready to be installed?

I wonder why 'override' is necessary given before we include the build-config
.mk fragment we do

STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS))

I suppose you checked w/o override and it didn't work?  Or ist the issue
that you have to use := here to get the previous addition to STAGE3_CFLAGS
resolved?

A make expert might want to chime in here.

Maybe a simpler solution is to do

STAGEtrain_CFLAGS := $(filter-out -fchecking=1,$(STAGE3_CFLAGS))

instead of the '=' assignment in the toplevel Makefile to not cause
build-config fragments changing the values of derived flags?
(if, then consistently for all, of course).

Richard.

> Thanks,
> Martin
>
> config/ChangeLog:
>
> 2019-04-15  Martin Liska  
>
> * bootstrap-lto-lean.mk: Filter out -flto in STAGEtrain_CFLAGS.
> ---
>  config/bootstrap-lto-lean.mk | 1 +
>  1 file changed, 1 insertion(+)
>
>


Re: [PATCH] Filter out LTO in config/bootstrap-lto-lean.mk.

2019-04-15 Thread Martin Liška
On 4/15/19 12:23 PM, Richard Biener wrote:
> On Mon, Apr 15, 2019 at 9:46 AM Martin Liška  wrote:
>>
>> Hi.
>>
>> The patch is fixing bootstrap-lto-lean.mk where with PGO LTO was
>> wrongly used in STAGEtrain.
>>
>> Tested on openSUSE gcc9 package, I'm attaching build log.
>>
>> Ready to be installed?
> 
> I wonder why 'override' is necessary given before we include the build-config
> .mk fragment we do
> 
> STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS))
> 
> I suppose you checked w/o override and it didn't work?  Or ist the issue
> that you have to use := here to get the previous addition to STAGE3_CFLAGS
> resolved?

Fails due to:
[   16s] + setarch x86_64 -R make profiledbootstrap 'STAGE1_CFLAGS=-g -O2' 
'BOOT_CFLAGS=-O2 -D_FORTIFY_SOURCE=2 -funwind-tables 
-fasynchronous-unwind-tables -fstack-clash-protection -g -U_FORTIFY_SOURCE' 
-j160
[   16s] ../config/bootstrap-lto-lean.mk:5: *** Recursive variable 
'STAGEtrain_CFLAGS' references itself (eventually).  Stop.

> 
> A make expert might want to chime in here.
> 
> Maybe a simpler solution is to do
> 
> STAGEtrain_CFLAGS := $(filter-out -fchecking=1,$(STAGE3_CFLAGS))

This one will work of course. I would wait for some time and we can eventually
take this change.

Martin

> 
> instead of the '=' assignment in the toplevel Makefile to not cause
> build-config fragments changing the values of derived flags?
> (if, then consistently for all, of course).
> 
> Richard.
> 
>> Thanks,
>> Martin
>>
>> config/ChangeLog:
>>
>> 2019-04-15  Martin Liska  
>>
>> * bootstrap-lto-lean.mk: Filter out -flto in STAGEtrain_CFLAGS.
>> ---
>>  config/bootstrap-lto-lean.mk | 1 +
>>  1 file changed, 1 insertion(+)
>>
>>



Re: [PR86438] avoid too-long shift in test

2019-04-15 Thread Andrew Stubbs

On 12/04/2019 02:42, Alexandre Oliva wrote:

The test fell back to long long and long when __int128 is not
available, but it assumed sizeof(long) < sizeof(long long) because of
a shift count that would be out of range for a long long if their
widths are the same.  Fixed by splitting it up into two shifts.

Tested on x86_64-linux-gnu, -m64 and -m32.  Hopefully Andrew and/or John
David will let me know if it fails to fix the problem on the platforms
in which they've observed it.  Thanks for the report, sorry it took me
so long to get to it.  I'm going to install this as obvious, unless
there are objections in the next few days.


Confirmed; the test now passes for amdgcn.

Andrew


Re: [PATCH] combine: Count auto_inc properly (PR89794)

2019-04-15 Thread Segher Boessenkool
On Sun, Apr 14, 2019 at 09:51:39AM +, Segher Boessenkool wrote:
> The code that checks if an auto-increment from i0 or i1 is not lost is
> a bit shaky.  The code to check the same for i2 is non-existent, and
> cannot be implemented in a similar way at all.  So, this patch counts
> all auto-increments, and makes sure we end up with the same number as
> we started with.  This works because we still have a check that we
> will not duplicate any.
> 
> We should do this some better way, but not while we are in stage 4.
> 
> Tested on powerpc64-linux {-m32,-m64}; also tested manually on the Arm
> testcase.

I added a missing "static", and added the testcase, as attached.
Committing it now.


Subject: [PATCH] combine: Count auto_inc properly (PR89794)

The code that checks if an auto-increment from i0 or i1 is not lost is
a bit shaky.  The code to check the same for i2 is non-existent, and
cannot be implemented in a similar way at all.  So, this patch counts
all auto-increments, and makes sure we end up with the same number as
we started with.  This works because we still have a check that we
will not duplicate any.


2019-04-15  Segher Boessenkool  

PR rtl-optimization/89794
* combine.c (count_auto_inc): New function.
(try_combine): Count how many auto_inc expressions there were in the
original instructions.  Ensure we have the same number in the new
instructions.  Remove the code that tried to ensure auto_inc side
effects on i1 and i0 are not lost.

gcc/testsuite/
PR rtl-optimization/89794
* gcc.dg/torture/pr89794.c: New testcase.

---
 gcc/combine.c  | 60 --
 gcc/testsuite/gcc.dg/torture/pr89794.c | 24 ++
 2 files changed, 66 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr89794.c

diff --git a/gcc/combine.c b/gcc/combine.c
index f681345..07bd0cf 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2667,6 +2667,16 @@ combine_remove_reg_equal_equiv_notes_for_regno (unsigned 
int regno)
 }
 }
 
+/* Callback function to count autoincs.  */
+
+static int
+count_auto_inc (rtx, rtx, rtx, rtx, rtx, void *arg)
+{
+  (*((int *) arg))++;
+
+  return 0;
+}
+
 /* Try to combine the insns I0, I1 and I2 into I3.
Here I0, I1 and I2 appear earlier than I3.
I0 and I1 can be zero; then we combine just I2 into I3, or I1 and I2 into
@@ -2732,6 +2742,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   int split_i2i3 = 0;
   int changed_i3_dest = 0;
   bool i2_was_move = false, i3_was_move = false;
+  int n_auto_inc = 0;
 
   int maxreg;
   rtx_insn *temp_insn;
@@ -3236,6 +3247,16 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   return 0;
 }
 
+  /* Count how many auto_inc expressions there were in the original insns;
+ we need to have the same number in the resulting patterns.  */
+
+  if (i0)
+for_each_inc_dec (PATTERN (i0), count_auto_inc, &n_auto_inc);
+  if (i1)
+for_each_inc_dec (PATTERN (i1), count_auto_inc, &n_auto_inc);
+  for_each_inc_dec (PATTERN (i2), count_auto_inc, &n_auto_inc);
+  for_each_inc_dec (PATTERN (i3), count_auto_inc, &n_auto_inc);
+
   /* If the set in I2 needs to be kept around, we must make a copy of
  PATTERN (I2), so that when we substitute I1SRC for I1DEST in
  PATTERN (I2), we are only substituting for the original I1DEST, not into
@@ -3439,18 +3460,11 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
 
   if (i1 && GET_CODE (newpat) != CLOBBER)
 {
-  /* Check that an autoincrement side-effect on I1 has not been lost.
-This happens if I1DEST is mentioned in I2 and dies there, and
-has disappeared from the new pattern.  */
-  if ((FIND_REG_INC_NOTE (i1, NULL_RTX) != 0
-  && i1_feeds_i2_n
-  && dead_or_set_p (i2, i1dest)
-  && !reg_overlap_mentioned_p (i1dest, newpat))
-  /* Before we can do this substitution, we must redo the test done
- above (see detailed comments there) that ensures I1DEST isn't
- mentioned in any SETs in NEWPAT that are field assignments.  */
- || !combinable_i3pat (NULL, &newpat, i1dest, NULL_RTX, NULL_RTX,
-   0, 0, 0))
+  /* Before we can do this substitution, we must redo the test done
+above (see detailed comments there) that ensures I1DEST isn't
+mentioned in any SETs in NEWPAT that are field assignments.  */
+  if (!combinable_i3pat (NULL, &newpat, i1dest, NULL_RTX, NULL_RTX,
+0, 0, 0))
{
  undo_all ();
  return 0;
@@ -3480,12 +3494,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
 
   if (i0 && GET_CODE (newpat) != CLOBBER)
 {
-  if ((FIND_REG_INC_NOTE (i0, NULL_RTX) != 0
-  && ((i0_feeds_i2_n && dead_or_set_p (i2, i0dest))
-  || (i0_feeds_i1_n && 

Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-15 Thread Jan Hubicka
> 
> This won't work with LTO.  Note we have the issue in the middle-end as well
> since we promote variables we see are not written to to TREE_READONLY.
> This can be seen with (the somewhat artificial...):
> 
> int a[1024*1024] = { 0 };
> 
> int __attribute__((noinline)) foo() { return *(volatile int *)a; }
> 
> int main()
> {
>   return foo ();
> }
> 
> where without -flto a gets placed into .bss while with -flto it
> gets into .rodata.  So I believe we should add a DECL flag
> specifying whether for section placement we can "ignore"
> TREE_READONLY.  We'd initialize that with the original
> state of TREE_READONLY so that the R/O promotion doesn't
> change section placement.  Also the Fortran FE can then
> simply set this flag on variables that may live in .bss.
> 
> There are 14 unused bits in tree_decl_with_vis so a
> patch for the middle-end part could look like the attached
> (w/o solving the LTO issue yet).
> 
> Of course adding sth like a .robss section would be nice.

Yep, but I think what you propose works well in practice (I am not sure
if we are forced to put const delcared variables to readonly memory and
if we can't do this as binary size optimization always). The patch
looks fine to me.  Would be possible to place the flags into
varpool_node rather then TREE? It is a lot easier to manage flags
there.

Honza
> 
> Richard.
> 
> > Regards
> >
> > Thomas
> >
> > 2019-04-13  Thomas Koenig  
> >
> >  PR fortran/84487
> >  * trans-decl.c (gfc_get_symbol_decl): Mark _def_init as
> >  artificial.
> >
> > 2019-04-13  Thomas Koenig  
> >
> >  PR fortran/84487
> >  * varasm.c (bss_initializer_p): If we are compiling Fortran, the
> >  decl is artifical and it has a size larger than 255, it can be
> >  put into BSS.
> >
> > 2019-04-13  Thomas Koenig  
> >
> >  PR fortran/84487
> >  * gfortran.dg/def_init_1.f90: New test.
> >
> >




Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-15 Thread Florian Weimer
* Richard Biener:

> Of course adding sth like a .robss section would be nice.

I think this is strictly a link editor issue because a read-only PT_LOAD
directive with a memory size larger than the file size already produces
read-only zero pages, without requiring a file allocation.

Thanks,
Florian


Re: Fix false -Wodr warnings

2019-04-15 Thread Richard Biener
On Sun, Apr 14, 2019 at 10:59 PM Jan Hubicka  wrote:
>
> Hi,
> this patch fixes false warning that is output when different -std
> settings are used. In this case C++ FE produces same declaration in
> different representations which differ by 0 sized fileds only.
> The patch makes them to be ignored (and I checked we ignore them for
> canonical type merging too)
>
> Bootstrapped/regtested x86_64-linux, comitted.

The testcase is bogus

WARNING: lto.exp does not support dg-do
WARNING: lto.exp does not support dg-options in primary source file


> Honza
>
> PR lto/89358
> * g++.dg/lto/pr89358_0.C: New testcase.
> * g++.dg/lto/pr89358_1.C: New testcase.
> * ipa-devirt.c (skip_in_fields_list_p): New.
> (odr_types_equivalent_p): Use it.
> Index: testsuite/g++.dg/lto/pr89358_0.C
> ===
> --- testsuite/g++.dg/lto/pr89358_0.C(nonexistent)
> +++ testsuite/g++.dg/lto/pr89358_0.C(working copy)
> @@ -0,0 +1,11 @@
> +/* { dg-do link } */
> +/* { dg-options "-std=c++17"  } */
> +#include 
> +
> +extern void test();
> +
> +int main()
> +{
> +std::map m;
> +test();
> +}
> Index: testsuite/g++.dg/lto/pr89358_1.C
> ===
> --- testsuite/g++.dg/lto/pr89358_1.C(nonexistent)
> +++ testsuite/g++.dg/lto/pr89358_1.C(working copy)
> @@ -0,0 +1,7 @@
> +/* { dg-options "-std=c++14"  } */
> +#include 
> +
> +void test()
> +{
> +std::map m;
> +}
> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c(revision 270324)
> +++ ipa-devirt.c(working copy)
> @@ -1282,6 +1282,24 @@ warn_types_mismatch (tree t1, tree t2, l
>  inform (loc_t2, "the incompatible type is defined here");
>  }
>
> +/* Return true if T should be ignored in TYPE_FIELDS for ODR comparsion.  */
> +
> +static bool
> +skip_in_fields_list_p (tree t)
> +{
> +  if (TREE_CODE (t) != FIELD_DECL)
> +return true;
> +  /* C++ FE introduces zero sized fields depending on -std setting, see
> + PR89358.  */
> +  if (DECL_SIZE (t)
> +  && integer_zerop (DECL_SIZE (t))
> +  && DECL_ARTIFICIAL (t)
> +  && DECL_IGNORED_P (t)
> +  && !DECL_NAME (t))
> +return true;
> +  return false;
> +}
> +
>  /* Compare T1 and T2, report ODR violations if WARN is true and set
> WARNED to true if anything is reported.  Return true if types match.
> If true is returned, the types are also compatible in the sense of
> @@ -1548,9 +1566,9 @@ odr_types_equivalent_p (tree t1, tree t2
>  f1 = TREE_CHAIN (f1), f2 = TREE_CHAIN (f2))
>   {
> /* Skip non-fields.  */
> -   while (f1 && TREE_CODE (f1) != FIELD_DECL)
> +   while (f1 && skip_in_fields_list_p (f1))
>   f1 = TREE_CHAIN (f1);
> -   while (f2 && TREE_CODE (f2) != FIELD_DECL)
> +   while (f2 && skip_in_fields_list_p (f2))
>   f2 = TREE_CHAIN (f2);
> if (!f1 || !f2)
>   break;


Re: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes.

2019-04-15 Thread Paul Richard Thomas
Dear Dominique, Gilles and Reinhold,

Thank you for your rapid feedback. We might even get a reasonably
functional ISO Fortran binding in place for 9-branch release :-)  On
your remaining nits:

(i) ISO_Fortran_binding_4.f90 -m32 -O1/Os looks awful. I will take a
look, though.

(ii) pr89844 being fixed by an earlier patch led me to give it lower
priority. I will look to see whether another testcase is required to
nail it down.

(iii) I will take a look at 90093 - it should be straight forward. I
do not regard it as being a regression, however, since the arguments
were not being correctly handled until now - ie. were not converted
from cfi to gfc descriptors.

Cheers

Paul

On Mon, 15 Apr 2019 at 10:27, Bader, Reinhold  wrote:
>
> Dear Paul,
>
> mostly looks good. Apart from a regression with optional arguments reported as
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90093
> all other  test cases I have now execute correctly.
>
> Cheers
> Reinhold
>
> > -Ursprüngliche Nachricht-
> > Von: Paul Richard Thomas 
> > Gesendet: Sonntag, 14. April 2019 20:16
> > An: Thomas Koenig 
> > Cc: Gilles Gouaillardet ; Bader, Reinhold
> > ; fort...@gcc.gnu.org; gcc-patches  > patc...@gcc.gnu.org>
> > Betreff: Re: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes.
> >
> > Hi Thomas,
> >
> > Thanks a lot. Committed as revision 270353.
> >
> > I was determined not to repeat the PDT experience, which is still nagging at
> > me. That has to be the next major gfc project, I guess.
> >
> > Regards
> >
> > Paul
> >
> > On Sun, 14 Apr 2019 at 18:08, Thomas Koenig 
> > wrote:
> > >
> > > Hi Paul,
> > >
> > >
> > > > Please find attached the updated patch, which fixes the problem with
> > > > -m32 in PR90022, eliminates the temporary creation for INTENT(IN)
> > > > dummies and fixes PR89846.
> > > >
> > > > While it looks like it should be intrusive because of its size, I
> > > > believe that the patch is still safe for trunk since it is hidden
> > > > behind tests for CFI descriptors.
> > > >
> > > > Bootstraps and regtests on FC29/x86_64 - OK for trunk?
> > >
> > > OK.
> > >
> > > I we're going into the gcc 9 release with an implementation of the C
> > > interop features, it will be better with fewer bugs :-)
> > >
> > > Thanks a lot for working on it!
> > >
> > > Regards
> > >
> > > Thomas
> >
> >
> >
> > --
> > "If you can't explain it simply, you don't understand it well enough"
> > - Albert Einstein



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).

2019-04-15 Thread Martin Liška
On 4/12/19 4:12 PM, H.J. Lu wrote:
> On Fri, Apr 12, 2019 at 4:41 AM Martin Liška  wrote:
>>
>> On 4/11/19 6:30 PM, H.J. Lu wrote:
>>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška  wrote:

 Hi.

 The patch is adding missing AVX512 ISAs for target and target_clone
 attributes.

 Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

 Ready to be installed?
 Thanks,
 Martin

 gcc/ChangeLog:

 2019-04-10  Martin Liska  

 PR target/89929
 * config/i386/i386.c (get_builtin_code_for_version): Add
 support for missing AVX512 ISAs.

 gcc/testsuite/ChangeLog:

 2019-04-10  Martin Liska  

 PR target/89929
 * g++.target/i386/mv28.C: New test.
 * gcc.target/i386/mvc14.c: New test.
 ---
  gcc/config/i386/i386.c| 34 ++-
  gcc/testsuite/g++.target/i386/mv28.C  | 30 +++
  gcc/testsuite/gcc.target/i386/mvc14.c | 16 +
  3 files changed, 79 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.target/i386/mv28.C
  create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c


>>>
>>
>> Hi.
>>
>>> Since any ISAs beyond AVX512F may be enabled individually, we
>>> can't simply assign priorities to them.   For GFNI, we can have
>>>
>>> 1. GFNI
>>> 2.  GFNI + AVX
>>> 3.  GFNI + AVX512F
>>> 4. GFNI + AVX512F + AVX512VL
>>
>> Makes sense to me! I'm considering syntax extension where one would be
>> able to come up with a priority. Eg.
>>
>> __attribute__((target("gfni,avx512bw", priority((3)
>>
>> Without that the ISA combinations are probably not comparable in a 
>> reasonable way.
>>
>>>
>>> For this code,  GFNI + AVX512BW is ignored:
>>>
>>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii
>>> __attribute__((target("gfni")))
>>> int foo(int i) {
>>> return 1;
>>> }
>>> __attribute__((target("gfni,avx512bw")))
>>> int foo(int i) {
>>> return 4;
>>> }
>>> __attribute__((target("default")))
>>> int foo(int i) {
>>> return 3;
>>> }
>>> int bar ()
>>> {
>>> return foo(2);
>>> }
>>
>> For 'target' attribute it works for me:
>>
>> 1) $ cat z.c && ./xg++ -B. z.c -c
>> #include 
>> volatile __m512i x1, x2;
>> volatile __mmask64 m64;
>>
>> __attribute__((target("gfni")))
>> int foo(int i) {
>> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>> return 1;
>> }
>> __attribute__((target("gfni,avx512bw")))
>> int foo(int i) {
>> return 4;
>> }
>> __attribute__((target("default")))
>> int foo(int i) {
>>   return 3;
>> }
>> int bar ()
>> {
>> return foo(2);
>> }
>> In file included from ./include/immintrin.h:117,
>>  from ./include/x86intrin.h:32,
>>  from z.c:1:
>> z.c: In function ‘int foo(int)’:
>> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa option 
>> -m32 -mgfni -mavx512f
>> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>>   |  ^~~~
>> z.c:7:10: note: the ABI for passing parameters with 64-byte alignment has 
>> changed in GCC 4.6
>>
>> 2) $ cat z.c && ./xg++ -B. z.c -c
>> #include 
>> volatile __m512i x1, x2;
>> volatile __mmask64 m64;
>>
>> __attribute__((target("gfni")))
>> int foo(int i) {
>> return 1;
>> }
>> __attribute__((target("gfni,avx512bw")))
>> int foo(int i) {
>> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>> return 4;
>> }
>> __attribute__((target("default")))
>> int foo(int i) {
>>   return 3;
>> }
>> int bar ()
>> {
>> return foo(2);
>> }
>>
>> [OK]
>>
>> Btw. is it really correct the '-m32' in: 'needs isa option -m32' ?
> 
> It does look odd.

I've just created a PR for that:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90096

Martin

> 
>> Similar applies to target_clone attribute where we'll have to come up with
>> a syntax that will allow multiple ISA to be combined. Something like:
>>
>> __attribute__((target_clones("gfni+avx512bw")))
>>
>> ? Priorities can be maybe implemented by order?
>>
> 
> I am thinking -misa=processor which will enable ISAs for
> processor.  It differs from -march=.  -misa= doesn't set
> -mtune.
> 



Re: [PATCH] Fix up RTL DCE find_call_stack_args (PR rtl-optimization/89965)

2019-04-15 Thread Michael Matz
Hi,

On Fri, 12 Apr 2019, Jeff Law wrote:

> > I don't think this follows. Imagine a pure foo tailcalling a pure bar.
> > To make the tailcall, foo may need to change some of its argument slots
> > to pass new arguments to bar.
> I'd claim that a pure/const call can't tail call another function as
> that would potentially modify the argument slots.

I still don't think that what you want follows.  Imagine this:

  int foo (int i) { ++i; return i; }

To claim that this function is anything else than const+pure seems weird 
(in fact this function doesn't access anything that must lie in memory at 
all).  Now take your off-the-mill ABI that passes arguments on stack, and 
an only slightly bad code generator, i.e. -O0 on i386.  You will get an 
modification of the argument slot:

foo:
pushl   %ebp
movl%esp, %ebp
addl$1, 8(%ebp)
movl8(%ebp), %eax
popl%ebp
ret

So, if anything then the ownership of argument slots is a property of the 
psABI.  And while we may have been through this discussion a couple times 
over the years, I'm pretty sure that at least I consistently argued to 
declare all psABIs that leave argument slot ownerships with the callers 
(after the call actually happens) to be seriously broken^Wmisguided (and 
yes, also because it can prevent tail calls that otherwise would be 
perfectly valid).


Ciao,
Michael.


Re: [PATCH] fix ICEs in c-attribs.c (PR 88383, 89288, 89798, 89797)​

2019-04-15 Thread Christophe Lyon
On Sat, 13 Apr 2019 at 00:38, Martin Sebor  wrote:
>
> On 4/12/19 3:42 PM, Jakub Jelinek wrote:
> > On Fri, Apr 12, 2019 at 10:45:25AM -0600, Jeff Law wrote:
> >>> gcc/ChangeLog:
> >>>
> >>> PR c/89797
> >>> * targhooks.c (default_vector_alignment): Avoid assuming
> >>> argument fits in SHWI.
> >>> * tree.h (TYPE_VECTOR_SUBPARTS): Avoid sign overflow in
> >>> a shift expression.
> >>>
> >>> gcc/c-family/ChangeLog:
> >>>
> >>> PR c/88383
> >>> PR c/89288
> >>> PR c/89798
> >>> PR c/89797
> >>> * c-attribs.c (type_valid_for_vector_size): Detect excessively
> >>> large sizes.
> ...
> >
> > Has the patch been tested at all?
>
> A few times.  The c-attribs.c change above didn't make it into
> the commit.

Hi,
Even with r270331, I'm still seeing the ICE on aarch64 (actually with
trunk @r270370)

Is there still some commit missing?

Thanks,

Christophe

> Martin


[PATCH] Tweak LIM MEM improvements to fix PR56049

2019-04-15 Thread Richard Biener


It turns out solving this long-standing optimization regression
is now easy by exploiting implmenetation details in how we
canonicalize refs in LIM.  This allows us to properly identifying
MEM[(integer(kind=4)[64] *)&a][0] and MEM[(c_char * {ref-all})&a]
the same, applying store-motion to an initialization (non-)loop
thereby eliminating it.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

I didn't go further trying to exploit alias subset relationship
instead but the alias-set zero case is obvious enough to be
correct.

Richard.

2019-04-15  Richard Biener  

PR tree-optimization/56049
* tree-ssa-loop-im.c (mem_ref_hasher::equal): Elide alias-set
equality check if alias-set zero will prevail.

* gfortran.dg/pr56049.f90: New testcase.

Index: gcc/tree-ssa-loop-im.c
===
--- gcc/tree-ssa-loop-im.c  (revision 270366)
+++ gcc/tree-ssa-loop-im.c  (working copy)
@@ -178,7 +178,17 @@ mem_ref_hasher::equal (const im_mem_ref
&& known_eq (mem1->mem.size, obj2->size)
&& known_eq (mem1->mem.max_size, obj2->max_size)
&& mem1->mem.volatile_p == obj2->volatile_p
-   && mem1->mem.ref_alias_set == obj2->ref_alias_set
+   && (mem1->mem.ref_alias_set == obj2->ref_alias_set
+   /* We are not canonicalizing alias-sets but for the
+  special-case we didn't canonicalize yet and the
+  incoming ref is a alias-set zero MEM we pick
+  the correct one already.  */
+   || (!mem1->ref_canonical
+   && (TREE_CODE (obj2->ref) == MEM_REF
+   || TREE_CODE (obj2->ref) == TARGET_MEM_REF)
+   && obj2->ref_alias_set == 0)
+   /* Likewise if there's a canonical ref with alias-set zero.  */
+   || (mem1->ref_canonical && mem1->mem.ref_alias_set == 0))
&& types_compatible_p (TREE_TYPE (mem1->mem.ref),
   TREE_TYPE (obj2->ref)));
   else
Index: gcc/testsuite/gfortran.dg/pr56049.f90
===
--- gcc/testsuite/gfortran.dg/pr56049.f90   (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr56049.f90   (working copy)
@@ -0,0 +1,29 @@
+! { dg-do compile }
+! { dg-options "-O3 -fdump-tree-optimized" }
+
+program inline
+
+integer i
+integer a(8,8), b(8,8)
+
+a = 0
+do i = 1, 1000
+call add(b, a, 1)
+a = b
+end do
+
+print *, a
+
+contains
+
+subroutine add(b, a, o)
+integer, intent(inout) :: b(8,8)
+integer, intent(in) :: a(8,8), o
+b = a + o
+end subroutine add
+
+end program inline
+
+! Check there's no loop left, just two bb 2 in two functions.
+! { dg-final { scan-tree-dump-times "" 2 "optimized" } }
+! { dg-final { scan-tree-dump-times "" 2 "optimized" } }


Re: [PATCH] fix ICEs in c-attribs.c (PR 88383, 89288, 89798, 89797)​

2019-04-15 Thread Jeff Law
On 4/15/19 7:12 AM, Christophe Lyon wrote:
> On Sat, 13 Apr 2019 at 00:38, Martin Sebor  wrote:
>>
>> On 4/12/19 3:42 PM, Jakub Jelinek wrote:
>>> On Fri, Apr 12, 2019 at 10:45:25AM -0600, Jeff Law wrote:
> gcc/ChangeLog:
>
> PR c/89797
> * targhooks.c (default_vector_alignment): Avoid assuming
> argument fits in SHWI.
> * tree.h (TYPE_VECTOR_SUBPARTS): Avoid sign overflow in
> a shift expression.
>
> gcc/c-family/ChangeLog:
>
> PR c/88383
> PR c/89288
> PR c/89798
> PR c/89797
> * c-attribs.c (type_valid_for_vector_size): Detect excessively
> large sizes.
>> ...
>>>
>>> Has the patch been tested at all?
>>
>> A few times.  The c-attribs.c change above didn't make it into
>> the commit.
> 
> Hi,
> Even with r270331, I'm still seeing the ICE on aarch64 (actually with
> trunk @r270370)
> 
> Is there still some commit missing?
> 
Or perhaps something else broken.  My tester flagged these are aarch64


> New tests that FAIL (4 tests):
> 
> gcc.dg/attr-vector_size.c (internal compiler error)
> gcc.dg/attr-vector_size.c (test for excess errors)
> gcc.dg/attr-vector_size.c LP64 (test for errors, line 33)
> gcc.dg/attr-vector_size.c LP64 (test for errors, line 60)

The rest of the tests passed.  It could well be something different
about the aarch64 port.  Seems like a bit of debugging is advisable.

jeff


Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).

2019-04-15 Thread H.J. Lu
On Mon, Apr 15, 2019 at 12:26 AM Martin Liška  wrote:
>
> On 4/12/19 4:12 PM, H.J. Lu wrote:
> > On Fri, Apr 12, 2019 at 4:41 AM Martin Liška  wrote:
> >>
> >> On 4/11/19 6:30 PM, H.J. Lu wrote:
> >>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška  wrote:
> 
>  Hi.
> 
>  The patch is adding missing AVX512 ISAs for target and target_clone
>  attributes.
> 
>  Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
>  Ready to be installed?
>  Thanks,
>  Martin
> 
>  gcc/ChangeLog:
> 
>  2019-04-10  Martin Liska  
> 
>  PR target/89929
>  * config/i386/i386.c (get_builtin_code_for_version): Add
>  support for missing AVX512 ISAs.
> 
>  gcc/testsuite/ChangeLog:
> 
>  2019-04-10  Martin Liska  
> 
>  PR target/89929
>  * g++.target/i386/mv28.C: New test.
>  * gcc.target/i386/mvc14.c: New test.
>  ---
>   gcc/config/i386/i386.c| 34 ++-
>   gcc/testsuite/g++.target/i386/mv28.C  | 30 +++
>   gcc/testsuite/gcc.target/i386/mvc14.c | 16 +
>   3 files changed, 79 insertions(+), 1 deletion(-)
>   create mode 100644 gcc/testsuite/g++.target/i386/mv28.C
>   create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c
> 
> 
> >>>
> >>
> >> Hi.
> >>
> >>> Since any ISAs beyond AVX512F may be enabled individually, we
> >>> can't simply assign priorities to them.   For GFNI, we can have
> >>>
> >>> 1. GFNI
> >>> 2.  GFNI + AVX
> >>> 3.  GFNI + AVX512F
> >>> 4. GFNI + AVX512F + AVX512VL
> >>
> >> Makes sense to me! I'm considering syntax extension where one would be
> >> able to come up with a priority. Eg.
> >>
> >> __attribute__((target("gfni,avx512bw", priority((3)
> >>
> >> Without that the ISA combinations are probably not comparable in a 
> >> reasonable way.
> >>
> >>>
> >>> For this code,  GFNI + AVX512BW is ignored:
> >>>
> >>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii
> >>> __attribute__((target("gfni")))
> >>> int foo(int i) {
> >>> return 1;
> >>> }
> >>> __attribute__((target("gfni,avx512bw")))
> >>> int foo(int i) {
> >>> return 4;
> >>> }
> >>> __attribute__((target("default")))
> >>> int foo(int i) {
> >>> return 3;
> >>> }
> >>> int bar ()
> >>> {
> >>> return foo(2);
> >>> }
> >>
> >> For 'target' attribute it works for me:
> >>
> >> 1) $ cat z.c && ./xg++ -B. z.c -c
> >> #include 
> >> volatile __m512i x1, x2;
> >> volatile __mmask64 m64;
> >>
> >> __attribute__((target("gfni")))
> >> int foo(int i) {
> >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
> >> return 1;
> >> }
> >> __attribute__((target("gfni,avx512bw")))
> >> int foo(int i) {
> >> return 4;
> >> }
> >> __attribute__((target("default")))
> >> int foo(int i) {
> >>   return 3;
> >> }
> >> int bar ()
> >> {
> >> return foo(2);
> >> }
> >> In file included from ./include/immintrin.h:117,
> >>  from ./include/x86intrin.h:32,
> >>  from z.c:1:
> >> z.c: In function ‘int foo(int)’:
> >> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa option 
> >> -m32 -mgfni -mavx512f
> >> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
> >>   |  ^~~~
> >> z.c:7:10: note: the ABI for passing parameters with 64-byte alignment has 
> >> changed in GCC 4.6
> >>
> >> 2) $ cat z.c && ./xg++ -B. z.c -c
> >> #include 
> >> volatile __m512i x1, x2;
> >> volatile __mmask64 m64;
> >>
> >> __attribute__((target("gfni")))
> >> int foo(int i) {
> >> return 1;
> >> }
> >> __attribute__((target("gfni,avx512bw")))
> >> int foo(int i) {
> >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
> >> return 4;
> >> }
> >> __attribute__((target("default")))
> >> int foo(int i) {
> >>   return 3;
> >> }
> >> int bar ()
> >> {
> >> return foo(2);
> >> }
> >>
> >> [OK]
> >>
> >> Btw. is it really correct the '-m32' in: 'needs isa option -m32' ?
> >
> > It does look odd.
>
> Then let me take a look at this.
>
> >
> >> Similar applies to target_clone attribute where we'll have to come up with
> >> a syntax that will allow multiple ISA to be combined. Something like:
> >>
> >> __attribute__((target_clones("gfni+avx512bw")))
> >>
> >> ? Priorities can be maybe implemented by order?
> >>
> >
> > I am thinking -misa=processor which will enable ISAs for
> > processor.  It differs from -march=.  -misa= doesn't set
> > -mtune.
> >
>
> Well, isn't that what we currently support, e.g.:
>
> $ cat mvc11.c && gcc mvc11.c -c
> __attribute__((target_clones("arch=sandybridge", "arch=cascadelake", 
> "default"))) int
> foo (void)
> {
>   return 0;
> }
>
> int
> main ()
> {
>   foo ();
> }
>
> If so, we can provide a new warning that will tell that for AVX512* on should 
> use 'arch=xyz'
> instead?
>

1. We don't have one option to enable AVX512F and AVX512CD, whic

[aarch64][RFA/RFC][rtl-optimization/87763] Add new movk pattern for aarch64

2019-04-15 Thread Jeff Law

Here's my attempt to fix the movk regression on bz 87763.

I still wonder if addressing some of these issues in combine is a better
long term solution, but in the immediate term I think backend patterns
are going to have to be the way to go.

This introduces a new insn_and_split that matches a movk via the
ior..and form.

We rewrite it back into the zero-extract form once operands0 and
operands1 match.  This allows insn fusion in the scheduler to work as it
expects the zero-extract form.

While I have bootstrapped this on aarch64 and aarch64_be, I haven't done
anything with ILP32.

On aarch64 I have also run this through a regression test cycle where it
fixes the movk regression identified in bz87763.


Thoughts?  If we're generally happy with this direction I can look to
tackle the insv_1 and insv_2 regressions in a similar manner.

Jeff


* config/aarch64/aarch64.md: Add new pattern matching movk field
insertion via (and (ior ...)).

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index ab8786a933e..109694f9ef0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1161,6 +1161,54 @@
   [(set_attr "type" "mov_imm")]
 )
 
+;; This is for the combiner to use to encourage creation of
+;; bitfield insertions using movk.
+;;
+;; We rewrite back into a movk bitfield insertion to make sched
+;; fusion happy the first chance we get where the appropriate
+;; operands match.  After LRA they should always match.
+(define_insn_and_split ""
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+   (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0")
+ (match_operand:GPI 2 "const_int_operand" "n"))
+(match_operand:GPI 3 "const_int_operand" "n")))]
+  "((UINTVAL (operands[2]) == 0x
+ || UINTVAL (operands[2]) == 0x
+ || UINTVAL (operands[2]) == 0x
+ || UINTVAL (operands[2]) == 0x)
+&& (UINTVAL (operands[2]) & UINTVAL (operands[3])) == 0)"
+  "#"
+  "&& rtx_equal_p (operands[0], operands[1])"
+  [(set (zero_extract: (match_dup 0)
+(const_int 16)
+(match_dup 2))
+   (match_dup 3))]
+  "{
+ if (UINTVAL (operands[2]) == 0x)
+   {
+ operands[2] = GEN_INT (0);
+ operands[3] = GEN_INT (UINTVAL (operands[3]) & 0x);
+   }
+ else if (UINTVAL (operands[2]) == 0x)
+   {
+ operands[2] = GEN_INT (16);
+ operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 16) & 0x);
+   }
+ else if (UINTVAL (operands[2]) == 0x)
+   {
+ operands[2] = GEN_INT (32);
+ operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 32) & 0x);
+   }
+ else if (UINTVAL (operands[2]) == 0x)
+   {
+ operands[2] = GEN_INT (48);
+ operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 48) & 0x);
+   }
+ else
+   gcc_unreachable ();
+   }"
+)
+
 (define_expand "movti"
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
(match_operand:TI 1 "general_operand" ""))]


Re: [PATCH v2] Fix __patchable_function_entries section flags

2019-04-15 Thread Joao Moreira




On 4/12/19 1:19 PM, Jeff Law wrote:

On 4/11/19 11:18 AM, Joao Moreira wrote:

When -fpatchable-relocation-entry is used, gcc places nops on the
prologue of each compiled function and creates a section named
__patchable_function_entries which holds relocation entries for the
positions in which the nops were placed. As is, gcc creates this
section without the proper section flags, causing crashes in the
compiled program during its load.

Given the above, fix the problem by creating the section with the
SECTION_WRITE and SECTION_RELRO flags.

The problem was noticed while compiling glibc with
-fpatchable-function-entry compiler flag. After applying the patch,
this issue was solved.

This was also tested on x86-64 arch without visible problems under
the gcc standard tests.

2019-04-10  Joao Moreira  

* targhooks.c (default_print_patchable_function_entry): Emit
__patchable_function_entries section with writable flags to allow
relocation resolution.

OK.  Do you have write access to the GCC repo?


No.

Tks,
Joao.


jeff




[PATCH wwwdocs] Mention GNU Tools Cauldron in the News section

2019-04-15 Thread Simon Marchi
Hi,

Here is a patch that adds a mention of the 2019 Cauldron, similar to the entries
for the previous editions.

Thanks,

Simon


Index: index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.1125
diff -u -r1.1125 index.html
--- index.html  29 Mar 2019 12:28:15 -  1.1125
+++ index.html  15 Apr 2019 16:39:00 -
@@ -54,6 +54,10 @@
 News
 

+https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools Cauldron 
2019
+[2019-04-15]
+Held in Montréal, Canada, September 13-15 2019
+
 GCC 8.3 released
 [2019-02-22]
 


Re: [PATCH] (RFA tree-tailcall) PR c++/82081 - tail call optimization breaks noexcept

2019-04-15 Thread Andrew Pinski
On Sun, Apr 14, 2019 at 11:50 PM Richard Biener
 wrote:
>
> On Sat, Apr 13, 2019 at 12:34 AM Jeff Law  wrote:
> >
> > On 4/12/19 3:24 PM, Jason Merrill wrote:
> > > If a noexcept function calls a function that might throw, doing the tail
> > > call optimization means that an exception thrown in the called function
> > > will propagate out, breaking the noexcept specification.  So we need to
> > > prevent the optimization in that case.
> > >
> > > Tested x86_64-pc-linux-gnu.  OK for trunk or hold for GCC 10?  This isn't 
> > > a
> > > regression, but it is a straightforward fix for a wrong-code bug.
> > >
> > >   * tree-tailcall.c (find_tail_calls): Don't turn a call from a
> > >   nothrow function to a might-throw function into a tail call.
> > I'd go on the trunk.  It's a wrong-code issue, what we're doing is just
> > plain wrong.  One could even make a case for backporting to the branches.
>
> Hmm, how's this different from adding another indirection?  That is,
> I don't understand why the tailcall is the issue here, shouldn't unwind
> still stop at the noexcept caller?  Thus, isn't this wrong CFI instead?

noexcept caller is no longer on the stack so the unwinder does not see it.
It is not the tail call from a normal function to a noexcept that is
an issue but rather inside a noexcept caller to a normal function.

>
> Of course I know to little about this.
>
> Btw, doesn't your check also prevent tail/sibling calls when
> the caller wraps it into a try { } catch (...) {}?  Or does unwind
> not work in that case either?
>
> Btw, I'd like to see a runtime testcase that fails.

There is one in the bug report.  Though it would not work for the
testsuite.  It should not be hard to change it to be one that works
for the testsuite.

Thanks,
Andrew Pinski

>
> Richard.
>
> > jeff
> >
> > ps.  I'm a bit surprised it hasn't been reported until now.


Re: [PATCH wwwdocs] Mention GNU Tools Cauldron in the News section

2019-04-15 Thread Simon Marchi
On 2019-04-15 12:42 p.m., Simon Marchi wrote:
> Hi,
> 
> Here is a patch that adds a mention of the 2019 Cauldron, similar to the 
> entries
> for the previous editions.
> 
> Thanks,
> 
> Simon
> 
> 
> Index: index.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
> retrieving revision 1.1125
> diff -u -r1.1125 index.html
> --- index.html29 Mar 2019 12:28:15 -  1.1125
> +++ index.html15 Apr 2019 16:39:00 -
> @@ -54,6 +54,10 @@
>  News
>  
> 
> +https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools Cauldron 
> 2019
> +[2019-04-15]
> +Held in Montréal, Canada, September 13-15 2019
> +
>  GCC 8.3 released
>  [2019-02-22]
>  
> 

Actually, it would be better to use the same dates as are written on the wiki 
(12-15),
so please consider the patch below instead.

Also, please note that I don't have push access on GCC, so if somebody could 
push the
patch for me, once it's approved, I would appreciate it.  Thanks!



Index: index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.1125
diff -u -r1.1125 index.html
--- index.html  29 Mar 2019 12:28:15 -  1.1125
+++ index.html  15 Apr 2019 17:34:48 -
@@ -54,6 +54,10 @@
 News
 

+https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools Cauldron 
2019
+[2019-04-15]
+Held in Montréal, Canada, September 12-15 2019
+
 GCC 8.3 released
 [2019-02-22]
 


Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs

2019-04-15 Thread Segher Boessenkool
On Mon, Apr 15, 2019 at 01:54:11PM +0200, Florian Weimer wrote:
> * Richard Biener:
> 
> > Of course adding sth like a .robss section would be nice.
> 
> I think this is strictly a link editor issue because a read-only PT_LOAD
> directive with a memory size larger than the file size already produces
> read-only zero pages, without requiring a file allocation.

But .rodata normally is not the last thing in its segment (the .eh*
things are after it, and those are usually not all zero).


Segher


Re: [C++ Patch/RFC] PR 89900 ("[9 Regression] ICE: Segmentation fault (in check_instantiated_arg)")

2019-04-15 Thread Paolo Carlini

Hi,

On 12/04/19 20:29, Jason Merrill wrote:

On 4/11/19 11:20 AM, Paolo Carlini wrote:

Hi,

over the last few days I spent some time on this regression, which at 
first seemed just a minor error-recovery issue, but then I noticed 
that very slightly tweeking the original testcase uncovered a pretty 
serious ICE on valid:


template void
fk (XE..., int/*SW*/);

void
w9 (void)
{
   fk (0);
}

The regression has to do with the changes committed by Jason for 
c++/86932, in particular with the condition in coerce_template_parms:


    if (template_parameter_pack_p (TREE_VALUE (parm))
   && (arg || !(complain & tf_partial))
   && !(arg && ARGUMENT_PACK_P (arg)))

which has the additional (arg || !complain & tf_partial)) false for 
the present testcase, thus the null arg is not changed into an empty 
pack, thus later  instantiate_template calls check_instantiated_args 
which finds it still null and crashes. Now, likely some additional 
analysis is in order, but for sure there is an important difference 
between the testcase which came with c++/86932 and the above: 
non-type vs type template parameter pack. It seems to me that the 
kind of problem fixed in c++/86932 cannot occur with type packs, 
because it boils down to a reference to a previous parm (full 
disclosure: the comments and logic in fixed_parameter_pack_p helped 
me a lot here). Thus I had the idea of simply restricting the scope 
of the new condition above by adding an || TREE_CODE (TREE_VALUE 
(parm)) == TYPE_DECL, which definitely leads to a clean testsuite and 
a proper behavior on the new testcases, AFAICS. I'm attaching what I 
tested on x86_64-linux.


I think the important property here is that it's non-terminal, not 
that it's a type pack.  We can't deduce anything for a non-terminal 
pack, so we should go ahead and make an empty pack.


I see.

Then what about something bolder, like the below? Instead of fiddling 
with the details of coerce_template_parms - how it handles the explicit 
arguments - in fn_type_unification we deal with both parameter_pack == 
true and false in the same way when targ == NULL_TREE, thus we set 
incomplete. Then, for the new testcases, since incomplete is true, there 
is no jump to the deduced label and type_unification_real takes care of 
making the empty pack - the same happens already when there are no 
explicit arguments. Tested x86_64-linux. I also checked quite a few 
other variants of the tests but nothing new, nothing interesting, showed 
up...


Thanks, Paolo.

/

Index: cp/pt.c
===
--- cp/pt.c (revision 270364)
+++ cp/pt.c (working copy)
@@ -20176,21 +20176,17 @@ fn_type_unification (tree fn,
   parameter_pack = TEMPLATE_PARM_PARAMETER_PACK (parm);
 }
 
- if (!parameter_pack && targ == NULL_TREE)
+ if (targ == NULL_TREE)
/* No explicit argument for this template parameter.  */
incomplete = true;
-
-  if (parameter_pack && pack_deducible_p (parm, fn))
+ else if (parameter_pack && pack_deducible_p (parm, fn))
 {
   /* Mark the argument pack as "incomplete". We could
  still deduce more arguments during unification.
 We remove this mark in type_unification_real.  */
-  if (targ)
-{
-  ARGUMENT_PACK_INCOMPLETE_P(targ) = 1;
-  ARGUMENT_PACK_EXPLICIT_ARGS (targ) 
-= ARGUMENT_PACK_ARGS (targ);
-}
+ ARGUMENT_PACK_INCOMPLETE_P(targ) = 1;
+ ARGUMENT_PACK_EXPLICIT_ARGS (targ)
+   = ARGUMENT_PACK_ARGS (targ);
 
   /* We have some incomplete argument packs.  */
   incomplete = true;
Index: testsuite/g++.dg/cpp0x/pr89900-1.C
===
--- testsuite/g++.dg/cpp0x/pr89900-1.C  (nonexistent)
+++ testsuite/g++.dg/cpp0x/pr89900-1.C  (working copy)
@@ -0,0 +1,10 @@
+// { dg-do compile { target c++11 } }
+
+template void
+fk (XE..., SW);  // { dg-error "12:.SW. has not been declared" }
+
+void
+w9 (void)
+{
+  fk (0);
+}
Index: testsuite/g++.dg/cpp0x/pr89900-2.C
===
--- testsuite/g++.dg/cpp0x/pr89900-2.C  (nonexistent)
+++ testsuite/g++.dg/cpp0x/pr89900-2.C  (working copy)
@@ -0,0 +1,10 @@
+// { dg-do compile { target c++11 } }
+
+template void
+fk (XE..., int);
+
+void
+w9 (void)
+{
+  fk (0);
+}
Index: testsuite/g++.dg/cpp0x/pr89900-3.C
===
--- testsuite/g++.dg/cpp0x/pr89900-3.C  (nonexistent)
+++ testsuite/g++.dg/cpp0x/pr89900-3.C  (working copy)
@@ -0,0 +1,10 @@
+// { dg-do compile { target c++11 } }
+
+template void
+fk (XE..., SW);  // { dg-error "12:.SW. has not been declared" }
+
+void
+w9 (void)
+{
+  fk (0);
+}
Index: testsuite/g++.dg/cpp0x/p

[PATCH, libphobos] Committed merge with upstream druntime

2019-04-15 Thread Iain Buclaw
Hi,

This patch merges the libdruntime library with upstream druntime 70b9fea6.

Backports fixes in the extern(C) bindings for the Solaris/SPARC port.

Bootstrapped and regression tested on x86_64-linux-gnu and i386-pc-solaris2.11.

Committed to trunk as r270372.

-- 
Iain
---
diff --git a/libphobos/libdruntime/MERGE b/libphobos/libdruntime/MERGE
index a7bbd3da964..dd5f621082f 100644
--- a/libphobos/libdruntime/MERGE
+++ b/libphobos/libdruntime/MERGE
@@ -1,4 +1,4 @@
-175bf5fc69d26fec60d533ff77f7e915fd5bb468
+70b9fea60246e63d936ad6826b1b48b6e0f1de8f
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/druntime repository.
diff --git a/libphobos/libdruntime/core/sys/posix/ucontext.d b/libphobos/libdruntime/core/sys/posix/ucontext.d
index 52b16864917..6200bfc3fe2 100644
--- a/libphobos/libdruntime/core/sys/posix/ucontext.d
+++ b/libphobos/libdruntime/core/sys/posix/ucontext.d
@@ -25,6 +25,10 @@ nothrow:
 
 version (RISCV32) version = RISCV_Any;
 version (RISCV64) version = RISCV_Any;
+version (SPARC)   version = SPARC_Any;
+version (SPARC64) version = SPARC_Any;
+version (X86) version = X86_Any;
+version (X86_64)  version = X86_Any;
 
 //
 // XOpen (XSI)
@@ -1029,6 +1033,8 @@ else version (DragonFlyBSD)
 }
 else version (Solaris)
 {
+private import core.stdc.stdint;
+
 alias uint[4] upad128_t;
 
 version (SPARC64)
@@ -1127,10 +1133,13 @@ else version (Solaris)
 }
 else version (X86_64)
 {
-union _u_st
+private
 {
-ushort[5]   fpr_16;
-upad128_t   __fpr_pad;
+union _u_st
+{
+ushort[5]   fpr_16;
+upad128_t   __fpr_pad;
+}
 }
 
 struct fpregset_t
@@ -1189,20 +1198,94 @@ else version (Solaris)
 else
 static assert(0, "unimplemented");
 
-struct mcontext_t
+version (SPARC_Any)
 {
-gregset_t   gregs;
-fpregset_t  fpregs;
+private
+{
+struct rwindow
+{
+greg_t[8] rw_local;
+greg_t[8] rw_in;
+}
+
+struct gwindows_t
+{
+int wbcnt;
+greg_t[31] *spbuf;
+rwindow[31] wbuf;
+}
+
+struct xrs_t
+{
+uint xrs_id;
+caddr_t  xrs_ptr;
+}
+
+struct cxrs_t
+{
+uint cxrs_id;
+caddr_t  cxrs_ptr;
+}
+
+alias int64_t[16] asrset_t;
+}
+
+struct mcontext_t
+{
+gregset_tgregs;
+gwindows_t   *gwins;
+fpregset_t   fpregs;
+xrs_txrs;
+version (SPARC64)
+{
+asrset_t asrs;
+cxrs_t   cxrs;
+c_long[2] filler;
+}
+else version (SPARC)
+{
+cxrs_t   cxrs;
+c_long[17] filler;
+}
+}
+}
+else version (X86_Any)
+{
+private
+{
+struct xrs_t
+{
+uint xrs_id;
+caddr_t  xrs_ptr;
+}
+}
+
+struct mcontext_t
+{
+gregset_t   gregs;
+fpregset_t  fpregs;
+}
 }
 
 struct ucontext_t
 {
-c_ulong  uc_flags;
+version (SPARC_Any)
+uintuc_flags;
+else version (X86_Any)
+c_ulong uc_flags;
 ucontext_t  *uc_link;
 sigset_tuc_sigmask;
 stack_t uc_stack;
 mcontext_t  uc_mcontext;
-c_long[5]   uc_filler;
+version (SPARC64)
+c_long[4]  uc_filler;
+else version (SPARC)
+c_long[23] uc_filler;
+else version (X86_Any)
+{
+xrs_t  uc_xrs;
+c_long[3]  uc_filler;
+}
 }
 }
 else version (CRuntime_UClibc)
@@ -1399,7 +1482,20 @@ int  swapcontext(ucontext_t*, in ucontext_t*);
 static if ( is( ucontext_t ) )
 {
 int  getcontext(ucontext_t*);
-void makecontext(ucontext_t*, void function(), int, ...);
+
+version (Solaris)
+{
+version (SPARC_Any)
+{
+void __makecontext_v2(ucontext_t*, void function(), int, ...);
+alias makecontext = __makecontext_v2;
+}
+else
+void makecontext(ucontext_t*, void function(), int, ...);
+}
+else
+void makecontext(ucontext_t*, void function(), int, ...);
+
 int  setcontext(in ucontext_t*);
 int  swapcontext(ucontext_t*, in ucontext_t*);
 }
diff --git a/libphobos/libdruntime/core/sys/solaris/link.d b/libphobos/libdruntime/core/sys/solaris/link.d
index c3e75de481e..2d908b12184 100644
--- a/libphobos/libdruntime/core/sys/solaris/link.d
+++ b/libphobos/libdrunt

[PATCH rs6000] Fix PR target/84369: gcc.dg/sms-10.c fails on Power9

2019-04-15 Thread Pat Haugen
As pointed out in the PR, the test is failing because a store->load dependency 
is reporting zero cost. Fixed by leaving existing costs as is (i.e. cost for 
update forms), and just adding a simple bypass for store->load dependencies.

Bootstrap/regtest on powerpc64le (Power9) with no new regressions and testcase 
now passing. Also ran cpu2006/cpu2017 benchmark comparisons with no notable 
differences. Ok for trunk?

-Pat


2019-04-15  Pat Haugen  

PR target/84369
* config/rs6000/power9.md: Add store forwarding bypass.

Index: gcc/config/rs6000/power9.md
===
--- gcc/config/rs6000/power9.md (revision 270261)
+++ gcc/config/rs6000/power9.md (working copy)
@@ -236,6 +236,9 @@ (define_insn_reservation "power9-vecstor
(eq_attr "cpu" "power9"))
   "DU_super_power9,LSU_pair_power9")
 
+; Store forwarding latency is 6
+(define_bypass 6 "power9-*store*" "power9-*load*")
+
 (define_insn_reservation "power9-larx" 4
   (and (eq_attr "type" "load_l")
(eq_attr "cpu" "power9"))



[committed] Fix various microblaze-linux failures

2019-04-15 Thread Jeff Law
microblaze testing in my tester has occasionally been failing
Warray-bounds-40 and Wstringop-overflow-9.  I finally took a little peek
because these occasional failures show up as a regression against the
prior run.

It looks like the microblaze backend is trying to inline a move of
SIZE_MAX bytes.  Ugh.  Not surprisingly the problem is the target bits
treating the size as a signed integer in a comparison.

Fixing this is pretty simple thankfully.  I didn't audit the entire
port, just microblaze_expand_block_move.

Here's what I'm installing on the trunk -- it basically ensures we treat
the size and alignment as unsigned values.  It also fixes errors with
string-large-1.c.   

Jeff



* config/microblaze/microblaze.c (microblaze_expand_block_move): Treat
size and alignment as unsigned.

diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index 70910fd1dde..55c1becf975 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -1258,8 +1258,8 @@ microblaze_expand_block_move (rtx dest, rtx src, rtx 
length, rtx align_rtx)
 
   if (GET_CODE (length) == CONST_INT)
 {
-  HOST_WIDE_INT bytes = INTVAL (length);
-  int align = INTVAL (align_rtx);
+  unsigned HOST_WIDE_INT bytes = UINTVAL (length);
+  unsigned int align = UINTVAL (align_rtx);
 
   if (align > UNITS_PER_WORD)
{
@@ -1267,7 +1267,7 @@ microblaze_expand_block_move (rtx dest, rtx src, rtx 
length, rtx align_rtx)
}
   else if (align < UNITS_PER_WORD)
{
- if (INTVAL (length) <= MAX_MOVE_BYTES)
+ if (UINTVAL (length) <= MAX_MOVE_BYTES)
{
  move_by_pieces (dest, src, bytes, align, RETURN_BEGIN);
  return true;
@@ -1276,14 +1276,14 @@ microblaze_expand_block_move (rtx dest, rtx src, rtx 
length, rtx align_rtx)
return false;
}
 
-  if (INTVAL (length) <= 2 * MAX_MOVE_BYTES)
+  if (UINTVAL (length) <= 2 * MAX_MOVE_BYTES)
{
- microblaze_block_move_straight (dest, src, INTVAL (length));
+ microblaze_block_move_straight (dest, src, UINTVAL (length));
  return true;
}
   else if (optimize)
{
- microblaze_block_move_loop (dest, src, INTVAL (length));
+ microblaze_block_move_loop (dest, src, UINTVAL (length));
  return true;
}
 }


New French PO file for 'gcc' (version 9.1-b20190414)

2019-04-15 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the French team of translators.  The file is available at:

https://translationproject.org/latest/gcc/fr.po

(This file, 'gcc-9.1-b20190414.fr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH wwwdocs] Mention GNU Tools Cauldron in the News section

2019-04-15 Thread Eric Gallager
On 4/15/19, Simon Marchi  wrote:
> Hi,
>
> Here is a patch that adds a mention of the 2019 Cauldron, similar to the
> entries for the previous editions.
>
> Thanks,
>
> Simon
>
>
> Index: index.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
> retrieving revision 1.1125
> diff -u -r1.1125 index.html
> --- index.html29 Mar 2019 12:28:15 -  1.1125
> +++ index.html15 Apr 2019 16:39:00 -
> @@ -54,6 +54,10 @@
>  News
>  
>
> +https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools
> Cauldron 2019
> +[2019-04-15]
> +Held in Montréal, Canada, September 13-15 2019
> +

Hey Montréal, I might actually be able to go this year! How do I register?

>  GCC 8.3 released
>  [2019-02-22]
>  
>

Eric Gallager


[PATCH, libphobos] Committed fix configure test for backtrace-supported.h

2019-04-15 Thread Iain Buclaw
Hi,

When porting/testing the D front-end to FreeBSD, I noticed that
backtrace supported returned false during the configuration of
libphobos.

The use of += assignment in the configure test was the reason why, and
now that's been corrected.

Bootstrapped and regression tested on x86_64-linux-gnu and x86_64-freebsd11.2.

Committed to trunk as r270377.

-- 
Iain
---
libphobos/ChangeLog:

2019-04-16  Iain Buclaw  

* config.h.in: Regenerate.
* configure: Regenerate.
* m4/druntime/libraries.m4 (DRUNTIME_LIBRARIES_BACKTRACE): Set
CPPFLAGS correctly for backtrace support test.

---
diff --git a/libphobos/config.h.in b/libphobos/config.h.in
index 19266b3b5e4..0249849c890 100644
--- a/libphobos/config.h.in
+++ b/libphobos/config.h.in
@@ -54,3 +54,35 @@
 
 /* Define to 1 if you have the ANSI C header files. */
 #undef STDC_HEADERS
+
+/* Enable extensions on AIX 3, Interix.  */
+#ifndef _ALL_SOURCE
+# undef _ALL_SOURCE
+#endif
+/* Enable GNU extensions on systems that have them.  */
+#ifndef _GNU_SOURCE
+# undef _GNU_SOURCE
+#endif
+/* Enable threading extensions on Solaris.  */
+#ifndef _POSIX_PTHREAD_SEMANTICS
+# undef _POSIX_PTHREAD_SEMANTICS
+#endif
+/* Enable extensions on HP NonStop.  */
+#ifndef _TANDEM_SOURCE
+# undef _TANDEM_SOURCE
+#endif
+/* Enable general extensions on Solaris.  */
+#ifndef __EXTENSIONS__
+# undef __EXTENSIONS__
+#endif
+
+
+/* Define to 1 if on MINIX. */
+#undef _MINIX
+
+/* Define to 2 if the system does not provide POSIX.1 features except with
+   this defined. */
+#undef _POSIX_1_SOURCE
+
+/* Define to 1 if you need to in order for `stat' and other things to work. */
+#undef _POSIX_SOURCE
diff --git a/libphobos/configure b/libphobos/configure
index 87e4e4a7c9b..8079a73527d 100755
--- a/libphobos/configure
+++ b/libphobos/configure
@@ -14838,7 +14838,7 @@ fi
 LIBBACKTRACE=../../libbacktrace/libbacktrace.la
 
 gdc_save_CPPFLAGS=$CPPFLAGS
-CPPFLAGS+=" -I../libbacktrace "
+CPPFLAGS="$CPPFLAGS -I../libbacktrace "
 
 ac_fn_c_check_header_mongrel "$LINENO" "backtrace-supported.h" "ac_cv_header_backtrace_supported_h" "$ac_includes_default"
 if test "x$ac_cv_header_backtrace_supported_h" = xyes; then :
diff --git a/libphobos/m4/druntime/libraries.m4 b/libphobos/m4/druntime/libraries.m4
index 6e81fd99e4b..a7aab4dd88b 100644
--- a/libphobos/m4/druntime/libraries.m4
+++ b/libphobos/m4/druntime/libraries.m4
@@ -178,7 +178,7 @@ AC_DEFUN([DRUNTIME_LIBRARIES_BACKTRACE],
 LIBBACKTRACE=../../libbacktrace/libbacktrace.la
 
 gdc_save_CPPFLAGS=$CPPFLAGS
-CPPFLAGS+=" -I../libbacktrace "
+CPPFLAGS="$CPPFLAGS -I../libbacktrace "
 
 AC_CHECK_HEADER(backtrace-supported.h, have_libbacktrace_h=true,
   have_libbacktrace_h=false)