Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
On Sat, Apr 13, 2019 at 8:48 PM Thomas Koenig wrote: > > Hello world, > > the attached patch fixes a 8/9 regression where _def_init, an internal > Fortran variable containing only zeros, was placed into the .rodata > section. This led to a large increase in executable size. > > There should be no impact on other languages because the change to > varasm.c is guarded by lang_GNU_Fortran (). > > Regarding the test case: I did find one other test which checks > for .bss, so I suppose this is safe. > > Regression-tested with a full test (--enable-languages=all and > make -j64 -k check) on POWER9. > > I would like to apply it to both affected branches. > > OK for the general and the Fortran part? This won't work with LTO. Note we have the issue in the middle-end as well since we promote variables we see are not written to to TREE_READONLY. This can be seen with (the somewhat artificial...): int a[1024*1024] = { 0 }; int __attribute__((noinline)) foo() { return *(volatile int *)a; } int main() { return foo (); } where without -flto a gets placed into .bss while with -flto it gets into .rodata. So I believe we should add a DECL flag specifying whether for section placement we can "ignore" TREE_READONLY. We'd initialize that with the original state of TREE_READONLY so that the R/O promotion doesn't change section placement. Also the Fortran FE can then simply set this flag on variables that may live in .bss. There are 14 unused bits in tree_decl_with_vis so a patch for the middle-end part could look like the attached (w/o solving the LTO issue yet). Of course adding sth like a .robss section would be nice. Richard. > Regards > > Thomas > > 2019-04-13 Thomas Koenig > > PR fortran/84487 > * trans-decl.c (gfc_get_symbol_decl): Mark _def_init as > artificial. > > 2019-04-13 Thomas Koenig > > PR fortran/84487 > * varasm.c (bss_initializer_p): If we are compiling Fortran, the > decl is artifical and it has a size larger than 255, it can be > put into BSS. > > 2019-04-13 Thomas Koenig > > PR fortran/84487 > * gfortran.dg/def_init_1.f90: New test. > > p Description: Binary data
Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).
On 4/12/19 4:12 PM, H.J. Lu wrote: > On Fri, Apr 12, 2019 at 4:41 AM Martin Liška wrote: >> >> On 4/11/19 6:30 PM, H.J. Lu wrote: >>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška wrote: Hi. The patch is adding missing AVX512 ISAs for target and target_clone attributes. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin gcc/ChangeLog: 2019-04-10 Martin Liska PR target/89929 * config/i386/i386.c (get_builtin_code_for_version): Add support for missing AVX512 ISAs. gcc/testsuite/ChangeLog: 2019-04-10 Martin Liska PR target/89929 * g++.target/i386/mv28.C: New test. * gcc.target/i386/mvc14.c: New test. --- gcc/config/i386/i386.c| 34 ++- gcc/testsuite/g++.target/i386/mv28.C | 30 +++ gcc/testsuite/gcc.target/i386/mvc14.c | 16 + 3 files changed, 79 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.target/i386/mv28.C create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c >>> >> >> Hi. >> >>> Since any ISAs beyond AVX512F may be enabled individually, we >>> can't simply assign priorities to them. For GFNI, we can have >>> >>> 1. GFNI >>> 2. GFNI + AVX >>> 3. GFNI + AVX512F >>> 4. GFNI + AVX512F + AVX512VL >> >> Makes sense to me! I'm considering syntax extension where one would be >> able to come up with a priority. Eg. >> >> __attribute__((target("gfni,avx512bw", priority((3) >> >> Without that the ISA combinations are probably not comparable in a >> reasonable way. >> >>> >>> For this code, GFNI + AVX512BW is ignored: >>> >>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii >>> __attribute__((target("gfni"))) >>> int foo(int i) { >>> return 1; >>> } >>> __attribute__((target("gfni,avx512bw"))) >>> int foo(int i) { >>> return 4; >>> } >>> __attribute__((target("default"))) >>> int foo(int i) { >>> return 3; >>> } >>> int bar () >>> { >>> return foo(2); >>> } >> >> For 'target' attribute it works for me: >> >> 1) $ cat z.c && ./xg++ -B. z.c -c >> #include >> volatile __m512i x1, x2; >> volatile __mmask64 m64; >> >> __attribute__((target("gfni"))) >> int foo(int i) { >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); >> return 1; >> } >> __attribute__((target("gfni,avx512bw"))) >> int foo(int i) { >> return 4; >> } >> __attribute__((target("default"))) >> int foo(int i) { >> return 3; >> } >> int bar () >> { >> return foo(2); >> } >> In file included from ./include/immintrin.h:117, >> from ./include/x86intrin.h:32, >> from z.c:1: >> z.c: In function ‘int foo(int)’: >> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa option >> -m32 -mgfni -mavx512f >> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); >> | ^~~~ >> z.c:7:10: note: the ABI for passing parameters with 64-byte alignment has >> changed in GCC 4.6 >> >> 2) $ cat z.c && ./xg++ -B. z.c -c >> #include >> volatile __m512i x1, x2; >> volatile __mmask64 m64; >> >> __attribute__((target("gfni"))) >> int foo(int i) { >> return 1; >> } >> __attribute__((target("gfni,avx512bw"))) >> int foo(int i) { >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); >> return 4; >> } >> __attribute__((target("default"))) >> int foo(int i) { >> return 3; >> } >> int bar () >> { >> return foo(2); >> } >> >> [OK] >> >> Btw. is it really correct the '-m32' in: 'needs isa option -m32' ? > > It does look odd. Then let me take a look at this. > >> Similar applies to target_clone attribute where we'll have to come up with >> a syntax that will allow multiple ISA to be combined. Something like: >> >> __attribute__((target_clones("gfni+avx512bw"))) >> >> ? Priorities can be maybe implemented by order? >> > > I am thinking -misa=processor which will enable ISAs for > processor. It differs from -march=. -misa= doesn't set > -mtune. > Well, isn't that what we currently support, e.g.: $ cat mvc11.c && gcc mvc11.c -c __attribute__((target_clones("arch=sandybridge", "arch=cascadelake", "default"))) int foo (void) { return 0; } int main () { foo (); } If so, we can provide a new warning that will tell that for AVX512* on should use 'arch=xyz' instead? Thanks, Martin
Re: [PATCH] Reset proper type on vector types (PR middle-end/88587).
On Mon, Apr 15, 2019 at 8:48 AM Martin Liška wrote: > > Hi. > > Apparently, there's one another PR: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90083 > > May I backport the patch to GCC-8 branch? Hmm, it isn't a regression, right? But it only affects multi-versioning, so yes, go ahead. Might as well consider GCC 7 then - do you have an overall idea of the state of the MV stuff on branches? IIRC you've done most of the "fixes"? Richard. > Thanks, > Martin
[PATCH] Filter out LTO in config/bootstrap-lto-lean.mk.
Hi. The patch is fixing bootstrap-lto-lean.mk where with PGO LTO was wrongly used in STAGEtrain. Tested on openSUSE gcc9 package, I'm attaching build log: https://drive.google.com/file/d/17sxGf_x_VaUekPk2SHI9joIXg1BR5-dY/view?usp=sharing Ready to be installed? Thanks, Martin config/ChangeLog: 2019-04-15 Martin Liska * bootstrap-lto-lean.mk: Filter out -flto in STAGEtrain_CFLAGS. --- config/bootstrap-lto-lean.mk | 1 + 1 file changed, 1 insertion(+) diff --git a/config/bootstrap-lto-lean.mk b/config/bootstrap-lto-lean.mk index ee36f6fe544..79cea50a4c6 100644 --- a/config/bootstrap-lto-lean.mk +++ b/config/bootstrap-lto-lean.mk @@ -2,6 +2,7 @@ # Otherwise, LTO is used in only stage3. STAGE3_CFLAGS += -flto=jobserver +override STAGEtrain_CFLAGS := $(filter-out -flto=jobserver,$(STAGEtrain_CFLAGS)) STAGEtrain_GENERATOR_CFLAGS += -flto=jobserver STAGEfeedback_CFLAGS += -flto=jobserver
Re: [PATCH] Reset proper type on vector types (PR middle-end/88587).
On 4/15/19 9:27 AM, Richard Biener wrote: > On Mon, Apr 15, 2019 at 8:48 AM Martin Liška wrote: >> >> Hi. >> >> Apparently, there's one another PR: >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90083 >> >> May I backport the patch to GCC-8 branch? > > Hmm, it isn't a regression, right? But it only > affects multi-versioning, so yes, go ahead. No, it's not. The issue is very old. > Might as well consider GCC 7 then - do you have > an overall idea of the state of the MV stuff on branches? Well, I've made quite some changes to target_clone pass (multiple_target.c). Thus I would ignore GCC-7 if possible. Martin > IIRC you've done most of the "fixes"? > > Richard. > >> Thanks, >> Martin
[PATCH committed] [Bug tree-optimization/90020] [7/8 regression] -O2 -Os x86-64 wrong code generated for GNU Emacs
Author: dominiq Date: Mon Apr 15 07:56:43 2019 New Revision: 270360 URL: https://gcc.gnu.org/viewcvs?rev=270360&root=gcc&view=rev Log: 2019-04-15 Dominique d'Humieres PR tree-optimization/90020 * gcc.dg/torture/pr90020.c: Add linker options for darwin. --- trunk/gcc/testsuite/gcc.dg/torture/pr90020.c2019/04/15 07:39:20 270359 +++ trunk/gcc/testsuite/gcc.dg/torture/pr90020.c2019/04/15 07:56:43 270360 @@ -1,5 +1,7 @@ /* { dg-do run } */ /* { dg-require-weak "" } */ +/* { dg-additional-options "-Wl,-undefined,dynamic_lookup" { target *-*-darwin* } } */ +/* { dg-additional-options "-Wl,-flat_namespace" { target *-*-darwin[89]* } } */ void __attribute__((noinline,noclone)) check (int i) Dominique
Re: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes.
Hi Paul, I have found another glitch with -m32 and -O1 or -Os, but not with other values: % gfc /opt/gcc/_clean/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_4.f90 -m32 -O % ./a.out FAIL Note: The following floating-point exceptions are signalling: IEEE_DENORMAL STOP 1 This looks tricky: if I add a line print *, x before if (any (abs (x - [1.,20.,3.,40.,5.,60.]) > 1.e-6)) stop 2 the test succeeds!-( Also you don’t want pr89844 to be solved, don’t you? TIA Dominique > Le 11 avr. 2019 à 16:44, Paul Richard Thomas > a écrit : > > Hi Dominique, > > Yes indeed - I used int(kind(loc(res))) to achieve the same effect. > > I am looking for but failing to find a similar problem for PR89846. > Tomorrow I turn my attention to an incorrect cast in the compiler. > > Regards > > Paul
New template for 'gcc' made available
Hello, gentle maintainer. This is a message from the Translation Project robot. (If you have any questions, send them to .) A new POT file for textual domain 'gcc' has been made available to the language teams for translation. It is archived as: https://translationproject.org/POT-files/gcc-9.1-b20190414.pot Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. Below is the URL which has been provided to the translators of your package. Please inform the translation coordinator, at the address at the bottom, if this information is not current: https://gcc.gnu.org/pub/gcc/snapshots/9-20190414/gcc-9-20190414.tar.xz Translated PO files will later be automatically e-mailed to you. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
New Spanish PO file for 'gcc' (version 9.1-b20190414)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Spanish team of translators. The file is available at: https://translationproject.org/latest/gcc/es.po (This file, 'gcc-9.1-b20190414.es.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
AW: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes.
Dear Paul, mostly looks good. Apart from a regression with optional arguments reported as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90093 all other test cases I have now execute correctly. Cheers Reinhold > -Ursprüngliche Nachricht- > Von: Paul Richard Thomas > Gesendet: Sonntag, 14. April 2019 20:16 > An: Thomas Koenig > Cc: Gilles Gouaillardet ; Bader, Reinhold > ; fort...@gcc.gnu.org; gcc-patches patc...@gcc.gnu.org> > Betreff: Re: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes. > > Hi Thomas, > > Thanks a lot. Committed as revision 270353. > > I was determined not to repeat the PDT experience, which is still nagging at > me. That has to be the next major gfc project, I guess. > > Regards > > Paul > > On Sun, 14 Apr 2019 at 18:08, Thomas Koenig > wrote: > > > > Hi Paul, > > > > > > > Please find attached the updated patch, which fixes the problem with > > > -m32 in PR90022, eliminates the temporary creation for INTENT(IN) > > > dummies and fixes PR89846. > > > > > > While it looks like it should be intrusive because of its size, I > > > believe that the patch is still safe for trunk since it is hidden > > > behind tests for CFI descriptors. > > > > > > Bootstraps and regtests on FC29/x86_64 - OK for trunk? > > > > OK. > > > > I we're going into the gcc 9 release with an implementation of the C > > interop features, it will be better with fewer bugs :-) > > > > Thanks a lot for working on it! > > > > Regards > > > > Thomas > > > > -- > "If you can't explain it simply, you don't understand it well enough" > - Albert Einstein smime.p7s Description: S/MIME cryptographic signature
[PATCH] Fix PR90074
I am testing the following patch to fix wrong-debug creatd by loop-distribution simply dropping debug stmts on the floor making earlier ones with bogus value live. Bootstrap & regtest running on x86_64-unknown-linux-gnu. Richard. 2019-04-15 Richard Biener PR debug/90074 * tree-loop-distribution.c (destroy_loop): Preserve correct debug info. * gcc.dg/guality/pr90074.c: New testcase. Index: gcc/tree-loop-distribution.c === --- gcc/tree-loop-distribution.c(revision 270358) +++ gcc/tree-loop-distribution.c(working copy) @@ -1094,12 +1094,8 @@ destroy_loop (struct loop *loop) bbs = get_loop_body_in_dom_order (loop); - redirect_edge_pred (exit, src); - exit->flags &= ~(EDGE_TRUE_VALUE|EDGE_FALSE_VALUE); - exit->flags |= EDGE_FALLTHRU; - cancel_loop_tree (loop); - rescan_loop_exit (exit, false, true); - + gimple_stmt_iterator dst_gsi = gsi_after_labels (exit->dest); + bool safe_p = single_pred_p (exit->dest); i = nbbs; do { @@ -1116,14 +1112,45 @@ destroy_loop (struct loop *loop) if (virtual_operand_p (gimple_phi_result (phi))) mark_virtual_phi_result_for_renaming (phi); } - for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi); - gsi_next (&gsi)) + for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi);) { gimple *stmt = gsi_stmt (gsi); tree vdef = gimple_vdef (stmt); if (vdef && TREE_CODE (vdef) == SSA_NAME) mark_virtual_operand_for_renaming (vdef); + /* Also move and eventually reset debug stmts. We can leave +constant values in place in case the stmt dominates the exit. +??? Non-constant values from the last iteration can be +replaced with final values if we can compute them. */ + if (gimple_debug_bind_p (stmt)) + { + tree val = gimple_debug_bind_get_value (stmt); + gsi_move_before (&gsi, &dst_gsi); + if (val + && (!safe_p + || !is_gimple_min_invariant (val) + || !dominated_by_p (CDI_DOMINATORS, exit->src, bbs[i]))) + { + gimple_debug_bind_reset_value (stmt); + update_stmt (stmt); + } + } + else + gsi_next (&gsi); } +} + while (i != 0); + + redirect_edge_pred (exit, src); + exit->flags &= ~(EDGE_TRUE_VALUE|EDGE_FALSE_VALUE); + exit->flags |= EDGE_FALLTHRU; + cancel_loop_tree (loop); + rescan_loop_exit (exit, false, true); + + i = nbbs; + do +{ + --i; delete_basic_block (bbs[i]); } while (i != 0); Index: gcc/testsuite/gcc.dg/guality/pr90074.c === --- gcc/testsuite/gcc.dg/guality/pr90074.c (nonexistent) +++ gcc/testsuite/gcc.dg/guality/pr90074.c (working copy) @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-g" } */ + +void __attribute__((noinline)) +optimize_me_not () +{ + __asm__ volatile ("" : : : "memory"); +} +char a; +short b[7][1]; +int main() +{ + int i, c; + a = 0; + i = 0; + for (; i < 7; i++) { + c = 0; + for (; c < 1; c++) + b[i][c] = 0; + } + /* i may very well be optimized out, so we cannot test for i == 7. + Instead test i + 1 which will make the test UNSUPPORTED if i + is optimized out. Since the test previously had wrong debug + with i == 0 this is acceptable. Optimally we'd produce a + debug stmt for the final value of the loop during loop distribution + which would fix the UNSUPPORTED cases. + c is optimized out at -Og for no obvious reason. */ + optimize_me_not(); /* { dg-final { gdb-test . "i + 1" "8" } } */ +/* { dg-final { gdb-test .-1 "c + 1" "2" } } */ + return 0; +}
[PATCH] Fix PR90071
The following fixes reassoc leaking abnormals into rewritten conditon chains. Bootstrap / regtest running on x86_64-unknown-linux-gnu. Richard. 2019-04-15 Richard Biener PR tree-optimization/90071 * tree-ssa-reassoc.c (init_range_entry): Do not pick up abnormal operands from def stmts. * gcc.dg/torture/pr90071.c: New testcase. Index: gcc/tree-ssa-reassoc.c === --- gcc/tree-ssa-reassoc.c (revision 270358) +++ gcc/tree-ssa-reassoc.c (working copy) @@ -2143,7 +2143,8 @@ init_range_entry (struct range_entry *r, exp_type = boolean_type_node; } - if (TREE_CODE (arg0) != SSA_NAME) + if (TREE_CODE (arg0) != SSA_NAME + || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (arg0)) break; loc = gimple_location (stmt); switch (code) Index: gcc/testsuite/gcc.dg/torture/pr90071.c === --- gcc/testsuite/gcc.dg/torture/pr90071.c (nonexistent) +++ gcc/testsuite/gcc.dg/torture/pr90071.c (working copy) @@ -0,0 +1,24 @@ +/* { dg-do compile } */ + +int a; +static int b; + +void +foo () +{ + int d; + int e = (int) (__INTPTR_TYPE__) &&f; + void *g = &&h; +h: ++e; + if (a) + i: goto *g; + for (;;) + { + e = 0; + if (b) +goto i; + } +f: + goto *({ d || e < 0 || e >= 2; }); + &e; +}
Re: [PATCH] Fix PR88936
On Fri, 12 Apr 2019, Richard Biener wrote: > On Fri, 12 Apr 2019, Richard Biener wrote: > > > On Fri, 12 Apr 2019, Michael Matz wrote: > > > > > Hi, > > > > > > On Fri, 12 Apr 2019, Richard Biener wrote: > > > > > > > > You miss PARM_DECLs and RESULT_DECLs, i.e. it's probably better to > > > > > factor > > > > > out tree.c:auto_var_in_fn_p and place the new auto_var_p in tree.c as > > > > > well. > > > > > > > > Hmm, I left the above unchanged from a different variant of the patch > > > > where for some reason I do not remember I explicitely decided > > > > parameters and results are not affected... > > > > > > Even if that were the case the function is sufficiently general (also its > > > name) that it should be generic infrastructure, not hidden away in > > > structalias. > > > > It was not fully equivalent, but yes. So - like the following? > > I think checking DECL_CONTEXT isn't necessary given the > > !DECL_EXTERNAL/STATIC checks. > > > > Bootstrap / regtest running on x86_64-unknown-linux-gnu. > > Aww, hits > > /space/rguenther/src/svn/trunk/libgomp/testsuite/libgomp.oacc-c/../libgomp.oacc-c-c++-common/zero_length_subarrays.c:33:1: > > internal compiler error: in fold_builtin_alloca_with_align, at > tree-ssa-ccp.c:2186^M > 0x6d7e45 fold_builtin_alloca_with_align^M > > have to look/think about this. I have applied the following variant after testing on x86_64-unknown-linux-gnu. Richard. 2019-04-15 Richard Biener PR ipa/88936 * tree.h (auto_var_p): Declare. * tree.c (auto_var_p): New function, split out from ... (auto_var_in_fn_p): ... here. * tree-ssa-structalias.c (struct variable_info): Add shadow_var_uid member. (new_var_info): Initialize it. (set_uids_in_ptset): Also set the shadow variable uid if required. (ipa_pta_execute): Postprocess points-to solutions assigning shadow variable uids for locals that may reach their containing function recursively. * tree-ssa-ccp.c (fold_builtin_alloca_with_align): Do not assert but instead check whether the points-to solution is a singleton. * gcc.dg/torture/pr88936-1.c: New testcase. * gcc.dg/torture/pr88936-2.c: Likewise. * gcc.dg/torture/pr88936-3.c: Likewise. Index: gcc/tree.c === --- gcc/tree.c (revision 270306) +++ gcc/tree.c (working copy) @@ -9268,17 +9268,25 @@ get_type_static_bounds (const_tree type, } } +/* Return true if VAR is an automatic variable. */ + +bool +auto_var_p (const_tree var) +{ + return VAR_P (var) && ! DECL_EXTERNAL (var)) + || TREE_CODE (var) == PARM_DECL) + && ! TREE_STATIC (var)) + || TREE_CODE (var) == RESULT_DECL); +} + /* Return true if VAR is an automatic variable defined in function FN. */ bool auto_var_in_fn_p (const_tree var, const_tree fn) { return (DECL_P (var) && DECL_CONTEXT (var) == fn - && VAR_P (var) && ! DECL_EXTERNAL (var)) - || TREE_CODE (var) == PARM_DECL) - && ! TREE_STATIC (var)) - || TREE_CODE (var) == LABEL_DECL - || TREE_CODE (var) == RESULT_DECL)); + && (auto_var_p (var) + || TREE_CODE (var) == LABEL_DECL)); } /* Subprogram of following function. Called by walk_tree. Index: gcc/tree.h === --- gcc/tree.h (revision 270306) +++ gcc/tree.h (working copy) @@ -4893,6 +4893,7 @@ extern bool stdarg_p (const_tree); extern bool prototype_p (const_tree); extern bool is_typedef_decl (const_tree x); extern bool typedef_variant_p (const_tree); +extern bool auto_var_p (const_tree); extern bool auto_var_in_fn_p (const_tree, const_tree); extern tree build_low_bits_mask (tree, unsigned); extern bool tree_nop_conversion_p (const_tree, const_tree); Index: gcc/tree-ssa-structalias.c === --- gcc/tree-ssa-structalias.c (revision 270306) +++ gcc/tree-ssa-structalias.c (working copy) @@ -299,6 +299,11 @@ struct variable_info /* Full size of the base variable, in bits. */ unsigned HOST_WIDE_INT fullsize; + /* In IPA mode the shadow UID in case the variable needs to be duplicated in + the final points-to solution because it reaches its containing + function recursively. Zero if none is needed. */ + unsigned int shadow_var_uid; + /* Name of this variable */ const char *name; @@ -397,6 +402,7 @@ new_var_info (tree t, const char *name, ret->solution = BITMAP_ALLOC (&pta_obstack); ret->oldsolution = NULL; ret->next = 0; + ret->shadow_var_uid = 0; ret->head = ret->id; stats.total_vars++; @@ -6452,6 +6458,16 @@ set_uids_in_ptset (bitmap into, bitmap f && (TREE_STATIC (vi->decl) || DECL_EXTERNAL (vi->decl)) && ! decl_binds_to_current_def_p (vi->decl)) p
Re: [PATCH] Filter out LTO in config/bootstrap-lto-lean.mk.
On Mon, Apr 15, 2019 at 9:46 AM Martin Liška wrote: > > Hi. > > The patch is fixing bootstrap-lto-lean.mk where with PGO LTO was > wrongly used in STAGEtrain. > > Tested on openSUSE gcc9 package, I'm attaching build log. > > Ready to be installed? I wonder why 'override' is necessary given before we include the build-config .mk fragment we do STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) I suppose you checked w/o override and it didn't work? Or ist the issue that you have to use := here to get the previous addition to STAGE3_CFLAGS resolved? A make expert might want to chime in here. Maybe a simpler solution is to do STAGEtrain_CFLAGS := $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) instead of the '=' assignment in the toplevel Makefile to not cause build-config fragments changing the values of derived flags? (if, then consistently for all, of course). Richard. > Thanks, > Martin > > config/ChangeLog: > > 2019-04-15 Martin Liska > > * bootstrap-lto-lean.mk: Filter out -flto in STAGEtrain_CFLAGS. > --- > config/bootstrap-lto-lean.mk | 1 + > 1 file changed, 1 insertion(+) > >
Re: [PATCH] Filter out LTO in config/bootstrap-lto-lean.mk.
On 4/15/19 12:23 PM, Richard Biener wrote: > On Mon, Apr 15, 2019 at 9:46 AM Martin Liška wrote: >> >> Hi. >> >> The patch is fixing bootstrap-lto-lean.mk where with PGO LTO was >> wrongly used in STAGEtrain. >> >> Tested on openSUSE gcc9 package, I'm attaching build log. >> >> Ready to be installed? > > I wonder why 'override' is necessary given before we include the build-config > .mk fragment we do > > STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) > > I suppose you checked w/o override and it didn't work? Or ist the issue > that you have to use := here to get the previous addition to STAGE3_CFLAGS > resolved? Fails due to: [ 16s] + setarch x86_64 -R make profiledbootstrap 'STAGE1_CFLAGS=-g -O2' 'BOOT_CFLAGS=-O2 -D_FORTIFY_SOURCE=2 -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g -U_FORTIFY_SOURCE' -j160 [ 16s] ../config/bootstrap-lto-lean.mk:5: *** Recursive variable 'STAGEtrain_CFLAGS' references itself (eventually). Stop. > > A make expert might want to chime in here. > > Maybe a simpler solution is to do > > STAGEtrain_CFLAGS := $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) This one will work of course. I would wait for some time and we can eventually take this change. Martin > > instead of the '=' assignment in the toplevel Makefile to not cause > build-config fragments changing the values of derived flags? > (if, then consistently for all, of course). > > Richard. > >> Thanks, >> Martin >> >> config/ChangeLog: >> >> 2019-04-15 Martin Liska >> >> * bootstrap-lto-lean.mk: Filter out -flto in STAGEtrain_CFLAGS. >> --- >> config/bootstrap-lto-lean.mk | 1 + >> 1 file changed, 1 insertion(+) >> >>
Re: [PR86438] avoid too-long shift in test
On 12/04/2019 02:42, Alexandre Oliva wrote: The test fell back to long long and long when __int128 is not available, but it assumed sizeof(long) < sizeof(long long) because of a shift count that would be out of range for a long long if their widths are the same. Fixed by splitting it up into two shifts. Tested on x86_64-linux-gnu, -m64 and -m32. Hopefully Andrew and/or John David will let me know if it fails to fix the problem on the platforms in which they've observed it. Thanks for the report, sorry it took me so long to get to it. I'm going to install this as obvious, unless there are objections in the next few days. Confirmed; the test now passes for amdgcn. Andrew
Re: [PATCH] combine: Count auto_inc properly (PR89794)
On Sun, Apr 14, 2019 at 09:51:39AM +, Segher Boessenkool wrote: > The code that checks if an auto-increment from i0 or i1 is not lost is > a bit shaky. The code to check the same for i2 is non-existent, and > cannot be implemented in a similar way at all. So, this patch counts > all auto-increments, and makes sure we end up with the same number as > we started with. This works because we still have a check that we > will not duplicate any. > > We should do this some better way, but not while we are in stage 4. > > Tested on powerpc64-linux {-m32,-m64}; also tested manually on the Arm > testcase. I added a missing "static", and added the testcase, as attached. Committing it now. Subject: [PATCH] combine: Count auto_inc properly (PR89794) The code that checks if an auto-increment from i0 or i1 is not lost is a bit shaky. The code to check the same for i2 is non-existent, and cannot be implemented in a similar way at all. So, this patch counts all auto-increments, and makes sure we end up with the same number as we started with. This works because we still have a check that we will not duplicate any. 2019-04-15 Segher Boessenkool PR rtl-optimization/89794 * combine.c (count_auto_inc): New function. (try_combine): Count how many auto_inc expressions there were in the original instructions. Ensure we have the same number in the new instructions. Remove the code that tried to ensure auto_inc side effects on i1 and i0 are not lost. gcc/testsuite/ PR rtl-optimization/89794 * gcc.dg/torture/pr89794.c: New testcase. --- gcc/combine.c | 60 -- gcc/testsuite/gcc.dg/torture/pr89794.c | 24 ++ 2 files changed, 66 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr89794.c diff --git a/gcc/combine.c b/gcc/combine.c index f681345..07bd0cf 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -2667,6 +2667,16 @@ combine_remove_reg_equal_equiv_notes_for_regno (unsigned int regno) } } +/* Callback function to count autoincs. */ + +static int +count_auto_inc (rtx, rtx, rtx, rtx, rtx, void *arg) +{ + (*((int *) arg))++; + + return 0; +} + /* Try to combine the insns I0, I1 and I2 into I3. Here I0, I1 and I2 appear earlier than I3. I0 and I1 can be zero; then we combine just I2 into I3, or I1 and I2 into @@ -2732,6 +2742,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, int split_i2i3 = 0; int changed_i3_dest = 0; bool i2_was_move = false, i3_was_move = false; + int n_auto_inc = 0; int maxreg; rtx_insn *temp_insn; @@ -3236,6 +3247,16 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, return 0; } + /* Count how many auto_inc expressions there were in the original insns; + we need to have the same number in the resulting patterns. */ + + if (i0) +for_each_inc_dec (PATTERN (i0), count_auto_inc, &n_auto_inc); + if (i1) +for_each_inc_dec (PATTERN (i1), count_auto_inc, &n_auto_inc); + for_each_inc_dec (PATTERN (i2), count_auto_inc, &n_auto_inc); + for_each_inc_dec (PATTERN (i3), count_auto_inc, &n_auto_inc); + /* If the set in I2 needs to be kept around, we must make a copy of PATTERN (I2), so that when we substitute I1SRC for I1DEST in PATTERN (I2), we are only substituting for the original I1DEST, not into @@ -3439,18 +3460,11 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, if (i1 && GET_CODE (newpat) != CLOBBER) { - /* Check that an autoincrement side-effect on I1 has not been lost. -This happens if I1DEST is mentioned in I2 and dies there, and -has disappeared from the new pattern. */ - if ((FIND_REG_INC_NOTE (i1, NULL_RTX) != 0 - && i1_feeds_i2_n - && dead_or_set_p (i2, i1dest) - && !reg_overlap_mentioned_p (i1dest, newpat)) - /* Before we can do this substitution, we must redo the test done - above (see detailed comments there) that ensures I1DEST isn't - mentioned in any SETs in NEWPAT that are field assignments. */ - || !combinable_i3pat (NULL, &newpat, i1dest, NULL_RTX, NULL_RTX, - 0, 0, 0)) + /* Before we can do this substitution, we must redo the test done +above (see detailed comments there) that ensures I1DEST isn't +mentioned in any SETs in NEWPAT that are field assignments. */ + if (!combinable_i3pat (NULL, &newpat, i1dest, NULL_RTX, NULL_RTX, +0, 0, 0)) { undo_all (); return 0; @@ -3480,12 +3494,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, if (i0 && GET_CODE (newpat) != CLOBBER) { - if ((FIND_REG_INC_NOTE (i0, NULL_RTX) != 0 - && ((i0_feeds_i2_n && dead_or_set_p (i2, i0dest)) - || (i0_feeds_i1_n &&
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
> > This won't work with LTO. Note we have the issue in the middle-end as well > since we promote variables we see are not written to to TREE_READONLY. > This can be seen with (the somewhat artificial...): > > int a[1024*1024] = { 0 }; > > int __attribute__((noinline)) foo() { return *(volatile int *)a; } > > int main() > { > return foo (); > } > > where without -flto a gets placed into .bss while with -flto it > gets into .rodata. So I believe we should add a DECL flag > specifying whether for section placement we can "ignore" > TREE_READONLY. We'd initialize that with the original > state of TREE_READONLY so that the R/O promotion doesn't > change section placement. Also the Fortran FE can then > simply set this flag on variables that may live in .bss. > > There are 14 unused bits in tree_decl_with_vis so a > patch for the middle-end part could look like the attached > (w/o solving the LTO issue yet). > > Of course adding sth like a .robss section would be nice. Yep, but I think what you propose works well in practice (I am not sure if we are forced to put const delcared variables to readonly memory and if we can't do this as binary size optimization always). The patch looks fine to me. Would be possible to place the flags into varpool_node rather then TREE? It is a lot easier to manage flags there. Honza > > Richard. > > > Regards > > > > Thomas > > > > 2019-04-13 Thomas Koenig > > > > PR fortran/84487 > > * trans-decl.c (gfc_get_symbol_decl): Mark _def_init as > > artificial. > > > > 2019-04-13 Thomas Koenig > > > > PR fortran/84487 > > * varasm.c (bss_initializer_p): If we are compiling Fortran, the > > decl is artifical and it has a size larger than 255, it can be > > put into BSS. > > > > 2019-04-13 Thomas Koenig > > > > PR fortran/84487 > > * gfortran.dg/def_init_1.f90: New test. > > > >
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
* Richard Biener: > Of course adding sth like a .robss section would be nice. I think this is strictly a link editor issue because a read-only PT_LOAD directive with a memory size larger than the file size already produces read-only zero pages, without requiring a file allocation. Thanks, Florian
Re: Fix false -Wodr warnings
On Sun, Apr 14, 2019 at 10:59 PM Jan Hubicka wrote: > > Hi, > this patch fixes false warning that is output when different -std > settings are used. In this case C++ FE produces same declaration in > different representations which differ by 0 sized fileds only. > The patch makes them to be ignored (and I checked we ignore them for > canonical type merging too) > > Bootstrapped/regtested x86_64-linux, comitted. The testcase is bogus WARNING: lto.exp does not support dg-do WARNING: lto.exp does not support dg-options in primary source file > Honza > > PR lto/89358 > * g++.dg/lto/pr89358_0.C: New testcase. > * g++.dg/lto/pr89358_1.C: New testcase. > * ipa-devirt.c (skip_in_fields_list_p): New. > (odr_types_equivalent_p): Use it. > Index: testsuite/g++.dg/lto/pr89358_0.C > === > --- testsuite/g++.dg/lto/pr89358_0.C(nonexistent) > +++ testsuite/g++.dg/lto/pr89358_0.C(working copy) > @@ -0,0 +1,11 @@ > +/* { dg-do link } */ > +/* { dg-options "-std=c++17" } */ > +#include > + > +extern void test(); > + > +int main() > +{ > +std::map m; > +test(); > +} > Index: testsuite/g++.dg/lto/pr89358_1.C > === > --- testsuite/g++.dg/lto/pr89358_1.C(nonexistent) > +++ testsuite/g++.dg/lto/pr89358_1.C(working copy) > @@ -0,0 +1,7 @@ > +/* { dg-options "-std=c++14" } */ > +#include > + > +void test() > +{ > +std::map m; > +} > Index: ipa-devirt.c > === > --- ipa-devirt.c(revision 270324) > +++ ipa-devirt.c(working copy) > @@ -1282,6 +1282,24 @@ warn_types_mismatch (tree t1, tree t2, l > inform (loc_t2, "the incompatible type is defined here"); > } > > +/* Return true if T should be ignored in TYPE_FIELDS for ODR comparsion. */ > + > +static bool > +skip_in_fields_list_p (tree t) > +{ > + if (TREE_CODE (t) != FIELD_DECL) > +return true; > + /* C++ FE introduces zero sized fields depending on -std setting, see > + PR89358. */ > + if (DECL_SIZE (t) > + && integer_zerop (DECL_SIZE (t)) > + && DECL_ARTIFICIAL (t) > + && DECL_IGNORED_P (t) > + && !DECL_NAME (t)) > +return true; > + return false; > +} > + > /* Compare T1 and T2, report ODR violations if WARN is true and set > WARNED to true if anything is reported. Return true if types match. > If true is returned, the types are also compatible in the sense of > @@ -1548,9 +1566,9 @@ odr_types_equivalent_p (tree t1, tree t2 > f1 = TREE_CHAIN (f1), f2 = TREE_CHAIN (f2)) > { > /* Skip non-fields. */ > - while (f1 && TREE_CODE (f1) != FIELD_DECL) > + while (f1 && skip_in_fields_list_p (f1)) > f1 = TREE_CHAIN (f1); > - while (f2 && TREE_CODE (f2) != FIELD_DECL) > + while (f2 && skip_in_fields_list_p (f2)) > f2 = TREE_CHAIN (f2); > if (!f1 || !f2) > break;
Re: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes.
Dear Dominique, Gilles and Reinhold, Thank you for your rapid feedback. We might even get a reasonably functional ISO Fortran binding in place for 9-branch release :-) On your remaining nits: (i) ISO_Fortran_binding_4.f90 -m32 -O1/Os looks awful. I will take a look, though. (ii) pr89844 being fixed by an earlier patch led me to give it lower priority. I will look to see whether another testcase is required to nail it down. (iii) I will take a look at 90093 - it should be straight forward. I do not regard it as being a regression, however, since the arguments were not being correctly handled until now - ie. were not converted from cfi to gfc descriptors. Cheers Paul On Mon, 15 Apr 2019 at 10:27, Bader, Reinhold wrote: > > Dear Paul, > > mostly looks good. Apart from a regression with optional arguments reported as > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90093 > all other test cases I have now execute correctly. > > Cheers > Reinhold > > > -Ursprüngliche Nachricht- > > Von: Paul Richard Thomas > > Gesendet: Sonntag, 14. April 2019 20:16 > > An: Thomas Koenig > > Cc: Gilles Gouaillardet ; Bader, Reinhold > > ; fort...@gcc.gnu.org; gcc-patches > patc...@gcc.gnu.org> > > Betreff: Re: [Patch, fortran] PRs 89843 and 90022 - C Fortran Interop fixes. > > > > Hi Thomas, > > > > Thanks a lot. Committed as revision 270353. > > > > I was determined not to repeat the PDT experience, which is still nagging at > > me. That has to be the next major gfc project, I guess. > > > > Regards > > > > Paul > > > > On Sun, 14 Apr 2019 at 18:08, Thomas Koenig > > wrote: > > > > > > Hi Paul, > > > > > > > > > > Please find attached the updated patch, which fixes the problem with > > > > -m32 in PR90022, eliminates the temporary creation for INTENT(IN) > > > > dummies and fixes PR89846. > > > > > > > > While it looks like it should be intrusive because of its size, I > > > > believe that the patch is still safe for trunk since it is hidden > > > > behind tests for CFI descriptors. > > > > > > > > Bootstraps and regtests on FC29/x86_64 - OK for trunk? > > > > > > OK. > > > > > > I we're going into the gcc 9 release with an implementation of the C > > > interop features, it will be better with fewer bugs :-) > > > > > > Thanks a lot for working on it! > > > > > > Regards > > > > > > Thomas > > > > > > > > -- > > "If you can't explain it simply, you don't understand it well enough" > > - Albert Einstein -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).
On 4/12/19 4:12 PM, H.J. Lu wrote: > On Fri, Apr 12, 2019 at 4:41 AM Martin Liška wrote: >> >> On 4/11/19 6:30 PM, H.J. Lu wrote: >>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška wrote: Hi. The patch is adding missing AVX512 ISAs for target and target_clone attributes. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin gcc/ChangeLog: 2019-04-10 Martin Liska PR target/89929 * config/i386/i386.c (get_builtin_code_for_version): Add support for missing AVX512 ISAs. gcc/testsuite/ChangeLog: 2019-04-10 Martin Liska PR target/89929 * g++.target/i386/mv28.C: New test. * gcc.target/i386/mvc14.c: New test. --- gcc/config/i386/i386.c| 34 ++- gcc/testsuite/g++.target/i386/mv28.C | 30 +++ gcc/testsuite/gcc.target/i386/mvc14.c | 16 + 3 files changed, 79 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.target/i386/mv28.C create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c >>> >> >> Hi. >> >>> Since any ISAs beyond AVX512F may be enabled individually, we >>> can't simply assign priorities to them. For GFNI, we can have >>> >>> 1. GFNI >>> 2. GFNI + AVX >>> 3. GFNI + AVX512F >>> 4. GFNI + AVX512F + AVX512VL >> >> Makes sense to me! I'm considering syntax extension where one would be >> able to come up with a priority. Eg. >> >> __attribute__((target("gfni,avx512bw", priority((3) >> >> Without that the ISA combinations are probably not comparable in a >> reasonable way. >> >>> >>> For this code, GFNI + AVX512BW is ignored: >>> >>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii >>> __attribute__((target("gfni"))) >>> int foo(int i) { >>> return 1; >>> } >>> __attribute__((target("gfni,avx512bw"))) >>> int foo(int i) { >>> return 4; >>> } >>> __attribute__((target("default"))) >>> int foo(int i) { >>> return 3; >>> } >>> int bar () >>> { >>> return foo(2); >>> } >> >> For 'target' attribute it works for me: >> >> 1) $ cat z.c && ./xg++ -B. z.c -c >> #include >> volatile __m512i x1, x2; >> volatile __mmask64 m64; >> >> __attribute__((target("gfni"))) >> int foo(int i) { >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); >> return 1; >> } >> __attribute__((target("gfni,avx512bw"))) >> int foo(int i) { >> return 4; >> } >> __attribute__((target("default"))) >> int foo(int i) { >> return 3; >> } >> int bar () >> { >> return foo(2); >> } >> In file included from ./include/immintrin.h:117, >> from ./include/x86intrin.h:32, >> from z.c:1: >> z.c: In function ‘int foo(int)’: >> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa option >> -m32 -mgfni -mavx512f >> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); >> | ^~~~ >> z.c:7:10: note: the ABI for passing parameters with 64-byte alignment has >> changed in GCC 4.6 >> >> 2) $ cat z.c && ./xg++ -B. z.c -c >> #include >> volatile __m512i x1, x2; >> volatile __mmask64 m64; >> >> __attribute__((target("gfni"))) >> int foo(int i) { >> return 1; >> } >> __attribute__((target("gfni,avx512bw"))) >> int foo(int i) { >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); >> return 4; >> } >> __attribute__((target("default"))) >> int foo(int i) { >> return 3; >> } >> int bar () >> { >> return foo(2); >> } >> >> [OK] >> >> Btw. is it really correct the '-m32' in: 'needs isa option -m32' ? > > It does look odd. I've just created a PR for that: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90096 Martin > >> Similar applies to target_clone attribute where we'll have to come up with >> a syntax that will allow multiple ISA to be combined. Something like: >> >> __attribute__((target_clones("gfni+avx512bw"))) >> >> ? Priorities can be maybe implemented by order? >> > > I am thinking -misa=processor which will enable ISAs for > processor. It differs from -march=. -misa= doesn't set > -mtune. >
Re: [PATCH] Fix up RTL DCE find_call_stack_args (PR rtl-optimization/89965)
Hi, On Fri, 12 Apr 2019, Jeff Law wrote: > > I don't think this follows. Imagine a pure foo tailcalling a pure bar. > > To make the tailcall, foo may need to change some of its argument slots > > to pass new arguments to bar. > I'd claim that a pure/const call can't tail call another function as > that would potentially modify the argument slots. I still don't think that what you want follows. Imagine this: int foo (int i) { ++i; return i; } To claim that this function is anything else than const+pure seems weird (in fact this function doesn't access anything that must lie in memory at all). Now take your off-the-mill ABI that passes arguments on stack, and an only slightly bad code generator, i.e. -O0 on i386. You will get an modification of the argument slot: foo: pushl %ebp movl%esp, %ebp addl$1, 8(%ebp) movl8(%ebp), %eax popl%ebp ret So, if anything then the ownership of argument slots is a property of the psABI. And while we may have been through this discussion a couple times over the years, I'm pretty sure that at least I consistently argued to declare all psABIs that leave argument slot ownerships with the callers (after the call actually happens) to be seriously broken^Wmisguided (and yes, also because it can prevent tail calls that otherwise would be perfectly valid). Ciao, Michael.
Re: [PATCH] fix ICEs in c-attribs.c (PR 88383, 89288, 89798, 89797)
On Sat, 13 Apr 2019 at 00:38, Martin Sebor wrote: > > On 4/12/19 3:42 PM, Jakub Jelinek wrote: > > On Fri, Apr 12, 2019 at 10:45:25AM -0600, Jeff Law wrote: > >>> gcc/ChangeLog: > >>> > >>> PR c/89797 > >>> * targhooks.c (default_vector_alignment): Avoid assuming > >>> argument fits in SHWI. > >>> * tree.h (TYPE_VECTOR_SUBPARTS): Avoid sign overflow in > >>> a shift expression. > >>> > >>> gcc/c-family/ChangeLog: > >>> > >>> PR c/88383 > >>> PR c/89288 > >>> PR c/89798 > >>> PR c/89797 > >>> * c-attribs.c (type_valid_for_vector_size): Detect excessively > >>> large sizes. > ... > > > > Has the patch been tested at all? > > A few times. The c-attribs.c change above didn't make it into > the commit. Hi, Even with r270331, I'm still seeing the ICE on aarch64 (actually with trunk @r270370) Is there still some commit missing? Thanks, Christophe > Martin
[PATCH] Tweak LIM MEM improvements to fix PR56049
It turns out solving this long-standing optimization regression is now easy by exploiting implmenetation details in how we canonicalize refs in LIM. This allows us to properly identifying MEM[(integer(kind=4)[64] *)&a][0] and MEM[(c_char * {ref-all})&a] the same, applying store-motion to an initialization (non-)loop thereby eliminating it. Bootstrap & regtest running on x86_64-unknown-linux-gnu. I didn't go further trying to exploit alias subset relationship instead but the alias-set zero case is obvious enough to be correct. Richard. 2019-04-15 Richard Biener PR tree-optimization/56049 * tree-ssa-loop-im.c (mem_ref_hasher::equal): Elide alias-set equality check if alias-set zero will prevail. * gfortran.dg/pr56049.f90: New testcase. Index: gcc/tree-ssa-loop-im.c === --- gcc/tree-ssa-loop-im.c (revision 270366) +++ gcc/tree-ssa-loop-im.c (working copy) @@ -178,7 +178,17 @@ mem_ref_hasher::equal (const im_mem_ref && known_eq (mem1->mem.size, obj2->size) && known_eq (mem1->mem.max_size, obj2->max_size) && mem1->mem.volatile_p == obj2->volatile_p - && mem1->mem.ref_alias_set == obj2->ref_alias_set + && (mem1->mem.ref_alias_set == obj2->ref_alias_set + /* We are not canonicalizing alias-sets but for the + special-case we didn't canonicalize yet and the + incoming ref is a alias-set zero MEM we pick + the correct one already. */ + || (!mem1->ref_canonical + && (TREE_CODE (obj2->ref) == MEM_REF + || TREE_CODE (obj2->ref) == TARGET_MEM_REF) + && obj2->ref_alias_set == 0) + /* Likewise if there's a canonical ref with alias-set zero. */ + || (mem1->ref_canonical && mem1->mem.ref_alias_set == 0)) && types_compatible_p (TREE_TYPE (mem1->mem.ref), TREE_TYPE (obj2->ref))); else Index: gcc/testsuite/gfortran.dg/pr56049.f90 === --- gcc/testsuite/gfortran.dg/pr56049.f90 (nonexistent) +++ gcc/testsuite/gfortran.dg/pr56049.f90 (working copy) @@ -0,0 +1,29 @@ +! { dg-do compile } +! { dg-options "-O3 -fdump-tree-optimized" } + +program inline + +integer i +integer a(8,8), b(8,8) + +a = 0 +do i = 1, 1000 +call add(b, a, 1) +a = b +end do + +print *, a + +contains + +subroutine add(b, a, o) +integer, intent(inout) :: b(8,8) +integer, intent(in) :: a(8,8), o +b = a + o +end subroutine add + +end program inline + +! Check there's no loop left, just two bb 2 in two functions. +! { dg-final { scan-tree-dump-times "" 2 "optimized" } } +! { dg-final { scan-tree-dump-times "" 2 "optimized" } }
Re: [PATCH] fix ICEs in c-attribs.c (PR 88383, 89288, 89798, 89797)
On 4/15/19 7:12 AM, Christophe Lyon wrote: > On Sat, 13 Apr 2019 at 00:38, Martin Sebor wrote: >> >> On 4/12/19 3:42 PM, Jakub Jelinek wrote: >>> On Fri, Apr 12, 2019 at 10:45:25AM -0600, Jeff Law wrote: > gcc/ChangeLog: > > PR c/89797 > * targhooks.c (default_vector_alignment): Avoid assuming > argument fits in SHWI. > * tree.h (TYPE_VECTOR_SUBPARTS): Avoid sign overflow in > a shift expression. > > gcc/c-family/ChangeLog: > > PR c/88383 > PR c/89288 > PR c/89798 > PR c/89797 > * c-attribs.c (type_valid_for_vector_size): Detect excessively > large sizes. >> ... >>> >>> Has the patch been tested at all? >> >> A few times. The c-attribs.c change above didn't make it into >> the commit. > > Hi, > Even with r270331, I'm still seeing the ICE on aarch64 (actually with > trunk @r270370) > > Is there still some commit missing? > Or perhaps something else broken. My tester flagged these are aarch64 > New tests that FAIL (4 tests): > > gcc.dg/attr-vector_size.c (internal compiler error) > gcc.dg/attr-vector_size.c (test for excess errors) > gcc.dg/attr-vector_size.c LP64 (test for errors, line 33) > gcc.dg/attr-vector_size.c LP64 (test for errors, line 60) The rest of the tests passed. It could well be something different about the aarch64 port. Seems like a bit of debugging is advisable. jeff
Re: [PATCH] Add support for missing AVX512* ISAs (PR target/89929).
On Mon, Apr 15, 2019 at 12:26 AM Martin Liška wrote: > > On 4/12/19 4:12 PM, H.J. Lu wrote: > > On Fri, Apr 12, 2019 at 4:41 AM Martin Liška wrote: > >> > >> On 4/11/19 6:30 PM, H.J. Lu wrote: > >>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška wrote: > > Hi. > > The patch is adding missing AVX512 ISAs for target and target_clone > attributes. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > 2019-04-10 Martin Liska > > PR target/89929 > * config/i386/i386.c (get_builtin_code_for_version): Add > support for missing AVX512 ISAs. > > gcc/testsuite/ChangeLog: > > 2019-04-10 Martin Liska > > PR target/89929 > * g++.target/i386/mv28.C: New test. > * gcc.target/i386/mvc14.c: New test. > --- > gcc/config/i386/i386.c| 34 ++- > gcc/testsuite/g++.target/i386/mv28.C | 30 +++ > gcc/testsuite/gcc.target/i386/mvc14.c | 16 + > 3 files changed, 79 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/g++.target/i386/mv28.C > create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c > > > >>> > >> > >> Hi. > >> > >>> Since any ISAs beyond AVX512F may be enabled individually, we > >>> can't simply assign priorities to them. For GFNI, we can have > >>> > >>> 1. GFNI > >>> 2. GFNI + AVX > >>> 3. GFNI + AVX512F > >>> 4. GFNI + AVX512F + AVX512VL > >> > >> Makes sense to me! I'm considering syntax extension where one would be > >> able to come up with a priority. Eg. > >> > >> __attribute__((target("gfni,avx512bw", priority((3) > >> > >> Without that the ISA combinations are probably not comparable in a > >> reasonable way. > >> > >>> > >>> For this code, GFNI + AVX512BW is ignored: > >>> > >>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii > >>> __attribute__((target("gfni"))) > >>> int foo(int i) { > >>> return 1; > >>> } > >>> __attribute__((target("gfni,avx512bw"))) > >>> int foo(int i) { > >>> return 4; > >>> } > >>> __attribute__((target("default"))) > >>> int foo(int i) { > >>> return 3; > >>> } > >>> int bar () > >>> { > >>> return foo(2); > >>> } > >> > >> For 'target' attribute it works for me: > >> > >> 1) $ cat z.c && ./xg++ -B. z.c -c > >> #include > >> volatile __m512i x1, x2; > >> volatile __mmask64 m64; > >> > >> __attribute__((target("gfni"))) > >> int foo(int i) { > >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > >> return 1; > >> } > >> __attribute__((target("gfni,avx512bw"))) > >> int foo(int i) { > >> return 4; > >> } > >> __attribute__((target("default"))) > >> int foo(int i) { > >> return 3; > >> } > >> int bar () > >> { > >> return foo(2); > >> } > >> In file included from ./include/immintrin.h:117, > >> from ./include/x86intrin.h:32, > >> from z.c:1: > >> z.c: In function ‘int foo(int)’: > >> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa option > >> -m32 -mgfni -mavx512f > >> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > >> | ^~~~ > >> z.c:7:10: note: the ABI for passing parameters with 64-byte alignment has > >> changed in GCC 4.6 > >> > >> 2) $ cat z.c && ./xg++ -B. z.c -c > >> #include > >> volatile __m512i x1, x2; > >> volatile __mmask64 m64; > >> > >> __attribute__((target("gfni"))) > >> int foo(int i) { > >> return 1; > >> } > >> __attribute__((target("gfni,avx512bw"))) > >> int foo(int i) { > >> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > >> return 4; > >> } > >> __attribute__((target("default"))) > >> int foo(int i) { > >> return 3; > >> } > >> int bar () > >> { > >> return foo(2); > >> } > >> > >> [OK] > >> > >> Btw. is it really correct the '-m32' in: 'needs isa option -m32' ? > > > > It does look odd. > > Then let me take a look at this. > > > > >> Similar applies to target_clone attribute where we'll have to come up with > >> a syntax that will allow multiple ISA to be combined. Something like: > >> > >> __attribute__((target_clones("gfni+avx512bw"))) > >> > >> ? Priorities can be maybe implemented by order? > >> > > > > I am thinking -misa=processor which will enable ISAs for > > processor. It differs from -march=. -misa= doesn't set > > -mtune. > > > > Well, isn't that what we currently support, e.g.: > > $ cat mvc11.c && gcc mvc11.c -c > __attribute__((target_clones("arch=sandybridge", "arch=cascadelake", > "default"))) int > foo (void) > { > return 0; > } > > int > main () > { > foo (); > } > > If so, we can provide a new warning that will tell that for AVX512* on should > use 'arch=xyz' > instead? > 1. We don't have one option to enable AVX512F and AVX512CD, whic
[aarch64][RFA/RFC][rtl-optimization/87763] Add new movk pattern for aarch64
Here's my attempt to fix the movk regression on bz 87763. I still wonder if addressing some of these issues in combine is a better long term solution, but in the immediate term I think backend patterns are going to have to be the way to go. This introduces a new insn_and_split that matches a movk via the ior..and form. We rewrite it back into the zero-extract form once operands0 and operands1 match. This allows insn fusion in the scheduler to work as it expects the zero-extract form. While I have bootstrapped this on aarch64 and aarch64_be, I haven't done anything with ILP32. On aarch64 I have also run this through a regression test cycle where it fixes the movk regression identified in bz87763. Thoughts? If we're generally happy with this direction I can look to tackle the insv_1 and insv_2 regressions in a similar manner. Jeff * config/aarch64/aarch64.md: Add new pattern matching movk field insertion via (and (ior ...)). diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index ab8786a933e..109694f9ef0 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1161,6 +1161,54 @@ [(set_attr "type" "mov_imm")] ) +;; This is for the combiner to use to encourage creation of +;; bitfield insertions using movk. +;; +;; We rewrite back into a movk bitfield insertion to make sched +;; fusion happy the first chance we get where the appropriate +;; operands match. After LRA they should always match. +(define_insn_and_split "" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0") + (match_operand:GPI 2 "const_int_operand" "n")) +(match_operand:GPI 3 "const_int_operand" "n")))] + "((UINTVAL (operands[2]) == 0x + || UINTVAL (operands[2]) == 0x + || UINTVAL (operands[2]) == 0x + || UINTVAL (operands[2]) == 0x) +&& (UINTVAL (operands[2]) & UINTVAL (operands[3])) == 0)" + "#" + "&& rtx_equal_p (operands[0], operands[1])" + [(set (zero_extract: (match_dup 0) +(const_int 16) +(match_dup 2)) + (match_dup 3))] + "{ + if (UINTVAL (operands[2]) == 0x) + { + operands[2] = GEN_INT (0); + operands[3] = GEN_INT (UINTVAL (operands[3]) & 0x); + } + else if (UINTVAL (operands[2]) == 0x) + { + operands[2] = GEN_INT (16); + operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 16) & 0x); + } + else if (UINTVAL (operands[2]) == 0x) + { + operands[2] = GEN_INT (32); + operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 32) & 0x); + } + else if (UINTVAL (operands[2]) == 0x) + { + operands[2] = GEN_INT (48); + operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 48) & 0x); + } + else + gcc_unreachable (); + }" +) + (define_expand "movti" [(set (match_operand:TI 0 "nonimmediate_operand" "") (match_operand:TI 1 "general_operand" ""))]
Re: [PATCH v2] Fix __patchable_function_entries section flags
On 4/12/19 1:19 PM, Jeff Law wrote: On 4/11/19 11:18 AM, Joao Moreira wrote: When -fpatchable-relocation-entry is used, gcc places nops on the prologue of each compiled function and creates a section named __patchable_function_entries which holds relocation entries for the positions in which the nops were placed. As is, gcc creates this section without the proper section flags, causing crashes in the compiled program during its load. Given the above, fix the problem by creating the section with the SECTION_WRITE and SECTION_RELRO flags. The problem was noticed while compiling glibc with -fpatchable-function-entry compiler flag. After applying the patch, this issue was solved. This was also tested on x86-64 arch without visible problems under the gcc standard tests. 2019-04-10 Joao Moreira * targhooks.c (default_print_patchable_function_entry): Emit __patchable_function_entries section with writable flags to allow relocation resolution. OK. Do you have write access to the GCC repo? No. Tks, Joao. jeff
[PATCH wwwdocs] Mention GNU Tools Cauldron in the News section
Hi, Here is a patch that adds a mention of the 2019 Cauldron, similar to the entries for the previous editions. Thanks, Simon Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.1125 diff -u -r1.1125 index.html --- index.html 29 Mar 2019 12:28:15 - 1.1125 +++ index.html 15 Apr 2019 16:39:00 - @@ -54,6 +54,10 @@ News +https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools Cauldron 2019 +[2019-04-15] +Held in Montréal, Canada, September 13-15 2019 + GCC 8.3 released [2019-02-22]
Re: [PATCH] (RFA tree-tailcall) PR c++/82081 - tail call optimization breaks noexcept
On Sun, Apr 14, 2019 at 11:50 PM Richard Biener wrote: > > On Sat, Apr 13, 2019 at 12:34 AM Jeff Law wrote: > > > > On 4/12/19 3:24 PM, Jason Merrill wrote: > > > If a noexcept function calls a function that might throw, doing the tail > > > call optimization means that an exception thrown in the called function > > > will propagate out, breaking the noexcept specification. So we need to > > > prevent the optimization in that case. > > > > > > Tested x86_64-pc-linux-gnu. OK for trunk or hold for GCC 10? This isn't > > > a > > > regression, but it is a straightforward fix for a wrong-code bug. > > > > > > * tree-tailcall.c (find_tail_calls): Don't turn a call from a > > > nothrow function to a might-throw function into a tail call. > > I'd go on the trunk. It's a wrong-code issue, what we're doing is just > > plain wrong. One could even make a case for backporting to the branches. > > Hmm, how's this different from adding another indirection? That is, > I don't understand why the tailcall is the issue here, shouldn't unwind > still stop at the noexcept caller? Thus, isn't this wrong CFI instead? noexcept caller is no longer on the stack so the unwinder does not see it. It is not the tail call from a normal function to a noexcept that is an issue but rather inside a noexcept caller to a normal function. > > Of course I know to little about this. > > Btw, doesn't your check also prevent tail/sibling calls when > the caller wraps it into a try { } catch (...) {}? Or does unwind > not work in that case either? > > Btw, I'd like to see a runtime testcase that fails. There is one in the bug report. Though it would not work for the testsuite. It should not be hard to change it to be one that works for the testsuite. Thanks, Andrew Pinski > > Richard. > > > jeff > > > > ps. I'm a bit surprised it hasn't been reported until now.
Re: [PATCH wwwdocs] Mention GNU Tools Cauldron in the News section
On 2019-04-15 12:42 p.m., Simon Marchi wrote: > Hi, > > Here is a patch that adds a mention of the 2019 Cauldron, similar to the > entries > for the previous editions. > > Thanks, > > Simon > > > Index: index.html > === > RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v > retrieving revision 1.1125 > diff -u -r1.1125 index.html > --- index.html29 Mar 2019 12:28:15 - 1.1125 > +++ index.html15 Apr 2019 16:39:00 - > @@ -54,6 +54,10 @@ > News > > > +https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools Cauldron > 2019 > +[2019-04-15] > +Held in Montréal, Canada, September 13-15 2019 > + > GCC 8.3 released > [2019-02-22] > > Actually, it would be better to use the same dates as are written on the wiki (12-15), so please consider the patch below instead. Also, please note that I don't have push access on GCC, so if somebody could push the patch for me, once it's approved, I would appreciate it. Thanks! Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.1125 diff -u -r1.1125 index.html --- index.html 29 Mar 2019 12:28:15 - 1.1125 +++ index.html 15 Apr 2019 17:34:48 - @@ -54,6 +54,10 @@ News +https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools Cauldron 2019 +[2019-04-15] +Held in Montréal, Canada, September 12-15 2019 + GCC 8.3 released [2019-02-22]
Re: [patch] Fix PR 84487, large rodata increase in tonto and other programs
On Mon, Apr 15, 2019 at 01:54:11PM +0200, Florian Weimer wrote: > * Richard Biener: > > > Of course adding sth like a .robss section would be nice. > > I think this is strictly a link editor issue because a read-only PT_LOAD > directive with a memory size larger than the file size already produces > read-only zero pages, without requiring a file allocation. But .rodata normally is not the last thing in its segment (the .eh* things are after it, and those are usually not all zero). Segher
Re: [C++ Patch/RFC] PR 89900 ("[9 Regression] ICE: Segmentation fault (in check_instantiated_arg)")
Hi, On 12/04/19 20:29, Jason Merrill wrote: On 4/11/19 11:20 AM, Paolo Carlini wrote: Hi, over the last few days I spent some time on this regression, which at first seemed just a minor error-recovery issue, but then I noticed that very slightly tweeking the original testcase uncovered a pretty serious ICE on valid: template void fk (XE..., int/*SW*/); void w9 (void) { fk (0); } The regression has to do with the changes committed by Jason for c++/86932, in particular with the condition in coerce_template_parms: if (template_parameter_pack_p (TREE_VALUE (parm)) && (arg || !(complain & tf_partial)) && !(arg && ARGUMENT_PACK_P (arg))) which has the additional (arg || !complain & tf_partial)) false for the present testcase, thus the null arg is not changed into an empty pack, thus later instantiate_template calls check_instantiated_args which finds it still null and crashes. Now, likely some additional analysis is in order, but for sure there is an important difference between the testcase which came with c++/86932 and the above: non-type vs type template parameter pack. It seems to me that the kind of problem fixed in c++/86932 cannot occur with type packs, because it boils down to a reference to a previous parm (full disclosure: the comments and logic in fixed_parameter_pack_p helped me a lot here). Thus I had the idea of simply restricting the scope of the new condition above by adding an || TREE_CODE (TREE_VALUE (parm)) == TYPE_DECL, which definitely leads to a clean testsuite and a proper behavior on the new testcases, AFAICS. I'm attaching what I tested on x86_64-linux. I think the important property here is that it's non-terminal, not that it's a type pack. We can't deduce anything for a non-terminal pack, so we should go ahead and make an empty pack. I see. Then what about something bolder, like the below? Instead of fiddling with the details of coerce_template_parms - how it handles the explicit arguments - in fn_type_unification we deal with both parameter_pack == true and false in the same way when targ == NULL_TREE, thus we set incomplete. Then, for the new testcases, since incomplete is true, there is no jump to the deduced label and type_unification_real takes care of making the empty pack - the same happens already when there are no explicit arguments. Tested x86_64-linux. I also checked quite a few other variants of the tests but nothing new, nothing interesting, showed up... Thanks, Paolo. / Index: cp/pt.c === --- cp/pt.c (revision 270364) +++ cp/pt.c (working copy) @@ -20176,21 +20176,17 @@ fn_type_unification (tree fn, parameter_pack = TEMPLATE_PARM_PARAMETER_PACK (parm); } - if (!parameter_pack && targ == NULL_TREE) + if (targ == NULL_TREE) /* No explicit argument for this template parameter. */ incomplete = true; - - if (parameter_pack && pack_deducible_p (parm, fn)) + else if (parameter_pack && pack_deducible_p (parm, fn)) { /* Mark the argument pack as "incomplete". We could still deduce more arguments during unification. We remove this mark in type_unification_real. */ - if (targ) -{ - ARGUMENT_PACK_INCOMPLETE_P(targ) = 1; - ARGUMENT_PACK_EXPLICIT_ARGS (targ) -= ARGUMENT_PACK_ARGS (targ); -} + ARGUMENT_PACK_INCOMPLETE_P(targ) = 1; + ARGUMENT_PACK_EXPLICIT_ARGS (targ) + = ARGUMENT_PACK_ARGS (targ); /* We have some incomplete argument packs. */ incomplete = true; Index: testsuite/g++.dg/cpp0x/pr89900-1.C === --- testsuite/g++.dg/cpp0x/pr89900-1.C (nonexistent) +++ testsuite/g++.dg/cpp0x/pr89900-1.C (working copy) @@ -0,0 +1,10 @@ +// { dg-do compile { target c++11 } } + +template void +fk (XE..., SW); // { dg-error "12:.SW. has not been declared" } + +void +w9 (void) +{ + fk (0); +} Index: testsuite/g++.dg/cpp0x/pr89900-2.C === --- testsuite/g++.dg/cpp0x/pr89900-2.C (nonexistent) +++ testsuite/g++.dg/cpp0x/pr89900-2.C (working copy) @@ -0,0 +1,10 @@ +// { dg-do compile { target c++11 } } + +template void +fk (XE..., int); + +void +w9 (void) +{ + fk (0); +} Index: testsuite/g++.dg/cpp0x/pr89900-3.C === --- testsuite/g++.dg/cpp0x/pr89900-3.C (nonexistent) +++ testsuite/g++.dg/cpp0x/pr89900-3.C (working copy) @@ -0,0 +1,10 @@ +// { dg-do compile { target c++11 } } + +template void +fk (XE..., SW); // { dg-error "12:.SW. has not been declared" } + +void +w9 (void) +{ + fk (0); +} Index: testsuite/g++.dg/cpp0x/p
[PATCH, libphobos] Committed merge with upstream druntime
Hi, This patch merges the libdruntime library with upstream druntime 70b9fea6. Backports fixes in the extern(C) bindings for the Solaris/SPARC port. Bootstrapped and regression tested on x86_64-linux-gnu and i386-pc-solaris2.11. Committed to trunk as r270372. -- Iain --- diff --git a/libphobos/libdruntime/MERGE b/libphobos/libdruntime/MERGE index a7bbd3da964..dd5f621082f 100644 --- a/libphobos/libdruntime/MERGE +++ b/libphobos/libdruntime/MERGE @@ -1,4 +1,4 @@ -175bf5fc69d26fec60d533ff77f7e915fd5bb468 +70b9fea60246e63d936ad6826b1b48b6e0f1de8f The first line of this file holds the git revision number of the last merge done from the dlang/druntime repository. diff --git a/libphobos/libdruntime/core/sys/posix/ucontext.d b/libphobos/libdruntime/core/sys/posix/ucontext.d index 52b16864917..6200bfc3fe2 100644 --- a/libphobos/libdruntime/core/sys/posix/ucontext.d +++ b/libphobos/libdruntime/core/sys/posix/ucontext.d @@ -25,6 +25,10 @@ nothrow: version (RISCV32) version = RISCV_Any; version (RISCV64) version = RISCV_Any; +version (SPARC) version = SPARC_Any; +version (SPARC64) version = SPARC_Any; +version (X86) version = X86_Any; +version (X86_64) version = X86_Any; // // XOpen (XSI) @@ -1029,6 +1033,8 @@ else version (DragonFlyBSD) } else version (Solaris) { +private import core.stdc.stdint; + alias uint[4] upad128_t; version (SPARC64) @@ -1127,10 +1133,13 @@ else version (Solaris) } else version (X86_64) { -union _u_st +private { -ushort[5] fpr_16; -upad128_t __fpr_pad; +union _u_st +{ +ushort[5] fpr_16; +upad128_t __fpr_pad; +} } struct fpregset_t @@ -1189,20 +1198,94 @@ else version (Solaris) else static assert(0, "unimplemented"); -struct mcontext_t +version (SPARC_Any) { -gregset_t gregs; -fpregset_t fpregs; +private +{ +struct rwindow +{ +greg_t[8] rw_local; +greg_t[8] rw_in; +} + +struct gwindows_t +{ +int wbcnt; +greg_t[31] *spbuf; +rwindow[31] wbuf; +} + +struct xrs_t +{ +uint xrs_id; +caddr_t xrs_ptr; +} + +struct cxrs_t +{ +uint cxrs_id; +caddr_t cxrs_ptr; +} + +alias int64_t[16] asrset_t; +} + +struct mcontext_t +{ +gregset_tgregs; +gwindows_t *gwins; +fpregset_t fpregs; +xrs_txrs; +version (SPARC64) +{ +asrset_t asrs; +cxrs_t cxrs; +c_long[2] filler; +} +else version (SPARC) +{ +cxrs_t cxrs; +c_long[17] filler; +} +} +} +else version (X86_Any) +{ +private +{ +struct xrs_t +{ +uint xrs_id; +caddr_t xrs_ptr; +} +} + +struct mcontext_t +{ +gregset_t gregs; +fpregset_t fpregs; +} } struct ucontext_t { -c_ulong uc_flags; +version (SPARC_Any) +uintuc_flags; +else version (X86_Any) +c_ulong uc_flags; ucontext_t *uc_link; sigset_tuc_sigmask; stack_t uc_stack; mcontext_t uc_mcontext; -c_long[5] uc_filler; +version (SPARC64) +c_long[4] uc_filler; +else version (SPARC) +c_long[23] uc_filler; +else version (X86_Any) +{ +xrs_t uc_xrs; +c_long[3] uc_filler; +} } } else version (CRuntime_UClibc) @@ -1399,7 +1482,20 @@ int swapcontext(ucontext_t*, in ucontext_t*); static if ( is( ucontext_t ) ) { int getcontext(ucontext_t*); -void makecontext(ucontext_t*, void function(), int, ...); + +version (Solaris) +{ +version (SPARC_Any) +{ +void __makecontext_v2(ucontext_t*, void function(), int, ...); +alias makecontext = __makecontext_v2; +} +else +void makecontext(ucontext_t*, void function(), int, ...); +} +else +void makecontext(ucontext_t*, void function(), int, ...); + int setcontext(in ucontext_t*); int swapcontext(ucontext_t*, in ucontext_t*); } diff --git a/libphobos/libdruntime/core/sys/solaris/link.d b/libphobos/libdruntime/core/sys/solaris/link.d index c3e75de481e..2d908b12184 100644 --- a/libphobos/libdruntime/core/sys/solaris/link.d +++ b/libphobos/libdrunt
[PATCH rs6000] Fix PR target/84369: gcc.dg/sms-10.c fails on Power9
As pointed out in the PR, the test is failing because a store->load dependency is reporting zero cost. Fixed by leaving existing costs as is (i.e. cost for update forms), and just adding a simple bypass for store->load dependencies. Bootstrap/regtest on powerpc64le (Power9) with no new regressions and testcase now passing. Also ran cpu2006/cpu2017 benchmark comparisons with no notable differences. Ok for trunk? -Pat 2019-04-15 Pat Haugen PR target/84369 * config/rs6000/power9.md: Add store forwarding bypass. Index: gcc/config/rs6000/power9.md === --- gcc/config/rs6000/power9.md (revision 270261) +++ gcc/config/rs6000/power9.md (working copy) @@ -236,6 +236,9 @@ (define_insn_reservation "power9-vecstor (eq_attr "cpu" "power9")) "DU_super_power9,LSU_pair_power9") +; Store forwarding latency is 6 +(define_bypass 6 "power9-*store*" "power9-*load*") + (define_insn_reservation "power9-larx" 4 (and (eq_attr "type" "load_l") (eq_attr "cpu" "power9"))
[committed] Fix various microblaze-linux failures
microblaze testing in my tester has occasionally been failing Warray-bounds-40 and Wstringop-overflow-9. I finally took a little peek because these occasional failures show up as a regression against the prior run. It looks like the microblaze backend is trying to inline a move of SIZE_MAX bytes. Ugh. Not surprisingly the problem is the target bits treating the size as a signed integer in a comparison. Fixing this is pretty simple thankfully. I didn't audit the entire port, just microblaze_expand_block_move. Here's what I'm installing on the trunk -- it basically ensures we treat the size and alignment as unsigned values. It also fixes errors with string-large-1.c. Jeff * config/microblaze/microblaze.c (microblaze_expand_block_move): Treat size and alignment as unsigned. diff --git a/gcc/config/microblaze/microblaze.c b/gcc/config/microblaze/microblaze.c index 70910fd1dde..55c1becf975 100644 --- a/gcc/config/microblaze/microblaze.c +++ b/gcc/config/microblaze/microblaze.c @@ -1258,8 +1258,8 @@ microblaze_expand_block_move (rtx dest, rtx src, rtx length, rtx align_rtx) if (GET_CODE (length) == CONST_INT) { - HOST_WIDE_INT bytes = INTVAL (length); - int align = INTVAL (align_rtx); + unsigned HOST_WIDE_INT bytes = UINTVAL (length); + unsigned int align = UINTVAL (align_rtx); if (align > UNITS_PER_WORD) { @@ -1267,7 +1267,7 @@ microblaze_expand_block_move (rtx dest, rtx src, rtx length, rtx align_rtx) } else if (align < UNITS_PER_WORD) { - if (INTVAL (length) <= MAX_MOVE_BYTES) + if (UINTVAL (length) <= MAX_MOVE_BYTES) { move_by_pieces (dest, src, bytes, align, RETURN_BEGIN); return true; @@ -1276,14 +1276,14 @@ microblaze_expand_block_move (rtx dest, rtx src, rtx length, rtx align_rtx) return false; } - if (INTVAL (length) <= 2 * MAX_MOVE_BYTES) + if (UINTVAL (length) <= 2 * MAX_MOVE_BYTES) { - microblaze_block_move_straight (dest, src, INTVAL (length)); + microblaze_block_move_straight (dest, src, UINTVAL (length)); return true; } else if (optimize) { - microblaze_block_move_loop (dest, src, INTVAL (length)); + microblaze_block_move_loop (dest, src, UINTVAL (length)); return true; } }
New French PO file for 'gcc' (version 9.1-b20190414)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the French team of translators. The file is available at: https://translationproject.org/latest/gcc/fr.po (This file, 'gcc-9.1-b20190414.fr.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Re: [PATCH wwwdocs] Mention GNU Tools Cauldron in the News section
On 4/15/19, Simon Marchi wrote: > Hi, > > Here is a patch that adds a mention of the 2019 Cauldron, similar to the > entries for the previous editions. > > Thanks, > > Simon > > > Index: index.html > === > RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v > retrieving revision 1.1125 > diff -u -r1.1125 index.html > --- index.html29 Mar 2019 12:28:15 - 1.1125 > +++ index.html15 Apr 2019 16:39:00 - > @@ -54,6 +54,10 @@ > News > > > +https://gcc.gnu.org/wiki/cauldron2019";>GNU Tools > Cauldron 2019 > +[2019-04-15] > +Held in Montréal, Canada, September 13-15 2019 > + Hey Montréal, I might actually be able to go this year! How do I register? > GCC 8.3 released > [2019-02-22] > > Eric Gallager
[PATCH, libphobos] Committed fix configure test for backtrace-supported.h
Hi, When porting/testing the D front-end to FreeBSD, I noticed that backtrace supported returned false during the configuration of libphobos. The use of += assignment in the configure test was the reason why, and now that's been corrected. Bootstrapped and regression tested on x86_64-linux-gnu and x86_64-freebsd11.2. Committed to trunk as r270377. -- Iain --- libphobos/ChangeLog: 2019-04-16 Iain Buclaw * config.h.in: Regenerate. * configure: Regenerate. * m4/druntime/libraries.m4 (DRUNTIME_LIBRARIES_BACKTRACE): Set CPPFLAGS correctly for backtrace support test. --- diff --git a/libphobos/config.h.in b/libphobos/config.h.in index 19266b3b5e4..0249849c890 100644 --- a/libphobos/config.h.in +++ b/libphobos/config.h.in @@ -54,3 +54,35 @@ /* Define to 1 if you have the ANSI C header files. */ #undef STDC_HEADERS + +/* Enable extensions on AIX 3, Interix. */ +#ifndef _ALL_SOURCE +# undef _ALL_SOURCE +#endif +/* Enable GNU extensions on systems that have them. */ +#ifndef _GNU_SOURCE +# undef _GNU_SOURCE +#endif +/* Enable threading extensions on Solaris. */ +#ifndef _POSIX_PTHREAD_SEMANTICS +# undef _POSIX_PTHREAD_SEMANTICS +#endif +/* Enable extensions on HP NonStop. */ +#ifndef _TANDEM_SOURCE +# undef _TANDEM_SOURCE +#endif +/* Enable general extensions on Solaris. */ +#ifndef __EXTENSIONS__ +# undef __EXTENSIONS__ +#endif + + +/* Define to 1 if on MINIX. */ +#undef _MINIX + +/* Define to 2 if the system does not provide POSIX.1 features except with + this defined. */ +#undef _POSIX_1_SOURCE + +/* Define to 1 if you need to in order for `stat' and other things to work. */ +#undef _POSIX_SOURCE diff --git a/libphobos/configure b/libphobos/configure index 87e4e4a7c9b..8079a73527d 100755 --- a/libphobos/configure +++ b/libphobos/configure @@ -14838,7 +14838,7 @@ fi LIBBACKTRACE=../../libbacktrace/libbacktrace.la gdc_save_CPPFLAGS=$CPPFLAGS -CPPFLAGS+=" -I../libbacktrace " +CPPFLAGS="$CPPFLAGS -I../libbacktrace " ac_fn_c_check_header_mongrel "$LINENO" "backtrace-supported.h" "ac_cv_header_backtrace_supported_h" "$ac_includes_default" if test "x$ac_cv_header_backtrace_supported_h" = xyes; then : diff --git a/libphobos/m4/druntime/libraries.m4 b/libphobos/m4/druntime/libraries.m4 index 6e81fd99e4b..a7aab4dd88b 100644 --- a/libphobos/m4/druntime/libraries.m4 +++ b/libphobos/m4/druntime/libraries.m4 @@ -178,7 +178,7 @@ AC_DEFUN([DRUNTIME_LIBRARIES_BACKTRACE], LIBBACKTRACE=../../libbacktrace/libbacktrace.la gdc_save_CPPFLAGS=$CPPFLAGS -CPPFLAGS+=" -I../libbacktrace " +CPPFLAGS="$CPPFLAGS -I../libbacktrace " AC_CHECK_HEADER(backtrace-supported.h, have_libbacktrace_h=true, have_libbacktrace_h=false)