Re: [PATCH 2/5] completely_scalarize arrays as well as records
On 27 August 2015 at 17:43, Alan Lawrence wrote: > Martin Jambor wrote: >> >> First, I would be much >> happier if you added a proper comment to scalarize_elem function which >> you forgot completely. The name is not very descriptive and it has >> quite few parameters too. >> >> Second, this patch should also fix PR 67283. It would be great if you >> could verify that and add it to the changelog when committing if that >> is indeed the case. > > Thanks for pointing both of those out. I've added a comment to scalarize_elem, > deleted the bogus comment in the new test, and yes I can confirm that the > patch > fixes PR 67283 on x86_64, and also AArch64 if > --param sra-max-scalarization-size-Ospeed is passed. (I've not added any > testcase specifically taken from that PR, however.) > > Pushed as r277265. Actually, is r227265. Since since commit I've noticed that g++.dg/torture/pr64312.C fails at -O1 in my config, saying "virtual memory exhaustion" (arm* targets) I run my validations under ulimit -v 10GB, which seems already large enough. Do we consider this a bug? Christophe. > --- > gcc/testsuite/gcc.dg/tree-ssa/sra-15.c | 37 > gcc/tree-sra.c | 149 > ++--- > 2 files changed, 138 insertions(+), 48 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sra-15.c > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c > b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c > new file mode 100644 > index 000..a22062e > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c > @@ -0,0 +1,37 @@ > +/* Verify that SRA total scalarization works on records containing arrays. > */ > +/* { dg-do run } */ > +/* { dg-options "-O1 -fdump-tree-release_ssa --param > sra-max-scalarization-size-Ospeed=32" } */ > + > +extern void abort (void); > + > +struct S > +{ > + char c; > + unsigned short f[2][2]; > + int i; > + unsigned short f3, f4; > +}; > + > + > +int __attribute__ ((noinline)) > +foo (struct S *p) > +{ > + struct S l; > + > + l = *p; > + l.i++; > + l.f[1][0] += 3; > + *p = l; > +} > + > +int > +main (int argc, char **argv) > +{ > + struct S a = {0, { {5, 7}, {9, 11} }, 4, 0, 0}; > + foo (&a); > + if (a.i != 5 || a.f[1][0] != 12) > +abort (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-times "l;" 0 "release_ssa" } } */ > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 8b3a0ad..3caf84a 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -915,73 +915,126 @@ create_access (tree expr, gimple stmt, bool write) > } > > > -/* Return true iff TYPE is a RECORD_TYPE with fields that are either of > gimple > - register types or (recursively) records with only these two kinds of > fields. > - It also returns false if any of these records contains a bit-field. */ > +/* Return true iff TYPE is scalarizable - i.e. a RECORD_TYPE or ARRAY_TYPE > with > + fields that are either of gimple register types (excluding bit-fields) > + or (recursively) scalarizable types. */ > > static bool > -type_consists_of_records_p (tree type) > +scalarizable_type_p (tree type) > { > - tree fld; > + gcc_assert (!is_gimple_reg_type (type)); > > - if (TREE_CODE (type) != RECORD_TYPE) > -return false; > + switch (TREE_CODE (type)) > + { > + case RECORD_TYPE: > +for (tree fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld)) > + if (TREE_CODE (fld) == FIELD_DECL) > + { > + tree ft = TREE_TYPE (fld); > > - for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld)) > -if (TREE_CODE (fld) == FIELD_DECL) > - { > - tree ft = TREE_TYPE (fld); > + if (DECL_BIT_FIELD (fld)) > + return false; > > - if (DECL_BIT_FIELD (fld)) > - return false; > + if (!is_gimple_reg_type (ft) > + && !scalarizable_type_p (ft)) > + return false; > + } > > - if (!is_gimple_reg_type (ft) > - && !type_consists_of_records_p (ft)) > - return false; > - } > +return true; > > - return true; > + case ARRAY_TYPE: > +{ > + tree elem = TREE_TYPE (type); > + if (DECL_P (elem) && DECL_BIT_FIELD (elem)) > + return false; > + if (!is_gimple_reg_type (elem) > +&& !scalarizable_type_p (elem)) > + return false; > + return true; > +} > + default: > +return false; > + } > } > > -/* Create total_scalarization accesses for all scalar type fields in DECL > that > - must be of a RECORD_TYPE conforming to type_consists_of_records_p. BASE > - must be the top-most VAR_DECL representing the variable, OFFSET must be > the > - offset of DECL within BASE. REF must be the memory reference expression > for > - the given decl. */ > +static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree); > + > +/* Create total_scalarization accesses for all scalar fields of a member > + of type DECL_TYPE conforming to scalarizable_type_p. BASE > + must b
Re: C++ delayed folding branch review
2015-08-28 4:11 GMT+02:00 Jason Merrill : > On 08/27/2015 02:12 PM, Kai Tietz wrote: >> >> + else if (TREE_CODE (type) == VECTOR_TYPE) >> +{ >> + if (TREE_CODE (arg1) == VECTOR_CST >> + && code == NOP_EXPR >> + && TYPE_VECTOR_SUBPARTS (type) == VECTOR_CST_NELTS (arg1)) >> + { >> + tree r = copy_node (arg1); >> + TREE_TYPE (arg1) = type; >> + return r; >> + } >> +} > > > I would drop the check on 'code' and add a check that > > TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (TREE_TYPE (arg1)) > > Does that still pass? Yes, is still passes. To check here for main-variant seems to be more robust. I commit it to branch, and will do complete regression-testing for it. > Jason Kai
Re: [PATCH] Guard early debug generation with !seen_errors ()
On Thu, 27 Aug 2015, Richard Biener wrote: > > This is already done for the TYPE_DECL case, the following avoids > trying to generate any debug after we've seen errors for the > calls to early_global_decl. > > Bootstrapped on x86_64-unknown-linux-gnu, testing and gdb testing in > progress. The following fixes early LTO debug fallout. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2015-08-28 Richard Biener * cgraphunit.c (symbol_table::compile): Move early debug generation and finish... (symbol_table::finalize_compilation_unit): ... back here and add a !seen_error () guard. Index: gcc/cgraphunit.c === --- gcc/cgraphunit.c(revision 227258) +++ gcc/cgraphunit.c(working copy) @@ -2314,16 +2314,6 @@ symbol_table::compile (void) symtab_node::verify_symtab_nodes (); #endif - /* Emit early debug for reachable functions, and by consequence, - locally scoped symbols. */ - struct cgraph_node *cnode; - FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (cnode) -(*debug_hooks->early_global_decl) (cnode->decl); - - /* Clean up anything that needs cleaning up after initial debug - generation. */ - (*debug_hooks->early_finish) (); - timevar_push (TV_CGRAPHOPT); if (pre_ipa_mem_report) { @@ -2492,6 +2482,19 @@ symbol_table::finalize_compilation_unit /* Gimplify and lower thunks. */ analyze_functions (/*first_time=*/false); + if (!seen_error ()) +{ + /* Emit early debug for reachable functions, and by consequence, +locally scoped symbols. */ + struct cgraph_node *cnode; + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (cnode) + (*debug_hooks->early_global_decl) (cnode->decl); + + /* Clean up anything that needs cleaning up after initial debug +generation. */ + (*debug_hooks->early_finish) (); +} + /* Finally drive the pass manager. */ compile ();
Re: [PATCH 2/5] completely_scalarize arrays as well as records
On Fri, 28 Aug 2015, Christophe Lyon wrote: > On 27 August 2015 at 17:43, Alan Lawrence wrote: > > Martin Jambor wrote: > >> > >> First, I would be much > >> happier if you added a proper comment to scalarize_elem function which > >> you forgot completely. The name is not very descriptive and it has > >> quite few parameters too. > >> > >> Second, this patch should also fix PR 67283. It would be great if you > >> could verify that and add it to the changelog when committing if that > >> is indeed the case. > > > > Thanks for pointing both of those out. I've added a comment to > > scalarize_elem, > > deleted the bogus comment in the new test, and yes I can confirm that the > > patch > > fixes PR 67283 on x86_64, and also AArch64 if > > --param sra-max-scalarization-size-Ospeed is passed. (I've not added any > > testcase specifically taken from that PR, however.) > > > > Pushed as r277265. > > Actually, is r227265. > > Since since commit I've noticed that > g++.dg/torture/pr64312.C > fails at -O1 in my config, saying "virtual memory exhaustion" (arm* targets) > I run my validations under ulimit -v 10GB, which seems already large enough. > > Do we consider this a bug? Sure we do. You have to investigate this (I guess we run into some endless looping/recursing that eats memory somewhere). Richard.
Re: [ARC] Cleanup A5 references
Ping Original submission: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01199.html Thanks, Claudiu On Thu, Aug 20, 2015 at 1:42 PM, Claudiu Zissulescu wrote: > This patch cleans up the references to obsolete A5 processor. > > Can this be committed? > > Thanks, > Claudiu > > > 2015-08-20 Claudiu Zissulescu > > * common/config/arc/arc-common.c, config/arc/arc-opts.h, > config/arc/arc.c, config/arc/arc.h, config/arc/arc.md, > config/arc/arc.opt, config/arc/constraints.md, > config/arc/t-arc-newlib: Remove references to A5. >
Re: [patch,libgfortran] Fix configure test for weakref support
ping**2 Given that it’s a configury-patch, I think what it needs is real exposure on unusual targets more than formal review, so I intend to commit it in 48 hours unless someone objects in the meantime. Then I’ll be around to fix things if some fall apart… FX > Le 14 août 2015 à 17:06, FX a écrit : > > This patch fixes PR 47571 > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47571) by providing an adequate > configure check (with AC_LINK_IFELSE) for weakref (which we use in > intrinsics/system_clock.c). > > Tested by building libgfortran on x86_64-pc-linux-gnu, where SUPPORTS_WEAKREF > gets defined to 1 as it should, and x86_64-apple-darwin14, where > SUPPORTS_WEAKREF gets defined to 0 as it should. Then regtested on both > targets. > > OK to commit to trunk? > > FX 2015-08-14 Francois-Xavier Coudert PR libfortran/47571 * acinclude.m4 (LIBGFOR_GTHREAD_WEAK): Remove. (LIBGFOR_CHECK_WEAKREF): New test. * configure.ac: Call LIBGFOR_CHECK_WEAKREF instead of LIBGFOR_GTHREAD_WEAK. * config.h.in: Regenerate. * configure: Regenerate. * intrinsics/system_clock.c: Use SUPPORTS_WEAKREF instead of SUPPORTS_WEAK and GTHREAD_USE_WEAK. weakref.diff Description: Binary data
Re: [PATCH 2/5] completely_scalarize arrays as well as records
On 28 August 2015 at 09:48, Richard Biener wrote: > On Fri, 28 Aug 2015, Christophe Lyon wrote: > >> On 27 August 2015 at 17:43, Alan Lawrence wrote: >> > Martin Jambor wrote: >> >> >> >> First, I would be much >> >> happier if you added a proper comment to scalarize_elem function which >> >> you forgot completely. The name is not very descriptive and it has >> >> quite few parameters too. >> >> >> >> Second, this patch should also fix PR 67283. It would be great if you >> >> could verify that and add it to the changelog when committing if that >> >> is indeed the case. >> > >> > Thanks for pointing both of those out. I've added a comment to >> > scalarize_elem, >> > deleted the bogus comment in the new test, and yes I can confirm that the >> > patch >> > fixes PR 67283 on x86_64, and also AArch64 if >> > --param sra-max-scalarization-size-Ospeed is passed. (I've not added any >> > testcase specifically taken from that PR, however.) >> > >> > Pushed as r277265. >> >> Actually, is r227265. >> >> Since since commit I've noticed that >> g++.dg/torture/pr64312.C >> fails at -O1 in my config, saying "virtual memory exhaustion" (arm* targets) >> I run my validations under ulimit -v 10GB, which seems already large enough. >> >> Do we consider this a bug? > > Sure we do. You have to investigate this (I guess we run into some > endless looping/recursing that eats memory somewhere). > I asked because I assumed that Alan saw it pass in his configuration. > Richard.
Re: [PATCH 2/5] completely_scalarize arrays as well as records
On Fri, 28 Aug 2015, Christophe Lyon wrote: > On 28 August 2015 at 09:48, Richard Biener wrote: > > On Fri, 28 Aug 2015, Christophe Lyon wrote: > > > >> On 27 August 2015 at 17:43, Alan Lawrence wrote: > >> > Martin Jambor wrote: > >> >> > >> >> First, I would be much > >> >> happier if you added a proper comment to scalarize_elem function which > >> >> you forgot completely. The name is not very descriptive and it has > >> >> quite few parameters too. > >> >> > >> >> Second, this patch should also fix PR 67283. It would be great if you > >> >> could verify that and add it to the changelog when committing if that > >> >> is indeed the case. > >> > > >> > Thanks for pointing both of those out. I've added a comment to > >> > scalarize_elem, > >> > deleted the bogus comment in the new test, and yes I can confirm that > >> > the patch > >> > fixes PR 67283 on x86_64, and also AArch64 if > >> > --param sra-max-scalarization-size-Ospeed is passed. (I've not added any > >> > testcase specifically taken from that PR, however.) > >> > > >> > Pushed as r277265. > >> > >> Actually, is r227265. > >> > >> Since since commit I've noticed that > >> g++.dg/torture/pr64312.C > >> fails at -O1 in my config, saying "virtual memory exhaustion" (arm* > >> targets) > >> I run my validations under ulimit -v 10GB, which seems already large > >> enough. > >> > >> Do we consider this a bug? > > > > Sure we do. You have to investigate this (I guess we run into some > > endless looping/recursing that eats memory somewhere). > > > > I asked because I assumed that Alan saw it pass in his configuration. Well, it should still be investigated - whether you caused it or not ;) It's a bug. Richard.
[PATCH] Fix LTO early-debug ICE wrt abstract origin node streaming
The following fixes Program received signal SIGSEGV, Segmentation fault. 0x00af8edd in lto_get_decl_name_mapping (decl_data=0x0, name=0x768dcbd0 "bar") at /space/rguenther/src/svn/trunk/gcc/lto-section-in.c:349 349 htab_t renaming_hash_table = decl_data->renaming_hash_table; (gdb) bt #0 0x00af8edd in lto_get_decl_name_mapping (decl_data=0x0, name=0x768dcbd0 "bar") at /space/rguenther/src/svn/trunk/gcc/lto-section-in.c:349 #1 0x00af41b9 in copy_function_or_variable (node=0x76abfb80) at /space/rguenther/src/svn/trunk/gcc/lto-streamer-out.c:2233 #2 0x00af4701 in lto_output () at /space/rguenther/src/svn/trunk/gcc/lto-streamer-out.c:2328 #3 0x00b6d909 in write_lto () at /space/rguenther/src/svn/trunk/gcc/passes.c:2411 #4 0x00b6e062 in ipa_write_optimization_summaries (encoder=0x2100ab0) at /space/rguenther/src/svn/trunk/gcc/passes.c:2615 I see quite often because I now stream DECL_ABSTRACT_ORIGIN from the early compile. Often the abstract origins have their bodies removed as unreachable but LTRANS boundary compute happily re-creates the cgraph nodes and tries to stream the bodies. Which obviously fails. Below are two possible fixes (and in fact I believe that with early LTO debug we do _not_ need to put abstract origins into the LTRANS boundary at all - the abstract instance is available from the early debug via the abstract origin decl). Any preference? I'm LTO bootstrapping the first one because it looks bogus to use get_create here (did we add this for dwarf2out ICEs with LTO?) Maybe we want the cgraph_node::remove hunk as well and assert in cgraph_node::create that we are not creating cgraph nodes for DECL_ABSTRACT_P decls... Thanks, Richard. 2015-08-28 Richard Biener * lto-cgraph.c (compute_ltrans_boundary): Only put abstract origin nodes into the ltrans boundary if their body is still available. Index: trunk/gcc/lto-cgraph.c === --- trunk.orig/gcc/lto-cgraph.c 2015-08-13 13:14:09.116378573 +0200 +++ trunk/gcc/lto-cgraph.c 2015-08-28 10:24:05.130397851 +0200 @@ -899,9 +899,12 @@ compute_ltrans_boundary (lto_symtab_enco if (DECL_ABSTRACT_ORIGIN (node->decl)) { struct cgraph_node *origin_node - = cgraph_node::get_create (DECL_ABSTRACT_ORIGIN (node->decl)); - origin_node->used_as_abstract_origin = true; - add_node_to (encoder, origin_node, true); + = cgraph_node::get (DECL_ABSTRACT_ORIGIN (node->decl)); + if (origin_node) + { + origin_node->used_as_abstract_origin = true; + add_node_to (encoder, origin_node, true); + } } } for (lsei = lsei_start_variable_in_partition (in_encoder); 2015-08-28 Richard Biener * cgraph.c (cgraph_node::remove): If the node was used as abstract origin mark the decl as abstract. * lto-cgraph.c (compute_ltrans_boundary): Do not put abstract nodes in the ltrans boundary. Index: trunk/gcc/cgraph.c === --- trunk.orig/gcc/cgraph.c 2015-08-13 13:13:56.942263721 +0200 +++ trunk/gcc/cgraph.c 2015-08-28 10:03:06.841844184 +0200 @@ -1840,6 +1840,9 @@ cgraph_node::remove (void) lto_file_data = NULL; } + if (used_as_abstract_origin) +DECL_ABSTRACT_P (decl) = 1; + decl = NULL; if (call_site_hash) { Index: trunk/gcc/lto-cgraph.c === --- trunk.orig/gcc/lto-cgraph.c 2015-08-13 13:14:09.116378573 +0200 +++ trunk/gcc/lto-cgraph.c 2015-08-28 10:12:13.944867635 +0200 @@ -896,10 +896,12 @@ compute_ltrans_boundary (lto_symtab_enco lto_set_symtab_encoder_in_partition (encoder, node); create_references (encoder, node); /* For proper debug info, we need to ship the origins, too. */ - if (DECL_ABSTRACT_ORIGIN (node->decl)) + if (DECL_ABSTRACT_ORIGIN (node->decl) + /* ??? But we might have removed it! */ + && ! DECL_ABSTRACT_P (DECL_ABSTRACT_ORIGIN (node->decl))) { struct cgraph_node *origin_node - = cgraph_node::get_create (DECL_ABSTRACT_ORIGIN (node->decl)); + = cgraph_node::get (DECL_ABSTRACT_ORIGIN (node->decl)); origin_node->used_as_abstract_origin = true; add_node_to (encoder, origin_node, true); }
[COMMITTED][AArch64] Rename SYMBOL_SMALL_GOTTPREL to SYMBOL_SMALL_TLSIE
SYMBOL_SMALL_GOTTPREL is for TLS IE model, while it is the only symbol name which is not following the name convention SYMBOL_[code model]_TLS[tls model]. This patch fix this. Committed as obivious. 2015-08-28 Jiong Wang gcc/ * config/aarch64/aarch64-protos.h (aarch64_symbol_context): Rename SYMBOL_SMALL_GOTTPREL to SYMBOL_SMALL_TLSIE. (aarch64_symbol_type): Likewise. * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Likewise. (aarch64_expand_mov_immediate): Likewise. (aarch64_print_operand): Likewise. (aarch64_classify_tls_symbol): Likewise. -- Regards, Jiong Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 227293) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,14 @@ +2015-08-28 Jiong Wang + + * config/aarch64/aarch64-protos.h (aarch64_symbol_context): Rename + SYMBOL_SMALL_GOTTPREL to SYMBOL_SMALL_TLSIE. + (aarch64_symbol_type): Likewise. + * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): + Likewise. + (aarch64_expand_mov_immediate): Likewise. + (aarch64_print_operand): Likewise. + (aarch64_classify_tls_symbol): Likewise. + 2015-08-28 Richard Biener * cgraphunit.c (symbol_table::compile): Move early debug generation Index: gcc/config/aarch64/aarch64-protos.h === --- gcc/config/aarch64/aarch64-protos.h (revision 227293) +++ gcc/config/aarch64/aarch64-protos.h (working copy) @@ -73,7 +73,7 @@ SYMBOL_SMALL_TLSGD SYMBOL_SMALL_TLSDESC - SYMBOL_SMALL_GOTTPREL + SYMBOL_SMALL_TLSIE SYMBOL_TINY_TLSIE SYMBOL_TLSLE12 SYMBOL_TLSLE24 @@ -112,7 +112,7 @@ SYMBOL_SMALL_GOT_4G, SYMBOL_SMALL_TLSGD, SYMBOL_SMALL_TLSDESC, - SYMBOL_SMALL_GOTTPREL, + SYMBOL_SMALL_TLSIE, SYMBOL_TINY_ABSOLUTE, SYMBOL_TINY_GOT, SYMBOL_TINY_TLSIE, Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c (revision 227293) +++ gcc/config/aarch64/aarch64.c (working copy) @@ -1103,7 +1103,7 @@ return; } -case SYMBOL_SMALL_GOTTPREL: +case SYMBOL_SMALL_TLSIE: { /* In ILP32, the mode of dest can be either SImode or DImode, while the got entry is always of SImode size. The mode of @@ -1737,7 +1737,7 @@ case SYMBOL_SMALL_TLSGD: case SYMBOL_SMALL_TLSDESC: -case SYMBOL_SMALL_GOTTPREL: + case SYMBOL_SMALL_TLSIE: case SYMBOL_SMALL_GOT_28K: case SYMBOL_SMALL_GOT_4G: case SYMBOL_TINY_GOT: @@ -4623,7 +4623,7 @@ asm_fprintf (asm_out_file, ":tlsdesc:"); break; - case SYMBOL_SMALL_GOTTPREL: + case SYMBOL_SMALL_TLSIE: asm_fprintf (asm_out_file, ":gottprel:"); break; @@ -4656,7 +4656,7 @@ asm_fprintf (asm_out_file, ":tlsdesc_lo12:"); break; - case SYMBOL_SMALL_GOTTPREL: + case SYMBOL_SMALL_TLSIE: asm_fprintf (asm_out_file, ":gottprel_lo12:"); break; @@ -8787,7 +8787,7 @@ case AARCH64_CMODEL_TINY_PIC: return SYMBOL_TINY_TLSIE; default: - return SYMBOL_SMALL_GOTTPREL; + return SYMBOL_SMALL_TLSIE; } case TLS_MODEL_LOCAL_EXEC:
Re: [PATCH] Fix LTO early-debug ICE wrt abstract origin node streaming
On Fri, 28 Aug 2015, Richard Biener wrote: > > The following fixes > > Program received signal SIGSEGV, Segmentation fault. > 0x00af8edd in lto_get_decl_name_mapping (decl_data=0x0, > name=0x768dcbd0 "bar") > at /space/rguenther/src/svn/trunk/gcc/lto-section-in.c:349 > 349 htab_t renaming_hash_table = decl_data->renaming_hash_table; > (gdb) bt > #0 0x00af8edd in lto_get_decl_name_mapping (decl_data=0x0, > name=0x768dcbd0 "bar") > at /space/rguenther/src/svn/trunk/gcc/lto-section-in.c:349 > #1 0x00af41b9 in copy_function_or_variable (node=0x76abfb80) > at /space/rguenther/src/svn/trunk/gcc/lto-streamer-out.c:2233 > #2 0x00af4701 in lto_output () > at /space/rguenther/src/svn/trunk/gcc/lto-streamer-out.c:2328 > #3 0x00b6d909 in write_lto () > at /space/rguenther/src/svn/trunk/gcc/passes.c:2411 > #4 0x00b6e062 in ipa_write_optimization_summaries > (encoder=0x2100ab0) > at /space/rguenther/src/svn/trunk/gcc/passes.c:2615 > > I see quite often because I now stream DECL_ABSTRACT_ORIGIN from > the early compile. Often the abstract origins have their bodies > removed as unreachable but LTRANS boundary compute happily > re-creates the cgraph nodes and tries to stream the bodies. Which > obviously fails. > > Below are two possible fixes (and in fact I believe that with > early LTO debug we do _not_ need to put abstract origins into the > LTRANS boundary at all - the abstract instance is available from > the early debug via the abstract origin decl). > > Any preference? I'm LTO bootstrapping the first one because it > looks bogus to use get_create here (did we add this for dwarf2out > ICEs with LTO?) Hum, indeed the first patch fails LTO bootstrap with /space/rguenther/src/svn/trunk/gcc/genhooks.c:58:1: internal compiler error: in gen_inlined_subroutine_die, at dwarf2out.c:19946 } ^ 0x85464e gen_inlined_subroutine_die /space/rguenther/src/svn/trunk/gcc/dwarf2out.c:19946 0x854bc5 gen_block_die /space/rguenther/src/svn/trunk/gcc/dwarf2out.c:21002 /* Make sure any inlined functions are known to be inlineable. */ gcc_checking_assert (DECL_ABSTRACT_P (decl) || cgraph_function_possibly_inlined_p (decl)); eventually solved by the DECL_ABSTRACT_P setting below. Trying... (though I don't think we set used_as_abstract_origin when just inlining). With that it fails in check_die. For early LTO debug I'm simply trying to kill the code streaming abstract origins into the LTRANS boundary. We're going to hit the above assert anyway though. Hmm, actually we dont. For int a; int __attribute__((noinline)) baz (int y) { return y; } static int foo (int x) { return baz (x + 1); } int main () { return foo (a); } we get with -O -flto -g and early LTO debug (gdb) start Temporary breakpoint 1, main () at t.c:4 4 int main () { return foo (a); } (gdb) si 0x004004ee in foo (x=, x=) at t.c:4 4 int main () { return foo (a); } (gdb) si baz (y=y@entry=1) at t.c:2 2 int __attribute__((noinline)) baz (int y) { return y; } (gdb) fin Run till exit from #0 baz (y=y@entry=1) at t.c:2 main (argc=, argv=) at t.c:4 4 int main (int argc, char **argv) { return foo (a); } Value returned is $3 = 1 odd that gdb prints 'x' twice here. We have <2><200>: Abbrev Number: 4 (DW_TAG_inlined_subroutine) <201> DW_AT_abstract_origin: <0x235> <205> DW_AT_low_pc : 0x4004e9 <20d> DW_AT_high_pc : 0xa <215> DW_AT_call_file : 1 <216> DW_AT_call_line : 4 <3><217>: Abbrev Number: 5 (DW_TAG_formal_parameter) <218> DW_AT_abstract_origin: <0x23f> <21c> DW_AT_location: 0x0 (location list) <1><235>: Abbrev Number: 8 (DW_TAG_subprogram) <236> DW_AT_abstract_origin: <0x18f> <23a> DW_AT_inline : 1 (inlined) <23b> DW_AT_sibling : <0x245> <2><23f>: Abbrev Number: 9 (DW_TAG_formal_parameter) <240> DW_AT_abstract_origin: <0x19e> <1><18f>: Abbrev Number: 5 (DW_TAG_subprogram) <190> DW_AT_name: foo <194> DW_AT_decl_file : 1 <195> DW_AT_decl_line : 3 <196> DW_AT_prototyped : 1 <196> DW_AT_type: <0x17d> <19a> DW_AT_sibling : <0x1a8> <2><19e>: Abbrev Number: 6 (DW_TAG_formal_parameter) <19f> DW_AT_name: x <1a1> DW_AT_decl_file : 1 <1a2> DW_AT_decl_line : 3 <1a3> DW_AT_type: <0x17d> I would have expected either one or three 'x' ... (if the die children of each abstract origin add up). Maybe gdb just doesn't expect the two level indirection (yeah - I can remove that for the extra cost of refering to the early CU directly). Well. At least it doesn't seem to ICE on the assert even though 'foo' is gone. Ah, cgraph_function_possibly_inlined_p simply returns true... for whatever reason it doesn't do so with trunk. Anyway, not pursuing anything on trunk yet but only with early-LTO-debu
Re: [Patch] Add to the libgfortran/newlib bodge to "detect" ftruncate support in ARM/AArch64/SH
On Tue, Aug 25, 2015 at 03:44:05PM +0100, FX wrote: > > 2015-08-25 James Greenhalgh > > > > * configure.ac: Auto-detect newlib function support unless we > > know there are issues when configuring for a host. > > * configure: Regenerate. > > Thanks for CC’ing the fortran list. > > Given that this is newlib-specific code, even though it’s in libgfortran > configury, you should decide and commit what’s best. I don’t think we have > any newlib expert in the Fortran maintainers. > > Wait for 48 hours to see if anyone else objects, though. OK, it has been 48 hours and I haven't seen any objections. The newlib patch has now been committed. I agree with Marcus' suggestion that we put the more comprehensive patch (which requires the newlib fix) on trunk and my original patch (which does not) on the release branches. I'll go ahead with that later today. Thanks, James
Remove redundant use of REG_CLASS_NAMES maros
Hi. This patch remove static reg_class_names array from print_translated_classes and print_unform_and_important_classes functions. Global reg_class_names array is used instead. Bootstrapped and reg-tested on x86_64-unknown-linux-gnu and powerpc64le-unknown-linux-gnu. OK for trunk? 2015-08-21 Anatoly Sokolov * ira.c (print_unform_and_important_classes, print_translated_classes): Remove reg_class_names static array. Index: gcc/ira.c === --- gcc/ira.c (revision 226990) +++ gcc/ira.c (working copy) @@ -1378,7 +1378,6 @@ static void print_unform_and_important_classes (FILE *f) { - static const char *const reg_class_names[] = REG_CLASS_NAMES; int i, cl; fprintf (f, "Uniform classes:\n"); @@ -1403,7 +1402,6 @@ enum reg_class *class_translate = (pressure_p ? ira_pressure_class_translate : ira_allocno_class_translate); - static const char *const reg_class_names[] = REG_CLASS_NAMES; int i; fprintf (f, "%s classes:\n", pressure_p ? "Pressure" : "Allocno"); Anatoly Sokolov.
Re: [PATCH 2/5] completely_scalarize arrays as well as records
Christophe Lyon wrote: I asked because I assumed that Alan saw it pass in his configuration. Bah. No - I now discover a problem in my C++ testsuite setup that was causing a large number of tests to not be executed. I see the problem too now, investigating --Alan
Remove redundant test for global_regs
Hi. The fixed_reg_set contain all fixed and global registers. This patch change code "fixed_regs[r] || global_regs[r]" with "TEST_HARD_REG_BIT (fixed_reg_set, r)". Bootstrapped and reg-tested on x86_64-unknown-linux-gnu and powerpc64le-unknown-linux-gnu. OK for trunk? 2015-08-24 Anatoly Sokolov * cse.c (FIXED_REGNO_P): Don't check global_regs. Use fixed_reg_set instead of fixed_regs. * df-problems.c (can_move_insns_across): Ditto. * postreload.c (reload_combine_recognize_pattern): Ditto. * recog.c (peep2_find_free_register): Ditto. * regcprop.c (copy_value): Ditto. * regrename.c (check_new_reg_p, rename_chains): Ditto. * sel-sched.c (init_regs_for_mode, mark_unavailable_hard_regs): Ditto. Index: gcc/cse.c === --- gcc/cse.c (revision 226953) +++ gcc/cse.c (working copy) @@ -463,7 +463,7 @@ A reg wins if it is either the frame pointer or designated as fixed. */ #define FIXED_REGNO_P(N) \ ((N) == FRAME_POINTER_REGNUM || (N) == HARD_FRAME_POINTER_REGNUM \ - || fixed_regs[N] || global_regs[N]) + || TEST_HARD_REG_BIT (fixed_reg_set, N)) /* Compute cost of X, as stored in the `cost' field of a table_elt. Fixed hard registers and pointers into the frame are the cheapest with a cost Index: gcc/df-problems.c === --- gcc/df-problems.c (revision 226953) +++ gcc/df-problems.c (working copy) @@ -3871,8 +3871,7 @@ EXECUTE_IF_SET_IN_BITMAP (merge_set, 0, i, bi) { if (i < FIRST_PSEUDO_REGISTER - && ! fixed_regs[i] - && ! global_regs[i]) + && ! TEST_HARD_REG_BIT (fixed_reg_set, i)) { fail = 1; break; Index: gcc/postreload.c === --- gcc/postreload.c(revision 226953) +++ gcc/postreload.c(working copy) @@ -1144,7 +1144,7 @@ && reg_state[i].store_ruid <= reg_state[regno].use_ruid && (call_used_regs[i] || df_regs_ever_live_p (i)) && (!frame_pointer_needed || i != HARD_FRAME_POINTER_REGNUM) - && !fixed_regs[i] && !global_regs[i] + && !TEST_HARD_REG_BIT (fixed_reg_set, i) && hard_regno_nregs[i][GET_MODE (reg)] == 1 && targetm.hard_regno_scratch_ok (i)) { Index: gcc/recog.c === --- gcc/recog.c (revision 226953) +++ gcc/recog.c (working copy) @@ -3162,17 +3162,11 @@ for (j = 0; success && j < hard_regno_nregs[regno][mode]; j++) { /* Don't allocate fixed registers. */ - if (fixed_regs[regno + j]) + if (TEST_HARD_REG_BIT (fixed_reg_set, regno + j)) { success = 0; break; } - /* Don't allocate global registers. */ - if (global_regs[regno + j]) - { - success = 0; - break; - } /* Make sure the register is of the right class. */ if (! TEST_HARD_REG_BIT (reg_class_contents[cl], regno + j)) { Index: gcc/regcprop.c === --- gcc/regcprop.c (revision 226953) +++ gcc/regcprop.c (working copy) @@ -315,7 +315,7 @@ /* Do not propagate copies to fixed or global registers, patterns can be relying to see particular fixed register or users can expect the chosen global register in asm. */ - if (fixed_regs[dr] || global_regs[dr]) + if (TEST_HARD_REG_BIT (fixed_reg_set, dr)) return; /* If SRC and DEST overlap, don't record anything. */ Index: gcc/regrename.c === --- gcc/regrename.c (revision 226953) +++ gcc/regrename.c (working copy) @@ -311,8 +311,7 @@ for (i = nregs - 1; i >= 0; --i) if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i) - || fixed_regs[new_reg + i] - || global_regs[new_reg + i] + || TEST_HARD_REG_BIT (fixed_reg_set, new_reg + i) /* Can't use regs which aren't saved by the prologue. */ || (! df_regs_ever_live_p (new_reg + i) && ! call_used_regs[new_reg + i]) @@ -440,7 +439,7 @@ if (this_head->cannot_rename) continue; - if (fixed_regs[reg] || global_regs[reg] + if (TEST_HARD_REG_BIT (fixed_reg_set, reg) || (!HARD_FRAME_POINTER_IS_FRAME_POINTER && frame_pointer_needed && reg == HARD_FRAME_POINTER_REGNUM) || (HARD_FRAME_POINTER_REGNUM && frame_pointer_needed Index: gcc/sel-sched.c === --- gcc/sel-sched.c (revision 226953) +++ gcc/sel-sched.c (working copy) @@ -1089,8 +1089,7 @@ nregs = hard_regno_nr
[PATCH COMMITTED] MAINTAINERS (Write After Approval): Add myself.
FYI. ChangeLog: 2015-08-28 David Sherwood * MAINTAINERS: Add myself.
Re: C++ delayed folding branch review
2015-08-28 9:19 GMT+02:00 Kai Tietz : > 2015-08-28 4:11 GMT+02:00 Jason Merrill : >> On 08/27/2015 02:12 PM, Kai Tietz wrote: >>> >>> + else if (TREE_CODE (type) == VECTOR_TYPE) >>> +{ >>> + if (TREE_CODE (arg1) == VECTOR_CST >>> + && code == NOP_EXPR >>> + && TYPE_VECTOR_SUBPARTS (type) == VECTOR_CST_NELTS (arg1)) >>> + { >>> + tree r = copy_node (arg1); >>> + TREE_TYPE (arg1) = type; >>> + return r; >>> + } >>> +} >> >> >> I would drop the check on 'code' and add a check that >> >> TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (TREE_TYPE (arg1)) >> >> Does that still pass? > > Yes, is still passes. To check here for main-variant seems to be more > robust. I commit it to branch, and will do complete > regression-testing for it. Completed regression-testing. No new regressions. Kai
Re: Move remaining flag_unsafe_math_optimizations using simplify and match
On Fri, Aug 28, 2015 at 7:33 AM, Marc Glisse wrote: > On Thu, 27 Aug 2015, Andreas Schwab wrote: > >> "Hurugalawadi, Naveen" writes: >> >>> * fold-const.c (fold_binary_loc) : Move Optimize >>> root(x)*root(y) as root(x*y) to match.pd. >>> Move Optimize expN(x)*expN(y) as expN(x+y) to match.pd. >>> Move Optimize pow(x,y)*pow(x,z) as pow(x,y+z) to match.pd. >>> Move Optimize a/root(b/c) into a*root(c/b) to match.pd. >>> Move Optimize x/expN(y) into x*expN(-y) to match.pd. >>> >>> * match.pd (mult (root:s @0) (root:s @1)): New simplifier. >>> (mult (POW:s @0 @1) (POW:s @0 @2)) : New simplifier. >>> (mult (exps:s @0) (exps:s @1)) : New simplifier. >>> (rdiv @0 (root:s (rdiv:s @1 @2))) : New simplifier. >>> (rdiv @0 (exps:s @1)) : New simplifier. >> >> >> FAIL: gcc.dg/builtins-11.c (test for excess errors) >> Excess errors: >> builtins-11.c:(.text+0x52): undefined reference to `link_error' > > > Indeed, generic-match ends up testing for sqrt(x)*sqrt(y) before testing for > sqrt(x)@1*@1, so it simplifies it to sqrt(x*x)->abs(x) instead of plain x. > Changing the genmatch machinery to better respect the order of patterns in > match.pd might not be trivial without sacrificing performance, a workaround > might be to add a special pattern sqrt(x)*sqrt(x), or to disable some > patterns for GENERIC so CSE has a chance to run first. Hum, it _does_ respect ordering (well, it is supposed to). But indeed I can see it does not. Bah. Mind opening a bugreport for this? Thanks, Richard. > -- > Marc Glisse
[PATCH] Fix c++/67371 (issues with throw in constexpr)
As PR67371 shows gcc currently rejects all throw statements in constant-expressions, even when they are never executed. Fix by simply allowing THROW_EXPR in potential_constant_expression_1. One drawback is that we now accept some ill formed cases, but they fall under the "no diagnostic required" rule in the standard, e.g.: constexpr int f1() { throw; return 0; } or constexpr void f2() { throw; } Tested on ppc64le. OK for trunk? Thanks. PR c++/67371 * constexpr.c (potential_constant_expression_1): Allow THROW_EXPR. diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index 1eacb8be9a44..34c503ab2bc4 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -4043,6 +4043,7 @@ potential_constant_expression_1 (tree t, bool want_rval, bool strict, case BREAK_STMT: case CONTINUE_STMT: case REQUIRES_EXPR: +case THROW_EXPR: return true; case AGGR_INIT_EXPR: @@ -4324,7 +4325,6 @@ potential_constant_expression_1 (tree t, bool want_rval, bool strict, case VEC_NEW_EXPR: case DELETE_EXPR: case VEC_DELETE_EXPR: -case THROW_EXPR: case OMP_ATOMIC: case OMP_ATOMIC_READ: case OMP_ATOMIC_CAPTURE_OLD: diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-throw.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-throw.C new file mode 100644 index ..09a3e618f8a2 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-throw.C @@ -0,0 +1,17 @@ +// { dg-do compile { target c++14 } } +constexpr void f() { + if (false) +throw; +} + +constexpr int fun(int n) { + switch (n) { + case 0: +return 1; + default: +throw; // { dg-error "not a constant-expression" } + } +} + +static_assert(fun(0), ""); +static_assert(fun(1), ""); // { dg-error "non-constant condition" } -- Markus
Re: Remove redundant use of REG_CLASS_NAMES maros
On 08/28/2015 06:07 AM, Anatoliy Sokolov wrote: Hi. This patch remove static reg_class_names array from print_translated_classes and print_unform_and_important_classes functions. Global reg_class_names array is used instead. Bootstrapped and reg-tested on x86_64-unknown-linux-gnu and powerpc64le-unknown-linux-gnu. OK for trunk? 2015-08-21 Anatoly Sokolov * ira.c (print_unform_and_important_classes, print_translated_classes): Remove reg_class_names static array. Index: gcc/ira.c === --- gcc/ira.c(revision 226990) +++ gcc/ira.c(working copy) @@ -1378,7 +1378,6 @@ static void print_unform_and_important_classes (FILE *f) { - static const char *const reg_class_names[] = REG_CLASS_NAMES; int i, cl; fprintf (f, "Uniform classes:\n"); @@ -1403,7 +1402,6 @@ enum reg_class *class_translate = (pressure_p ? ira_pressure_class_translate : ira_allocno_class_translate); - static const char *const reg_class_names[] = REG_CLASS_NAMES; int i; fprintf (f, "%s classes:\n", pressure_p ? "Pressure" : "Allocno"); OK. Thanks for removing the artifacts. I see also a typo. It should be uniform instead of unform in print_unform_and_important_classes. Could you change it too. Thanks again.
Re: [PATCH 2/5] completely_scalarize arrays as well as records
On Fri, 28 Aug 2015, Alan Lawrence wrote: > Christophe Lyon wrote: > > > > I asked because I assumed that Alan saw it pass in his configuration. > > > Bah. No - I now discover a problem in my C++ testsuite setup that was causing > a large number of tests to not be executed. I see the problem too now, > investigating Btw, your patch broke Ada: +===GNAT BUG DETECTED==+ | 6.0.0 20150828 (experimental) (x86_64-pc-linux-gnu) GCC error: | | in completely_scalarize, at tree-sra.c:996 | | Error detected around ../rts/a-coorse.ads:46:24 | case ARRAY_TYPE: { tree elemtype = TREE_TYPE (decl_type); tree elem_size = TYPE_SIZE (elemtype); gcc_assert (elem_size && tree_fits_uhwi_p (elem_size)); int el_size = tree_to_uhwi (elem_size); gcc_assert (el_size); tree minidx = TYPE_MIN_VALUE (TYPE_DOMAIN (decl_type)); tree maxidx = TYPE_MAX_VALUE (TYPE_DOMAIN (decl_type)); gcc_assert (TREE_CODE (minidx) == INTEGER_CST && TREE_CODE (maxidx) == INTEGER_CST); obviously you missed VLAs. min/max value can also be NULL. Richard.
RE: [PATCH] MIPS: Add the lo register to the clobber list in the madd-8.c and msub-8.c testcases
> Yes, this looks OK. Committed as SVN 227299. Regards, Andrew
Re: [PATCH 2/5] completely_scalarize arrays as well as records
Richard Biener wrote: On Fri, 28 Aug 2015, Alan Lawrence wrote: Christophe Lyon wrote: I asked because I assumed that Alan saw it pass in his configuration. Bah. No - I now discover a problem in my C++ testsuite setup that was causing a large number of tests to not be executed. I see the problem too now, investigating Btw, your patch broke Ada: +===GNAT BUG DETECTED==+ | 6.0.0 20150828 (experimental) (x86_64-pc-linux-gnu) GCC error: | | in completely_scalarize, at tree-sra.c:996 | | Error detected around ../rts/a-coorse.ads:46:24 | case ARRAY_TYPE: { tree elemtype = TREE_TYPE (decl_type); tree elem_size = TYPE_SIZE (elemtype); gcc_assert (elem_size && tree_fits_uhwi_p (elem_size)); int el_size = tree_to_uhwi (elem_size); gcc_assert (el_size); tree minidx = TYPE_MIN_VALUE (TYPE_DOMAIN (decl_type)); tree maxidx = TYPE_MAX_VALUE (TYPE_DOMAIN (decl_type)); gcc_assert (TREE_CODE (minidx) == INTEGER_CST && TREE_CODE (maxidx) == INTEGER_CST); obviously you missed VLAs. min/max value can also be NULL. Richard. Right. I think VLA's are the problem with pr64312.C also. I'm testing a fix (that declares arrays with any of these properties as unscalarizable). Monday is a bank holiday in UK and so I expect to get back to you on Tuesday. --Alan
[PATCH] Tidy tree-ssa-dom.c: Use dom_valueize more.
The code in the dom_valueize function is duplicated a number of times; so, call the function. Also remove a comment in lookup_avail_expr re const_and_copies, describing one of said duplicates, that looks like it was superceded in r87787. Bootstrapped + check-gcc on x86-none-linux-gnu. gcc/ChangeLog: * tree-ssa-dom.c (record_equivalences_from_phis, record_equivalences_from_stmt, optimize_stmt): Use dom_valueize. (lookup_avail_expr): Likewise, and remove comment and unused temp. --- gcc/tree-ssa-dom.c | 31 --- 1 file changed, 4 insertions(+), 27 deletions(-) diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index a7d42bc..5a9a8aa 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -1577,12 +1577,7 @@ record_equivalences_from_phis (basic_block bb) if (lhs == t) continue; - /* Valueize t. */ - if (TREE_CODE (t) == SSA_NAME) - { - tree tmp = SSA_NAME_VALUE (t); - t = tmp ? tmp : t; - } + t = dom_valueize (t); /* If we have not processed an alternative yet, then set RHS to this alternative. */ @@ -2160,12 +2155,7 @@ record_equivalences_from_stmt (gimple stmt, int may_optimize_p) && (TREE_CODE (rhs) == SSA_NAME || is_gimple_min_invariant (rhs))) { - /* Valueize rhs. */ - if (TREE_CODE (rhs) == SSA_NAME) - { - tree tmp = SSA_NAME_VALUE (rhs); - rhs = tmp ? tmp : rhs; - } + rhs = dom_valueize (rhs); if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -2442,12 +2432,7 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator si) tree rhs = gimple_assign_rhs1 (stmt); tree cached_lhs; gassign *new_stmt; - if (TREE_CODE (rhs) == SSA_NAME) - { - tree tem = SSA_NAME_VALUE (rhs); - if (tem) - rhs = tem; - } + rhs = dom_valueize (rhs); /* Build a new statement with the RHS and LHS exchanged. */ if (TREE_CODE (rhs) == SSA_NAME) { @@ -2569,7 +2554,6 @@ lookup_avail_expr (gimple stmt, bool insert) { expr_hash_elt **slot; tree lhs; - tree temp; struct expr_hash_elt element; /* Get LHS of phi, assignment, or call; else NULL_TREE. */ @@ -2664,14 +2648,7 @@ lookup_avail_expr (gimple stmt, bool insert) definition of another variable. */ lhs = (*slot)->lhs; - /* See if the LHS appears in the CONST_AND_COPIES table. If it does, then - use the value from the const_and_copies table. */ - if (TREE_CODE (lhs) == SSA_NAME) -{ - temp = SSA_NAME_VALUE (lhs); - if (temp) - lhs = temp; -} + lhs = dom_valueize (lhs); if (dump_file && (dump_flags & TDF_DETAILS)) { -- 1.8.3
[gomp4.1] document more structures in libgomp.h
More boring patches in an effort to make sense of it all. Does this match your understanding? If it does, OK for branch? commit 3b7ffc815a8e163391c913196160354348348945 Author: Aldy Hernandez Date: Fri Aug 28 07:19:51 2015 -0700 * libgomp.h: Document gomp_task_depend_entry, gomp_task, gomp_taskgroup. *task.c (gomp_task_run_pre): Add comments. (gomp_task_run_post_handle_dependers): Same. (GOMP_taskwait): Same. diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index f855813..2357357 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -279,9 +279,11 @@ struct htab; struct gomp_task_depend_entry { + /* Address of dependency. */ void *addr; struct gomp_task_depend_entry *next; struct gomp_task_depend_entry *prev; + /* Task that provides the depdency in ADDR. */ struct gomp_task *task; /* Depend entry is of type "IN". */ bool is_in; @@ -312,10 +314,14 @@ struct gomp_taskwait struct gomp_task { struct gomp_task *parent; + /* Children of this task. Siblings are chained by + NEXT/PREV_CHILD fields below. */ struct gomp_task *children; struct gomp_task *next_child; struct gomp_task *prev_child; + /* Next task in TASK_QUEUE list in `struct gomp_team'. */ struct gomp_task *next_queue; + /* Prev task in TASK_QUEUE list in `struct gomp_team'. */ struct gomp_task *prev_queue; /* Next task in the current taskgroup. */ struct gomp_task *next_taskgroup; @@ -323,10 +329,13 @@ struct gomp_task struct gomp_task *prev_taskgroup; /* Taskgroup this task belongs in. */ struct gomp_taskgroup *taskgroup; + /* Tasks that depend on this task. */ struct gomp_dependers_vec *dependers; struct htab *depend_hash; struct gomp_taskwait *taskwait; + /* Number of items in DEPEND. */ size_t depend_count; + /* Number of tasks in the DEPENDERS field above. */ size_t num_dependees; struct gomp_task_icv icv; void (*fn) (void *); @@ -335,13 +344,20 @@ struct gomp_task bool in_tied_task; bool final_task; bool copy_ctors_done; + /* Set for undeferred tasks with unsatisfied dependencies which + block further execution of their parent until the dependencies + are satisfied. */ bool parent_depends_on; + /* Dependencies provided and/or needed for this task. DEPEND_COUNT + is the number of items available. */ struct gomp_task_depend_entry depend[]; }; struct gomp_taskgroup { struct gomp_taskgroup *prev; + /* List of tasks that belong in this taskgroup. Tasks are chained + by next/prev_taskgroup within the gomp_task. */ struct gomp_task *children; bool in_taskgroup_wait; bool cancelled; @@ -411,6 +427,8 @@ struct gomp_team struct gomp_work_share work_shares[8]; gomp_mutex_t task_lock; + /* Scheduled tasks. Chain fields are next/prev_queue within a + gomp_task. */ struct gomp_task *task_queue; /* Number of all GOMP_TASK_{WAITING,TIED} tasks in the team. */ unsigned int task_count; diff --git a/libgomp/task.c b/libgomp/task.c index aa7ae4d..179e0fa 100644 --- a/libgomp/task.c +++ b/libgomp/task.c @@ -463,14 +463,26 @@ gomp_task_run_pre (struct gomp_task *child_task, struct gomp_task *parent, if (parent->children == child_task) parent->children = child_task->next_child; + /* If the current task (child_task) is at the top of the +parent's last_parent_depends_on, it's about to be removed +from it. Adjust last_parent_depends_on appropriately. */ if (__builtin_expect (child_task->parent_depends_on, 0) && parent->taskwait->last_parent_depends_on == child_task) { + /* The last_parent_depends_on list was built with all +parent_depends_on entries linked to the prev_child. Grab +the next last_parent_depends_on head from this prev_child if +available... */ if (child_task->prev_child->kind == GOMP_TASK_WAITING && child_task->prev_child->parent_depends_on) parent->taskwait->last_parent_depends_on = child_task->prev_child; else - parent->taskwait->last_parent_depends_on = NULL; + { + /* ...otherwise, there are no more parent_depends_on +entries waiting to run. In which case, clear the +list. */ + parent->taskwait->last_parent_depends_on = NULL; + } } } @@ -529,6 +541,11 @@ gomp_task_run_post_handle_depend_hash (struct gomp_task *child_task) } } +/* After CHILD_TASK has been run, adjust the various task queues to + give higher priority to the tasks that depend on CHILD_TASK. + + TEAM is the team to which CHILD_TASK belongs to. */ + static size_t gomp_task_run_post_handle_dependers (struct gomp_task *child_task, struct gomp_team *team) @@ -552,7 +569,7 @@ gomp_task_run_post_handle_dependers (struct gomp_task *child_task,
Re: [gomp4.1] document more structures in libgomp.h
On Fri, Aug 28, 2015 at 07:21:46AM -0700, Aldy Hernandez wrote: > * libgomp.h: Document gomp_task_depend_entry, gomp_task, > gomp_taskgroup. > *task.c (gomp_task_run_pre): Add comments. Missing space before task.c. > --- a/libgomp/libgomp.h > +++ b/libgomp/libgomp.h > @@ -279,9 +279,11 @@ struct htab; > > struct gomp_task_depend_entry > { > + /* Address of dependency. */ >void *addr; >struct gomp_task_depend_entry *next; >struct gomp_task_depend_entry *prev; > + /* Task that provides the depdency in ADDR. */ Typo. >struct gomp_task *task; >/* Depend entry is of type "IN". */ >bool is_in; > @@ -312,10 +314,14 @@ struct gomp_taskwait > struct gomp_task > { >struct gomp_task *parent; > + /* Children of this task. Siblings are chained by > + NEXT/PREV_CHILD fields below. */ I think it would be better to say here that it is a circular list, and how the siblings are ordered in the circular list. > struct gomp_taskgroup > { >struct gomp_taskgroup *prev; > + /* List of tasks that belong in this taskgroup. Tasks are chained > + by next/prev_taskgroup within the gomp_task. */ Again, perhaps mention also that it is a circular list and how the items of the circular list are sorted. > + /* Scheduled tasks. Chain fields are next/prev_queue within a > + gomp_task. */ Similarly. >struct gomp_task *task_queue; Jakub
[HSA] Add support for unordered comparions codes
Hi, I've committed the following simple patch to the branch to add missing support for unordered compariosn codes. Martin 2015-08-28 Martin Jambor * hsa-gen.c (gen_hsa_cmp_insn_from_gimple): Add unordered comparison codes. (gen_hsa_insns_for_operation_assignment): Likewise. diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c index 213b564..92df7e4 100644 --- a/gcc/hsa-gen.c +++ b/gcc/hsa-gen.c @@ -2236,6 +2236,31 @@ gen_hsa_cmp_insn_from_gimple (enum tree_code code, tree lhs, tree rhs, case NE_EXPR: compare = BRIG_COMPARE_NE; break; +case UNORDERED_EXPR: + compare = BRIG_COMPARE_NAN; + break; +case ORDERED_EXPR: + compare = BRIG_COMPARE_NUM; + break; +case UNLT_EXPR: + compare = BRIG_COMPARE_LTU; + break; +case UNLE_EXPR: + compare = BRIG_COMPARE_LEU; + break; +case UNGT_EXPR: + compare = BRIG_COMPARE_GTU; + break; +case UNGE_EXPR: + compare = BRIG_COMPARE_GEU; + break; +case UNEQ_EXPR: + compare = BRIG_COMPARE_EQU; + break; +case LTGT_EXPR: + compare = BRIG_COMPARE_NEU; + break; + default: sorry ("Support for HSA does not implement comparison tree code %s\n", get_tree_code_name (code)); @@ -2494,6 +2519,14 @@ gen_hsa_insns_for_operation_assignment (gimple assign, hsa_bb *hbb, case GE_EXPR: case EQ_EXPR: case NE_EXPR: +case UNORDERED_EXPR: +case ORDERED_EXPR: +case UNLT_EXPR: +case UNLE_EXPR: +case UNGT_EXPR: +case UNGE_EXPR: +case UNEQ_EXPR: +case LTGT_EXPR: { hsa_op_reg *dest = hsa_reg_for_gimple_ssa (gimple_assign_lhs (assign), ssa_map);
[hsa] Create a special omp statement for gpu kernels
Hi, the patch I below that I have committed to the branch adds a special gimple statement code in which GPU statements can survive between lowering and expansion and which makes sure that even statements which pertain to the kernel loop but lowering puts them in front of the loop are picked up by expansion and put into a separate function. Thanks, Martin 2015-08-28 Martin Jambor * omp-low.c (expand_omp_for_kernel): Do not insert return statement. (expand_target_kernel_body): Handle kernels encapsulated in GIMPLE_OMP_GPUKERNEL statements. (lower_omp_target): Lower kernel code into a new GIMPLE_OMP_GPUKERNEL statement. * gimple.def (GIMPLE_OMP_GPUKERNEL): New code. * gimple.c (gimple_build_omp_gpukernel): New function. (gimple_copy): Handle GIMPLE_OMP_GPUKERNEL case. * gimple-low.c (lower_stmt): Likewise. * gimple-pretty-print.c (dump_gimple_omp_block): Likewise. (pp_gimple_stmt_1): Likewise. * gimple.h (gimple_build_omp_gpukernel): Declare. (gimple_has_substatements): Handle GIMPLE_OMP_GPUKERNEL case. (CASE_GIMPLE_OMP): Likewise. Index: gcc/gimple.def === --- gcc/gimple.def (revision 227279) +++ gcc/gimple.def (working copy) @@ -375,6 +375,10 @@ DEFGSCODE(GIMPLE_OMP_TARGET, "gimple_omp CLAUSES is an OMP_CLAUSE chain holding the associated clauses. */ DEFGSCODE(GIMPLE_OMP_TEAMS, "gimple_omp_teams", GSS_OMP_SINGLE_LAYOUT) +/* GIMPLE_OMP_GPUKERNEL represents a parallel loop lowered for execution + on a GPU. It is an artificial statement created by omp lowering. */ +DEFGSCODE(GIMPLE_OMP_GPUKERNEL, "gimple_omp_gpukernel", GSS_OMP) + /* GIMPLE_PREDICT specifies a hint for branch prediction. PREDICT is one of the predictors from predict.def. Index: gcc/gimple.c === --- gcc/gimple.c(revision 227279) +++ gcc/gimple.c(working copy) @@ -959,6 +959,19 @@ gimple_build_omp_master (gimple_seq body return p; } +/* Build a GIMPLE_OMP_GPUKERNEL statement. + + BODY is the sequence of statements to be executed by the kernel. */ + +gimple +gimple_build_omp_gpukernel (gimple_seq body) +{ + gimple p = gimple_alloc (GIMPLE_OMP_GPUKERNEL, 0); + if (body) +gimple_omp_set_body (p, body); + + return p; +} /* Build a GIMPLE_OMP_TASKGROUP statement. @@ -1798,6 +1811,7 @@ gimple_copy (gimple stmt) case GIMPLE_OMP_MASTER: case GIMPLE_OMP_TASKGROUP: case GIMPLE_OMP_ORDERED: + case GIMPLE_OMP_GPUKERNEL: copy_omp_body: new_seq = gimple_seq_copy (gimple_omp_body (stmt)); gimple_omp_set_body (copy, new_seq); Index: gcc/gimple.h === --- gcc/gimple.h(revision 227279) +++ gcc/gimple.h(working copy) @@ -1435,6 +1435,7 @@ gomp_task *gimple_build_omp_task (gimple tree, tree); gimple gimple_build_omp_section (gimple_seq); gimple gimple_build_omp_master (gimple_seq); +gimple gimple_build_omp_gpukernel (gimple_seq); gimple gimple_build_omp_taskgroup (gimple_seq); gomp_continue *gimple_build_omp_continue (tree, tree); gimple gimple_build_omp_ordered (gimple_seq); @@ -1691,6 +1692,7 @@ gimple_has_substatements (gimple g) case GIMPLE_OMP_TARGET: case GIMPLE_OMP_TEAMS: case GIMPLE_OMP_CRITICAL: +case GIMPLE_OMP_GPUKERNEL: case GIMPLE_WITH_CLEANUP_EXPR: case GIMPLE_TRANSACTION: return true; @@ -5879,7 +5881,8 @@ gimple_return_set_retbnd (gimple gs, tre case GIMPLE_OMP_RETURN:\ case GIMPLE_OMP_ATOMIC_LOAD: \ case GIMPLE_OMP_ATOMIC_STORE: \ -case GIMPLE_OMP_CONTINUE +case GIMPLE_OMP_CONTINUE: \ +case GIMPLE_OMP_GPUKERNEL static inline bool is_gimple_omp (const_gimple stmt) Index: gcc/gimple-pretty-print.c === --- gcc/gimple-pretty-print.c (revision 227279) +++ gcc/gimple-pretty-print.c (working copy) @@ -1486,6 +1486,9 @@ dump_gimple_omp_block (pretty_printer *b case GIMPLE_OMP_SECTION: pp_string (buffer, "#pragma omp section"); break; + case GIMPLE_OMP_GPUKERNEL: + pp_string (buffer, "#pragma omp gpukernel"); + break; default: gcc_unreachable (); } @@ -2240,6 +2243,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer case GIMPLE_OMP_TASKGROUP: case GIMPLE_OMP_ORDERED: case GIMPLE_OMP_SECTION: +case GIMPLE_OMP_GPUKERNEL: dump_gimple_omp_block (buffer, gs, spc, flags); break; Index: gcc/gimple-low.c === --- gcc/gimple-low.c(revision 227279) +++ gcc/gimple-low.c(working copy) @@ -366,6 +366,7 @@
[PATCH] [ARM, Callgraph] Fix PR67280: function incorrectly marked as nothrow
Hi This patch is an attempt to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67280. I have written up an analysis of the bug there. When cgraph_node::create_wrapper() updates the callgraph for the new function, it sets the can_throw_external flag to false, even when wrapping a function which can throw. This causes the ipa-pure-const phase to mark the wrapper function as nothrow which results in incorrect unwinding tables. (more details on bugzilla) The attached patch addresses the problem in cgraph_node::create_wrapper(). A slightly more general approach would be to change symbol_table::create_edge() so that it checks TREE_NOTHROW(callee->decl) when call_stmt is NULL. This patch passed make check with no new regressions on gcc-5-branch on arm-linux-gnueabihf using qemu. I will do a bootstrap on ARM hardware over the weekend. Do I also need to test x86_64? I plan to add a test case, but it seems it's worth getting review for the approach in the mean time. Thanks, Charles gcc/ChangeLog: 2015-08-28 Charles Baylis * cgraphunit.c (cgraph_node::create_wrapper): Set can_throw_external in new callgraph edge. 0001-fix-up-can_throw_external-in-cgraph_node-create_wrap.patch Description: application/download
Re: [PATCH 2/5] completely_scalarize arrays as well as records
Alan Lawrence wrote: Right. I think VLA's are the problem with pr64312.C also. I'm testing a fix (that declares arrays with any of these properties as unscalarizable). Monday is a bank holiday in UK and so I expect to get back to you on Tuesday. --Alan In the meantime I've reverted the patch pending further testing on x86, aarch64 and arm. --Alan
Re: [Patch] Add to the libgfortran/newlib bodge to "detect" ftruncate support in ARM/AArch64/SH
On Fri, Aug 28, 2015 at 10:40:31AM +0100, James Greenhalgh wrote: > On Tue, Aug 25, 2015 at 03:44:05PM +0100, FX wrote: > > > 2015-08-25 James Greenhalgh > > > > > > * configure.ac: Auto-detect newlib function support unless we > > > know there are issues when configuring for a host. > > > * configure: Regenerate. > > > > Thanks for CC’ing the fortran list. > > > > Given that this is newlib-specific code, even though it’s in libgfortran > > configury, you should decide and commit what’s best. I don’t think we have > > any newlib expert in the Fortran maintainers. > > > > Wait for 48 hours to see if anyone else objects, though. > > OK, it has been 48 hours and I haven't seen any objections. The newlib > patch has now been committed. > > I agree with Marcus' suggestion that we put the more comprehensive patch > (which requires the newlib fix) on trunk and my original patch (which does > not) on the release branches. > > I'll go ahead with that later today. Now in place on trunk (r227301), gcc-5-branch (r227302) and gcc-4_9-branch (r227304). Give me a shout if you see issues in your build systems. Thanks, James
Re: [patch] fix bootstrap if no ISL is available.
On Thu, Aug 27, 2015 at 3:19 PM, Andreas Tobler wrote: > Hi all, > > I think this is obvious? Yes. Sorry for missing testing without ISL. I have committed a similar patch just after I got pinged that my patch broke builds without ISL. Sebastian > > Thanks, > Andreas > > 2015-08-27 Andreas Tobler > > * toplev.c (process_options): Remove flag_loop_block, > flag_loop_interchange, flag_loop_strip_mine. > > > Index: toplev.c > === > --- toplev.c(revision 227279) > +++ toplev.c(working copy) > @@ -1317,9 +1317,6 @@ > #ifndef HAVE_isl >if (flag_graphite >|| flag_graphite_identity > - || flag_loop_block > - || flag_loop_interchange > - || flag_loop_strip_mine >|| flag_loop_parallelize_all) > sorry ("Graphite loop optimizations cannot be used (ISL is not > available)" >"(-fgraphite, -fgraphite-identity, -floop-block, "
RE: [PATCH] MIPS: If a test in the MIPS testsuite requires standard library support check the sysroot supports the required test options.
> I had some comments on this that I hadn't got round to posting. The fix in > this patch is not general enough as the missing header problem comes in > two (related) forms: > > 1) Using the new MTI and IMG sysroot layout we can end up with GCC looking >for headers in a sysroot that simply does not exist. The current patch >handles this. > 2) Using any sysroot layout (i.e. a simple mips-linux-gnu) it is possible >for the stdlib.h header to be found but the ABI dependent gnu-stubs >header may not be installed depending on soft/hard nan1985/nan2008. > > The test for stdlib.h needs to therefore verify that preprocessing succeeds > rather than just testing for an error relating to stdlib.h. This could be > done by adding a further option to mips_preprocess to indicate the processor > output should go to a file and that the caller wants the messages emitted > by the compiler instead. > > A second issue is that you have added (REQUIRES_STDLIB) to too many tests. > You only need to add it to tests that request a compiler option (via > dg-options) that could potentially lead to forcing soft/hard nan1985/nan2008 > directly or indirectly. So -mips32r6 implies nan2008 so you need it -mips32r5 > implies nan1985 so you need it. There are at least two tests which don't > need the option but you need to check them all so we don't run the check > needlessly. The updated patch and ChangeLog that addresses Matthew's comments is below. Ok to commit? Regards, Andrew testsuite/ * gcc.target/mips/loongson-simd.c (dg-options): Add (REQUIRES_STDLIB). * gcc.target/mips/loongson-shift-count-truncated-1.c: Ditto * gcc.target/mips/mips-3d-1.c: Ditto * gcc.target/mips/mips-3d-2.c: Ditto * gcc.target/mips/mips-3d-3.c: Ditto * gcc.target/mips/mips-3d-4.c: Ditto * gcc.target/mips/mips-3d-5.c: Ditto * gcc.target/mips/mips-3d-6.c: Ditto * gcc.target/mips/mips-3d-7.c: Ditto * gcc.target/mips/mips-3d-8.c: Ditto * gcc.target/mips/mips-3d-9.c: Ditto * gcc.target/mips/mips-ps-1.c: Ditto * gcc.target/mips/mips-ps-2.c: Ditto * gcc.target/mips/mips-ps-3.c: Ditto * gcc.target/mips/mips-ps-4.c: Ditto * gcc.target/mips/mips-ps-6.c: Ditto * gcc.target/mips/mips16-attributes.c: Ditto * gcc.target/mips/mips32-dsp-run.c: Ditto * gcc.target/mips/mips32-dsp.c: Ditto * gcc.target/mips/save-restore-1.c: Ditto * gcc.target/mips/mips.exp (mips_option_groups): Add stdlib. (mips_preprocess): Add ignore_output argument that when set will not return the pre-processed output. (mips_arch_info): Update arguments for the call to mips_preprocess. (mips-dg-init): Ditto. (mips-dg-options): Check if a test having test option (REQUIRES_STDLIB) has the required sysroot support for the current test options. diff --git a/gcc/testsuite/gcc.target/mips/loongson-shift-count-truncated-1.c b/gcc/testsuite/gcc.target/mips/loongson-shift-count-truncated-1.c index f57a18c..baed48c 100644 --- a/gcc/testsuite/gcc.target/mips/loongson-shift-count-truncated-1.c +++ b/gcc/testsuite/gcc.target/mips/loongson-shift-count-truncated-1.c @@ -4,7 +4,7 @@ /* loongson.h does not handle or check for MIPS16ness. There doesn't seem any good reason for it to, given that the Loongson processors do not support MIPS16. */ -/* { dg-options "isa=loongson -mhard-float -mno-mips16" } */ +/* { dg-options "isa=loongson -mhard-float -mno-mips16 (REQUIRES_STDLIB)" } */ /* See PR 52155. */ /* { dg-options "isa=loongson -mhard-float -mno-mips16 -mlong64" { mips*-*-elf* && ilp32 } } */ diff --git a/gcc/testsuite/gcc.target/mips/loongson-simd.c b/gcc/testsuite/gcc.target/mips/loongson-simd.c index 6d2ceb6..f263b43 100644 --- a/gcc/testsuite/gcc.target/mips/loongson-simd.c +++ b/gcc/testsuite/gcc.target/mips/loongson-simd.c @@ -26,7 +26,7 @@ along with GCC; see the file COPYING3. If not see because inclusion of some system headers e.g. stdint.h will fail due to not finding stubs-o32_hard.h. */ /* { dg-require-effective-target mips_nanlegacy } */ -/* { dg-options "isa=loongson -mhard-float -mno-micromips -mno-mips16 -flax-vector-conversions" } */ +/* { dg-options "isa=loongson -mhard-float -mno-micromips -mno-mips16 -flax-vector-conversions (REQUIRES_STDLIB)" } */ #include "loongson.h" #include diff --git a/gcc/testsuite/gcc.target/mips/mips-3d-1.c b/gcc/testsuite/gcc.target/mips/mips-3d-1.c index f11ffc5..3a3318d 100644 --- a/gcc/testsuite/gcc.target/mips/mips-3d-1.c +++ b/gcc/testsuite/gcc.target/mips/mips-3d-1.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-mips3d" } */ +/* { dg-options "-mips3d (REQUIRES_STDLIB)" } */ /* Test MIPS-3D builtin functions */ #include diff --git a/gcc/testsuite/gcc.target/mips/mips-3d-2.c b/gcc/testsuite/gcc.target/mips/mips-3d-2.c index b04c3bf..3464ed4 100644 ---
[PATCH] PR bootstrap/67385: READELF_FOR_TARGET isn't used in gcc configure
Similar to as, ld, nm and objdump, gcc configure should check $READELF_FOR_TARGET for readelf. OK for trunk? H.J. --- PR bootstrap/67385 * configure.ac (gcc_cv_readelf): Check $READELF_FOR_TARGET. * configure: Regenerated. --- gcc/configure| 6 -- gcc/configure.ac | 4 +++- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/gcc/configure b/gcc/configure index 0d31383..4d16140 100755 --- a/gcc/configure +++ b/gcc/configure @@ -22232,9 +22232,11 @@ if test -f $gcc_cv_binutils_srcdir/configure.ac \ gcc_cv_readelf=../binutils/readelf$build_exeext elif test -x readelf$build_exeext; then gcc_cv_readelf=./readelf$build_exeext +elif ( set dummy $READELF_FOR_TARGET; test -x $2 ); then +gcc_cv_readelf="$READELF_FOR_TARGET" else -# Extract the first word of "readelf", so it can be a program name with args. -set dummy readelf; ac_word=$2 +# Extract the first word of "$READELF_FOR_TARGET", so it can be a program name with args. +set dummy $READELF_FOR_TARGET; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_path_gcc_cv_readelf+set}" = set; then : diff --git a/gcc/configure.ac b/gcc/configure.ac index 846651d..81aba21 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -2367,8 +2367,10 @@ if test -f $gcc_cv_binutils_srcdir/configure.ac \ gcc_cv_readelf=../binutils/readelf$build_exeext elif test -x readelf$build_exeext; then gcc_cv_readelf=./readelf$build_exeext +elif ( set dummy $READELF_FOR_TARGET; test -x $[2] ); then +gcc_cv_readelf="$READELF_FOR_TARGET" else -AC_PATH_PROG(gcc_cv_readelf, readelf) +AC_PATH_PROG(gcc_cv_readelf, $READELF_FOR_TARGET) fi]) AC_MSG_CHECKING(what readelf to use) -- 2.4.3
Re: [gomp4.1] document more structures in libgomp.h
On 08/28/2015 07:31 AM, Jakub Jelinek wrote: On Fri, Aug 28, 2015 at 07:21:46AM -0700, Aldy Hernandez wrote: * libgomp.h: Document gomp_task_depend_entry, gomp_task, gomp_taskgroup. *task.c (gomp_task_run_pre): Add comments. Missing space before task.c. --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -279,9 +279,11 @@ struct htab; struct gomp_task_depend_entry { + /* Address of dependency. */ void *addr; struct gomp_task_depend_entry *next; struct gomp_task_depend_entry *prev; + /* Task that provides the depdency in ADDR. */ Typo. struct gomp_task *task; /* Depend entry is of type "IN". */ bool is_in; @@ -312,10 +314,14 @@ struct gomp_taskwait struct gomp_task { struct gomp_task *parent; + /* Children of this task. Siblings are chained by + NEXT/PREV_CHILD fields below. */ I think it would be better to say here that it is a circular list, and how the siblings are ordered in the circular list. struct gomp_taskgroup { struct gomp_taskgroup *prev; + /* List of tasks that belong in this taskgroup. Tasks are chained + by next/prev_taskgroup within the gomp_task. */ Again, perhaps mention also that it is a circular list and how the items of the circular list are sorted. + /* Scheduled tasks. Chain fields are next/prev_queue within a + gomp_task. */ Similarly. struct gomp_task *task_queue; Jakub How about this? Aldy commit 9647978b5a05dfb851e2b61704187dce71ffe379 Author: Aldy Hernandez Date: Fri Aug 28 07:19:51 2015 -0700 * libgomp.h: Document gomp_task_depend_entry, gomp_task, gomp_taskgroup. *task.c (gomp_task_run_pre): Add comments. (gomp_task_run_post_handle_dependers): Same. (GOMP_taskwait): Same. diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index f855813..7babf04 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -279,9 +279,11 @@ struct htab; struct gomp_task_depend_entry { + /* Address of dependency. */ void *addr; struct gomp_task_depend_entry *next; struct gomp_task_depend_entry *prev; + /* Task that provides the dependency in ADDR. */ struct gomp_task *task; /* Depend entry is of type "IN". */ bool is_in; @@ -311,22 +313,35 @@ struct gomp_taskwait struct gomp_task { + /* Parent circular list. See children description below. */ struct gomp_task *parent; + /* Circular list representing the children of this task. + + In this list we first have parent_depends_on ready to run tasks, + then !parent_depends_on ready to run tasks, then already running + tasks, and finally the rest of the tasks. */ struct gomp_task *children; struct gomp_task *next_child; struct gomp_task *prev_child; + /* Circular task_queue in `struct gomp_team'. + + GOMP_TASK_WAITING tasks come before GOMP_TASK_TIED tasks. */ struct gomp_task *next_queue; struct gomp_task *prev_queue; - /* Next task in the current taskgroup. */ + /* Circular queue in gomp_taskgroup->children. + + GOMP_TASK_WAITING tasks come before GOMP_TASK_TIED tasks. */ struct gomp_task *next_taskgroup; - /* Previous task in the current taskgroup. */ struct gomp_task *prev_taskgroup; /* Taskgroup this task belongs in. */ struct gomp_taskgroup *taskgroup; + /* Tasks that depend on this task. */ struct gomp_dependers_vec *dependers; struct htab *depend_hash; struct gomp_taskwait *taskwait; + /* Number of items in DEPEND. */ size_t depend_count; + /* Number of tasks in the DEPENDERS field above. */ size_t num_dependees; struct gomp_task_icv icv; void (*fn) (void *); @@ -335,13 +350,23 @@ struct gomp_task bool in_tied_task; bool final_task; bool copy_ctors_done; + /* Set for undeferred tasks with unsatisfied dependencies which + block further execution of their parent until the dependencies + are satisfied. */ bool parent_depends_on; + /* Dependencies provided and/or needed for this task. DEPEND_COUNT + is the number of items available. */ struct gomp_task_depend_entry depend[]; }; struct gomp_taskgroup { struct gomp_taskgroup *prev; + /* Circular list of tasks that belong in this taskgroup. + + Tasks are chained by next/prev_taskgroup within gomp_task, and + are sorted by GOMP_TASK_WAITING tasks, and then GOMP_TASK_TIED + tasks. */ struct gomp_task *children; bool in_taskgroup_wait; bool cancelled; @@ -411,6 +436,8 @@ struct gomp_team struct gomp_work_share work_shares[8]; gomp_mutex_t task_lock; + /* Scheduled tasks. Chain fields are next/prev_queue within a + gomp_task. */ struct gomp_task *task_queue; /* Number of all GOMP_TASK_{WAITING,TIED} tasks in the team. */ unsigned int task_count; diff --git a/libgomp/task.c b/libgomp/task.c index aa7ae4d..179e0fa 100644 --- a/libgomp/task.c +++ b/libgomp/task.c @@ -463,14 +463,26 @@ gomp_task_run_pre (struct gomp_task *child_task
[gomp4] check for compatible parallelism with acc routines
This patch teaches omplower to report any incompatible parallelism when using routines. I also fixed a minor bug involving reductions inside routines and removed a dead variable inside execute_oacc_transform which caused a build warning. There are two scenarios involving acc routines that need checking: 1. calls to routines 2. acc loops inside routines For both of these cases, I'm utilizing the routine dimensions associated with the 'oacc function' attribute. A couple of libgomp test cases were clearly bogus. E.g., you cannot have a gang loop inside a worker routine, nor can you call a vector routine from a vector loop. This patch corrects those tests, too. I encountered one ambiguity in the spec involving the seq loop clause. The spec say that seq loops are supposed to be executed sequentially by a single thread. I'm not sure whether that implies that a seq loop cannot be embedded into a gang/worker/vector loop, or if a gwv loop can nest inside a loop. E.g. #pragma acc loop gang for (...) { #pragma acc loop seq for (...) } and #pragma acc loop seq for (...) { #pragma acc loop gang for (...) } Right now, gcc is permitting both of these loops. I.e., only the seq loop itself is executing in a non-partitioned mode. Julian inquired about this in the openacc technical list a while ago, but I don't think he got a response. This patch has been applied to gomp-4_0-branch. Cesar 2015-08-28 Cesar Philippidis gcc/ * omp-low.c (extract_oacc_routine_gwv): New function. (build_outer_var_ref): Handle refs inside acc routines. (scan_omp_for): Check nested parallelism inside acc routines. (scan_omp_1_stmt): Check for compatible parallelism when calling routines. (execute_oacc_transform): Remove dead variable. gcc/testsuite/ * c-c++-common/goacc/routine-6.c: New test. * c-c++-common/goacc/routine-7.c: New test. * gfortran.dg/goacc/routine-4.f90: New test. * gfortran.dg/goacc/routine-5.f90: New test. libgomp/ * testsuite/libgomp.oacc-c-c++-common/routine-4.c: Fix calls to acc routines. * testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise. * testsuite/libgomp.oacc-fortran/vector-routine.f90: Likewise. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 4312a60..e8d7513 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -415,6 +415,35 @@ is_combined_parallel (struct omp_region *region) return region->is_combined_parallel; } +/* Return the gang, worker and vector attributes from associated with + FNDECL. Returns a GOMP_DIM for the lowest level of parallelism beginning + with GOMP_DIM_GANG, or -1 if the routine is a SEQ. Otherwise, return 0 if + the FNDECL is not an acc routine. +*/ + +static int +extract_oacc_routine_gwv (tree fndecl) +{ + tree attrs = get_oacc_fn_attrib (fndecl); + tree pos; + unsigned gwv = 0; + int i; + int ret = 0; + + if (attrs != NULL_TREE) +{ + for (i = 0, pos = TREE_VALUE (attrs); + gwv == 0 && i != GOMP_DIM_MAX; + i++, pos = TREE_CHAIN (pos)) + if (TREE_PURPOSE (pos) != boolean_false_node) + return 1 << i; + + ret = -1; +} + + return ret; +} + /* Extract the header elements of parallel loop FOR_STMT and store them into *FD. */ @@ -1227,7 +1256,8 @@ build_outer_var_ref (tree var, omp_context *ctx) else x = lookup_decl (var, ctx->outer); } - else if (is_reference (var) || is_oacc_parallel (ctx)) + else if (is_reference (var) || is_oacc_parallel (ctx) + || extract_oacc_routine_gwv (current_function_decl) != 0) /* This can happen with orphaned constructs. If var is reference, it is possible it is shared and as such valid. */ x = var; @@ -2578,9 +2608,16 @@ scan_omp_for (gomp_for *stmt, omp_context *outer_ctx) bool gwv_clause = false; bool auto_clause = false; bool seq_clause = false; + int gwv_routine = 0; if (outer_ctx) outer_type = gimple_code (outer_ctx->stmt); + else +{ + gwv_routine = extract_oacc_routine_gwv (current_function_decl); + if (gwv_routine > 0) + gwv_routine = gwv_routine >> 1; +} ctx = new_omp_context (stmt, outer_ctx); @@ -2699,6 +2736,12 @@ scan_omp_for (gomp_for *stmt, omp_context *outer_ctx) && ctx->gwv_this > ctx->gwv_below) error_at (gimple_location (stmt), "gang, worker and vector must occur in this order in a loop nest"); + else if (!outer_ctx && ctx->gwv_this != 0 && gwv_routine != 0 + && ((ffs (ctx->gwv_this) <= gwv_routine) + || gwv_routine < 0)) + error_at (gimple_location (stmt), + "invalid parallelism inside acc routine"); + if (outer_ctx && outer_type == GIMPLE_OMP_FOR) outer_ctx->gwv_below |= ctx->gwv_below; } @@ -3287,6 +3330,16 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p, default: break; } + else if (ctx && is_gimple_omp_oacc (ctx->stmt) + && !is_oacc_parallel (ctx)) + { + /* Is this a call to an acc routine? */ + int gwv = extract_oacc_routine_gwv (fnd
Re: [gomp4.1] document more structures in libgomp.h
On Fri, Aug 28, 2015 at 08:53:44AM -0700, Aldy Hernandez wrote: > @@ -311,22 +313,35 @@ struct gomp_taskwait > > struct gomp_task > { > + /* Parent circular list. See children description below. */ >struct gomp_task *parent; > + /* Circular list representing the children of this task. > + > + In this list we first have parent_depends_on ready to run tasks, > + then !parent_depends_on ready to run tasks, then already running > + tasks, and finally the rest of the tasks. */ I don't think we have ", and finally the rest of the tasks", so please leave that out. Ok with that change. Jakub
Re: [Patch, libstdc++] Add specific error message into exceptions
On 27/08/15 22:18 -0700, Tim Shen wrote: Bootstrapped and tested. Thanks! -- Regards, Tim Shen commit 53c1caff442e97a18652ec8b1d984341168fd98d Author: Tim Shen Date: Thu Aug 27 21:42:40 2015 -0700 PR libstdc++/67361 * include/bits/regex_error.h: Add __throw_regex_error that supports string. * include/bits/regex_automaton.h: Add more specific exception messages. * include/bits/regex_automaton.tcc: Likewise. * include/bits/regex_compiler.h: Likewise. * include/bits/regex_compiler.tcc: Likewise. * include/bits/regex_scanner.h: Likewise. * include/bits/regex_scanner.tcc: Likewise. Nice, thanks for doing this! @@ -158,10 +159,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // _M_paren_stack is {1, 3}, for incomplete "(a.." and "(c..". At this // time, "\\2" is valid, but "\\1" and "\\3" are not. if (__index >= _M_subexpr_count) - __throw_regex_error(regex_constants::error_backref); + __throw_regex_error( + regex_constants::error_backref, + "Back-reference index exceeds current sub-expression count."); for (auto __it : this->_M_paren_stack) if (__index == __it) - __throw_regex_error(regex_constants::error_backref); + __throw_regex_error( + regex_constants::error_backref, + "Back-reference refered to an opened sub-expression."); Should be "referred". And one of the other strings in another throw says "befoer". diff --git a/libstdc++-v3/include/bits/regex_compiler.h b/libstdc++-v3/include/bits/regex_compiler.h index 0cb0c04..12ffabe 100644 --- a/libstdc++-v3/include/bits/regex_compiler.h +++ b/libstdc++-v3/include/bits/regex_compiler.h @@ -397,7 +397,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION auto __st = _M_traits.lookup_collatename(__s.data(), __s.data() + __s.size()); if (__st.empty()) - __throw_regex_error(regex_constants::error_collate); + __throw_regex_error(regex_constants::error_collate, + string("Invalid collate element: ")); _M_char_set.push_back(_M_translator._M_translate(__st[0])); #ifdef _GLIBCXX_DEBUG _M_is_ready = false; There seems to be no need to construct a std::string here, just pass a const char* (see below). Also, this string ends in a colon, whereas most end in a period. Any reason for the difference? @@ -411,7 +412,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION auto __st = _M_traits.lookup_collatename(__s.data(), __s.data() + __s.size()); if (__st.empty()) - __throw_regex_error(regex_constants::error_collate); + __throw_regex_error(regex_constants::error_collate, + string("Invalid equivalence class.")); __st = _M_traits.transform_primary(__st.data(), __st.data() + __st.size()); _M_equiv_set.push_back(__st); Just pass const char*. @@ -428,7 +430,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __s.data() + __s.size(), __icase); if (__mask == 0) - __throw_regex_error(regex_constants::error_ctype); + __throw_regex_error(regex_constants::error_collate, + string("Invalid character class.")); if (!__neg) _M_class_set |= __mask; else Ditto. @@ -442,7 +445,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_make_range(_CharT __l, _CharT __r) { if (__l > __r) - __throw_regex_error(regex_constants::error_range); + __throw_regex_error(regex_constants::error_range, + string("Invalid range in bracket expression.")); _M_range_set.push_back(make_pair(_M_translator._M_transform(__l), _M_translator._M_transform(__r))); #ifdef _GLIBCXX_DEBUG Ditto. diff --git a/libstdc++-v3/include/bits/regex_compiler.tcc b/libstdc++-v3/include/bits/regex_compiler.tcc index 9a62311..019ca42 100644 --- a/libstdc++-v3/include/bits/regex_compiler.tcc +++ b/libstdc++-v3/include/bits/regex_compiler.tcc @@ -77,16 +77,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_traits(_M_nfa->_M_traits), _M_ctype(std::use_facet<_CtypeT>(__loc)) { - _StateSeqT __r(*_M_nfa, _M_nfa->_M_start()); - __r._M_append(_M_nfa->_M_insert_subexpr_begin()); - this->_M_disjunction(); - if (!_M_match_token(_ScannerT::_S_token_eof)) - __throw_regex_error(regex_constants::error_paren); - __r._M_append(_M_pop()); - _GLIBCXX_DEBUG_ASSERT(_M_stack.empty()); - __r._M_append(_M_nfa->_M_insert_subexpr_end()); - __r._M_append(_M_nfa->_M_insert_accept()); - _M_nfa->_M_eliminate_dummy(); + __try + { + _StateSeqT __r(*_M_nfa, _M_nfa->_M_start()); +
[PATCH][lto/66752] Fix missed FSM jump thread
It's taken a month to get back to this, but it's time to re-install this patch onto the trunk with a minor update. This patch gives the FSM jump threading code the opportunity to find known values when we have a condition like (x != 0). Previously it just allowed a naked SSA_NAME (which is what appears in a SWITCH_EXPR). By handling (x != 0) the FSM bits can thread through some COND_EXPRs. Basically given (x != 0), we just ask the FSM bits to do their thing on (x) and the right things just happen. Which brings us to what changed between this and the prior version of this patch. When we ask the FSM bits to lookup the value of an SSA_NAME that appears in a COND_EXPR or SWITCH_EXPR, we must ask for the SSA_NAME from the *original* expression -- without any simplifications/substitutions. The reason for this is those simplifications/substitutions may have been made using a context sensitive equivalence that is not guaranteed to hold on whatever paths the FSM threader finds. The SWITCH_EXPR handling code handled this correctly, and I can recall discussing the issue with Sebastian. However, the issue slipped my mind when I did the extension to handle COND_EXPRs. This led to the PPC bootstrap failures we saw when this patch was originally installed. Sadly, I've made multiple attempts to reduce the bootstrap failure to a reasonable testcase for the regression suite without success. GCC sources are getting harder and harder to turn into execution tests ;( Anyway, this patch includes a fix for that issue. In fact, recovering the original operands for the comparison is the only change I've made. Bootstrapped and regression tested on x86_64-linux and ppc64-linux. Installed on the trunk. Jeff diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 09d4a6d..de7f367 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,15 @@ +2015-08-28 Jeff Law + + PR lto/66752 + * tree-ssa-threadedge.c (simplify_conrol_stmt_condition): If we are + unable to find X NE 0 in the tables, return X as the simplified + condition. + (fsm_find_control_statement_thread_paths): If nodes in NEXT_PATH are + in VISISTED_BBS, then return failure. Else add nodes from NEXT_PATH + to VISISTED_BBS. + * tree-ssa-threadupdate.c (duplicate_thread_path): Fix up edge flags + after removing the control flow statement and unnecessary edges. + 2015-08-28 Alan Lawrence Revert: diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 388417a..8d4c3f6 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,10 @@ +2015-08-28 Jeff Law + + PR lto/66752 + * gcc.dg/tree-ssa/pr66752-2.c: New test. + * gcc.dg/torture/pr66752-1.c: New test + * g++.dg/torture/pr66752-2.C: New test. + 2015-08-28 Alan Lawrence Revert: 2015-08-27 Alan Lawrence diff --git a/gcc/testsuite/g++.dg/torture/pr66752-2.C b/gcc/testsuite/g++.dg/torture/pr66752-2.C new file mode 100644 index 000..96d3fe9 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr66752-2.C @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +extern "C" +{ + typedef struct _IO_FILE FILE; + extern int fprintf (FILE * __restrict __stream, + const char *__restrict __format, ...); +} +typedef union tree_node *tree; +class ipa_polymorphic_call_context +{ +}; +class ipcp_value_base +{ +}; +template < typename valtype > class ipcp_value:public ipcp_value_base +{ +public:valtype value; + ipcp_value *next; +}; + +template < typename valtype > class ipcp_lattice +{ +public:ipcp_value < valtype > *values; + void print (FILE * f, bool dump_sources, bool dump_benefits); +}; + +class ipcp_param_lattices +{ +public:ipcp_lattice < tree > itself; + ipcp_lattice < ipa_polymorphic_call_context > ctxlat; +}; +template < typename valtype > void ipcp_lattice < valtype >::print (FILE * f, + bool + dump_sources, + bool + dump_benefits) +{ + ipcp_value < valtype > *val; + bool prev = false; + for (val = values; val; val = val->next) +{ + if (dump_benefits && prev) + fprintf (f, " "); + else if (!dump_benefits && prev) + fprintf (f, ", "); + else + prev = true; + if (dump_sources) + fprintf (f, "]"); + if (dump_benefits) + fprintf (f, "shit"); +} +} + +void +print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits) +{ + struct ipcp_param_lattices *plats; + plats->ctxlat.print (f, dump_sources, dump_benefits); +} diff --git a/gcc/testsuite/gcc.dg/torture/pr66752-1.c b/gcc/testsuite/gcc.dg/torture/pr66752-1.c new file mode 100644 index 000..a742555 --- /dev/null +++ b/gcc/testsuite/gc
[PATCH, libbacktrace] SPU does not support fcntl
Hello, one more compatibility problem showed up with libbacktrace on SPU: we do not have the fcntl routine. This does not cause any failure on building the library, but when linking the final executable. Fixed by disabling have_fcntl for SPU (just as is already done for mingw). Tested on x86_64-linux and spu-elf. OK for mainline? Bye, Ulrich ChangeLog: * configure.ac: For spu-*-* targets, set have_fcntl to no. * configure: Regenerate. Index: libbacktrace/configure.ac === *** libbacktrace/configure.ac (revision 227304) --- libbacktrace/configure.ac (working copy) *** fi *** 325,330 --- 331,337 if test -n "${with_target_subdir}"; then case "${host}" in *-*-mingw*) have_fcntl=no ;; +spu-*-*) have_fcntl=no ;; *) have_fcntl=yes ;; esac else -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com
[PATCH, libbacktrace] SPU does not support __sync or __atomic
Hello, this is the (hopefully) last compatibility problem with libbacktrace on SPU: we do not have either the __sync or the __atomic routines (since the SPU is a fundamentally single-threaded target). There are configure.ac checks for both of these functions, but for cross-compilation, the code assumes they are always available. This patch adds explicit checks for the SPU target and disables those functions for that target, just as is done with other checks. However, the resulting source does not build, since the fallback (abort) defines in internals.h cause compile warnings (made into errors due to -Werror). There are two problems: - "variable set but not used" for some variables used as arguments to the backtrace_atomic_... routines. Fixed by adding dummy uses of the arguments to the fallback definitions of those macros. - "right-hand operand of comma expression has to effect" for two cases where code ignored the return value of __sync_bool_compare_and_swap with a fallback definition of (abort (), 1). I was unable to find a solution solely by modifying the fallback definition. There doesn't appear to be a way to do so using regular macros. Turning the fallback into an inline function doesn't work since it is a type-generic primitive. I guess a statement expression might work, but I'm not sure if GNU extensions are OK here. So I ended up with actually adding (void) casts to the two places where this happens. It seems to me explicitly indicating that it is OK to ignore the return value of __sync_bool_compare_and_swap in those places may be useful anyway. Of course I'd be happy for alternative suggestions how to fix this. Tested on x86_64-linux and spu-elf. OK for mainline? Bye, Ulrich ChangeLog: * configure.ac: For spu-*-* targets, set libbacktrace_cv_sys_sync and libbacktrace_cv_sys_sync to no. * configure: Regenerate. * internals.h [!HAVE_ATOMIC_FUNCTIONS, !HAVE_SYNC_FUNCTIONS] (backtrace_atomic_load_pointer, backtrace_atomic_load_int, backtrace_atomic_store_pointer, backtrace_atomic_store_size_t, backtrace_atomic_store_int): Add dummy uses of arguments. * elf.c (backtrace_initialize): Explicitly cast unused return value of __sync_bool_compare_and_swap to void. * pecoff.c (backtrace_initialize): Likewise. Index: libbacktrace/configure.ac === *** libbacktrace/configure.ac (revision 227304) --- libbacktrace/configure.ac (working copy) *** AC_SUBST(PIC_FLAG) *** 172,178 AC_CACHE_CHECK([__sync extensions], [libbacktrace_cv_sys_sync], [if test -n "${with_target_subdir}"; then !libbacktrace_cv_sys_sync=yes else AC_LINK_IFELSE( [AC_LANG_PROGRAM([int i;], --- 172,181 AC_CACHE_CHECK([__sync extensions], [libbacktrace_cv_sys_sync], [if test -n "${with_target_subdir}"; then !case "${host}" in !spu-*-*) libbacktrace_cv_sys_sync=no ;; !*) libbacktrace_cv_sys_sync=yes ;; !esac else AC_LINK_IFELSE( [AC_LANG_PROGRAM([int i;], *** AC_SUBST(BACKTRACE_SUPPORTS_THREADS) *** 194,200 AC_CACHE_CHECK([__atomic extensions], [libbacktrace_cv_sys_atomic], [if test -n "${with_target_subdir}"; then !libbacktrace_cv_sys_atomic=yes else AC_LINK_IFELSE( [AC_LANG_PROGRAM([int i;], --- 197,206 AC_CACHE_CHECK([__atomic extensions], [libbacktrace_cv_sys_atomic], [if test -n "${with_target_subdir}"; then !case "${host}" in !spu-*-*) libbacktrace_cv_sys_atomic=no ;; !*) libbacktrace_cv_sys_atomic=yes ;; !esac else AC_LINK_IFELSE( [AC_LANG_PROGRAM([int i;], Index: libbacktrace/internal.h === *** libbacktrace/internal.h (revision 227304) --- libbacktrace/internal.h (working copy) *** extern void backtrace_atomic_store_int ( *** 99,109 /* We have neither the sync nor the atomic functions. These will never be called. */ ! #define backtrace_atomic_load_pointer(p) (abort(), (void *) NULL) ! #define backtrace_atomic_load_int(p) (abort(), 0) ! #define backtrace_atomic_store_pointer(p, v) abort() ! #define backtrace_atomic_store_size_t(p, v) abort() ! #define backtrace_atomic_store_int(p, v) abort() #endif /* !defined (HAVE_SYNC_FUNCTIONS) */ #endif /* !defined (HAVE_ATOMIC_FUNCTIONS) */ --- 99,109 /* We have neither the sync nor the atomic functions. These will never be called. */ ! #define backtrace_atomic_load_pointer(p) ((void)(p), abort(), (void *) NULL) ! #define backtrace_atomic_load_int(p) ((void)(p), abort(), 0) ! #define backtrace_atomic_store_pointer(p, v) ((void)(p), (void)(v), abort()) ! #define backtrace_atomic_store_size_t(p, v) ((void)(p), (void)(v), abort()) ! #define backtrace_atomic_store_int(p, v) ((void)(p), (void)(v), abort()) #endif /* !defi
[PATCH][GCC] Algorithmic optimization in match and simplify
Two new algorithmic optimisations: 1.((X op0 C0) op1 C1) op2 C2) with op0 = {&, >>, <<}, op1 = {|,^}, op2 = {|,^} and op1 != op2 zero_mask has 1's for all bits that are sure to be 0 in (X op0 C0) and 0's otherwise. if (op1 == '^') C1 &= ~C2 (Only changed if actually emitted) if ((C1 & ~zero_mask) == 0) then emit (X op0 C0) op2 (C1 op2 C2) if ((C2 & ~zero_mask) == 0) then emit (X op0 C0) op1 (C1 op2 C2) 2. (X {|,^,&} C0) {<<,>>} C1 -> (X {<<,>>} C1) {|,^,&} (C0 {<<,>>} C1) This patch does two algorithmic optimisations that target patterns like: (((x << 24) | 0x00FF) ^ 0xFF00) and ((x ^ 0x4002) >> 1) | 0x8000. The transformation uses the knowledge of which bits are zero after operations like (X {&,<<,(unsigned)>>}) to combine constants, reducing run-time operations. The two examples above would be transformed into (X << 24) ^ 0x and (X >> 1) ^ 0xa001 respectively. gcc/ChangeLog: 2015-08-03 Andre Vieira * match.pd: Added new patterns: ((X {&,<<,>>} C0) {|,^} C1) {^,|} C2) (X {|,^,&} C0) {<<,>>} C1 -> (X {<<,>>} C1) {|,^,&} (C0 {<<,>>} C1) gcc/testsuite/ChangeLog: 2015-08-03 Andre Vieira * gcc.dg/tree-ssa/forwprop-33.c: New test. From 15f86df5b3561edf26ae79cedbe160fd46596fd9 Mon Sep 17 00:00:00 2001 From: Andre Simoes Dias Vieira Date: Wed, 26 Aug 2015 16:27:31 +0100 Subject: [PATCH] algorithmic optimization --- gcc/match.pd| 70 + gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c | 42 + 2 files changed, 112 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c diff --git a/gcc/match.pd b/gcc/match.pd index eb0ba9d10a9b8ca66c23c56da0678477379daf80..3d9a8f52713e8dfb2189aad76bce709c924fa286 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -708,6 +708,76 @@ along with GCC; see the file COPYING3. If not see && tree_nop_conversion_p (type, TREE_TYPE (@1))) (convert (bit_and (bit_not @1) @0 +/* (X bit_op C0) rshift C1 -> (X rshift C0) bit_op (C0 rshift C1) */ +(for bit_op (bit_ior bit_xor bit_and) +(simplify + (rshift (bit_op:c @0 INTEGER_CST@1) INTEGER_CST@2) + (bit_op + (rshift @0 @2) + { wide_int_to_tree (type, wi::rshift (@1, @2, TYPE_SIGN (type))); }))) + +/* (X bit_op C0) lshift C1 -> (X lshift C0) bit_op (C0 lshift C1) */ +(for bit_op (bit_ior bit_xor bit_and) +(simplify + (lshift (bit_op:c @0 INTEGER_CST@1) INTEGER_CST@2) + (bit_op + (lshift @0 @2) + { wide_int_to_tree (type, wi::lshift (@1, @2)); }))) + + +/* ((X op0 C0) op1 C1) op2 C2) +with op0 = {&, >>, <<}, op1 = {|,^}, op2 = {|,^} and op1 != op2 +zero_mask has 1's for all bits that are sure to be 0 in (X op0 C0) +and 0's otherwise. +if (op1 == '^') C1 &= ~C2; +if ((C1 & ~zero_mask) == 0) then emit (X op0 C0) op2 (C1 op2 C2) +if ((C2 & ~zero_mask) == 0) then emit (X op0 C0) op1 (C1 op2 C2) +*/ +(for op0 (rshift rshift lshift lshift bit_and bit_and) + op1 (bit_ior bit_xor bit_ior bit_xor bit_ior bit_xor) + op2 (bit_xor bit_ior bit_xor bit_ior bit_xor bit_ior) +(simplify + (op2:c + (op1:c + (op0 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3) + (with + { +unsigned int prec = TYPE_PRECISION (type); +wide_int zero_mask = wi::zero (prec); +wide_int C0 = wide_int_storage (@1); +wide_int C1 = wide_int_storage (@2); +wide_int C2 = wide_int_storage (@3); +wide_int cst_emit; + +if (op0 == BIT_AND_EXPR) + { + zero_mask = wide_int_storage (wi::neg (@1)); + } +else if (op0 == LSHIFT_EXPR && wi::fits_uhwi_p (@1)) + { + zero_mask = wide_int_storage (wi::mask (C0.to_uhwi (), false, prec)); + } +else if (op0 == RSHIFT_EXPR && TYPE_UNSIGNED (type) && wi::fits_uhwi_p (@1)) + { + unsigned HOST_WIDE_INT m = prec - C0.to_uhwi (); + zero_mask = wide_int_storage (wi::mask (m, true, prec)); + } + +if (op1 == BIT_XOR_EXPR) + { + C1 = wi::bit_and_not (C1,C2); + cst_emit = wi::bit_or (C1, C2); + } +else + { + cst_emit = wi::bit_xor (C1, C2); + } + } + (if ((C1 & ~zero_mask) == 0) + (op2 (op0 @0 @1) { wide_int_to_tree (type, cst_emit); }) + (if ((C2 & ~zero_mask) == 0) +(op1 (op0 @0 @1) { wide_int_to_tree (type, cst_emit); })) + /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ (simplify (pointer_plus (pointer_plus:s @0 @1) @3) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c new file mode 100644 index ..984d8b37a01defe0e6852070a7dfa7ace5027c51 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-33.c @@ -0,0 +1,42 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-forwprop1" } */ + +unsigned short +foo (unsigned short a) +{ + a ^= 0x4002; + a >>= 1; + a |= 0x8000; /* Simplify to ((a >> 1) ^ 0xa001). */ + return a; +} + +unsigned short +bar (unsigned short a) +{ + a |= 0x4002; + a <<= 1; + a ^= 0x0001; /* Simp
Re: Openacc launch API
On 08/25/15 09:29, Nathan Sidwell wrote: I did rename the GOACC_parallel entry point to GOACC_parallel_keyed and provide a forwarding function. However, as the mkoffload data is incompatible, this is probably overkill. I've had to increment the (just committed) version number to detect the change in data representation. So any attempt to run an old binary with a new libgomp will fail at the loading point. We could simply keep the same 'GOACC_parallel' name and not need any new symbols. WDYT? I'm coming to the conclusion that just keeping the original 'GOACC_parallel' name is the way to go. As I said above, we cannot support backwards compatibility on the offload data, so the only remaining case is someone building an openacc program for running on the host. As I said at the cauldron, I think the set of users that cared enough about openacc to try gcc 5 but don't care enough to recompile their programs for gcc 6 is the empty set. Jakub? nathan
Re: [gomp4] check for compatible parallelism with acc routines
On 08/28/15 11:56, Cesar Philippidis wrote: Right now, gcc is permitting both of these loops. I.e., only the seq loop itself is executing in a non-partitioned mode. Julian inquired about this in the openacc technical list a while ago, but I don't think he got a response. This patch has been applied to gomp-4_0-branch. thanks!
Re: Openacc launch API
On Fri, Aug 28, 2015 at 01:29:51PM -0400, Nathan Sidwell wrote: > On 08/25/15 09:29, Nathan Sidwell wrote: > > >I did rename the GOACC_parallel entry point to GOACC_parallel_keyed and > >provide > >a forwarding function. However, as the mkoffload data is incompatible, this > >is > >probably overkill. I've had to increment the (just committed) version > >number to > >detect the change in data representation. So any attempt to run an old > >binary > >with a new libgomp will fail at the loading point. We could simply keep the > >same 'GOACC_parallel' name and not need any new symbols. WDYT? > > I'm coming to the conclusion that just keeping the original 'GOACC_parallel' > name is the way to go. As I said above, we cannot support backwards > compatibility on the offload data, so the only remaining case is someone > building an openacc program for running on the host. As I said at the > cauldron, I think the set of users that cared enough about openacc to try > gcc 5 but don't care enough to recompile their programs for gcc 6 is the > empty set. It is ok if for the GCC 5 compiled programs we always fallback to host, but IMNSHO we really should keep at least that host fallback working. We'll have new names for the OpenMP target entry points too. Jakub
Re: [PATCH][GCC] Algorithmic optimization in match and simplify
(not a review, I haven't even read the whole patch) On Fri, 28 Aug 2015, Andre Vieira wrote: 2015-08-03 Andre Vieira * match.pd: Added new patterns: ((X {&,<<,>>} C0) {|,^} C1) {^,|} C2) (X {|,^,&} C0) {<<,>>} C1 -> (X {<<,>>} C1) {|,^,&} (C0 {<<,>>} C1) +(for op0 (rshift rshift lshift lshift bit_and bit_and) + op1 (bit_ior bit_xor bit_ior bit_xor bit_ior bit_xor) + op2 (bit_xor bit_ior bit_xor bit_ior bit_xor bit_ior) You can nest for-loops, it seems clearer as: (for op0 (rshift lshift bit_and) (for op1 (bit_ior bit_xor) op2 (bit_xor bit_ior) +(simplify + (op2:c + (op1:c + (op0 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3) I suspect you will want more :s (single_use) and less :c (canonicalization should put constants in second position). + C1 = wi::bit_and_not (C1,C2); Space after ','. Having wide_int_storage in many places is surprising, I can't find similar code anywhere else in gcc. -- Marc Glisse
[gomp4.1] WIP: Structure element mapping support
Hi! Here is my current WIP on further structure element mapping support (so, structure element {pointer,reference to pointer,reference to array} based array sections, start of C++ support (still need to add tests for template instantiation and verify it works properly)). I have still pending questions on mapping of references (other than array sections) and structure element references pending, hope they will be responded to soon and will be able to commit this next week. --- gcc/gimplify.c.jj 2015-08-24 14:32:06.0 +0200 +++ gcc/gimplify.c 2015-08-28 19:18:15.551860807 +0200 @@ -6203,6 +6203,7 @@ gimplify_scan_omp_clauses (tree *list_p, struct gimplify_omp_ctx *ctx, *outer_ctx; tree c; hash_map *struct_map_to_clause = NULL; + tree *orig_list_p = list_p; ctx = new_omp_context (region_type); outer_ctx = ctx->outer_context; @@ -6443,13 +6444,31 @@ gimplify_scan_omp_clauses (tree *list_p, } if (!DECL_P (decl)) { + tree d = decl, *pd; + if (TREE_CODE (d) == ARRAY_REF) + { + while (TREE_CODE (d) == ARRAY_REF) + d = TREE_OPERAND (d, 0); + if (TREE_CODE (d) == COMPONENT_REF + && TREE_CODE (TREE_TYPE (d)) == ARRAY_TYPE) + decl = d; + } + pd = &OMP_CLAUSE_DECL (c); + if (d == decl + && TREE_CODE (decl) == INDIRECT_REF + && TREE_CODE (TREE_OPERAND (decl, 0)) == COMPONENT_REF + && (TREE_CODE (TREE_TYPE (TREE_OPERAND (decl, 0))) + == REFERENCE_TYPE)) + { + pd = &TREE_OPERAND (decl, 0); + decl = TREE_OPERAND (decl, 0); + } if (TREE_CODE (decl) == COMPONENT_REF) { while (TREE_CODE (decl) == COMPONENT_REF) decl = TREE_OPERAND (decl, 0); } - if (gimplify_expr (&OMP_CLAUSE_DECL (c), pre_p, -NULL, is_gimple_lvalue, fb_lvalue) + if (gimplify_expr (pd, pre_p, NULL, is_gimple_lvalue, fb_lvalue) == GS_ERROR) { remove = true; @@ -6478,18 +6497,48 @@ gimplify_scan_omp_clauses (tree *list_p, HOST_WIDE_INT bitsize, bitpos; machine_mode mode; int unsignedp, volatilep = 0; - tree base - = get_inner_reference (OMP_CLAUSE_DECL (c), &bitsize, - &bitpos, &offset, &mode, &unsignedp, - &volatilep, false); + tree base = OMP_CLAUSE_DECL (c); + while (TREE_CODE (base) == ARRAY_REF) + base = TREE_OPERAND (base, 0); + if (TREE_CODE (base) == INDIRECT_REF) + base = TREE_OPERAND (base, 0); + base = get_inner_reference (base, &bitsize, &bitpos, &offset, + &mode, &unsignedp, + &volatilep, false); gcc_assert (base == decl && (offset == NULL_TREE || TREE_CODE (offset) == INTEGER_CST)); splay_tree_node n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl); - if (n == NULL || (n->value & GOVD_MAP) == 0) + bool ptr = (OMP_CLAUSE_MAP_KIND (c) + == GOMP_MAP_FIRSTPRIVATE_POINTER); + if (n == NULL || (n->value & (ptr ? GOVD_PRIVATE + : GOVD_MAP)) == 0) { + if (ptr) + { + tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_PRIVATE); + OMP_CLAUSE_DECL (c2) = decl; + OMP_CLAUSE_CHAIN (c2) = *orig_list_p; + *orig_list_p = c2; + if (struct_map_to_clause == NULL) + struct_map_to_clause = new hash_map; + tree *osc; + if (n == NULL || (n->value & GOVD_MAP) == 0) + osc = NULL; + else + osc = struct_map_to_clause->get (decl); + if (osc == NULL) + struct_map_to_clause->put (decl, + tree_cons (NULL_TREE, + c, NULL_TREE)); + else + *osc = tree_cons (*osc, c, NULL_TREE); + flags = GOVD_PRIVATE | GOVD_
Re: [Patch, libstdc++] Add specific error message into exceptions
On Fri, Aug 28, 2015 at 8:59 AM, Jonathan Wakely wrote: > There seems to be no need to construct a std::string here, just pass a > const char* (see below). To be honest, I wasn't considering performance for a bit, since exceptions are already considered slow by me :P. But yes, we can do less allocations. > I wonder if we want to make this more efficient by adding a private > member to regex_error that would allow information to be appended to > the string, rather then creating a new regex_error with a new string. I can add a helper function to _Scanner to construct the exception object for only once. For functions that can't access this helper, use return value for error handling. > I suggest adding another overload that takes a const char* rather than > std::string. The reason is that when using the new ABI this function > will take a std::__cxx11::string, so calling it will allocate memory > for the string data, then that string is passed to the regex_error > constructor which has to convert it internally to an old std::string, > which has to allocate a second time. First, to make it clear: due to _M_get_location_string(), we need dynamic allocation. So is it good to have an owned raw pointer stored in runtime_error, pointing to a heap allocated char chunk, which will be deallocated in regex_error's dtor? -- Regards, Tim Shen
Re: Openacc launch API
On 08/28/15 13:36, Jakub Jelinek wrote: On Fri, Aug 28, 2015 at 01:29:51PM -0400, Nathan Sidwell wrote: On 08/25/15 09:29, Nathan Sidwell wrote: I did rename the GOACC_parallel entry point to GOACC_parallel_keyed and provide a forwarding function. However, as the mkoffload data is incompatible, this is probably overkill. I've had to increment the (just committed) version number to detect the change in data representation. So any attempt to run an old binary with a new libgomp will fail at the loading point. We could simply keep the same 'GOACC_parallel' name and not need any new symbols. WDYT? I'm coming to the conclusion that just keeping the original 'GOACC_parallel' name is the way to go. As I said above, we cannot support backwards compatibility on the offload data, so the only remaining case is someone building an openacc program for running on the host. As I said at the cauldron, I think the set of users that cared enough about openacc to try gcc 5 but don't care enough to recompile their programs for gcc 6 is the empty set. It is ok if for the GCC 5 compiled programs we always fallback to host, but IMNSHO we really should keep at least that host fallback working. Is that approval for the patch as I posted it? nathan
[wwwdocs] Adjust three links to gccupc.org
...which got broken over time. Applied. Gerald Index: projects/gupc.html === RCS file: /cvs/gcc/wwwdocs/htdocs/projects/gupc.html,v retrieving revision 1.8 diff -u -r1.8 gupc.html --- projects/gupc.html 29 Jun 2014 11:31:33 - 1.8 +++ projects/gupc.html 28 Aug 2015 19:36:52 - @@ -84,7 +84,7 @@ Download The latest release of GUPC can be downloaded from http://www.gccupc.org/downloads/upc-downloads";>gccupc.org. +href="http://www.gccupc.org/download";>gccupc.org. Alternatively, read-only SVN access to the GUPC branch can be used to @@ -97,13 +97,13 @@ For a list of configuration switches that you can use to build GUPC, consult the GUPC -http://gccupc.org/gcc-upc-info/gcc-upc-configuration";> +http://gccupc.org/gnu-upc-info/gnu-upc-install-from-source-code";> configuration page. For a quick summary of the switches used to compile and link a UPC program, consult the GUPC -http://gccupc.org/gcc-upc-info/gcc-upc-man-page";> +http://gccupc.org/gnu-upc-info/gnu-upc-compile-options";> manual page.
jit bit missing?
These seem to be missing? I stenciled this up by copy and pasting… I did try and run the test suite, but you didn’t add a -L$(objdir) to the flags to pick up the library that was built. You incorrectly test the installed library, which has no relation to the library that should be tested. I’ve installed that library, but, it can’t be found without a -L, as I didn’t install it in any standard place. Anyway, here is what I saw: Executing on host: c++ /home/mrs/net/gcc/gcc/testsuite/../jit/docs/examples/tut01-hello-world.cc -fno-diagnostics-show-caret -fdiagnostics-color=never -fmessage-length=0 -I/home/mrs/net/gcc/gcc/testsuite/../jit -lgccjit -g -Wall -Werror -Wl,--export-dynamic-lm-o tut01-hello-world.cc.exe(timeout = 300) spawn c++ /home/mrs/net/gcc/gcc/testsuite/../jit/docs/examples/tut01-hello-world.cc -fno-diagnostics-show-caret -fdiagnostics-color=never -fmessage-length=0 -I/home/mrs/net/gcc/gcc/testsuite/../jit -lgccjit -g -Wall -Werror -Wl,--export-dynamic -lm -o tut01-hello-world.cc.exe /usr/bin/ld: cannot find -lgccjit collect2: error: ld returned 1 exit status compiler exited with status 1 I hacked in a -L to get the built library found and ran the jit testsuite, no failures. The test suite at -j30 is, well, lopsided. 2 that take forever and the rest finish pretty quickly. Ok? jit-1.diffs Description: Binary data
Re: [PING] Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function
The second question is about your suggestion to consolidate the code into mark_rvalue_use. The problem I'm running into there is that mark_rvalue_use is called for calls to builtins as well as for other uses and doesn't have enough context to tell one from the other. Ah, true. But special-casing call uses is still fewer places than special-casing all non-call uses. Sorry it's taken me so long to get back to this. Changing the patch to issue the diagnostic in mark_rvalue_use instead of anywhere in expr.c or call.c caused a false positive for calls to builtin functions, and a number of false negatives (e.g., for the address-of expression and for reinterpret_cast(builtin)). To avoid the false positive I added a new argument to both mark_rvalue_use and decay_conversion (which calls mark_rvalue_use when called from build_addr_func) to avoid diagnosing calls to builtins. To avoid the false negatives, we still need to retain some calls to cp_reject_gcc_builtin from functions other than mark_rvalue_use. I also removed the DECL_IS_GCC_BUILTIN macro introduced in the first patch and replaced its uses with a somewhat simplified expression that's the same between the C and C++ front ends. Finally, I merged the c_ and cp_reject_gcc_builtin functions into a single reject_gcc_builtin function and simplified the logic to avoid testing for operators new and delete by name. I removed from the C++ version the checks for the type substitution flags since (IIUC) they don't come into play here (all uses of the builtins can be diagnosed, regardless of the type substitution context). I ran into one regression in the gcc.dg/lto/pr54702_1.c test. The file takes the address of malloc without declaring it, and after calling it first. The code is invalid but GCC compiles it due to a bug. I raised it in c/67386 -- missing diagnostic on a use of an undeclared function, and suppressed the error by adding a declaration to the test. I mention it here because the error that GCC issues otherwise (without the declaration) is one about __builtin_malloc not being directly called. I'm sure the error will change to "'malloc' undeclared" once PR67386 is fixed but until then, in this case, the error is rather cryptic. I haven't spent too much time trying to fix it (it seems to have to do with the undeclared malloc being treated as a builtin without a library fallback). Let me know if this is closer to what you were suggesting or if you would like to see some other changes (or prefer the original approach). I tested the patch by bootstrapping on 86_64 and on powerpc64le and running tests on the latter. I see the following difference in the test results Thanks Martin gcc/ChangeLog 2015-08-27 Martin Sebor PR c/66516 * doc/extend.texi (Other Builtins): Document when the address of a builtin function can be taken. gcc/c-family/ChangeLog 2015-08-27 Martin Sebor PR c/66516 * c-common.h (reject_gcc_builtin): Declare new function. * c-common.c (reject_gcc_builtin): Define it. gcc/c/ChangeLog 2015-08-27 Martin Sebor PR c/66516 * c/c-typeck.c (convert_arguments, parser_build_unary_op) (build_conditional_expr, c_cast_expr, convert_for_assignment) (build_binary_op, _objc_common_truthvalue_conversion): Call reject_gcc_builtin. gcc/cp/ChangeLog 2015-08-27 Martin Sebor PR c/66516 * cp/cp-tree.h (mark_rvalue_use, decay_conversion): Add new argument(s). * cp/expr.c (mark_rvalue_use): Use new argument. * cp/call.c (build_addr_func): Call decay_conversion with new argument. * cp/pt.c (convert_template_argument): Call reject_gcc_builtin. * cp/typeck.c (decay_conversion): Use new argument. gcc/testsuite/ChangeLog 2015-08-27 Martin Sebor PR c/66516 * g++.dg/addr_builtin-1.C: New test. * gcc.dg/addr_builtin-1.c: New test. * gcc.dg/lto/pr54702_1.c: Declare malloc before taking its address. diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 7691035..8fda350 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -12882,4 +12882,38 @@ pointer_to_zero_sized_aggr_p (tree t) return (TYPE_SIZE (t) && integer_zerop (TYPE_SIZE (t))); } +/* For an EXPR of a FUNCTION_TYPE that references a GCC built-in function + with no library fallback or for an ADDR_EXPR whose operand is such type + issues an error pointing to the location LOC. + Returns true when the expression has been diagnosed and false + otherwise. */ +bool +reject_gcc_builtin (const_tree expr, location_t loc /* = UNKNOWN_LOCATION */) +{ + if (TREE_CODE (expr) == ADDR_EXPR) +expr = TREE_OPERAND (expr, 0); + + if (TREE_TYPE (expr) + && TREE_CODE (TREE_TYPE (expr)) == FUNCTION_TYPE + && DECL_P (expr) + /* The intersection of DECL_BUILT_IN and DECL_IS_BUILTIN avoids + false positives for user-declared built-ins such as abs or + strlen, and for C++ operators new and delete. */ + && DECL_BUILT_IN (expr) + && DECL_IS_BUILTIN (expr) + && !DECL_ASSEMBLER_NAME_SET_P (expr)) +{ +
Re: [PING] Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function
On Fri, 28 Aug 2015, Martin Sebor wrote: > I ran into one regression in the gcc.dg/lto/pr54702_1.c test. > The file takes the address of malloc without declaring it, and > after calling it first. The code is invalid but GCC compiles it > due to a bug. I raised it in c/67386 -- missing diagnostic on > a use of an undeclared function, and suppressed the error by But that PR isn't a bug - the code is working exactly as it's meant to (an implicit declaration acts exactly like an explicit declaration "int func ();" in the nearest containing scope). The declaration has an incompatible type, it's true, but GCC deliberately allows that with a warning. What if (a) you use a built-in function that returns int, instead of malloc, and (b) use -std=gnu89, so the implicit declaration isn't even an extension? Then you have something that's completely valid, including taking the address of the implicitly declared function. -- Joseph S. Myers jos...@codesourcery.com
[fortran,committed] Emit direct calls to malloc() and free()
For the MALLOC and FREE Cray pointer-related intrinsics, we used to emit calls to _gfortran_malloc and _gfortran_free, which would in turn call the libc routines. The attached patch makes us directly emit the calls to the BUILT_IN_MALLOC and BUILT_IN_FREE. Committed as revision 227311 after regtesting on x86_64-apple-darwin15. I have updated the wiki ABI cleanup page. FX z.diff Description: Binary data
Re: [PING] Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function
On 08/28/2015 02:09 PM, Joseph Myers wrote: On Fri, 28 Aug 2015, Martin Sebor wrote: I ran into one regression in the gcc.dg/lto/pr54702_1.c test. The file takes the address of malloc without declaring it, and after calling it first. The code is invalid but GCC compiles it due to a bug. I raised it in c/67386 -- missing diagnostic on a use of an undeclared function, and suppressed the error by But that PR isn't a bug - the code is working exactly as it's meant to (an implicit declaration acts exactly like an explicit declaration "int func ();" in the nearest containing scope). The declaration has an incompatible type, it's true, but GCC deliberately allows that with a warning. What if (a) you use a built-in function that returns int, instead of malloc, and (b) use -std=gnu89, so the implicit declaration isn't even an extension? Then you have something that's completely valid, including taking the address of the implicitly declared function. In that case the patched GCC issues an error for taking the address of the undeclared function as the test case below shows. I was aware of the C90 implicit declaration rule but I interpreted it as saying that the injected declaration is only in effect for the call expression. Since no other tests broke, I assumed the one that did was buggy. Anyway, after testing a few other compilers it looks like they all also extend the implicit declaration through the rest of the scope, so the patch will need further tweaking to allow this corner case. The problem is that DECL_IS_BUILTIN(expr) returns true for an implicitly declared builtin function with a library fallback but false for one that's been declared explicitly. I'll either have to find some other test to determine that the implicitly declared function has a fallback or fix whatever is causing the macro to return the wrong value. Martin $ cat t.c && /build/gcc-66516/gcc/xgcc -B /build/gcc-66516/gcc -std=gnu89 t.cint (*p)(int); void foo (void) { p = abs; } int bar (void) { int n = abs (0); p = abs; return n; } t.c: In function ‘foo’: t.c:4:9: error: ‘abs’ undeclared (first use in this function) p = abs; ^ t.c:4:9: note: each undeclared identifier is reported only once for each function it appears in t.c: In function ‘bar’: t.c:9:5: error: builtin function ‘abs’ must be directly called p = abs; ^
Re: [PATCH, libbacktrace] SPU does not support __sync or __atomic
On Fri, Aug 28, 2015 at 9:54 AM, Ulrich Weigand wrote: > > this is the (hopefully) last compatibility problem with libbacktrace on SPU: > we do not have either the __sync or the __atomic routines (since the SPU > is a fundamentally single-threaded target). I guess I don't understand. These are GCC intrinsic functions that ought to be supported on every target. There are many processors other than the SPU that are fundamentally single-threaded. That doesn't mean they can't support these functions; it just means that their implementation is trivial. When I look at the docs for the __sync and __atomic functions I don't see anything saying "these functions are only available on some targets." Ian
Re: [PATCH, libbacktrace] SPU does not support fcntl
On Fri, Aug 28, 2015 at 9:53 AM, Ulrich Weigand wrote: > > * configure.ac: For spu-*-* targets, set have_fcntl to no. > * configure: Regenerate. This is OK. Thanks. Ian
Re: [PATCH] Fix c++/67371 (issues with throw in constexpr)
On 08/28/2015 08:00 AM, Markus Trippelsdorf wrote: As PR67371 shows gcc currently rejects all throw statements in constant-expressions, even when they are never executed. Fix by simply allowing THROW_EXPR in potential_constant_expression_1. One drawback is that we now accept some ill formed cases, but they fall under the "no diagnostic required" rule in the standard, e.g.: I think we can do better. The handling of IF_STMT in potential_constant_expression_1 currently returns false if either the then or the else clauses are problematic, but instead it should return true if either of them are OK (or empty). We could try to analyze the body of a SWITCH_STMT more closely, but if you don't want to try it's fine if we just assume that the body is OK. Jason
Re: [Patch, libstdc++] Add specific error message into exceptions
On Fri, Aug 28, 2015 at 11:23 AM, Tim Shen wrote: > So is it good to have an owned raw pointer stored in runtime_error, > pointing to a heap allocated char chunk, which will be deallocated in > regex_error's dtor? I just put a string member into regex_error, completely ignoring the storage in std::runtime_error. Also used rethrow to keep stack frames. -- Regards, Tim Shen commit 36e7845b251eb1b2eeea76e22264acad1cab6355 Author: Tim Shen Date: Thu Aug 27 21:42:40 2015 -0700 PR libstdc++/67361 * include/bits/regex_error.h: Add __throw_regex_error that supports string. * include/bits/regex_automaton.h: Add more specific exception messages. * include/bits/regex_automaton.tcc: Likewise. * include/bits/regex_compiler.h: Likewise. * include/bits/regex_compiler.tcc: Likewise. * include/bits/regex_scanner.h: Likewise. * include/bits/regex_scanner.tcc: Likewise. diff --git a/libstdc++-v3/include/bits/regex_automaton.h b/libstdc++-v3/include/bits/regex_automaton.h index b6ab307..1f672ee 100644 --- a/libstdc++-v3/include/bits/regex_automaton.h +++ b/libstdc++-v3/include/bits/regex_automaton.h @@ -327,7 +327,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { this->push_back(std::move(__s)); if (this->size() > _GLIBCXX_REGEX_STATE_LIMIT) - __throw_regex_error(regex_constants::error_space); + __throw_regex_error( + regex_constants::error_space, + "Number of NFA states exceeds limit. Please use shorter regex " + "string, or use smaller brace expression, or make " + "_GLIBCXX_REGEX_STATE_LIMIT larger."); return this->size()-1; } diff --git a/libstdc++-v3/include/bits/regex_automaton.tcc b/libstdc++-v3/include/bits/regex_automaton.tcc index cecc407..4eeeac5 100644 --- a/libstdc++-v3/include/bits/regex_automaton.tcc +++ b/libstdc++-v3/include/bits/regex_automaton.tcc @@ -149,7 +149,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _NFA<_TraitsT>::_M_insert_backref(size_t __index) { if (this->_M_flags & regex_constants::__polynomial) - __throw_regex_error(regex_constants::error_complexity); + __throw_regex_error(regex_constants::error_complexity, + "Unexpected back-reference in polynomial mode."); // To figure out whether a backref is valid, a stack is used to store // unfinished sub-expressions. For example, when parsing // "(a(b)(c\\1(d)))" at '\\1', _M_subexpr_count is 3, indicating that 3 @@ -158,10 +159,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // _M_paren_stack is {1, 3}, for incomplete "(a.." and "(c..". At this // time, "\\2" is valid, but "\\1" and "\\3" are not. if (__index >= _M_subexpr_count) - __throw_regex_error(regex_constants::error_backref); + __throw_regex_error( + regex_constants::error_backref, + "Back-reference index exceeds current sub-expression count."); for (auto __it : this->_M_paren_stack) if (__index == __it) - __throw_regex_error(regex_constants::error_backref); + __throw_regex_error( + regex_constants::error_backref, + "Back-reference referred to an opened sub-expression."); this->_M_has_backref = true; _StateT __tmp(_S_opcode_backref); __tmp._M_backref_index = __index; diff --git a/libstdc++-v3/include/bits/regex_compiler.h b/libstdc++-v3/include/bits/regex_compiler.h index 0cb0c04..da44d42 100644 --- a/libstdc++-v3/include/bits/regex_compiler.h +++ b/libstdc++-v3/include/bits/regex_compiler.h @@ -397,7 +397,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION auto __st = _M_traits.lookup_collatename(__s.data(), __s.data() + __s.size()); if (__st.empty()) - __throw_regex_error(regex_constants::error_collate); + __throw_regex_error(regex_constants::error_collate, + "Invalid collate element."); _M_char_set.push_back(_M_translator._M_translate(__st[0])); #ifdef _GLIBCXX_DEBUG _M_is_ready = false; @@ -411,7 +412,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION auto __st = _M_traits.lookup_collatename(__s.data(), __s.data() + __s.size()); if (__st.empty()) - __throw_regex_error(regex_constants::error_collate); + __throw_regex_error(regex_constants::error_collate, + "Invalid equivalence class."); __st = _M_traits.transform_primary(__st.data(), __st.data() + __st.size()); _M_equiv_set.push_back(__st); @@ -428,7 +430,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __s.data() + __s.size(), __icase); if (__mask == 0) - __throw_regex_error(regex_constants::error_ctype); + __throw
Re: [c++-delayed-folding] fold_simple
On 08/27/2015 05:21 AM, Kai Tietz wrote: 2015-08-27 4:57 GMT+02:00 Jason Merrill : Why does fold_simple fold so many patterns? I thought we wanted something that would just fold conversions and negations of constant values. Yes, initial variant was handling much less patterns. But actually we need for functions (eg. like build_vec_init in init.c) a simple routine to perform basic constant-value arithmetics (sizeof * / + - trunc, etc) to avoid call of maybe_constant_value. Also for overflow-diagnostics we want at least to resolve such simple patterns for constant-values only. We could change those calls to use maybe_constant_value instead, but the overhead (and some of its folding) leads much further then working on constant-values only (as fold_simple does). For build_vec_init, since whether maxindex is constant has semantic meaning, I think we want maybe_constant_value. I think we also want it for overflow warnings, to get better diagnostics. Jason
[patch, libgfortran] PR67367 Program crashes on READ
I found that in read_buf where raw_read is called, no checks for errors were being made, raw_read returns the number of bytes read or an error code. In the test case, an error occurs and we proceeded to use the resulting error code as if it were the number of bytes read. The attached patch fixes this. Regression tested on x86_64. New test case provided. OK for trunk? Regards, Jerry 2015-08-28 Jerry DeLisle PR libgfortran/67367 * io/unix.c (buf_read): Check for error condition and if found return the error code. Index: unix.c === --- unix.c (revision 227314) +++ unix.c (working copy) @@ -529,16 +529,26 @@ buf_read (unix_stream * s, void * buf, ssize_t nby if (to_read <= BUFFER_SIZE/2) { did_read = raw_read (s, s->buffer, BUFFER_SIZE); - s->physical_offset += did_read; - s->active = did_read; - did_read = (did_read > to_read) ? to_read : did_read; - memcpy (p, s->buffer, did_read); + if (likely (did_read >= 0)) + { + s->physical_offset += did_read; + s->active = did_read; + did_read = (did_read > to_read) ? to_read : did_read; + memcpy (p, s->buffer, did_read); + } + else + return did_read; } else { did_read = raw_read (s, p, to_read); - s->physical_offset += did_read; - s->active = 0; + if (likely (did_read >= 0)) + { + s->physical_offset += did_read; + s->active = 0; + } + else + return did_read; } nbyte = did_read + nread; } ! { dg-do run } ! PR67367 program bug implicit none character(len=1) :: c character(len=256) :: message integer ios call system('[ -d junko.dir ] || mkdir junko.dir') open(unit=10, file='junko.dir',iostat=ios,action='read',access='stream') if (ios.ne.0) call abort read(10, iostat=ios) c if (ios.ne.21) call abort call system('rmdir junko.dir') end program bug