Re: [PATCH] PowerPC VLE port
On 09/08/12 00:52, David Edelsohn wrote: This patch contains a lot of unnecessary, gratuitous changes in addition to being very invasive. It was not edited and cleaned sufficiently before posting. It has too much of a negative impact on the current PowerPC port. The patch is not going to be accepted in its current form. could you explain in more detail what you find unsatisfactory. Thanks. nathan -- Nathan Sidwell
Re: [AArch64] allow 16 bytes constants in constant pool.
On 06/09/12 15:30, Marcus Shawcroft wrote: Relax the logic that prevents TFmode constants being addressed in the constant pool. 2012-09-06 Marcus Shawcroft * config/aarch64/aarch64.c (aarch64_classify_address): Allow 16 byte modes in constant pool. I've just committed this patch to the aarch64 branch. /Marcus
Re: [AArch64] support long double exceptions and rounding mode
On 06/09/12 15:34, Marcus Shawcroft wrote: Enable the raising of exception in long double soft float and support for rounding mode. 2012-09-06 Marcus Shawcroft * config/aarch64/sfp-machine.h (FP_EX_INVALID, FP_EX_DIVZERO) (FP_EX_OVERFLOW, FP_EX_UNDERFLOW, FP_EX_INEXACT) (FP_HANDLE_EXCEPTIONS, FP_RND_NEAREST, FP_RND_ZERO, FP_RND_PINF) (FP_RND_MINF, _FP_DECL_EX, FP_INIT_FOUNDMODE, FP_ROUNDMODE):New. I've just committed this patch to the aarch64 branch. /Marcus
Re: [AArch64] Implement section anchors
James, I've committed your patch. /Marcus On 6 September 2012 15:38, Richard Earnshaw wrote: > On 06/09/12 15:31, James Greenhalgh wrote: >> >> Hi, >> >> This patch implements section anchors for the AArch64 port. >> >> OK for aarch64-branch? >> >> Regards, >> James Greenhalgh >> >> -- >> 2012-09-06 James Greenhalgh >> Richard Earnshaw >> >> * common/config/aarch64/aarch64-common.c >> (aarch_option_optimization_table): New. >> (TARGET_OPTION_OPTIMIZATION_TABLE): Define. >> * gcc/config.gcc ([aarch64] target_has_targetm_common): Set to yes. >> * gcc/config/aarch64/aarch64-elf.h (ASM_OUTPUT_DEF): New definition. >> * gcc/config/aarch64/aarch64.c (TARGET_MIN_ANCHOR_OFFSET): Define. >> (TARGET_MAX_ANCHOR_OFFSET): Likewise. >> >> > > OK. > > R. >
Re: [PATCH, libstdc++] Add proper OpenBSD support
On 10 September 2012 07:34, Mark Kettenis wrote: >> Date: Sun, 9 Sep 2012 21:07:39 +0100 >> From: Jonathan Wakely >> >> On 4 September 2012 20:26, Mark Kettenis wrote: >> > Fixes a few testcases. Mostly based on the existing >> > NetBSD/FreeBSD/Darwin code. >> > >> > 2012-09-04 Mark Kettenis >> > >> > * configure.host (*-*-openbsd*) Set cpu_include_dir. >> > * config/os/bsd/openbsd/ctype_base.h: New file. >> > * config/os/bsd/openbsd/ctype_configure_char.cc: New file. >> > * config/os/bsd/openbsd/ctype_inline.h: New file. >> > * config/os/bsd/openbsd/os_defines.h: New file. >> >> This patch is OK, thanks. Do you want me to commit it for you? > > Yes please. It occurs to me now that the patch changes the size of ctype_base::mask, from the generic unsigned to char. I assume the OpenBSD system compiler uses char? How long has that change been present in the OpenBSD source tree? I'm not sure whether or not it's better to change the size of that type in GCC 4.8, which would break compatibility with previous versions of the FSF sources but provide compatibility with the OpenBSD system compiler. My guess would be that most people on OpenBSD are using the system compiler not upstream FSF sources. >> It shouoldn't stop the patch going in, but I assume that this test >> fails on OpenBSD even with your patch applied? >> >> #include >> #include >> >> class gnu_ctype: public std::ctype { }; >> >> int main() >> { >> gnu_ctype gctype; >> >> assert(gctype.is(std::ctype_base::xdigit, L'a')); >> } > > Interestingly enough, it doesn't fail without my diff. But it does > fail for OpenBSD's system compiler (GCC 4.2.1 with a lot of local > modifications). As far as I can determine this is the result of > ctype_base::mask being an 8-bit integer type which doesn't go well > with the generic ctype_members.cc implementation. Probably need to > have an OpenBSD-specific implementation just like newlib. Looking > into that now. See http://gcc.gnu.org/PR51772 (the original description gets the cause wrong, see comment 3 for the real problem)
LTO partitioning reorg 3/n - remove some hacks and handle vars/functions more regularly
Hi, this patch makes variable and cgraph handling more alike so code can be shared in future. The basic idea is to categorize symbols into three categories: 1) external symbols that goes only into boundary if they are used. 2) partitioned symbols that goes into one partition based on decision of particular paritioning alg. 3) non-partitioned symbols (COMDATs that are not keyed by C++ ABI and not used by object files, weakrefs, constat pool). Those symbols are output only if used and are duplicated into every partition using them. 1) is easy to identify by DECL_EXTERNAL_P, for 2) we have partition_symbol_p and 3) is the case where first two tests fails. I will cleanup the APIs in the followup. The purpose of this patch is to remove hacks that has cummulated in the code over as it envolved and I believe they are not needed (or rather was fixing symptoms rather than bugs). One was handling of COMDAT where I managed to convince myself that I need to ship it into every partition even if it is keyed by C++ ABI. This should not be true. Second is somewhat convoluted handling of aliases. This come from a time when we made no difference in between aliases and weakrefs but we had the non-same-body alias path still around. I tested the patch by bootstrapping/regtested x86_64-linux and also by compiling mozilla/Qt/webkit with LTO. Comitted. Honza * lto-cgraph.c (compute_ltrans_boundary): Do not care about aliases. * lto-partition.c (partition_symbol_p): Forward declare. (add_references_to_partition): Reimplement using partition_symbol_p. (add_aliases_to_partition): Break out from add_references_to_partition; reimplement using partition_symbol_p. (add_cgraph_node_to_partition_1): Handle callees using partition_symbol_p; add sanity checks. (add_varpool_node_to_partition): Use add_aliases_to_partition. (partition_varpool_node_p): Do not special case aliases. Index: lto-cgraph.c === --- lto-cgraph.c(revision 191113) +++ lto-cgraph.c(working copy) @@ -730,8 +730,6 @@ compute_ltrans_boundary (lto_symtab_enco lto_set_symtab_encoder_encode_initializer (encoder, vnode); add_references (encoder, &vnode->symbol.ref_list); } - else if (vnode->alias || vnode->alias_of) - add_references (encoder, &vnode->symbol.ref_list); } } Index: lto/lto-partition.c === --- lto/lto-partition.c (revision 191113) +++ lto/lto-partition.c (working copy) @@ -35,6 +35,7 @@ VEC(ltrans_partition, heap) *ltrans_part static void add_cgraph_node_to_partition (ltrans_partition part, struct cgraph_node *node); static void add_varpool_node_to_partition (ltrans_partition part, struct varpool_node *vnode); +static bool partition_symbol_p (symtab_node node); /* Create new partition with name NAME. */ static ltrans_partition @@ -62,8 +63,8 @@ free_ltrans_partitions (void) VEC_free (ltrans_partition, heap, ltrans_partitions); } -/* See all references that go to comdat objects and bring them into partition too. - Also see all aliases of the newly added entry and bring them, too. */ +/* Add all referenced symbols referenced by REFS that are not external and not + partitioned into PART. */ static void add_references_to_partition (ltrans_partition part, struct ipa_ref_list *refs) { @@ -71,46 +72,38 @@ add_references_to_partition (ltrans_part struct ipa_ref *ref; for (i = 0; ipa_ref_list_reference_iterate (refs, i, ref); i++) { - if (symtab_function_p (ref->referred) - && (DECL_COMDAT (cgraph_function_node (ipa_ref_node (ref), - NULL)->symbol.decl) - || (ref->use == IPA_REF_ALIAS - && lookup_attribute - ("weakref", DECL_ATTRIBUTES (ref->referred->symbol.decl - && !lto_symtab_encoder_in_partition_p (part->encoder, ref->referred)) + if (DECL_EXTERNAL (ref->referred->symbol.decl) + || partition_symbol_p (ref->referred) + || lto_symtab_encoder_in_partition_p (part->encoder, ref->referred)) + continue; + if (symtab_function_p (ref->referred)) add_cgraph_node_to_partition (part, ipa_ref_node (ref)); else -if (symtab_variable_p (ref->referred) - && (DECL_COMDAT (ref->referred->symbol.decl) - || DECL_EXTERNAL (ref->referred->symbol.decl) - || (ref->use == IPA_REF_ALIAS - && lookup_attribute -("weakref", - DECL_ATTRIBUTES (ref->referred->symbol.decl - && !lto_symtab_encoder_in_partition_p (part->encoder, ref->referred)) - add_varpool_node_to_partition (part, ipa_ref_varpool_node (ref)); + add_varpool_node_to_partition (part, ipa_ref_varpool_node
Re: [patch] rewrite another failing data race test (gcc.dg/pr52558-2.c)
On Fri, Sep 7, 2012 at 8:05 PM, Aldy Hernandez wrote: > This is the same thing as gcc.dg/pr52558-1.c, but in this case I had to > tweak the testcase a bit because optimization passes after LIM are smart > enough to remove the condition altogether, thus never triggering the test. > Interestingly, GCC can figure out what's going on when the condition is "l < > 1234", but not when it is "l != 4". > > Luckily, the original PR (PR52558) was testing "l != 4", so now the test > looks exactly as the what the PR writer had. > > Tested on x86-64 Linux by running with and without --param > allow-store-data-races=0, and by visual inspection of the assembly. > > OK? Ok. Thanks, Richard.
Re: [patch] fix gcc.dg/tm/reg-promotion.c
On Fri, Sep 7, 2012 at 8:20 PM, Aldy Hernandez wrote: > This is a bit different, in that we don't currently have an infrastructure > to test transactional memory code within the simulate-thread framework. > Luckily for this test, the LIM pass has an actual dump message when it fails > to hoist a value due to its presence in a transaction. > > Eventually/ideally, we should have a mechanism for testing races of > transactionally executed code. > > OK? Ok. Thanks, Richard.
Re: [PATCH] Fix PR54515
On Fri, Sep 7, 2012 at 11:01 PM, Markus Trippelsdorf wrote: > Here the problem is that get_base_address() can return NULL_TREE and > this later leads to a segfault. Fix by checking that the return value is > valid. > gcc-4.6 and 4.7 are also affected. > > Please commit if this looks OK. > Thanks. Hmm, we call the function on VIEW_CONVERT_EXPR(0)[0] which should have been folded to a constant. And get_base_address should just return the constant tree instead of returning NULL (it does return a plethora of base object kinds already). Your patch looks ok for the branches where I'll install it and come up with sth else for trunk. Thanks, Richard. > Tested on x86_64-pc-linux-gnu > > 2012-09-07 Markus Trippelsdorf > > PR middle-end/54515 > * tree-sra.c (disqualify_base_of_expr): Check for possible > NULL_TREE returned by get_base_address() > > * g++.dg/tree-ssa/pr54515.C: new testcase > > diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr54515.C > b/gcc/testsuite/g++.dg/tree-ssa/pr54515.C > new file mode 100644 > index 000..11ed468 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/tree-ssa/pr54515.C > @@ -0,0 +1,19 @@ > +// { dg-do compile } > +// { dg-options "-O2" } > + > +template < typename T > T h2le (T) > +{ > +T a; > +unsigned short &b = a; > +short c = 0; > +unsigned char (&d)[2] = reinterpret_cast < unsigned char (&)[2] > (c); > +unsigned char (&e)[2] = reinterpret_cast < unsigned char (&)[2] > (b); > +e[0] = d[0]; > +return a; > +} > + > +void > +bar () > +{ > +h2le ((unsigned short) 0); > +} > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index aafaa15..2bb92e9 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -984,7 +984,8 @@ static void > disqualify_base_of_expr (tree t, const char *reason) > { >t = get_base_address (t); > - if (sra_mode == SRA_MODE_EARLY_IPA > + if (t > + && sra_mode == SRA_MODE_EARLY_IPA >&& TREE_CODE (t) == MEM_REF) > t = get_ssa_base_param (TREE_OPERAND (t, 0)); > > -- > Markus
Re: [PATCH] Combine location with block using block_locations
On Sun, Sep 9, 2012 at 12:26 AM, Dehao Chen wrote: > Hi, Diego, > > Thanks a lot for the review. I've updated the patch. > > This patch is large and may easily break builds because it reserves > more complete information for TREE_BLOCK as well as gimple_block (may > trigger bugs that was hided when these info are unavailable). I've > done more rigorous testing to ensure that most bugs are caught before > checking in. > > * Sync to the head and retest all gcc testsuite. > * Port the patch to google-4_7 branch to retest all gcc testsuite, as > well as build many large applications. > > Through these tests, I've found two additional bugs that was omitted > in the original implementation. A new patch is attached (patch.txt) to > fix these problems. After this fix, all gcc testsuites pass for both > trunk and google-4_7 branch. I've also copy pasted the new fixes > (lto.c and tree-cfg.c) below. Now I'd say this patch is in good shape. > But it may not be perfect. I'll look into build failures as soon as it > arises. > > Richard and Diego, could you help me take a look at the following two fixes? > > Thanks, > Dehao > > New fixes: > --- gcc/lto/lto.c (revision 191083) > +++ gcc/lto/lto.c (working copy) > @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t) > { >enum tree_code code = TREE_CODE (t); >LTO_NO_PREVAIL (TREE_TYPE (t)); > - if (CODE_CONTAINS_STRUCT (code, TS_COMMON)) > -LTO_NO_PREVAIL (TREE_CHAIN (t)); That change is odd. Can you show us how it breaks? >if (DECL_P (t)) > { >LTO_NO_PREVAIL (DECL_NAME (t)); > > Index: gcc/tree-cfg.c > === > --- gcc/tree-cfg.c (revision 191083) > +++ gcc/tree-cfg.c (working copy) > @@ -5980,9 +5974,21 @@ move_stmt_op (tree *tp, int *walk_subtrees, void * >tree t = *tp; > >if (EXPR_P (t)) > -/* We should never have TREE_BLOCK set on non-statements. */ > -gcc_assert (!TREE_BLOCK (t)); > - > +{ > + tree block = TREE_BLOCK (t); > + if (p->orig_block == NULL_TREE > + || block == p->orig_block > + || block == NULL_TREE) > + TREE_SET_BLOCK (t, p->new_block); > +#ifdef ENABLE_CHECKING > + else if (block != p->new_block) > + { > + while (block && block != p->orig_block) > + block = BLOCK_SUPERCONTEXT (block); > + gcc_assert (block); > + } > +#endif I think what this means is that TREE_BLOCK on non-stmts are meaningless (thus only gimple_block is interesting on GIMPLE, not BLOCKs on trees). So instead of setting a BLOCK in some cases you should clear BLOCK if it happens to be set, or alternatively, only re-set it if there was a block associated with it. Richard. > +} >else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME) > { >if (TREE_CODE (t) == SSA_NAME) > > Whole patch: > gcc/ChangeLog: > 2012-09-08 Dehao Chen > > * toplev.c (general_init): Init block_locations. > * tree.c (tree_set_block): New. > (tree_block): Change to use LOCATION_BLOCK. > * tree.h (TREE_SET_BLOCK): New. > * final.c (reemit_insn_block_notes): Change to use LOCATION_BLOCK. > (final_start_function): Likewise. > * input.c (expand_location_1): Likewise. > * input.h (LOCATION_LOCUS): New. > (LOCATION_BLOCK): New. > (IS_UNKNOWN_LOCATION): New. > * fold-const.c (expr_location_or): Change to use new location. > * reorg.c (emit_delay_sequence): Likewise. > (try_merge_delay_insns): Likewise. > * modulo-sched.c (dump_insn_location): Likewise. > * lto-streamer-out.c (lto_output_location_bitpack): Likewise. > * jump.c (rtx_renumbered_equal_p): Likewise. > * ifcvt.c (noce_try_move): Likewise. > (noce_try_store_flag): Likewise. > (noce_try_store_flag_constants): Likewise. > (noce_try_addcc): Likewise. > (noce_try_store_flag_mask): Likewise. > (noce_try_cmove): Likewise. > (noce_try_cmove_arith): Likewise. > (noce_try_minmax): Likewise. > (noce_try_abs): Likewise. > (noce_try_sign_mask): Likewise. > (noce_try_bitop): Likewise. > (noce_process_if_block): Likewise. > (cond_move_process_if_block): Likewise. > (find_cond_trap): Likewise. > * dwarf2out.c (add_src_coords_attributes): Likewise. > * expr.c (expand_expr_real): Likewise. > * tree-parloops.c (create_loop_fn): Likewise. > * recog.c (peep2_attempt): Likewise. > * function.c (free_after_compilation): Likewise. > (expand_function_end): Likewise. > (set_insn_locations): Likewise. > (thread_prologue_and_epilogue_insns): Likewise. > * print-rtl.c (print_rtx): Likewise. > * profile.c (branch_prob): Likewise. > * trans-mem.c (ipa_tm_scan_irr_block): Likewise. > * gimplify.c (gimplify_call_expr): Likewise. >
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Sun, Sep 9, 2012 at 8:02 PM, Iyer, Balaji V wrote: > Hello Joseph, > Here is an updated patch. I think I have fixed all the changes you > and others have mentioned. Please let me know if everything looks OK. Thanks > again for doing the review! The ChangeLog still mentions the vectorizer looking at flag_cilk_plus. Also this patch does not contain a single testcase and thus all codepaths are not exercised when bootstrapping and testing. You do not mention how this feature works from an ABI perspective. I don't think this is anywere ready to go in. Please split it into two parts at least, part one would be to make the vectorizer support vectorizing functions with the "elemental function" attribute by assuming a vectorized variant with documented mangling exists. That feature should work regardless of whether Cilk+ is enabled or not. The mangling needs to be documented alongside the documentation for the "elemental funciton" attribute. Part two would be the rest, possibly re-implemented in the way that was suggested. Thanks, Richard. > Sincerely, > > Balaji V. Iyer. > > Here are the fixed ChangeLog entries: > > > gcc/ChangeLog > 2012-09-09 Balaji V. Iyer > > * attribs.c (is_elem_fn_attribute_p): New function. > (decl_attributes): Added a check for Elemental function attribute when > Cilk Plus is enabled. > * cgraphunit.c (cgraph_decide_is_function_needed): Added a check for > cloned elemental function when Cilk Plus is enabled. > (cgraph_add_new_function): When Cilk Plus is enabled we call > cgraph_get_create_node. > (cgraph_analyze_functions): Added a check if the function call is a > cloned elemental function when Cilk Plus is enabled. > * cilkplus.h: New file. > * elem-function-common.c: Likewise. > * config/i386/i386.c (ix86_cilkplus_map_proc_to_attr): New function. > (TARGET_CILKPLUS_BUILTIN_MAP_PROCESSOR_TO_ATTR): New define. > * expr.c (expand_expr_real_1): Added a check if Cilk Plus is enabled. > * function.h (struct function): Added elem_fn_already_cloned field. > * gimplify.c (gimplify_function_tree): Added a check if Cilk Plus is > enabled and if the function is an elemental function. If so, then > call > the function to clone elemental function. > * langhooks.c (lhd_elem_fn_create_fn): New function. > * langhooks-def.h (LANG_HOOKS_CILKPLUS): New define. > (LANG_HOOK_DECLS): Added LANG_HOOKS_CILKPLUS field. > * langhooks.h (struct lang_hooks_for_cilkplus): New struct. > (struct lang_hooks): Added a field called cilkplus. > * target.def (TARGET_CILKPLUS): New hook vector. > (builtin_map_processor_to_attr): New target hook def. > * targhooks.c (default_builtin_map_processor_to_attr): New function. > * doc/tm.texi: Regenerated. > * doc/tm.texi.in (TARGET_CILKPLUS_BUILTIN_MAP_PROCESSOR_TO_ATTR): > Documented > new hook. > * tree.h (tree_function_decl): Added a new field called > elem_fn_already_cloned. > (DECL_ELEM_FN_ALREADY_CLONED): New define. > * tree-data-ref.c (find_data_references_in_stmt): Added a check for > an elemental function call when Cilk Plus is enabled. > * tree-inline.c (elem_fn_copy_arguments_for_versioning): New function. > (initialize_elem_fn_cfun): Likewise. > (tree_elem_fn_versioning): Likewise. > * tree-vect-stmts.c (vect_get_vec_def_for_operand): Check parm type > for > an elemental function when Cilk Plus is enabled and set data > definition > accordingly. > (elem_fn_vect_get_vec_def_for_operand): New function. > (vect_finish_stmt_generation): Added a check for elemental function. > (vectorizable_function): Check if the function call is a Cilk Plus > elemental function. If so, then insert the appopriate mangled name. > (vectorizable_call): Eliminate the argument requirement when Cilk Plus > is enabled for vectorization. Also, set thee appropriate data def. > for > an elemental function call. > (elem_fn_linear_init_vector): New function. > * tree.c (build_elem_fn_linear_vector): Likewise. > > gcc/c-family/ChangeLog > 2012-09-09 Balaji V. Iyer > > * c-common.c (struct c_common_attribute_table): Added vector > attribute for Cilk Plus elemental function. > (handle_vector_atribute): New function. > * c-cpp-elem-function.c: New file. > * c.opt (-fcilkplus): Added new flag. > > gcc/c/ChangeLog > 2012-09-09 Balaji V. Iyer > > * c-decl.c (bind): Added a check for non NULL scope. > * c-parser.c (c_parser_declaration_or_fndef): Added a check if Cilk > Plus defined. If so, then we save the arguments for a fun
[Patch,avr,committed]: Fix PR54536
http://gcc.gnu.org/viewcvs?view=revision&revision=191132 http://gcc.gnu.org/viewcvs?view=revision&revision=191133 http://gcc.gnu.org/viewcvs?view=revision&revision=191134 Committed these changes as obvious fix for a typo in avr-devices.c / avr-mcus.def at90usb1287 had a wrong library_name "usb1286" instead of "usb1287". The bug is not critical because at90usb1286/7 are the same architecture and crtusb128[67].o is the same. Johann PR target/54536 * config/avr/avr-mcus.def (at90usb1287): Set LIBRARY_NAME to "usb1287".
RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Sun, 9 Sep 2012, Iyer, Balaji V wrote: > Here is an updated patch. I think I have fixed all the changes you > and others have mentioned. Please let me know if everything looks OK. > Thanks again for doing the review! Has the user documentation for this feature been posted? For patch review we really need a self-contained submission that for any feature implemented includes not just the implementation but the testcases and the documentation. I think the testsuite patch also needs reworking to make it easy to add support for new architectures. I think you need to revisit your split into 22 patches and arrange things based primarily on features. If the changes for a feature are so big they can't be posted in one message, you should still always post all the patches for that feature together (implementation, documentation, testcases) - even if not all parts have changed in a particular revision. -- Joseph S. Myers jos...@codesourcery.com
Re: combine vec_perm_expr with constructor
On Sat, Sep 8, 2012 at 9:14 AM, Marc Glisse wrote: > On Mon, 3 Sep 2012, Richard Guenther wrote: > >> You do work above and then bail late here. Always do early exists early >> to reduce useless compile-time. > > [...] > >> You need to verify that fold_ternary returns something that is valid >> GIMPLE. >> fold () in general happily returns trees that are in the need of >> re-gimplification. >> You expect a CONSTRUCTOR or VECTOR_CST here, so you should check >> for that. > > > Hello, > > here is a new version of the patch, again tested on x86_64-linux-gnu. Ok. Thanks, Richard. > 2012-09-08 Marc Glisse > > > gcc/ > * tree-ssa-forwprop.c (simplify_permutation): Handle CONSTRUCTOR. > > gcc/testsuite/ > * gcc.dg/tree-ssa/forwprop-20.c: New testcase. > > > -- > Marc Glisse > > Index: gcc/testsuite/gcc.dg/tree-ssa/forwprop-20.c > === > --- gcc/testsuite/gcc.dg/tree-ssa/forwprop-20.c (revision 0) > +++ gcc/testsuite/gcc.dg/tree-ssa/forwprop-20.c (revision 0) > @@ -0,0 +1,70 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target double64 } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +#include > + > +/* All of these optimizations happen for unsupported vector modes as a > + consequence of the lowering pass. We need to test with a vector mode > + that is supported by default on at least some architectures, or make > + the test target specific so we can pass a flag like -mavx. */ > + > +typedef double vecf __attribute__ ((vector_size (2 * sizeof (double; > +typedef int64_t veci __attribute__ ((vector_size (2 * sizeof (int64_t; > + > +void f (double d, vecf* r) > +{ > + vecf x = { -d, 5 }; > + vecf y = { 1, 4 }; > + veci m = { 2, 0 }; > + *r = __builtin_shuffle (x, y, m); // { 1, -d } > +} > + > +void g (float d, vecf* r) > +{ > + vecf x = { d, 5 }; > + vecf y = { 1, 4 }; > + veci m = { 2, 1 }; > + *r = __builtin_shuffle (x, y, m); // { 1, 5 } > +} > + > +void h (double d, vecf* r) > +{ > + vecf x = { d + 1, 5 }; > + vecf y = { 1 , 4 }; > + veci m = { 2 , 0 }; > + *r = __builtin_shuffle (y, x, m); // { d + 1, 1 } > +} > + > +void i (float d, vecf* r) > +{ > + vecf x = { d, 5 }; > + veci m = { 1, 0 }; > + *r = __builtin_shuffle (x, m); // { 5, d } > +} > + > +void j (vecf* r) > +{ > + vecf y = { 1, 2 }; > + veci m = { 0, 0 }; > + *r = __builtin_shuffle (y, m); // { 1, 1 } > +} > + > +void k (vecf* r) > +{ > + vecf x = { 3, 4 }; > + vecf y = { 1, 2 }; > + veci m = { 3, 0 }; > + *r = __builtin_shuffle (x, y, m); // { 2, 3 } > +} > + > +void l (double d, vecf* r) > +{ > + vecf x = { -d, 5 }; > + vecf y = { d, 4 }; > + veci m = { 2, 0 }; > + *r = __builtin_shuffle (x, y, m); // { d, -d } > +} > + > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */ > +/* { dg-final { cleanup-tree-dump "optimized" } } */ > > Property changes on: gcc/testsuite/gcc.dg/tree-ssa/forwprop-20.c > ___ > Added: svn:eol-style >+ native > Added: svn:keywords >+ Author Date Id Revision URL > > Index: gcc/tree-ssa-forwprop.c > === > --- gcc/tree-ssa-forwprop.c (revision 191082) > +++ gcc/tree-ssa-forwprop.c (working copy) > @@ -2599,75 +2599,134 @@ is_combined_permutation_identity (tree m >if (j == i) > maybe_identity2 = false; >else if (j == i + nelts) > maybe_identity1 = false; >else > return 0; > } >return maybe_identity1 ? 1 : maybe_identity2 ? 2 : 0; > } > > -/* Combine two shuffles in a row. Returns 1 if there were any changes > - made, 2 if cfg-cleanup needs to run. Else it returns 0. */ > +/* Combine a shuffle with its arguments. Returns 1 if there were any > + changes made, 2 if cfg-cleanup needs to run. Else it returns 0. */ > > static int > simplify_permutation (gimple_stmt_iterator *gsi) > { >gimple stmt = gsi_stmt (*gsi); >gimple def_stmt; > - tree op0, op1, op2, op3; > - enum tree_code code = gimple_assign_rhs_code (stmt); > - enum tree_code code2; > + tree op0, op1, op2, op3, arg0, arg1; > + enum tree_code code; > > - gcc_checking_assert (code == VEC_PERM_EXPR); > + gcc_checking_assert (gimple_assign_rhs_code (stmt) == VEC_PERM_EXPR); > >op0 = gimple_assign_rhs1 (stmt); >op1 = gimple_assign_rhs2 (stmt); >op2 = gimple_assign_rhs3 (stmt); > > - if (TREE_CODE (op0) != SSA_NAME) > -return 0; > - >if (TREE_CODE (op2) != VECTOR_CST) > return 0; > > - if (op0 != op1) > -return 0; > + if (TREE_CODE (op0) == VECTOR_CST) > +{ > + code = VECTOR_CST; > + arg0 = op0; > +} > + else if (TREE_CODE (op0) == SSA_NAME) > +{ > + def_stmt = SSA_NAME_DEF_STMT (op0); > + if (!def_stmt || !is_gimple_assign (def_stmt) > + || !can_propagate_from (def_stmt)) > + return 0;
Re: combine BIT_FIELD_REF and VEC_PERM_EXPR
On Sat, Sep 8, 2012 at 10:17 AM, Marc Glisse wrote: > On Mon, 3 Sep 2012, Richard Guenther wrote: > >> Please do the early outs where you compute the arguments. Thus, right >> after getting op0 in this case or right after computing n for the n != 1 >> check. >> >> I think you need to verify that the type of 'op' is actually the element >> type >> of op0. The BIT_FIELD_REF can happily access elements two and three >> of { 1, 2, 3, 4 } as a long for example. See the BIT_FIELD_REF foldings >> in fold-const.c. > > > On Mon, 3 Sep 2012, Richard Guenther wrote: > >> If you use fold_build3 you need to check that the result is in expected >> form >> (a is_gimple_invariant or an SSA_NAME). > > > I first tried this: > > + if (TREE_CODE (tem) != SSA_NAME > + && TREE_CODE (tem) != BIT_FIELD_REF > + && !is_gimple_min_invariant (tem)) > + return false; > > but then I thought that fold_stmt probably does the right thing (?) so I > switched to it. I should indeed. > >>> Now that I look at this line, I wonder if I am missing some unshare_expr >>> for >>> p and/or op1. >> >> >> If either is a CONSTRUCTOR and its def stmt is not removed and it survives >> into tem then yes ... > > > I added an unconditional unshare_expr. I guess it would be possible to look > at if the permutation is only used once and in that case maybe not call > unshare_expr (?) and call remove_prop_source_from_use at the end, but it > gets a bit complicated and I don't think it helps compared to waiting for > the next DCE pass. > > Please also add handling of code == CONSTRUCTOR. >>> >>> >>> The cases I tried were already handled by fre1. I can add code for >>> constructor, but I'll need to look for a testcase first. Can that go to a >>> different patch? >> >> >> Yes. > > > This is currently handled by FRE. The only testcase I have that reaches > forwprop is bit_field_ref (vec_perm_expr (constructor)) and will require me > to disable this patch so I can test that one... > > > New version of the patch (I really think I already regtested it, but I'll do > it again to be sure): > > 2012-09-08 Marc Glisse > > > gcc/ > * tree-ssa-forwprop.c (simplify_bitfield): New function. > (ssa_forward_propagate_and_combine): Call it. > > gcc/testsuite/ > * gcc.dg/tree-ssa/forwprop-21.c: New testcase. > > -- > Marc Glisse > Index: tree-ssa-forwprop.c > === > --- tree-ssa-forwprop.c (revision 191089) > +++ tree-ssa-forwprop.c (working copy) > @@ -2567,20 +2567,92 @@ combine_conversions (gimple_stmt_iterato > gimple_assign_set_rhs_code (stmt, CONVERT_EXPR); > update_stmt (stmt); > return remove_prop_source_from_use (op0) ? 2 : 1; > } > } > } > >return 0; > } > > +/* Combine an element access with a shuffle. Returns true if there were > + any changes made, else it returns false. */ > + > +static bool > +simplify_bitfield (gimple_stmt_iterator *gsi) simplify_bitfield_ref Ok with that change. Thanks, Richard. > +{ > + gimple stmt = gsi_stmt (*gsi); > + gimple def_stmt; > + tree op, op0, op1, op2; > + tree elem_type; > + unsigned idx, n, size; > + enum tree_code code; > + > + op = gimple_assign_rhs1 (stmt); > + gcc_checking_assert (TREE_CODE (op) == BIT_FIELD_REF); > + > + op0 = TREE_OPERAND (op, 0); > + if (TREE_CODE (op0) != SSA_NAME > + || TREE_CODE (TREE_TYPE (op0)) != VECTOR_TYPE) > +return false; > + > + elem_type = TREE_TYPE (TREE_TYPE (op0)); > + if (TREE_TYPE (op) != elem_type) > +return false; > + > + size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type)); > + op1 = TREE_OPERAND (op, 1); > + n = TREE_INT_CST_LOW (op1) / size; > + if (n != 1) > +return false; > + > + def_stmt = SSA_NAME_DEF_STMT (op0); > + if (!def_stmt || !is_gimple_assign (def_stmt) > + || !can_propagate_from (def_stmt)) > +return false; > + > + op2 = TREE_OPERAND (op, 2); > + idx = TREE_INT_CST_LOW (op2) / size; > + > + code = gimple_assign_rhs_code (def_stmt); > + > + if (code == VEC_PERM_EXPR) > +{ > + tree p, m, index, tem; > + unsigned nelts; > + m = gimple_assign_rhs3 (def_stmt); > + if (TREE_CODE (m) != VECTOR_CST) > + return false; > + nelts = VECTOR_CST_NELTS (m); > + idx = TREE_INT_CST_LOW (VECTOR_CST_ELT (m, idx)); > + idx %= 2 * nelts; > + if (idx < nelts) > + { > + p = gimple_assign_rhs1 (def_stmt); > + } > + else > + { > + p = gimple_assign_rhs2 (def_stmt); > + idx -= nelts; > + } > + index = build_int_cst (TREE_TYPE (TREE_TYPE (m)), idx * size); > + tem = build3 (BIT_FIELD_REF, TREE_TYPE (op), > +unshare_expr (p), op1, index); > + gimple_assign_set_rhs1 (stmt, tem); > + fold_stmt (gsi); > + update_stmt (gsi_stmt (*gsi)); > + return true; > +} > + > + return false; > +} > + > /* De
Re: [Patch contrib] check_GNU_style: remove tmp file
On 9 September 2012 12:46, Gerald Pfeifer wrote: > On Mon, 3 Sep 2012, Christophe Lyon wrote: >> check_GNU_style.sh currently leaves a temporary file in the current >> directory. This patch removes it upon exit. >> >> Christophe. >> >> 2012-09-03 Christophe Lyon >> >> * check_GNU_style.sh: Remove temporay file upon exit. > > Shouldn't this also be removed upon abort? > > See contrib/warn_summary, for an example, > > Gerald Good point. Here is a new version, catching the same signals as warn_summary. Christophe. check-gnu-style.patch Description: Binary data
Re: [SH] PR 54089 - Improve software dynamic shifts
Oleg Endo wrote: > This patch does two things... > > 1) The dynamic shift cost is set to be the same if HW dynamic shifts are > available. This improves code size for SH2A a little (-2 KByte on CSiBE > for -m2a-single -O2). > > 2) Improve code around library function calls for software dynamic > shifts (logical right + left shifts only for now). > For this I had to change the implementations of ashlsi3 and lshrsi3 in > lib1funcs.S, but the changes are backwards compatible with older > binaries. Due to the additional branch insn in the dyn shift functions > they might be one or two cycles slower than the original, but this > reduces the amount of clobbered regs and cuts 9.5 KByte in the CSiBE set > (-m2 -ml -O2), which seems more beneficial to do on average. > > Tested on rev. 190990 with > make -k check RUNTESTFLAGS="--target_board=sh-sim > \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}" > > and no new failures except for this one on SH2: > > FAIL: gcc.dg/pr28402.c scan-assembler-not __[a-z]*si3 > > The reason for this is that now the middle-end will expand DImode shifts > as SImode shifts instead of a DImode shift library call, because it sees > the new SImode dynamic library call shift patterns for SH2. I will have > a look at this issue later to see if it is beneficial to do special > handling of DImode shifts on SH2. > > OK to install? OK. Regards, kaz
RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
>-Original Message- >From: Joseph Myers [mailto:jos...@codesourcery.com] >Sent: Monday, September 10, 2012 7:23 AM >To: Iyer, Balaji V >Cc: gcc-patches@gcc.gnu.org; Aldy Hernandez (al...@redhat.com); Jeff Law; >r...@redhat.com >Subject: RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22) > >On Sun, 9 Sep 2012, Iyer, Balaji V wrote: > >> Here is an updated patch. I think I have fixed all the changes you >> and others have mentioned. Please let me know if everything looks OK. >> Thanks again for doing the review! > >Has the user documentation for this feature been posted? For patch review we >really need a self-contained submission that for any feature implemented >includes not just the implementation but the testcases and the documentation. >I >think the testsuite patch also needs reworking to make it easy to add support >for >new architectures. I think you had some changes for the test cases and I am currently working on it. > >I think you need to revisit your split into 22 patches and arrange things based >primarily on features. If the changes for a feature are so big they can't be >posted >in one message, you should still always post all the patches for that feature >together (implementation, documentation, >testcases) - even if not all parts have changed in a particular revision. So, I assume it is OK for me to include testsuites with the code-changes? I included them separately because I remember someone in the mailing list saying the patch size must be small and one logical way is to put test cases separately from the code-changes. > >-- >Joseph S. Myers >jos...@codesourcery.com
Re: VxWorks Patches Back from the Dead!
On 09/09/12 08:54, rbmj wrote: > Just because I *love* bothering everyone with emails... I don't mind, as long as you don't expect me to do anything until I'm certain you've stabilized the patch ;) I'm glad you rolled it up into one patch, because I was eventually going to ask you to do that. Thank you. Cheers - Bruce > I've made a few changes and squashed everything into a single patch for ease > of application. The commit message is inside the patch, but here's the > suggested ChangeLog: > > configure.ac: add --enable-libstdcxx option > configure: regenerate > > [gcc] > gcov-io.c (gcov_open): Pass mode to open() unconditionally > > [fixincludes] > fixinc.in: Added ability to skip machine_name > inclhack.def (AAB_vxworks_assert): Added fix > inclhack.def (AAB_vxworks_regs_vxtypes): Added fix > inclhack.def (AAB_vxworks_stdint): Added fix > inclhack.def (AAB_vxworks_unistd): Added fix > inclhack.def (vxworks_ioctl_macro): Added fix > inclhack.def (vxworks_mkdir_macro): Added fix > inclhack.def (vxworks_regs): Added fix > inclhack.def (vxworks_write_const): Added fix > fixincl.x: Regenerate > mkfixinc.sh: Removed vxworks from list of no-op fixinc targets > > [libstdc++-v3] > config/os/vxworks/os_defines.h: #define'd NOMINMAX > > Thanks, > > Robert Mason >
[SH] Add simple_return pattern
This patch implements the simple_return pattern to enable -fshrink-wrap on SH. It also clean up some redundancies for expand_epilogue (called twice from the "return" and "epilogue" patterns and the sh_expand_prologue parameter type. No regressions with sh-superh-elf and sh4-linux gcc testsuites. Thanks Christian 2012-08-29 Christian Bruel * config/sh/sh-protos.h (sh_need_epilogue): Delete. * config/sh/sh.c (sh_need_epilogue): Delete. (sh_need_epilogue_known): Delete. (sh_output_function_epilogue): Remove sh_need_epilogue_known. * config/sh/sh.md (any_return): New iterator and optab. (simple_return): Define. (return): Check epilogue_completed. (epilogue): Use inline return rtl. (sh_expand_epilogue): Cleanup parameters boolean type. Index: gcc/config/sh/sh-protos.h === --- gcc/config/sh/sh-protos.h (revision 191129) +++ gcc/config/sh/sh-protos.h (working copy) @@ -117,7 +117,6 @@ extern int sh_media_register_for_return (void); extern void sh_expand_prologue (void); extern void sh_expand_epilogue (bool); -extern bool sh_need_epilogue (void); extern void sh_set_return_address (rtx, rtx); extern int initial_elimination_offset (int, int); extern bool fldi_ok (void); Index: gcc/config/sh/sh.c === --- gcc/config/sh/sh.c (revision 191129) +++ gcc/config/sh/sh.c (working copy) @@ -7901,22 +7901,6 @@ static int sh_need_epilogue_known = 0; -bool -sh_need_epilogue (void) -{ - if (! sh_need_epilogue_known) -{ - rtx epilogue; - - start_sequence (); - sh_expand_epilogue (0); - epilogue = get_insns (); - end_sequence (); - sh_need_epilogue_known = (epilogue == NULL ? -1 : 1); -} - return sh_need_epilogue_known > 0; -} - /* Emit code to change the current function's return address to RA. TEMP is available as a scratch register, if needed. */ @@ -7996,7 +7980,6 @@ sh_output_function_epilogue (FILE *file ATTRIBUTE_UNUSED, HOST_WIDE_INT size ATTRIBUTE_UNUSED) { - sh_need_epilogue_known = 0; } static rtx Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 191129) +++ gcc/config/sh/sh.md (working copy) @@ -177,6 +177,10 @@ (UNSPECV_EH_RETURN 12) ]) +(define_code_iterator any_return [return simple_return]) +(define_code_attr optab [(return "return") + (simple_return "simple_return")]) + ;; - ;; Attributes ;; - @@ -9280,7 +9284,7 @@ [(return)] "" { - sh_expand_epilogue (1); + sh_expand_epilogue (true); if (TARGET_SHCOMPACT) { rtx insn, set; @@ -10099,9 +10103,13 @@ } [(set_attr "type" "load_media")]) +(define_expand "simple_return" + [(simple_return)] + "") + (define_expand "return" - [(return)] - "reload_completed && ! sh_need_epilogue ()" + [(simple_return)] + "reload_completed && epilogue_completed" { if (TARGET_SHMEDIA) { @@ -10117,8 +10125,8 @@ } }) -(define_insn "*return_i" - [(return)] +(define_insn "*_i" + [(any_return)] "TARGET_SH1 && ! (TARGET_SHCOMPACT && (crtl->args.info.call_cookie & CALL_COOKIE_RET_TRAMP (1))) @@ -10244,19 +10252,12 @@ (define_expand "prologue" [(const_int 0)] "" -{ - sh_expand_prologue (); - DONE; -}) + "sh_expand_prologue (); DONE;") (define_expand "epilogue" [(return)] "" -{ - sh_expand_epilogue (0); - emit_jump_insn (gen_return ()); - DONE; -}) + "sh_expand_epilogue (false);") (define_expand "eh_return" [(use (match_operand 0 "register_operand" ""))]
Re: [PATCH] PowerPC VLE port
On 09/07/2012 07:52 PM, David Edelsohn wrote: This patch contains a lot of unnecessary, gratuitous changes in addition to being very invasive. It was not edited and cleaned sufficiently before posting. It has too much of a negative impact on the current PowerPC port. The patch is not going to be accepted in its current form. David, What are your thoughts on how to move forward. -- Jim Lemke Mentor Graphics / CodeSourcery Orillia Ontario, +1-613-963-1073
[PATCH] Fix PR54520
The following fixes PR54520 - we were not updating bb->loop_father for all basic-blocks converted to "pre-header" blocks during jump threading. Fixed as follows. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-09-10 Richard Guenther * tree-ssa-threadupdate.c (def_split_header_continue_p): Properly consider sub-loops. Index: gcc/tree-ssa-threadupdate.c === *** gcc/tree-ssa-threadupdate.c (revision 191129) --- gcc/tree-ssa-threadupdate.c (working copy) *** static bool *** 846,853 def_split_header_continue_p (const_basic_block bb, const void *data) { const_basic_block new_header = (const_basic_block) data; ! return (bb->loop_father == new_header->loop_father ! && bb != new_header); } /* Thread jumps through the header of LOOP. Returns true if cfg changes. --- 846,854 def_split_header_continue_p (const_basic_block bb, const void *data) { const_basic_block new_header = (const_basic_block) data; ! return (bb != new_header ! && (loop_depth (bb->loop_father) ! >= loop_depth (new_header->loop_father))); } /* Thread jumps through the header of LOOP. Returns true if cfg changes. *** thread_through_loop_header (struct loop *** 1031,1040 nblocks = dfs_enumerate_from (header, 0, def_split_header_continue_p, bblocks, loop->num_nodes, tgt_bb); for (i = 0; i < nblocks; i++) ! { ! remove_bb_from_loops (bblocks[i]); ! add_bb_to_loop (bblocks[i], loop_outer (loop)); ! } free (bblocks); /* If the new header has multiple latches mark it so. */ --- 1032,1042 nblocks = dfs_enumerate_from (header, 0, def_split_header_continue_p, bblocks, loop->num_nodes, tgt_bb); for (i = 0; i < nblocks; i++) ! if (bblocks[i]->loop_father == loop) ! { ! remove_bb_from_loops (bblocks[i]); ! add_bb_to_loop (bblocks[i], loop_outer (loop)); ! } free (bblocks); /* If the new header has multiple latches mark it so. */ Index: gcc/testsuite/gcc.dg/torture/pr54520.c === *** gcc/testsuite/gcc.dg/torture/pr54520.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr54520.c (working copy) *** *** 0 --- 1,15 + /* { dg-do compile } */ + + char *a; + void + fn1 () + { + char *p = a; + while (p && *p != '\0') + { + while (*p == '\t') + *p++ = '\0'; + if (*p != '\0') + p = 0; + } + }
C++ PATCH for c++/54506 (wrong implicit move)
This area of the standard is in flux, but what we were doing was definitely wrong. The proposed resolution for issue 1402 says that if a move constructor would call a non-trivial non-move constructor for a subobject, it is not implicitly declared. We might end up dropping that provision entirely, but in any case we should allow moving via template constructor as well as non-template. Tested x86_64-pc-linux-gnu, applying to trunk and 4.7 (since it only affects C++11 mode). commit 37c8977bb82c984645795a9992fe6658841e2d35 Author: Jason Merrill Date: Mon Sep 10 09:22:37 2012 -0400 PR c++/54506 * decl.c (move_signature_fn_p): Split out from move_fn_p. * method.c (process_subob_fn): Use it. * cp-tree.h: Declare it. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 3e0fc3f..3c55ba4 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -5066,6 +5066,7 @@ extern tree build_ptrmem_type (tree, tree); extern tree build_this_parm (tree, cp_cv_quals); extern int copy_fn_p(const_tree); extern bool move_fn_p (const_tree); +extern bool move_signature_fn_p (const_tree); extern tree get_scope_of_declarator (const cp_declarator *); extern void grok_special_member_properties (tree); extern int grok_ctor_properties (const_tree, const_tree); diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 7655f78..e34092d 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -10859,10 +10859,6 @@ copy_fn_p (const_tree d) bool move_fn_p (const_tree d) { - tree args; - tree arg_type; - bool result = false; - gcc_assert (DECL_FUNCTION_MEMBER_P (d)); if (cxx_dialect == cxx98) @@ -10872,12 +10868,29 @@ move_fn_p (const_tree d) if (TREE_CODE (d) == TEMPLATE_DECL || (DECL_TEMPLATE_INFO (d) && DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (d -/* Instantiations of template member functions are never copy +/* Instantiations of template member functions are never move functions. Note that member functions of templated classes are represented as template functions internally, and we must - accept those as copy functions. */ + accept those as move functions. */ return 0; + return move_signature_fn_p (d); +} + +/* D is a constructor or overloaded `operator='. + + Then, this function returns true when D has the same signature as a move + constructor or move assignment operator (because either it is such a + ctor/op= or it is a template specialization with the same signature), + false otherwise. */ + +bool +move_signature_fn_p (const_tree d) +{ + tree args; + tree arg_type; + bool result = false; + args = FUNCTION_FIRST_USER_PARMTYPE (d); if (!args) return 0; diff --git a/gcc/cp/method.c b/gcc/cp/method.c index c21ae15..a42ed60 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -947,9 +947,10 @@ process_subob_fn (tree fn, bool move_p, tree *spec_p, bool *trivial_p, } } - /* Core 1402: A non-trivial copy op suppresses the implicit + /* Core 1402: A non-trivial non-move ctor suppresses the implicit declaration of the move ctor/op=. */ - if (no_implicit_p && move_p && !move_fn_p (fn) && !trivial_fn_p (fn)) + if (no_implicit_p && move_p && !move_signature_fn_p (fn) + && !trivial_fn_p (fn)) *no_implicit_p = true; if (constexpr_p && !DECL_DECLARED_CONSTEXPR_P (fn)) diff --git a/gcc/testsuite/g++.dg/cpp0x/implicit14.C b/gcc/testsuite/g++.dg/cpp0x/implicit14.C new file mode 100644 index 000..8a56244 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/implicit14.C @@ -0,0 +1,26 @@ +// PR c++/54506 +// { dg-do compile { target c++11 } } + +template +struct A +{ + A() {} + + A(A const volatile &&) = delete; + A &operator =(A const volatile &&) = delete; + + template A(A &&) {} + template A &operator =(A &&) { return *this; } +}; + +struct B +{ + A a; + B() = default; +}; + +int main() +{ + B b = B(); + b = B(); +}
[google] Fix exception in unroller code size heuristics (issue6498112)
Fix divide by zero error. Passes bootstrap and regression tests. Ok for google branches? Teresa 2012-09-10 Teresa Johnson * loop-unroll.c (code_size_limit_factor): Index: loop-unroll.c === --- loop-unroll.c (revision 191138) +++ loop-unroll.c (working copy) @@ -223,7 +223,8 @@ code_size_limit_factor(struct loop *loop) /* Next, set the value of the codesize-based unroll factor divisor which in most loops will need to be set to a value that will reduce or eliminate unrolling/peeling. */ - if (profile_info->num_hot_counters < size_threshold * 2) + if (profile_info->num_hot_counters < size_threshold * 2 + && loop->header->count > 0) { /* For applications that are less than twice the codesize limit, allow limited unrolling for very hot loops. */ -- This patch is available for review at http://codereview.appspot.com/6498112
Re: [google] Fix exception in unroller code size heuristics (issue6498112)
On Mon, Sep 10, 2012 at 10:16 AM, Teresa Johnson wrote: > 2012-09-10 Teresa Johnson > > * loop-unroll.c (code_size_limit_factor): OK. Diego.
[PATCH] Fix PR54492
Richard found some N^2 behavior in SLSR that has to be suppressed. Searching for the best possible basis is overkill when there are hundreds of thousands of possibilities. This patch constrains the search to "good enough" in such cases. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no regressions. Ok for trunk? Thanks, Bill 2012-08-10 Bill Schmidt * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit the time spent searching for a basis. Index: gcc/gimple-ssa-strength-reduction.c === --- gcc/gimple-ssa-strength-reduction.c (revision 191135) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -353,10 +353,14 @@ find_basis_for_candidate (slsr_cand_t c) cand_chain_t chain; slsr_cand_t basis = NULL; + // Limit potential of N^2 behavior for long candidate chains. + int iters = 0; + const int MAX_ITERS = 50; + mapping_key.base_expr = c->base_expr; chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); - for (; chain; chain = chain->next) + for (; chain && iters < MAX_ITERS; chain = chain->next, ++iters) { slsr_cand_t one_basis = chain->cand;
Re: [PATCH] Fix PR54492
On Mon, 10 Sep 2012, William J. Schmidt wrote: > Richard found some N^2 behavior in SLSR that has to be suppressed. > Searching for the best possible basis is overkill when there are > hundreds of thousands of possibilities. This patch constrains the > search to "good enough" in such cases. > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no > regressions. Ok for trunk? Hm, rather than stopping the search, can we stop adding new candidates instead so the list never grows that long? If that's not easy the patch is ok as-is. Thanks, Richard. > Thanks, > Bill > > > 2012-08-10 Bill Schmidt > > * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit > the time spent searching for a basis. > > > Index: gcc/gimple-ssa-strength-reduction.c > === > --- gcc/gimple-ssa-strength-reduction.c (revision 191135) > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > @@ -353,10 +353,14 @@ find_basis_for_candidate (slsr_cand_t c) >cand_chain_t chain; >slsr_cand_t basis = NULL; > > + // Limit potential of N^2 behavior for long candidate chains. > + int iters = 0; > + const int MAX_ITERS = 50; > + >mapping_key.base_expr = c->base_expr; >chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); > > - for (; chain; chain = chain->next) > + for (; chain && iters < MAX_ITERS; chain = chain->next, ++iters) > { >slsr_cand_t one_basis = chain->cand; > > > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PATCH] Fix PR54492
On Mon, Sep 10, 2012 at 04:45:24PM +0200, Richard Guenther wrote: > On Mon, 10 Sep 2012, William J. Schmidt wrote: > > > Richard found some N^2 behavior in SLSR that has to be suppressed. > > Searching for the best possible basis is overkill when there are > > hundreds of thousands of possibilities. This patch constrains the > > search to "good enough" in such cases. > > > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no > > regressions. Ok for trunk? > > Hm, rather than stopping the search, can we stop adding new candidates > instead so the list never grows that long? If that's not easy > the patch is ok as-is. Don't we want a param for that, or is a hardcoded magic constant fine here? > > 2012-08-10 Bill Schmidt > > > > * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit > > the time spent searching for a basis. > > > > > > Index: gcc/gimple-ssa-strength-reduction.c > > === > > --- gcc/gimple-ssa-strength-reduction.c (revision 191135) > > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > > @@ -353,10 +353,14 @@ find_basis_for_candidate (slsr_cand_t c) > >cand_chain_t chain; > >slsr_cand_t basis = NULL; > > > > + // Limit potential of N^2 behavior for long candidate chains. > > + int iters = 0; > > + const int MAX_ITERS = 50; > > + > >mapping_key.base_expr = c->base_expr; > >chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); > > > > - for (; chain; chain = chain->next) > > + for (; chain && iters < MAX_ITERS; chain = chain->next, ++iters) > > { > >slsr_cand_t one_basis = chain->cand; Jakub
Re: [PATCH] Fix PR54492
On Mon, 2012-09-10 at 16:45 +0200, Richard Guenther wrote: > On Mon, 10 Sep 2012, William J. Schmidt wrote: > > > Richard found some N^2 behavior in SLSR that has to be suppressed. > > Searching for the best possible basis is overkill when there are > > hundreds of thousands of possibilities. This patch constrains the > > search to "good enough" in such cases. > > > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no > > regressions. Ok for trunk? > > Hm, rather than stopping the search, can we stop adding new candidates > instead so the list never grows that long? If that's not easy > the patch is ok as-is. I think this way is probably better. Right now the potential bases are organized as a stack with new ones added to the front and considered first. To disable it there would require adding state to keep a count, and then we would only be looking at the most distant ones. This way the 50 most recently added potential bases (most likely to be local) are considered. Thanks, Bill > > Thanks, > Richard. > > > Thanks, > > Bill > > > > > > 2012-08-10 Bill Schmidt > > > > * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit > > the time spent searching for a basis. > > > > > > Index: gcc/gimple-ssa-strength-reduction.c > > === > > --- gcc/gimple-ssa-strength-reduction.c (revision 191135) > > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > > @@ -353,10 +353,14 @@ find_basis_for_candidate (slsr_cand_t c) > >cand_chain_t chain; > >slsr_cand_t basis = NULL; > > > > + // Limit potential of N^2 behavior for long candidate chains. > > + int iters = 0; > > + const int MAX_ITERS = 50; > > + > >mapping_key.base_expr = c->base_expr; > >chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); > > > > - for (; chain; chain = chain->next) > > + for (; chain && iters < MAX_ITERS; chain = chain->next, ++iters) > > { > >slsr_cand_t one_basis = chain->cand; > > > > > > > > >
Re: [PATCH] Fix PR54492
On Mon, 10 Sep 2012, Jakub Jelinek wrote: > On Mon, Sep 10, 2012 at 04:45:24PM +0200, Richard Guenther wrote: > > On Mon, 10 Sep 2012, William J. Schmidt wrote: > > > > > Richard found some N^2 behavior in SLSR that has to be suppressed. > > > Searching for the best possible basis is overkill when there are > > > hundreds of thousands of possibilities. This patch constrains the > > > search to "good enough" in such cases. > > > > > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no > > > regressions. Ok for trunk? > > > > Hm, rather than stopping the search, can we stop adding new candidates > > instead so the list never grows that long? If that's not easy > > the patch is ok as-is. > > Don't we want a param for that, or is a hardcoded magic constant fine here? I suppose a param for it would be nice. Richard. > > > 2012-08-10 Bill Schmidt > > > > > > * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit > > > the time spent searching for a basis. > > > > > > > > > Index: gcc/gimple-ssa-strength-reduction.c > > > === > > > --- gcc/gimple-ssa-strength-reduction.c (revision 191135) > > > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > > > @@ -353,10 +353,14 @@ find_basis_for_candidate (slsr_cand_t c) > > >cand_chain_t chain; > > >slsr_cand_t basis = NULL; > > > > > > + // Limit potential of N^2 behavior for long candidate chains. > > > + int iters = 0; > > > + const int MAX_ITERS = 50; > > > + > > >mapping_key.base_expr = c->base_expr; > > >chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); > > > > > > - for (; chain; chain = chain->next) > > > + for (; chain && iters < MAX_ITERS; chain = chain->next, ++iters) > > > { > > >slsr_cand_t one_basis = chain->cand; > > Jakub > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
[Patch][AArch64] Expand binary operations' constant operands for neon intrinsics.
Hi, This patch expands an Advanced SIMD intrinsic's operand into a constant operand only if the predicate allows it. Regression-tested on aarch64-none-elf. OK for aarch64-branch? Thanks, Tejas Belagod ARM. Changelog 2012-09-10 Tejas Belagod gcc/ * config/aarch64/aarch64.c (aarch64_simd_expand_builtin): Expand binary operations' constant operand only if the predicate allows it.diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 04cc48a..731f369 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1215,13 +1215,17 @@ aarch64_simd_expand_builtin (int fcode, tree exp, rtx target) case AARCH64_SIMD_BINOP: { - bool op1_const_int_p - = CONST_INT_P (expand_normal (CALL_EXPR_ARG (exp, 1))); - return aarch64_simd_expand_args (target, icode, 1, exp, -SIMD_ARG_COPY_TO_REG, -op1_const_int_p ? SIMD_ARG_CONSTANT -: SIMD_ARG_COPY_TO_REG, -SIMD_ARG_STOP); +rtx arg2 = expand_normal (CALL_EXPR_ARG (exp, 1)); +/* Handle constants only if the predicate allows it. */ + bool op1_const_int_p = + (CONST_INT_P (arg2) + && (*insn_data[icode].operand[2].predicate) + (arg2, insn_data[icode].operand[2].mode)); + return aarch64_simd_expand_args + (target, icode, 1, exp, + SIMD_ARG_COPY_TO_REG, + op1_const_int_p ? SIMD_ARG_CONSTANT : SIMD_ARG_COPY_TO_REG, + SIMD_ARG_STOP); } case AARCH64_SIMD_TERNOP:
[Patch][AArch64] Split a move of Q-reg vectors contained in general regs.
Hi, This patch fixes the mov pattern to split a move between general regs that contain a Q-reg vector value. Regression-tested on aarch64-none-elf. OK for aarch64-branch? Thanks, Tejas Belagod ARM. Changelog: 2012-09-10 Tejas Belagod gcc/ * config/aarch64/aarch64-simd.md (*aarch64_simd_mov): Split Q-reg vector value move contained in general registers.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d3f8ef2..1113b06 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -443,7 +443,7 @@ case 2: return "orr\t%0., %1., %1."; case 3: return "umov\t%0, %1.d[0]\;umov\t%H0, %1.d[1]"; case 4: return "ins\t%0.d[0], %1\;ins\t%0.d[1], %H1"; - case 5: return "mov\t%0, %1;mov\t%H0, %H1"; + case 5: return "#"; case 6: { int is_valid; @@ -475,6 +475,27 @@ (set_attr "length" "4,4,4,8,8,8,4")] ) +(define_split + [(set (match_operand:VQ 0 "register_operand" "") + (match_operand:VQ 1 "register_operand" ""))] + "TARGET_SIMD && reload_completed + && GP_REGNUM_P (REGNO (operands[0])) + && GP_REGNUM_P (REGNO (operands[1]))" + [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))] +{ + int rdest = REGNO (operands[0]); + int rsrc = REGNO (operands[1]); + rtx dest[2], src[2]; + + dest[0] = gen_rtx_REG (DImode, rdest); + src[0] = gen_rtx_REG (DImode, rsrc); + dest[1] = gen_rtx_REG (DImode, rdest + 1); + src[1] = gen_rtx_REG (DImode, rsrc + 1); + + aarch64_simd_disambiguate_copy (operands, dest, src, 2); +}) + (define_insn "orn3" [(set (match_operand:VDQ 0 "register_operand" "=w") (ior:VDQ (not:VDQ (match_operand:VDQ 1 "register_operand" "w"))
[Patch][AArch64] Tighten predicate for CMP pattern.
Hi, This patch tightens the predicate for the CMP pattern. It makes it restrictive to accept reg or zero as prescribed by the architecture. Regression-tested on aarch64-none-elf. OK for aarch64-branch? Thanks, Tejas Belagod ARM. PS: This patch applies over vldn-vstn.txt sent out earlier. Changelog: 2012-09-10 Tejas Belagod gcc/ * config/aarch64/aarch64-simd.md (aarch64_cm): Tighten predicate for operand 2 of the compare pattern to accept register or zero. * config/aarch64/predicates.md (aarch64_simd_reg_or_zero): New.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d3f8ef2..50114aa 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2670,7 +2670,7 @@ [(set (match_operand: 0 "register_operand" "=w,w") (unspec: [(match_operand:VSDQ_I_DI 1 "register_operand" "w,w") - (match_operand:VSDQ_I_DI 2 "nonmemory_operand" "w,Z")] + (match_operand:VSDQ_I_DI 2 "aarch64_simd_reg_or_zero" "w,Z")] VCMP_S))] "TARGET_SIMD" "@ diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 328e5cf..f40ab56 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -265,3 +265,10 @@ { return aarch64_simd_shift_imm_p (op, mode, false); }) + +(define_predicate "aarch64_simd_reg_or_zero" + (and (match_code "reg,subreg,const_int,const_vector") + (ior (match_operand 0 "register_operand") + (ior (match_test "op == const0_rtx") +(match_test "aarch64_simd_imm_zero_p (op, mode)") +
RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Mon, 10 Sep 2012, Iyer, Balaji V wrote: > So, I assume it is OK for me to include testsuites with the > code-changes? I included them separately because I remember someone in > the mailing list saying the patch size must be small and one logical way > is to put test cases separately from the code-changes. I believe the message size limit for gcc-patches is 400 kB, and this patch is way below that. But if the self-contained unit of changes exceeds 400 kB, you should still post the whole thing at once in multiple messages so that a matched set of patches that could go in together is available for review. -- Joseph S. Myers jos...@codesourcery.com
[Patch][AArch64] Move immediate into Advanced SIMD scalar.
Hi, This patch adds support for move an immediate DImode value into an AdvSIMD scalar D register. i.e. movi Dd, #imm. Regression-tested on aarch64-none-elf. OK for aarch64-branch? Thanks, Tejas Belagod. ARM. Changelog: 2012-09-10 Tejas Belagod gcc/ * config/aarch64/aarch64-protos.h (aarch64_simd_imm_scalar_p): Declare. * config/aarch64/aarch64.c (aarch64_simd_imm_scalar_p): New. * config/aarch64/aarch64.md (*movdi_aarch64): Add alternative for moving valid scalar immediate into a Advanved SIMD D-register. * config/aarch64/constraints.md (Dd): New.diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index afb8b1e..e6d35e4 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -178,6 +178,7 @@ bool aarch64_pad_arg_upward (enum machine_mode, const_tree); bool aarch64_pad_reg_upward (enum machine_mode, const_tree, bool); bool aarch64_regno_ok_for_base_p (int, bool); bool aarch64_regno_ok_for_index_p (int, bool); +bool aarch64_simd_imm_scalar_p (rtx x, enum machine_mode mode); bool aarch64_simd_imm_zero_p (rtx, enum machine_mode); bool aarch64_simd_shift_imm_p (rtx, enum machine_mode, bool); bool aarch64_symbolic_address_p (rtx); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 10af252..b90be6d 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6508,6 +6508,23 @@ aarch64_simd_imm_zero_p (rtx x, enum machine_mode mode) return true; } +bool +aarch64_simd_imm_scalar_p (rtx x, enum machine_mode mode ATTRIBUTE_UNUSED) +{ + HOST_WIDE_INT imm = INTVAL (x); + int i; + + for (i = 0; i < 8; i++) +{ + unsigned int byte = imm & 0xff; + if (byte != 0xff && byte != 0) + return false; + imm >>= 8; +} + + return true; +} + /* Return a const_int vector of VAL. */ rtx aarch64_simd_gen_const_vector_dup (enum machine_mode mode, int val) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 8f52ed4..78a71fe 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -950,8 +950,8 @@ ) (define_insn "*movdi_aarch64" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,m, r, r, *w, r,*w") - (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,m,rZ,Usa,Ush,rZ,*w,*w"))] + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,m, r, r, *w, r,*w,w") + (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,m,rZ,Usa,Ush,rZ,*w,*w,Dd"))] "(register_operand (operands[0], DImode) || aarch64_reg_or_zero (operands[1], DImode))" "@ @@ -965,10 +965,11 @@ adrp\\t%x0, %A1 fmov\\t%d0, %x1 fmov\\t%x0, %d1 - fmov\\t%d0, %d1" - [(set_attr "v8type" "move,move,move,alu,load1,store1,adr,adr,fmov,fmov,fmov") + fmov\\t%d0, %d1 + movi\\t%d0, %1" + [(set_attr "v8type" "move,move,move,alu,load1,store1,adr,adr,fmov,fmov,fmov,fmov") (set_attr "mode" "DI") - (set_attr "fp" "*,*,*,*,*,*,*,*,yes,yes,yes")] + (set_attr "fp" "*,*,*,*,*,*,*,*,yes,yes,yes,yes")] ) (define_insn "insv_imm" diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 267b0b8..fe61307 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -159,3 +159,9 @@ A constraint that matches vector of immediate zero." (and (match_code "const_vector") (match_test "aarch64_simd_imm_zero_p (op, GET_MODE (op))"))) + +(define_constraint "Dd" + "@internal + A constraint that matches an immediate operand valid for AdvSIMD scalar." + (and (match_code "const_int") + (match_test "aarch64_simd_imm_scalar_p (op, GET_MODE (op))")))
[Patch][AArch64] Fix vfmaq_lane_f64.
Hi, This patch fixes vfmaq_lane_f64 () AdvSIMD intrinsic. Regression-tested on aarch64-none-elf. OK for aarch64-branch? Thanks, Tejas Belagod. ARM. Changelog: 2012-09-10 Tejas Belagod gcc/ * config/aarch64/arm_neon.h (vfmaq_lane_f64): Fix prototype and assembler template accordingly.diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index de3a2f2..54eb29c 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -7859,15 +7859,16 @@ vfmaq_f64 (float64x2_t a, float64x2_t b, float64x2_t c) result; \ }) -#define vfmaq_lane_f64(a, b, c) \ +#define vfmaq_lane_f64(a, b, c, d) \ __extension__ \ ({ \ + float64x2_t c_ = (c);\ float64x2_t b_ = (b);\ float64x2_t a_ = (a);\ float64x2_t result; \ - __asm__ ("fmla %0.2d,%1.2d,%2.d[%3]" \ + __asm__ ("fmla %0.2d,%2.2d,%3.d[%4]" \ : "=w"(result) \ -: "w"(a_), "w"(b_), "i"(c) \ +: "0"(a_), "w"(b_), "w"(c_), "i"(d) \ : /* No clobbers */); \ result; \ })
[Patch][AArch64] Implement vmovq_n_f64.
Hi, This patch adds the missing intrinsic vmovq_n_f64(). OK? Thanks, Tejas Belagod ARM. Changelog: 2012-09-10 Tejas Belagod gcc/ * config/aarch64/arm_neon.h (vmovq_n_f64): Add.diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index e7dadf9..cf8b676 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -11753,6 +11753,12 @@ vmovq_n_f32 (float32_t a) return result; } +__extension__ static __inline float64x2_t __attribute__ ((__always_inline__)) +vmovq_n_f64 (float64_t a) +{ + return (float64x2_t) {a, a}; +} + __extension__ static __inline poly8x16_t __attribute__ ((__always_inline__)) vmovq_n_p8 (uint32_t a) {
[Patch][AArch64] Fix Narrowing high shifts.
Hi, The attached patch has fixes to assembler templates for rshrn2 and shrn2. OK? Thanks, Tejas Belagod. ARM. Changelog: 2012-09-10 Tejas Belagod gcc/ * config/aarch64/arm_neon.h (vrshrn_high_n_s16, vrshrn_high_n_s32, vrshrn_high_n_s64, vrshrn_high_n_u16, vrshrn_high_n_u32, vrshrn_high_n_u64, vshrn_high_n_s16, vshrn_high_n_s32, vshrn_high_n_s32, vshrn_high_n_s64, vshrn_high_n_u16, vshrn_high_n_u32, vshrn_high_n_u64): Fix template to reference correct operands.diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 46abaf6..a4b2e78 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -15334,7 +15334,7 @@ vrndqp_f64 (float64x2_t a) int8x8_t a_ = (a); \ int8x16_t result = vcombine_s8 \ (a_, vcreate_s8 (UINT64_C (0x0))); \ - __asm__ ("rshrn2 %0.16b,%2.8h,#%3" \ + __asm__ ("rshrn2 %0.16b,%1.8h,#%2" \ : "+w"(result) \ : "w"(b_), "i"(c) \ : /* No clobbers */); \ @@ -15348,7 +15348,7 @@ vrndqp_f64 (float64x2_t a) int16x4_t a_ = (a); \ int16x8_t result = vcombine_s16 \ (a_, vcreate_s16 (UINT64_C (0x0))); \ - __asm__ ("rshrn2 %0.8h,%2.4s,#%3"\ + __asm__ ("rshrn2 %0.8h,%1.4s,#%2"\ : "+w"(result) \ : "w"(b_), "i"(c) \ : /* No clobbers */); \ @@ -15362,7 +15362,7 @@ vrndqp_f64 (float64x2_t a) int32x2_t a_ = (a); \ int32x4_t result = vcombine_s32 \ (a_, vcreate_s32 (UINT64_C (0x0))); \ - __asm__ ("rshrn2 %0.4s,%2.2d,#%3"\ + __asm__ ("rshrn2 %0.4s,%1.2d,#%2"\ : "+w"(result) \ : "w"(b_), "i"(c) \ : /* No clobbers */); \ @@ -15376,7 +15376,7 @@ vrndqp_f64 (float64x2_t a) uint8x8_t a_ = (a); \ uint8x16_t result = vcombine_u8 \ (a_, vcreate_u8 (UINT64_C (0x0))); \ - __asm__ ("rshrn2 %0.16b,%2.8h,#%3" \ + __asm__ ("rshrn2 %0.16b,%1.8h,#%2" \ : "+w"(result) \ : "w"(b_), "i"(c) \ : /* No clobbers */); \ @@ -15390,7 +15390,7 @@ vrndqp_f64 (float64x2_t a) uint16x4_t a_ = (a); \ uint16x8_t result = vcombine_u16 \ (a_, vcreate_u16 (UINT64_C (0x0))); \ - __asm__ ("rshrn2 %0.8h,%2.4s,#%3"\ + __asm__ ("rshrn2 %0.8h,%1.4s,#%2"\ : "+w"(result) \ : "w"(b_), "i"(c) \ : /* No clobbers */); \ @@ -15404,7 +15404,7 @@ vrndqp_f64 (float64x2_t a) uint32x2_t a_ = (a); \ uint32x4_t result = vcombine_u32 \ (a_, vcreate_u32 (UINT64_C (0x0))); \ - __asm__ ("rshrn2 %0.4s,%2.2d,#%3"\ + __asm__ ("rshrn2 %0.4s,%1.2d,#%2"\ : "+w"(result) \ : "w"(b_), "i"(c) \ : /* No clobbers */); \ @@ -16088,7 +16088,7 @@ vrsubhn_u64 (uint64x2_t a, uint64x2_t b) int8x8_t a_ = (a); \ int8x16_t result = vcombine_s8 \ (a_, vcreate_s8 (UINT64_C (0x0))); \ - __asm__ ("shrn2 %0.16b,%2.8h,#%3"\ + __asm__ ("shrn2 %0.16b,%1.8h,#
Re: [PATCH] Combine location with block using block_locations
On Mon, Sep 10, 2012 at 3:01 AM, Richard Guenther wrote: > On Sun, Sep 9, 2012 at 12:26 AM, Dehao Chen wrote: >> Hi, Diego, >> >> Thanks a lot for the review. I've updated the patch. >> >> This patch is large and may easily break builds because it reserves >> more complete information for TREE_BLOCK as well as gimple_block (may >> trigger bugs that was hided when these info are unavailable). I've >> done more rigorous testing to ensure that most bugs are caught before >> checking in. >> >> * Sync to the head and retest all gcc testsuite. >> * Port the patch to google-4_7 branch to retest all gcc testsuite, as >> well as build many large applications. >> >> Through these tests, I've found two additional bugs that was omitted >> in the original implementation. A new patch is attached (patch.txt) to >> fix these problems. After this fix, all gcc testsuites pass for both >> trunk and google-4_7 branch. I've also copy pasted the new fixes >> (lto.c and tree-cfg.c) below. Now I'd say this patch is in good shape. >> But it may not be perfect. I'll look into build failures as soon as it >> arises. >> >> Richard and Diego, could you help me take a look at the following two fixes? >> >> Thanks, >> Dehao >> >> New fixes: >> --- gcc/lto/lto.c (revision 191083) >> +++ gcc/lto/lto.c (working copy) >> @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t) >> { >>enum tree_code code = TREE_CODE (t); >>LTO_NO_PREVAIL (TREE_TYPE (t)); >> - if (CODE_CONTAINS_STRUCT (code, TS_COMMON)) >> -LTO_NO_PREVAIL (TREE_CHAIN (t)); > > That change is odd. Can you show us how it breaks? This will break LTO build of gcc.c-torture/execute/pr38051.c There is data structure like: union { long int l; char c[sizeof (long int)]; } u; Once the block info is reserved for this, it'll reserve this data structure. And inside this data structure, there is VAR_DECL. Thus LTO_NO_PREVAIL assertion does not satisfy here for TREE_CHAIN (t). > >>if (DECL_P (t)) >> { >>LTO_NO_PREVAIL (DECL_NAME (t)); >> >> Index: gcc/tree-cfg.c >> === >> --- gcc/tree-cfg.c (revision 191083) >> +++ gcc/tree-cfg.c (working copy) >> @@ -5980,9 +5974,21 @@ move_stmt_op (tree *tp, int *walk_subtrees, void * >>tree t = *tp; >> >>if (EXPR_P (t)) >> -/* We should never have TREE_BLOCK set on non-statements. */ >> -gcc_assert (!TREE_BLOCK (t)); >> - >> +{ >> + tree block = TREE_BLOCK (t); >> + if (p->orig_block == NULL_TREE >> + || block == p->orig_block >> + || block == NULL_TREE) >> + TREE_SET_BLOCK (t, p->new_block); >> +#ifdef ENABLE_CHECKING >> + else if (block != p->new_block) >> + { >> + while (block && block != p->orig_block) >> + block = BLOCK_SUPERCONTEXT (block); >> + gcc_assert (block); >> + } >> +#endif > > I think what this means is that TREE_BLOCK on non-stmts are meaningless > (thus only gimple_block is interesting on GIMPLE, not BLOCKs on trees). > > So instead of setting a BLOCK in some cases you should clear BLOCK > if it happens to be set, or alternatively, only re-set it if there was > a block associated > with it. Yeah, makes sense. New change: @@ -5980,9 +5974,10 @@ tree t = *tp; if (EXPR_P (t)) -/* We should never have TREE_BLOCK set on non-statements. */ -gcc_assert (!TREE_BLOCK (t)); - +{ + if (TREE_BLOCK (t)) + TREE_SET_BLOCK (t, p->new_block); +} else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME) { if (TREE_CODE (t) == SSA_NAME) Thanks, Dehao > > Richard. > >> +} >>else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME) >> { >>if (TREE_CODE (t) == SSA_NAME) >> >> Whole patch: >> gcc/ChangeLog: >> 2012-09-08 Dehao Chen >> >> * toplev.c (general_init): Init block_locations. >> * tree.c (tree_set_block): New. >> (tree_block): Change to use LOCATION_BLOCK. >> * tree.h (TREE_SET_BLOCK): New. >> * final.c (reemit_insn_block_notes): Change to use LOCATION_BLOCK. >> (final_start_function): Likewise. >> * input.c (expand_location_1): Likewise. >> * input.h (LOCATION_LOCUS): New. >> (LOCATION_BLOCK): New. >> (IS_UNKNOWN_LOCATION): New. >> * fold-const.c (expr_location_or): Change to use new location. >> * reorg.c (emit_delay_sequence): Likewise. >> (try_merge_delay_insns): Likewise. >> * modulo-sched.c (dump_insn_location): Likewise. >> * lto-streamer-out.c (lto_output_location_bitpack): Likewise. >> * jump.c (rtx_renumbered_equal_p): Likewise. >> * ifcvt.c (noce_try_move): Likewise. >> (noce_try_store_flag): Likewise. >> (noce_try_store_flag_constants): Likewise. >> (noce_try_addcc): Likewise. >> (noce_try_store_flag_mask): Likewise. >> (noce_try_cmove): Likewise. >> (noce_try_cmove_arith): L
[Patch][AArch64] Implement TARGET_SHIFT_TRUNCATION_MASK.
Hi, The attached patch implements TARGET_SHIFT_TRUNCATION_MASK target hook. Regression-tested on aarch64-none-elf. OK for aarch64-branch? Thanks, Tejas Belagod ARM. PS: This patch applies over vldn-vstn.txt sent earlier. Changelog: 2012-09-10 Tejas Belagod gcc/ * config/aarch64/aarch64.c (aarch64_shift_truncation_mask): Define. (TARGET_SHIFT_TRUNCATION_MASK): Define. * config/aarch64/aarch64.h (SHIFT_COUNT_TRUNCATED): Conditionalize on TARGET_SIMD.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 20b23d2..7952530 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6677,6 +6677,14 @@ aarch64_simd_attr_length_move (rtx insn) return 4; } +static unsigned HOST_WIDE_INT +aarch64_shift_truncation_mask (enum machine_mode mode) +{ + return +(aarch64_vector_mode_supported_p (mode) + || aarch64_vect_struct_mode_p (mode)) ? 0 : (GET_MODE_BITSIZE (mode) - 1); +} + #ifndef TLS_SECTION_ASM_FLAG #define TLS_SECTION_ASM_FLAG 'T' #endif @@ -6930,6 +6938,9 @@ aarch64_c_mode_for_suffix (char suffix) #undef TARGET_SECONDARY_RELOAD #define TARGET_SECONDARY_RELOAD aarch64_secondary_reload +#undef TARGET_SHIFT_TRUNCATION_MASK +#define TARGET_SHIFT_TRUNCATION_MASK aarch64_shift_truncation_mask + #undef TARGET_SETUP_INCOMING_VARARGS #define TARGET_SETUP_INCOMING_VARARGS aarch64_setup_incoming_varargs diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 28cafa9..8dfcd44 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -786,7 +786,7 @@ enum aarch64_builtins : 0) -#define SHIFT_COUNT_TRUNCATED 1 +#define SHIFT_COUNT_TRUNCATED !TARGET_SIMD /* Callee only saves lower 64-bits of a 128-bit register. Tell the compiler the callee clobbers the top 64-bits when restoring the
Re: [PATCH, AArch64] Allow symbol+offset even if not being used for memory access
On 09/06/2012 10:19 AM, Ian Bolton wrote: > Based on that, and assuming I remove the constraints on the > pattern, would you say the patch is worthy of commit? Can you send me the test case you were looking at for this? r~
Re: [PATCH] Fix PR54492
On Mon, 2012-09-10 at 16:56 +0200, Richard Guenther wrote: > On Mon, 10 Sep 2012, Jakub Jelinek wrote: > > > On Mon, Sep 10, 2012 at 04:45:24PM +0200, Richard Guenther wrote: > > > On Mon, 10 Sep 2012, William J. Schmidt wrote: > > > > > > > Richard found some N^2 behavior in SLSR that has to be suppressed. > > > > Searching for the best possible basis is overkill when there are > > > > hundreds of thousands of possibilities. This patch constrains the > > > > search to "good enough" in such cases. > > > > > > > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no > > > > regressions. Ok for trunk? > > > > > > Hm, rather than stopping the search, can we stop adding new candidates > > > instead so the list never grows that long? If that's not easy > > > the patch is ok as-is. > > > > Don't we want a param for that, or is a hardcoded magic constant fine here? > > I suppose a param for it would be nice. OK, I'll get a param in place and get back to you. Thanks... Bill > > Richard. > > > > > 2012-08-10 Bill Schmidt > > > > > > > > * gimple-ssa-strength-reduction.c (find_basis_for_candidate): > > > > Limit > > > > the time spent searching for a basis. > > > > > > > > > > > > Index: gcc/gimple-ssa-strength-reduction.c > > > > === > > > > --- gcc/gimple-ssa-strength-reduction.c (revision 191135) > > > > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > > > > @@ -353,10 +353,14 @@ find_basis_for_candidate (slsr_cand_t c) > > > >cand_chain_t chain; > > > >slsr_cand_t basis = NULL; > > > > > > > > + // Limit potential of N^2 behavior for long candidate chains. > > > > + int iters = 0; > > > > + const int MAX_ITERS = 50; > > > > + > > > >mapping_key.base_expr = c->base_expr; > > > >chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); > > > > > > > > - for (; chain; chain = chain->next) > > > > + for (; chain && iters < MAX_ITERS; chain = chain->next, ++iters) > > > > { > > > >slsr_cand_t one_basis = chain->cand; > > > > Jakub > > > > >
Re: [Patch ARM] implement bswap16
On 7 September 2012 17:28, Richard Earnshaw wrote: > > Ah, sigh! I'd forgotten about the cond-exec issue. That makes things > a little awkward, since we also have to deal with the fact that thumb1 > does not support predication. The solution, unfortunately, is thus a > bit more involved. > Sorry if your suggestion makes me ask a few more questions :-) > What we need are two patterns (although currently it looks like we've got two, > in reality the predication means there are three), which need to read: What is the advantage of the version you propose? I mean there are already two explicit patterns, your proposal does not really bring factorization since we end up with two patterns. > (define_insn "*arm_revsh" > [(set (match_operand:SI 0 "s_register_operand" "=l,l,r") > (sign_extend:SI (bswap:HI (match_operand:HI 1 > "s_register_operand" "l,l,r"] > "arm_arch6" > "@ >revsh\t%0, %1 >revsh%?\t%0, %1 >revsh%?\t%0, %1" Why do we have to keep room for the predicate here? (%?) Doesn't this pattern match only in unconditional cases? BTW, I didn't manage to have GCC generate conditional revsh. I merely added an "if (y)" guard before calling builtin_bswap16, but this didn't turn into a conditional revsh. > [(set_attr "arch" "t1,t2,32") >(set_attr "length" "2,2,4")] > (define_insn "*arm_revsh_cond" > [(cond_exec (match_operator 2 "arm_comparison_operator" >[(match_operand 3 "cc_register" "") (const_int 0)]) > (set (match_operand:SI 0 "s_register_operand" "=l,r") >(sign_extend:SI (bswap:HI (match_operand:HI 1 > "s_register_operand" "l,r")] > "TARGET32_BIT && arm_arch6" > "revsh%?\t%0, %1" > [(set_attr "arch" "t2,*") >(set_attr "length" "2,4")]) > > Note that this removes the "predicable" attribute as we now handle this > manually rather than with the auto-generation. > > Sorry, this has turned out to be more complex than I originally realised. I understand that this is also applicable to the existing arm_rev and thumb1_rev patterns for 32 bit swaps. I'd like to understand the rationale & implications of your proposal. Thanks Christophe.
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On 09/07/2012 02:00 PM, Iyer, Balaji V wrote: > So, if I am understanding this correctly, there is no way to have the > vectorization turned on/off on a function by function basis? I don't > mind if it is turned off for -O0, but would like it be turned on/off > for anything > -O1. There's probably no reason that we can't enable vectorization on a loop by loop basis. Given that we're keeping a bit attached to the loop itself for #pragma simd anyway. This ought not be terribly difficult to arrange... r~
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On 09/07/2012 12:31 PM, Iyer, Balaji V wrote: > I hope I have not mistaken your question, but to clarify the > elemental function's definition and body is visible to all passes > after the invocation of gimplify_function_tree (). It is also visible > for the LTO optimization. If that's the case, what's the point in defining an external ABI and defining what __attribute__((vector)) placed on a function declaration means? r~
Re: [PATCH 5/6] Thread pointer built-in functions, xtensa [PING]
On Sun, Sep 9, 2012 at 11:31 PM, Chung-Lin Tang wrote: > On 2012/8/28 äžć 04:15, Chung-Lin Tang wrote: >> On 12/7/12 äžć2:52, Chung-Lin Tang wrote: >> Xtensa parts updated to use MD pattern. >> >> Thanks, >> Chung-Lin >> >> * config/xtensa/xtensa.md (get_thread_pointersi): Renamed from >> load_tp. >> (set_thread_pointersi): Renamed from set_tp. >> * config/xtensa/xtensa.c >> (xtensa_legitimize_tls_address): Change gen_load_tp calls to >> gen_get_thread_pointersi. >> (xtensa_builtin): Remove XTENSA_BUILTIN_THREAD_POINTER and >> XTENSA_BUILTIN_SET_THREAD_POINTER. >> (xtensa_init_builtins): Remove __builtin_thread_pointer, >> __builtin_set_thread_pointer machine-specific builtins. >> (xtensa_fold_builtin): Remove XTENSA_BUILTIN_THREAD_POINTER, >> XTENSA_BUILTIN_SET_THREAD_POINTER cases. >> (xtensa_expand_builtin): Remove XTENSA_BUILTIN_THREAD_POINTER, >> XTENSA_BUILTIN_SET_THREAD_POINTER cases. >> > > Ping. > This is OK for xtensa.
RE: [PATCH, AArch64] Allow symbol+offset even if not being used for memory access
> Can you send me the test case you were looking at for this? See attached. (Most of it is superfluous, but the point is that we are not using the address to do a memory access.) Cheers, Ian constant-test1.c Description: Binary data
RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
>-Original Message- >From: Richard Henderson [mailto:r...@redhat.com] >Sent: Monday, September 10, 2012 12:03 PM >To: Iyer, Balaji V >Cc: Richard Guenther; gcc-patches@gcc.gnu.org; Gabriel Dos Reis; Aldy >Hernandez (al...@redhat.com); Jeff Law >Subject: Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22) > >On 09/07/2012 12:31 PM, Iyer, Balaji V wrote: >> I hope I have not mistaken your question, but to clarify the elemental >> function's definition and body is visible to all passes after the >> invocation of gimplify_function_tree (). It is also visible for the >> LTO optimization. > >If that's the case, what's the point in defining an external ABI and defining >what >__attribute__((vector)) placed on a function declaration means? When you have __attribute__((vector)) you are asking the compiler to create a vector AND a scalar version of the function. The advantage is that if the function is used, for example, in 2 loops where 1 can be vectorized and another cannot, the vectorizable loop won't suffer (i.e. suffer from being not-vectorized). Thanks, Balaji V. Iyer. > > >r~
RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
>-Original Message- >From: Richard Henderson [mailto:r...@redhat.com] >Sent: Monday, September 10, 2012 12:01 PM >To: Iyer, Balaji V >Cc: Jakub Jelinek; Andi Kleen; Richard Guenther; gcc-patches@gcc.gnu.org; >Gabriel Dos Reis; Aldy Hernandez (al...@redhat.com); Jeff Law >Subject: Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22) > >On 09/07/2012 02:00 PM, Iyer, Balaji V wrote: >> So, if I am understanding this correctly, there is no way to have the >> vectorization turned on/off on a function by function basis? I don't >> mind if it is turned off for -O0, but would like it be turned on/off >> for anything > -O1. > >There's probably no reason that we can't enable vectorization on a loop by loop >basis. Given that we're keeping a bit attached to the loop itself for #pragma >simd >anyway. > >This ought not be terribly difficult to arrange... Can you please help me get a start on how to get can be done? From what I understand (please correct me if I am wrong), this requires rearranging and duplicating a lot of passes and can potentially open up to a lot of bugs. > > >r~
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On 09/10/2012 09:11 AM, Iyer, Balaji V wrote: > Can you please help me get a start on how to get can be done? From > what I understand (please correct me if I am wrong), this requires > rearranging and duplicating a lot of passes and can potentially open > up to a lot of bugs. Certainly not duplicating passes. And probably not even rearranging them. The Important parts are: (1) Having a bit in "struct loop" that indicates the special semantics you have for #pragma simd. I don't know if maybe all loops inside an elemental function are so automatically marked? (2) Have bits in "struct function" that summarize the contents of the bit from "struct loop", for all loops in the function. Note that this bit would need to be updated during inlining. (3) Change the "gate" predicates for the relevant function to also check the bit from "struct function". In some cases the pass might need to run globally (perhaps if-conversion?) and in some cases the pass might be able to restrict work to specific loops (e.g. the vectorizer), skipping loops for which the optimization is not enabled. r~
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On 09/10/2012 09:09 AM, Iyer, Balaji V wrote: >> >If that's the case, what's the point in defining an external ABI and >> >defining what >> >__attribute__((vector)) placed on a function declaration means? > When you have __attribute__((vector)) you are asking the compiler to > create a vector AND a scalar version of the function. The advantage > is that if the function is used, for example, in 2 loops where 1 can > be vectorized and another cannot, the vectorizable loop won't suffer > (i.e. suffer from being not-vectorized). You've totally mis-understood my point. Whether or not the compiler creates a clone COULD BE totally up to the compiler, based on whether or not vectorization is enabled, whether the loop has been analyzed such that vectorization may proceed, or indeed the phase of the moon. But in order for that to happen, the clone must be totally private to the module for which we are generating code (in the LTO sense, this is the entire program or dll; without LTO, this is just the object file). It means that we never attempt to generate clones for functions for which the body of the function is not visible. On the other hand, if you insist on assuming a clone exists merely because a declaration bears an attribute, then you must address ALL of the problems with respect to defining a stable ABI in the face of different cpu revisions, different ISAs, and different vector lengths. I've not seen you address ANY of these problems, despite having the problem pointed out multiple times. r~
Re: [PATCH] Combine location with block using block_locations
Hi, I was curious how the patch behaves memory wise on compilling Mozilla. It actually crashes on: (gdb) bt #0 0x7fab8cd70945 in raise () from /lib64/libc.so.6 #1 0x7fab8cd71f21 in abort () from /lib64/libc.so.6 #2 0x00b52330 in linemap_location_from_macro_expansion_p (set=0x7805, location=30725) at ../../libcpp/line-map.c:952 #3 0x00b526fc in linemap_lookup (set=0x7fab8dc34000, line=0) at ../../libcpp/line-map.c:644 #4 0x00776745 in maybe_unwind_expanded_macro_loc (where=0, diagnostic=, context=) at ../../gcc/tree-diagnostic.c:113 #5 virt_loc_aware_diagnostic_finalizer (context=0x11b8a80, diagnostic=0x7fff4d8adf90) at ../../gcc/tree-diagnostic.c:282 #6 0x00b4aa80 in diagnostic_report_diagnostic (context=0x11b8a80, diagnostic=0x7fff4d8adf90) at ../../gcc/diagnostic.c:652 #7 0x00b4acd6 in internal_error (gmsgid=) at ../../gcc/diagnostic.c:957 #8 0x007555c0 in crash_signal (signo=11) at ../../gcc/toplev.c:335 #9 #10 0x00b526e8 in linemap_lookup (set=0x7fab8dc34000, line=4294967295) at ../../libcpp/line-map.c:643 #11 0x00b530fa in linemap_location_in_system_header_p (set=0x7fab8dc34000, location=4294967295) at ../../libcpp/line-map.c:916 #12 0x00b4a8b2 in diagnostic_report_diagnostic (context=0x11b8a80, diagnostic=0x7fff4d8ae620) at ../../gcc/diagnostic.c:513 #13 0x00b4b462 in warning_at (location=, opt=0, gmsgid=) at ../../gcc/diagnostic.c:805 #14 0x00699679 in lto_symtab_merge_decls_2 (diagnosed_p=, slot=) at ../../gcc/lto-symtab.c:574 #15 lto_symtab_merge_decls_1 (slot=, data=) at ../../gcc/lto-symtab.c:691 #16 0x00bd32e8 in htab_traverse_noresize (htab=, callback=0x698ed0 , info=0x0) at ../../libiberty/hashtab.c:784 #17 0x004e2630 in read_cgraph_and_symbols (nfiles=2849, fnames=) at ../../gcc/lto/lto.c:1824 #18 0x004e2b75 in lto_main () at ../../gcc/lto/lto.c:2107 It seems that warning_at is not really able to lookup the position. Honza
Re: status of -fstack-protector-strong?
Hi, ping, could any one take a look at this patch? Thanks, -Han On Fri, Sep 7, 2012 at 4:07 PM, Kees Cook wrote: > > Hi, > > I'm curious about the status of this patch: > http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00974.html > > Chrome OS uses this, and the Ubuntu Security Team has expressed > interest in it as well. What's needed to land this in gcc? > > Thanks! > > -Kees > > -- > Kees Cook > Chrome OS Security -- Han Shen | Software Engineer | shen...@google.com | +1-650-440-3330
Re: [PATCH] Enable bbro for -Os
> All other comments are accepted. > > The updated patch is attached. Is it OK? As you probably gathered, I had missed that Steven and Richard had already commented on your patch before posting my message. Sorry about that... I think that the patch is interesting because, even if it doesn't exactly implement what the comment in gate_handle_reorder_blocks was talking about, it fixes code layout regressions without increasing the code size (and even decreasing it). So, assuming that Steven and Richard don't strongly oppose, I think the patch is OK modulo the following nits: + The above description is for the full algorithm, which is used when the + function is optimized for speed. When the function is optimized for size, + in order to reduce long jump and connect more fall through edges, the long jumps... bb-reorder.c uses "fallthru edges" consistently. + algorithm is modified as follows: + (1) Break long trace to short ones. The trace is broken at a block, which + has multi-predecessors/successors during finding traces. long traces... A trace is broken at a block that has multiple predecessors/ successors during trace discovery. + (2) Ignore the edge probability and frequency for fall through edges. fallthru + (3) Keep its original order when there is no chance to fall through. bbro Keep the original order of blocks... We rely on the results of cfg_cleanup + bases on the result of cfg_cleanup, which does lots of optimizations on cfg. + So the order is expected to be kept if no fall through. + + To implement the change for code size optimization, block's index is + selected as the key and all traces are found in one round. + /* If the best destination has multiple successors or predecessors, +don't allow it to be added when optimizing for size. This makes +sure predecessors with smaller index handled before the best +destination. It breaks long trace and reduces long jumps. missing "are" before "handled" +After removing the best edge, the final result will be ABCD/ACBD. +It does not add jump compared with the previous order. But it +reduce the possibility of long jump. */ Double space before "But". + if (optimize_function_for_size_p (cfun)) +{ + e_index = src_index_p ? e->src->index : e->dest->index; + b_index = src_index_p ? cur_best_edge->src->index + : cur_best_edge->dest->index; + /* The smaller one is better to keep the original order. */ + return b_index > e_index; +} Trailing space after the last parenthesis. + /* If dest has multiple predecessors, skip it. We expect +that one predecessor with smaller index connect with it +later. */ connects + /* Only connect Trace n with Trace n + 1. It is conservative +to keep the order as close as possible to the original order. +It also helps to reduce long jump. */ long jumps Thanks for working on this. -- Eric Botcazou
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Mon, Sep 10, 2012 at 09:30:15AM -0700, Richard Henderson wrote: > On 09/10/2012 09:11 AM, Iyer, Balaji V wrote: > > Can you please help me get a start on how to get can be done? From > > what I understand (please correct me if I am wrong), this requires > > rearranging and duplicating a lot of passes and can potentially open > > up to a lot of bugs. > > Certainly not duplicating passes. And probably not even rearranging them. It would be great if unrolling was also done loop by loop in a similar way. I often wanted that (only enable it for some loop, not the whole file) And a lot of other compilers have pragmas for this, just not gcc. As I understand vectorization needs some unrolling anyways? -Andi
Re: [patch, mips] New mips triplet for multilib linux builds
On Sat, 2012-09-08 at 13:50 +0100, Richard Sandiford wrote: > Should add BASE_DRIVER_SELF_SPECS too. OK with that change, thanks. > And thanks for your patience. > > Richard I added BASE_DRIVER_SELF_SPECS and did the checkin. Thanks for all your help and advice. Steve Ellcey sell...@mips.com
Re: [Patch ARM] implement bswap16
On 10/09/12 16:40, Christophe Lyon wrote: > On 7 September 2012 17:28, Richard Earnshaw wrote: >> >> Ah, sigh! I'd forgotten about the cond-exec issue. That makes things >> a little awkward, since we also have to deal with the fact that thumb1 >> does not support predication. The solution, unfortunately, is thus a >> bit more involved. >> > Sorry if your suggestion makes me ask a few more questions :-) > No problem :-) >> What we need are two patterns (although currently it looks like we've got >> two, >> in reality the predication means there are three), which need to read: > What is the advantage of the version you propose? > I mean there are already two explicit patterns, your proposal does not > really bring factorization since we end up with two patterns. > The code generated by recog would have to recoginize three possible patterns otherwise. Predication effectively goes through the insn list and generates additional patterns for each predicable insn. You never see them, but they're in there somewhere... It's relatively minor but it does lead to a slightly simpler recognizer, which should mean a smaller, faster compiler. >> (define_insn "*arm_revsh" >> [(set (match_operand:SI 0 "s_register_operand" "=l,l,r") >> (sign_extend:SI (bswap:HI (match_operand:HI 1 >> "s_register_operand" "l,l,r"] >> "arm_arch6" >> "@ >>revsh\t%0, %1 >>revsh%?\t%0, %1 >>revsh%?\t%0, %1" > Why do we have to keep room for the predicate here? (%?) Doesn't this > pattern match only in unconditional cases? > Because the ARM back-end has a very late conditionalizer pass that can also generate conditional execution. It very rarely kicks in these days, but if the predication rules are in there you could end up with an instruction that the compiler thought was conditionally executed being always run. That would be bad^TM. > BTW, I didn't manage to have GCC generate conditional revsh. I merely > added an "if (y)" guard before calling builtin_bswap16, but this > didn't turn into a conditional revsh. > >> [(set_attr "arch" "t1,t2,32") >>(set_attr "length" "2,2,4")] > > > >> (define_insn "*arm_revsh_cond" >> [(cond_exec (match_operator 2 "arm_comparison_operator" >>[(match_operand 3 "cc_register" "") (const_int 0)]) >> (set (match_operand:SI 0 "s_register_operand" "=l,r") >>(sign_extend:SI (bswap:HI (match_operand:HI 1 >> "s_register_operand" "l,r")] >> "TARGET32_BIT && arm_arch6" >> "revsh%?\t%0, %1" >> [(set_attr "arch" "t2,*") >>(set_attr "length" "2,4")]) >> >> Note that this removes the "predicable" attribute as we now handle this >> manually rather than with the auto-generation. >> >> Sorry, this has turned out to be more complex than I originally realised. > > I understand that this is also applicable to the existing arm_rev and > thumb1_rev patterns for 32 bit swaps. I'd like to understand the > rationale & implications of your proposal. > > Thanks > > Christophe. > R.
Re: [PATCH] Combine location with block using block_locations
Thanks for helping test this. I'll try to build mozzila to check the memory consumption as well as find new bugs. Dehao On Tue, Sep 11, 2012 at 12:41 AM, Jan Hubicka wrote: > Hi, > I was curious how the patch behaves memory wise on compilling Mozilla. It > actually crashes on: > (gdb) bt > #0 0x7fab8cd70945 in raise () from /lib64/libc.so.6 > #1 0x7fab8cd71f21 in abort () from /lib64/libc.so.6 > #2 0x00b52330 in linemap_location_from_macro_expansion_p > (set=0x7805, location=30725) at ../../libcpp/line-map.c:952 > #3 0x00b526fc in linemap_lookup (set=0x7fab8dc34000, line=0) at > ../../libcpp/line-map.c:644 > #4 0x00776745 in maybe_unwind_expanded_macro_loc (where=0, > diagnostic=, context=) at > ../../gcc/tree-diagnostic.c:113 > #5 virt_loc_aware_diagnostic_finalizer (context=0x11b8a80, > diagnostic=0x7fff4d8adf90) at ../../gcc/tree-diagnostic.c:282 > #6 0x00b4aa80 in diagnostic_report_diagnostic (context=0x11b8a80, > diagnostic=0x7fff4d8adf90) at ../../gcc/diagnostic.c:652 > #7 0x00b4acd6 in internal_error (gmsgid=) at > ../../gcc/diagnostic.c:957 > #8 0x007555c0 in crash_signal (signo=11) at ../../gcc/toplev.c:335 > #9 > #10 0x00b526e8 in linemap_lookup (set=0x7fab8dc34000, > line=4294967295) at ../../libcpp/line-map.c:643 > #11 0x00b530fa in linemap_location_in_system_header_p > (set=0x7fab8dc34000, location=4294967295) at ../../libcpp/line-map.c:916 > #12 0x00b4a8b2 in diagnostic_report_diagnostic (context=0x11b8a80, > diagnostic=0x7fff4d8ae620) at ../../gcc/diagnostic.c:513 > #13 0x00b4b462 in warning_at (location=, opt=0, > gmsgid=) at ../../gcc/diagnostic.c:805 > #14 0x00699679 in lto_symtab_merge_decls_2 (diagnosed_p= out>, slot=) at ../../gcc/lto-symtab.c:574 > #15 lto_symtab_merge_decls_1 (slot=, data=) at > ../../gcc/lto-symtab.c:691 > #16 0x00bd32e8 in htab_traverse_noresize (htab=, > callback=0x698ed0 , info=0x0) at > ../../libiberty/hashtab.c:784 > #17 0x004e2630 in read_cgraph_and_symbols (nfiles=2849, > fnames=) at ../../gcc/lto/lto.c:1824 > #18 0x004e2b75 in lto_main () at ../../gcc/lto/lto.c:2107 > > It seems that warning_at is not really able to lookup the position. > > Honza
Re: [PATCH] Fix PR54492
Here's the revised patch with a param. Bootstrapped and tested in the same manner. Ok for trunk? Thanks, Bill 2012-08-10 Bill Schmidt * doc/invoke.texi (max-slsr-cand-scan): New description. * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit the time spent searching for a basis. * params.def (PARAM_MAX_SLSR_CANDIDATE_SCAN): New param. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 191135) +++ gcc/doc/invoke.texi (working copy) @@ -9407,6 +9407,11 @@ having a regular register file and accurate regist See @file{haifa-sched.c} in the GCC sources for more details. The default choice depends on the target. + +@item max-slsr-cand-scan +Set the maximum number of existing candidates that will be considered when +seeking a basis for a new straight-line strength reduction candidate. + @end table @end table Index: gcc/gimple-ssa-strength-reduction.c === --- gcc/gimple-ssa-strength-reduction.c (revision 191135) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3. If not see #include "domwalk.h" #include "pointer-set.h" #include "expmed.h" +#include "params.h" /* Information about a strength reduction candidate. Each statement in the candidate table represents an expression of one of the @@ -353,10 +354,14 @@ find_basis_for_candidate (slsr_cand_t c) cand_chain_t chain; slsr_cand_t basis = NULL; + // Limit potential of N^2 behavior for long candidate chains. + int iters = 0; + int max_iters = PARAM_VALUE (PARAM_MAX_SLSR_CANDIDATE_SCAN); + mapping_key.base_expr = c->base_expr; chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); - for (; chain; chain = chain->next) + for (; chain && iters < max_iters; chain = chain->next, ++iters) { slsr_cand_t one_basis = chain->cand; Index: gcc/params.def === --- gcc/params.def (revision 191135) +++ gcc/params.def (working copy) @@ -973,6 +973,13 @@ DEFPARAM (PARAM_SCHED_PRESSURE_ALGORITHM, "Which -fsched-pressure algorithm to apply", 1, 1, 2) +/* Maximum length of candidate scans in straight-line strength reduction. */ +DEFPARAM (PARAM_MAX_SLSR_CANDIDATE_SCAN, + "max-slsr-cand-scan", + "Maximum length of candidate scans for straight-line " + "strength reduction", + 50, 1, 99) + /* Local variables: mode:c
Re: VxWorks Patches Back from the Dead!
On 9/10/2012 9:35 AM, Bruce Korb wrote: On 09/09/12 08:54, rbmj wrote: Just because I *love* bothering everyone with emails... I don't mind, as long as you don't expect me to do anything until I'm certain you've stabilized the patch ;) I'm glad you rolled it up into one patch, because I was eventually going to ask you to do that. Thank you. I keep thinking everything is stable, but then something changes (bitrot? something elsewhere in GCC? I don't know) and I have to regroup. Sorry for changing everything 10 times - please bear with me. At this point, I've recompiled with different settings about 10 times and it hasn't broken itself yet. I've tried in a different VM and it works there too. So *hopefully* it should be good. On the other hand, I've read this on the website: Don't mix together changes made for different reasons. Send them individually. Ideally, each change you send should be impossible to subdivide into parts that we might want to consider separately, because each of its parts gets its motivation from the other parts "impossible to subdivide into parts" seems like one patch per fixincludes rule (am I looking at the wrong level of granularity here?). At the same time, it's a pain in the rear to worry about 12 different commits (especially when I'm making changes, I git rebase a TON). I'm also not sure about practicality of this approach in terms of the amount of work it creates on all ends. Unless cosmic rays break everything again, that should be all. Thanks, Robert Mason
[alpha] Fix GPREL16 relocation error building glibc
There's an access within glibc wherein we've forced an important global variable into the .sdata section (so that accesses to its members can use 16-bit relocations), but an array access gets constant-folded such that we produce an offset well well outside of a 16-bit range. A test case that must be visually inspected looks like the following. I'm not certain how to turn this into a portable link-time test. But considering that other ports also handle SYMBOL_REF_SMALL_DATA, it does seem like something portable would be nice. Ideas? Or should I just drop this in as an alpha specific test? extern int x[10] __attribute__((visibility("hidden"), section(".sdata"))); int foo(int y) { return x[y-10]; } Committed the patch itself to mainline and 4.7 branch. r~ diff --git a/gcc/config/alpha/predicates.md b/gcc/config/alpha/predicates.md index 598742f..0a1885b 100644 --- a/gcc/config/alpha/predicates.md +++ b/gcc/config/alpha/predicates.md @@ -328,26 +328,50 @@ (define_predicate "small_symbolic_operand" (match_code "const,symbol_ref") { + HOST_WIDE_INT ofs = 0, max_ofs = 0; + if (! TARGET_SMALL_DATA) -return 0; +return false; if (GET_CODE (op) == CONST && GET_CODE (XEXP (op, 0)) == PLUS && CONST_INT_P (XEXP (XEXP (op, 0), 1))) -op = XEXP (XEXP (op, 0), 0); +{ + ofs = INTVAL (XEXP (XEXP (op, 0), 1)); + op = XEXP (XEXP (op, 0), 0); +} if (GET_CODE (op) != SYMBOL_REF) -return 0; +return false; /* ??? There's no encode_section_info equivalent for the rtl constant pool, so SYMBOL_FLAG_SMALL never gets set. */ if (CONSTANT_POOL_ADDRESS_P (op)) -return GET_MODE_SIZE (get_pool_mode (op)) <= g_switch_value; +{ + max_ofs = GET_MODE_SIZE (get_pool_mode (op)); + if (max_ofs > g_switch_value) + return false; +} + else if (SYMBOL_REF_LOCAL_P (op) + && SYMBOL_REF_SMALL_P (op) + && !SYMBOL_REF_WEAK (op) + && !SYMBOL_REF_TLS_MODEL (op)) +{ + if (SYMBOL_REF_DECL (op)) +max_ofs = tree_low_cst (DECL_SIZE_UNIT (SYMBOL_REF_DECL (op)), 1); +} + else +return false; - return (SYMBOL_REF_LOCAL_P (op) - && SYMBOL_REF_SMALL_P (op) - && !SYMBOL_REF_WEAK (op) - && !SYMBOL_REF_TLS_MODEL (op)); + /* Given that we know that the GP is always 8 byte aligned, we can + always adjust by 7 without overflowing. */ + if (max_ofs < 8) +max_ofs = 8; + + /* Since we know this is an object in a small data section, we know the + entire section is addressable via GP. We don't know where the section + boundaries are, but we know the entire object is within. */ + return IN_RANGE (ofs, 0, max_ofs - 1); }) ;; Return true if OP is a SYMBOL_REF or CONST referencing a variable
Re: [PATCH] Add option for dumping to stderr (issue6190057)
Ping. Thanks, Sharad Sharad On Wed, Sep 5, 2012 at 10:34 AM, Sharad Singhai wrote: > Ping. > > Thanks, > Sharad > > Sharad > > > > > On Fri, Aug 24, 2012 at 1:06 AM, Sharad Singhai wrote: >> >> Sorry about the delay. Please see comments inline. >> >> On Wed, Jul 4, 2012 at 6:33 AM, Richard Guenther >> wrote: >> > On Tue, Jul 3, 2012 at 11:07 PM, Sharad Singhai >> > wrote: >> >> Apologies for the spam. Attempting to resend the patch after shrinking >> >> it. >> >> >> >> I have updated the attached patch to use a new dump message >> >> classification system for the vectorizer. It currently uses four >> >> classes, viz, MSG_OPTIMIZED_LOCATIONS, MSG_UNOPTIMIZED_LOCATION, >> >> MSG_MISSING_OPTIMIZATION, and MSG_NOTE. I have gone through the >> >> vectorizer passes and have converted each call to fprintf (dump_file, >> >> ) to a message classification matching in spirit. Most often, it >> >> is MSG_OPTIMIZED_LOCATIONS, but occasionally others as well. >> >> >> >> For example, the following >> >> >> >> if (vect_print_dump_info (REPORT_DETAILS)) >> >> { >> >> fprintf (vect_dump, "niters for prolog loop: "); >> >> print_generic_expr (vect_dump, iters, TDF_SLIM); >> >> } >> >> >> >> gets converted to >> >> >> >> if (dump_kind (MSG_OPTIMIZED_LOCATIONS)) >> >> { >> >> dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, >> >> "niters for prolog loop: "); >> >> dump_generic_expr (MSG_OPTIMIZED_LOCATIONS, TDF_SLIM, iters); >> >> } >> >> >> >> The asymmetry between the first printf and the second is due to the >> >> fact that 'vect_print_dump_info (xxx)' prints the location as a >> >> "side-effect". To preserve the original intent somewhat, I have >> >> converted the first call within a dump sequence to a dump_printf_loc >> >> (xxx) which prints the location while the subsequence calls within the >> >> same conditional get converted to the corresponding plain variants. >> > >> > Ok, that looks reasonable. >> > >> >> I considered removing the support for alternate dump file, but ended >> >> up preserving it instead since it is needed for setting the alternate >> >> dump file to stderr for the case when -fopt-info is given but no dump >> >> file is available. >> >> >> >> The following invocation >> >> g++ ... -ftree-vectorize -fopt-info=4 >> >> >> >> dumps *all* available information to stderr. Currently, the opt-info >> >> level is common to all passes, i.e., a pass can't specify if wants a >> >> different level of diagnostic info. This can be added as an >> >> enhancement with a suitable syntax for selecting passes. >> >> >> >> I haven't fixed up the documentation/tests but wanted to get some >> >> feedback about the current state of patch before doing that. >> > >> > Some comments / questions. >> > >> > + if (dump_file && (dump_kind & opt_info_flags)) >> > +{ >> > + dump_loc (dump_kind, dump_file, loc); >> > + print_generic_expr (dump_file, t, dump_flags | extra_dump_flags); >> > +} >> > + >> > + if (alt_dump_file && (dump_kind & opt_info_flags)) >> > +{ >> > >> > you always test dump_kind against the same opt_info_flags variable. >> > I would have thought that the alternate dump file has a different >> > opt_info_flags >> > setting so I can have -fdump-tree-vect-details -fopt-info=1. Am I >> > missing >> > something? >> >> It was an oversight on my part. I have since fixed this. There are two >> separate flags corresponding to the two types of dump files, >> >> pflags ==> pass private dump file >> alt_flags ==> opt-info dump file >> >> > If I do >> > >> >> gcc file1.c file2.c -O3 -fdump-tree-vectorize=foo >> > >> > what will foo contain afterwards? I think you need to document the >> > behavior >> > when such redirection is used with the compiler-driver feature of >> > handling >> > multiple translation units. Especially the difference (or not >> > difference) to >> > >> >> gcc file1.c -O3 -fdump-tree-vectorize=foo >> >> gcc file2.c -O3 -fdump-tree-vectorize=foo >> >> Yes, the dump file gets overwritten during each invocation. I have >> noted this in the documentation. >> >> > I suppose we do not want to append to foo (but eventually support that >> > with some alternate syntax? Like -fdump-tree-vectorize=+foo?) >> >> Yes, I agree. We could define a new syntax as you suggested for >> appending to a dump file. However, this feature can wait for a >> separate patch. >> >> > + >> > +static void >> > +set_opt_info (int opt_info_level) >> > +{ >> > >> > This function needs a comment. How do the dump flags translate to >> > the opt-info flags? Is this documented anywhere? I only see >> > >> > +/* Different types of dump classifications. */ >> > +enum dump_msg_kind { >> > + MSG_NONE = 1 << 0, >> > + MSG_OPTIMIZED_LOCATIONS = 1 << 1, >> > + MSG_UNOPTIMIZED_LOCATIONS= 1 << 2, >> > + MSG_MISSED_OPTIMIZATION = 1 << 3, >> > + MSG_NOTE = 1 << 4 >> > +}; >> >> Yes, my mapping was s
Re: [Patch, fortran] PR46897 - [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign
Dear All, Please find attached a new attempt at the patch for PR46897. It now uses temporaries to overcome the side effects that Mikael pointed out. The resulting code can be quite profligate: infant0 = new_child() produces ASSIGN main:da@0 new_child[[()]] ASSIGN main:da@1 main:infant0 ASSIGN main:da@2 main:infant0 ASSIGN main:infant0 main:da@0 ASSIGN main:da@3 main:da@1 % parent ASSIGN main:da@4 main:da@1 % parent CALL assign0 ((main:da@3 % foo) (main:da@0 % parent % foo)) ASSIGN main:da@1 % parent % foo main:da@3 % foo ASSIGN main:infant0 % parent main:da@1 % parent It could be simplified, I suspect but I do not believe that it is worth any more effort for what is, after all, well off the beaten track. The comments in resolve.c explain how the patch works. Bootstrapped and regtested on FC9/x86_64 - OK for trunk? Cheers Paul 2012-09-10 Alessandro Fanfarillo Paul Thomas PR fortran/46897 * gfortran.h : Add bit field 'defined_assign_comp' to symbol_attribute structure. Add primitive for gfc_add_full_array_ref. * expr.c (gfc_add_full_array_ref): New function. (gfc_lval_expr_from_sym): Call new function. * resolve.c (add_comp_ref): New function. (build_assignment): New function. (get_temp_from_expr): New function (add_code_to_chain): New function (generate_component_assignments): New function that calls all the above new functions. (resolve_code): Call generate_component_assignments. 2012-09-10 Alessandro Fanfarillo Paul Thomas PR fortran/46897 * gfortran.dg/defined_assignment_1.f90: New test. * gfortran.dg/defined_assignment_2.f90: New test. * gfortran.dg/defined_assignment_3.f90: New test. On 14/08/2012, Paul Richard Thomas wrote: > Mikael, > > On 14 August 2012 10:42, Mikael Morin wrote: >> On 14/08/2012 07:03, Paul Richard Thomas wrote: However, if we do it before, we also overwrite components to be assigned with a typebound call, and this can have some side effects as the LHS's argument can be INTENT(INOUT). >>> >>> This might be so but it is what the standard dictates should >>> happen isn't it? >>> >> It dictates that the components should be assigned one by one (by either >> defined or intrinsic assignment), which I don't see as strictly >> equivalent to a whole structure assignment followed by typebound calls >> (for components needing it). > > Hmmm. That's true. ***sigh*** > > I'll put it right. > > Cheers > > Paul > Index: gcc/fortran/gfortran.h === *** gcc/fortran/gfortran.h (revision 191115) --- gcc/fortran/gfortran.h (working copy) *** typedef struct *** 786,794 /* The symbol is a derived type with allocatable components, pointer components or private components, procedure pointer components, possibly nested. zero_comp is true if the derived type has no ! component at all. */ unsigned alloc_comp:1, pointer_comp:1, proc_pointer_comp:1, ! private_comp:1, zero_comp:1, coarray_comp:1, lock_comp:1; /* This is a temporary selector for SELECT TYPE. */ unsigned select_type_temporary:1; --- 786,796 /* The symbol is a derived type with allocatable components, pointer components or private components, procedure pointer components, possibly nested. zero_comp is true if the derived type has no ! component at all. defined_assign_comp is true if the derived ! type or an ancestor has a typebound defined assignment. */ unsigned alloc_comp:1, pointer_comp:1, proc_pointer_comp:1, ! private_comp:1, zero_comp:1, coarray_comp:1, lock_comp:1, ! defined_assign_comp:1; /* This is a temporary selector for SELECT TYPE. */ unsigned select_type_temporary:1; *** gfc_try gfc_check_assign_symbol (gfc_sym *** 2761,2766 --- 2763,2769 bool gfc_has_default_initializer (gfc_symbol *); gfc_expr *gfc_default_initializer (gfc_typespec *); gfc_expr *gfc_get_variable_expr (gfc_symtree *); + void gfc_add_full_array_ref (gfc_expr *, gfc_array_spec *); gfc_expr * gfc_lval_expr_from_sym (gfc_symbol *); gfc_array_spec *gfc_get_full_arrayspec_from_expr (gfc_expr *expr); Index: gcc/fortran/expr.c === *** gcc/fortran/expr.c (revision 191115) --- gcc/fortran/expr.c (working copy) *** gfc_get_variable_expr (gfc_symtree *var) *** 3878,3883 --- 3878,3910 } + /* Adds a full array reference to an expression, as needed. */ + + void + gfc_add_full_array_ref (gfc_expr *e, gfc_array_spec *as) + { + gfc_ref *ref; + for (ref = e->ref; ref; ref = ref->next) + if (!ref->next) + break; + if (ref) + { + ref->next = gfc_get_ref (); + ref = ref->next; + } + else + { + e
Re: [PATCH, AArch64] Allow symbol+offset even if not being used for memory access
On 09/10/2012 09:09 AM, Ian Bolton wrote: >> Can you send me the test case you were looking at for this? > > See attached. (Most of it is superfluous, but the point is that > we are not using the address to do a memory access.) Ok. Having dug a bit deeper I think the main problem is that you're working against yourself by not handling this pattern right from the beginning. You have split the address incorrectly to begin and are now trying to recover after the fact. The following patch seems to do the trick for me, producing > (insn 6 5 7 (set (reg:DI 81) > (high:DI (const:DI (plus:DI (symbol_ref:DI ("arr") [flags 0x80] > ) > (const_int 12 [0xc]) z.c:8 -1 > (nil)) > > (insn 7 6 8 (set (reg:DI 80) > (lo_sum:DI (reg:DI 81) > (const:DI (plus:DI (symbol_ref:DI ("arr") [flags 0x80] 0x7f9bae1105f0 arr>) > (const_int 12 [0xc]) z.c:8 -1 > (expr_list:REG_EQUAL (const:DI (plus:DI (symbol_ref:DI ("arr") [flags > 0x80] ) > (const_int 12 [0xc]))) > (nil))) right from the .150.expand dump. I'll leave it to you to fully regression test and commit the patch as appropriate. ;-) r~ Index: aarch64.md === --- aarch64.md (revision 191152) +++ aarch64.md (working copy) @@ -2840,7 +2840,7 @@ (lo_sum:DI (match_operand:DI 1 "register_operand" "r") (match_operand 2 "aarch64_valid_symref" "S")))] "" - "add\\t%0, %1, :lo12:%2" + "add\\t%0, %1, :lo12:%a2" [(set_attr "v8type" "alu") (set_attr "mode" "DI")] Index: aarch64.c === --- aarch64.c (revision 191152) +++ aarch64.c (working copy) @@ -652,43 +652,57 @@ unsigned HOST_WIDE_INT val; bool subtargets; rtx subtarget; - rtx base, offset; int one_match, zero_match; gcc_assert (mode == SImode || mode == DImode); - /* If we have (const (plus symbol offset)), and that expression cannot - be forced into memory, load the symbol first and add in the offset. */ - split_const (imm, &base, &offset); - if (offset != const0_rtx - && (targetm.cannot_force_const_mem (mode, imm) - || (can_create_pseudo_p ( + /* Check on what type of symbol it is. */ + if (GET_CODE (imm) == SYMBOL_REF + || GET_CODE (imm) == LABEL_REF + || GET_CODE (imm) == CONST) { - base = aarch64_force_temporary (dest, base); - aarch64_emit_move (dest, aarch64_add_offset (mode, NULL, base, INTVAL (offset))); - return; -} + rtx mem, base, offset; + enum aarch64_symbol_type sty; - /* Check on what type of symbol it is. */ - if (GET_CODE (base) == SYMBOL_REF || GET_CODE (base) == LABEL_REF) -{ - rtx mem; - switch (aarch64_classify_symbol (base, SYMBOL_CONTEXT_ADR)) + /* If we have (const (plus symbol offset)), separate out the offset +before we start classifying the symbol. */ + split_const (imm, &base, &offset); + + sty = aarch64_classify_symbol (base, SYMBOL_CONTEXT_ADR); + switch (sty) { case SYMBOL_FORCE_TO_MEM: - mem = force_const_mem (mode, imm); + if (offset != const0_rtx + && targetm.cannot_force_const_mem (mode, imm)) + { + gcc_assert(can_create_pseudo_p ()); + base = aarch64_force_temporary (dest, base); + base = aarch64_add_offset (mode, NULL, base, INTVAL (offset)); + aarch64_emit_move (dest, base); + return; + } + mem = force_const_mem (mode, imm); gcc_assert (mem); emit_insn (gen_rtx_SET (VOIDmode, dest, mem)); return; -case SYMBOL_SMALL_TLSGD: -case SYMBOL_SMALL_TLSDESC: -case SYMBOL_SMALL_GOTTPREL: -case SYMBOL_SMALL_TPREL: +case SYMBOL_SMALL_TLSGD: +case SYMBOL_SMALL_TLSDESC: +case SYMBOL_SMALL_GOTTPREL: case SYMBOL_SMALL_GOT: + if (offset != const0_rtx) + { + gcc_assert(can_create_pseudo_p ()); + base = aarch64_force_temporary (dest, base); + base = aarch64_add_offset (mode, NULL, base, INTVAL (offset)); + aarch64_emit_move (dest, base); + return; + } + /* FALLTHRU */ + +case SYMBOL_SMALL_TPREL: case SYMBOL_SMALL_ABSOLUTE: - aarch64_load_symref_appropriately - (dest, imm, aarch64_classify_symbol (base, SYMBOL_CONTEXT_ADR)); + aarch64_load_symref_appropriately (dest, imm, sty); return; default: @@ -696,7 +710,7 @@ } } - if ((CONST_INT_P (imm) && aarch64_move_imm (INTVAL (imm), mode))) + if (CONST_INT_P (imm) && aarch64_move_imm (INTVAL (imm), mode)) { emit_insn (gen_rtx_SET (VOIDmode, dest, imm)); return;
Re: [PATCH] Use -lgcc in libgcc_so linker script
On Sun, Sep 9, 2012 at 12:29 PM, Andreas Schwab wrote: > > PR target/46191 > * config/t-slibgcc-libgcc (SHLIB_MAKE_SOLINK): Use -lgcc instead > of libgcc.a. This is OK. Thanks. Ian
Re: VxWorks Patches Back from the Dead!
Hi, On Mon, Sep 10, 2012 at 10:48 AM, rbmj wrote: > On the other hand, I've read this on the website: > >> Don't mix together changes made for different reasons. Send them >> individually. Ideally, each change you send should be impossible to >> subdivide into > > parts that we might want to consider separately, because each of its parts > gets its motivation from the other parts OTOH, this is a fairly cohesive set of patches. A single project. Even if, strictly speaking, each include fix is entirely separate from the others (by the design of fixincludes), I see them as a cohesive set that ought to be in a single commit. Fixes to fixes for fixincludes have been very infrequent. > ... At the same > time, it's a pain in the rear to worry about 12 different commits I'm into comforting one's derriere. > Unless cosmic rays break everything again, that should be all. :) OK. I'll push it on your behalf once the other bits have been approved by their approvers. Cheers - Bruce
Re: [PATCH] Enable bbro for -Os
On 09/06/2012 02:56 AM, Zhenqiang Chen wrote: > + (3) Keep its original order when there is no chance to fall through. bbro > + bases on the result of cfg_cleanup, which does lots of optimizations on > cfg. > + So the order is expected to be kept if no fall through. Thanks for doing this. Our kernel guys have been asking for something like this for quite a while. I am curious about the case of no fall through. Especially about using that opportunity to sort cold blocks to the end of the function. I'm thinking here of stuff like switch with a default: abort(), or asm goto with an explicitly annotated cold path. r~
Re: [PATCH, libstdc++] Add proper OpenBSD support
> > On 10 September 2012 07:34, Mark Kettenis wrote: > >> Date: Sun, 9 Sep 2012 21:07:39 +0100 > >> From: Jonathan Wakely > >> > >> On 4 September 2012 20:26, Mark Kettenis wrote: > >> > Fixes a few testcases. Mostly based on the existing > >> > NetBSD/FreeBSD/Darwin code. > >> > > >> > 2012-09-04 Mark Kettenis > >> > > >> > * configure.host (*-*-openbsd*) Set cpu_include_dir. > >> > * config/os/bsd/openbsd/ctype_base.h: New file. > >> > * config/os/bsd/openbsd/ctype_configure_char.cc: New file. > >> > * config/os/bsd/openbsd/ctype_inline.h: New file. > >> > * config/os/bsd/openbsd/os_defines.h: New file. > >> > >> This patch is OK, thanks. Do you want me to commit it for you? > > > > Yes please. > > It occurs to me now that the patch changes the size of > ctype_base::mask, from the generic unsigned to char. I assume the > OpenBSD system compiler uses char? How long has that change been > present in the OpenBSD source tree? Yes, the system compile uses char and has been doing so since mid-2005. > I'm not sure whether or not it's better to change the size of that > type in GCC 4.8, which would break compatibility with previous > versions of the FSF sources but provide compatibility with the OpenBSD > system compiler. My guess would be that most people on OpenBSD are > using the system compiler not upstream FSF sources. Indeed. People either use the system compiler or install one from ports/packages. Given the sorry state of OpenBSD support in the FSF source tree (barely buildable) I think binary compatibility with the system compiler is more important. > >> It shouoldn't stop the patch going in, but I assume that this test > >> fails on OpenBSD even with your patch applied? > >> > >> #include > >> #include > >> > >> class gnu_ctype: public std::ctype { }; > >> > >> int main() > >> { > >> gnu_ctype gctype; > >> > >> assert(gctype.is(std::ctype_base::xdigit, L'a')); > >> } > > > > Interestingly enough, it doesn't fail without my diff. But it does > > fail for OpenBSD's system compiler (GCC 4.2.1 with a lot of local > > modifications). As far as I can determine this is the result of > > ctype_base::mask being an 8-bit integer type which doesn't go well > > with the generic ctype_members.cc implementation. Probably need to > > have an OpenBSD-specific implementation just like newlib. Looking > > into that now. > > See http://gcc.gnu.org/PR51772 (the original description gets the > cause wrong, see comment 3 for the real problem) Right! Using the newlib locale model on OpenBSD fixes the problem, and seems to fix a couple of test cases in the g++ testsuite as well. 2012-09-10 Mark Kettenis * acinclude.m4 (GLIBCXX_ENABLE_CLOCALE): Use newlib locale model for OpenBSD. * configure: Regenerated. Index: acinclude.m4 === --- acinclude.m4(revision 191120) +++ acinclude.m4(working copy) @@ -1836,6 +1836,9 @@ darwin* | freebsd*) enable_clocale_flag=darwin ;; + openbsd*) + enable_clocale_flag=newlib + ;; *) if test x"$with_newlib" = x"yes"; then enable_clocale_flag=newlib
Re: [ping][PATCH] Power: Reorder a sign-extend RTL pattern for readability
On Sat, 8 Sep 2012, David Edelsohn wrote: > 2012-08-10 Maciej W. Rozycki > > gcc/ > * config/rs6000/rs6000.md: Move a splitter next to its insn. > > This patch is okay. Yes, the splitter should not have been separated > from the basic pattern. Thanks for helping to clean up the port. Applied now, thanks for your review. Maciej
Re: [PATCH,i386] fma4 addition for bdver2
On 09/09/2012 10:23 PM, Gopalasubramanian, Ganesh wrote: > 2012-09-05 Ganesh Gopalasubramanian > > * config/i386/i386.md : Comments on fma4 instruction > selection reflect requirement on register pressure based > cost model. > > * config/i386/driver-i386.c (host_detect_local_cpu): fma4 > flag is set-reset as informed by the cpuid flag. > > * config/i386/i386.c (processor_alias_table): fma4 > flag is enabled for bdver2. Ok. r~
[Google 4.7] Fix is_static in gdb index generation (issue6498114)
The enclosed patch fixes a reversed conditional when calculating the is_static flag while generating gdb index--is_static is the opposite of the DW_AT_external flag. OK for google 4.7? Sterling 2012-09-10 Sterling Augustine * dwarf2out.c (output_pubname): Correct conditional. Index: dwarf2out.c === --- dwarf2out.c (revision 191084) +++ dwarf2out.c (working copy) @@ -9436,7 +9436,7 @@ output_pubname (dw_offset die_offset, pubname_entry *entry) { dw_die_ref die = entry->die; - int is_static = get_AT_flag (die, DW_AT_external) ? 1 : 0; + int is_static = get_AT_flag (die, DW_AT_external) ? 0 : 1; dw2_asm_output_data (DWARF_OFFSET_SIZE, die_offset, "DIE offset"); -- This patch is available for review at http://codereview.appspot.com/6498114
Re: [Google 4.7] Fix is_static in gdb index generation (issue6498114)
> The enclosed patch fixes a reversed conditional when calculating the > is_static flag while generating gdb index--is_static is the opposite > of the DW_AT_external flag. > > OK for google 4.7? > > Sterling > > 2012-09-10 Sterling Augustine > > * dwarf2out.c (output_pubname): Correct conditional. OK for google/gcc-4_7. -cary
Re: [SH] Add simple_return pattern
On Mon, 2012-09-10 at 15:51 +0200, Christian Bruel wrote: > This patch implements the simple_return pattern to enable -fshrink-wrap > on SH. It also clean up some redundancies for expand_epilogue (called > twice from the "return" and "epilogue" patterns and the > sh_expand_prologue parameter type. > > No regressions with sh-superh-elf and sh4-linux gcc testsuites. > > Thanks > > Christian > Regarding the iterators, maybe it's better to put them in config/sh/iterators.md. The optab code attr is not needed in this case, "" is sufficient. How about the attached patch instead? BTW, I'm now also testing the modified attached patch and your previous newlib related patch. Cheers, Oleg Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 191161) +++ gcc/config/sh/sh.md (working copy) @@ -9335,7 +9335,7 @@ [(return)] "" { - sh_expand_epilogue (1); + sh_expand_epilogue (true); if (TARGET_SHCOMPACT) { rtx insn, set; @@ -10154,9 +10154,12 @@ } [(set_attr "type" "load_media")]) +(define_expand "simple_return" + [(simple_return)]) + (define_expand "return" - [(return)] - "reload_completed && ! sh_need_epilogue ()" + [(simple_return)] + "reload_completed && epilogue_completed" { if (TARGET_SHMEDIA) { @@ -10172,8 +10175,8 @@ } }) -(define_insn "*return_i" - [(return)] +(define_insn "*_i" + [(RETURN)] "TARGET_SH1 && ! (TARGET_SHCOMPACT && (crtl->args.info.call_cookie & CALL_COOKIE_RET_TRAMP (1))) @@ -10299,19 +10302,12 @@ (define_expand "prologue" [(const_int 0)] "" -{ - sh_expand_prologue (); - DONE; -}) + "sh_expand_prologue (); DONE;") (define_expand "epilogue" [(return)] "" -{ - sh_expand_epilogue (0); - emit_jump_insn (gen_return ()); - DONE; -}) + "sh_expand_epilogue (false);") (define_expand "eh_return" [(use (match_operand 0 "register_operand" ""))] Index: gcc/config/sh/iterators.md === --- gcc/config/sh/iterators.md (revision 191161) +++ gcc/config/sh/iterators.md (working copy) @@ -34,3 +34,5 @@ (define_mode_attr disp04 [(QI "K04") (HI "K05")]) (define_mode_attr disp12 [(QI "K12") (HI "K13")]) +;; Code iterator for return codes. +(define_code_iterator RETURN [return simple_return]) Index: gcc/config/sh/sh-protos.h === --- gcc/config/sh/sh-protos.h (revision 191161) +++ gcc/config/sh/sh-protos.h (working copy) @@ -117,7 +117,6 @@ extern int sh_media_register_for_return (void); extern void sh_expand_prologue (void); extern void sh_expand_epilogue (bool); -extern bool sh_need_epilogue (void); extern void sh_set_return_address (rtx, rtx); extern int initial_elimination_offset (int, int); extern bool fldi_ok (void); Index: gcc/config/sh/sh.c === --- gcc/config/sh/sh.c (revision 191161) +++ gcc/config/sh/sh.c (working copy) @@ -7901,22 +7901,6 @@ static int sh_need_epilogue_known = 0; -bool -sh_need_epilogue (void) -{ - if (! sh_need_epilogue_known) -{ - rtx epilogue; - - start_sequence (); - sh_expand_epilogue (0); - epilogue = get_insns (); - end_sequence (); - sh_need_epilogue_known = (epilogue == NULL ? -1 : 1); -} - return sh_need_epilogue_known > 0; -} - /* Emit code to change the current function's return address to RA. TEMP is available as a scratch register, if needed. */ @@ -7996,7 +7980,6 @@ sh_output_function_epilogue (FILE *file ATTRIBUTE_UNUSED, HOST_WIDE_INT size ATTRIBUTE_UNUSED) { - sh_need_epilogue_known = 0; } static rtx
Re: [Google 4.7] Fix is_static in gdb index generation (issue6498114)
On Mon, Sep 10, 2012 at 3:05 PM, Cary Coutant wrote: >> The enclosed patch fixes a reversed conditional when calculating the >> is_static flag while generating gdb index--is_static is the opposite >> of the DW_AT_external flag. >> >> OK for google 4.7? >> >> Sterling >> >> 2012-09-10 Sterling Augustine >> >> * dwarf2out.c (output_pubname): Correct conditional. > > OK for google/gcc-4_7. > > -cary Committed as revision 191163.
[C++ Patch] PR 54541 / 54542
Hi, I'm finishing (in the C++ library testsuite now) testing this straightforward patch for a couple of "standard" SFINAE issues (interestingly, however, 54541 is a regression in mainline). Ok for mainline? Thanks, Paolo. PS: I suspect that the require_complete_type in convert_arg_to_ellipsis should also be require_complete_type_sfinae /cp 2012-09-11 Paolo Carlini PR c++/54541 PR c++/54542 * call.c (build_cxx_call): Add tsubst_flags_t parameter, use require_complete_type_sfinae. (build_op_delete_call, build_over_call): Adjust. * typeck.c (build_x_compound_expr_from_vec): Add tsubst_flags_t parameter. (cp_build_function_call_vec): Adjust. * init.c (build_new_1): Likewise. * rtti.c (throw_bad_cast, throw_bad_typeid, build_dynamic_cast_1): Likewise. * optimize.c (build_delete_destructor_body): Likewise. * cp-tree.h: Adjust declarations. /testsuite 2012-09-11 Paolo Carlini PR c++/54541 PR c++/54542 * g++.dg/cpp0x/sfinae40.C: New. * g++.dg/cpp0x/sfinae41.C: Likewise. Index: testsuite/g++.dg/cpp0x/sfinae40.C === --- testsuite/g++.dg/cpp0x/sfinae40.C (revision 0) +++ testsuite/g++.dg/cpp0x/sfinae40.C (revision 0) @@ -0,0 +1,21 @@ +// PR c++/54541 +// { dg-do compile { target c++11 } } + +template T&& declval(); + +struct X; + +X f(int); + +template +void g(decltype((void)f(declval())) *) +{} + +template +void g(...) +{} + +int main() +{ + g(0); +} Index: testsuite/g++.dg/cpp0x/sfinae41.C === --- testsuite/g++.dg/cpp0x/sfinae41.C (revision 0) +++ testsuite/g++.dg/cpp0x/sfinae41.C (revision 0) @@ -0,0 +1,17 @@ +// PR c++/54542 +// { dg-do compile { target c++11 } } + +template +void f(decltype(new T(1, 2)) *) +{ + T(1, 2); +} + +template +void f(...) +{} + +int main() +{ + f(0); +} Index: cp/typeck.c === --- cp/typeck.c (revision 191162) +++ cp/typeck.c (working copy) @@ -3373,7 +3373,7 @@ cp_build_function_call_vec (tree function, VEC(tre null parameters. */ check_function_arguments (fntype, nargs, argarray); - ret = build_cxx_call (function, nargs, argarray); + ret = build_cxx_call (function, nargs, argarray, complain); if (allocated != NULL) release_tree_vector (allocated); @@ -5719,7 +5719,8 @@ build_x_compound_expr_from_list (tree list, expr_l /* Like build_x_compound_expr_from_list, but using a VEC. */ tree -build_x_compound_expr_from_vec (VEC(tree,gc) *vec, const char *msg) +build_x_compound_expr_from_vec (VEC(tree,gc) *vec, const char *msg, + tsubst_flags_t complain) { if (VEC_empty (tree, vec)) return NULL_TREE; @@ -5732,14 +5733,19 @@ tree tree t; if (msg != NULL) - permerror (input_location, - "%s expression list treated as compound expression", - msg); + { + if (complain & tf_error) + permerror (input_location, + "%s expression list treated as compound expression", + msg); + else + return error_mark_node; + } expr = VEC_index (tree, vec, 0); for (ix = 1; VEC_iterate (tree, vec, ix, t); ++ix) expr = build_x_compound_expr (EXPR_LOCATION (t), expr, - t, tf_warning_or_error); + t, complain); return expr; } Index: cp/optimize.c === --- cp/optimize.c (revision 191162) +++ cp/optimize.c (working copy) @@ -128,7 +128,8 @@ build_delete_destructor_body (tree delete_dtor, tr /* Call the corresponding complete destructor. */ gcc_assert (complete_dtor); - call_dtor = build_cxx_call (complete_dtor, 1, &parm); + call_dtor = build_cxx_call (complete_dtor, 1, &parm, + tf_warning_or_error); add_stmt (call_dtor); add_stmt (build_stmt (0, LABEL_EXPR, cdtor_label)); Index: cp/init.c === --- cp/init.c (revision 191162) +++ cp/init.c (working copy) @@ -2740,7 +2740,8 @@ build_new_1 (VEC(tree,gc) **placement, tree type, /* We are processing something like `new int (10)', which means allocate an int, and initialize it with 10. */ - ie = build_x_compound_expr_from_vec (*init, "new initializer"); + ie = build_x_compound_expr_from_vec (*init, "new initializer", + complain); init_expr = cp_build_modify_expr (init_expr, INIT_EXPR, ie, complain); } Index: cp/rtti.c
[patch] PR54149: fix data race in LIM pass
In this failing testcase the LIM pass writes to g_13 regardless of the initial value of g_13, which is the test protecting the write. This causes an incorrect store data race wrt both the C++ memory model and transactional memory (the latter if the store occurs inside of a transaction). The problem here is that the ``lsm_flag'' temporary should only be set to true on the code paths where we actually set the original global. As it stands, we are setting lsm_flag to true for reads or writes. Fixed by only setting lsm_flag=1 when the original code path has a write. Tested on x86-64 Linux. OK for trunk? PR middle-end/54149 * tree-ssa-loop-im.c (execute_sm_if_changed_flag_set): Only set flag for writes. diff --git a/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c b/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c new file mode 100644 index 000..735cf5b --- /dev/null +++ b/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c @@ -0,0 +1,54 @@ +/* { dg-do link } */ +/* { dg-options "--param allow-store-data-races=0 -O" } */ +/* { dg-final { simulate-thread } } */ + +#include +#include + +#include "simulate-thread.h" + +/* PR 54139 */ +/* Test that speculative stores do not happen for --param + allow-store-data-races=0. */ + +int g_13=1, insns=1; + +__attribute__((noinline)) +void simulate_thread_main() +{ + int l_245; + + /* Since g_13 is unilaterally set positive above, there should be + no store to g_13 below. */ + for (l_245 = 0; l_245 <= 1; l_245 += 1) +for (; g_13 <= 0; g_13 = 1) + ; +} + +int main() +{ + simulate_thread_main (); + simulate_thread_done (); + return 0; +} + +void simulate_thread_other_threads () +{ + ++g_13; + ++insns; +} + +int simulate_thread_step_verify () +{ + return 0; +} + +int simulate_thread_final_verify () +{ + if (g_13 != insns) +{ + printf("FAIL: g_13 was incorrectly cached\n"); + return 1; +} + return 0; +} diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 0f61631..c5fc324 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -2113,9 +2113,14 @@ execute_sm_if_changed_flag_set (struct loop *loop, mem_ref_p ref) gimple_stmt_iterator gsi; gimple stmt; - gsi = gsi_for_stmt (loc->stmt); - stmt = gimple_build_assign (flag, boolean_true_node); - gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING); + /* Only set the flag for writes. */ + if (is_gimple_assign (loc->stmt) + && gimple_assign_lhs (loc->stmt) == *loc->ref) + { + gsi = gsi_for_stmt (loc->stmt); + stmt = gimple_build_assign (flag, boolean_true_node); + gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING); + } } VEC_free (mem_ref_loc_p, heap, locs); return flag;
[patch] Expand SJLJ exceptions as tablejump/casesi
On Wed, Apr 18, 2012 at 2:44 PM, Richard Henderson wrote: > On 04/18/2012 05:39 AM, Jan Hubicka wrote: >> Well, if SJLJ lowering happens as gimple pass somewhere near the end of >> gimple >> queue, this should not be problem at all. (and implementation would be >> cleaner) > > If you can find a clean way of separating sjlj expansion from dw2 expansion, > please do. But there's a lot of code shared between the two. I see nothing > wrong with always expanding via tablejump. Hello, This patch makes it so... well, almost anyway. With this patch, all SJLJ exception dispatch tables with more than 5 case labels are are expanded as a tablejump or a casesi. With the existing code: * a 1-case dispatch is expanded as a simple jump by code in except.c * a >1-case dispatch is built as a GIMPLE_SWITCH and expanded via expand_case as a decision tree if there are fewer than case_values_threshold() cases, or a tablejump or casesi for everything else If my patch is OK, this changes to: * a 1-case dispatch is expanded as a simple jump by code in except.c (unchanged) * a >1-case dispatch is built as a VEC of CASE_LABEL nodes and expanded via expand_sjlj_dispatch_table as a decrement-chain if there are fewer than 5 cases, or a tablejump or casesi for everything else Bootstrapped&tested on powerpc64-unknown-linux-gnu, and multilib-bootstrapped on x86_64-unknown-linux-gnu with java enabled and with --enable-sjlj-exceptions. OK for trunk? Ciao! Steven sjlj_tablejump.diff Description: Binary data
Re: [PATCH 1-2/12 ] New configure option --enable-espf=(all|ssp|pie|no)
fredag 07 september 2012 18.52.11 skrev du: > On Fri, 7 Sep 2012, Magnus Granberg wrote: > > * Makefile.in Add > > -fno-stack-protector when > > > > needed for espf. > > Toplevel Makefile.in is a generated file. You need to patch Makefile.def > or Makefile.tpl and regenerate Makefile.in. > > I'm surprised this passes bootstrap, since I wouldn't expect bootstrap to > avoid -Wformat-security warnings, and all the previous patch submissions I > recall to avoid such warnings have been incorrect (you can't just change > error (msg) to error ("%s", msg) when the reason the code is written how > it is is that no-argument formats such as %< and %> may appear in msg and > need interpreting). Have updated Makefile and configure patch and it bootstrap with --enable-werror did't have that enable last time. Have new changelog to. Thank you for the help. Gentoo Hardened Project Magnus Granberg 2012-09-10 Magnus Granberg Kees Cook gcc/doc/ * invoke.texi Add notes to -Wformat, -Wformat-security, -O2, -fstack-protector, -fPIE and -pie for espf. * install.texi Add new configure options 2012-08-26 Magnus Granberg Kees Cook gcc/testsuite * gcc.dg/charset/builtin2.c Add -Wno-format when effective_target is espf. * gcc.dg/format/format.exp Likewise. * gcc.dg/pr30473.c Likewise. * gcc.dg/pr38902.c Likewise. * gcc.dg/ipa/ipa-sra-1.cLikewise. * gcc.dg/torture/tls/tls-test.c Likewise. * g++.dg/abi/pragma-pack1.C Likewise. * g++.dg/cpp0x/constexpr-tuple.CLikewise. * lib/target-supports.exp Add check_effective_target_espf. * gcc.c-torture/execute/memset-1.x New file * gcc.c-torture/execute/vprintf-chk-1.x Likewise. * gcc.c-torture/execute/vfprintf-chk-1.xLikewise. * gcc.dg/stack-usage-1.cAdd -fno-stack-protector when effective_target is espf. * gcc.dg/superblock.c Likewise. * gcc.dg/20021014-1.c Add -fno-PIE when effective_target is espf. * gcc.dg/nest.c Likewise. * gcc.dg/nested-func-4.cLikewise. * gcc.dg/pr32450.c Likewise. * gcc.dg/pr43643.c Likewise. * g++.dg/other/anon5.C Likewise. * g++.old-deja/g++.law/profile1.C Likewise. * gcc.dg/tree-ssa/ssa-store-ccp-3.c Skip the test. 2012-08-27 Magnus Granberg Kees Cook gcc/testsuite/ PR 39537 * g++.dg/ext/align1.C Remove printf * g++.old-deja/g++.law/operators28.CFix format-string/type. * gcc.dg/torture/matrix-2.c Likewise. * gcc.dg/packed-vla.c Likewise. * g++.dg/opt/alias2.C Likewise. * g++.old-deja/g++.abi/vbase1.C Likewise. * g++.old-deja/g++.brendan/template8.C Likewise. * g++.old-deja/g++.eh/ptr1.C Likewise. * g++.old-deja/g++.jason/access23.C Likewise. * g++.old-deja/g++.law/cvt8.C Likewise. * g++.old-deja/g++.mike/net35.C Likewise. * g++.old-deja/g++.mike/offset1.C Likewise. * g++.old-deja/g++.mike/p12306.C Likewise. * g++.old-deja/g++.mike/p3579.C Likewise. * g++.old-deja/g++.mike/p3708a.C Likewise. * g++.old-deja/g++.mike/p3708b.C Likewise. * g++.old-deja/g++.mike/p3708.C Likewise. * g++.old-deja/g++.mike/p646.C Likewise. * g++.old-deja/g++.mike/p710.C Likewise. * g++.old-deja/g++.mike/p789a.C Likewise. * g++.old-deja/g++.mike/pmf2.C Likewise. * g++.old-deja/g++.mike/temp.C Likewise. * g++.old-deja/g++.other/temporary1.C Likewise. * g++.old-deja/g++.other/virtual8.C Likewise. * g++.old-deja/g++.pt/memtemp23.C
Re: [PATCH 3-4/12 ] New configure option --enable-espf=(all|ssp|pie|no)
fredag 07 september 2012 18.43.59 skrev du: > On Fri, 7 Sep 2012, Magnus Granberg wrote: > > --- a/gcc/config/linux.h2011-07-07 17:38:34.0 +0200 > > +++ b/gcc/config/linux.h2012-07-09 14:24:08.599281404 +0200 > > I see nothing related specifically to Linux rather than other targets that > may use GNU userspace, so I think all this belongs in gnu-user.h. > > > --- a/gcc/config/i386/linux.h 2011-06-03 20:30:39.0 +0200 > > Likewise. > > > +#if def ENABLE_ESPF > > Stray space inside #ifdef. Have updated the patch and move from linux*.h to gnu-user*.h. Thank you for the hints. Gentoo Hardened Project Magnus Granberg --- a/gcc/config/gnu-user.h 2011-04-28 18:49:49.0 +0200 +++ b/gcc/config/gnu-user.h 2012-09-08 18:22:41.020729353 +0200 @@ -98,3 +98,31 @@ see the files COPYING3 and COPYING.RUNTI #define TARGET_C99_FUNCTIONS 1 #define TARGET_HAS_SINCOS 1 + +#ifdef ENABLE_ESPF +#ifdef ENABLE_ESPF_PIE +#define ESPF_GCC_PIE_SPEC \ +"%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE| \ +shared|static|nostdlib|nostartfiles:;:-fPIE -pie}" +#else +#define ESPF_GCC_PIE_SPEC "" +#endif +#ifdef ENABLE_ESPF_SSP +#define ESPF_GCC_SSP_SPEC \ +"%{nostdlib|nodefaultlibs|fno-stack-protector| \ +fstack-protector|fstack-protector-all:;:-fstack-protector}" +#else +#define ESPF_GCC_SSP_SPEC "" +#endif +#ifdef ENABLE_ESPF_FORTIFY +#define ESPF_CPP_UNIQUE_OPTIONS_SPEC \ +"%{D_FORTIFY_SOURCE|D_FORTIFY_SOURCE=*|U_FORTIFY_SOURCE:;:-D_FORTIFY_SOURCE=2}" +#else +#define ESPF_CPP_UNIQUE_OPTIONS_SPEC "" +#endif +#define ESPF_DRIVER_SELF_SPECS \ +ESPF_GCC_PIE_SPEC, \ +ESPF_GCC_SSP_SPEC +#define ESPF_EXTRA_SPECS \ +{ "espf_cpp_unique_options", ESPF_CPP_UNIQUE_OPTIONS_SPEC } +#endif --- a/gcc/config/i386/gnu-user.h 2011-05-05 14:32:50.0 +0200 +++ b/gcc/config/i386/gnu-user.h 2012-07-09 14:28:38.726289455 +0200 @@ -93,9 +93,16 @@ along with GCC; see the file COPYING3. "--32 %{!mno-sse2avx:%{mavx:-msse2avx}} %{msse2avx:%{!mavx:-msse2avx}}" #undef SUBTARGET_EXTRA_SPECS +#ifdef ENABLE_ESPF #define SUBTARGET_EXTRA_SPECS \ { "link_emulation", GNU_USER_LINK_EMULATION },\ - { "dynamic_linker", GNU_USER_DYNAMIC_LINKER } + { "dynamic_linker", GNU_USER_DYNAMIC_LINKER }, \ + ESPF_EXTRA_SPECS +#else +#define SUBTARGET_EXTRA_SPECS \ + { "link_emulation", GNU_USER_LINK_EMULATION },\ + { "dynamic_linker", GNU_USER_DYNAMIC_LINKER } +#endif #undef LINK_SPEC #define LINK_SPEC "-m %(link_emulation) %{shared:-shared} \ @@ -202,3 +159,7 @@ along with GCC; see the file COPYING3. #define TARGET_CAN_SPLIT_STACK #define TARGET_THREAD_SPLIT_STACK_OFFSET 0x30 #endif + +#ifdef ENABLE_ESPF +#define DRIVER_SELF_SPECS ESPF_DRIVER_SELF_SPECS +#endif --- gcc-4.8-20120302/gcc/config/i386/gnu-user64.h 2012-06-30 00:21:30.0 +0200 +++ gcc-4.8-20120302-work/gcc/config/i386/gnu-user64.h 2012-09-08 18:14:03.683713936 +0200 @@ -94,3 +94,7 @@ see the files COPYING3 and COPYING.RUNTI #undef WCHAR_TYPE #define WCHAR_TYPE (TARGET_LP64 ? "int" : "long int") + +#ifdef ENABLE_ESPF +#define DRIVER_SELF_SPECS ESPF_DRIVER_SELF_SPECS +#endif --- a/gcc/config/i386/i386.h 2011-11-24 23:11:12.0 +0100 +++ b/gcc/config/i386/i386.h 2012-07-09 14:21:24.575276517 +0200 @@ -617,13 +617,16 @@ enum target_cpu_default Do not define this macro if it does not need to do anything. */ #ifndef SUBTARGET_EXTRA_SPECS +#ifdef ENABLE_ESPF +#define SUBTARGET_EXTRA_SPECS ESPF_EXTRA_SPECS +#else #define SUBTARGET_EXTRA_SPECS #endif +#endif #define EXTRA_SPECS \ { "cc1_cpu", CC1_CPU_SPEC }, \ SUBTARGET_EXTRA_SPECS - /* Set the value of FLT_EVAL_METHOD in float.h. When using only the FPU, assume that the fpcw is set to extended precision; when using
Re: [PATCH 8/12 ] New configure option --enable-espf=(all|ssp|pie|no)
fredag 07 september 2012 18.41.29 skrev Joseph S. Myers: > On Fri, 7 Sep 2012, Magnus Granberg wrote: > > +NOTE: With configure --enable-espf=@r{[}all@r{|}ssp@r{|}pie@r{]}is > > @emph{Note:} (existing style). @option{--enable-espf}. > > > +this option enabled by default for C, C++, ObjC, ObjC++. > > +To disable, use @option{-Wformat=0}. > > -Wno-format rather than -Wformat=0. > > The same comments apply several times in the patch. > > > +@option{-shared}, @option{-nodefaultlibs}, nor @option{static} are found. > > @option{-static} (missing '-'). Likewise elsewhere in the patch. Have updated the patch. Thank you for the hints. Gentoo Hardened Project Magnus Granberg--- a/gcc/doc/invoke.texi 2012-03-01 10:57:59.0 +0100 +++ b/gcc/doc/invoke.texi 2012-07-30 00:57:03.766847851 +0200 @@ -3216,6 +3216,11 @@ aspects of format checking, the options @option{-Wformat-nonliteral}, @option{-Wformat-security}, and @option{-Wformat=2} are available, but are not included in @option{-Wall}. +@emph{Note:} (existing style). +With @option{--enable-espf=@r{[}all@r{|}ssp@r{|}pie@r{]}}this option is +enabled by default for C, C++, ObjC, ObjC++. To disable, use +@option{-Wno-format}. + @item -Wformat-y2k @opindex Wformat-y2k @opindex Wno-format-y2k @@ -3269,6 +3273,13 @@ currently a subset of what @option{-Wfor in future warnings may be added to @option{-Wformat-security} that are not included in @option{-Wformat-nonliteral}.) +@emph{Note:} (existing style). +With @option{--enable-espf=@r{[}all@r{|}ssp@r{|}pie@r{]}}this option is +enabled by default for C, C++, ObjC, ObjC++. To disable, use +@option{-Wno-format-security}, or disable all format warnings +with @option{-Wno-format}. To make format security warnings fatal, +specify @option{-Werror=format-security}. + @item -Wformat=2 @opindex Wformat=2 @opindex Wno-format=2 @@ -6229,6 +6239,14 @@ also turns on the following optimization Please note the warning under @option{-fgcse} about invoking @option{-O2} on programs that use computed gotos. +@emph{Note:} (existing style). +With @option{--enable-espf=@r{[}all@r{|}ssp@r{|}pie@r{]}}, +@option{-D_FORTIFY_SOURCE=2} is set by default, and is activated +when @option{-O} is set to 2 or higher. This enables additional +compile-time and run-time checks for several libc functions. +To disable, specify either @option{-U_FORTIFY_SOURCE} or +@option{-D_FORTIFY_SOURCE=0}. + @item -O3 @opindex O3 Optimize yet more. @option{-O3} turns on all optimizations specified @@ -8475,6 +8492,13 @@ functions with buffers larger than 8 byt when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits. +@emph{Note:} (existing style). +With @option{--enable-espf=@r{[}all@r{|}ssp@r{|}pie@r{]}} this option +is enabled by default for C, C++, ObjC, ObjC++, if none of +@option{-fno-stack-protector}, @option{-nostdlib}, +@option{-fno-stack-protector-all}, @option{nodefaultlibs}, +nor @option{-ffreestanding} are found. + @item -fstack-protector-all @opindex fstack-protector-all Like @option{-fstack-protector} except that all functions are protected. @@ -9457,6 +9480,13 @@ For predictable results, you must also s that were used to generate code (@option{-fpie}, @option{-fPIE}, or model suboptions) when you specify this option. +@emph{Note:} (existing style). +With @option{--enable-espf=@r{[}all@r{|}ssp@r{|}pie@r{]}} this option is +enabled by default for C, C++, ObjC, ObjC++, if none of @option{-fno-PIE}, +@option{-fno-pie}, @option{-fPIC}, @option{-fpic}, @option{-fno-PIC}, +@option{-fno-pic}, @option{-nostdlib}, @option{-nostartfiles}, +@option{-shared}, @option{-nodefaultlibs}, nor @option{-static} are found. + @item -rdynamic @opindex rdynamic Pass the flag @option{-export-dynamic} to the ELF linker, on targets @@ -19125,6 +19154,13 @@ used during linking. @code{__pie__} and @code{__PIE__}. The macros have the value 1 for @option{-fpie} and 2 for @option{-fPIE}. +@emph{Note:} (existing style). +With @option{--enable-espf=@r{[}all@r{|}ssp@r{|}pie@r{]}} this option is +enabled by default for C, C++, ObjC, ObjC++, if none of @option{-fno-PIE}, +@option{-fno-pie}, @option{-fPIC}, @option{-fpic}, @option{-fno-PIC}, +@option{-fno-pic}, @option{-nostdlib}, @option{-nostartfiles}, +@option{-shared}, @option{-nodefaultlibs}, nor @option{-static} are found. + @item -fno-jump-tables @opindex fno-jump-tables Do not use jump tables for switch statements even where it would be --- a/gcc/doc/install.texi 2012-03-02 10:37:30.0 +0100 +++ b/gcc/doc/install.texi 2012-07-23 18:05:14.160784593 +0200 @@ -1392,6 +1392,18 @@ do a @samp{make -C gcc gnatlib_and_tools Specify that the run-time libraries for stack smashing protection should not be built. +@item --enable-espf=@var{list} +Will turn on some compiler and preprosessor options as default. +@option{-D_FORTIFY_SOURCE=2}, @option{-Wformat} and +@option{-Wformat-se
C++ PATCH for c++/54538 (lambda mangling)
This bug was introduced by the fix for 53783; the change from tsubst_copy to tsubst messed up handling of FIELD_DECLs, because tsubst of a FIELD_DECL always creates a new one. Fixed by limiting the 53783 change to FUNCTION_DECLs. Tested x86_64-pc-linux-gnu, applying to trunk. commit d61bd5ca5d9e57ed3bb2b82cacbcbc110f874349 Author: Jason Merrill Date: Mon Sep 10 15:26:44 2012 -0400 PR c++/54538 PR c++/53783 * pt.c (tsubst_copy_and_build) [LAMBDA_EXPR]: Go back to using RECUR for LAMBDA_EXPR_EXTRA_SCOPE except for function scope. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index cde83f2..a875528 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -14199,8 +14199,18 @@ tsubst_copy_and_build (tree t, LAMBDA_EXPR_MUTABLE_P (r) = LAMBDA_EXPR_MUTABLE_P (t); LAMBDA_EXPR_DISCRIMINATOR (r) = (LAMBDA_EXPR_DISCRIMINATOR (t)); - LAMBDA_EXPR_EXTRA_SCOPE (r) - = tsubst (LAMBDA_EXPR_EXTRA_SCOPE (t), args, complain, in_decl); + /* For a function scope, we want to use tsubst so that we don't + complain about referring to an auto function before its return + type has been deduced. Otherwise, we want to use tsubst_copy so + that we look up the existing field/parameter/variable rather + than build a new one. */ + tree scope = LAMBDA_EXPR_EXTRA_SCOPE (t); + if (scope && TREE_CODE (scope) == FUNCTION_DECL) + scope = tsubst (LAMBDA_EXPR_EXTRA_SCOPE (t), args, + complain, in_decl); + else + scope = RECUR (scope); + LAMBDA_EXPR_EXTRA_SCOPE (r) = scope; LAMBDA_EXPR_RETURN_TYPE (r) = tsubst (LAMBDA_EXPR_RETURN_TYPE (t), args, complain, in_decl); diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-mangle4.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-mangle4.C new file mode 100644 index 000..0d37637 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-mangle4.C @@ -0,0 +1,13 @@ +// PR c++/54538 +// { dg-do compile { target c++11 } } + +template +struct A +{ + // { dg-final { scan-assembler "_ZNK1AIcE1pMUlvE_cvPFvvEEv" } } + // { dg-final { scan-assembler "_ZNK1AIiE1pMUlvE_cvPFvvEEv" } } + void (*p)() = []{}; +}; + +A a1; +A a2;
Re: [C++ Patch] PR 54541 / 54542
OK. PS: I suspect that the require_complete_type in convert_arg_to_ellipsis should also be require_complete_type_sfinae Agreed. Jason
Re: status of -fstack-protector-strong?
On 08.09.2012 01:07, Kees Cook wrote: > Hi, > > I'm curious about the status of this patch: > http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00974.html > > Chrome OS uses this, and the Ubuntu Security Team has expressed > interest in it as well. What's needed to land this in gcc? I don't see any statement about testsuite results and possible regressions with this patch (not only on x86 platforms). Matthias
Re: [SH] Add simple_return pattern
Christian Bruel wrote: > This patch implements the simple_return pattern to enable -fshrink-wrap > on SH. It also clean up some redundancies for expand_epilogue (called > twice from the "return" and "epilogue" patterns and the > sh_expand_prologue parameter type. > > No regressions with sh-superh-elf and sh4-linux gcc testsuites. With the patch + revision 191106, I've got a new failure: FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE (internal compiler error) for sh4-unknown-linux-gnu. My testsuite/gcc/gcc.log says /exp/ldroot/dodes/xsh-gcc/gcc/xgcc -B/exp/ldroot/dodes/xsh-gcc/gcc/ /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c -fno-diagnostics-show-caret -O2 -freorder-blocks-and-partition -fprofile-use -D_PROFILE_USE -lm -o /exp/ldroot/dodes/xsh-gcc/gcc/testsuite/gcc/bb-reorg.x02 /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c: In function 'main': /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c:38:1: error: EDGE_CROSSING missing across section boundary /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c:38:1: internal compiler error: verify_flow_info failed Please submit a full bug report, Regards, kaz
[RFC / Patch] PR 54403
Hi, I just had a quick look to this PR, for: template class Foo { bool m_barbar; void Bar() { auto bar = [this]() { if (!m_barbar) { } }; } }; we ICE as a Seg fault in lvalue_kind, at line # 147, because for ref we have an INDIRECT_REF with null TREE_TYPE: case INDIRECT_REF: case ARROW_EXPR: case ARRAY_REF: case PARM_DECL: case RESULT_DECL: if (TREE_CODE (TREE_TYPE (ref)) != METHOD_TYPE)// 147 return clk_ordinary; break; I noticed that elsewhere in the function we handle a null TREE_TYPE and I quickly tried the trivial attached patchlet which works for the testcase, in that lvalue_kind ends up simply returning clk_none. Does it make sense to you? Thanks! Paolo. / Index: tree.c === --- tree.c (revision 191169) +++ tree.c (working copy) @@ -144,7 +144,7 @@ lvalue_kind (const_tree ref) case ARRAY_REF: case PARM_DECL: case RESULT_DECL: - if (TREE_CODE (TREE_TYPE (ref)) != METHOD_TYPE) + if (TREE_TYPE (ref) && TREE_CODE (TREE_TYPE (ref)) != METHOD_TYPE) return clk_ordinary; break;
Re: [PATCH] PowerPC VLE port
On Mon, Sep 10, 2012 at 5:28 PM, Maciej W. Rozycki wrote: > David, > >> The %c print_operand modifier was added by Aldy for a pattern that he >> added in 2004 and removed the same year. However, he did not remove >> the modifier. > > Indeed -- introduced with r80876 and then removed in r84775 -- thanks for > digging out the history behind this code. > So here's a new change to discard the case altogether, along with a > leftover comment from r80876 that should have been removed in r84775 too. > > OK to apply? > > 2012-09-10 Maciej W. Rozycki > > gcc/ > * config/rs6000/rs6000.c (print_operand) <'c'>: Remove. > * config/rs6000/spe.md: Remove a leftover comment. Okay. Thanks, David
Re: [PATCH] Fix PR 54362 (COND_EXPR not understood by ITM)
On Tue, Sep 4, 2012 at 12:20 AM, Andrew Pinski wrote: > Hi, > The problem here is that trans-mem.c does not take into account that > COND_EXPR can happen for pointers. This patch modifies > thread_private_new_memory to handle COND_EXPR as it can handle PHI > nodes. The testcase is a modified version of memopt-12.c but with a > loop which both LIM and if-convert can change the conditional to a > COND_EXPR. > > I found this problem when I was producing a pass which does a full > if-convert before expanding (well changing the last phi-opt pass) and > it produces COND_EXPRs and memopt-12.c started to fail. > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. Applied after approval from RTH offline. Thanks, Andrew > > Thanks, > Andrew Pinski > > ChangeLog: > * trans-mem.c (thread_private_new_memory): Handle COND_EXPR also. > > testsuite/ChangeLog: > * gcc.dg/tm/memopt-16.c: New testcase.
Re: Remove unnecessary VEC function overloads.
On Mon, Sep 10, 2012 at 4:52 PM, Diego Novillo wrote: > > Ian, could you commit the changes in go/gofrontend? Done. Actually, it looks like you already committed them, but I brought the master repo up to date. Ian > 2012-09-10 Diego Novillo > > * vec.h (vec_t::quick_push): Remove overload that accepts 'T *'. > Update all users. > (vec_t::safe_push): Likewise. > (vec_t::quick_insert): Likewise. > (vec_t::lower_bound): Likewise. > (vec_t::safe_insert): Likewise. > (vec_t::replace): Change second argument to 'T &'.
Re: [C++ Patch] for c++/54537
Oops, not sure how I test that change initially, or I must be blind, because it triggers an error in tr1/cmath about pow. I'll see what I can do... 2012/9/10 Jason Merrill : > OK. > > Jason -- Fabien
[Fortran, Patch] PR 54225 - Fix ice-on-invalid-code with "*" in array refs
This patch fixes a GCC 4.7/4.8 regression for invalid code. Build and regtested on x86-64-linux. OK for the trunk and 4.7? Tobias 2012-09-11 Tobias Burnus PR fortran/54225 * array.c (match_subscript, gfc_match_array_ref): Fix diagnostic of coarray's '*'. 2012-09-11 Tobias Burnus PR fortran/54225 * gfortran.dg/coarray_10.f90: Update dg-error. * gfortran.dg/coarray_28.f90: New. * gfortran.dg/array_section_3.f90: New. diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c index 07fecd8..44ec72e 100644 --- a/gcc/fortran/array.c +++ b/gcc/fortran/array.c @@ -91,9 +91,7 @@ match_subscript (gfc_array_ref *ar, int init, bool match_star) else if (!star) m = gfc_match_expr (&ar->start[i]); - if (m == MATCH_NO && gfc_match_char ('*') == MATCH_YES) -return MATCH_NO; - else if (m == MATCH_NO) + if (m == MATCH_NO) gfc_error ("Expected array subscript at %C"); if (m != MATCH_YES) return MATCH_ERROR; @@ -224,7 +222,7 @@ coarray: for (ar->codimen = 0; ar->codimen + ar->dimen < GFC_MAX_DIMENSIONS; ar->codimen++) { - m = match_subscript (ar, init, ar->codimen == (corank - 1)); + m = match_subscript (ar, init, true); if (m == MATCH_ERROR) return MATCH_ERROR; @@ -255,6 +253,13 @@ coarray: gfc_error ("Invalid form of coarray reference at %C"); return MATCH_ERROR; } + else if (ar->dimen_type[ar->codimen + ar->dimen] == DIMEN_STAR) + { + gfc_error ("Unexpected '*' for codimension %d of %d at %C", + ar->codimen + 1, corank); + return MATCH_ERROR; + } + if (ar->codimen >= corank) { gfc_error ("Invalid codimension %d at %C, only %d codimensions exist", diff --git a/gcc/testsuite/gfortran.dg/coarray_10.f90 b/gcc/testsuite/gfortran.dg/coarray_10.f90 index 99f5782..78abb5a 100644 --- a/gcc/testsuite/gfortran.dg/coarray_10.f90 +++ b/gcc/testsuite/gfortran.dg/coarray_10.f90 @@ -30,12 +30,12 @@ end subroutine this_image_check subroutine rank_mismatch() implicit none integer,allocatable :: A(:)[:,:,:,:] - allocate(A(1)[1,1,1:*]) ! { dg-error "Unexpected ... for codimension" } + allocate(A(1)[1,1,1:*]) ! { dg-error "Too few codimensions" } allocate(A(1)[1,1,1,1,1,*]) ! { dg-error "Invalid codimension 5" } allocate(A(1)[1,1,1,*]) allocate(A(1)[1,1]) ! { dg-error "Too few codimensions" } allocate(A(1)[1,*]) ! { dg-error "Too few codimensions" } - allocate(A(1)[1,1:*]) ! { dg-error "Unexpected ... for codimension" } + allocate(A(1)[1,1:*]) ! { dg-error "Too few codimensions" } A(1)[1,1,1] = 1 ! { dg-error "Too few codimensions" } A(1)[1,1,1,1,1,1] = 1 ! { dg-error "Invalid codimension 5" } @@ -48,5 +48,5 @@ end subroutine rank_mismatch subroutine rank_mismatch2() implicit none integer, allocatable:: A(:)[:,:,:] - allocate(A(1)[7:8,4:*]) ! { dg-error "Unexpected .*. for codimension 2 of 3" } + allocate(A(1)[7:8,4:*]) ! { dg-error "Too few codimensions" } end subroutine rank_mismatch2 --- /dev/null 2012-09-11 07:30:33.339725680 +0200 +++ gcc/gcc/testsuite/gfortran.dg/array_section_3.f90 2012-09-10 18:29:37.0 +0200 @@ -0,0 +1,12 @@ +! { dg-do compile } +! +! PR fortran/54225 +! +! Contributed by robb wu +! +program test + implicit none + real :: A(2,3) + + print *, A(1, *) ! { dg-error 'Expected array subscript' } +end program --- /dev/null 2012-09-11 07:30:33.339725680 +0200 +++ gcc/gcc/testsuite/gfortran.dg/coarray_28.f90 2012-09-11 07:51:24.0 +0200 @@ -0,0 +1,10 @@ +! { dg-do compile } +! { dg-options "-fcoarray=single" } +! +! PR fortran/54225 +! + +integer, allocatable :: a[:,:] + +allocate (a[*,4]) ! { dg-error "Unexpected '.' for codimension 1 of 2" } +end
Re: [SH] Add simple_return pattern
On 09/11/2012 12:28 AM, Oleg Endo wrote: > On Mon, 2012-09-10 at 15:51 +0200, Christian Bruel wrote: >> This patch implements the simple_return pattern to enable -fshrink-wrap >> on SH. It also clean up some redundancies for expand_epilogue (called >> twice from the "return" and "epilogue" patterns and the >> sh_expand_prologue parameter type. >> >> No regressions with sh-superh-elf and sh4-linux gcc testsuites. >> >> Thanks >> >> Christian >> > > Regarding the iterators, maybe it's better to put them in > config/sh/iterators.md. The optab code attr is not needed in this case, > "" is sufficient. How about the attached patch instead? yes, there is this new iterator.md file. I'm moving the iterator there. Will resent. Thanks Christian