Re: [PATCH] Allow dg-skip-if to use compiler flags specified through set_board_info cflags
On Tue, Aug 21, 2012 at 03:08:43PM -0700, Mike Stump wrote: > On Aug 11, 2012, at 10:39 AM, Senthil Kumar Selvaraj wrote: > > This patch allows cflags set in board config files using > > "set_board_info cflags" to be used in the selectors of > > dg-skip-if and other dejagnu commands that use the check-flags > > proc. > > Ok. I forgot to mention that I don't have commit access - would you mind comitting the patch to trunk for me? Regards Senthil
Re: RFA: LM32: Fix building libgcc
Hi Sébastien, After applying your patch, I get an assertion failure in emit_return_into_block, at function.c:5600 when trying to compile anything (line is "gcc_assert (ANY_RETURN_P (pat));"). Any ideas? Doh - I should have checked further. The problem is the return_internal pattern. emit_return_into_block() expects that in a return insn that contains more than one piece of RTL, the first piece of RTL will be (return). This is not (currently) true for the lm23. Fixing that let the build proceed as far as truncdfsf2.c in libgcc, This then fails with a return insn that does not match the return pattern. This turns out to be because the "return" pattern should really be an expander that will work when conditions are not being tested. The simplest way to fix this was to create an expander and rename "return" to "basic_return". Next there is seg-fault building ffssi2 in newlib. I do not have a fix for this one yet, but at least libgcc builds now. OK to apply ? Cheers Nick gcc/ChangeLog 2012-08-21 Nick Clifton * config/lm32/lm32.md (return_internal): Place the (return) as the first element in the vector. (return): Changed into an expander. (basic_return): Renamed version of previous return pattern. Index: gcc/config/lm32/lm32.md === --- gcc/config/lm32/lm32.md (revision 190564) +++ gcc/config/lm32/lm32.md (working copy) @@ -629,16 +629,22 @@ ) (define_insn "return_internal" - [(use (match_operand:SI 0 "register_operand" "r")) - (return)] + [(return) + (use (match_operand:SI 0 "register_operand" "r"))] "" "b%0" [(set_attr "type" "uibranch")] ) -(define_insn "return" +(define_expand "return" [(return)] "lm32_can_use_return ()" + "lm32_expand_epilogue (); DONE;" +) + +(define_insn "basic_return" + [(return)] + "lm32_can_use_return ()" "ret" [(set_attr "type" "uibranch")] )
[SH] PR 54089 - Logical right shifts
Hello, This adapts SH's logical right shift patterns to work/look the same way as left shift patterns. It mainly fixes a few issues with dynamic shift insn selection. Tested on rev 190580 with make -k check RUNTESTFLAGS="--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}" and no new failures. OK? Cheers, Oleg ChangeLog: PR target/54089 * config/sh/predicates (p27_rshift_count_operand, not_p27_rshift_count_operand): New predicates. * config/sh/sh.c (sh_ashlsi_clobbers_t_reg_p, sh_lshrsi_clobbers_t_reg_p, sh_dynamicalize_shift_p): Handle special case when shift amount is 31. (gen_ashift): Emit gen_shlr instead of gen_lshrsi3_m. * config/sh/sh.md (ashlsi3_d): Set type to 'dyn_shift' instead of 'arith'. (ashlsi_c): Rename to shll. Adapt calls to gen_ashlsi_c throughout the file. (lshrsi3): Remove clobber from expander. Use shift_count_operand instead of nonmemory_operand predicate for second operand. Add handling of case lshrsi3_n_clobbers_t. (lshrsi3_k): Use p27_rshift_count_operand for second operand. (lshrsi3_d): Make insn_and_split. Split dynamic shift to constant shift sequences if beneficial. (lshrsi3_n): Make insn_and_split. Split constant shift sequence to dynamic shift if beneficial. (lshrsi3_n_clobbers_t): New insn_and_split. (lshrsi3_m): Delete. testsuite/ChangeLog: PR target/54089 * gcc.target/sh/pr54089-2.c: New. Index: gcc/config/sh/predicates.md === --- gcc/config/sh/predicates.md (revision 190580) +++ gcc/config/sh/predicates.md (working copy) @@ -825,6 +825,8 @@ return arith_reg_operand (op, mode); }) +;; Predicates for matching operands that are constant shift +;; amounts 1, 2, 8, 16. (define_predicate "p27_shift_count_operand" (and (match_code "const_int") (match_test "satisfies_constraint_P27 (op)"))) @@ -833,6 +835,19 @@ (and (match_code "const_int") (match_test "! satisfies_constraint_P27 (op)"))) +;; For right shifts the constant 1 is a special case because the shlr insn +;; clobbers the T_REG and is handled by the T_REG clobbering version of the +;; insn, which is also used for non-P27 shift sequences. +(define_predicate "p27_rshift_count_operand" + (and (match_code "const_int") + (match_test "satisfies_constraint_P27 (op)") + (match_test "! satisfies_constraint_M (op)"))) + +(define_predicate "not_p27_rshift_count_operand" + (and (match_code "const_int") + (ior (match_test "! satisfies_constraint_P27 (op)") + (match_test "satisfies_constraint_M (op)" + ;; TODO: Add a comment here. (define_predicate "shift_operator" Index: gcc/config/sh/sh.c === --- gcc/config/sh/sh.c (revision 190580) +++ gcc/config/sh/sh.c (working copy) @@ -2840,6 +2840,11 @@ { 4, { 16, 8, 2, 2 }, 0 }, { 4, { 16, -1, -2, 16 }, ASHL_CLOBBERS_T }, { 3, { 16, -2, 16 }, 0 }, + + /* For a right shift by 31 a 2 insn shll-movt sequence can be used. + For a left shift by 31 a 2 insn and-rotl sequences can be used. + However, the shift-and combiner code needs this entry here to be in + terms of real shift insns. */ { 3, { 16, -1, 16 }, ASHL_CLOBBERS_T } }; @@ -2888,7 +2893,14 @@ sh_ashlsi_clobbers_t_reg_p (rtx shift_amount) { gcc_assert (CONST_INT_P (shift_amount)); - return (ashl_lshr_seq[INTVAL (shift_amount) & 31].clobbers_t + + const int shift_amount_i = INTVAL (shift_amount) & 31; + + /* Special case for shift count of 31: use and-rotl sequence. */ + if (shift_amount_i == 31) +return true; + + return (ashl_lshr_seq[shift_amount_i].clobbers_t & ASHL_CLOBBERS_T) != 0; } @@ -2896,10 +2908,39 @@ sh_lshrsi_clobbers_t_reg_p (rtx shift_amount) { gcc_assert (CONST_INT_P (shift_amount)); - return (ashl_lshr_seq[INTVAL (shift_amount) & 31].clobbers_t + + const int shift_amount_i = INTVAL (shift_amount) & 31; + + /* Special case for shift count of 31: use shll-movt sequence. */ + if (shift_amount_i == 31) +return true; + + return (ashl_lshr_seq[shift_amount_i].clobbers_t & LSHR_CLOBBERS_T) != 0; } +/* Return true if it is potentially beneficial to use a dynamic shift + instruction (shad / shar) instead of a combination of 1/2/8/16 + shift instructions for the specified shift count. + If dynamic shifts are not available, always return false. */ +bool +sh_dynamicalize_shift_p (rtx count) +{ + gcc_assert (CONST_INT_P (count)); + + const int shift_amount_i = INTVAL (count) & 31; + int insn_count; + + /* For left and right shifts, there are shorter 2 insn sequences for + shift amounts of 31. */ + if (shift_amount_i == 31) +insn_count = 2; + else +insn_count = ashl_lshr_seq[shift_amount_i].insn_
Re: [patch] two more bitmap obstacks
On Tue, Aug 21, 2012 at 9:26 PM, Steven Bosscher wrote: > Hello, > > Two more bitmap obstacks, this time in tree-ssa-coalesce.c. > > The advantage isn't so much in having the bitmaps on the non-default > obstack, but more in that the bitmaps can be free'ed all at once by > simply releasing the obstack. > > Bootstrapped&tested on x86_64-unknown-linux-gnu. OK for trunk? Ok. Thanks, Richard. > Ciao! > Steven
Re: [PATCH] fix wrong-code bug for -fstrict-volatile-bitfields
On Tue, Aug 21, 2012 at 10:20 PM, Sandra Loosemore wrote: > This patch is a followup to the addition of support for > -fstrict-volatile-bitfields (required by the ARM EABI); see this thread > > http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01889.html > > for discussion of the original patch. > > That patch only addressed the behavior when extracting the value of a > volatile bit field, but the same problems affect storing values into a > volatile bit field (or a field of a packed structure, which is effectively > implemented as a bit field). This patch makes the code for bitfield stores > mirror that for bitfield loads. > > Although the fix is in target-independent code, it's really for ARM; hence > the test case, which (without this patch) generates wrong code. Code to > determine the access width was correctly preserving the user-specified > width, but was incorrectly falling through to code that assumes word mode. > > As well as regression testing on arm-none-eabi, I've bootstrapped and > regression-tested this patch on x86_64 Linux. Earlier versions of this > patch have also been present in our local 4.5 and 4.6 GCC trees for some > time, so it's been well-tested on a variety of other platforms. OK to check > in on mainline? + bool packedp = false; + + if (TREE_CODE(to) == COMPONENT_REF + && (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (to, 0))) + || (TREE_CODE (TREE_OPERAND (to, 1)) == FIELD_DECL + && DECL_PACKED (TREE_OPERAND (to, 1) + packedp = true; note that this is not a reliable way of determining if a field is packed (yes, the existing code is broken in the same way). I realize that you are only using this for diagnostic purposes, still something worth mentioning. I dislike passing around the packedp flag throughout the expander and to warn inside the expander. First, the flag has a name that can be confusing (it does not conservatively flag all packed accesses). Second, why don't you warn for this from inside the frontend when the bitfield access is generated? This way the diagnostic won't depend on optimization levels and you avoid uglifying the expander. Thanks, Richard. > -Sandra > > > 2012-08-21 Paul Brook > Joseph Myers > Sandra Loosemore > > gcc/ > * expr.h (store_bit_field): Add packedp parameter to prototype. > * expmed.c (store_bit_field, store_bit_field_1): Add packedp > parameter. Adjust all callers. > (warn_misaligned_bitfield): New function, split from > extract_fixed_bit_field. > (store_fixed_bit_field): Add packedp parameter. Fix wrong-code > behavior for the combination of misaligned bitfield and > -fstrict-volatile-bitfields. Use warn_misaligned_bitfield. > (extract_fixed_bit_field): Use warn_misaligned_bitfield. > * expr.c: Adjust calls to store_bit_field. > (expand_assignment): Identify accesses to packed structures. > (store_field): Add packedp parameter. Adjust callers. > * calls.c: Adjust calls to store_bit_field. > * ifcvt.c: Likewise. > * config/s390/s390.c: Likewise. > > gcc/testsuite/ > * gcc.target/arm/volatile-bitfields-5.c: New test case. >
Re: [PATCH] Fix some leaks and one uninitialized var read
On Tue, Aug 21, 2012 at 10:33 PM, Jakub Jelinek wrote: > Hi! > > The recent change in find_assert_locations from XCNEWVEC to XNEWVEC > caused a valgrind warning, because bb_rpo[ENTRY_BLOCK] used to > be accessed, but was never initialized. > > Fixed by ignoring edges from ENTRY_BLOCK altogether. > > The rest are a couple of memory leak fixes. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. > 2012-08-21 Jakub Jelinek > > * tree-vrp.c (find_assert_locations): Skip also edges > from the entry block. > > * tree-vect-loop-manip.c (slpeel_make_loop_iterate_ntimes): Call > free_stmt_vec_info on orig_cond after gsi_removing it. > * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Always > free body_cost_vec vector. > (vect_analyze_data_refs): If gather is unsuccessful, > free_data_ref (dr). > * tree-inline.c (tree_function_versioning): Free > old_transforms_to_apply vector. > > --- gcc/tree-vrp.c.jj 2012-08-20 20:56:01.0 +0200 > +++ gcc/tree-vrp.c 2012-08-21 12:15:32.501753048 +0200 > @@ -5596,7 +5596,7 @@ find_assert_locations (void) > FOR_EACH_EDGE (e, ei, bb->preds) > { > int pred = e->src->index; > - if (e->flags & EDGE_DFS_BACK) > + if ((e->flags & EDGE_DFS_BACK) || pred == ENTRY_BLOCK) > continue; > > if (!live[pred]) > --- gcc/tree-vect-loop-manip.c.jj 2012-08-15 10:55:24.0 +0200 > +++ gcc/tree-vect-loop-manip.c 2012-08-21 15:01:02.600750196 +0200 > @@ -788,6 +788,7 @@ slpeel_make_loop_iterate_ntimes (struct > >/* Remove old loop exit test: */ >gsi_remove (&loop_cond_gsi, true); > + free_stmt_vec_info (orig_cond); > >loop_loc = find_loop_location (loop); >if (dump_file && (dump_flags & TDF_DETAILS)) > --- gcc/tree-vect-data-refs.c.jj2012-08-20 11:09:45.0 +0200 > +++ gcc/tree-vect-data-refs.c 2012-08-21 16:32:13.631428796 +0200 > @@ -1934,10 +1934,9 @@ vect_enhance_data_refs_alignment (loop_v > gcc_assert (stat); >return stat; > } > - else > - VEC_free (stmt_info_for_cost, heap, body_cost_vec); > } > > + VEC_free (stmt_info_for_cost, heap, body_cost_vec); > >/* (2) Versioning to force alignment. */ > > @@ -3313,6 +3312,8 @@ vect_analyze_data_refs (loop_vec_info lo > gather = false; > if (!gather) > { > + STMT_VINFO_DATA_REF (stmt_info) = NULL; > + free_data_ref (dr); > if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) > { > fprintf (vect_dump, > --- gcc/tree-inline.c.jj2012-08-15 10:55:33.0 +0200 > +++ gcc/tree-inline.c 2012-08-21 17:28:24.181069515 +0200 > @@ -5089,6 +5089,7 @@ tree_function_versioning (tree old_decl, >VEC_index (ipa_opt_pass, > old_transforms_to_apply, > i)); > + VEC_free (ipa_opt_pass, heap, old_transforms_to_apply); > } > >id.copy_decl = copy_decl_no_change; > > Jakub
Re: [PATCH] Fix some leaks and one uninitialized var read
On Tue, Aug 21, 2012 at 10:33 PM, Jakub Jelinek wrote: > Hi! > > The recent change in find_assert_locations from XCNEWVEC to XNEWVEC > caused a valgrind warning, because bb_rpo[ENTRY_BLOCK] used to > be accessed, but was never initialized. Sorry for this breakage. I looked at each of the changes carefully to convince myself that XNEWVEC instead of XCNEWVEC wouldn't introduce this kind of problem, but I overlooked that pre_and_rev_post_order_compute is called with include_entry_exit==false. Ciao! Steven
Re: [Patch, Fortran, committed] Free loop and gfc_ss data
Hello, On 22/08/2012 07:56, Tobias Burnus wrote: > Committed as Rev. 190586 after successful regtesting. > > That's the version I also had attached to > http://gcc.gnu.org/ml/fortran/2012-08/msg00118.html; as written there: I have one minor comment about it. See below. > > "The patch is incomplete, e.g. "argss" of gfc_conv_procedure_call is not > (or not always) freed. Ditto for rss of gfc_trans_assignment_1; >From glancing at the code, I don't see why rss is not freed by gfc_cleanup_loop for the latter. > Index: gcc/fortran/trans-expr.c > === > --- gcc/fortran/trans-expr.c (Revision 190585) > +++ gcc/fortran/trans-expr.c (Arbeitskopie) > @@ -6770,6 +6771,7 @@ gfc_trans_arrayfunc_assign (gfc_expr * expr1, gfc_ >if (!expr2->value.function.isym) > { > realloc_lhs_loop_for_fcn_call (&se, &expr1->where, &ss, &loop); > + gfc_cleanup_loop (&loop); > ss->is_alloc_lhs = 1; > } >else This takes care of freeing ss along the way, but ss should be freed outside of the conditional too. Mikael
[PATCH] Fix another leak and one uninitialized var read
Hi! Another leak and another uninitialized var. Testing in progress, ok for trunk if it passes? 2012-08-22 Jakub Jelinek * tree-vect-loop.c (vect_transform_loop): Initialize check_profitability to false. * tree-predcom.c (try_combine_chains): Free the worklist vector at the end. --- gcc/tree-vect-loop.c.jj 2012-08-21 22:48:30.0 +0200 +++ gcc/tree-vect-loop.c2012-08-22 11:35:54.901413507 +0200 @@ -5277,7 +5277,7 @@ vect_transform_loop (loop_vec_info loop_ gimple_seq pattern_def_seq = NULL; gimple_stmt_iterator pattern_def_si = gsi_none (); bool transform_pattern_stmt = false; - bool check_profitability; + bool check_profitability = false; int th; if (vect_print_dump_info (REPORT_DETAILS)) --- gcc/tree-predcom.c.jj 2012-08-10 12:57:38.0 +0200 +++ gcc/tree-predcom.c 2012-08-22 11:21:23.314423304 +0200 @@ -2331,6 +2331,8 @@ try_combine_chains (VEC (chain_p, heap) } } } + + VEC_free (chain_p, heap, worklist); } /* Prepare initializers for CHAIN in LOOP. Returns false if this is Jakub
[patch] alias.c: propagate information in RPO order on the CFG
Hello, This both speeds up and improves RTL alias analysis by propagating the alias chains information in a visit in topological order of the CFG. Perhaps unsurprisingly, this is motivated again by PR middle-end/54146, where I noticed that the loop in init_alias_analysis terminated because the iteration count was greater than MAX_ALIAS_LOOP_PASSES (==10), even though the maximum loop depth in the test case is only 3 so that the minimum number of iterations required to fully propagate all information is only 5. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. OK if it passes? (BTW the diff is with -bw because the inner loop needed re-indenting, but there are no functional change in it.) Ciao! Steven * alias.c (MAX_ALIAS_LOOP_PASSES): Update comment with rationale, or rather a lack thereof. (init_alias_analysis): Propagate the latest information across the CFG in topological order to propagate as far as possible in each iteration. Ignore debug insns. Index: alias.c === --- alias.c (revision 190588) +++ alias.c (working copy) @@ -168,7 +168,10 @@ static void memory_modified_1 (rtx, cons #define SIZE_FOR_MODE(X) (GET_MODE_SIZE (GET_MODE (X))) /* Cap the number of passes we make over the insns propagating alias - information through set chains. 10 is a completely arbitrary choice. */ + information through set chains. + ??? 10 is a completely arbitrary choice. This should be based on the + maximum loop depth in the CFG, but we do not have this information + available (even if current_loops _is_ available). */ #define MAX_ALIAS_LOOP_PASSES 10 /* reg_base_value[N] gives an address to which register N is related. @@ -2764,6 +2767,8 @@ init_alias_analysis (void) int i; unsigned int ui; rtx insn, val; + int rpo_cnt; + int *rpo; timevar_push (TV_ALIAS_ANALYSIS); @@ -2786,6 +2791,9 @@ init_alias_analysis (void) "constant" information from the previous pass to propagate alias information through another level of assignments. + The propagation is done on the CFG in reverse post-order, to + propagate things forward as far as possible in each iteration. + This could get expensive if the assignment chains are long. Maybe we should throttle the number of iterations, possibly based on the optimization level or flag_expensive_optimizations. @@ -2801,6 +2809,9 @@ init_alias_analysis (void) The state of the arrays for the set chain in question does not matter since the program has undefined behavior. */ + rpo = XNEWVEC (int, n_basic_blocks); + rpo_cnt = pre_and_rev_post_order_compute (NULL, rpo, false); + pass = 0; do { @@ -2833,9 +2844,12 @@ init_alias_analysis (void) FIRST_PSEUDO_REGISTER * sizeof (rtx)); /* Walk the insns adding values to the new_reg_base_value array. */ - for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) + for (i = 0; i < rpo_cnt; i++) { - if (INSN_P (insn)) + basic_block bb = BASIC_BLOCK (rpo[i]); + FOR_BB_INSNS (bb, insn) + { + if (NONDEBUG_INSN_P (insn)) { rtx note, set; @@ -2924,7 +2938,9 @@ init_alias_analysis (void) } } } +} while (changed && ++pass < MAX_ALIAS_LOOP_PASSES); + XDELETEVEC (rpo); /* Fill in the remaining entries. */ FOR_EACH_VEC_ELT (rtx, reg_known_value, i, val)
Re: [PATCH] Fix another leak and one uninitialized var read
On Wed, Aug 22, 2012 at 11:38 AM, Jakub Jelinek wrote: > Hi! > > Another leak and another uninitialized var. > Testing in progress, ok for trunk if it passes? Ok. Thanks, Richard. > 2012-08-22 Jakub Jelinek > > * tree-vect-loop.c (vect_transform_loop): Initialize > check_profitability to false. > > * tree-predcom.c (try_combine_chains): Free the worklist vector > at the end. > > --- gcc/tree-vect-loop.c.jj 2012-08-21 22:48:30.0 +0200 > +++ gcc/tree-vect-loop.c2012-08-22 11:35:54.901413507 +0200 > @@ -5277,7 +5277,7 @@ vect_transform_loop (loop_vec_info loop_ >gimple_seq pattern_def_seq = NULL; >gimple_stmt_iterator pattern_def_si = gsi_none (); >bool transform_pattern_stmt = false; > - bool check_profitability; > + bool check_profitability = false; >int th; > >if (vect_print_dump_info (REPORT_DETAILS)) > --- gcc/tree-predcom.c.jj 2012-08-10 12:57:38.0 +0200 > +++ gcc/tree-predcom.c 2012-08-22 11:21:23.314423304 +0200 > @@ -2331,6 +2331,8 @@ try_combine_chains (VEC (chain_p, heap) > } > } > } > + > + VEC_free (chain_p, heap, worklist); > } > > /* Prepare initializers for CHAIN in LOOP. Returns false if this is > > Jakub
Re: [Patch ARM] Update the test case to differ movs and lsrs for ARM mode and non-ARM mode
On 22/08/12 02:20, Terry Guo wrote: > Hi, > > Due to the impact of ARM UAL, the Thumb1 and Thumb2 mode use LSRS > instruction while the ARM mode uses MOVS instruction. So the following case > is updated accordingly. Is it OK to trunk? > > BR, > Terry > > 2012-08-21 Terry Guo > > * gcc.target/arm/combine-movs.c: Check movs for ARM mode > and lsrs for other mode. > This can't be right. Thumb1 doesn't use unified syntax. R. > diff --git a/gcc/testsuite/gcc.target/arm/combine-movs.c > b/gcc/testsuite/gcc.target/arm/combine-movs.c > index 4209a33..fbef9df 100644 > --- a/gcc/testsuite/gcc.target/arm/combine-movs.c > +++ b/gcc/testsuite/gcc.target/arm/combine-movs.c > @@ -1,5 +1,4 @@ > /* { dg-do compile } */ > -/* { dg-skip-if "" { arm_thumb1 } } */ > /* { dg-options "-O" } */ > > void foo (unsigned long r[], unsigned int d) > @@ -9,4 +8,5 @@ void foo (unsigned long r[], unsigned int d) > r[i] = 0; > } > > -/* { dg-final { scan-assembler "movs\tr\[0-9\]" } } */ > +/* { dg-final { scan-assembler "movs\tr\[0-9\]" { target arm_nothumb } } } > */ > +/* { dg-final { scan-assembler "lsrs\tr\[0-9\]" { target { ! arm_nothumb } > } } } */ > > > >
Re: [patch] alias.c: propagate information in RPO order on the CFG
On Wed, Aug 22, 2012 at 11:53 AM, Steven Bosscher wrote: > Hello, > > This both speeds up and improves RTL alias analysis by propagating the > alias chains information in a visit in topological order of the CFG. > > Perhaps unsurprisingly, this is motivated again by PR > middle-end/54146, where I noticed that the loop in init_alias_analysis > terminated because the iteration count was greater than > MAX_ALIAS_LOOP_PASSES (==10), even though the maximum loop depth in > the test case is only 3 so that the minimum number of iterations > required to fully propagate all information is only 5. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. OK if it > passes? > (BTW the diff is with -bw because the inner loop needed re-indenting, > but there are no functional change in it.) Ok. Thanks, Richard. > Ciao! > Steven > > * alias.c (MAX_ALIAS_LOOP_PASSES): Update comment with rationale, > or rather a lack thereof. > (init_alias_analysis): Propagate the latest information across > the CFG in topological order to propagate as far as possible in > each iteration. Ignore debug insns. > > Index: alias.c > === > --- alias.c (revision 190588) > +++ alias.c (working copy) > @@ -168,7 +168,10 @@ static void memory_modified_1 (rtx, cons > #define SIZE_FOR_MODE(X) (GET_MODE_SIZE (GET_MODE (X))) > > /* Cap the number of passes we make over the insns propagating alias > - information through set chains. 10 is a completely arbitrary choice. */ > + information through set chains. > + ??? 10 is a completely arbitrary choice. This should be based on the > + maximum loop depth in the CFG, but we do not have this information > + available (even if current_loops _is_ available). */ > #define MAX_ALIAS_LOOP_PASSES 10 > > /* reg_base_value[N] gives an address to which register N is related. > @@ -2764,6 +2767,8 @@ init_alias_analysis (void) >int i; >unsigned int ui; >rtx insn, val; > + int rpo_cnt; > + int *rpo; > >timevar_push (TV_ALIAS_ANALYSIS); > > @@ -2786,6 +2791,9 @@ init_alias_analysis (void) > "constant" information from the previous pass to propagate alias > information through another level of assignments. > > + The propagation is done on the CFG in reverse post-order, to > + propagate things forward as far as possible in each iteration. > + > This could get expensive if the assignment chains are long. Maybe > we should throttle the number of iterations, possibly based on > the optimization level or flag_expensive_optimizations. > @@ -2801,6 +2809,9 @@ init_alias_analysis (void) > The state of the arrays for the set chain in question does not matter > since the program has undefined behavior. */ > > + rpo = XNEWVEC (int, n_basic_blocks); > + rpo_cnt = pre_and_rev_post_order_compute (NULL, rpo, false); > + >pass = 0; >do > { > @@ -2833,9 +2844,12 @@ init_alias_analysis (void) > FIRST_PSEUDO_REGISTER * sizeof (rtx)); > >/* Walk the insns adding values to the new_reg_base_value array. */ > - for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) > + for (i = 0; i < rpo_cnt; i++) > { > - if (INSN_P (insn)) > + basic_block bb = BASIC_BLOCK (rpo[i]); > + FOR_BB_INSNS (bb, insn) > + { > + if (NONDEBUG_INSN_P (insn)) > { > rtx note, set; > > @@ -2924,7 +2938,9 @@ init_alias_analysis (void) > } > } > } > +} >while (changed && ++pass < MAX_ALIAS_LOOP_PASSES); > + XDELETEVEC (rpo); > >/* Fill in the remaining entries. */ >FOR_EACH_VEC_ELT (rtx, reg_known_value, i, val)
RE: [Patch ARM] Update the test case to differ movs and lsrs for ARM mode and non-ARM mode
> > > > Due to the impact of ARM UAL, the Thumb1 and Thumb2 mode use LSRS > > instruction while the ARM mode uses MOVS instruction. So the > following case > > is updated accordingly. Is it OK to trunk? > > > > BR, > > Terry > > > > 2012-08-21 Terry Guo > > > > * gcc.target/arm/combine-movs.c: Check movs for ARM mode > > and lsrs for other mode. > > > > This can't be right. Thumb1 doesn't use unified syntax. > > R. > oops. You are right. Sorry for making such obvious mistake. Here is patch updated to distinguish ARM and Thumb2. Tested for Thumb1, Thumb2 and ARM modes. No regression. Is it OK? BR, Terry 2012-08-21 Terry Guo * gcc.target/arm/combine-movs.c: Check movs for ARM mode and lsrs for Thumb2 mode. diff --git a/gcc/testsuite/gcc.target/arm/combine-movs.c b/gcc/testsuite/gcc.target/arm/combine-movs.c index 4209a33..3e36033 100644 --- a/gcc/testsuite/gcc.target/arm/combine-movs.c +++ b/gcc/testsuite/gcc.target/arm/combine-movs.c @@ -9,4 +9,5 @@ void foo (unsigned long r[], unsigned int d) r[i] = 0; } -/* { dg-final { scan-assembler "movs\tr\[0-9\]" } } */ +/* { dg-final { scan-assembler "lsrs\tr\[0-9\]" { target arm_thumb2_ok } } } */ +/* { dg-final { scan-assembler "movs\tr\[0-9\]" { target { ! arm_thumb2_ok } } } } */
Fix double_int overflow in VRP PLUS_EXPR
Hello, when I adapted VRP PLUS_EXPR handling for __int128, I missed one place where double_int can overflow. Note that I have no idea if that helps for bug 54317, but that's where I noticed the issue. 2012-08-21 Marc Glisse PR tree-optimization/54317 gcc/ * tree-vrp.c (extract_range_from_binary_expr_1): Test for double_int overflow. Remove dead tests. gcc/testsuite/ * gcc.dg/tree-ssa/vrp79.c: New testcase. -- Marc GlisseIndex: testsuite/gcc.dg/tree-ssa/vrp79.c === --- testsuite/gcc.dg/tree-ssa/vrp79.c (revision 0) +++ testsuite/gcc.dg/tree-ssa/vrp79.c (revision 0) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#ifdef __SIZEOF_INT128__ +typedef unsigned __int128 NT; +#else +typedef unsigned long long NT; +#endif + +extern void do_not_go_away (); + +void f (NT x, NT y) +{ + NT n = 1; + n <<= (8 * sizeof (NT) - 1); + if (x > n) return; + if (y > n) return; + NT z = x + y; + if (z == 42) do_not_go_away (); +} + +/* { dg-final { scan-tree-dump "do_not_go_away" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: testsuite/gcc.dg/tree-ssa/vrp79.c ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: tree-vrp.c === --- tree-vrp.c (revision 190590) +++ tree-vrp.c (working copy) @@ -2467,32 +2467,35 @@ extract_range_from_binary_expr_1 (value_ && max_ovf == 1) { /* Underflow and overflow, drop to VR_VARYING. */ set_value_range_to_varying (vr); return; } else { /* Min underflow or max overflow. The range kind changes to VR_ANTI_RANGE. */ + bool covers = false; double_int tem = tmin; gcc_assert ((min_ovf == -1 && max_ovf == 0) || (max_ovf == 1 && min_ovf == 0)); type = VR_ANTI_RANGE; tmin = double_int_add (tmax, double_int_one); + if (double_int_cmp (tmin, tmax, uns) < 0) + covers = true; tmax = double_int_add (tem, double_int_minus_one); + if (double_int_cmp (tmax, tem, uns) > 0) + covers = true; /* If the anti-range would cover nothing, drop to varying. Likewise if the anti-range bounds are outside of the types values. */ - if (double_int_cmp (tmin, tmax, uns) > 0 - || double_int_cmp (tmin, type_min, uns) < 0 - || double_int_cmp (tmax, type_max, uns) > 0) + if (covers || double_int_cmp (tmin, tmax, uns) > 0) { set_value_range_to_varying (vr); return; } min = double_int_to_tree (expr_type, tmin); max = double_int_to_tree (expr_type, tmax); } } else {
[PATCH] Fix half of PR46590, limit alias queries in SCCVN
This finally limits the number of alias queries we do when looking for redundant loads/stores from inside SCCVN (thus FRE and PRE). Previously the number was bound roughly by O(n_loads * n_stores) while now it is bound by O(n_loads) with the constant factor --param sccvn-max-alias-queries-per-access. The patch should not make any difference for any real-world testcase but SCCVN compile-time for half of the artificial testcase in PR46590 goes down from 230s to 13s. I'm not sure limiting the walks more is necessary (we hit this issue only for machine-generated code like this - where other passes have similar issues, like loop header copying for this particular case) Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2012-08-22 Richard Guenther PR tree-optimization/46590 * tree-ssa-alias.h (get_continuation_for_phi): Add alias query counter output argument. (walk_non_aliased_vuses): Add alias query counter argument to the walker callback. * tree-ssa-alias.c (maybe_skip_until): Add alias query counter output argument and count alias queries. (get_continuation_for_phi_1): Likewise. (get_continuation_for_phi): Likewise. (walk_non_aliased_vuses): Add alias query counter argument to the walker callback and allow it to abort the walk by returning -1. * tree-ssa-pre.c (translate_vuse_through_block): Adjust. * tree-ssa-sccvn.c (vn_reference_lookup_2): Add alias query counter parmeter, abort walk if that is bigger than --param sccvn-max-alias-queries-per-access. * params.def (sccvn-max-alias-queries-per-access): New param. * doc/invoke.texi (sccvn-max-alias-queries-per-access): Document. Index: gcc/tree-ssa-alias.c === *** gcc/tree-ssa-alias.c.orig 2012-08-21 16:46:21.0 +0200 --- gcc/tree-ssa-alias.c2012-08-22 13:38:45.502016867 +0200 *** stmt_kills_ref_p (gimple stmt, tree ref) *** 1929,1935 static bool maybe_skip_until (gimple phi, tree target, ao_ref *ref, ! tree vuse, bitmap *visited) { basic_block bb = gimple_bb (phi); --- 1929,1935 static bool maybe_skip_until (gimple phi, tree target, ao_ref *ref, ! tree vuse, unsigned int *cnt, bitmap *visited) { basic_block bb = gimple_bb (phi); *** maybe_skip_until (gimple phi, tree targe *** 1948,1962 /* An already visited PHI node ends the walk successfully. */ if (bitmap_bit_p (*visited, SSA_NAME_VERSION (PHI_RESULT (def_stmt return true; ! vuse = get_continuation_for_phi (def_stmt, ref, visited); if (!vuse) return false; continue; } ! /* A clobbering statement or the end of the IL ends it failing. */ ! else if (gimple_nop_p (def_stmt) ! || stmt_may_clobber_ref_p_1 (def_stmt, ref)) return false; /* If we reach a new basic-block see if we already skipped it in a previous walk that ended successfully. */ if (gimple_bb (def_stmt) != bb) --- 1948,1967 /* An already visited PHI node ends the walk successfully. */ if (bitmap_bit_p (*visited, SSA_NAME_VERSION (PHI_RESULT (def_stmt return true; ! vuse = get_continuation_for_phi (def_stmt, ref, cnt, visited); if (!vuse) return false; continue; } ! else if (gimple_nop_p (def_stmt)) return false; + else + { + /* A clobbering statement or the end of the IL ends it failing. */ + ++*cnt; + if (stmt_may_clobber_ref_p_1 (def_stmt, ref)) + return false; + } /* If we reach a new basic-block see if we already skipped it in a previous walk that ended successfully. */ if (gimple_bb (def_stmt) != bb) *** maybe_skip_until (gimple phi, tree targe *** 1976,1982 static tree get_continuation_for_phi_1 (gimple phi, tree arg0, tree arg1, ! ao_ref *ref, bitmap *visited) { gimple def0 = SSA_NAME_DEF_STMT (arg0); gimple def1 = SSA_NAME_DEF_STMT (arg1); --- 1981,1987 static tree get_continuation_for_phi_1 (gimple phi, tree arg0, tree arg1, ! ao_ref *ref, unsigned int *cnt, bitmap *visited) { gimple def0 = SSA_NAME_DEF_STMT (arg0); gimple def1 = SSA_NAME_DEF_STMT (arg1); *** get_continuation_for_phi_1 (gimple phi, *** 1989,2002 && dominated_by_p (CDI_DOMINATORS, gimple_bb (def1), gimple_bb (def0 { ! if (maybe_skip_until (phi, arg0, ref, arg1, visited)) return arg0; } else if (gimple_nop_p (def1) || dominated_by_p (CDI_DOMINATORS,
Re: Fix double_int overflow in VRP PLUS_EXPR
On Wed, Aug 22, 2012 at 1:55 PM, Marc Glisse wrote: > Hello, > > when I adapted VRP PLUS_EXPR handling for __int128, I missed one place where > double_int can overflow. Note that I have no idea if that helps for bug > 54317, but that's where I noticed the issue. Ok. Thanks, Richard. > 2012-08-21 Marc Glisse > > PR tree-optimization/54317 > > gcc/ > * tree-vrp.c (extract_range_from_binary_expr_1): Test for > double_int overflow. > Remove dead tests. > > gcc/testsuite/ > * gcc.dg/tree-ssa/vrp79.c: New testcase. > > -- > Marc Glisse > Index: testsuite/gcc.dg/tree-ssa/vrp79.c > === > --- testsuite/gcc.dg/tree-ssa/vrp79.c (revision 0) > +++ testsuite/gcc.dg/tree-ssa/vrp79.c (revision 0) > @@ -0,0 +1,23 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +#ifdef __SIZEOF_INT128__ > +typedef unsigned __int128 NT; > +#else > +typedef unsigned long long NT; > +#endif > + > +extern void do_not_go_away (); > + > +void f (NT x, NT y) > +{ > + NT n = 1; > + n <<= (8 * sizeof (NT) - 1); > + if (x > n) return; > + if (y > n) return; > + NT z = x + y; > + if (z == 42) do_not_go_away (); > +} > + > +/* { dg-final { scan-tree-dump "do_not_go_away" "optimized" } } */ > +/* { dg-final { cleanup-tree-dump "optimized" } } */ > > Property changes on: testsuite/gcc.dg/tree-ssa/vrp79.c > ___ > Added: svn:keywords >+ Author Date Id Revision URL > Added: svn:eol-style >+ native > > Index: tree-vrp.c > === > --- tree-vrp.c (revision 190590) > +++ tree-vrp.c (working copy) > @@ -2467,32 +2467,35 @@ extract_range_from_binary_expr_1 (value_ >&& max_ovf == 1) > { > /* Underflow and overflow, drop to VR_VARYING. */ > set_value_range_to_varying (vr); > return; > } > else > { > /* Min underflow or max overflow. The range kind > changes to VR_ANTI_RANGE. */ > + bool covers = false; > double_int tem = tmin; > gcc_assert ((min_ovf == -1 && max_ovf == 0) > || (max_ovf == 1 && min_ovf == 0)); > type = VR_ANTI_RANGE; > tmin = double_int_add (tmax, double_int_one); > + if (double_int_cmp (tmin, tmax, uns) < 0) > + covers = true; > tmax = double_int_add (tem, double_int_minus_one); > + if (double_int_cmp (tmax, tem, uns) > 0) > + covers = true; > /* If the anti-range would cover nothing, drop to varying. > Likewise if the anti-range bounds are outside of the > types values. */ > - if (double_int_cmp (tmin, tmax, uns) > 0 > - || double_int_cmp (tmin, type_min, uns) < 0 > - || double_int_cmp (tmax, type_max, uns) > 0) > + if (covers || double_int_cmp (tmin, tmax, uns) > 0) > { > set_value_range_to_varying (vr); > return; > } > min = double_int_to_tree (expr_type, tmin); > max = double_int_to_tree (expr_type, tmax); > } > } > else > { >
Re: Fix double_int overflow in VRP PLUS_EXPR
On Wed, Aug 22, 2012 at 02:19:19PM +0200, Richard Guenther wrote: > > 2012-08-21 Marc Glisse > > > > PR tree-optimization/54317 > > > > gcc/ > > * tree-vrp.c (extract_range_from_binary_expr_1): Test for > > double_int overflow. > > Remove dead tests. > > > > gcc/testsuite/ > > * gcc.dg/tree-ssa/vrp79.c: New testcase. > > > > -- > > + n <<= (8 * sizeof (NT) - 1); Better use __CHAR_BIT__ instead of 8 here... Jakub
Re: Fix double_int overflow in VRP PLUS_EXPR
On Wed, 22 Aug 2012, Jakub Jelinek wrote: 2012-08-21 Marc Glisse + n <<= (8 * sizeof (NT) - 1); Better use __CHAR_BIT__ instead of 8 here... Ok, I committed the __CHAR_BIT__ version (I just compiled that one file manually with old and new compilers, I didn't restart the testsuite, I hope the manual test is enough to detect any typo that could break things). -- Marc Glisse
Re: Fix double_int overflow in VRP PLUS_EXPR
On Wed, Aug 22, 2012 at 02:34:01PM +0200, Marc Glisse wrote: > On Wed, 22 Aug 2012, Jakub Jelinek wrote: > > >>>2012-08-21 Marc Glisse > >>>+ n <<= (8 * sizeof (NT) - 1); > > > >Better use __CHAR_BIT__ instead of 8 here... > > Ok, I committed the __CHAR_BIT__ version (I just compiled that one > file manually with old and new compilers, I didn't restart the > testsuite, I hope the manual test is enough to detect any typo that > could break things). Probably. You could have used: make check-gcc RUNTESTFLAGS=tree-ssa.exp=vrp79.c to test just that single testcase (with both old and new compilers). Jakub
[C++ patch] PR 20420
Hi, yesterday I had a look to this old PR and noticed that we are almost doing already the right thing for the original testcase: we are for classes, but we ICE (something recent) for enums. The latter issue seems easy to fix: handle specially enums at the beginning of supplement_binding_1 only when the comment says, that is in templates (otherwise we hit an assert in dependent_type_p). Then, Comment #4 in the audit trail added a case of duplicate using declaration which we should also reject, involving VAR_DECLs at function scope (C++11 7.3.3/10). Seems also easy to fix, by checking the return value of validate_nonmember_using_decl at the beginning of do_local_using_decl and erroring out if it's null. Tested x86_64-linux. Thanks, Paolo. /// /cp 2012-08-22 Paolo Carlini PR c++/20420 * name-lookup.c (supplement_binding_1): Handle specially enums only in class templates. (do_local_using_decl): Enforce 7.3.3/10 about duplicate using declarations at function scope. /testsuite 2012-08-22 Paolo Carlini PR c++/20420 * g++.dg/lookup/using53.C: New. Index: testsuite/g++.dg/lookup/using53.C === --- testsuite/g++.dg/lookup/using53.C (revision 0) +++ testsuite/g++.dg/lookup/using53.C (revision 0) @@ -0,0 +1,31 @@ +// PR c++/20420 + +class B +{ +protected: + enum E { E1, E2, E3 }; + struct S { int i; E e; }; +}; + +class D : private B +{ +public: + using B::E; // { dg-message "previous" } + using B::S; // { dg-message "previous" } + +private: + enum E {};// { dg-error "conflicts" } + struct S {}; // { dg-error "conflicts" } +}; + +namespace N +{ + int i; +} + +void +f () +{ + using N::i; + using N::i; // { dg-error "declared" } +} Index: cp/name-lookup.c === --- cp/name-lookup.c(revision 190590) +++ cp/name-lookup.c(working copy) @@ -441,7 +441,8 @@ supplement_binding_1 (cxx_binding *binding, tree d template in order to handle late matching of underlying type on an opaque-enum-declaration followed by an enum-specifier. */ - || (TREE_CODE (TREE_TYPE (target_decl)) == ENUMERAL_TYPE + || (processing_template_decl + && TREE_CODE (TREE_TYPE (target_decl)) == ENUMERAL_TYPE && TREE_CODE (TREE_TYPE (target_bval)) == ENUMERAL_TYPE && (dependent_type_p (ENUM_UNDERLYING_TYPE (TREE_TYPE (target_decl))) @@ -2581,7 +2582,12 @@ do_local_using_decl (tree decl, tree scope, tree n decl = validate_nonmember_using_decl (decl, scope, name); if (decl == NULL_TREE) -return; +{ + /* C++11 7.3.3/10. */ + if (TREE_CODE (orig_decl) == VAR_DECL) + error ("%qD is already declared in this scope", name); + return; +} if (building_stmt_list_p () && at_function_scope_p ())
Re: Fix double_int overflow in VRP PLUS_EXPR
On Wed, 22 Aug 2012, Jakub Jelinek wrote: On Wed, Aug 22, 2012 at 02:34:01PM +0200, Marc Glisse wrote: Ok, I committed the __CHAR_BIT__ version (I just compiled that one file manually with old and new compilers, I didn't restart the testsuite, I hope the manual test is enough to detect any typo that could break things). Probably. You could have used: make check-gcc RUNTESTFLAGS=tree-ssa.exp=vrp79.c to test just that single testcase (with both old and new compilers). Thanks, I keep forgetting the details of that syntax. I just ran it for peace of mind, and it was happy. -- Marc Glisse
Re: Fix Solaris 9/x86 bootstrap
Richard Guenther writes: > Doesn't that belong in system.h instead? And removed from rtl.h? Fine with me. The following patch does just that, moving the #undefs as far down in system.h as seemed reasonable. Bootstrapped without regressions on i386-pc-solaris2.10 and x86_64-unknown-linux-gnu; i386-pc-solaris2.9 build just started running the testsuite. Ok for mainline? Rainer 2012-08-22 Rainer Orth * rtl.h (FFS, FLOAT, ABS, PC): Don't undef. * system.h (FFS, FLOAT, ABS, PC): Undef. # HG changeset patch # Parent 4cd1f4bc7b3c2901830dde3d7ace137718eae0e4 Fix Solaris 9/x86 bootstrap * rtl.h (FFS, FLOAT, ABS, PC): Don't undef. * system.h (FFS, FLOAT, ABS, PC): Undef. diff --git a/gcc/rtl.h b/gcc/rtl.h --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -33,11 +33,6 @@ along with GCC; see the file COPYING3. #include "hashtab.h" #include "flags.h" -#undef FFS /* Some systems predefine this symbol; don't let it interfere. */ -#undef FLOAT /* Likewise. */ -#undef ABS /* Likewise. */ -#undef PC /* Likewise. */ - /* Value used by some passes to "recognize" noop moves as valid instructions. */ #define NOOP_MOVE_INSN_CODE INT_MAX diff --git a/gcc/system.h b/gcc/system.h --- a/gcc/system.h +++ b/gcc/system.h @@ -638,6 +638,11 @@ extern int vsnprintf(char *, size_t, con /* Get libiberty declarations. */ #include "libiberty.h" +#undef FFS /* Some systems predefine this symbol; don't let it interfere. */ +#undef FLOAT /* Likewise. */ +#undef ABS /* Likewise. */ +#undef PC /* Likewise. */ + /* Provide a default for the HOST_BIT_BUCKET. This suffices for POSIX-like hosts. */ -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH][RFC] Virtual operands in loop-closed SSA form
While we should already be in loop-closed SSA form for virtual operands most of the time (because we have a virtual use at the return statement) and loop-closed SSA form for virtuals is cheap (we only have a single virtual operand now) the following makes sure that a loop-closed PHI node for virtuals does exist. Nobody makes use of the fact but ISTR code that has code explicitely dealing with the situation that virtuals are _not_ in loop-closed SSA form. Testing pending. Richard. 2012-08-22 Richard Guenther * tree-ssa-loop-manip.c (add_exit_phis_var): Allow virtual operands. (find_uses_to_rename_use): Likewise. (find_uses_to_rename_bb): Likewise. (find_uses_to_rename_stmt): Walk over all operands. Index: gcc/tree-ssa-loop-manip.c === --- gcc/tree-ssa-loop-manip.c (revision 190590) +++ gcc/tree-ssa-loop-manip.c (working copy) @@ -303,8 +303,7 @@ add_exit_phis_var (tree var, bitmap use_ basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (var)); bitmap live_exits = BITMAP_ALLOC (&loop_renamer_obstack); - gcc_checking_assert (! virtual_operand_p (var)); - gcc_assert (! bitmap_bit_p (use_blocks, def_bb->index)); + gcc_checking_assert (! bitmap_bit_p (use_blocks, def_bb->index)); compute_live_loop_exits (live_exits, use_blocks, loop_exits, def_bb); @@ -367,10 +366,6 @@ find_uses_to_rename_use (basic_block bb, if (TREE_CODE (use) != SSA_NAME) return; - /* We don't need to keep virtual operands in loop-closed form. */ - if (virtual_operand_p (use)) -return; - ver = SSA_NAME_VERSION (use); def_bb = gimple_bb (SSA_NAME_DEF_STMT (use)); if (!def_bb) @@ -408,7 +403,7 @@ find_uses_to_rename_stmt (gimple stmt, b if (is_gimple_debug (stmt)) return; - FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_USE) + FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_USES) find_uses_to_rename_use (bb, var, use_blocks, need_phis); } @@ -428,9 +423,8 @@ find_uses_to_rename_bb (basic_block bb, for (bsi = gsi_start_phis (e->dest); !gsi_end_p (bsi); gsi_next (&bsi)) { gimple phi = gsi_stmt (bsi); - if (! virtual_operand_p (gimple_phi_result (phi))) - find_uses_to_rename_use (bb, PHI_ARG_DEF_FROM_EDGE (phi, e), - use_blocks, need_phis); + find_uses_to_rename_use (bb, PHI_ARG_DEF_FROM_EDGE (phi, e), +use_blocks, need_phis); } for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
Re: [PATCH] Set current_function_decl in {push,pop}_cfun and push_struct_function
Hi, On Tue, Aug 21, 2012 at 01:30:47PM +0200, Richard Guenther wrote: > On Tue, Aug 21, 2012 at 1:27 PM, Martin Jambor wrote: > > On Wed, Aug 15, 2012 at 05:21:04PM +0200, Martin Jambor wrote: > >> Hi, > >> > >> On Fri, Aug 10, 2012 at 04:57:41PM +0200, Eric Botcazou wrote: > >> > > - ada/gcc-interface/utils.c:rest_of_subprog_body_compilation calls > >> > > dump_function which in turns calls dump_function_to_file which calls > >> > > push_cfun. But Ada front end has its idea of the > >> > > current_function_decl and there is no cfun which is an inconsistency > >> > > which makes push_cfun assert fail. I "solved" it by temporarily > >> > > setting current_function_decl to NULL_TREE. It's just dumping and I > >> > > thought that dump_function should be considered middle-end and thus > >> > > middle-end invariants should apply. > >> > > >> > If you think that calling dump_function from > >> > rest_of_subprog_body_compilation > >> > is a layering violation, I don't have a problem with replacing it with a > >> > more > >> > "manual" scheme like the one in c-family/c-gimplify.c:c_genericize, > >> > provided > >> > that this yields roughly the same output. > >> > >> Richi suggested on IRC that I remove the push/pop_cfun calls from > >> dump_function_to_file. The only problem seems to be > >> dump_histograms_for_stmt > > > > Yesterday I actually tried and it is not the only problem. Another > > one is dump_function_to_file->dump_bb->maybe_hot_bb_p which uses cfun > > to read profile_status. There may be others, this one just blew up > > first when I set cfun to NULL. And in future someone is quite likely > > to need cfun to dump something new too. > > > > At the same time, re-implementing dumping > > c-family/c-gimplify.c:c_genericize when dump_function suffices seems > > ugly to me. > > > > So I am going to declare dump_function a front-end interface and use > > set_cfun in my original patch in dump_function_to_file like we do in > > other such functions. > > > > I hope that will be OK. Thanks, > > Setting cfun has side-effects of switching target stuff which might have > code-generation side-effects because of implementation issues we have > with target/optimize attributes. So I don't think cfun should be changed > just for dumping. > > Can you instead just set current_function_decl and access > struct function via DECL_STRUCT_FUNCTION in the dumpers then? > After all, it it is a front-end interface, the frontend way of saying > "this is the current function" is to set current_function_decl, not the > middle-end cfun. > Like the following? Tested and bootstrapped on x86_64-linux, it does help avoid the ada hunk in my previous patch. OK for trunk? Thanks, Martin 2012-08-21 Martin Jambor * predict.c (maybe_hot_frequency_p): New parameter fun. Use its decl instead of current_function_decl, use profile_status_for_function and ENTRY_BLOCK_PTR_FOR_FUNCTION with fun instead of their cfun variants. (maybe_hot_count_p): New parameter fun, use profile_status_for_function instead of its cfun_variant. (maybe_hot_bb_p): New parameter fun, checking-assert it, pass it to all callees. (maybe_hot_edge_p): Pass cfun to maybe_hot_count_p and maybe_hot_frequency_p. (probably_never_executed_bb_p): New parameter fun, use its decl instead of current_function_decl. (optimize_bb_for_size_p): Pass cfun to maybe_hot_bb_p. (rtl_profile_for_bb): Likewise. (compute_function_frequency): Pass cfun to maybe_hot_bb_p and probably_never_executed_bb_p. * tree-ssa-operands.c (ssa_operands_active): New operator fun. Use it instead of cfun. (update_stmt_operands): Pass cfun as an argument of ssa_operands_active. (swap_tree_operands): Likewise. * gimple-iterator.c (update_modified_stmt): Likewise. (update_modified_stmts): Likewise. * tree-flow-inline.h (delink_stmt_imm_use): Likewise. * tree-ssa.c (delete_tree_ssa): Likewise. * bb-reorder.c (bb_to_key): Pass cfun to probably_never_executed_bb_p. (push_to_next_round_p): Likewise. (find_rarely_executed_basic_blocks_and_crossing_edges ): Likewise. * cfg.c: Inlude tree.h. (check_bb_profile): Use profile_status_for_function, EXIT_BLOCK_PTR_FOR_FUNCTION and ENTRY_BLOCK_PTR_FOR_FUNCTION with DECL_STRUCT_FUNCTION (current_function_decl) instead of their cfun variants. (dump_bb_info): Pass DECL_STRUCT_FUNCTION (current_function_decl) to maybe_hot_bb_p and probably_never_executed_bb_p. * gimple-pretty-print.c (gimple_dump_bb_buff): Checking-assert that DECL_STRUCT_FUNCTION (current_function_decl) is not NULL. Pass it to dump_histograms_for_stmt. (dump_gimple_mem_ops): Pass DECL_STRUCT_FUNCTION (current_function_decl) as an argument to dump_gimple_mem_ops.
[PATCH] Fix memory leak in Fortran FE with init_emit
Hi! init_function_start is these days supposed to be called only at the start of RTL expansion, it shouldn't be called much earlier, because then we leak e.g. the memory allocated by init_emit. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-08-22 Jakub Jelinek * trans-decl.c (trans_function_start, generate_coarray_init, create_main_function, gfc_generate_constructors): Call allocate_struct_function instead of init_function_start. --- gcc/fortran/trans-decl.c.jj 2012-08-13 14:36:51.0 +0200 +++ gcc/fortran/trans-decl.c2012-08-22 12:18:29.963953958 +0200 @@ -2265,7 +2265,8 @@ trans_function_start (gfc_symbol * sym) /* Create RTL for function definition. */ make_decl_rtl (fndecl); - init_function_start (fndecl); + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); + allocate_struct_function (fndecl, false); /* function.c requires a push at the start of the function. */ pushlevel (); @@ -4408,7 +4409,8 @@ generate_coarray_init (gfc_namespace * n rest_of_decl_compilation (fndecl, 0, 0); make_decl_rtl (fndecl); - init_function_start (fndecl); + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); + allocate_struct_function (fndecl, false); pushlevel (); gfc_init_block (&caf_init_block); @@ -4982,7 +4984,8 @@ create_main_function (tree fndecl) rest_of_decl_compilation (ftn_main, 1, 0); make_decl_rtl (ftn_main); - init_function_start (ftn_main); + gcc_checking_assert (DECL_STRUCT_FUNCTION (ftn_main) == NULL); + allocate_struct_function (ftn_main, false); pushlevel (); gfc_init_block (&body); @@ -5537,7 +5540,8 @@ gfc_generate_constructors (void) make_decl_rtl (fndecl); - init_function_start (fndecl); + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); + allocate_struct_function (fndecl, false); pushlevel (); Jakub
[PATCH] Some CH and into-SSA TLC
For PR46590 we spend a lot of time doing incremental SSA update. The following makes sure that time isn't increased by overeager asserts. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-08-22 Richard Guenther * tree-ssa-loop-ch.c (copy_loop_headers): Remove redundant checking. * tree-into-ssa.c (initialize_flags_in_bb): Use gcc_checking_assert instead of gcc_assert. (mark_block_for_update): Likewise. (add_new_name_mapping): Likewise. (mark_def_sites): Likewise. (insert_phi_nodes_for): Likewise. (rewrite_debug_stmt_uses): Likewise. (rewrite_stmt): Likewise. (maybe_register_def): Likewise. (rewrite_update_phi_arguments): Likewise. (rewrite_update_enter_block): Likewise. (mark_def_interesting): Likewise. (prepare_def_site_for): Likewise. (insert_updated_phi_nodes_for): Likewise. Index: gcc/tree-ssa-loop-ch.c === --- gcc/tree-ssa-loop-ch.c (revision 190590) +++ gcc/tree-ssa-loop-ch.c (working copy) @@ -143,10 +143,6 @@ copy_loop_headers (void) return 0; } -#ifdef ENABLE_CHECKING - verify_loop_structure (); -#endif - bbs = XNEWVEC (basic_block, n_basic_blocks); copied_bbs = XNEWVEC (basic_block, n_basic_blocks); bbs_size = n_basic_blocks; Index: gcc/tree-into-ssa.c === --- gcc/tree-into-ssa.c (revision 190590) +++ gcc/tree-into-ssa.c (working copy) @@ -426,7 +426,7 @@ initialize_flags_in_bb (basic_block bb) /* We are going to use the operand cache API, such as SET_USE, SET_DEF, and FOR_EACH_IMM_USE_FAST. The operand cache for each statement should be up-to-date. */ - gcc_assert (!gimple_modified_p (stmt)); + gcc_checking_assert (!gimple_modified_p (stmt)); set_rewrite_uses (stmt, false); set_register_defs (stmt, false); } @@ -437,7 +437,7 @@ initialize_flags_in_bb (basic_block bb) static void mark_block_for_update (basic_block bb) { - gcc_assert (blocks_to_update != NULL); + gcc_checking_assert (blocks_to_update != NULL); if (!bitmap_set_bit (blocks_to_update, bb->index)) return; initialize_flags_in_bb (bb); @@ -588,7 +588,8 @@ static void add_new_name_mapping (tree new_tree, tree old) { /* OLD and NEW_TREE must be different SSA names for the same symbol. */ - gcc_assert (new_tree != old && SSA_NAME_VAR (new_tree) == SSA_NAME_VAR (old)); + gcc_checking_assert (new_tree != old + && SSA_NAME_VAR (new_tree) == SSA_NAME_VAR (old)); /* We may need to grow NEW_SSA_NAMES and OLD_SSA_NAMES because our caller may have created new names since the set was created. */ @@ -639,7 +640,7 @@ mark_def_sites (basic_block bb, gimple s form, force an operand scan on every statement. */ update_stmt (stmt); - gcc_assert (blocks_to_update == NULL); + gcc_checking_assert (blocks_to_update == NULL); set_register_defs (stmt, false); set_rewrite_uses (stmt, false); @@ -648,7 +649,7 @@ mark_def_sites (basic_block bb, gimple s FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE) { tree sym = USE_FROM_PTR (use_p); - gcc_assert (DECL_P (sym)); + gcc_checking_assert (DECL_P (sym)); set_rewrite_uses (stmt, true); } if (rewrite_uses_p (stmt)) @@ -661,7 +662,7 @@ mark_def_sites (basic_block bb, gimple s FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_ALL_USES) { tree sym = USE_FROM_PTR (use_p); - gcc_assert (DECL_P (sym)); + gcc_checking_assert (DECL_P (sym)); if (!bitmap_bit_p (kills, DECL_UID (sym))) set_livein_block (sym, bb); set_rewrite_uses (stmt, true); @@ -671,7 +672,7 @@ mark_def_sites (basic_block bb, gimple s each def to the set of killed symbols. */ FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_ALL_DEFS) { - gcc_assert (DECL_P (def)); + gcc_checking_assert (DECL_P (def)); set_def_block (def, bb, false); bitmap_set_bit (kills, DECL_UID (def)); set_register_defs (stmt, true); @@ -960,10 +961,7 @@ insert_phi_nodes_for (tree var, bitmap p gimple phi; basic_block bb; bitmap_iterator bi; - struct def_blocks_d *def_map; - - def_map = find_def_blocks_for (var); - gcc_assert (def_map); + struct def_blocks_d *def_map = find_def_blocks_for (var); /* Remove the blocks where we already have PHI nodes for VAR. */ bitmap_and_compl_into (phi_insertion_points, def_map->phi_blocks); @@ -990,7 +988,7 @@ insert_phi_nodes_for (tree var, bitmap p edge_iterator ei; tree new_lhs; - gcc_assert (update_p); + gcc_checking_assert (update_p); new_lhs = duplicate_ssa_name (var, NULL); phi = create_phi_node (new_lhs, bb); add_new_name_mapping (new_lhs, var); @@
Re: [PATCH] Fix memory leak in Fortran FE with init_emit
On 08/22/2012 03:03 PM, Jakub Jelinek wrote: init_function_start is these days supposed to be called only at the start of RTL expansion, it shouldn't be called much earlier, because then we leak e.g. the memory allocated by init_emit. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. That way it makes much more sense how init_function_start works. Thanks for the patch! Tobias 2012-08-22 Jakub Jelinek * trans-decl.c (trans_function_start, generate_coarray_init, create_main_function, gfc_generate_constructors): Call allocate_struct_function instead of init_function_start. --- gcc/fortran/trans-decl.c.jj 2012-08-13 14:36:51.0 +0200 +++ gcc/fortran/trans-decl.c2012-08-22 12:18:29.963953958 +0200 @@ -2265,7 +2265,8 @@ trans_function_start (gfc_symbol * sym) /* Create RTL for function definition. */ make_decl_rtl (fndecl); - init_function_start (fndecl); + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); + allocate_struct_function (fndecl, false); /* function.c requires a push at the start of the function. */ pushlevel (); @@ -4408,7 +4409,8 @@ generate_coarray_init (gfc_namespace * n rest_of_decl_compilation (fndecl, 0, 0); make_decl_rtl (fndecl); - init_function_start (fndecl); + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); + allocate_struct_function (fndecl, false); pushlevel (); gfc_init_block (&caf_init_block); @@ -4982,7 +4984,8 @@ create_main_function (tree fndecl) rest_of_decl_compilation (ftn_main, 1, 0); make_decl_rtl (ftn_main); - init_function_start (ftn_main); + gcc_checking_assert (DECL_STRUCT_FUNCTION (ftn_main) == NULL); + allocate_struct_function (ftn_main, false); pushlevel (); gfc_init_block (&body); @@ -5537,7 +5540,8 @@ gfc_generate_constructors (void) make_decl_rtl (fndecl); - init_function_start (fndecl); + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); + allocate_struct_function (fndecl, false); pushlevel ();
Re: [PATCH] Fix memory leak in Fortran FE with init_emit
On Wed, Aug 22, 2012 at 3:03 PM, Jakub Jelinek wrote: > Hi! > > init_function_start is these days supposed to be called only at the start of > RTL expansion, it shouldn't be called much earlier, because then we leak > e.g. the memory allocated by init_emit. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Looks ok, but please move the assert into allocate_struct_function instead. Richard. > 2012-08-22 Jakub Jelinek > > * trans-decl.c (trans_function_start, generate_coarray_init, > create_main_function, gfc_generate_constructors): Call > allocate_struct_function instead of init_function_start. > > --- gcc/fortran/trans-decl.c.jj 2012-08-13 14:36:51.0 +0200 > +++ gcc/fortran/trans-decl.c2012-08-22 12:18:29.963953958 +0200 > @@ -2265,7 +2265,8 @@ trans_function_start (gfc_symbol * sym) >/* Create RTL for function definition. */ >make_decl_rtl (fndecl); > > - init_function_start (fndecl); > + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); > + allocate_struct_function (fndecl, false); > >/* function.c requires a push at the start of the function. */ >pushlevel (); > @@ -4408,7 +4409,8 @@ generate_coarray_init (gfc_namespace * n > >rest_of_decl_compilation (fndecl, 0, 0); >make_decl_rtl (fndecl); > - init_function_start (fndecl); > + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); > + allocate_struct_function (fndecl, false); > >pushlevel (); >gfc_init_block (&caf_init_block); > @@ -4982,7 +4984,8 @@ create_main_function (tree fndecl) > >rest_of_decl_compilation (ftn_main, 1, 0); >make_decl_rtl (ftn_main); > - init_function_start (ftn_main); > + gcc_checking_assert (DECL_STRUCT_FUNCTION (ftn_main) == NULL); > + allocate_struct_function (ftn_main, false); >pushlevel (); > >gfc_init_block (&body); > @@ -5537,7 +5540,8 @@ gfc_generate_constructors (void) > >make_decl_rtl (fndecl); > > - init_function_start (fndecl); > + gcc_checking_assert (DECL_STRUCT_FUNCTION (fndecl) == NULL); > + allocate_struct_function (fndecl, false); > >pushlevel (); > > > Jakub
Re: [PATCH] Set current_function_decl in {push,pop}_cfun and push_struct_function
On Wed, Aug 22, 2012 at 3:04 PM, Martin Jambor wrote: > Hi, > > On Tue, Aug 21, 2012 at 01:30:47PM +0200, Richard Guenther wrote: >> On Tue, Aug 21, 2012 at 1:27 PM, Martin Jambor wrote: >> > On Wed, Aug 15, 2012 at 05:21:04PM +0200, Martin Jambor wrote: >> >> Hi, >> >> >> >> On Fri, Aug 10, 2012 at 04:57:41PM +0200, Eric Botcazou wrote: >> >> > > - ada/gcc-interface/utils.c:rest_of_subprog_body_compilation calls >> >> > > dump_function which in turns calls dump_function_to_file which calls >> >> > > push_cfun. But Ada front end has its idea of the >> >> > > current_function_decl and there is no cfun which is an inconsistency >> >> > > which makes push_cfun assert fail. I "solved" it by temporarily >> >> > > setting current_function_decl to NULL_TREE. It's just dumping and I >> >> > > thought that dump_function should be considered middle-end and thus >> >> > > middle-end invariants should apply. >> >> > >> >> > If you think that calling dump_function from >> >> > rest_of_subprog_body_compilation >> >> > is a layering violation, I don't have a problem with replacing it with >> >> > a more >> >> > "manual" scheme like the one in c-family/c-gimplify.c:c_genericize, >> >> > provided >> >> > that this yields roughly the same output. >> >> >> >> Richi suggested on IRC that I remove the push/pop_cfun calls from >> >> dump_function_to_file. The only problem seems to be >> >> dump_histograms_for_stmt >> > >> > Yesterday I actually tried and it is not the only problem. Another >> > one is dump_function_to_file->dump_bb->maybe_hot_bb_p which uses cfun >> > to read profile_status. There may be others, this one just blew up >> > first when I set cfun to NULL. And in future someone is quite likely >> > to need cfun to dump something new too. >> > >> > At the same time, re-implementing dumping >> > c-family/c-gimplify.c:c_genericize when dump_function suffices seems >> > ugly to me. >> > >> > So I am going to declare dump_function a front-end interface and use >> > set_cfun in my original patch in dump_function_to_file like we do in >> > other such functions. >> > >> > I hope that will be OK. Thanks, >> >> Setting cfun has side-effects of switching target stuff which might have >> code-generation side-effects because of implementation issues we have >> with target/optimize attributes. So I don't think cfun should be changed >> just for dumping. >> >> Can you instead just set current_function_decl and access >> struct function via DECL_STRUCT_FUNCTION in the dumpers then? >> After all, it it is a front-end interface, the frontend way of saying >> "this is the current function" is to set current_function_decl, not the >> middle-end cfun. >> > > Like the following? Tested and bootstrapped on x86_64-linux, it does > help avoid the ada hunk in my previous patch. OK for trunk? Ok if nobody complains - btw, I think you miss to restore current_function_decl here: > *** /tmp/HcgoTd_tree-cfg.c Wed Aug 22 15:02:30 2012 > --- gcc/tree-cfg.c Wed Aug 22 11:53:02 2012 > *** move_sese_region_to_fn (struct function > *** 6632,6650 > */ > > void > ! dump_function_to_file (tree fn, FILE *file, int flags) > { > ! tree arg, var; > struct function *dsf; > bool ignore_topmost_bind = false, any_var = false; > basic_block bb; > tree chain; > ! bool tmclone = TREE_CODE (fn) == FUNCTION_DECL && decl_is_tm_clone (fn); > > ! fprintf (file, "%s %s(", current_function_name (), > ! tmclone ? "[tm-clone] " : ""); > > ! arg = DECL_ARGUMENTS (fn); > while (arg) > { > print_generic_expr (file, TREE_TYPE (arg), dump_flags); > --- 6632,6652 > */ > > void > ! dump_function_to_file (tree fndecl, FILE *file, int flags) > { > ! tree arg, var, old_current_fndecl = current_function_decl; > struct function *dsf; > bool ignore_topmost_bind = false, any_var = false; > basic_block bb; > tree chain; > ! bool tmclone = (TREE_CODE (fndecl) == FUNCTION_DECL > ! && decl_is_tm_clone (fndecl)); > ! struct function *fun = DECL_STRUCT_FUNCTION (fndecl); > > ! current_function_decl = fndecl; > ! fprintf (file, "%s %s(", function_name (fun), tmclone ? "[tm-clone] " : > ""); > > ! arg = DECL_ARGUMENTS (fndecl); > while (arg) > { > print_generic_expr (file, TREE_TYPE (arg), dump_flags); > *** dump_function_to_file (tree fn, FILE *fi > *** 6659,6689 > fprintf (file, ")\n"); > > if (flags & TDF_VERBOSE) > ! print_node (file, "", fn, 2); > > ! dsf = DECL_STRUCT_FUNCTION (fn); > if (dsf && (flags & TDF_EH)) > dump_eh_tree (file, dsf); > > ! if (flags & TDF_RAW && !gimple_has_body_p (fn)) > { > ! dump_node (fn, TDF_SLIM | flags, file); > return; > } > > - /* Switch CFUN to point to FN. */ > - push_cfun (DECL_STRUCT_FUNCTION (fn)); > - > /* When GIMPLE is lowered, the variables are no longer available in >BI
Re: [patch] Backport fixes for PR54146 to GCC 4.7
On Wed, Aug 22, 2012 at 12:57 AM, Steven Bosscher wrote: > Hello, > > This patch back-ports most of the changes made to resolve the worst of > PR54146 on trunk to GCC 4.7. This PR is basically an accumulation of > various compiler speed and memory usage regressions, and all fixes are > almost trivial, so I think it's fair and safe to back-port these > changes. > > Bootstrapped&tested (gcc-4_7-branch, of course) on > x86_64-unknown-linux-gnu, and on powerpc64-unknown-linux-gnu. OK? Ok with me, please give other RMs a chance to object. Thanks, Richard. > Ciao! > Steven
[PATCH] more efficient mersenne_twister_engine::discard
The discard member function of the mersenne_twister_engine class is unnecessarily inefficient. It currently discard elements one-by-one. It is possible to discard with higher granularity by discarding the entire internal buffer. The attached patch implements this. To avoid duplication a new internal member function is introduced which generates a new set of bits for the internal buffer. The operator() is changed to use it. 2012-08-22 Ulrich Drepper * include/bits/random.h (mersenne_twister_engine): Don't inline discard here. New member function _M_gen_rand. * include/bits/random.tcc (mersenne_twister_engine<>::_M_gen_rand): New function. Extracted from operator(). (mersenne_twister_engine<>::discard): New implementation which skips in large steps. (mersenne_twister_engine<>::operator()): Use _M_gen_rand. d-mersenne-discard Description: Binary data
Re: [patch] Backport fixes for PR54146 to GCC 4.7
On Wed, Aug 22, 2012 at 03:41:08PM +0200, Richard Guenther wrote: > On Wed, Aug 22, 2012 at 12:57 AM, Steven Bosscher > wrote: > > Hello, > > > > This patch back-ports most of the changes made to resolve the worst of > > PR54146 on trunk to GCC 4.7. This PR is basically an accumulation of > > various compiler speed and memory usage regressions, and all fixes are > > almost trivial, so I think it's fair and safe to back-port these > > changes. > > > > Bootstrapped&tested (gcc-4_7-branch, of course) on > > x86_64-unknown-linux-gnu, and on powerpc64-unknown-linux-gnu. OK? > > Ok with me, please give other RMs a chance to object. To me the patch looks way too long, changing too many things, to be suitable for the release branch. Jakub
[PATCH] Make TREE_NOTHROW use the base.nothrow_flag again
Hello, While working on something else, I noticed that debug_tree (vec), when vec is a TREE_VEC, was crashing because TREE_NOTHROW was asserting that its argument is not a TREE_VEC, so print_node would crash. It turned out that TREE_NOTHROW was accidentally modified by this change set: commit 87d8f7b67c6a36c37e48e298f26e693520099b1e Author: rguenth Date: Tue Aug 21 10:03:38 2012 + 2012-08-21 Richard Guenther cp/ * cp-tree.h (TREE_INDIRECT_USING): Use TREE_LANG_FLAG_0 accessor. (ATTR_IS_DEPENDENT): Likewise. (ARGUMENT_PACK_INCOMPLETE_P): Use TREE_ADDRESSABLE instead of TREE_LANG_FLAG_0 on TREE_VECs. * tree.h (struct tree_base): Add union to make it possible to re-use the upper 4 bytes for tree codes that do not need as many flags as others. Move visited and default_def_flag to common bits section in exchange for saturating_flag and unsigned_flag. Add SSA name version and tree vec length fields here. (struct tree_vec): Remove length field here. (struct tree_ssa_name): Remove version field here. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@190558 138bc75d-0d04-0410-961f-82ee72b054a4 Richard Guenther pre-approved the reverting of the TREE_NOTHROW change line. Tested on x86_64-unknown-linux-gnu against master and applied to trunk. gcc/ * tree.h (TREE_NOTHROW): Use the base.nothrow_flag. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@190595 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog |4 gcc/tree.h|2 +- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/gcc/tree.h b/gcc/tree.h index f332fdd..bca0576 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1305,7 +1305,7 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int, throw an exception. In a CALL_EXPR, nonzero means the call cannot throw. We can't easily check the node type here as the C++ frontend also uses this flag (for AGGR_INIT_EXPR). */ -#define TREE_NOTHROW(NODE) (TREE_NOT_CHECK (NODE, TREE_VEC)->base.nothrow_flag) +#define TREE_NOTHROW(NODE) ((NODE)->base.nothrow_flag) /* In a CALL_EXPR, means that it's safe to use the target of the call expansion as the return slot for a call that returns in memory. */ -- Dodji
Re: [C++ patch] PR 20420
On 08/22/2012 08:43 AM, Paolo Carlini wrote: yesterday I had a look to this old PR and noticed that we are almost doing already the right thing for the original testcase: we are for classes, but we ICE (something recent) for enums. The latter issue seems easy to fix: handle specially enums at the beginning of supplement_binding_1 only when the comment says, that is in templates (otherwise we hit an assert in dependent_type_p). Which assert? The underlying type ought to be set by the time we see a using, and it shouldn't be a TEMPLATE_TYPE_PARM. Then, Comment #4 in the audit trail added a case of duplicate using declaration which we should also reject, involving VAR_DECLs at function scope (C++11 7.3.3/10). Seems also easy to fix, by checking the return value of validate_nonmember_using_decl at the beginning of do_local_using_decl and erroring out if it's null. Let's put the error in validate_nonmember_using_decl with the other errors. Jason
Re: [PATCH] more efficient mersenne_twister_engine::discard
On 08/22/2012 03:43 PM, Ulrich Drepper wrote: The discard member function of the mersenne_twister_engine class is unnecessarily inefficient. It currently discard elements one-by-one. It is possible to discard with higher granularity by discarding the entire internal buffer. The attached patch implements this. To avoid duplication a new internal member function is introduced which generates a new set of bits for the internal buffer. The operator() is changed to use it. Great, thanks. By the way, I suspect that other discard member functions could be also improved (nobody touched the code since the initial implementation when all where some sort of trivial conforming implementation) Paolo.
Re: [C++ patch] PR 20420
Hi, On 08/22/2012 04:09 PM, Jason Merrill wrote: On 08/22/2012 08:43 AM, Paolo Carlini wrote: yesterday I had a look to this old PR and noticed that we are almost doing already the right thing for the original testcase: we are for classes, but we ICE (something recent) for enums. The latter issue seems easy to fix: handle specially enums at the beginning of supplement_binding_1 only when the comment says, that is in templates (otherwise we hit an assert in dependent_type_p). Which assert? The underlying type ought to be set by the time we see a using, and it shouldn't be a TEMPLATE_TYPE_PARM. Here: if (!processing_template_decl) { /* If we are not processing a template, then nobody should be providing us with a dependent type. */ gcc_assert (type); gcc_assert (TREE_CODE (type) != TEMPLATE_TYPE_PARM || is_auto (type)); return false; } type is NULL_TREE. Makes sense? Then, Comment #4 in the audit trail added a case of duplicate using declaration which we should also reject, involving VAR_DECLs at function scope (C++11 7.3.3/10). Seems also easy to fix, by checking the return value of validate_nonmember_using_decl at the beginning of do_local_using_decl and erroring out if it's null. Let's put the error in validate_nonmember_using_decl with the other errors. Ok. Indeed, I wanted to do that, but was annoyed that we cannot assume we are in a function, because validate_nonmember_using_decl is also called by do_toplevel_using_decl. I suppose I should use at_function_scope_p () then. Paolo.
Re: [Patch ARM] Update the test case to differ movs and lsrs for ARM mode and non-ARM mode
On 22/08/12 12:16, Terry Guo wrote: > >>> >>> Due to the impact of ARM UAL, the Thumb1 and Thumb2 mode use LSRS >>> instruction while the ARM mode uses MOVS instruction. So the >> following case >>> is updated accordingly. Is it OK to trunk? >>> >>> BR, >>> Terry >>> >>> 2012-08-21 Terry Guo >>> >>> * gcc.target/arm/combine-movs.c: Check movs for ARM mode >>> and lsrs for other mode. >>> >> >> This can't be right. Thumb1 doesn't use unified syntax. >> >> R. >> > > oops. You are right. Sorry for making such obvious mistake. > Here is patch updated to distinguish ARM and Thumb2. > Tested for Thumb1, Thumb2 and ARM modes. No regression. > > Is it OK? > > BR, > Terry > > 2012-08-21 Terry Guo > > * gcc.target/arm/combine-movs.c: Check movs for ARM mode > and lsrs for Thumb2 mode. > > OK. R.
Re: [C++ patch] PR 20420
Hi. On 08/22/2012 04:14 PM, Paolo Carlini wrote: Hi, On 08/22/2012 04:09 PM, Jason Merrill wrote: On 08/22/2012 08:43 AM, Paolo Carlini wrote: yesterday I had a look to this old PR and noticed that we are almost doing already the right thing for the original testcase: we are for classes, but we ICE (something recent) for enums. The latter issue seems easy to fix: handle specially enums at the beginning of supplement_binding_1 only when the comment says, that is in templates (otherwise we hit an assert in dependent_type_p). Which assert? The underlying type ought to be set by the time we see a using, and it shouldn't be a TEMPLATE_TYPE_PARM. Here: if (!processing_template_decl) { /* If we are not processing a template, then nobody should be providing us with a dependent type. */ gcc_assert (type); gcc_assert (TREE_CODE (type) != TEMPLATE_TYPE_PARM || is_auto (type)); return false; } type is NULL_TREE. Like this: #0 dependent_type_p (type=0x0) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/pt.c:19020 #1 0x0073301f in supplement_binding_1 (binding=0x7651c7f8, decl=0x76521cf0) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/name-lookup.c:448 #2 0x00733a65 in supplement_binding (binding=0x7651c7f8, decl=0x76521cf0) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/name-lookup.c:568 #3 0x0073e4e0 in push_class_level_binding_1 (name=0x76534948, x=0x76521cf0) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/name-lookup.c:3162 #4 0x0073e545 in push_class_level_binding (name=0x76534948, x=0x76521cf0) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/name-lookup.c:3180 #5 0x0073d753 in pushdecl_class_level (x=0x76521cf0) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/name-lookup.c:2914 #6 0x006d9538 in finish_member_declaration (decl=0x76521cf0) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/semantics.c:2668 #7 0x007469c2 in pushtag_1 (name=0x76534948, type=0x76535498, scope=ts_current) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/name-lookup.c:5787 #8 0x00746d78 in pushtag (name=0x76534948, type=0x76535498, scope=ts_current) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/name-lookup.c:5853 #9 0x00570021 in start_enum (name=0x76534948, enumtype=0x76535498, underlying_type=0x0, scoped_enum_p=false, is_new=0x7fffd3f7) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/decl.c:12146 #10 0x00658dfc in cp_parser_enum_specifier (parser=0x76534b00) at /scratch/Gcc/svn-dirs/trunk/gcc/cp/parser.c:14509 Paolo.
Re: [PATCH, MIPS] add new peephole for 74k dspr2
On 08/19/2012 11:22 AM, Richard Sandiford wrote: Not sure whether a peephole is the right choice here. In practice, I'd imagine these opportunities would only come from a DImode move of $0 into a doubleword register, so we could simply emit the pattern in mips_split_doubleword_move. That would also allow us to use it for plain HI and LO. It wasn't obvious from the patch why it was restricted to the DSP extension registers. Please also add a scan-assembler test. How is this version of the fix? -Sandra 2012-08-22 Sandra Loosemore gcc/ * mips.c (mips_split_doubleword_move): Use mult instruction to zero-initialize accumulator. gcc/testsuite/ * gcc.target/mips/mips32-dsp-accinit.c: New. Index: gcc/config/mips/mips.c === --- gcc/config/mips/mips.c (revision 190463) +++ gcc/config/mips/mips.c (working copy) @@ -4158,6 +4158,14 @@ mips_split_doubleword_move (rtx dest, rt else emit_insn (gen_mfhisi_di (mips_subword (dest, true), src)); } + else if (!TARGET_64BIT && !TARGET_MIPS16 && ISA_HAS_DSP_MULT + && src == const0_rtx + && REG_P (dest) && ACC_REG_P (REGNO (dest))) +/* Zero-initialize accumulator using "mult $dest,$0,$0" instead + of a mthi/mtlo pair. */ +emit_insn (gen_mulsidi3_32bit (dest, + gen_rtx_REG (SImode, GP_REG_FIRST), + gen_rtx_REG (SImode, GP_REG_FIRST))); else { /* The operation can be split into two normal moves. Decide in Index: gcc/testsuite/gcc.target/mips/mips32-dsp-accinit.c === --- gcc/testsuite/gcc.target/mips/mips32-dsp-accinit.c (revision 0) +++ gcc/testsuite/gcc.target/mips/mips32-dsp-accinit.c (revision 0) @@ -0,0 +1,16 @@ +/* { dg-options "-O2 -march=74kc -mgp32" } */ + +/* Check that the zero-initialization of the accumulator feeding into + the madd is done by means of a mult instruction instead of mthi/mtlo. */ + +NOMIPS16 long long f (int n, int *v, int m) +{ + long long result = 0; + int i; + + for (i = 0; i < n; i++) +result = __builtin_mips_madd (result, v[i], m); + return result; +} + +/* { dg-final { scan-assembler "mult\t\\\$ac.,\\\$0,\\\$0" } } */
PATCH: Default to 64-bit long double for Bionic/x86
Hi, Long double is the same as double in Bionic. This patch 1. Add -mlong-double-80: 80-bit long double Enabled for Linux by default. 2. Add -mlong-double-64: 64-bit long double Predefine a new C/C++ macro, __LONG_DOUBLE_64__. Enabled for Android by default. __float80 can still be used for 80-bit long double. Tested on Linux/x86 and Android/x86. OK to install? Thanks. H.J. gcc/ 2012-08-22 H.J. Lu * doc/invoke.texi: Document -mlong-double-64/-mlong-double-80. * config/i386/i386.c (flag_opts): Add -mlong-double-64. (TARGET_HAS_BIONIC): Default long double to 64-bit for Bionic. * config/i386/i386.h (LONG_DOUBLE_TYPE_SIZE): Use 64 if TARGET_LONG_DOUBLE_64 is true. (LIBGCC2_LONG_DOUBLE_TYPE_SIZE): New macro. (WIDEST_HARDWARE_FP_SIZE): Defined to 80. * config/i386/i386.opt (mlong-double-80): New option. (mlong-double-64): Likewise. * config/i386/i386-c.c (ix86_target_macros): Define __LONG_DOUBLE_64__ for TARGET_LONG_DOUBLE_64. gcc/testsuite/ 2012-08-22 H.J. Lu * gcc.target/i386/long-double-64-1.c: New file. * gcc.target/i386/long-double-64-2.c: Likewise. * gcc.target/i386/long-double-64-3.c: Likewise. * gcc.target/i386/long-double-64-4.c: Likewise. * gcc.target/i386/long-double-80-1.c: Likewise. * gcc.target/i386/long-double-80-2.c: Likewise. * gcc.target/i386/long-double-80-3.c: Likewise. * gcc.target/i386/long-double-80-4.c: Likewise. * gcc.target/i386/long-double-80-5.c: Likewise. * gcc.target/i386/long-double-80-6.c: Likewise. * gcc.target/i386/long-double-80-7.c: Likewise. libgcc/ 2012-08-22 H.J. Lu * config/i386/t-linux (HOST_LIBGCC2_CFLAGS): New. diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c index d00e0ba..edd64ff 100644 --- a/gcc/config/i386/i386-c.c +++ b/gcc/config/i386/i386-c.c @@ -409,6 +409,9 @@ ix86_target_macros (void) builtin_define_std ("i386"); } + if (TARGET_LONG_DOUBLE_64) +cpp_define (parse_in, "__LONG_DOUBLE_64__"); + cpp_define_formatted (parse_in, "__ATOMIC_HLE_ACQUIRE=%d", IX86_HLE_ACQUIRE); cpp_define_formatted (parse_in, "__ATOMIC_HLE_RELEASE=%d", IX86_HLE_RELEASE); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index a6fc45b..da931ee 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2786,6 +2786,7 @@ ix86_target_string (HOST_WIDE_INT isa, int flags, const char *arch, static struct ix86_target_opts flag_opts[] = { { "-m128bit-long-double", MASK_128BIT_LONG_DOUBLE }, +{ "-mlong-double-64", MASK_LONG_DOUBLE_64 }, { "-m80387", MASK_80387 }, { "-maccumulate-outgoing-args",MASK_ACCUMULATE_OUTGOING_ARGS }, { "-malign-double",MASK_ALIGN_DOUBLE }, @@ -4084,6 +4085,11 @@ ix86_option_override_internal (bool main_args_p) else if (target_flags_explicit & MASK_RECIP) recip_mask &= ~(RECIP_MASK_ALL & ~recip_mask_explicit); + /* Default long double to 64-bit for Bionic. */ + if (TARGET_HAS_BIONIC + && !(target_flags_explicit & MASK_LONG_DOUBLE_64)) +target_flags |= MASK_LONG_DOUBLE_64; + /* Save the initial options in case the user does function specific options. */ if (main_args_p) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 11f79e3..3a41a43 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -671,9 +671,17 @@ enum target_cpu_default #define LONG_LONG_TYPE_SIZE 64 #define FLOAT_TYPE_SIZE 32 #define DOUBLE_TYPE_SIZE 64 -#define LONG_DOUBLE_TYPE_SIZE 80 +#define LONG_DOUBLE_TYPE_SIZE (TARGET_LONG_DOUBLE_64 ? 64 : 80) -#define WIDEST_HARDWARE_FP_SIZE LONG_DOUBLE_TYPE_SIZE +/* Define this to set long double type size to use in libgcc2.c, which can + not depend on target_flags. */ +#ifdef __LONG_DOUBLE_64__ +#define LIBGCC2_LONG_DOUBLE_TYPE_SIZE 64 +#else +#define LIBGCC2_LONG_DOUBLE_TYPE_SIZE 80 +#endif + +#define WIDEST_HARDWARE_FP_SIZE 80 #if defined (TARGET_BI_ARCH) || TARGET_64BIT_DEFAULT #define MAX_BITS_PER_WORD 64 diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index e4f78f3..6a38994 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -86,6 +86,14 @@ m96bit-long-double Target RejectNegative Report InverseMask(128BIT_LONG_DOUBLE) Save sizeof(long double) is 12 +mlong-double-80 +Target Report RejectNegative InverseMask(LONG_DOUBLE_64) Save +Use 80-bit long double + +mlong-double-64 +Target Report RejectNegative Mask(LONG_DOUBLE_64) Save +Use 64-bit long double + maccumulate-outgoing-args Target Report Mask(ACCUMULATE_OUTGOING_ARGS) Save Reserve space for outgoing arguments in the function prologue diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ae22ca9..0957979 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -
Re: [C++ patch] PR 20420
. thus, in short, what is happening is that, for this testcase: class B { protected: enum E { E1, E2, E3 }; }; class D : private B { public: using B::E; private: enum E { }; }; we parse the new declaration enum E { }; and we reach supplement_binding_1 before setting the underlying type of the new declaration. The old declaration is fine, would not ICE dependent_type_p. Paolo.
[PATCH] Remove some mark_virtual_operands_for_renaming calls
Inside the vectorizer they are not necessary (stmt update will make sure they are marked for renaming - sth I'm going to improve with a follwup). Also BB vectorization doesn't need to update SSA form after vectorizing each BB. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-08-22 Richard Guenther * tree-vectorizer.c (vectorize_loops): Do not call mark_virtual_operands_for_renaming. * tree-vect-slp.c (vect_slp_transform_bb): Likewise. Do not update SSA form here. Index: gcc/tree-vectorizer.c === *** gcc/tree-vectorizer.c (revision 190590) --- gcc/tree-vectorizer.c (working copy) *** vectorize_loops (void) *** 223,230 /* --- Finalize. --- */ - mark_virtual_operands_for_renaming (cfun); - for (i = 1; i < vect_loops_num; i++) { loop_vec_info loop_vinfo; --- 223,228 Index: gcc/tree-vect-slp.c === *** gcc/tree-vect-slp.c (revision 190590) --- gcc/tree-vect-slp.c (working copy) *** vect_slp_transform_bb (basic_block bb) *** 3159,3170 } } - mark_virtual_operands_for_renaming (cfun); - /* The memory tags and pointers in vectorized statements need to - have their SSA forms updated. FIXME, why can't this be delayed - until all the loops have been transformed? */ - update_ssa (TODO_update_ssa); - if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "BASIC BLOCK VECTORIZED\n"); --- 3159,3164
Re: [C++ patch] PR 20420
On 08/22/2012 10:55 AM, Paolo Carlini wrote: . thus, in short, what is happening is that, for this testcase: class B { protected: enum E { E1, E2, E3 }; }; class D : private B { public: using B::E; private: enum E { }; }; we parse the new declaration enum E { }; and we reach supplement_binding_1 before setting the underlying type of the new declaration. The old declaration is fine, would not ICE dependent_type_p. So with your change would we still ICE if D were a template? It seems like what we should be checking for is null underlying type. Jason
Re: [C++ patch] PR 20420
Hi again, On 08/22/2012 05:13 PM, Jason Merrill wrote: On 08/22/2012 10:55 AM, Paolo Carlini wrote: . thus, in short, what is happening is that, for this testcase: class B { protected: enum E { E1, E2, E3 }; }; class D : private B { public: using B::E; private: enum E { }; }; we parse the new declaration enum E { }; and we reach supplement_binding_1 before setting the underlying type of the new declaration. The old declaration is fine, would not ICE dependent_type_p. So with your change would we still ICE if D were a template? It seems like what we should be checking for is null underlying type. Good question ;) Yesterday night I double checked with this: template class BT { protected: enum E { E1, E2, E3 }; struct S { int i; E e; }; }; template class DT : private BT { public: using BT::E; using BT::S; private: enum E {}; struct S {}; }; template class DT; and we handle it correctly (note: we error out *only* at instantiation time). At this point, let me know, I could either add to the testcase a templated variant like the above (see attached), or rework the code to explicitly check the underlying type (I would add locals hosting the TREE_TYPEs to shorten a bit things, etc). The below passes testing, anyway. Thanks, Paolo. /// Index: testsuite/g++.dg/lookup/using53.C === --- testsuite/g++.dg/lookup/using53.C (revision 0) +++ testsuite/g++.dg/lookup/using53.C (revision 0) @@ -0,0 +1,53 @@ +// PR c++/20420 + +class B +{ +protected: + enum E { E1, E2, E3 }; + struct S { int i; E e; }; +}; + +class D : private B +{ +public: + using B::E; // { dg-message "previous" } + using B::S; // { dg-message "previous" } + +private: + enum E {};// { dg-error "conflicts" } + struct S {}; // { dg-error "conflicts" } +}; + +template +class BT +{ +protected: + enum E { E1, E2, E3 }; + struct S { int i; E e; }; +}; + +template +class DT : private BT +{ +public: + using BT::E; // { dg-message "previous" } + using BT::S; // { dg-message "previous" } + +private: + enum E {};// { dg-error "conflicts" } + struct S {}; // { dg-error "conflicts" } +}; + +template class DT; + +namespace N +{ + int i; +} + +void +f () +{ + using N::i; + using N::i; // { dg-error "declared" } +} Index: cp/name-lookup.c === --- cp/name-lookup.c(revision 190590) +++ cp/name-lookup.c(working copy) @@ -441,7 +441,8 @@ supplement_binding_1 (cxx_binding *binding, tree d template in order to handle late matching of underlying type on an opaque-enum-declaration followed by an enum-specifier. */ - || (TREE_CODE (TREE_TYPE (target_decl)) == ENUMERAL_TYPE + || (processing_template_decl + && TREE_CODE (TREE_TYPE (target_decl)) == ENUMERAL_TYPE && TREE_CODE (TREE_TYPE (target_bval)) == ENUMERAL_TYPE && (dependent_type_p (ENUM_UNDERLYING_TYPE (TREE_TYPE (target_decl))) @@ -2420,7 +2421,15 @@ validate_nonmember_using_decl (tree decl, tree sco gcc_assert (DECL_P (decl)); /* Make a USING_DECL. */ - return push_using_decl (scope, name); + tree using_decl = push_using_decl (scope, name); + + if (using_decl == NULL_TREE + && at_function_scope_p () + && TREE_CODE (decl) == VAR_DECL) +/* C++11 7.3.3/10. */ +error ("%qD is already declared in this scope", name); + + return using_decl; } /* Process local and global using-declarations. */
Request for help with the scalarizer
Dear all, first, a question to Mikael (and others knowing the scalarizer): How to properly fix the following: implicit none REAL qss(3) REAL, ALLOCATABLE :: qj(:,:) INTEGER :: qcount qss(:)=qj(:,qcount) end For that one calls gfc_cleanup_loop (&loop) - and in gfc_free_ss: case GFC_SS_SECTION: for (n = 0; n < ss->dimen; n++) { if (ss_info->data.array.subscript[ss->dim[n]]) gfc_free_ss_chain (ss_info->data.array.subscript[ss->dim[n]]); } The problem is: (gdb) p ss->dimen $8 = 1 (gdb) p ss->dim[0] $9 = 0 (gdb) p ss->info->data.array.subscript $10 = {0x0, 0x15f37f0, 0x0, 0x0, 0x0, 0x0, 0x0} The question is now whether ss->dim[0] should be 1 instead of 0, then the bug is in gfc_walk_array_ref's AR_SECTION: -> DIMEN_ELEMENT handling. Or whether the gfc_free_ss handling is wrong. A brute-force method would be to walk all MAX_DIMENSION elements of ss->info->data.array.subscript. Secondly, I tried to to fix all gfc_ss mem leaks (including PR54350, which I accidentally introduced). The attached patch works nicely for the test suite (except for realloc_on_assign_*.f90 aka PR54350), it also fixes the leaks in some real-world test files. And it compiles nearly all polyhedron examples. However: It fails to compile rnflow of Polyhedron 2005. Namely, one enters an endless loop in gfc_conv_ss_startstride with the following backtrace. Obviously, one has freed too much memory to early. Namely: ss = gfc_walk_expr (expr1); gfc_conv_array_parameter (&se, expr1, ss, false, NULL, NULL, NULL); realloc_lhs_loop_for_fcn_call (&se, &expr1->where, &ss, &loop); With the current patch, gfc_conv_array_parameter always frees "ss"; before, it only freed ss by calling gfc_conv_expr_descriptor. How to solve that? A partial ss freeing is rather bad as one cannot detect whether "ss" has been freed or not. One solution would be that gfc_conv_expr_descriptor no longer frees the memory - i.e. the caller has to do the duty. That's probably the most invasive patch, but at least it makes the code clearer. Suggestions? #0 gfc_conv_ss_startstride (loop=0x7fffd8a0) at /projects/tob/gcc-git/gcc/gcc/fortran/trans-array.c:3861 #1 0x0063b105 in realloc_lhs_loop_for_fcn_call (loop=0x7fffd8a0, ss=, where=0x183a990, se=0x7fffd850) at /projects/tob/gcc-git/gcc/gcc/fortran/trans-expr.c:6597 #2 gfc_trans_arrayfunc_assign (expr1=0x183a940, expr2=expr2@entry=0x183e080) at /projects/tob/gcc-git/gcc/gcc/fortran/trans-expr.c:6778 #3 0x0063c4c2 in gfc_trans_assignment (expr1=0x183a940, expr2=0x183e080, init_flag=, dealloc=) at /projects/tob/gcc-git/gcc/gcc/fortran/trans-expr.c:7441 #4 0x00602be2 in trans_code (code=0x183ee10, cond=0x0) at /projects/tob/gcc-git/gcc/gcc/fortran/trans.c:1312 #5 0x00629937 in gfc_generate_function_code (ns=out>) at /projects/tob/gcc-git/gcc/gcc/fortran/trans-decl.c:5346 Tobias diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 8c254dd..08f5a38 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -6443,6 +6443,7 @@ gfc_conv_expr_descriptor (gfc_se * se, gfc_expr * expr, gfc_ss * ss) if (expr->ts.type == BT_CHARACTER) se->string_length = gfc_get_expr_charlen (expr); + gfc_free_ss (ss); return; } break; @@ -6477,6 +6478,7 @@ gfc_conv_expr_descriptor (gfc_se * se, gfc_expr * expr, gfc_ss * ss) gcc_assert (se->ss == ss); se->expr = gfc_build_addr_expr (NULL_TREE, se->expr); gfc_conv_expr (se, expr); + gfc_free_ss (ss); return; } @@ -6986,6 +6988,7 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, gfc_ss * ss, bool g77, se->expr = gfc_build_addr_expr (NULL_TREE, tmp); if (size) array_parameter_size (tmp, expr, size); + gfc_free_ss (ss); return; } @@ -6996,6 +6999,8 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, gfc_ss * ss, bool g77, gfc_conv_expr_descriptor (se, expr, ss); tmp = se->expr; } + else + gfc_free_ss (ss); if (size) array_parameter_size (tmp, expr, size); se->expr = gfc_conv_array_data (tmp); diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index cfb0862..28f8d28 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -3534,7 +3534,6 @@ conv_isocbinding_procedure (gfc_se * se, gfc_symbol * sym, gfc_add_block_to_block (&block, &loop.post); gfc_add_block_to_block (&block, &fptrse.post); gfc_cleanup_loop (&loop); - gfc_free_ss (ss); gfc_add_modify (&block, offset, fold_build1_loc (input_location, NEGATE_EXPR, @@ -4040,28 +4039,34 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, if (e->expr_type == EXPR_VARIABLE && is_subref_array (e)) - /* The actual argument is a component reference to an - array of derived types. In this case, the argument - is convert
Re: [patch] Backport fixes for PR54146 to GCC 4.7
On Wed, Aug 22, 2012 at 3:44 PM, Jakub Jelinek wrote: > To me the patch looks way too long, changing too many things, to be suitable > for the release branch. Most of the changes are only to allocate bitmaps on dedicated obstacks. It makes the patch long, but the changes are not really significant except for bringing down the memory footprint (e.g. for c-common.c this patch halves the peak memory usage in top). Only the tree-ssa-loop-manip.c changes are the more complex bit, actual algorithm changes. The rewrite of add_exit_phis() makes an O(b_basic_blocks^2) algorithm linear, but I can see why this may be considered unsuitable for a release branch. I could simplyleave these changes out and only apply the bitmap_obstack bits . I think it's important that GCC 4.7 gets less memory-hungry. This is the first GCC with a usable LTO, but it currently consumes way too much memory. My patches, and those from Richard and you to plug leaks in various places (which I plan to test and propose for back-porting also), make GCC 4.7 a better compiler with almost no risk. Would it be acceptable for you if I apply the patches one at a time (after testing them individually first)? Ciao! Steven
Re: [PATCH][RFC] Virtual operands in loop-closed SSA form
On Wed, Aug 22, 2012 at 3:01 PM, Richard Guenther wrote: > > While we should already be in loop-closed SSA form for virtual > operands most of the time (because we have a virtual use at > the return statement) and loop-closed SSA form for virtuals > is cheap (we only have a single virtual operand now) the following > makes sure that a loop-closed PHI node for virtuals does exist. Make sense. I think it would be good to add an explanation of what this means in the comment before rewrite_into_loop_closed_ssa, because "liveness" of a memory reference isn't as obvious as that of an ssa register. Did you try this with the header-copying change from PR46590 to make it do only TODO_update_ssa_no_phi? Ciao! Steven
Re: PATCH: Default to 64-bit long double for Bionic/x86
On Wed, Aug 22, 2012 at 4:41 PM, H.J. Lu wrote: > Long double is the same as double in Bionic. This patch > > 1. Add -mlong-double-80: > 80-bit long double > Enabled for Linux by default. > 2. Add -mlong-double-64: > 64-bit long double > Predefine a new C/C++ macro, __LONG_DOUBLE_64__. > Enabled for Android by default. > > __float80 can still be used for 80-bit long double. Tested on Linux/x86 > and Android/x86. OK to install? > > 2012-08-22 H.J. Lu > > * doc/invoke.texi: Document -mlong-double-64/-mlong-double-80. > > * config/i386/i386.c (flag_opts): Add -mlong-double-64. > (TARGET_HAS_BIONIC): Default long double to 64-bit for Bionic. > > * config/i386/i386.h (LONG_DOUBLE_TYPE_SIZE): Use 64 if > TARGET_LONG_DOUBLE_64 is true. > (LIBGCC2_LONG_DOUBLE_TYPE_SIZE): New macro. > (WIDEST_HARDWARE_FP_SIZE): Defined to 80. > > * config/i386/i386.opt (mlong-double-80): New option. > (mlong-double-64): Likewise. > > * config/i386/i386-c.c (ix86_target_macros): Define > __LONG_DOUBLE_64__ for TARGET_LONG_DOUBLE_64. > > gcc/testsuite/ > > 2012-08-22 H.J. Lu > > * gcc.target/i386/long-double-64-1.c: New file. > * gcc.target/i386/long-double-64-2.c: Likewise. > * gcc.target/i386/long-double-64-3.c: Likewise. > * gcc.target/i386/long-double-64-4.c: Likewise. > * gcc.target/i386/long-double-80-1.c: Likewise. > * gcc.target/i386/long-double-80-2.c: Likewise. > * gcc.target/i386/long-double-80-3.c: Likewise. > * gcc.target/i386/long-double-80-4.c: Likewise. > * gcc.target/i386/long-double-80-5.c: Likewise. > * gcc.target/i386/long-double-80-6.c: Likewise. > * gcc.target/i386/long-double-80-7.c: Likewise. > > libgcc/ > > 2012-08-22 H.J. Lu > > * config/i386/t-linux (HOST_LIBGCC2_CFLAGS): New. OK. Thanks, Uros.
Re: PATCH: PR target/54347: REAL_VALUE_TO_TARGET_LONG_DOUBLE shouldn't be used in i386
On Tue, Aug 21, 2012 at 5:05 PM, H.J. Lu wrote: > long double may not be 80-bit on i386. We can't use > REAL_VALUE_TO_TARGET_LONG_DOUBLE for XFmode. This patch replaces > REAL_VALUE_TO_TARGET_LONG_DOUBLE with real_to_target. OK to install? > > Thanks. > > H.J. > --- > 2012-08-21 H.J. Lu > > PR target/54347 > * config/i386/i386.c (ix86_split_to_parts): Replace > REAL_VALUE_TO_TARGET_LONG_DOUBLE with real_to_target. OK. Thanks, Uros.
Re: [PATCH v2, rtl-optimization]: Fix PR46829, ICE in spill_failure with -fschedule-insns [was: Fixing instability of -fschedule-insns for x86]
On Tue, Aug 21, 2012 at 1:34 AM, Oleg Endo wrote: > On Sat, 2012-08-18 at 10:23 +0200, Uros Bizjak wrote: >> On Sat, Aug 18, 2012 at 10:14 AM, Uros Bizjak wrote: >> >> > After discussion with Oleg, it looks that it is enough to prevent >> > wrong registers in the output of the (multi-output) insn pattern. As >> > far as inputs are concerned, combine already handles limited reload >> > classes in the right way. The problem with x86 is, that reload tried >> > to fix output operand of the multi-output ins, where scheduler already >> > moved load of ax register before this insn. >> > >> > Version 2 of the patch now handles only output operands. Also, >> > handling of empty constraints was fixed. >> >> ... but here we fail testcase from PR46843... so we HAVE to check >> input operands. > > Hm, I'm curious ... how would that work for patterns such as > > (define_insn "*addc" > [(set (match_operand:SI 0 "arith_reg_dest" "=r") > (plus:SI (plus:SI > (match_operand:SI 1 "arith_reg_operand" "%0") > (match_operand:SI 2 "arith_reg_or_0_operand" "r")) > (match_operand:SI 3 "t_reg_operand" ""))) >(clobber (reg:SI T_REG))] > "TARGET_SH1" > "addc %2,%0" > [(set_attr "type" "arith")]) > > ... where the predicate "arith_reg_or_0_operand" allows either > "const_int 0" or an "arith_reg_operand", but the constraint "r" tells > reload to load the constant into a register for this insn. > Probably those kind of patterns that rely on this behavior would need to > be rewritten as insn_and_split to do the constant loading 'manually'? I think that we have to introduce a target hook that would "approve" the combined insn. This way, every target can analyse the combined insn and handle it in the most appropriate way. Uros.
Re: Request for help with the scalarizer
On 22/08/2012 19:19, Tobias Burnus wrote: > Dear all, > > first, a question to Mikael (and others knowing the scalarizer): How to > properly fix the following: > > implicit none > REAL qss(3) > REAL, ALLOCATABLE :: qj(:,:) > INTEGER :: qcount > qss(:)=qj(:,qcount) > end > > For that one calls gfc_cleanup_loop (&loop) - and in gfc_free_ss: > > case GFC_SS_SECTION: > for (n = 0; n < ss->dimen; n++) > { > if (ss_info->data.array.subscript[ss->dim[n]]) > gfc_free_ss_chain (ss_info->data.array.subscript[ss->dim[n]]); > } > > The problem is: > > (gdb) p ss->dimen > $8 = 1 that's fine: rank 1 array > (gdb) p ss->dim[0] > $9 = 0 that's fine: the first dimension of the array subreference is the first dimension of the full array. > (gdb) p ss->info->data.array.subscript > $10 = {0x0, 0x15f37f0, 0x0, 0x0, 0x0, 0x0, 0x0} that's fine: there is a (scalar) subscript ss in dimension 1 corresponding to `qcount', and nothing in dimension 0 as there is no vector subscript in that dimension. > > The question is now whether ss->dim[0] should be 1 instead of 0, then > the bug is in gfc_walk_array_ref's AR_SECTION: -> DIMEN_ELEMENT > handling. No, it's fine as is. > Or whether the gfc_free_ss handling is wrong. A brute-force > method would be to walk all MAX_DIMENSION elements of > ss->info->data.array.subscript. Yes, that's the way it should be. For example gfc_add_loop_ss_code has already one "brute force" loop about subscripts: case GFC_SS_SECTION: /* Add the expressions for scalar and vector subscripts. */ for (n = 0; n < GFC_MAX_DIMENSIONS; n++) if (info->subscript[n]) gfc_add_loop_ss_code (loop, info->subscript[n], true, where); > > > Secondly, I tried to to fix all gfc_ss mem leaks (including PR54350, > which I accidentally introduced). > > The attached patch works nicely for the test suite (except for > realloc_on_assign_*.f90 aka PR54350), it also fixes the leaks in some > real-world test files. And it compiles nearly all polyhedron examples. > > However: It fails to compile rnflow of Polyhedron 2005. Namely, one > enters an endless loop in gfc_conv_ss_startstride with the following > backtrace. Obviously, one has freed too much memory to early. Namely: > > ss = gfc_walk_expr (expr1); > gfc_conv_array_parameter (&se, expr1, ss, false, NULL, NULL, NULL); > realloc_lhs_loop_for_fcn_call (&se, &expr1->where, &ss, &loop); > > With the current patch, gfc_conv_array_parameter always frees "ss"; > before, it only freed ss by calling gfc_conv_expr_descriptor. > > > How to solve that? A partial ss freeing is rather bad as one cannot > detect whether "ss" has been freed or not. One solution would be that > gfc_conv_expr_descriptor no longer frees the memory - i.e. the caller > has to do the duty. That's probably the most invasive patch, but at > least it makes the code clearer. In general, I prefer having allocation and deallocation at the same scope for clarity. In gfc_conv_expr_descriptor, it makes some sense to throw away ss once it has been used for a loop, so it would be better to have the allocation happen there too. I have reviewed gfc_conv_expr_descriptor calls, and unless I missed some, they all follow the same pattern: ss = gfc_walk_expr (expr); ... gfc_init_se (&se, ...); ... gfc_conv_expr_descriptor (&se, expr, ss); This shows that ss is redundant with expr, so I think the walking should be internal to gfc_conv_expr_descriptor, the ss argument should be removed, and then gfc_conv_array_parameter doesn't need ss any more. I believe it would solve your problem, and make allocation and cleanup happen in the same function, which is nice. I don't think it would avoid invasive changes though. Mikael
PATCH: PR driver/54335: -dm doesn't work
Hi, -dm hasn't worked for a long time, at least dating back to GCC 3.4. This patch removes -dm and puts back -da, which was removed by accident. OK to install? Thanks. H.J. --- 2012-08-22 H.J. Lu PR driver/54335 * doc/invoke.texi: Add -da and remove -dm. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ae22ca9..e2feb6d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -5610,7 +5610,9 @@ Dump after live range splitting. @opindex fdump-rtl-dfinish These dumps are defined but always produce empty files. -@item -fdump-rtl-all +@item -da +@itemx -fdump-rtl-all +@opindex da @opindex fdump-rtl-all Produce all the dumps listed above. @@ -5627,11 +5629,6 @@ normal output. @opindex dH Produce a core dump whenever an error occurs. -@item -dm -@opindex dm -Print statistics on memory usage, at the end of the run, to -standard error. - @item -dp @opindex dp Annotate the assembler output with a comment indicating which
[patch, fortran] Reduce Explosion of noise from -Wall
Am 22.08.2012 11:18, schrieb Tobias Burnus: Regarding -Wcompare-real, I wonder whether it makes sense to either ignore comparisions against zero or to put them into a different flag (-Wcompare-real-zero); Here is a patch to not warn for comparisons against zero. Regression-tested, also tested with make info and make dvi. OK for trunk? Thomas 012-08-22 Thomas König PR fortran/54298 * invoke.texi: Document that -Wcompare-reals does not warn for comparison against zero. * resolve.c (is_real_zero): New function. (is_complex_zero): New function. (resolve_operator): If the comparison is against real or complex zero, don't warn about equality/inequality comparisions. Document how to suppress the warning in the error message. 2012-08-22 Thomas König PR fortran/54298 * gfortran.dg/compare_real_2.f90: New test. Index: invoke.texi === --- invoke.texi (Revision 190541) +++ invoke.texi (Arbeitskopie) @@ -939,7 +939,8 @@ allocatable variable; this includes scalars and de @item -Wcompare-reals @opindex @code{Wcompare-reals} Warn when comparing real or complex types for equality or inequality. -Enabled by @option{-Wall}. +Comparisons against zero are not warned about. Enabled by +@option{-Wall}. @item -Wtarget-lifetime @opindex @code{Wtargt-lifetime} Index: resolve.c === --- resolve.c (Revision 190541) +++ resolve.c (Arbeitskopie) @@ -3876,7 +3876,27 @@ compare_shapes (gfc_expr *op1, gfc_expr *op2) return t; } +/* Check if an expr is a zero REAL constant. */ +static bool +is_real_zero (gfc_expr *e) +{ + return e->ts.type == BT_REAL && e->expr_type == EXPR_CONSTANT +&& mpfr_sgn (e->value.real) == 0; + +} + +/* Check if an expr is a zero COMPLEX constant. */ + +static bool +is_complex_zero (gfc_expr *e) +{ + return e->ts.type == BT_COMPLEX && e->expr_type == EXPR_CONSTANT +&& mpfr_sgn (mpc_realref (e->value.complex)) == 0 +&& mpfr_sgn (mpc_imagref (e->value.complex)) == 0; + +} + /* Resolve an operator expression node. This can involve replacing the operation with a user defined function call. */ @@ -4047,11 +4067,16 @@ resolve_operator (gfc_expr *e) { const char *msg; + /* Comparisons against zero are usually legitimate, so let's not warn. */ + if (is_real_zero (op1) || is_real_zero (op2) + || is_complex_zero (op1) || is_complex_zero (op2)) + break; + if (op == INTRINSIC_EQ || op == INTRINSIC_EQ_OS) - msg = "Equality comparison for %s at %L"; + msg = "Equality comparison for %s at %L [use -Wno-compare-reals to suppress]"; else - msg = "Inequality comparison for %s at %L"; - + msg = "Inequality comparison for %s at %L [use -Wno-compare-reals to suppress]"; + gfc_warning (msg, gfc_typename (&op1->ts), &op1->where); } } ! { dg-do compile } ! { dg-options "-Wcompare-reals" } program main real :: a complex :: c read (*,*) a read (*,*) c if (a .eq. 0.) print *,"foo" if (0. == a) print *,"foo" if (a .eq. 0) print *,"foo" if (0 == a) print *,"foo" if (a .ne. 0.0) print *,"foo" if (0.0 /= a) print *,"foo" if (a .ne. 0) print *,"foo" if (0 /= a) print *,"foo" if (c .eq. (0., 0.)) print *,"foo" if ((0., 0.) == a) print *,"foo" if (c .ne. (0., 0.)) print *,"foo" if ((0., 2.11) /= c) print *,"foo" ! { dg-warning "Inequality comparison for COMPLEX" } if ((2.11, 0.) /= c) print *,"foo" ! { dg-warning "Inequality comparison for COMPLEX" } end program main
[PATCH] Small change for cloog.m4 configuration file
Hi. I've been having to make this small change to the 'configure' script when building on sparc-sun-solaris2.10 to accomodate the shell executing the script. Without the change, I get an error message like so: configure: test: unknown operator == With the double equals being replaced by a single equal, the configure script executes and GCC builds properly. I think this code was added to the file in early July. The patch below fixes the source of the problem in config/cloog.m4. Thanks. Art Haas diff --git a/config/cloog.m4 b/config/cloog.m4 index 270cf8f..5193f4e 100644 --- a/config/cloog.m4 +++ b/config/cloog.m4 @@ -65,7 +65,7 @@ AC_DEFUN([CLOOG_INIT_FLAGS], fi dnl If no --with-cloog flag was specified and there is in-tree ClooG dnl source, set up flags to use that. - if test "x${clooginc}" == x && test "x${clooglibs}" == x \ + if test "x${clooginc}" = x && test "x${clooglibs}" = x \ && test -d ${srcdir}/cloog; then clooglibs='-L$$r/$(HOST_SUBDIR)/cloog/'"$lt_cv_objdir"' ' clooginc='-I$$r/$(HOST_SUBDIR)/cloog/include -I$$s/cloog/include -I'${srcdir}'/cloog/include '
Re: [PATCH] fix wrong-code bug for -fstrict-volatile-bitfields
> + bool packedp = false; > + > + if (TREE_CODE(to) == COMPONENT_REF > + && (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (to, 0))) > + || (TREE_CODE (TREE_OPERAND (to, 1)) == FIELD_DECL > + && DECL_PACKED (TREE_OPERAND (to, 1) > + packedp = true; > > note that this is not a reliable way of determining if a field is packed > (yes, the existing code is broken in the same way). I realize that you > are only using this for diagnostic purposes, still something worth > mentioning. Indeed, ideally DECL_PACKED shouldn't be used outside stor-layout.c and dwarf2out.c in the middle-end. > I dislike passing around the packedp flag throughout the expander and to > warn inside the expander. First, the flag has a name that can be confusing > (it does not conservatively flag all packed accesses). Second, why don't > you warn for this from inside the frontend when the bitfield access is > generated? This way the diagnostic won't depend on optimization levels > and you avoid uglifying the expander. Seconded. If the -fstrict-volatile-bitfields support is still incomplete, let's take this opportunity to clean it up. Testing DECL_PACKED or issuing a warning from the RTL expander is to be avoided. -- Eric Botcazou
Re: [SH] PR 54089 - Logical right shifts
Oleg Endo wrote: > This adapts SH's logical right shift patterns to work/look the same way > as left shift patterns. It mainly fixes a few issues with dynamic shift > insn selection. > Tested on rev 190580 with > make -k check RUNTESTFLAGS="--target_board=sh-sim > \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}" > > and no new failures. > OK? OK. Regards, kaz
Re: [C++ patch] PR 20420
Hi again, On 08/22/2012 05:13 PM, Jason Merrill wrote: On 08/22/2012 10:55 AM, Paolo Carlini wrote: . thus, in short, what is happening is that, for this testcase: class B { protected: enum E { E1, E2, E3 }; }; class D : private B { public: using B::E; private: enum E { }; }; we parse the new declaration enum E { }; and we reach supplement_binding_1 before setting the underlying type of the new declaration. The old declaration is fine, would not ICE dependent_type_p. So with your change would we still ICE if D were a template? It seems like what we should be checking for is null underlying type. In the meanwhile, I tried to actually write a patch doing away with the processing_template_decl check but failed. The problem is: if, as in the below patch "3", I don't call dependent_type_p when the underlying type is null and the check overall fails, I see regressions for testcases like forw_enum7.C, because when templates are involved we actually want to call dependent_type_p and return true; if, as in the below "4", a null underlying type means that overall the check succeeds, then we don't error out anymore for the new testcases, we don't ICE but we don't have the diagnostics either. I see the outcome of these experiments as an indication that we want the processing_template_decl check... Paolo. // Index: cp/name-lookup.c === --- cp/name-lookup.c(revision 190590) +++ cp/name-lookup.c(working copy) @@ -433,20 +433,25 @@ supplement_binding_1 (cxx_binding *binding, tree d bool ok = true; tree target_bval = strip_using_decl (bval); tree target_decl = strip_using_decl (decl); + tree target_bval_type = TREE_TYPE (target_bval); + tree target_decl_type = TREE_TYPE (target_decl); - if (TREE_CODE (target_decl) == TYPE_DECL && DECL_ARTIFICIAL (target_decl) + if (TREE_CODE (target_decl) == TYPE_DECL + && DECL_ARTIFICIAL (target_decl) && target_decl != target_bval && (TREE_CODE (target_bval) != TYPE_DECL /* We allow pushing an enum multiple times in a class template in order to handle late matching of underlying type on an opaque-enum-declaration followed by an enum-specifier. */ - || (TREE_CODE (TREE_TYPE (target_decl)) == ENUMERAL_TYPE - && TREE_CODE (TREE_TYPE (target_bval)) == ENUMERAL_TYPE - && (dependent_type_p (ENUM_UNDERLYING_TYPE - (TREE_TYPE (target_decl))) - || dependent_type_p (ENUM_UNDERLYING_TYPE - (TREE_TYPE (target_bval))) + || (TREE_CODE (target_decl_type) == ENUMERAL_TYPE + && TREE_CODE (target_bval_type) == ENUMERAL_TYPE + && ((ENUM_UNDERLYING_TYPE (target_decl_type) + && dependent_type_p (ENUM_UNDERLYING_TYPE + (target_decl_type))) + || (ENUM_UNDERLYING_TYPE (target_bval_type) + && dependent_type_p (ENUM_UNDERLYING_TYPE + (target_bval_type))) /* The new name is the type name. */ binding->type = decl; else if (/* TARGET_BVAL is null when push_class_level_binding moves @@ -469,8 +474,7 @@ supplement_binding_1 (cxx_binding *binding, tree d && DECL_ARTIFICIAL (target_bval) && target_decl != target_bval && (TREE_CODE (target_decl) != TYPE_DECL - || same_type_p (TREE_TYPE (target_decl), - TREE_TYPE (target_bval + || same_type_p (target_decl_type, target_bval_type))) { /* The old binding was a type name. It was placed in VALUE field because it was thought, at the point it was @@ -485,11 +489,11 @@ supplement_binding_1 (cxx_binding *binding, tree d && TREE_CODE (target_decl) == TYPE_DECL && DECL_NAME (target_decl) == DECL_NAME (target_bval) && binding->scope->kind != sk_class - && (same_type_p (TREE_TYPE (target_decl), TREE_TYPE (target_bval)) + && (same_type_p (target_decl_type, target_bval_type) /* If either type involves template parameters, we must wait until instantiation. */ - || uses_template_parms (TREE_TYPE (target_decl)) - || uses_template_parms (TREE_TYPE (target_bval + || uses_template_parms (target_decl_type) + || uses_template_parms (target_bval_type))) /* We have two typedef-names, both naming the same type to have the same name. In general, this is OK because of: Index: cp/name-lookup.c === --- cp/name-lookup.c(revision 190590) +++ cp/name-lookup.c(working copy) @@ -433,20 +433,25 @@ supplement_binding_1 (cxx_binding *binding, tree d bool ok = tru
[patch] rs6000: plug a leak
Hello Bill, This patch plugs a leak in rs6000.c:rs6000_density_test(). You have to free the array that get_loop_body returns. Noticed while going over all uses of get_loop_body (it's a common mistake to leak the return array). Patch is completely untested because I don't know when/how this function is used. You've added this function: 2012-07-31 Bill Schmidt <...> * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Revise costs for vec_perm and vec_promote_demote down to more natural values. (struct _rs6000_cost_data): New data structure. -->(rs6000_density_test): New function so I suppose you know what it's for and how to test this patch :-) Could you test this for me and commit it if nothing strange happens? Thanks, Ciao! Steven Index: config/rs6000/rs6000.c === --- config/rs6000/rs6000.c (revision 190601) +++ config/rs6000/rs6000.c (working copy) @@ -3509,6 +3509,7 @@ rs6000_density_test (rs6000_cost_data *d not_vec_cost++; } } + free (bbs); density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost);
Re: [PATCH] Fix ARM constant-pool layout calculations under -falign-labels
Hi Richard, You never responded to this. Is there something wrong with this fix? Can you address whether it's sufficient for align_loops > align_labels and such cases that Julian Brown raised? A patch against the current trunk is below. Thanks, Roland On Wed, Aug 1, 2012 at 2:43 PM, Roland McGrath wrote: > Using e.g. -falign-labels=16 on ARM can confuse the constant-pool layout > code such that it places pool entries too far away from their referring > instructions. This change seems to fix it. > > I don't have a small test case, only a large one, which I haven't actually > tried to get to reproduce on any vanilla ARM target. But the logic of the > change seems straightforward and sound. gcc/ 2012-08-22 Roland McGrath * config/arm/arm.c (get_label_padding): Use align_labels as minimum. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 2805b7c..586d094 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -12409,6 +12409,7 @@ get_label_padding (rtx label) HOST_WIDE_INT align, min_insn_size; align = 1 << label_to_alignment (label); + align = MAX (align, align_labels); min_insn_size = TARGET_THUMB ? 2 : 4; return align > min_insn_size ? align - min_insn_size : 0; }
Re: [patch] rs6000: plug a leak
On Thu, 2012-08-23 at 00:53 +0200, Steven Bosscher wrote: > Hello Bill, > > This patch plugs a leak in rs6000.c:rs6000_density_test(). You have to > free the array that get_loop_body returns. Noticed while going over > all uses of get_loop_body (it's a common mistake to leak the return > array). > > Patch is completely untested because I don't know when/how this > function is used. You've added this function: > > 2012-07-31 Bill Schmidt <...> > > * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Revise > costs for vec_perm and vec_promote_demote down to more natural values. > (struct _rs6000_cost_data): New data structure. > -->(rs6000_density_test): New function > > so I suppose you know what it's for and how to test this patch :-) > > Could you test this for me and commit it if nothing strange happens? Sure thing! Thanks for catching this. Bill > > Thanks, > > Ciao! > Steven > > > > Index: config/rs6000/rs6000.c > === > --- config/rs6000/rs6000.c (revision 190601) > +++ config/rs6000/rs6000.c (working copy) > @@ -3509,6 +3509,7 @@ rs6000_density_test (rs6000_cost_data *d > not_vec_cost++; > } > } > + free (bbs); > >density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost); >
VxWorks Patches Back from the Dead!
Hello Everyone, I have ten patches which are approved or obvious but waiting on commit, each of which is attached to this email. Feel free to consider this a ping, HOWEVER, they are rebased onto the latest trunk so they're no longer stale. Additionally, I updated the commit messages and with complete proposed changelog entries. They all (inside the patch file) have a link to their approval message and the original proposal message. Note that ANY patch that touches fixincludes has a dependency on the first patch (0001...), as fixincludes WILL NOT RUN without this patch on a vxworks platform. Sorry for the long latency period but I did not have access to a computer for a substantial period of time over the summer. I've just been able to get everything back up and running. -- Robert Mason >From 991c277c1be41dff96b474a1e8bb9d297042ac70 Mon Sep 17 00:00:00 2001 From: rbmj Date: Wed, 30 May 2012 08:16:57 -0400 Subject: [PATCH 01/10] Add ability to skip the machine_name fixincludes fix. On some platforms, machine_name is overzealous, or even breaks things. This patch adds the functionality to skip the machine_name 'fix' by giving it an empty macro list. It also removes vxworks from the list of no-op fix platforms so that fixincludes will run on that platform (it needs it). Proposed by Robert Mason: http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00383.html Approved by Nathan Sidwell: http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00549.html Approved by Bruce Korb: http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00552.html Changes: [gcc] configure.ac: Add target-based selection of new exported variable skip_machine_name_fix configure: Regenerate Makefile.in: add check for SKIP_MACHINE_NAME_FIX [fixincludes] mkfixinc.sh: Remove vxworks from the list of no-op fixincludes platforms as it needs some fixes --- fixincludes/mkfixinc.sh |1 - gcc/Makefile.in | 15 +++ gcc/configure.ac| 14 ++ 3 files changed, 25 insertions(+), 5 deletions(-) diff --git a/fixincludes/mkfixinc.sh b/fixincludes/mkfixinc.sh index 89e8ab7..6653fed 100755 --- a/fixincludes/mkfixinc.sh +++ b/fixincludes/mkfixinc.sh @@ -15,7 +15,6 @@ case $machine in i?86-*-mingw32* | \ x86_64-*-mingw32* | \ i?86-*-interix* | \ -*-*-vxworks* | \ powerpc-*-eabisim* | \ powerpc-*-eabi*| \ powerpc-*-rtems* | \ diff --git a/gcc/Makefile.in b/gcc/Makefile.in index dddffb6..31b36eb 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -237,6 +237,9 @@ LINKER = $(CC) LINKER_FLAGS = $(CFLAGS) endif +# Whether or not to run the machine_name fixincludes fix +SKIP_MACHINE_NAME_FIX = @skip_machine_name_fix@ + # --- # Programs which operate on the build machine # --- @@ -4045,10 +4048,14 @@ install-gcc-tooldir: macro_list: s-macro_list; @true s-macro_list : $(GCC_PASSES) - echo | $(GCC_FOR_TARGET) -E -dM - | \ - sed -n -e 's/^#define \([^_][a-zA-Z0-9_]*\).*/\1/p' \ --e 's/^#define \(_[^_A-Z][a-zA-Z0-9_]*\).*/\1/p' | \ - sort -u > tmp-macro_list + @if test "$(SKIP_MACHINE_NAME_FIX)" != "yes" ; then \ + echo | $(GCC_FOR_TARGET) -E -dM - | \ + sed -n -e 's/^#define \([^_][a-zA-Z0-9_]*\).*/\1/p' \ + -e 's/^#define \(_[^_A-Z][a-zA-Z0-9_]*\).*/\1/p' | \ + sort -u > tmp-macro_list ; \ + else \ + echo > tmp-macro_list ; \ + fi $(SHELL) $(srcdir)/../move-if-change tmp-macro_list macro_list $(STAMP) s-macro_list diff --git a/gcc/configure.ac b/gcc/configure.ac index 7042c91..f5ff61c 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -5121,6 +5121,20 @@ if test x"${LINKER_HASH_STYLE}" != x; then [The linker hash style]) fi +# Check whether to enable the fixincludes machine_name hack on this platform +case "${target}" in +*-*-vxworks*) +skip_machine_name_fix="yes" +;; +*) +# Note that some platforms have fixincludes disabled by default so +# this will make no difference +skip_machine_name_fix="no" +;; +esac +AC_SUBST(skip_machine_name_fix) + + # Configure the subdirectories # AC_CONFIG_SUBDIRS($subdirs) -- 1.7.10.4 >From 5dd13cf6ae3cc76df8e51cbabf9ac6d7a879b880 Mon Sep 17 00:00:00 2001 From: rbmj Date: Mon, 4 Jun 2012 13:18:10 -0400 Subject: [PATCH 02/10] Added assert fixinclude hack for VxWorks. VxWorks's assert.h relies on adjacent string tokens being joined, and uses macros for some of the strings (e.g. __FILE__). However, it does not put a space after the end quote and before the macro, so instead of replacing the macro, gcc >= 4.7.x thinks it's a user-defined literal token, and since the lookup obviously fails, compilation of libstd
[google/gcc-4_7] Fix ICE in should_move_die_to_comdat
This patch is for the google/gcc-4_7 branch. If a function contains a local typedef of an anonymous structure, GCC will generate a typedef DIE in the function, where the typedef DIE points to a structure_type DIE at the top level. That structure_type DIE, if it's a non-POD, can contain ctor and dtor definitions. This causes an assertion in should_move_die_to_comdat to fail, as we have up to now assumed that this could never happen. Perhaps a better fix would be to fix the generation of the structure_type DIE somehow, but GDB seems happy with what we're generating, and we don't want to extract this type into a comdat type unit anyway, so this patch fixes things up so that we just skip the type instead of asserting. I'll explore other possible fixes for trunk, but this should be good for google/gcc-4_7. Tested with build and validate, and by compiling the failing test case. Google ref b/6669962. 2012-08-22 Cary Coutant * gcc/dwarf2out.c (should_move_die_to_comdat): A type definition can contain a subprogram definition, but don't move it to a comdat unit. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 190603) +++ gcc/dwarf2out.c (working copy) @@ -7289,14 +7289,13 @@ should_move_die_to_comdat (dw_die_ref di case DW_TAG_structure_type: case DW_TAG_enumeration_type: case DW_TAG_union_type: - /* Don't move declarations, inlined instances, or types nested in a -subprogram. */ + /* Don't move declarations, inlined instances, types nested in a +subprogram, or types that contain subprogram definitions. */ if (is_declaration_die (die) || get_AT (die, DW_AT_abstract_origin) - || is_nested_in_subprogram (die)) + || is_nested_in_subprogram (die) + || contains_subprogram_definition (die)) return 0; - /* A type definition should never contain a subprogram definition. */ - gcc_assert (!contains_subprogram_definition (die)); return 1; case DW_TAG_array_type: case DW_TAG_interface_type:
Go patch committed: Remove unsafe.Pointer handling from type assertions
The Go frontend had some old code for special handling of unsafe.Pointer for type assertions. That is not part of the current language, and would actually break some otherwise valid programs. This patch removes the special handling, and fixes one piece of gccgo-specific code in libgo that used it. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian diff -r 4cd85453373c go/expressions.cc --- a/go/expressions.cc Mon Aug 20 22:28:53 2012 -0700 +++ b/go/expressions.cc Wed Aug 22 17:08:07 2012 -0700 @@ -12914,26 +12914,8 @@ void Type_guard_expression::do_check_types(Gogo*) { - // 6g permits using a type guard with unsafe.pointer; we are - // compatible. Type* expr_type = this->expr_->type(); - if (expr_type->is_unsafe_pointer_type()) -{ - if (this->type_->points_to() == NULL - && (this->type_->integer_type() == NULL - || (this->type_->forwarded() - != Type::lookup_integer_type("uintptr" - this->report_error(_("invalid unsafe.Pointer conversion")); -} - else if (this->type_->is_unsafe_pointer_type()) -{ - if (expr_type->points_to() == NULL - && (expr_type->integer_type() == NULL - || (expr_type->forwarded() - != Type::lookup_integer_type("uintptr" - this->report_error(_("invalid unsafe.Pointer conversion")); -} - else if (expr_type->interface_type() == NULL) + if (expr_type->interface_type() == NULL) { if (!expr_type->is_error() && !this->type_->is_error()) this->report_error(_("type assertion only valid for interface types")); @@ -12966,23 +12948,10 @@ tree Type_guard_expression::do_get_tree(Translate_context* context) { - Gogo* gogo = context->gogo(); tree expr_tree = this->expr_->get_tree(context); if (expr_tree == error_mark_node) return error_mark_node; - Type* expr_type = this->expr_->type(); - if ((this->type_->is_unsafe_pointer_type() - && (expr_type->points_to() != NULL - || expr_type->integer_type() != NULL)) - || (expr_type->is_unsafe_pointer_type() - && this->type_->points_to() != NULL)) -return convert_to_pointer(type_to_tree(this->type_->get_backend(gogo)), - expr_tree); - else if (expr_type->is_unsafe_pointer_type() - && this->type_->integer_type() != NULL) -return convert_to_integer(type_to_tree(this->type_->get_backend(gogo)), - expr_tree); - else if (this->type_->interface_type() != NULL) + if (this->type_->interface_type() != NULL) return Expression::convert_interface_to_interface(context, this->type_, this->expr_->type(), expr_tree, true, diff -r 4cd85453373c libgo/go/os/dir.go --- a/libgo/go/os/dir.go Mon Aug 20 22:28:53 2012 -0700 +++ b/libgo/go/os/dir.go Wed Aug 22 17:08:07 2012 -0700 @@ -49,7 +49,7 @@ file.dirinfo.dir = r } - entry_dirent := unsafe.Pointer(&file.dirinfo.buf[0]).(*syscall.Dirent) + entry_dirent := (*syscall.Dirent)(unsafe.Pointer(&file.dirinfo.buf[0])) size := n if size < 0 {
Re: VxWorks Patches Back from the Dead!
On 08/22/12 17:05, rbmj wrote: Hello Everyone, I have ten patches which are approved or obvious but waiting on commit The include fixing stuff looks fine to me. However I think it might be simpler to tweak mkfixinc.sh to sed '/if test -s .{MACRO_LIST}/s/$/ && false/' \ ${srcdir}/fixinc.in > ${target} for vxworks rather than all that configury rigmarole. That would eliminate changes to gcc/configure.ac and gcc/Makefile.in. Basically degrading the entire first patch to this: $ svn diff mkfixinc.sh Index: mkfixinc.sh === 1 == '-u' 2 == '-L' 3 == 'mkfixinc.sh (revision 190448)' 4 == '-L' 5 == 'mkfixinc.sh (working copy)' 6 == '.svn/text-base/mkfixinc.sh.svn-base' 7 == 'mkfixinc.sh' --- mkfixinc.sh (revision 190448) +++ mkfixinc.sh (working copy) @@ -15,7 +15,6 @@ i?86-*-mingw32* | \ x86_64-*-mingw32* | \ i?86-*-interix* | \ -*-*-vxworks* | \ powerpc-*-eabisim* | \ powerpc-*-eabi*| \ powerpc-*-rtems* | \ @@ -26,6 +25,11 @@ (echo "#! /bin/sh" ; echo "exit 0" ) > ${target} ;; +*-*-vxworks* ) +sed '/if test -s .{MACRO_LIST}/s/$/ && false/' \ +${srcdir}/fixinc.in > ${target} +;; + *) cat < ${srcdir}/fixinc.in > ${target} || exit 1 ;;
Another ping: Reorganized documentation for warnings -- attempt 2
I hope this can get looked at, thanks. http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01208.html
Re: [google/gcc-4_7] Fix ICE in should_move_die_to_comdat
On Wed, Aug 22, 2012 at 5:11 PM, Cary Coutant wrote: > This patch is for the google/gcc-4_7 branch. Approved for google/gcc-4_7 branch. Thanks, -- Paul Pluzhnikov
Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)
On Tue, Aug 21, 2012 at 11:18 PM, Jan Hubicka wrote: >> On Tue, Aug 21, 2012 at 6:56 PM, Jan Hubicka wrote: >> >> > I can go ahead with the histogram approach. There is some roundoff >> >> > error from the working set scaling approach that can affect different >> >> > merging orders as you note, although I think this only really affects >> >> > the >> >> > small counter values. The other place where saving/merging the histogram >> > >> > Do you have any intuition on why simple maximalization merging (that is >> > safe wrt >> > ordering) would be bad idea? >> >> When you say "maximalization merging" are you talking about the >> histogram merging approach I mentioned a few emails back (my response >> on Aug 19) where we assume the same relative order of hotness in the >> counters between runs, and accumulate the counter values in the >> histogram in that order? > > I was speaking of BB (counter) counts only here and considering the simplest > strategy: maximalize the BB counts in corresponding buckets and sum the > cummlative > values in the corresponding buckets. I'm concerned about inaccuracies arising when multiple runs have different sizes and therefore counters are in different buckets in the corresponding histograms. We would end up in a situation where the merged histogram has more "counters" in it than there are actual counters in the program. > > The strategy of scaling is more sensible, but it will also be sensitive to > order > of train runs, i.e. it will give unstable results with parallel make. Regarding the approach of merging histograms by matching up counters in the same order and apportioning out the histogram cumulative values evenly between counters in the same bucket before merging, I don't think the ordering will have a huge effect. The reason is that counters we end up matching together and accumulating will stay in the same relative order, and the min counter values are simply summed to compute the new bucket index and new min counter value to record. For the cumulative value, since the incoming cumulative values will be apportioned evenly between the counters coming from the same bucket when computing the new cumulative value (which is then just another add), there could be some slight differences from different round off errors when histograms are merged in different orders. But I would expect that to result in only small differences in the cumulative values in the histogram. Is that acceptable? >> > OK, so I guess we went through >> > 1) two pass updating with race in between pases. >> > 2) two pass updating with first pass updating countes and second having >> > race only for summary update. >> > (i.e. no races for counters) >> > 3) two pass updating with flocking (and some way to handle detected >> > deadlocks) >> > 4) one pass updating with histogram merging + maximalization of working >> > set. >> > (we do not realy need to scale the buckets, we can simply merge the >> > histograms >> > and then mutiply them by nruns before comparing to actual counters. >> >> By merging the histograms (and accumulating the counter values stored >> there as we merge), I don't think we need to multiply the counter >> values by nruns, do we? > > With the simple approach we need, but... >> >> > This assumes that working sets of all runs are about the same, but >> > should work >> > resonably in practice I think. >> > >> > I guess 3/4 are acceptable WRT bootstrap reproducibility. >> > >> > I have no experience with flocking large number of files and portability of >> > this solution i.e. to Windows. If you think that 2) would be too >> > inaccurate >> > in practice and 3) has chance to be portable, we could go for this. It >> > will >> > solve the precision problems and will also work for LIPO summaries. >> > I would be curious about effect on profiledbootstrap time of this if you >> > implement >> > it. >> >> I'm hoping that 2) will be accurate enough in practice, but it will >> need some investigation. > > Perhaps weighting all pros/cons you are right here. I think getting results > consistent across different order of runs is more important than the > possibility of race on the last update and 4) is probably getting quite a lot > of GIGO issues. > > We can implement more consistent locking mechanizm incrementally if this turns > out to be issue. > > It is posible to actually detect the race: at first round we are updating the > nruns. In the second round we can see if nruns has changed. If so, we have > problem since someone has already updated the file and will update the > histograms. > > I would suggest skiping update when nruns has changes, so the summary at least > match the counters in current file accurately (even though it will be off for > the whole program since whoever updated the file may have not seen all our > updates, just part of them). Just to reduce changes that the trashed run is > the most important one. Theoretically we could
Re: [PATCH, MIPS] fix MIPS16 jump table overflow
On 08/21/2012 02:23 PM, Richard Sandiford wrote: Would be nice to add a compile test for -mabi=64 just to make sure that Pmode == DImode works. A copy of an existing test like code-readable-1.c would be fine. I'm having problems with this part -- it seems like every combination of options with -mabi=64 I've tried with code-readable-1.c complains about something-or-another being incompatible. The closest I've come is "-mabi=64 -march=mips64 -msoft-float", which is accepted by the mipsisa32r2-sde-elf build I'm using for testing, but when I add -O to that I get an unrelated ICE: code-readable-1.c: In function 'bar': code-readable-1.c:25:1: error: unrecognizable insn: (insn 17 5 18 2 (set (reg/i:DI 2 $2) (high:DI (const:DI (unspec:DI [ (symbol_ref:DI ("k") [flags 0x40] 0xf72d14ac k>) ] 229 code-readable-1.c:25 -1 (nil)) code-readable-1.c:25:1: internal compiler error: in extract_insn, at recog.c:2125 And, even without -O, that combination of options gives a "sorry" on the mips-linux-gnu build I also have around. Here's the revised patch that addresses the other problems you pointed out. Is this part OK, at least? It passes regression testing. -Sandra 2012-08-22 Julian Brown Sandra Loosemore gcc/ * config/mips/mips.md (UNSPEC_CASESI_DISPATCH): New. (MIPS16_T_REGNUM): New constant. (tablejump): Don't use for MIPS16_SHORT_JUMP_TABLES. (casesi): New. (casesi_internal_mips16_): New. * config/mips/mips.c (mips16_split_long_branches): Adjust test to ignore casesi jump tables. * config/mips/mips.h (TARGET_MIPS16_SHORT_JUMP_TABLES): Update comment. (CASE_VECTOR_MODE): Use SImode unconditionally. (CASE_VECTOR_SHORTEN_MODE): Define. (ASM_OUTPUT_ADDR_DIFF_ELT): Output word-sized addr_diff_elts when necessary for MIPS16_SHORT_JUMP_TABLES. gcc/testsuite/ * gcc.target/mips/code-readable-1.c: Add -O to options. Index: gcc/config/mips/mips.md === --- gcc/config/mips/mips.md (revision 190463) +++ gcc/config/mips/mips.md (working copy) @@ -134,10 +134,14 @@ ;; Used in a call expression in place of args_size. It's present for PIC ;; indirect calls where it contains args_size and the function symbol. UNSPEC_CALL_ATTR + + ;; MIPS16 casesi jump table dispatch. + UNSPEC_CASESI_DISPATCH ]) (define_constants [(TLS_GET_TP_REGNUM 3) + (MIPS16_T_REGNUM 24) (PIC_FUNCTION_ADDR_REGNUM 25) (RETURN_ADDR_REGNUM 31) (CPRESTORE_SLOT_REGNUM 76) @@ -5904,14 +5908,9 @@ [(set (pc) (match_operand 0 "register_operand")) (use (label_ref (match_operand 1 "")))] - "" + "!TARGET_MIPS16_SHORT_JUMP_TABLES" { - if (TARGET_MIPS16_SHORT_JUMP_TABLES) -operands[0] = expand_binop (Pmode, add_optab, -convert_to_mode (Pmode, operands[0], false), -gen_rtx_LABEL_REF (Pmode, operands[1]), -0, 0, OPTAB_WIDEN); - else if (TARGET_GPWORD) + if (TARGET_GPWORD) operands[0] = expand_binop (Pmode, add_optab, operands[0], pic_offset_table_rtx, 0, 0, OPTAB_WIDEN); else if (TARGET_RTP_PIC) @@ -5937,6 +5936,94 @@ [(set_attr "type" "jump") (set_attr "mode" "none")]) +;; For MIPS16, we don't know whether a given jump table will use short or +;; word-sized offsets until late in compilation, when we are able to determine +;; the sizes of the insns which comprise the containing function. This +;; necessitates the use of the casesi rather than the tablejump pattern, since +;; the latter tries to calculate the index of the offset to jump through early +;; in compilation, i.e. at expand time, when nothing is known about the +;; eventual function layout. + +(define_expand "casesi" + [(match_operand:SI 0 "register_operand" "") ; index to jump on + (match_operand:SI 1 "const_int_operand" "") ; lower bound + (match_operand:SI 2 "const_int_operand" "") ; total range + (match_operand 3 "" "") ; table label + (match_operand 4 "" "")] ; out of range label + "TARGET_MIPS16_SHORT_JUMP_TABLES" +{ + if (operands[1] != const0_rtx) +{ + rtx reg = gen_reg_rtx (SImode); + rtx offset = gen_int_mode (-INTVAL (operands[1]), SImode); + + if (!arith_operand (offset, SImode)) +offset = force_reg (SImode, offset); + + emit_insn (gen_addsi3 (reg, operands[0], offset)); + operands[0] = reg; +} + + if (!arith_operand (operands[0], SImode)) +operands[0] = force_reg (SImode, operands[0]); + + operands[2] = GEN_INT (INTVAL (operands[2]) + 1); + + emit_jump_insn (PMODE_INSN (gen_casesi_internal_mips16, + (operands[0], operands[2], + operands[3], operands[4]))); + + DONE; +}) + +(define_insn "casesi_internal_mips16_" + [(set (pc) + (if_then_else + (leu (match_operand:SI 0 "register_operand" "d") + (match_operand:SI 1 "ari
Re: VxWorks Patches Back from the Dead!
On 8/22/2012 8:52 PM, Bruce Korb wrote: On 08/22/12 17:05, rbmj wrote: Hello Everyone, I have ten patches which are approved or obvious but waiting on commit The include fixing stuff looks fine to me. However I think it might be simpler to tweak mkfixinc.sh to sed '/if test -s .{MACRO_LIST}/s/$/ && false/' \ ${srcdir}/fixinc.in > ${target} for vxworks rather than all that configury rigmarole. That would eliminate changes to gcc/configure.ac and gcc/Makefile.in. I didn't even think of that. I was probably doing all my fixes in the Makefile because that was where I was looking with the original problem. However, I'm looking at the result of that sed command, and I don't think that the && is going to work, as when I run that sed command I just get whitespace there. IIRC, sed handles '&' differently so it probably needs to be escaped. I'll see if I get a chance to test that later on. Basically degrading the entire first patch to this: $ svn diff mkfixinc.sh Index: mkfixinc.sh === 1 == '-u' 2 == '-L' 3 == 'mkfixinc.sh(revision 190448)' 4 == '-L' 5 == 'mkfixinc.sh(working copy)' 6 == '.svn/text-base/mkfixinc.sh.svn-base' 7 == 'mkfixinc.sh' --- mkfixinc.sh(revision 190448) +++ mkfixinc.sh(working copy) @@ -15,7 +15,6 @@ i?86-*-mingw32* | \ x86_64-*-mingw32* | \ i?86-*-interix* | \ -*-*-vxworks* | \ powerpc-*-eabisim* | \ powerpc-*-eabi*| \ powerpc-*-rtems* | \ @@ -26,6 +25,11 @@ (echo "#! /bin/sh" ; echo "exit 0" ) > ${target} ;; +*-*-vxworks* ) +sed '/if test -s .{MACRO_LIST}/s/$/ && false/' \ +${srcdir}/fixinc.in > ${target} +;; + *) cat < ${srcdir}/fixinc.in > ${target} || exit 1 ;; Assuming that when I test everything works and I can get a successful build that works for me. Probably won't get a chance to test as I don't have very much time until the weekend and right now there's another regression in rs6000 (issue with one of the macros) that's breaking that I have to fix before I get to that part of the build process. Thanks for your input. That's certainly a simpler fix. -- Robert Mason
Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)
> > I'm concerned about inaccuracies arising when multiple runs have > different sizes and therefore counters are in different buckets in the > corresponding histograms. We would end up in a situation where the > merged histogram has more "counters" in it than there are actual > counters in the program. Yes, it is very inaccurate. > > > > > The strategy of scaling is more sensible, but it will also be sensitive to > > order > > of train runs, i.e. it will give unstable results with parallel make. > > Regarding the approach of merging histograms by matching up counters > in the same order and apportioning out the histogram cumulative values > evenly between counters in the same bucket before merging, I don't > think the ordering will have a huge effect. The reason is that > counters we end up matching together and accumulating will stay in the > same relative order, and the min counter values are simply summed to > compute the new bucket index and new min counter value to record. For > the cumulative value, since the incoming cumulative values will be > apportioned evenly between the counters coming from the same bucket > when computing the new cumulative value (which is then just another > add), there could be some slight differences from different round off > errors when histograms are merged in different orders. But I would > expect that to result in only small differences in the cumulative > values in the histogram. Is that acceptable? Yes, I think it will be small difference and it will give resonable results in practice. I am only concerned about reproducibility of the bootstraps that... > Ok, I think that makes sense. > > If it is ok with you, I'd prefer to go with a 3 step approach: > 1) Implement some form of histogram merging (one of the options we > discussed above). (Your approach #4). > 2) Implement approach #2 (restructure into 2 passes with counters > being updated accurately and a possible race on the histogram > computation). > 3) Investigate and implement either approach #3 (some kind of global > lock) or the race detection you describe above. Yes, this sounds like good plan. > > > > > We will still have problems with bootstrap: the summary of libbackend.a will > > depend on what of compiler binaries has executed last, since cc1 will have > > different summary from cc1plus becuase frontend is different. I would give > > extra points to voluteer who changes libgcov to do multiple summaries for > > different programs as intended originally (if you search archives in > > 2001-2003, > > somewhere are patches that changes libgcov from not merging at all to > > merging > > always. I guess changing it to merge just when program checksum match is not > > that hard). > > In this case, I assume that currently the merge will abort because the > # counters, etc don't match? We only abort when number of counters in current file does not match (i.e. the gcda file is from difference compilation). We accept when the overall program is different, i.e. libbackend.a is linked into multiple binaries. Since the histogram summing is global across all profiled libraries, we will end up with different histograms depending on what program using given library was executed last. To get this stable I think it is only possible to go with #3 (i.e. program wide global locking) and get different records for different programs (i.e. do checksum of all objects libgcov sees and merge only with matching checksums). In coverage.c we then can either choose the proper record whendoing LTO and knowing what program we are just compiling or do hitogram merging in your way but after sorting the entries into some canonical order (i.e. increasing checksum). So we can track this incrementally. I guess if you go with step 1) as you describe and we can move ahead. Other thing I have patch for is to make LTO ipa-profile to produce more accurate histograms based on real instruction counts (here we can read in all gcda files, turn them into CFG profiles, and do our own summary). I suppose that once you merge in your patches I can update this and we will have two mechanizms so we will also have data on how your implementation diverges from precise solution. Thanks! Honza > > Thanks, > Teresa > > > > > Thanks, > > Honza > >> > >> Thanks, > >> Teresa > >> > >> > > >> > Honza > >> >> > >> >> David > >> >> > >> >> > >> >> > >> >> > > >> >> > Thanks, > >> >> > Teresa > >> >> > > >> >> > > > >> >> >> > >> > >> >> >> > >> > >> >> >> > >> > 2) Do we plan to add some features in near future that will > >> >> >> anyway require global locking? > >> >> >> > >> > I guess LIPO itself does not count since it streams its > >> >> >> > >> > data > >> >> >> into independent file as you > >> >> >> > >> > mentioned earlier and locking LIPO file is not that hard. > >> >> >> > >> > Does LIPO stream everything into that common file, or > >> >> >> > >> > does it > >> >> >> use combination of gcda files > >> >> >> > >> >
Re: [patch, fortran] Reduce Explosion of noise from -Wall
On Wed, Aug 22, 2012 at 10:47:34PM +0200, Thomas Koenig wrote: > Am 22.08.2012 11:18, schrieb Tobias Burnus: > > >Regarding -Wcompare-real, I wonder whether it makes sense to either > >ignore comparisions against zero or to put them into a different flag > >(-Wcompare-real-zero); > > Here is a patch to not warn for comparisons against zero. > Regression-tested, also tested with make info and make dvi. > > OK for trunk? > I would rather see the -Wcompare-real option removed from -Wall. The patch does nothing for the 158 warnings for comparison with ONE in BLAS. This, of course, is just my opinion and you're moe than welcomed to ignore it. -- Steve
Re: [PATCH] PR 53528 c++/ C++11 Generalized Attribute support
On 08/15/2012 03:43 AM, Dodji Seketeli wrote: Or we could just require people to put the attribute in the right place (or one of the right places) if they want it to apply to the decl. That is, either at the beginning of the declaration statement or after the declarator-id. Just to make sure I understand, you mean, for tagged types? Because e.g for scalars, I feel like having is within the spirit of [dcl.spec]/1 const int [[some-attr]] foo = 10; (with [[some-attr]] applying to the integer type) right? Yes, in this declaration [[some-attr]] applies to the type. Whether or not that makes sense depends on the attribute and the type. Specifically, it doesn't ever make sense for classes, and I'm resistant to doing it for any types until we clarify how attributes participate in the type system. C++11 doesn't clarify that because it doesn't have any attributes that can be used in this context. I think we should start out with a fairly strict interpretation of the rules, and possibly relax them later, rather than try to relax them now. For unused though, I am not sure how to do that in an appropriate manner. An idea? What does it mean to say that int is unused? It seems meaningless to me. OK, in handle_unused_attribute, I prevent 'unused' from applying to types, when used in the c+11 attribute syntax. Well, it does make sense to specify 'unused' in a class definition (i.e. when ATTR_FLAG_TYPE_IN_PLACE is set), it just doesn't make sense to apply it to a type-specifier. + if (TREE_CODE (*node) != UNION_TYPE) +{ + if (flags & ATTR_FLAG_CXX11) + { + /* transparent_union is being used as a c++11 attribute. In +this mode we force it to apply to a union types only. */ + warning (OPT_Wattributes, + "attribute %qE applies to union types only", name); + return NULL_TREE; + } +} I think it would be better to disable looking through the TYPE_DECL in this case and then share the warning with the normal case. +check_cxx_fundamental_alignment_constraints (tree node, In this function it would be nice to print out the requested and maximum alignments in the error. And I wonder if we want to offer this as an optional warning for GNU attribute syntax. + else if (flags & ATTR_FLAG_CXX11 + && DECL_ALIGN (decl) > (1U << i) * BITS_PER_UNIT) +/* C++-11 [dcl.align/4]: + [When multiple alignment-specifiers are specified for an + entity, the alignment requirement shall be set to the + strictest specified alignment]. */ +*no_add_attrs = true; This seems like a good change in general; I'd be inclined to drop the check for C++11 syntax. If there are multiple conflicting alignments specified for a declaration, the only things that make sense are to choose the largest alignment or give an error; the current behavior of choosing the most recently-specified alignment is just broken. + if (cxx11_attribute_p (declspecs->attributes)) + warning_at (declspecs->locations[ds_attribute], + OPT_Wattributes, + "ignoring attribute applying to missing declarator"); Does this distinguish between attributes before the class-specifier (which appertain to any declarations) and attributes after the class-specifier (which appertain to the type as used in any declarations)? I think it would be helpful to clarify that the attributes apply to declarators because they're at the beginning of the declaration, and keep the inform call for this case as well. + if (std_attrs != NULL_TREE) + { + /* Attributes should be parsed as part of the the + declaration, so let's un-parse them. */ + cp_lexer_rollback_tokens (parser->lexer); + std_attrs = NULL_TREE; + } Huh? Attributes before a label appertain to the label, not the labeled statement. And wouldn't this cause us to parse the label in an infinite loop? +However, in G++ we cannot apply attributes to an already +built *tagged* type, as that breaks some invariants. So, +for tagged types, to comply with [dcl.spec]/1 which says: + +[The attribute-specifier-seq affects the type only for + the declaration it appears in, not other declarations + involving the same type] + + we are going to apply the attributes to the declaration + that is carrying this type, just like what is done for GNU + attributes. decl_attributes then knows how to apply the + attributes to the type of that declaration only. For + non-tagged types, we are going to just apply the + attribute. */ As discussed above, we don't want to do this. The attributes appertain to the type; for tagged types, that doesn't work, but that just means the user's (new) code is wrong, not that we should
Re: [C++ patch] PR 20420
On 08/22/2012 12:01 PM, Paolo Carlini wrote: At this point, let me know, I could either add to the testcase a templated variant like the above (see attached), or rework the code to explicitly check the underlying type (I would add locals hosting the TREE_TYPEs to shorten a bit things, etc). The below passes testing, anyway. OK. Jason
Re: [PATCH] fix wrong-code bug for -fstrict-volatile-bitfields
On 08/22/2012 03:27 PM, Eric Botcazou wrote: + bool packedp = false; + + if (TREE_CODE(to) == COMPONENT_REF +&& (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (to, 0))) + || (TREE_CODE (TREE_OPERAND (to, 1)) == FIELD_DECL +&& DECL_PACKED (TREE_OPERAND (to, 1) + packedp = true; note that this is not a reliable way of determining if a field is packed (yes, the existing code is broken in the same way). I realize that you are only using this for diagnostic purposes, still something worth mentioning. Indeed, ideally DECL_PACKED shouldn't be used outside stor-layout.c and dwarf2out.c in the middle-end. I dislike passing around the packedp flag throughout the expander and to warn inside the expander. First, the flag has a name that can be confusing (it does not conservatively flag all packed accesses). Second, why don't you warn for this from inside the frontend when the bitfield access is generated? This way the diagnostic won't depend on optimization levels and you avoid uglifying the expander. Seconded. If the -fstrict-volatile-bitfields support is still incomplete, let's take this opportunity to clean it up. Testing DECL_PACKED or issuing a warning from the RTL expander is to be avoided. Well, I'm willing to give it a shot, but from a pragmatic point of view I've only got about a week left to wind up the current batch of patches I've got pending before I'm reassigned to a new project. Can one of you give a more specific outline of how this should be fixed, to increase the chances that I can actually implement an acceptable solution in the time available? In particular, the packedp flag that's being passed down is *not* just used for diagnostics; it changes code generation, and this code is scary enough that I'm worried about preserving the current behavior (which IIUC is required by the ARM EABI) if this is restructured. If only the diagnostics should be moved elsewhere, where, exactly? Or is there just a better test that can be used in the existing places to determine whether a field is packed? -Sandra the confused
Go patch committed: comparisons are untyped boolean
The Go language was changed slightly so that comparisons return an untyped boolean value rather than the named type "bool". This patch implements that change in gccgo. One of the tests in the testsuite changed as well. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian Index: gcc/go/gofrontend/expressions.h === --- gcc/go/gofrontend/expressions.h (revision 190404) +++ gcc/go/gofrontend/expressions.h (working copy) @@ -623,9 +623,9 @@ class Expression // Return a tree implementing the comparison LHS_TREE OP RHS_TREE. // TYPE is the type of both sides. static tree - comparison_tree(Translate_context*, Operator op, Type* left_type, - tree left_tree, Type* right_type, tree right_tree, - Location); + comparison_tree(Translate_context*, Type* result_type, Operator op, + Type* left_type, tree left_tree, Type* right_type, + tree right_tree, Location); // Return a tree for the multi-precision integer VAL in TYPE. static tree @@ -1149,7 +1149,7 @@ class Binary_expression : public Express Binary_expression(Operator op, Expression* left, Expression* right, Location location) : Expression(EXPRESSION_BINARY, location), - op_(op), left_(left), right_(right) + op_(op), left_(left), right_(right), type_(NULL) { } // Return the operator. @@ -1280,6 +1280,8 @@ class Binary_expression : public Express Expression* left_; // The right hand side operand. Expression* right_; + // The type of a comparison operation. + Type* type_; }; // A call expression. The go statement needs to dig inside this. Index: gcc/go/gofrontend/expressions.cc === --- gcc/go/gofrontend/expressions.cc (revision 190608) +++ gcc/go/gofrontend/expressions.cc (working copy) @@ -5075,7 +5075,7 @@ Binary_expression::do_lower(Gogo* gogo, &right_nc, location, &result)) return this; - return Expression::make_cast(Type::lookup_bool_type(), + return Expression::make_cast(Type::make_boolean_type(), Expression::make_boolean(result, location), location); @@ -5106,10 +5106,7 @@ Binary_expression::do_lower(Gogo* gogo, { int cmp = left_string.compare(right_string); bool r = Binary_expression::cmp_to_bool(op, cmp); - return Expression::make_cast(Type::lookup_bool_type(), - Expression::make_boolean(r, -location), - location); + return Expression::make_boolean(r, location); } } } @@ -5327,15 +5324,15 @@ Binary_expression::do_type() switch (this->op_) { -case OPERATOR_OROR: -case OPERATOR_ANDAND: case OPERATOR_EQEQ: case OPERATOR_NOTEQ: case OPERATOR_LT: case OPERATOR_LE: case OPERATOR_GT: case OPERATOR_GE: - return Type::lookup_bool_type(); + if (this->type_ == NULL) + this->type_ = Type::make_boolean_type(); + return this->type_; case OPERATOR_PLUS: case OPERATOR_MINUS: @@ -5346,6 +5343,8 @@ Binary_expression::do_type() case OPERATOR_MOD: case OPERATOR_AND: case OPERATOR_BITCLEAR: +case OPERATOR_OROR: +case OPERATOR_ANDAND: { Type* type; if (!Binary_expression::operation_type(this->op_, @@ -5453,6 +5452,16 @@ Binary_expression::do_determine_type(con } this->right_->determine_type(&subcontext); + + if (is_comparison) +{ + if (this->type_ != NULL && !this->type_->is_abstract()) + ; + else if (context->type != NULL && context->type->is_boolean_type()) + this->type_ = context->type; + else if (!context->may_be_abstract) + this->type_ = Type::lookup_bool_type(); +} } // Report an error if the binary operator OP does not support TYPE. @@ -5664,7 +5673,7 @@ Binary_expression::do_get_tree(Translate case OPERATOR_LE: case OPERATOR_GT: case OPERATOR_GE: - return Expression::comparison_tree(context, this->op_, + return Expression::comparison_tree(context, this->type_, this->op_, this->left_->type(), left, this->right_->type(), right, this->location()); @@ -6125,8 +6134,8 @@ Expression::make_binary(Operator op, Exp // Implement a comparison. tree -Expression::comparison_tree(Translate_context* context, Operator op, - Type* left_type, tree left_tree, +Expression::comparison_tree(Translate_context* context, Type* result_type, + Operator op, Type* left_type, tree left_tree, Type* right_type, tree right_tree, Location location) { @@ -6367,7 +6376,13 @@ Expression::comparison_tree(Translate_co if (left_tree == error_mark_node || right_tree == error_mark_node) return error_mark_node; - tree ret = fold_build2(code, boolean_type_node, left_tree, right_tree); + tree result_type_tree; + if (result_type == NULL) +result_type_tree = boolean_type_node; + else
Re: [Fortran] PR37336 - FIINAL patch [1/n]: Implement the finalization wrapper subroutine
* PING * On August 19, 2012, Tobias Burnus wrote: Dear all, attached is a slightly updated patch: * Call finalizers of nonallocatable, nonpointer components * Generate FINAL wrapper for abstract types which have a finalizer. (The allocatable components are deallocated in the first type (abstract or not) which has a finalizer, i.e. abstract + finalizer or first nonabstract type.) I had to disable some resolve warning; I did so by introducing an attr.artificial. I used it to also fix PR 51632, where we errored out for __def_init and __copy where there were coarray components. Build and regtested on x86-64-linux. OK for the trunk? Tobias
Re: [PATCH, MIPS] fix MIPS16 jump table overflow
Sandra Loosemore writes: > 2012-08-22 Julian Brown > Sandra Loosemore > > gcc/ > * config/mips/mips.md > (UNSPEC_CASESI_DISPATCH): New. > (MIPS16_T_REGNUM): New constant. > (tablejump): Don't use for MIPS16_SHORT_JUMP_TABLES. > (casesi): New. > (casesi_internal_mips16_): New. > * config/mips/mips.c (mips16_split_long_branches): Adjust test > to ignore casesi jump tables. > * config/mips/mips.h (TARGET_MIPS16_SHORT_JUMP_TABLES): Update > comment. > (CASE_VECTOR_MODE): Use SImode unconditionally. > (CASE_VECTOR_SHORTEN_MODE): Define. > (ASM_OUTPUT_ADDR_DIFF_ELT): Output word-sized addr_diff_elts > when necessary for MIPS16_SHORT_JUMP_TABLES. > > gcc/testsuite/ > * gcc.target/mips/code-readable-1.c: Add -O to options. OK, thanks. Richard