RE: [PATCH Version 2][RFA]Improving register pressure directed hoist
> -Original Message- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On > Behalf Of Bin Cheng > Sent: Tuesday, November 06, 2012 9:04 PM > To: 'Jeff Law' > Cc: gcc-patches@gcc.gnu.org > Subject: RE: [PATCH Version 2][RFA]Improving register pressure directed hoist > > > > > -Original Message- > > From: Jeff Law [mailto:l...@redhat.com] > > Sent: Tuesday, November 06, 2012 4:51 AM > > To: Bin Cheng > > Cc: gcc-patches@gcc.gnu.org > > Subject: Re: [PATCH Version 2][RFA]Improving register pressure > > directed > hoist > > > > On 11/02/2012 02:34 AM, Bin Cheng wrote: > > >> > > > Also I don't understand why the bogus patch can catch more hoist > > > opportunities and improve code size, so please help if you have any > > > idea about this. > > Well, perturbing the code, particularly in a way which is supposed to > change > > the amount of register pressure is certainly going to affect the > allocators > > and reload. > > > > It shouldn't be that hard to look at results which differ between the > > two patches and analyze why they're different. > I will try to investigate this issue. > > > > > > > > > 2012-11-02 Bin Cheng > > > > > > * gcse.c: (struct bb_data): Add new fields, old_pressure, live_in > > > and backup. > > > (calculate_bb_reg_pressure): Initialize live_in and backup. > > > (update_bb_reg_pressure): New. > > > (should_hoist_expr_to_dom): Add new parameter from. > > > Monitor the change of reg pressure and use it to drive hoisting. > > > (hoist_code): Update LIVE and reg pressure information. > > > > > > gcc/testsuite/ChangeLog > > > 2012-11-02 Bin Cheng > > > > > > * gcc.dg/hoist-register-pressure-3.c: New test. > > > > > > > > > +/* Update register pressure for BB when hoisting an expression from > > > + instruction FROM, if live ranges of inputs are shrunk. Also > > > + maintain live_in information if live range of register referred > > > + in FROM is shrunk. > > > + > > > + Return 0 if register pressure doesn't change, otherwise return > > > + the number by which register pressure is decreased. > > > + > > > + NOTE: Register pressure won't be increased in this function. */ > > > + > > > +static int > > > +update_bb_reg_pressure (basic_block bb, rtx from, > > > + enum reg_class pressure_class, int nregs) { > > > + rtx dreg, insn; > > > + basic_block succ_bb; > > > + df_ref *op, op_ref; > > > + edge succ; > > > + edge_iterator ei; > > > + int decreased_pressure = 0; > > > + > > > + for (op = DF_INSN_USES (from); *op; op++) > > > +{ > > > + dreg = DF_REF_REAL_REG (*op); > > > + /* The live range of register is shrunk only if it isn't: > > > + 1. referred on any path from the end of this block to EXIT, or > > > + 2. referred by insns other than FROM in this block. */ > > > + FOR_EACH_EDGE (succ, ei, bb->succs) > > > + { > > > + succ_bb = succ->dest; > > > + if (succ_bb == EXIT_BLOCK_PTR) > > > + continue; > > > + > > > + if (bitmap_bit_p (BB_DATA (succ_bb)->live_in, REGNO (dreg))) > > > + break; > > > + } > > > + if (succ != NULL) > > > + continue; > > > + > > > + op_ref = DF_REG_USE_CHAIN (REGNO (dreg)); > > > + for (; op_ref; op_ref = DF_REF_NEXT_REG (op_ref)) > > > + { > > > + if (!DF_REF_INSN_INFO (op_ref)) > > > + continue; > > > + > > > + insn = DF_REF_INSN (op_ref); > > > + if (BLOCK_FOR_INSN (insn) == bb > > > + && NONDEBUG_INSN_P (insn) && insn != from) > > > + break; > > > + } > > > + > > > + /* Decrease register pressure and update live_in information for > > > + this block. */ > > > + if (!op_ref) > > > + { > > > + decreased_pressure += nregs; > > > + BB_DATA (bb)->max_reg_pressure[pressure_class] -= nregs; > > > + bitmap_clear_bit (BB_DATA (bb)->live_in, REGNO (dreg)); > > > + } > > > +} > > > + return decreased_pressure; > > So we're looking to see if any of the registers used in FROM are used > after > > from. If none are used, then we decrease the register pressure by > > nregs > which > > appears to be a property of the the registers *set* in FROM. > > Is seems like there's some inconsistency here. Or have I > > misunderstood something? > > > > I'm not sure how much it matters in practice, except perhaps for > conversions > > and the like where the source and destination operands are different > modes. > > Agreed, I missed the inconsistence of register_class/number between input and > output. I will fix this issue and measure the effect once I get back to office. > Hi Jeff, I modified the patch updating reg_class/num of input registers, rather than output register. The code size info collected shows no obvious change from last patch. Bootstrapped on x86, tested on x86-linux and arm-none-eabi/Thumb1. Is it OK? Thanks 2012-11-08 Bin Cheng * gcse.c: (struct bb_data): Add new fields, old_pressure, live_in and backup. (calculate_bb_reg_pressure): Initializ
Re: [PATCH] Vtable pointer verification, gcc changes (patch 2 of 2)
On 11/05/2012 06:48 PM, Caroline Tice wrote: As requested, I have split the original patch into two parts: GCC changes and runtime library changes. The attached patch is fore the gcc changes. Out of curiosity, what's the primary source of wrong vtable values you expect? User-after-free issues, heap spraying, or something else? -- Florian Weimer / Red Hat Product Security Team
Re: [PATCH] Fix fold reassociation (PR c++/55137)
> The first (C++) testcase is rejected since my SIZEOF_EXPR folding deferral > changes, the problem is that > -1 + (int) (sizeof (int) - 1) > is first changed into > -1 + (int) ((unsigned) sizeof (int) + UINT_MAX) > and then fold_binary_loc reassociates it in int type into > (int) sizeof (int) + [-2(overflow)], thus introducing overflow where > there hasn't been originally. maybe_const_value then refuses to fold it > into constant due to that. I've fixed that by the moving the TYPE_OVERFLOW > check from before the operation to a check whether the whole reassociation > doesn't introduce overflows (while previously it was testing solely > for overflow introduced on fold_converted lit0/lit1 being combined > together, not e.g. when overflow was introduced by fold_converting lit0 > resp. lit1, or for minus_lit{0,1} constants). FWIW, no objections by me. > Looking at that lead me to the second testcase below, which I hope is > valid C, the overflow happens there in unsigned type, thus with defined > wrapping, then there is implementation defined? conversion to signed type > (but all our targets are two's complement), and associate_trees: > was reassociating it in signed type, thus introducing signed overflow > where there wasn't before. Fixed by doing the reassociation in the unsigned > type if (at least) one of the operands is of unsigned type. I guess the natural question is then: if we start to change the type of the operation, why not always reassociate in the unsigned version of the type? int foo (int x, int y) { return (x + 1) + (int) (y + 1); } int bar (int x, unsigned int y) { return (x + 1) + (int) (y + 1); } -- Eric Botcazou
Re: [PING^2] [C++ PATCH] Add overflow checking to __cxa_vec_new[23]
On 11/06/2012 05:01 PM, Florian Weimer wrote: On 11/06/2012 04:55 PM, Jason Merrill wrote: On 11/05/2012 12:52 PM, Florian Weimer wrote: +// Avoid use of none-overridable new/delete operators in shared Typo: that should be "non-overridable" Jason Thanks, this patch fixes both instances. I figured I could commit this as obvious, so I did that in r193326. -- Florian Weimer / Red Hat Product Security Team
Re: [PING^2] [C++ PATCH] Add overflow checking to __cxa_vec_new[23]
On 11/08/2012 10:54 AM, Florian Weimer wrote: I figured I could commit this as obvious, so I did that in r193326. Of course. Thanks again! Paolo. PS: as you may have noticed, I adjusted your new code to not throw, instead abort when __EXCEPTIONS is not defined: in general, we want the library to build also with -fno-exceptions and throw doesn't even compile in that case. Of course such a library isn't strictly conforming but In know quite a few users want the option.
Re: [PATCH] Fix fold reassociation (PR c++/55137)
On Thu, Nov 08, 2012 at 10:37:00AM +0100, Eric Botcazou wrote: > I guess the natural question is then: if we start to change the type of the > operation, why not always reassociate in the unsigned version of the type? > > int > foo (int x, int y) > { > return (x + 1) + (int) (y + 1); > } I was afraid of preventing optimizations based on the assumption that the operations don't overflow. Perhaps we could do that just if with signed atype a TREE_OVERFLOW would be introduced (i.e. when my current patch returns NULL: + /* Don't introduce overflows through reassociation. */ + if (!any_overflows + && ((lit0 && TREE_OVERFLOW (lit0)) + || (minus_lit0 && TREE_OVERFLOW (minus_lit0 + return NULL_TREE; it could instead for INTEGER_TYPE_P (atype) do atype = build_nonstandard_integer_type (TYPE_PRECISION (type), 1); and retry (can be done also by atype = build_nonstandard_integer_type (TYPE_PRECISION (type), 1); return fold_convert_loc (loc, type, fold_binary_loc (loc, code, atype, fold_convert (atype, arg0), fold_convert (atype, arg1))); ). E.g. for int foo (int x, int y) { return (x + __INT_MAX__) + (int) (y + __INT_MAX__); } where __INT_MAX__ + __INT_MAX__ overflows, but if say x and y are (- __INT_MAX__ / 4 * 3), then there is no overflow originally. But we'd give up on this already because there are two different variables. So perhaps better int foo (int x) { return (x + __INT_MAX__) + (int) (x + __INT_MAX__); } And similarly the retry with unsigned type could be done for the var0 && var1 case where it sets ok = false;. > > int > bar (int x, unsigned int y) > { > return (x + 1) + (int) (y + 1); > } Jakub
Re: [patch] PR55191 - ICE in tree-ssa-pre.c due to missing fake edge for an infinite loop
On Mon, Nov 05, 2012 at 10:59:45PM +0100, Steven Bosscher wrote: > I hadn't expected that cfganal.c's reverse-CFG DFS would actually > depend on the order of the basic blocks. The attached patch fixes that > small oversight... > > Bootstrapped&tested on powerpc64-unknown-linux-gnu. OK for trunk? > > Ciao! > Steven > Please put the PR number into the ChangeLog entry. > * cfganal.c (connect_infinite_loops_to_exit): Call dfs_deadend > from here. > (flow_dfs_compute_reverse_execute): Don't call it here. Ok. > @@ -0,0 +1,14 @@ > +/* PR tree-optimization/55191 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +int a, b; > + > +void f(void) > +{ > + b = a || b; > + > + for(a = 0; a < 2; a++); > + while(1); > +} > + No need for the trailing line. Jakub
Re: [trans-mem][rfc] Improvements to uninstrumented code paths
On Wed, 2012-11-07 at 15:01 -0800, Richard Henderson wrote: > I wrote the second of these patches first, and I'm uncertain about the > desirability of the first of the patches. > > While working on the uninstrumented code path bulk patch, I noticed that > nested transactions within the copy of the outermost transaction I assume "within" refers to within the same static scope of the function? > were > not being processed for an uninstrumented code path, and so were only > receiving an instrumented path. This is clearly less than ideal when > considering HTM. Yes, it would be better if nested transactions would have an uninstrumented code path too. > Now, it seemed to me that if we're already in an uninstrumented code > path, we're extremely likely to want to stay in one. This is certainly > true for HTM, as well as when we've selected the serialirr method. I > can't think off hand of any other reason we'd be on the uninstrumented > code path. > > Therefore the second patch arranges for all nested transactions in the > uninstrumented path to _only_ have uninstrumented paths themselves. This is only possible if the nested transaction cannot abort. If it potentially can, then we must have an instrumented code path for it. > While reviewing the results of this second patch in detail, I noticed > that nested transactions on the instrumented code paths were not > generating both instrumented and uninstrumented code paths. My first > reaction was that this was a bug, and so I wrote the first patch to > fix it. > > But as I was reviewing the patch to write the changelog, I began to > wonder whether the same logic concerning the original instrumented/ > uninstrumented choice applies as well to the instrumented path. > > Is it ever likely that we'd choose an uninstrumented path for a > nested transaction, given that we're already executing the instrumented > path for an outer transaction? It could be a performance benefit in cases like __transaction_relaxed { // something if (foo) { unsafe_action(); transaction_relaxed { // lots of code } } } Here, we go serial at unsafe_action and can then cause less overall slowdown when running "lots of code" as uninstrumented (but it's just avoiding the per-memory-access function calls). If the txn can acquire the serial lock and commit afterwards, then it doesn't have to restart at the outermost txn. I can't say whether such cases would appear often enough in practice to offset the code size increase (and thus potential slowdowns) caused by having uninstrumented code paths available too. > It now seems to me that the only time we'd switch from instrumented > to uninstrumented code path would be during a transaction restart, > after having selected to retry with a serialirr method. > > Which means that I should apply the second patch only, > > Thoughts? Another thing to consider is whether flat nesting could be used: for nested transactions that have hasNoAbort set, we can elide txn begin/commit calls and just embed the nested txn body in the enclosing txn. This would be a benefit on uninstrumented/uninstrumented code path pairs because the nested txn can't do anything but keep running. For instrumented/instrumented (with hasNoAbort set), we could also flatten but there is a performance trade-off: the TM runtime lib could do closed nesting on data conflicts, so that it tries to roll back only the inner txn on a conflict. The potential benefit of such a scheme depends a lot on the workload, and libitm currently also flattens txns that do not abort. OTOH, the overhead of the begin/commit calls for nested txns should also be relatively small compared to overall STM/instrumented overheads. So, to summarize: - If the txn could abort, then it must have instrumented and could have uninstrumented. - If the nested txn does not abort, then: - For uninstrumented / uninstrumented nesting, flat nesting makes sense (and thus we don't need instrumented for the nested). - For instrumented / instrumented, we could flatten, but unsure whether that significantly improves performance. - Nested uninstrumented could be useful with enclosing nested, but perhaps this doesn't happen often enough to offset the code size increase. Thoughts? Torvald
Re: [trans-mem][rfc] Improvements to uninstrumented code paths
On Wed, 2012-11-07 at 15:08 -0800, Andi Kleen wrote: > Richard Henderson writes: > > > > Is it ever likely that we'd choose an uninstrumented path for a > > nested transaction, given that we're already executing the instrumented > > path for an outer transaction? > > I don't see why not. A small inner transaction may well succeed > in HTM, even if the big outer one does not. It could succeed, but this would require a HyTM scheme: when the enclosing txn is a SW txn synchronizing with other SW txns, then the nested HW txn would have to synchronize with those other SW txns too. So, if the STM has a serialized commit phase or something like that, then you can run the uninstrumented code path from within HW txns (because the HyTM part can be done on begin). But for all HyTMs that do need instrumentation, it wouldn't work.
Re: [v3] Fix profile mode failures
On 11/08/2012 02:58 AM, Paolo Carlini wrote: On 11/08/2012 02:37 AM, Jonathan Wakely wrote: Bah, it's nothing to do with me, the profile-mode list should never have worked! I'm testing this overnight. Ah! Thanks! By the way, I'm also seeing this: WARNING: 23_containers/unordered_multimap/insert/55028-debug.cc compilation failed to produce executable but in this case the issue is even more trivial: we are explicitly passing -D_GLIBCXX_DEBUG - which lately we are supposed to not do - and it conflicts with profile-mode. I'll fix it. Paolo.
Re: [trans-mem][rfc] Improvements to uninstrumented code paths
On Thu, 2012-11-08 at 13:33 +0100, Torvald Riegel wrote: > On Wed, 2012-11-07 at 15:01 -0800, Richard Henderson wrote: > > I wrote the second of these patches first, and I'm uncertain about the > > desirability of the first of the patches. > > > > While working on the uninstrumented code path bulk patch, I noticed that > > nested transactions within the copy of the outermost transaction > > I assume "within" refers to within the same static scope of the > function? > > > were > > not being processed for an uninstrumented code path, and so were only > > receiving an instrumented path. This is clearly less than ideal when > > considering HTM. > > Yes, it would be better if nested transactions would have an > uninstrumented code path too. > > > Now, it seemed to me that if we're already in an uninstrumented code > > path, we're extremely likely to want to stay in one. This is certainly > > true for HTM, as well as when we've selected the serialirr method. I > > can't think off hand of any other reason we'd be on the uninstrumented > > code path. > > > > Therefore the second patch arranges for all nested transactions in the > > uninstrumented path to _only_ have uninstrumented paths themselves. > > This is only possible if the nested transaction cannot abort. If it > potentially can, then we must have an instrumented code path for it. > > > While reviewing the results of this second patch in detail, I noticed > > that nested transactions on the instrumented code paths were not > > generating both instrumented and uninstrumented code paths. My first > > reaction was that this was a bug, and so I wrote the first patch to > > fix it. > > > > But as I was reviewing the patch to write the changelog, I began to > > wonder whether the same logic concerning the original instrumented/ > > uninstrumented choice applies as well to the instrumented path. > > > > Is it ever likely that we'd choose an uninstrumented path for a > > nested transaction, given that we're already executing the instrumented > > path for an outer transaction? > > It could be a performance benefit in cases like > > __transaction_relaxed { > // something > if (foo) { > unsafe_action(); > transaction_relaxed { > // lots of code > } > } > } > > Here, we go serial at unsafe_action and can then cause less overall > slowdown when running "lots of code" as uninstrumented (but it's just > avoiding the per-memory-access function calls). If the txn can acquire > the serial lock and commit afterwards, then it doesn't have to restart > at the outermost txn. > > I can't say whether such cases would appear often enough in practice to > offset the code size increase (and thus potential slowdowns) caused by > having uninstrumented code paths available too. > > > It now seems to me that the only time we'd switch from instrumented > > to uninstrumented code path would be during a transaction restart, > > after having selected to retry with a serialirr method. > > > > Which means that I should apply the second patch only, > > > > Thoughts? > > Another thing to consider is whether flat nesting could be used: for > nested transactions that have hasNoAbort set, we can elide txn > begin/commit calls and just embed the nested txn body in the enclosing > txn. > This would be a benefit on uninstrumented/uninstrumented code path pairs > because the nested txn can't do anything but keep running. For > instrumented/instrumented (with hasNoAbort set), we could also flatten > but there is a performance trade-off: the TM runtime lib could do closed > nesting on data conflicts, so that it tries to roll back only the inner > txn on a conflict. The potential benefit of such a scheme depends a lot > on the workload, and libitm currently also flattens txns that do not > abort. OTOH, the overhead of the begin/commit calls for nested txns > should also be relatively small compared to overall STM/instrumented > overheads. > > So, to summarize: > - If the txn could abort, then it must have instrumented and could have > uninstrumented. > - If the nested txn does not abort, then: > - For uninstrumented / uninstrumented nesting, flat nesting makes > sense (and thus we don't need instrumented for the nested). > - For instrumented / instrumented, we could flatten, but unsure > whether that significantly improves performance. > - Nested uninstrumented could be useful with enclosing nested, but > perhaps this doesn't happen often enough to offset the code size > increase. One addition for cases like: __transaction_atomic { __transaction_atomic { /* ... */ } if (foo) __transaction_cancel; } Here, the enclosing txn may abort. In this case, we must have instrumented code paths for all nested txns too, and we could have uninstrumented code paths for them (to allow for running those with HTMs that can abort txns). Torvald
Re: [PATCH] Make IPA-CP work on aggregates
Hi, On Wed, Nov 07, 2012 at 03:55:16PM +0100, Jan Hubicka wrote: > > On Wed, Nov 07, 2012 at 03:39:15PM +0100, Martin Jambor wrote: > > > another bootstrap finishes. I'm not sure if it would be OK to commit > > > it now, given it is stage3, though. OTOH, I think it would be worth > > > it. > > > > I'm ok with getting that in now from RM POV, but not familiar with the > > code enough to review it. So, if somebody acks it (Honza?), it can be > > added. > > > > > 2012-11-07 Martin Jambor > > > > > > * ipa-prop.c (determine_known_aggregate_parts): Skip writes to > > > different declarations when tracking writes to a declaration. > > > > > > Index: src/gcc/ipa-prop.c > > > === > > > --- src.orig/gcc/ipa-prop.c > > > +++ src/gcc/ipa-prop.c > > > @@ -1318,7 +1318,12 @@ determine_known_aggregate_parts (gimple > > > break; > > > } > > >else if (lhs_base != arg_base) > > > - break; > > > + { > > > + if (DECL_P (lhs_base)) > > > + continue; > > > + else > > > + break; > > > + } > > OK, so the point of patch is to not stop on writes to decls while looking > for value the field is initialized to? > > It looks OK. Yes, when we are looking for writes to a decl and come accross a write to another decl, we just skip it instead of bailing out completely. The reason my original implementation missed it is that I relied on stmt_may_clobber_ref_p_1 to determine it could not clobber the decl but apprently it does not... perhaps that is a bug? Anyway, I added a testcase to the above and this is what I am about to commit. Thanks a lot, Martin 2012-11-07 Martin Jambor * ipa-prop.c (determine_known_aggregate_parts): Skip writes to different declarations when tracking writes to a declaration. * gfortran.dg/ipcp-array-1.f90: New test. Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -1318,7 +1318,12 @@ determine_known_aggregate_parts (gimple break; } else if (lhs_base != arg_base) - break; + { + if (DECL_P (lhs_base)) + continue; + else + break; + } if (lhs_offset + lhs_size < arg_offset || lhs_offset >= (arg_offset + arg_size)) Index: src/gcc/testsuite/gfortran.dg/ipcp-array-1.f90 === --- /dev/null +++ src/gcc/testsuite/gfortran.dg/ipcp-array-1.f90 @@ -0,0 +1,19 @@ +! { dg-do compile } +! { dg-options "-O2 -fdump-ipa-cp-details -fno-inline -fdump-tree-optimized" } + +subroutine bar (a, b, n) + integer :: a(n), b(n) + call foo (a, b) +contains +subroutine foo (a, b) + integer :: a(:), b(:) + a = b +end subroutine +end + +! { dg-final { scan-ipa-dump "Creating a specialized node of foo" "cp" } } +! { dg-final { scan-ipa-dump-times "Aggregate replacements\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=\[^=\]*=" 2 "cp" } } +! { dg-final { cleanup-ipa-dump "cp" } } +! { dg-final { scan-tree-dump-not "stride;" "optimized" } } +! { dg-final { scan-tree-dump-not "lbound;" "optimized" } } +! { dg-final { cleanup-tree-dump "optimized" } }
Re: [PATCH] Make IPA-CP work on aggregates
> Hi, > > On Wed, Nov 07, 2012 at 03:55:16PM +0100, Jan Hubicka wrote: > > > On Wed, Nov 07, 2012 at 03:39:15PM +0100, Martin Jambor wrote: > > > > another bootstrap finishes. I'm not sure if it would be OK to commit > > > > it now, given it is stage3, though. OTOH, I think it would be worth > > > > it. > > > > > > I'm ok with getting that in now from RM POV, but not familiar with the > > > code enough to review it. So, if somebody acks it (Honza?), it can be > > > added. > > > > > > > 2012-11-07 Martin Jambor > > > > > > > > * ipa-prop.c (determine_known_aggregate_parts): Skip writes to > > > > different declarations when tracking writes to a declaration. > > > > > > > > Index: src/gcc/ipa-prop.c > > > > === > > > > --- src.orig/gcc/ipa-prop.c > > > > +++ src/gcc/ipa-prop.c > > > > @@ -1318,7 +1318,12 @@ determine_known_aggregate_parts (gimple > > > > break; > > > > } > > > >else if (lhs_base != arg_base) > > > > - break; > > > > + { > > > > + if (DECL_P (lhs_base)) > > > > + continue; > > > > + else > > > > + break; > > > > + } > > > > OK, so the point of patch is to not stop on writes to decls while looking > > for value the field is initialized to? > > > > It looks OK. > > Yes, when we are looking for writes to a decl and come accross a write > to another decl, we just skip it instead of bailing out completely. > The reason my original implementation missed it is that I relied on > stmt_may_clobber_ref_p_1 to determine it could not clobber the decl > but apprently it does not... perhaps that is a bug? > > Anyway, I added a testcase to the above and this is what I am about to > commit. > > Thanks a lot, > > Martin > > > 2012-11-07 Martin Jambor > > * ipa-prop.c (determine_known_aggregate_parts): Skip writes to > different declarations when tracking writes to a declaration. > > * gfortran.dg/ipcp-array-1.f90: New test. Thanks, it looks good. I think the code should also be extended to handle var=const_var i.e. when you arrive to var_decl with const_value_known_p or const_decl. I suppose this will need to look into constructors then... Honza
Re: [PATCH] Vtable pointer verification, gcc changes (patch 2 of 2)
Most likely use-after-free issues, but any memory use bug lays the program open to these attacks. -- Caroline Tice cmt...@google.com On Thu, Nov 8, 2012 at 1:36 AM, Florian Weimer wrote: > On 11/05/2012 06:48 PM, Caroline Tice wrote: > >> As requested, I have split the original patch into two parts: GCC >> changes and runtime library changes. The attached patch is fore the >> gcc changes. > > > Out of curiosity, what's the primary source of wrong vtable values you > expect? User-after-free issues, heap spraying, or something else? > > -- > Florian Weimer / Red Hat Product Security Team
Re: patch fixing a test for PR55151
On Wed, 7 Nov 2012, Vladimir Makarov wrote: > On 12-11-07 5:27 PM, H.J. Lu wrote: > > You should check !ia32 target: > > > > /* { dg-do compile { target { ! { ia32 } } } } */ > > > > > Thanks, H.J. I've just fixed it. > > Index: testsuite/ChangeLog > === > --- testsuite/ChangeLog (revision 193316) > +++ testsuite/ChangeLog (working copy) > @@ -1,3 +1,8 @@ > +2012-11-07 Vladimir Makarov > + > + PR rtl-optimization/55151 > + * gcc.dg/pr55151.c: Use ia32 instead of x86_64. > + Lots of constraints there that might not be suitable for all machines. Unless it's expected to pass (almost) everywhere, it should move to gcc.target/i386. If it *is* expected to pass everywhere, at least add /* { dg-require-effective-target fpic } */ or similar (can be done in the target expression). brgds, H-P
Re: patch to fix constant math - first small patch
Joseph, Here is a revised patch with the change you asked for. There have been no other comments. May I commit it? Kenny On 10/05/2012 08:14 PM, Joseph S. Myers wrote: On Fri, 5 Oct 2012, Kenneth Zadeck wrote: +# define HOST_HALF_WIDE_INT_PRINT "h" This may cause problems on hosts not supporting %hd (MinGW?), and there's no real need for using "h" here given the promotion of short to int; you can just use "" (rather than e.g. needing special handling in xm-mingw32.h like is done for HOST_LONG_LONG_FORMAT). diff --git a/gcc/hwint.h b/gcc/hwint.h index ca47148..87371b5 100644 --- a/gcc/hwint.h +++ b/gcc/hwint.h @@ -77,6 +77,40 @@ extern char sizeof_long_long_must_be_8[sizeof(long long) == 8 ? 1 : -1]; # endif #endif +/* Print support for half a host wide int. */ +#define HOST_BITS_PER_HALF_WIDE_INT (HOST_BITS_PER_WIDE_INT / 2) +#if HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_LONG +# define HOST_HALF_WIDE_INT long +# define HOST_HALF_WIDE_INT_PRINT HOST_LONG_FORMAT +# define HOST_HALF_WIDE_INT_PRINT_C "L" +# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d" +# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C +# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u" +# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x" +# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x" +#elif HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_INT +# define HOST_HALF_WIDE_INT int +# define HOST_HALF_WIDE_INT_PRINT "" +# define HOST_HALF_WIDE_INT_PRINT_C "" +# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d" +# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C +# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u" +# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x" +# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x" +#elif HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_SHORT +# define HOST_HALF_WIDE_INT short +# define HOST_HALF_WIDE_INT_PRINT "" +# define HOST_HALF_WIDE_INT_PRINT_C "" +# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d" +# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C +# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u" +# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x" +# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x" +#else +#error Please add support for HOST_HALF_WIDE_INT +#endif + + #define HOST_WIDE_INT_1 HOST_WIDE_INT_C(1) /* This is a magic identifier which allows GCC to figure out the type @@ -94,9 +128,13 @@ typedef HOST_WIDE_INT __gcc_host_wide_int__; # if HOST_BITS_PER_WIDE_INT == 64 # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \ "0x%" HOST_LONG_FORMAT "x%016" HOST_LONG_FORMAT "x" +# define HOST_WIDE_INT_PRINT_PADDED_HEX \ + "%016" HOST_LONG_FORMAT "x" # else # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \ "0x%" HOST_LONG_FORMAT "x%08" HOST_LONG_FORMAT "x" +# define HOST_WIDE_INT_PRINT_PADDED_HEX \ + "%08" HOST_LONG_FORMAT "x" # endif #else # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT @@ -104,6 +142,8 @@ typedef HOST_WIDE_INT __gcc_host_wide_int__; /* We can assume that 'long long' is at least 64 bits. */ # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \ "0x%" HOST_LONG_LONG_FORMAT "x%016" HOST_LONG_LONG_FORMAT "x" +# define HOST_WIDE_INT_PRINT_PADDED_HEX \ +"%016" HOST_LONG_LONG_FORMAT "x" #endif /* HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG */ #define HOST_WIDE_INT_PRINT_DEC "%" HOST_WIDE_INT_PRINT "d" @@ -277,4 +317,32 @@ extern HOST_WIDE_INT pos_mul_hwi (HOST_WIDE_INT, HOST_WIDE_INT); extern HOST_WIDE_INT mul_hwi (HOST_WIDE_INT, HOST_WIDE_INT); extern HOST_WIDE_INT least_common_multiple (HOST_WIDE_INT, HOST_WIDE_INT); +/* Sign extend SRC starting from PREC. */ + +static inline HOST_WIDE_INT +sext_hwi (HOST_WIDE_INT src, int prec) +{ + if (prec == HOST_BITS_PER_WIDE_INT) +return src; + else +{ + int shift = HOST_BITS_PER_WIDE_INT - (prec & (HOST_BITS_PER_WIDE_INT - 1)); + return (src << shift) >> shift; +} +} + +/* Zero extend SRC starting from PREC. */ + +static inline HOST_WIDE_INT +zext_hwi (HOST_WIDE_INT src, int prec) +{ + if (prec == HOST_BITS_PER_WIDE_INT) +return src; + else +return src & (((HOST_WIDE_INT)1 + << (prec & (HOST_BITS_PER_WIDE_INT - 1))) - 1); +} + + + #endif /* ! GCC_HWINT_H */ 2012-10-5 Kenneth Zadeck * hwint.h (HOST_BITS_PER_HALF_WIDE_INT, HOST_HALF_WIDE_INT, HOST_HALF_WIDE_INT_PRINT, HOST_HALF_WIDE_INT_PRINT_C, HOST_HALF_WIDE_INT_PRINT_DEC, HOST_HALF_WIDE_INT_PRINT_DEC_C, HOST_HALF_WIDE_INT_PRINT_UNSIGNED, HOST_HALF_WIDE_INT_PRINT_HEX, HOST_HALF_WIDE_INT_PRINT_HEX_PURE): New symbols. (sext_hwi, zext_hwi): New functions.
Re: [PATCH] Cleanup last_location and update input_location in ipa_prop
ping... On Mon, Nov 5, 2012 at 5:20 PM, Dehao Chen wrote: > > Hi, > > This is a patch to do some obvious cleanup and setting correct > input_location in ipa_prop (because it invokes gimplification > routines). > > Bootstrapped and passed gcc regression tests. > > Is it okay for trunk? > > Thanks, > Dehao > > gcc/ChangeLog: > 2010-11-05 Dehao Chen > > * ipa-prop.c (ipa_modify_call_arguments): Set input_location so that > gimplification routines can have right location. > * emit-rtl.c (last_location): Remove unused variable.
Re: [PATCH] Do not dump location for compare_debug
ping... On Wed, Oct 31, 2012 at 5:26 PM, Dehao Chen wrote: > Hi, > > When -fcompare_debug is used, what we really want to do is to compare > instructions between the -g version and -gtoggle version. However, > current dump file still contains the source line in its rtl dump. This > patch changes to only dump rtl without dumping its source info. > > Bootstrapped and passed gcc regression test on x86_64. > > Is it okay for trunk? > > Thanks, > Dehao > > 2012-10-31 Dehao Chen > > * final.c (rest_of_clean_state): Dump the final rtl without location.
[PATCH] Remove inline keyword from rs6000 legitimate_indirect_address_p (PR target/54308)
Hi! We have in rs6000-protos.h extern bool legitimate_indirect_address_p (rtx, int); and in rs6000.c inline bool legitimate_indirect_address_p (rtx x, int strict) { ... } and in predicates.md call this function. That works fine in C (both -fgnu89-inline mode and C99), the function is inlined within rs6000.c but an out of line copy is still emitted and predicates.md can thus reference it. But in C++, if compiled with optimizations, without -fkeep-inline-functions and the compiler inlines all calls to that function, it doesn't have to emit anything. Thus, either we have the option to move the definition of legitimate_indirect_address_p into a header (but rs6000-protos.h doesn't look like a correct spot to define inline functions), or we need to drop inline keyword and force that way an out of line copy. Ok for trunk? 2012-11-08 Jakub Jelinek PR target/54308 * config/rs6000/rs6000.c (legitimate_indirect_address_p): Remove inline keyword. --- gcc/config/rs6000/rs6000.c.jj 2012-10-31 09:20:44.0 +0100 +++ gcc/config/rs6000/rs6000.c 2012-11-08 18:33:34.950408593 +0100 @@ -5465,7 +5465,7 @@ avoiding_indexed_address_p (enum machine return (TARGET_AVOID_XFORM && VECTOR_MEM_NONE_P (mode)); } -inline bool +bool legitimate_indirect_address_p (rtx x, int strict) { return GET_CODE (x) == REG && INT_REG_OK_FOR_BASE_P (x, strict); Jakub
Re: patch to fix constant math - second small patch
I have added the proper doc. OK to commit? Kenny On 10/08/2012 05:06 AM, Richard Guenther wrote: On Sat, Oct 6, 2012 at 12:48 AM, Kenneth Zadeck wrote: This patch adds machinery to genmodes.c so that largest possible sizes of various data structures can be determined at gcc build time. These functions create 3 symbols that are available in insn-modes.h: MAX_BITSIZE_MODE_INT - the bitsize of the largest int. MAX_BITSIZE_MODE_PARTIAL_INT - the bitsize of the largest partial int. MAX_BITSIZE_MODE_ANY_INT - the largest bitsize of any kind of int. Ok. Please document these macros in rtl.texi. Richard. 2012-11-8 Kenneth Zadeck * genmodes.c (emit_max_int): New function. (emit_insn_modes_h): Added call to emit_max_function. * doc/rtl.texi (MAX_BITSIZE_MODE_INT, MAX_BITSIZE_MODE_PARTIAL_INT, MAX_BITSIZE_MODE_ANY_INT, MAX_BITSIZE_MODE_ANY_MODE): Added doc. diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi index 07c480d..6842cb8 100644 --- a/gcc/doc/rtl.texi +++ b/gcc/doc/rtl.texi @@ -1458,6 +1458,28 @@ Returns the number of units contained in a mode, i.e., Returns the narrowest mode in mode class @var{c}. @end table +The following 4 variables are defined on every target. They can be +used to allocate buffers that are guaranteed to be large enough to +hold any value that can be represented on the target. + +@table @code +@findex MAX_BITSIZE_MODE_INT +@item MAX_BITSIZE_MODE_INT +The bitsize of the largest integer mode defined on the target. + +@findex MAX_BITSIZE_MODE_PARTIAL_INT +@item MAX_BITSIZE_MODE_PARTIAL_INT +The bitsize of the largest partial integer mode defined on the target. + +@findex MAX_BITSIZE_MODE_ANY_INT +@item MAX_BITSIZE_MODE_ANY_INT +The maximum of MAX_BITSIZE_MODE_INT and MAX_BITSIZE_MODE_PARTIAL_INT. + +@findex MAX_BITSIZE_MODE_ANY_MODE +@item MAX_BITSIZE_MODE_ANY_MODE +The bitsize of the largest mode on the target. +@end table + @findex byte_mode @findex word_mode The global variables @code{byte_mode} and @code{word_mode} contain modes diff --git a/gcc/genmodes.c b/gcc/genmodes.c index d0095c3..3e63cc7 100644 --- a/gcc/genmodes.c +++ b/gcc/genmodes.c @@ -849,6 +849,38 @@ calc_wider_mode (void) #define print_closer() puts ("};") +/* Compute the max bitsize of some of the classes of integers. It may + be that there are needs for the other integer classes, and this + code is easy to extend. */ +static void +emit_max_int (void) +{ + unsigned int max, mmax; + struct mode_data *i; + int j; + + puts (""); + for (max = 1, i = modes[MODE_INT]; i; i = i->next) +if (max < i->bytesize) + max = i->bytesize; + printf ("#define MAX_BITSIZE_MODE_INT %d*BITS_PER_UNIT\n", max); + mmax = max; + for (max = 1, i = modes[MODE_PARTIAL_INT]; i; i = i->next) +if (max < i->bytesize) + max = i->bytesize; + printf ("#define MAX_BITSIZE_MODE_PARTIAL_INT %d*BITS_PER_UNIT\n", max); + if (max > mmax) +mmax = max; + printf ("#define MAX_BITSIZE_MODE_ANY_INT %d*BITS_PER_UNIT\n", mmax); + + mmax = 0; + for (j = 0; j < MAX_MODE_CLASS; j++) +for (i = modes[j]; i; i = i->next) + if (mmax < i->bytesize) + mmax = i->bytesize; + printf ("#define MAX_BITSIZE_MODE_ANY_MODE %d*BITS_PER_UNIT\n", mmax); +} + static void emit_insn_modes_h (void) { @@ -913,6 +945,7 @@ enum machine_mode\n{"); #endif printf ("#define CONST_MODE_IBIT%s\n", adj_ibit ? "" : " const"); printf ("#define CONST_MODE_FBIT%s\n", adj_fbit ? "" : " const"); + emit_max_int (); puts ("\ \n\ #endif /* insn-modes.h */"); diff --git a/gcc/machmode.def b/gcc/machmode.def index 631015f..7186cb4 100644 --- a/gcc/machmode.def +++ b/gcc/machmode.def @@ -180,8 +180,11 @@ RANDOM_MODE (BLK); FRACTIONAL_INT_MODE (BI, 1, 1); /* Basic integer modes. We go up to TI in generic code (128 bits). - The name OI is reserved for a 256-bit type (needed by some back ends). - FIXME TI shouldn't be generically available either. */ + TImode is needed here because the some front ends now genericly + support __int128. If the front ends decide to generically support + larger types, then corresponding modes must be added here. The + name OI is reserved for a 256-bit type (needed by some back ends). +*/ INT_MODE (QI, 1); INT_MODE (HI, 2); INT_MODE (SI, 4);
Re: [PATCH,RX] Support Bit Manipulation on Memory Operands
Now looking at the patch past the first hunk... > +(define_insn "*iorbset_reg" > + [(set (match_operand:SI 0 "register_operand" "+r") > + (ior:SI (match_dup 0) > + (match_operand 1 "const_int_operand" "Intso")))] > + "satisfies_constraint_Intso (operands[1])" > + "bset\\t%D1,%0" > + [(set_attr "length" "3")] > +) While this and the new *xorbnot_reg and *andbclr_reg patterns are correct, I question whether you really want to add them. I think they should be folded in as alternatives in the parent ior/xor/and patterns. Otherwise reload will not make the correct choices. Note that they cannot be added to the version of the ior/xor/and patterns that use the flags register. > +(define_insn "*bset" > + [(set (zero_extract:SI > + (match_operand:QI 0 "rx_restricted_mem_operand" "+Q") > + (const_int 1) > + (match_operand 1 "rx_constbit_operand" "Uint03")) > + (const_int 1))] > + "" > + "bset\\t%1,%0.B" > + [(set_attr "length" "3")] > +) This and the *bclr pattern that you add seem to be redundant with the existing *insv_imm pattern. Why did you add them? > +(define_insn "*insv_mem_imm" > + [(set (zero_extract:SI > + (match_operand 0 "rx_restricted_mem_operand" "+Q") > + (const_int 1) > + (match_operand 1 "nonmemory_operand" "ri")) > + (match_dup 0))] > + "" > +{ > + if (INTVAL (operands[2]) & 1) > +return "bset\t%1, %0.B"; > + else > +return "bclr\t%1, %0.B"; > +} > + [(set_attr "length" "3")] Huh? You reference operand 2, which does not exist. And a similar comment applies: {bitclr_in_memory,bitset_in_memory} are redundant with this insv_mem_imm pattern. >/* We only handle single-bit inserts. */ >if (!CONST_INT_P (operands[1]) || INTVAL (operands[1]) != 1) > FAIL; > > + if (GET_CODE (operands [0]) == MEM > + && (!CONST_INT_P (operands[2]) > + || !CONST_INT_P (operands[3]) > + || (INTVAL (operands[2]) < 0 || INTVAL (operands[2]) > 7))) > + FAIL; > + > + if (GET_CODE (operands [0]) == MEM > + && satisfies_constraint_Intu1 (operands[1]) This check is redundant with the first test quoted above. > + && satisfies_constraint_Uint03 (operands[2])) > +{ > + if (satisfies_constraint_Intu0 (operands[3])) Rather than referencing satisfies_constraint_* here, it would be better to follow the form of the rest of the function. Code could be shared then. There is a problem here, however. The argument to the insv is an SImode operand. The instruction takes a QImode operand. You cannot just pretend the two are the same; you need to apply some transform. The code as written is wrong for -mbig-endian-data. Now, suppose you do transform SImode to QImode. Then I will question whether this is efficient. Once you force the value to remain in memory, and in a mode for which you have no arithmetic operations, you cripple the rtl optimizers, and in particular, combine. Have you tested this change with things other than microbenchmarks? My suggestion for insv, and these sorts of patterns, is to merge them so that you allow SImode values in registers or memory until after reload is complete. And only then split the pattern, applying the transform on the memory operand to create a QImode reference. It's quite a bit more work than what you have here, which is why I didn't do it myself when I rearranged all of this code in 2011. r~
Re: PR fortran/51727: make module files reproducible, question on C++ in gcc
Hi all, On 2012-10-14 23:35, Janne Blomqvist wrote: On Sat, Oct 13, 2012 at 4:26 PM, Tobias Schlüter wrote: - Personally, I'd prefer the C++ version; The C++ standard library is widely used and documented and using it in favour of rolling our own is IMHO a good idea. After receiving no reply for two weeks and two pings of my attempts at addressing Jakub's complaint about error handling in the STL (which, I might add, is explicitly allowed by the coding standard) went unanswered, I committed the C-only version after re-testing. It's my first commit by means of git, please alert me of any problems. Cheers, - Tobi
Re: [PATCH] Remove inline keyword from rs6000 legitimate_indirect_address_p (PR target/54308)
On Thu, Nov 8, 2012 at 12:42 PM, Jakub Jelinek wrote: > We have in rs6000-protos.h > extern bool legitimate_indirect_address_p (rtx, int); > and in rs6000.c > inline bool > legitimate_indirect_address_p (rtx x, int strict) { ... } > and in predicates.md call this function. That works fine in C > (both -fgnu89-inline mode and C99), the function is inlined within rs6000.c > but an out of line copy is still emitted and predicates.md can thus > reference it. But in C++, if compiled with optimizations, without > -fkeep-inline-functions and the compiler inlines all calls to that function, > it doesn't have to emit anything. Thus, either we have the option to > move the definition of legitimate_indirect_address_p into a header (but > rs6000-protos.h doesn't look like a correct spot to define inline > functions), or we need to drop inline keyword and force that way > an out of line copy. Ok for trunk? > > 2012-11-08 Jakub Jelinek > > PR target/54308 > * config/rs6000/rs6000.c (legitimate_indirect_address_p): Remove > inline keyword. Okay. Thanks, David
[PATCH] Attribute for unused warning for variables of non-trivial types
Hello, could somebody please review http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55203 (patch also attached)? The patch implements an attribute for marking types for which gcc cannot on its own issue warnings about unused variables (e.g. because the ctor is external), but for which such a warning might be useful anyway (e.g. std::string). Thank you. -- Lubos Lunak l.lu...@suse.cz diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 34cfb98..d9e6725 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -375,6 +375,7 @@ static tree handle_optimize_attribute (tree *, tree, tree, int, bool *); static tree ignore_attribute (tree *, tree, tree, int, bool *); static tree handle_no_split_stack_attribute (tree *, tree, tree, int, bool *); static tree handle_fnspec_attribute (tree *, tree, tree, int, bool *); +static tree handle_warn_unused_attribute (tree *, tree, tree, int, bool *); static void check_function_nonnull (tree, int, tree *); static void check_nonnull_arg (void *, tree, unsigned HOST_WIDE_INT); @@ -738,6 +739,8 @@ const struct attribute_spec c_common_attribute_table[] = The name contains space to prevent its usage in source code. */ { "fn spec", 1, 1, false, true, true, handle_fnspec_attribute, false }, + { "warn_unused",0, 0, false, false, false, + handle_warn_unused_attribute, false }, { NULL, 0, 0, false, false, false, NULL, false } }; @@ -7908,6 +7911,27 @@ handle_fnspec_attribute (tree *node ATTRIBUTE_UNUSED, tree ARG_UNUSED (name), return NULL_TREE; } +/* Handle a "warn_unused" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_warn_unused_attribute (tree *node, tree name, + tree args ATTRIBUTE_UNUSED, + int flags ATTRIBUTE_UNUSED, bool *no_add_attrs) +{ + if (TYPE_P(*node)) +/* Do nothing else, just set the attribute. We'll get at + it later with lookup_attribute. */ +; + else +{ + warning (OPT_Wattributes, "%qE attribute ignored", name); + *no_add_attrs = true; +} + + return NULL_TREE; +} + /* Handle a "returns_twice" attribute; arguments as in struct attribute_spec.handler. */ diff --git a/gcc/cp/call.c b/gcc/cp/call.c index e21049b..0f5fb63 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -6894,7 +6894,12 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain) /* FIXME handle trivial default constructor, too. */ if (!already_used) -mark_used (fn); +{ + /* Constructors and destructor of warn_unused types do not make them used. */ + if (( !DECL_CONSTRUCTOR_P(fn) && !DECL_DESTRUCTOR_P(fn)) + || !lookup_attribute ("warn_unused", TYPE_ATTRIBUTES (TYPE_METHOD_BASETYPE (TREE_TYPE (fn) +mark_used (fn); +} if (DECL_VINDEX (fn) && (flags & LOOKUP_NONVIRTUAL) == 0) { diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index d25aa80..e0548e5 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -632,7 +632,8 @@ poplevel (int keep, int reverse, int functionbody) && DECL_NAME (decl) && ! DECL_ARTIFICIAL (decl) && TREE_TYPE (decl) != error_mark_node && (!CLASS_TYPE_P (TREE_TYPE (decl)) - || !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (decl + || !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (decl)) + || lookup_attribute ("warn_unused", TYPE_ATTRIBUTES (TREE_TYPE (decl) { if (! TREE_USED (decl)) warning (OPT_Wunused_variable, "unused variable %q+D", decl); diff --git a/gcc/cp/init.c b/gcc/cp/init.c index 0446038..38b15a7 100644 --- a/gcc/cp/init.c +++ b/gcc/cp/init.c @@ -1512,8 +1512,12 @@ build_aggr_init (tree exp, tree init, int flags, tsubst_flags_t complain) } if (TREE_CODE (exp) == VAR_DECL || TREE_CODE (exp) == PARM_DECL) -/* Just know that we've seen something for this node. */ -TREE_USED (exp) = 1; +{ + /* Just know that we've seen something for this node. + Merely creating a warn_unused aggregate doesn't make it used though. */ + if( !lookup_attribute ("warn_unused", TYPE_ATTRIBUTES (type))) +TREE_USED (exp) = 1; +} is_global = begin_init_stmts (&stmt_expr, &compound_stmt); destroy_temps = stmts_are_full_exprs_p (); diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 6bf929a..177b3ee 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -15631,6 +15631,19 @@ only be applied to classes declared within an @code{extern "Java"} block. Calls to methods declared in this interface will be dispatched using GCJ's interface table mechanism, instead of regular virtual table dispatch. +@item warn_unused +@cindex @code{warn_unused} attribute + +For C++ types with non-trivial constructors and/or destructors it may be +difficult or impossible for the compiler to find out whether a variable +of this type is truly unused if it is not referenced. This type attribute +informs the compiler tha
[PATCH] Fix -grecord-dwarf-switches with PCH (PR debug/53145)
Hi! This PR reports a problem on some PCH tests due to -grecord-dwarf-switches, sometimes .debug_str section is emitted in different order with PCH than without. I can't reproduce the issue unless -fno-eliminate-unused-debug-types, but anyway, I hope this patch should cure it. The problem is that without PCH dwarf2out_finish wasn't replacing the producer string with the right one, while with PCH it would be (older one would come from the header file compilation restored from PCH file) and thus the debug_str_hash table contained one extra string (while not referenced and thus not emitted could change the order in which htab_traverse traversed the hash table). With -feliminate-unused-debug-types the hash table is htab_empty cleared afterwards and repopulated, but guess if one is unlucky enough the one extra entry before htab_empty might result in bigger hash table and thus different layout of the hash table. Fixed by initially setting DW_AT_producer to an empty string and unconditionally fixing it up at dwarf2out_finish time. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-11-07 Jakub Jelinek PR debug/53145 * dwarf2out.c (gen_compile_unit_die): Don't call gen_producer_string here, instead add "" if producer_string is NULL. (dwarf2out_finish): Call gen_producer_string here, unconditionally decrease refcount of the old indirect string and set val_str to find_AT_string result. --- gcc/dwarf2out.c.jj 2012-11-07 08:42:30.0 +0100 +++ gcc/dwarf2out.c 2012-11-07 18:59:38.667571248 +0100 @@ -18867,9 +18867,7 @@ gen_compile_unit_die (const char *filena add_comp_dir_attribute (die); } - if (producer_string == NULL) -producer_string = gen_producer_string (); - add_AT_string (die, DW_AT_producer, producer_string); + add_AT_string (die, DW_AT_producer, producer_string ? producer_string : ""); /* If our producer is LTO try to figure out a common language to use from the global list of translation units. */ @@ -23217,13 +23215,12 @@ dwarf2out_finish (const char *filename) dw_die_ref main_comp_unit_die; /* PCH might result in DW_AT_producer string being restored from the - header compilation, fix it up if needed. */ + header compilation, so always fill it with empty string initially + and overwrite only here. */ dw_attr_ref producer = get_AT (comp_unit_die (), DW_AT_producer); - if (strcmp (AT_string (producer), producer_string) != 0) -{ - struct indirect_string_node *node = find_AT_string (producer_string); - producer->dw_attr_val.v.val_str = node; -} + producer_string = gen_producer_string (); + producer->dw_attr_val.v.val_str->refcount--; + producer->dw_attr_val.v.val_str = find_AT_string (producer_string); gen_scheduled_generic_parms_dies (); gen_remaining_tmpl_value_param_die_attribute (); Jakub
[PATCH] Fix up assembly thunks with -gstabs+ (PR debug/54499)
Hi! Cary added recently a debug_hooks->source_line call for asm thunks, but that breaks e.g. for -gstabs+, which emits a relocation relative to whatever label is emitted in the begin_prologue debug hook, so obviously doesn't work well if called before the begin_prologue debug hook. Fixed by reverting that change and instead setting up proper prologue_location, so that final_start_function called by output_mi_thunk when calling the debug_hooks->begin_prologue hook passes the right line/file. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-11-08 Jakub Jelinek PR debug/54499 * cgraphunit.c (assemble_thunk): Don't call source_line debug hook here, instead call insn_locations_{init,finalize} and initialize prologue_location. * g++.dg/debug/pr54499.C: New test. --- gcc/cgraphunit.c.jj 2012-11-06 09:03:53.0 +0100 +++ gcc/cgraphunit.c2012-11-08 13:16:48.010938659 +0100 @@ -1413,16 +1413,16 @@ assemble_thunk (struct cgraph_node *node DECL_INITIAL (thunk_fndecl) = fn_block; init_function_start (thunk_fndecl); cfun->is_thunk = 1; + insn_locations_init (); + set_curr_insn_location (DECL_SOURCE_LOCATION (thunk_fndecl)); + prologue_location = curr_insn_location (); assemble_start_function (thunk_fndecl, fnname); - (*debug_hooks->source_line) (DECL_SOURCE_LINE (thunk_fndecl), - DECL_SOURCE_FILE (thunk_fndecl), - /* discriminator */ 0, - /* is_stmt */ 1); targetm.asm_out.output_mi_thunk (asm_out_file, thunk_fndecl, fixed_offset, virtual_value, alias); assemble_end_function (thunk_fndecl, fnname); + insn_locations_finalize (); init_insn_lengths (); free_after_compilation (cfun); set_cfun (NULL); --- gcc/testsuite/g++.dg/debug/pr54499.C.jj 2012-11-08 13:23:19.906740003 +0100 +++ gcc/testsuite/g++.dg/debug/pr54499.C2012-11-08 13:23:55.189531409 +0100 @@ -0,0 +1,22 @@ +// PR debug/54499 +// { dg-do assemble } + +struct S1 +{ + virtual void f () = 0; +}; + +struct S2 +{ + virtual ~S2 () { } +}; + +struct S3 : public S1, public S2 +{ + void f (); +}; + +void +S3::f () +{ +} Jakub
[PATCH] Fix make_range_step with -fwrapv (PR tree-optimization/55236)
Hi! With -fwrapv the range test optimization miscompiles the following testcase (both inter-bb version in 4.8+ in first function and the pure fold-const.c one since 3.4). The problem is that with -fwrapv and signed type -x in [-,b] doesn't imply x in [-b,-] and -x in [a,-] doesn't imply x in [-,-a], as -INT_MIN is INT_MIN and thus if INT_MIN was in the range, after negation it needs to be in the range too (and similarly if it wasn't, it can't be there). Turns out that if we make sure both low and high are non-NULL (i.e. replace - with INT_MIN resp. INT_MAX), normalize: will do the right thing that it does also for unsigned operation - if it detects the high bound is lower than low bound, it adjusts the range by inverting it. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk (what about 4.7 where bar is miscompiled too)? Perhaps for PLUS_EXPR/MINUS_EXPR and signed -fwrapv type we could do the same instead of return NULL_TREE; that we do right now (added by Kazu for PR23518). 2012-11-08 Jakub Jelinek PR tree-optimization/55236 * fold-const.c (make_range_step) : For -fwrapv and signed ARG0_TYPE, force low and high to be non-NULL. * gcc.dg/pr55236.c: New test. --- gcc/fold-const.c.jj 2012-11-07 11:24:34.0 +0100 +++ gcc/fold-const.c2012-11-08 16:54:38.978339040 +0100 @@ -3880,6 +3880,17 @@ make_range_step (location_t loc, enum tr return arg0; case NEGATE_EXPR: + /* If flag_wrapv and ARG0_TYPE is signed, make sure +low and high are non-NULL, then normalize will DTRT. */ + if (!TYPE_UNSIGNED (arg0_type) + && !TYPE_OVERFLOW_UNDEFINED (arg0_type)) + { + if (low == NULL_TREE) + low = TYPE_MIN_VALUE (arg0_type); + if (high == NULL_TREE) + high = TYPE_MAX_VALUE (arg0_type); + } + /* (-x) IN [a,b] -> x in [-b, -a] */ n_low = range_binop (MINUS_EXPR, exp_type, build_int_cst (exp_type, 0), --- gcc/testsuite/gcc.dg/pr55236.c.jj 2012-11-08 16:40:49.171781137 +0100 +++ gcc/testsuite/gcc.dg/pr55236.c 2012-11-08 16:41:20.044578919 +0100 @@ -0,0 +1,31 @@ +/* PR tree-optimization/55236 */ +/* { dg-do run } */ +/* { dg-options "-O2 -fwrapv" } */ + +extern void abort (); + +__attribute__((noinline, noclone)) void +foo (int i) +{ + if (i > 0) +abort (); + i = -i; + if (i < 0) +return; + abort (); +} + +__attribute__((noinline, noclone)) void +bar (int i) +{ + if (i > 0 || (-i) >= 0) +abort (); +} + +int +main () +{ + foo (-__INT_MAX__ - 1); + bar (-__INT_MAX__ - 1); + return 0; +} Jakub
Re: [PATCH] Fix up assembly thunks with -gstabs+ (PR debug/54499)
On 2012-11-08 11:55, Jakub Jelinek wrote: > 2012-11-08 Jakub Jelinek > > PR debug/54499 > * cgraphunit.c (assemble_thunk): Don't call source_line debug hook > here, instead call insn_locations_{init,finalize} and initialize > prologue_location. > > * g++.dg/debug/pr54499.C: New test. I recall mentioning at the time that I thought Cary's approach was error-prone. Ok. r~
Re: [PATCH] PR 54789: Fix driver error when GCC_COMPARE_DEBUG is defined
On Fri, Oct 05, 2012 at 06:49:25AM +0400, Dmitry Gorbachev wrote: > This patch simplifies process_command a bit by using save_switch and > sets the "known" switch field to "true". > > > gcc/ChangeLog: > > 2012-10-05 Dmitry Gorbachev > > PR driver/54789 > * gcc.c (process_command): Use save_switch for synthesized > -fcompare-debug=* option; mark the switch as known. Applied after bootstrap/regtested on x86_64-linux and i686-linux and testing with GCC_COMPARE_DEBUG in environment. Jakub
Re: [PATCH] Fix -grecord-dwarf-switches with PCH (PR debug/53145)
On 2012-11-08 11:49, Jakub Jelinek wrote: > 2012-11-07 Jakub Jelinek > > PR debug/53145 > * dwarf2out.c (gen_compile_unit_die): Don't call gen_producer_string > here, instead add "" if producer_string is NULL. > (dwarf2out_finish): Call gen_producer_string here, unconditionally > decrease refcount of the old indirect string and set val_str to > find_AT_string result. Ok. r~
Re: [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
Attached patch applied to trunk and 4.7 branch. 2012-11-08 François Dumont PR libstdc++/54075 * include/bits/hashtable.h (_Hashtable<>::rehash): Reset hash policy state if no rehash. * testsuite/23_containers/unordered_set/modifiers/reserve.cc (test02): New. François On 11/08/2012 01:58 AM, Jonathan Wakely wrote: On 7 November 2012 22:02, François Dumont wrote: Ok to commit ? If so, where ? That patch is OK for trunk and 4.7, thanks. Index: include/bits/hashtable.h === --- include/bits/hashtable.h (revision 193258) +++ include/bits/hashtable.h (working copy) @@ -1597,6 +1597,9 @@ // level. _M_rehash_policy._M_prev_resize = 0; } + else + // No rehash, restore previous state to keep a consistent state. + _M_rehash_policy._M_reset(__saved_state); } template Set; + Set s; + s.reserve(N); + s.reserve(N); + + std::size_t bkts = s.bucket_count(); + for (int i = 0; i != N; ++i) +{ + s.insert(i); + // As long as we insert less than the reserved number of elements we + // shouldn't experiment any rehash. + VERIFY( s.bucket_count() == bkts ); +} +} + int main() { test01(); + test02(); return 0; }
RE: [RFC] New feature to reuse one multilib among different targets
On Thu, 8 Nov 2012, Terry Guo wrote: > To convert such statements to data structure used by multilib_raw, I > refactor codes in genmultilib into two functions combo_to_dir and The "function" keyword for creating shell functions is not POSIX, and I don't know if we ensure that $SHELL is a shell supporting functions. (It's documented that CONFIG_SHELL may need to be set to a POSIX shell if /bin/sh isn't sufficient, but does that feed through to the value of SHELL used to run this script?) > 2012-11-08 Terry Guo > > * genmultilib (combo_to_dir): New function. > (options_output): New function. > (MULTILIB_REUSE): New argument. > * Makefile.in (s-mlib): Add a new argument MULTILIB_REUSE. > * gcc.c (multilib_reuse): New spec. > (set_multilib_dir): Use multilib_reuse. Documentation changes need mentioning in ChangeLog entries. -- Joseph S. Myers jos...@codesourcery.com
Re: [C++ Patch] for c++/11750
On 08/13/2012 04:49 PM, Fabien Chêne wrote: + /* Optimize away vtable lookup if we know that this +function can't be overridden. We need to check if +the context and the instance type are the same, +actually FN migth be defined in a different class Typo: "might" +type because of a using-declaration. In this case, we +do not want to perform a non-virtual call. */ if (DECL_VINDEX (fn) && ! (flags & LOOKUP_NONVIRTUAL) + && same_type_p (DECL_CONTEXT (fn), TREE_TYPE (instance)) && resolves_to_fixed_type_p (instance, 0)) This should be same_type_ignoring_top_level_qualifiers_p. OK with that change. Jason
[PR 55238] More careful pass-through handling in find_aggregate_values_for_callers_subset
Hi, when writing find_aggregate_values_for_callers_subset, I omitted a test that is present in propagate_aggs_accross_jump_function and that lead to triggering an assert in the former. Fixed by moving the test into a separate predicate function in the following patch. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2012-11-08 Martin Jambor PR tree-optimization/55238 * ipa-cp.c (agg_pass_through_permissible_p): New function. (propagate_aggs_accross_jump_function): Use it. (find_aggregate_values_for_callers_subset): Likewise and relax an assert. * testsuite/gcc.dg/torture/pr55238.c: New test. Index: src/gcc/ipa-cp.c === --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -1312,6 +1312,19 @@ merge_aggregate_lattices (struct cgraph_ return ret; } +/* Determine whether there is anything to propagate FROM SRC_PLATS through a + pass-through JFUNC and if so, whether it has conform and conforms to the + rules about propagating values passed by reference. */ + +static bool +agg_pass_through_permissible_p (struct ipcp_param_lattices *src_plats, + struct ipa_jump_func *jfunc) +{ + return src_plats->aggs +&& (!src_plats->aggs_by_ref + || ipa_get_jf_pass_through_agg_preserved (jfunc)); +} + /* Propagate scalar values across jump function JFUNC that is associated with edge CS and put the values into DEST_LAT. */ @@ -1333,9 +1346,7 @@ propagate_aggs_accross_jump_function (st struct ipcp_param_lattices *src_plats; src_plats = ipa_get_parm_lattices (caller_info, src_idx); - if (src_plats->aggs - && (!src_plats->aggs_by_ref - || ipa_get_jf_pass_through_agg_preserved (jfunc))) + if (agg_pass_through_permissible_p (src_plats, jfunc)) { /* Currently we do not produce clobber aggregate jump functions, replace with merging when we do. */ @@ -2893,23 +2904,33 @@ find_aggregate_values_for_callers_subset if (caller_info->ipcp_orig_node) { - if (!inter) - inter = agg_replacements_to_vector (cs->caller, 0); - else - intersect_with_agg_replacements (cs->caller, src_idx, -&inter, 0); + struct cgraph_node *orig_node = caller_info->ipcp_orig_node; + struct ipcp_param_lattices *orig_plats; + orig_plats = ipa_get_parm_lattices (IPA_NODE_REF (orig_node), + src_idx); + if (agg_pass_through_permissible_p (orig_plats, jfunc)) + { + if (!inter) + inter = agg_replacements_to_vector (cs->caller, 0); + else + intersect_with_agg_replacements (cs->caller, src_idx, +&inter, 0); + } } else { struct ipcp_param_lattices *src_plats; src_plats = ipa_get_parm_lattices (caller_info, src_idx); - /* Currently we do not produce clobber aggregate jump -functions, adjust when we do. */ - gcc_checking_assert (!jfunc->agg.items); - if (!inter) - inter = copy_plats_to_inter (src_plats, 0); - else - intersect_with_plats (src_plats, &inter, 0); + if (agg_pass_through_permissible_p (src_plats, jfunc)) + { + /* Currently we do not produce clobber aggregate jump +functions, adjust when we do. */ + gcc_checking_assert (!jfunc->agg.items); + if (!inter) + inter = copy_plats_to_inter (src_plats, 0); + else + intersect_with_plats (src_plats, &inter, 0); + } } } else if (jfunc->type == IPA_JF_ANCESTOR @@ -2933,7 +2954,7 @@ find_aggregate_values_for_callers_subset src_plats = ipa_get_parm_lattices (caller_info, src_idx);; /* Currently we do not produce clobber aggregate jump functions, adjust when we do. */ - gcc_checking_assert (!jfunc->agg.items); + gcc_checking_assert (!src_plats->aggs || !jfunc->agg.items); if (!inter) inter = copy_plats_to_inter (src_plats, delta); else Index: src/gcc/testsuite/gcc.dg/torture/pr55238.c === --- /dev/null +++ src/gcc/testsuite/gcc.dg/torture/pr55238.c @@ -0,0 +1,44 @@ +/* { dg-do compile
Go patch committed: Update index.go in testsuite
This patch to the Go testsuite updates index.go to the version currently in mainline. This required adding some new support to the Go testsuite driver. This fixes PR go/55228. Ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian Index: go.test/test/index0.go === --- go.test/test/index0.go (revision 0) +++ go.test/test/index0.go (revision 0) @@ -0,0 +1,12 @@ +// runoutput ./index.go + +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Generate test of index and slice bounds checks. +// The output is compiled and run. + +package main + +const pass = 0 Index: go.test/test/index1.go === --- go.test/test/index1.go (revision 0) +++ go.test/test/index1.go (revision 0) @@ -0,0 +1,12 @@ +// errorcheckoutput ./index.go + +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Generate test of index and slice bounds checks. +// The output is error checked. + +package main + +const pass = 1 Index: go.test/test/index2.go === --- go.test/test/index2.go (revision 0) +++ go.test/test/index2.go (revision 0) @@ -0,0 +1,12 @@ +// errorcheckoutput ./index.go + +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Generate test of index and slice bounds checks. +// The output is error checked. + +package main + +const pass = 2 Index: go.test/test/index.go === --- go.test/test/index.go (revision 193259) +++ go.test/test/index.go (working copy) @@ -1,27 +1,19 @@ -// $G $D/$F.go && $L $F.$A && -// ./$A.out -pass 0 >tmp.go && $G tmp.go && $L -o $A.out1 tmp.$A && ./$A.out1 && -// ./$A.out -pass 1 >tmp.go && errchk $G -e tmp.go && -// ./$A.out -pass 2 >tmp.go && errchk $G -e tmp.go -// rm -f tmp.go $A.out1 - -// NOTE: This test is not run by 'run.go' and so not run by all.bash. -// To run this test you must use the ./run shell script. +// skip // Copyright 2010 The Go Authors. All rights reserved. // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. // Generate test of index and slice bounds checks. -// The output is compiled and run. +// The actual tests are index0.go, index1.go, index2.go. package main import ( "bufio" - "flag" "fmt" "os" - "runtime" + "unsafe" ) const prolog = ` @@ -155,14 +147,13 @@ func bug() { func main() { ` -// Passes: +// pass variable set in index[012].go // 0 - dynamic checks // 1 - static checks of invalid constants (cannot assign to types) // 2 - static checks of array bounds -var pass = flag.Int("pass", 0, "which test (0,1,2)") func testExpr(b *bufio.Writer, expr string) { - if *pass == 0 { + if pass == 0 { fmt.Fprintf(b, "\ttest(func(){use(%s)}, %q)\n", expr, expr) } else { fmt.Fprintf(b, "\tuse(%s) // ERROR \"index|overflow\"\n", expr) @@ -172,12 +163,10 @@ func testExpr(b *bufio.Writer, expr stri func main() { b := bufio.NewWriter(os.Stdout) - flag.Parse() - - if *pass == 0 { - fmt.Fprint(b, "// $G $D/$F.go && $L $F.$A && ./$A.out\n\n") + if pass == 0 { + fmt.Fprint(b, "// run\n\n") } else { - fmt.Fprint(b, "// errchk $G -e $D/$F.go\n\n") + fmt.Fprint(b, "// errorcheck\n\n") } fmt.Fprint(b, prolog) @@ -225,9 +214,10 @@ func main() { // the next pass from running. // So run it as a separate check. thisPass = 1 - } else if a == "s" && n == "" && (i == "i64big" || i == "i64bigger") && runtime.GOARCH == "amd64" { -// On amd64, these huge numbers do fit in an int, so they are not -// rejected at compile time. + } else if a == "s" && n == "" && (i == "i64big" || i == "i64bigger") && unsafe.Sizeof(int(0)) > 4 { +// If int is 64 bits, these huge +// numbers do fit in an int, so they +// are not rejected at compile time. thisPass = 0 } else { thisPass = 2 @@ -240,7 +230,7 @@ func main() { } // Only print the test case if it is appropriate for this pass. - if thisPass == *pass { + if thisPass == pass { pae := p+a+e+big cni := c+n+i Index: go.test/go-test.exp === --- go.test/go-test.exp (revision 193259) +++ go.test/go-test.exp (working copy) @@ -449,15 +449,23 @@ proc go-gc-tests { } { # This is a vanilla compile and link test. set dg-do-what-default "link" go-dg-runtest $test "-w $DEFAULT_GOCFLAGS" - } elseif { $test_line == "// runoutput" \ + } elseif { [string match "// runoutput*" $test_line] \ || ($test_line ==
[PATCH] Handle abortTransaction with RTM
On 2012-11-08 15:32, Torvald Riegel wrote: > On Thu, 2012-11-08 at 15:38 +, Lai, Konrad wrote: >> The xabort argument is the only "escape" currently allowed in RTM. So it is >> not possible to use a separate Boolean in memory. > > No, that's not what I meant. The boolean would be used in libitm's > htm_abort(), which the architecture-specific code (eg, > config/x86/target.h) would then change into whatever the particular HTM > uses as abort reasons (eg, true would become 0xff). That's just to keep > as much of libitm portable as possible, nothing more. > >> [Roman just posted an example in >> http://software.intel.com/en-us/blogs/2012/11/06/exploring-intel-transactional-synchronization-extensions-with-intel-software >> ] >> >> I don't know "any" large code that uses cancel. Someone claim there is one >> in STAMP, and one got speed up if it was removed. I think this discussion >> potentially explains why. > > The abort in STAMP is bogus. They didn't instrument all of the code (if > you use the manual instrumentation), and the abort is just there to > "catch" race conditions and other inconsistencies. > > My suggestions for next steps are to move the begin stuff to assembly. > After that, we can go for the abort branch, if removing it really makes > a non-negligible performance difference. I believe this is the sort of patch that Torvald was talking about for handling abortTransaction with RTM. Andi, can you please test? r~ diff --git a/libitm/beginend.cc b/libitm/beginend.cc index 4369946..8f5c4ef 100644 --- a/libitm/beginend.cc +++ b/libitm/beginend.cc @@ -166,18 +166,16 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb) GTM_fatal("pr_undoLogCode not supported"); #if defined(USE_HTM_FASTPATH) && !defined(HTM_CUSTOM_FASTPATH) - // HTM fastpath. Only chosen in the absence of transaction_cancel to allow - // using an uninstrumented code path. - // The fastpath is enabled only by dispatch_htm's method group, which uses - // serial-mode methods as fallback. Serial-mode transactions cannot execute - // concurrently with HW transactions because the latter monitor the serial - // lock's writer flag and thus abort if another thread is or becomes a - // serial transaction. Therefore, if the fastpath is enabled, then a - // transaction is not executing as a HW transaction iff the serial lock is - // write-locked. This allows us to use htm_fastpath and the serial lock's - // writer flag to reliable determine whether the current thread runs a HW - // transaction, and thus we do not need to maintain this information in - // per-thread state. + // HTM fastpath. The fastpath is enabled only by dispatch_htm's method + // group, which uses serial-mode methods as fallback. Serial-mode + // transactions cannot execute concurrently with HW transactions because + // the latter monitor the serial lock's writer flag and thus abort if + // another thread is or becomes a serial transaction. Therefore, if the + // fastpath is enabled, then a transaction is not executing as a HW + // transaction iff the serial lock is write-locked. This allows us to + // use htm_fastpath and the serial lock's writer flag to reliable determine + // whether the current thread runs a HW transaction, and thus we do not + // need to maintain this information in per-thread state. // If an uninstrumented code path is not available, we can still run // instrumented code from a HW transaction because the HTM fastpath kicks // in early in both begin and commit, and the transaction is not canceled. @@ -187,7 +185,7 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb) // indeed in serial mode, and HW transactions should never need serial mode // for any internal changes (e.g., they never abort visibly to the STM code // and thus do not trigger the standard retry handling). - if (likely(htm_fastpath && (prop & pr_hasNoAbort))) + if (likely(htm_fastpath)) { for (uint32_t t = htm_fastpath; t; t--) { @@ -198,33 +196,41 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb) // Monitor the writer flag in the serial-mode lock, and abort // if there is an active or waiting serial-mode transaction. if (unlikely(serial_lock.is_write_locked())) - htm_abort(); + htm_abort_retry(); else // We do not need to set a_saveLiveVariables because of HTM. return (prop & pr_uninstrumentedCode) ? a_runUninstrumentedCode : a_runInstrumentedCode; } - // The transaction has aborted. Don't retry if it's unlikely that + // The transaction has aborted. Retry if it's likely that // retrying the transaction will be successful. - if (!htm_abort_should_retry(ret)) - break; - // Wait until any concurrent serial-mode transactions ha
Re: [PATCH] Fix up assembly thunks with -gstabs+ (PR debug/54499)
jakub> 2012-11-08 Jakub Jelinek jakub> jakub> PR debug/54499 jakub> * cgraphunit.c (assemble_thunk): Don't call source_line debug hook jakub> here, instead call insn_locations_{init,finalize} and initialize jakub> prologue_location. jakub> jakub> * g++.dg/debug/pr54499.C: New test. Thanks, Jakub, this does look like a better fix. I had tried something like this, but it didn't work -- I was missing the _init and _finalize calls, and I didn't set prologue_location either. rth> I recall mentioning at the time that I thought Cary's approach was error-prone. Well, yes, you did, but then you backed off after I tested an ARM compiler: On Mon, Aug 6, 2012 at 2:53 PM, Richard Henderson wrote: > On 08/06/2012 02:45 PM, Cary Coutant wrote: >> Do you still have concerns about the patch? > > Nope. I'd mis-remembered from whence things get finalized. > > Revised patch is ok. I do agree with you, though, that it would still be better to treat thunks as first-class objects. -cary
Re: [PATCH] Handle abortTransaction with RTM
Richard Henderson writes: > > static inline void > -htm_abort () > +htm_abort_retry () > { >// ??? According to a yet unpublished ABI rule, 0xff is reserved and >// supposed to signal a busy lock. Source: andi.kl...@intel.com >_xabort(0xff); > } > > +static inline void > +htm_abort_cancel () > +{ > + // ??? What's the unpublished ABI rule for this, Andi? There is none for cancel, just for lock-is-locked (0xfe) and lock-busy (0xff) The convention is just for easier abort profiling. The profiler (perf) can display this abort code and it's far easier to understand if common situations have their standard code. But you can always make up your own too. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: [C++11] PR54413 Option for turning off compiler extensions for numeric literals.
On 11/07/2012 10:29 AM, Jakub Jelinek wrote: On Wed, Nov 07, 2012 at 10:22:57AM -0500, Jason Merrill wrote: I thought about that. We'd need some machinery that would allow cpp to query what has been declared already. Or alternately, always treat them as user-defined in C++ mode and have the front end decide to use the built-in interpretation if no literal operator is declared. Yeah, I think that would be best. Jakub OK, this passes x86_64-linux. This is my first rodeo with *.texi. I just have one flag: -fext-numeric-literals. It leaves gnu extension suffixes for std=gnu++*, and std=c++98. I sets the flag off (use suffixes for user-defined literals) for std=c++11+. Ed libcpp 2012-11-09 Ed Smith-Rowland <3dw...@verizon.net> PR c++/54413 * include/cpplib.h (cpp_interpret_float_suffix): Add cpp_reader* arg. (cpp_interpret_int_suffix): Add cpp_reader* arg. * init.c (cpp_create_reader): Iitialize new flags. * expr.c (interpret_float_suffix): Use new flags. (cpp_interpret_float_suffix): Add cpp_reader* arg. (interpret_int_suffix): Use new flags. (cpp_interpret_int_suffix): Add cpp_reader* arg. (cpp_classify_number): Adjust calls to interpret_x_suffix. gcc/c-family 2012-11-09 Ed Smith-Rowland <3dw...@verizon.net> PR c++/54413 * c-opts.c (c_common_handle_option): Set new flags. * c.opt: Describe new flags. gcc/cp 2012-11-09 Ed Smith-Rowland <3dw...@verizon.net> PR c++/54413 * decl.c (grokfndecl): Adjust calls to interpret_x_suffix. gcc/testsuite 2012-11-09 Ed Smith-Rowland <3dw...@verizon.net> PR c++/54413 * g++.dg/cpp0x/gnu_fext-numeric-literals.C: New. * g++.dg/cpp0x/std_fext-numeric-literals.C: New. * g++.dg/cpp0x/gnu_fno-ext-numeric-literals.C: New. * g++.dg/cpp0x/std_fno-ext-numeric-literals.C: New. gcc 2012-11-09 Ed Smith-Rowland <3dw...@verizon.net> PR c++/54413 * doc/invoke.texi: Document f[no-]ext-numeric-literals flag. Index: libcpp/include/cpplib.h === --- libcpp/include/cpplib.h (revision 193296) +++ libcpp/include/cpplib.h (working copy) @@ -431,6 +431,10 @@ ud-suffix which does not beging with an underscore. */ unsigned char warn_literal_suffix; + /* Nonzero means interpret imaginary, fixed-point, or other gnu extension + literal number suffixes as user-defined literal number suffixes. */ + unsigned char ext_numeric_literals; + /* Holds the name of the target (execution) character set. */ const char *narrow_charset; @@ -854,10 +858,12 @@ const char **, source_location); /* Return the classification flags for a float suffix. */ -extern unsigned int cpp_interpret_float_suffix (const char *, size_t); +extern unsigned int cpp_interpret_float_suffix (cpp_reader *, const char *, + size_t); /* Return the classification flags for an int suffix. */ -extern unsigned int cpp_interpret_int_suffix (const char *, size_t); +extern unsigned int cpp_interpret_int_suffix (cpp_reader *, const char *, + size_t); /* Evaluate a token classified as category CPP_N_INTEGER. */ extern cpp_num cpp_interpret_integer (cpp_reader *, const cpp_token *, Index: libcpp/init.c === --- libcpp/init.c (revision 193296) +++ libcpp/init.c (working copy) @@ -182,6 +182,7 @@ CPP_OPTION (pfile, track_macro_expansion) = 2; CPP_OPTION (pfile, warn_normalize) = normalized_C; CPP_OPTION (pfile, warn_literal_suffix) = 1; + CPP_OPTION (pfile, ext_numeric_literals) = 1; /* Default CPP arithmetic to something sensible for the host for the benefit of dumb users like fix-header. */ Index: libcpp/expr.c === --- libcpp/expr.c (revision 193296) +++ libcpp/expr.c (working copy) @@ -61,8 +61,8 @@ static cpp_num parse_defined (cpp_reader *); static cpp_num eval_token (cpp_reader *, const cpp_token *, source_location); static struct op *reduce (cpp_reader *, struct op *, enum cpp_ttype); -static unsigned int interpret_float_suffix (const uchar *, size_t); -static unsigned int interpret_int_suffix (const uchar *, size_t); +static unsigned int interpret_float_suffix (cpp_reader *, const uchar *, size_t); +static unsigned int interpret_int_suffix (cpp_reader *, const uchar *, size_t); static void check_promotion (cpp_reader *, const struct op *); /* Token type abuse to create unary plus and minus operators. */ @@ -87,7 +87,7 @@ length LEN, possibly zero. Returns 0 for an invalid suffix, or a flag vector describing the suffix. */ static unsigned int -interpret_float_suffix (const uchar *s, size_t len) +interpret_float_suffi
[PATCH] Fix array access beyond bounds in test cases
Hi, I found that some test cases access arrays beyond their bounds. I looked up their originating bugzillas and found that the test cases for pr22506 and pr34005 were likely to be typos since the original test cases in the report do not have this problem. For pr31227 however, I am inclined to think that the test case is incorrect. I am not sure what the test case verifies since the array accesses are obviously beyond bounds. Regards, Siddhesh testsuite/ChangeLog: 2012-11-09 Siddhesh Poyarekar * gcc.dg/Warray-bounds-3.c (bar): Keep array access within bounds for ABDAY, DAY, ABMON, MON, AM_PM. * gcc.dg/vect/pr22506.c (foo): Reduce loop iterations to within array bounds. * gcc.dg/vect/pr34005.c (XdmcpUnwrap): Likewise. diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-3.c b/gcc/testsuite/gcc.dg/Warray-bounds-3.c index 19cdb8e..773f463 100644 --- a/gcc/testsuite/gcc.dg/Warray-bounds-3.c +++ b/gcc/testsuite/gcc.dg/Warray-bounds-3.c @@ -34,31 +34,31 @@ bar (struct S *time) iov[1].iov_base = (void *) "def"; iov[1].iov_len = 3; - for (cnt = 0; cnt <= 7; ++cnt) + for (cnt = 0; cnt < 7; ++cnt) { iov[2 + cnt].iov_base = (void *) (time->abday[cnt] ?: ""); iov[2 + cnt].iov_len = strlen (iov[2 + cnt].iov_base) + 1; } - for (; cnt <= 14; ++cnt) + for (; cnt < 14; ++cnt) { iov[2 + cnt].iov_base = (void *) (time->day[cnt - 7] ?: ""); iov[2 + cnt].iov_len = strlen (iov[2 + cnt].iov_base) + 1; } - for (; cnt <= 26; ++cnt) + for (; cnt < 26; ++cnt) { iov[2 + cnt].iov_base = (void *) (time->abmon[cnt - 14] ?: ""); iov[2 + cnt].iov_len = strlen (iov[2 + cnt].iov_base) + 1; } - for (; cnt <= 38; ++cnt) + for (; cnt < 38; ++cnt) { iov[2 + cnt].iov_base = (void *) (time->mon[cnt - 26] ?: ""); iov[2 + cnt].iov_len = strlen (iov[2 + cnt].iov_base) + 1; } - for (; cnt <= 40; ++cnt) + for (; cnt < 40; ++cnt) { iov[2 + cnt].iov_base = (void *) (time->am_pm[cnt - 38] ?: ""); iov[2 + cnt].iov_len = strlen (iov[2 + cnt].iov_base) + 1; diff --git a/gcc/testsuite/gcc.dg/vect/pr22506.c b/gcc/testsuite/gcc.dg/vect/pr22506.c index 5a2d749..a618e32 100644 --- a/gcc/testsuite/gcc.dg/vect/pr22506.c +++ b/gcc/testsuite/gcc.dg/vect/pr22506.c @@ -7,8 +7,8 @@ void foo() { int i; -for (i=0; i<5; ++i) x[i]=0; -for (i=0; i<4; ++i) ; +for (i=0; i<3; ++i) x[i]=0; +for (i=0; i<2; ++i) ; } /* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/pr34005.c b/gcc/testsuite/gcc.dg/vect/pr34005.c index 813f950..86c3832 100644 --- a/gcc/testsuite/gcc.dg/vect/pr34005.c +++ b/gcc/testsuite/gcc.dg/vect/pr34005.c @@ -8,7 +8,8 @@ void XdmcpUnwrap (unsigned char *output, int k) int i; unsigned char blocks[2][8]; k = (k == 0) ? 1 : 0; - for (i = 0; i < 32; i++) + + for (i = 0; i < 8; i++) output[i] = blocks[k][i]; }
RE: [RFC] New feature to reuse one multilib among different targets
> -Original Message- > From: Joseph Myers [mailto:jos...@codesourcery.com] > Sent: Friday, November 09, 2012 5:10 AM > To: Terry Guo > Cc: gcc-patches@gcc.gnu.org > Subject: RE: [RFC] New feature to reuse one multilib among different > targets > > On Thu, 8 Nov 2012, Terry Guo wrote: > > > To convert such statements to data structure used by multilib_raw, I > > refactor codes in genmultilib into two functions combo_to_dir and > > The "function" keyword for creating shell functions is not POSIX, and I > don't know if we ensure that $SHELL is a shell supporting functions. > (It's documented that CONFIG_SHELL may need to be set to a POSIX shell > if /bin/sh isn't sufficient, but does that feed through to the value of > SHELL used to run this script?) > You are right that we should make script POSIX compliant. This v3 patch removed "function" and "local" which don't belong to POSIX standard. I also verified that CONFIG_SHELL is passed to this script with value "/bin/sh". Checked new genmultilib script with command "checkbashisms --posix genmultilib" in Ubuntu. No warning and error messages reported. [...] > > Documentation changes need mentioning in ChangeLog entries. > Added them in following ChangeLog. BR, Terry 2012-11-09 Terry Guo * genmultilib (combo_to_dir): New function. (options_output): New function. (MULTILIB_REUSE): New argument. * Makefile.in (s-mlib): Add a new argument MULTILIB_REUSE. * gcc.c (multilib_reuse): New spec. (set_multilib_dir): Use multilib_reuse. * doc/fragments.texi: Mention MULTILIB_REUSE. multilib-reuse-v3.patch Description: Binary data