Re: fwprop and CSE const anchor opt
Thank you very much. This was very informative. Richard Sandiford writes: > If we have an instruction: > > A: (set (reg Z) (plus (reg X) (const_int 0xdeadbeef))) > > we will need to use something like: > >(set (reg Y) (const_int 0xdead)) >(set (reg Y) (ior (reg Y) (const_int 0xbeef))) > B: (set (reg Z) (plus (reg X) (reg Y))) > > But if A is in a loop, the Y loads can be hoisted, and the cost > of A is effectively the same as the cost of B. In other words, > the (il)legitimacy of the constant operand doesn't really matter. My guess is that A not being a recognizable insn, this is relevant at RTL expansion. Is this correct? > In summary, the current costs generally work because: > > (a) We _usually_ only apply costs to arbitrary instructions > (rather than candidate instruction patterns) before > loop optimisation. I don't think I understand this point. I see the part that the cost is typically queried before loop optimization but I don't understand the distinction between "arbitrary instructions" and "candidate instruction patterns". Can you please explain the difference? > (b) It doesn't matter what we return for invalid candidate > instruction patterns, because recog will reject them anyway. > > So I suppose my next question is: are you seeing this problem with cse1 > or cse2? The reasoning behind the zero cost might still be valid for > REG_EQUAL notes in cse1. However, it's probably not right for cse2, > which runs after loop hoisting. I am seeing it with both, so at least at cse2 we could do it with this. > Perhaps we could add some kind of context parameter to rtx_costs > to choose between the hoisting and non-hoisting cost. As well as > helping with your case, it could let us use the non-hoisting cost > before loop optimisation in cases where the insn isn't going to > go in a loop. The drawback is that we then have to replicate > even more of the .md file in rtx_costs. > > Alternatively, perhaps we could just assume that rtx_costs always > returns the hoisted cost when optimising for speed, in which case > I think your alternative solution would be theoretically correct > (i.e. not a hack ;)). OK, I think I am going to propose this in the patch then. It might still be interesting to experiment with providing more context to rtx_costs. > E.g. suppose we're deciding how to implement an in-loop multiplication. > We calculate the cost of a multiplication instruction vs. the cost of a > shift/add sequence, but we don't consider whether any of the backend-specific > shift/add set-up instructions could be hoisted. This would lead to us > using multiplication insns in cases where we don't want to. > > (This was one of the most common situations in which the zero cost helped.) I am not sure I understand this. Why would we decide to hoist suboperations of a multiplication? If it is loop-variant then even the suboperations are loop-variant whereas if it is loop-invariant then we can hoist the whole operation. What am I missing? Adam
Re: [SOLVED] Re: stdint.h type information needed
Joseph S. Myers wrote: > On Fri, 3 Apr 2009, Dave Korn wrote: > >> Got it: the key is that the types we use in our stdint.h target files have >> to match the exact wording used at the top of c_common_nodes_and_builtins: > > This requirement is documented in tm.texi (under SIZE_TYPE, to which the > documentation of the other target macros refers). Ah, silly me. I just read through the patch at that URL you posted at the head of the thread, and that hunk is not present there. Thank you. cheers, DaveK
Re: Intermittent/non-reproducible gcc testsuite failures
On Tue, 2009-04-07 at 23:45 -0700, Michael Eager wrote: > I'm running the gcc test suite on powerpc-unknown-eabisim > on the trunk and I get results which are different from > one run to the next. When I run the failing tests by > hand, all pass. Mike Stein also has noted that some of > the tests are intermittent failures. > Does anyone have any suggestions on how to get one of > these tests to fail consistently, or a different approach > to finding the cause of the intermittent failures? There are two (or three?) sources of interittency here, and one of them is the testsuite harness itself. One thing you could do to try and narrow it down is to run *just* the test cases you are seeing intermittency in using RUNTESTFLAGS=foo.exp=bar.c a dozen times and see if the results are stable. Cheers, Ben
Re: Intermittent/non-reproducible gcc testsuite failures
Michael Eager wrote: > Does anyone have any suggestions on how to get one of > these tests to fail consistently, or a different approach > to finding the cause of the intermittent failures? Perhaps hack the testsuite to run the tests under gdb, setting a breakpoint on abort() that causes it to dump core? Then at least you'd be able to get a backtrace and see some state post-mortem. Of course, being a Heisenbug, it'll probably stop appearing when run under a debugger! cheers, DaveK
Re: IRIX stdint patch
Hi Tom, > All of IRIX 6.x is not equal and stdint.h was not added until IRIX 6.5. it occured to me after I sent my mail. > Please consider extending the patch to wrap stdint.h for any IRIX < 6.5 > and not just IRIX 5. I don't have access to IRIX 6.2 any longer, but I'm pretty sure stdint.h was not in 6.5.x proper. It seems like it was added in IDF 1.3 only and never was part of the base OS, so I'll assume it is available in every 6.5.x since I'm pretty sure you can't build or use GCC without the compiler_dev package. Rainer - Rainer Orth, Faculty of Technology, Bielefeld University
Re: First PPL 0.10.1 release candidate
Dave Korn wrote: > Roberto Bagnara wrote: > >> We have uploaded the first PPL 0.10.1 release candidate to > >> Please report any problem you may encounter to ppl-devel > > Hi Roberto and team, > > I am sorry to report some problems encountered. I am pleased to report some problems resolved :) > Target: i686-pc-cygwin, cygwin-1.7.0-42, gcc-4.3.2, ppl configured with > --enable-shared --disable-static, no -fexceptions, no --enable-cxx. > > FAILs: 100%. Target: i686-pc-cygwin, cygwin-1.7.0-42, gcc-4.3.2, ppl configured with --enable-shared --disable-static, no -fexceptions, no --enable-cxx. PASSes: 100%. :) > I don't know if this is purely a problem of the testsuite > framework, or if this represents a real problem in the library itself, It was caused by a packaging glitch in the Cygwin distro, which can be resolved upstream. > leftover executables after the testsuite run all fail with exit code 128. I > ran a few under gdb and they all said > > Program exited with code 0200. > > so I guess this is a deliberate exit rather than a SEGV or other crash. Apparently this is also one way in which a missing dependent DLL can manifest. cheers, DaveK
Re: Syntactic sugar to access arrays and hashes in Objective-C
Am Samstag, den 21.03.2009, 11:59 +0100 schrieb John Holdsworth: > I was wondering if it would be a useful extension to Objective-C > expand the [] operator > to support array and hash references to NSArray and NSDictionary > classes directly to > greatly improve the readability of code: I'm not an ObjC front end maintainer and have no authority but one issue I would have with this feature with gcc proper is that the ObjC front end would have to learn about the semantics of "NSArray" and "NSDictionary" which are actually not part of the language but part of an external library. Now gcc already supports the -fconstant-string-class option as one way to embed knowledge about an external library into an executable. But I would like adding options with all these semantics from a foreign library into the language implementation. Maybe this could be done more elegantly with plugin infrastructure that that es being currently added: http://gcc.gnu.org/wiki/plugins Cheers, David
Re: Intermittent/non-reproducible gcc testsuite failures
Dave Korn wrote: Michael Eager wrote: Does anyone have any suggestions on how to get one of these tests to fail consistently, or a different approach to finding the cause of the intermittent failures? Perhaps hack the testsuite to run the tests under gdb, setting a breakpoint on abort() that causes it to dump core? Then at least you'd be able to get a backtrace and see some state post-mortem. Thanks, Ben and Dave. I'll give it a try. Of course, being a Heisenbug, it'll probably stop appearing when run under a debugger! Ah, yes. Hiding in the shadows. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
My plans on EH infrastructure
Hi, while looking into problems of current and pretty-ipa's inlining heuristics implementation, I noticed that we have relatively important problems with EH overhead that confuse inliner code size metrics. Looking deeper into EH problems, I think main issues are: - Inliner tends to produce exponential amount of destructor calls on programs with high abstraction penalty (pooma/boost/DLV all seem to have this). Every scope block such as { class a a;.} contains implicit cleanup. Destructor of class A is called twice, once for ned of block, once for cleanup. Now it is common for destructors to call destuctors of contained objects and local iterators, this all happens in same way causing EH cleanup code that often exceed main function body. - Not inlining destructors in cleanups cause otherwise fully inlined object to miss SRA/store sinking and other optimizations. - We tend to block optimization because variable along hot path has abnormal PHI. Interestingly this usually limits to creating more copies, PHIs, conditionals and BBs that gets almost completely cleaned up later on RTL land, but I don't think it is good excuse to ignore these ;) - Cleanups calling destructors are often fully removable but we don't do that very well. We remove completely dead destructor by means of new local-pureconst pass. As soon as destructor has some stores or loops, we realize the fact that they are dead very late in compilation. Also destructors calling delete() on some fields are not simplified. This is partly doable via Martin's IPA-SRA. I thus started to look into improvements in EH machinery and elimination of code in the cleanups What is already done on mainline: - New ehcleanup pass: when EH handler becomes empty by local pureconst+DCE+DSE+complette unrolling combo, we can safely eliminate it. This further eliminate local nothrow region handlers that are numberous. - There is new local pureconst pass done during early optimization. It tries to prove stuff nothrow and pure. This leads to DCEing the destructors before they are accounted in inline metricts. - Some of bits from RTL EH lowering are gone now. More should come We really have here partial transition from RTL EH code to gimple. This will hopefully simplify RTL EH code to basic part adding landing pads. What is already done on pretty-ipa: - Inliner herusitics can be bit smarter about code size estimates on how code will look after inlining. In particular MUST_NOT_THROW receivers are most likely optimized away or commonized and can be ignored. Also all the FILTER_EXPR/OBJ_REF_EXPR sets/reads are probably going to disappear after lowering. What I have implementation for and would like to push to pretty-ipa and later for mainline after some more evaulation: - EH edge redirection. This is conceptually easy thing to do. When EH edge needs to be redirected and it is only edge to BB, we just update label in tree of EH handlers. When it is not only edge, we walk from the throwing statement region up to the outermost handler region in the EH tree and duplicate everything on a way. This breaks some assumptions EH code have. In particular on EH handler can do RESX in other handler and handlers can share labels. These assumptions are made in except.c, but they don't have to be: EH tree is really just on-side representaiton of decision tree of what happens after expression and is lowered to such for for dwarf. There is no need to know what happens after EH is delivered. While I don't expect immediate improvements in C++ codegen. I benchmarked it in C++ testers and there is some improvment in libstdc++ tests, little improvement in boot's wave and cwcessboard. Partly it is because all testcases we have do pretty much no EH except for one done implicitly by cleanup regions. The testcases improving in libstdc++ do have trycatch constructs in them. We probably could try to find some new benchmarks for C++ testsuite to cover this area as well as cases where it is not best solution to inline everything completely. Probably something like mozilla would do the job, but I don't know if there is easy to compile benchmark aplication for this. Our C++ testsuite is pretty much testsuite for specific kind of C++ coding requiring inliner to do a lot of job. This is not surprising because it was constructed with inliner tunning in mind. I think this is important infrastructure bit. In particular it seems to me that since this leaves us with abnormal edges produced by setjmp/longjmp, nonlocal labels and computed goto and because of computed goto factoring, only longjmp/setjmp edges are really unsplittable and thus we can get rid of abnormal phi handling and teach out-of-ssa to insert conditionals into a
Re: My plans on EH infrastructure
On Wed, 8 Apr 2009, Jan Hubicka wrote: > Some remaining issues: > - FILTER_EXPR/OBJ_REF_EXPR is currently handled in quite dangerous way. > Original Rth's code made them quite 100% volatile. Now we can PRE them. > The FILTER_EXPR/OBJ_REF_EXPR are really hard registers in RTL world that > are set by EH runtime on EH edges from throwing statement to handler > (not at RESX edges) and they are used by pre-landing pads and also by > RESX code. > > It would be more precise if RESX instruction took FILTER_EXPR/OBJ_REF_EXPR > value as argument (since it really uses the values) so we can kill > magicness > of the sets. Problem is that I don't think there is good SSA > representation > for register that implicitly change over edges. > > Take the example > > call (); EH edge to handler 1: > > > receiver 1: >tmp1 = filter_expr; >tmp2 = obj_ref_expr; >call2 (); EH edge to handler 2 > label1: >filter_expr = tmp1 >obj_ref_expr = tmp2 >resx (tmp1, tmp2) > > > handler 2: >tmp3 = filter_expr; >tmp4 = obj_ref_expr; >if (conditional) > goto label1: >else > filter_expr = tmp3 > obj_ref_expr = tmp4 > resx (tmp3, tmp4); > > In this case tmp1 != tmp3 and tmp2 != tmp4 and thus it is invalid to > optimize > second resx to (tmp1, tmp3). There is nothing to model this. > I wonder if FILTER_EXPR/OBJ_REF_EXPR can't be best handled by just being > volatile to majority of optimizations (i.e. assumed to change value all > the time) > and handle this in copyprop/PRE as specal case invalidating the value > across > EH edges? Hm, what we have now is indeed somewhat dangerous. Correct would be to make them global volatile variables and thus have volatile loads and stores. I don't see any easy way of killing the values on EH edges - but certainly we could teach the VN to do this. So in the end non-volatileness might work as well, if the loaded values are properly used by sth. I can have a look into the current state in the VN if you have a testcase that produces the above CFG for example. > - The nature of code duplication in between cleanup at end of block and > cleanup in EH actually brings a lot of tail merging possibilities. > I wonder if we can handle this somehow effectivly on SSA. In RTL world > crossjumping would catch thse cases if it was not almost 100% ineffective > by fact that we hardly re-use same register for temporaries. > > I wonder if there is resonable SSA optimization that would have similar > effect as tail merging here or if we want to implement tail merging on > gimple. > - Can we somehow conclude that structure being desturcted dies after the > destructors so all writes to it are dead already in early optimizations? > That would allow a lot more DSE and cheaper inlining. > - It is possible to make EH edges redirectable on RTL too. I wonder > if it is worth the effort however. > - We ought to be able to prove finitarity of simple loops so we can > DCE more early and won't rely on full loop unrolling to get rid of > empty loops originally initializing dead arrays. I wonder if CD-DCE should not catch this, but I see that for for(i=0;i<4;++i) ; we do Marking useful stmt: if (i_1 <= 3) which is already a problem if we want to DCE the loop. Can we mark the controlling predicate necessary somehow only if we mark a stmt necessary in the BBs it controls? Otherwise there is the empty loop removal pass ... Richard.
Re: My plans on EH infrastructure
On Wed, 8 Apr 2009, Richard Guenther wrote: > On Wed, 8 Apr 2009, Jan Hubicka wrote: > > - The nature of code duplication in between cleanup at end of block and > > cleanup in EH actually brings a lot of tail merging possibilities. > > I wonder if we can handle this somehow effectivly on SSA. In RTL world > > crossjumping would catch thse cases if it was not almost 100% > > ineffective > > by fact that we hardly re-use same register for temporaries. > > > > I wonder if there is resonable SSA optimization that would have similar > > effect as tail merging here or if we want to implement tail merging on > > gimple. > > - Can we somehow conclude that structure being desturcted dies after the > > destructors so all writes to it are dead already in early optimizations? > > That would allow a lot more DSE and cheaper inlining. > > - It is possible to make EH edges redirectable on RTL too. I wonder > > if it is worth the effort however. > > - We ought to be able to prove finitarity of simple loops so we can > > DCE more early and won't rely on full loop unrolling to get rid of > > empty loops originally initializing dead arrays. > > I wonder if CD-DCE should not catch this, but I see that for > > for(i=0;i<4;++i) >; > > we do > > Marking useful stmt: if (i_1 <= 3) > > which is already a problem if we want to DCE the loop. Can we > mark the controlling predicate necessary somehow only if we > mark a stmt necessary in the BBs it controls? Ah, it can do it but ... /* Prevent the loops from being removed. We must keep the infinite loops, and we currently do not have a means to recognize the finite ones. */ FOR_EACH_BB (bb) { edge_iterator ei; FOR_EACH_EDGE (e, ei, bb->succs) if (e->flags & EDGE_DFS_BACK) mark_control_dependent_edges_necessary (e->dest, el); } thus what we could do is initialize loops and use number of iterations analysis here and mark only the back edge of those we do not know the number of iterations. Of course that is both expensive. Richard.
Re: My plans on EH infrastructure
> On Wed, 8 Apr 2009, Richard Guenther wrote: > > > On Wed, 8 Apr 2009, Jan Hubicka wrote: > > > - The nature of code duplication in between cleanup at end of block and > > > cleanup in EH actually brings a lot of tail merging possibilities. > > > I wonder if we can handle this somehow effectivly on SSA. In RTL world > > > crossjumping would catch thse cases if it was not almost 100% > > > ineffective > > > by fact that we hardly re-use same register for temporaries. > > > > > > I wonder if there is resonable SSA optimization that would have > > > similar > > > effect as tail merging here or if we want to implement tail merging on > > > gimple. > > > - Can we somehow conclude that structure being desturcted dies after the > > > destructors so all writes to it are dead already in early > > > optimizations? > > > That would allow a lot more DSE and cheaper inlining. > > > - It is possible to make EH edges redirectable on RTL too. I wonder > > > if it is worth the effort however. > > > - We ought to be able to prove finitarity of simple loops so we can > > > DCE more early and won't rely on full loop unrolling to get rid of > > > empty loops originally initializing dead arrays. > > > > I wonder if CD-DCE should not catch this, but I see that for > > > > for(i=0;i<4;++i) > >; > > > > we do > > > > Marking useful stmt: if (i_1 <= 3) > > > > which is already a problem if we want to DCE the loop. Can we > > mark the controlling predicate necessary somehow only if we > > mark a stmt necessary in the BBs it controls? > > Ah, it can do it but ... > > /* Prevent the loops from being removed. We must keep the infinite > loops, > and we currently do not have a means to recognize the finite > ones. */ > FOR_EACH_BB (bb) > { > edge_iterator ei; > FOR_EACH_EDGE (e, ei, bb->succs) > if (e->flags & EDGE_DFS_BACK) > mark_control_dependent_edges_necessary (e->dest, el); > } Yes, this is what I referred to. I plan to write simple predicate that will do similar analysis as number of iterations code, just be more relaxed. For instance for (i=0; i > thus what we could do is initialize loops and use number of iterations > analysis here and mark only the back edge of those we do not know > the number of iterations. Of course that is both expensive. How far did we get with presistence of loop structures? This is probably bit we can stick somewhere and make DCE and pureconst to use it. Honza > > Richard.
Re: My plans on EH infrastructure
On Wed, 8 Apr 2009, Jan Hubicka wrote: > > On Wed, 8 Apr 2009, Richard Guenther wrote: > > > > > On Wed, 8 Apr 2009, Jan Hubicka wrote: > > > > - The nature of code duplication in between cleanup at end of block > > > > and > > > > cleanup in EH actually brings a lot of tail merging possibilities. > > > > I wonder if we can handle this somehow effectivly on SSA. In RTL > > > > world > > > > crossjumping would catch thse cases if it was not almost 100% > > > > ineffective > > > > by fact that we hardly re-use same register for temporaries. > > > > > > > > I wonder if there is resonable SSA optimization that would have > > > > similar > > > > effect as tail merging here or if we want to implement tail merging > > > > on > > > > gimple. > > > > - Can we somehow conclude that structure being desturcted dies after > > > > the > > > > destructors so all writes to it are dead already in early > > > > optimizations? > > > > That would allow a lot more DSE and cheaper inlining. > > > > - It is possible to make EH edges redirectable on RTL too. I wonder > > > > if it is worth the effort however. > > > > - We ought to be able to prove finitarity of simple loops so we can > > > > DCE more early and won't rely on full loop unrolling to get rid of > > > > empty loops originally initializing dead arrays. > > > > > > I wonder if CD-DCE should not catch this, but I see that for > > > > > > for(i=0;i<4;++i) > > >; > > > > > > we do > > > > > > Marking useful stmt: if (i_1 <= 3) > > > > > > which is already a problem if we want to DCE the loop. Can we > > > mark the controlling predicate necessary somehow only if we > > > mark a stmt necessary in the BBs it controls? > > > > Ah, it can do it but ... > > > > /* Prevent the loops from being removed. We must keep the infinite > > loops, > > and we currently do not have a means to recognize the finite > > ones. */ > > FOR_EACH_BB (bb) > > { > > edge_iterator ei; > > FOR_EACH_EDGE (e, ei, bb->succs) > > if (e->flags & EDGE_DFS_BACK) > > mark_control_dependent_edges_necessary (e->dest, el); > > } > > Yes, this is what I referred to. > I plan to write simple predicate that will do similar analysis as number > of iterations code, just be more relaxed. > > For instance > for (i=0; i can be optimized out for signed counter relying that overflow can jump > in between max...INT_MAX > > togehter with loops of style > for (i=0; i in unsigned, and for MAX known and safely away from end of type we > should prove finiteness in most common cases. > > I was wondering if loops of form > for (i=0; ; i++) >a[i] > can be assumed finite because eventaully a[i] would get to unallocated > memory otherwise. This is however similar to inifinite recursion > > a() > { > ...nonlooping... > a(); > } > > will either overflow stack or be finite. We previously concluded here > it is invalid to optimize out such function call.. > > > > thus what we could do is initialize loops and use number of iterations > > analysis here and mark only the back edge of those we do not know > > the number of iterations. Of course that is both expensive. > > How far did we get with presistence of loop structures? This is > probably bit we can stick somewhere and make DCE and pureconst to use > it. I think the infrastructure is there but nobody tried to keep it "alive" yet ;) Richard.
Re: My plans on EH infrastructure
> For instance > for (i=0; i can be optimized out for signed counter relying that overflow can jump > in between max...INT_MAX > > togehter with loops of style > for (i=0; i in unsigned, and for MAX known and safely away from end of type we > should prove finiteness in most common cases. Please handle also i!=max conditions. I have a patch to derive precise value ranges for simple induction variables instead of punting to [0,INT_MAX). With the patch VRP likes to turn i
Re: My plans on EH infrastructure
> On Wed, 8 Apr 2009, Jan Hubicka wrote: > > > Some remaining issues: > > - FILTER_EXPR/OBJ_REF_EXPR is currently handled in quite dangerous way. > > Original Rth's code made them quite 100% volatile. Now we can PRE them. > > The FILTER_EXPR/OBJ_REF_EXPR are really hard registers in RTL world that > > are set by EH runtime on EH edges from throwing statement to handler > > (not at RESX edges) and they are used by pre-landing pads and also by > > RESX code. > > > > It would be more precise if RESX instruction took > > FILTER_EXPR/OBJ_REF_EXPR > > value as argument (since it really uses the values) so we can kill > > magicness > > of the sets. Problem is that I don't think there is good SSA > > representation > > for register that implicitly change over edges. > > > > Take the example > > > > call (); EH edge to handler 1: > > > > > > receiver 1: > >tmp1 = filter_expr; > >tmp2 = obj_ref_expr; > >call2 (); EH edge to handler 2 > > label1: > >filter_expr = tmp1 > >obj_ref_expr = tmp2 > >resx (tmp1, tmp2) > > > > > > handler 2: > >tmp3 = filter_expr; > >tmp4 = obj_ref_expr; > >if (conditional) > > goto label1: > >else > > filter_expr = tmp3 > > obj_ref_expr = tmp4 > > resx (tmp3, tmp4); > > > > In this case tmp1 != tmp3 and tmp2 != tmp4 and thus it is invalid to > > optimize > > second resx to (tmp1, tmp3). There is nothing to model this. > > I wonder if FILTER_EXPR/OBJ_REF_EXPR can't be best handled by just being > > volatile to majority of optimizations (i.e. assumed to change value all > > the time) > > and handle this in copyprop/PRE as specal case invalidating the value > > across > > EH edges? > > Hm, what we have now is indeed somewhat dangerous. Correct would be > to make them global volatile variables and thus have volatile loads > and stores. Yep, but we definitly do want to DCE them (since we emit a lot of redundant load+store pairs during lowering to handle case of possibly throwing code inside cleanup) at least. Having way to PRE them would be nice. I have code for early EH lowering. Trycatch or exception_allowed regions basically expand to something like switch (filter_expr) { case 5: goto catch_region 1; case 7: goto catch_region 2; default: resx; } Thus I think is best done somewehre in the middle of all_optimization_passes. Doing so is already win because most often RESX expands to simple goto and we merge blocks and also because we use switch expansion instead of ugly seqeunce of ifs. In fact I forgot to write item in my list of experimental stuff and it is lowering of those trivial RESX early (pre-inlining). It makes ehclanup matching more interesting since we can see sequence of EH regions merged together instead of single EH region, but it also saves thousdands of BBs on Tramp. Of course, it is possible to look when expanding the TRY...catch receiver into incomming copies of still live FILTER_EXPRs and use the temporaries instead of relying on SSA optimizers to work this out. But it is not 100% trivial analysis. > > I don't see any easy way of killing the values on EH edges - but certainly > we could teach the VN to do this. So in the end non-volatileness might > work as well, if the loaded values are properly used by sth. > > I can have a look into the current state in the VN if you have a > testcase that produces the above CFG for example. It should be try-catch inside catch of try-catch. I will try to write one ;) Honza
Re: Fixing the pre-pass scheduler on x86 (Bug 38403)
Steven Bosscher wrote: On Wed, Apr 8, 2009 at 5:19 AM, Vladimir Makarov wrote: I've been working on register-pressure sensitive insn scheduling last two months and I hope to submit this work for gcc4.5. I am implementing also a mode in insn-scheduler to do only live range shrinkage. Is all of this still necessary if the selective scheduler (with register renaming) is made to work on i686/x86_64 after reload? That is a really interesting question, Steven. I thought about this for a few months (since last fall). Here is my points to result me in starting my work on haifa-scheduler: 1. Selective scheduler works only for Itanium now. As I know there are some plans to make it working on PPC, but there are no plans to make it working for other acrhictectures for now. 2. My understanding is that (register-pressure sensitive insn scheduling + RA + 2nd insn scheduling) is not equal to (RA + selective scheduling with register renaming) with the point of the potential performance results. In first case, the pressure-sensitive insn scheduler could improve RA by live-range shrinkage. In the 2nd case it is already impossible. It could be improved by the 2nd RA but RA is more time consuming than scheduling now. In general this chicken-egg problem could be solved by iterating the 2 contradictory passes insn scheduling + RA (or RA + insns scheduling). It is a matter of practicality when to stop these iterations. 3. My current understanding is that selective scheduler is overkill for architectures with few registers. In other words, I don't think it will give a better performance for such architectures. On the other hand, it is much slower than haifa-scheduler because of its complexity. Haifa-scheduler before RA already adds 3-5% compile time, selective scheduler will add 5-7% more compile time without performance improvement for mainstream architectures x86/x86_64. I think it is intolerable. I think problems #1 and #2 could be resolved by investing significant resources. I don't think problem #3 could be resolved. I'd like to be wrong and will accept this if somebody shows me. But they will need my implementation of register-pressure sensitive insn scheduler in any case to prove it. Although I'd wish to see one insn-scheduler as many other people, now I believe that haifa-scheduler and selective scheduler (by the way they share a lot of common code) will coexist. One could be used for architectures with few registers and another one for architectures with massive fine-grain parallelism which needs a lot of explicit registers. We could still remove EBB insn scheduler because selective scheduler works much better. I hope I answered your question which is difficult for me because I always supported selective-scheduling project and really wanted to see it as a single insn scheduler.
Re: Fixing the pre-pass scheduler on x86 (Bug 38403)
Vladimir Makarov wrote: Steven Bosscher wrote: On Wed, Apr 8, 2009 at 5:19 AM, Vladimir Makarov wrote: I've been working on register-pressure sensitive insn scheduling last two months and I hope to submit this work for gcc4.5. I am implementing also a mode in insn-scheduler to do only live range shrinkage. Is all of this still necessary if the selective scheduler (with register renaming) is made to work on i686/x86_64 after reload? That is a really interesting question, Steven. I thought about this for a few months (since last fall). Here is my points to result me in starting my work on haifa-scheduler: 1. Selective scheduler works only for Itanium now. As I know there are some plans to make it working on PPC, but there are no plans to make it working for other acrhictectures for now. There are some patches for fixing sel-sched on PPC that I need to ping, thanks for reminding me :) They were too late for regression-only mode, and we didn't get access to modern PPCs as we hoped, so this was not an issue earlier. Btw, when we've submitted the scheduler, it did work on x86-64 (compile farm), I don't know whether this is still the case. 2. My understanding is that (register-pressure sensitive insn scheduling + RA + 2nd insn scheduling) is not equal to (RA + selective scheduling with register renaming) with the point of the potential performance results. In first case, the pressure-sensitive insn scheduler could improve RA by live-range shrinkage. In the 2nd case it is already impossible. It could be improved by the 2nd RA but RA is more time consuming than scheduling now. In general this chicken-egg problem could be solved by iterating the 2 contradictory passes insn scheduling + RA (or RA + insns scheduling). It is a matter of practicality when to stop these iterations. 3. My current understanding is that selective scheduler is overkill for architectures with few registers. In other words, I don't think it will give a better performance for such architectures. On the other hand, it is much slower than haifa-scheduler because of its complexity. Haifa-scheduler before RA already adds 3-5% compile time, selective scheduler will add 5-7% more compile time without performance improvement for mainstream architectures x86/x86_64. I think it is intolerable. We still plan to do some speedups to the sel-sched code within 4.5 timeframe, mainly to the dependence handling code. But even after that, I agree that for out-of-order architectures selective scheduler will probably be an overkill. Besides, register renaming itself would remain an expensive operation, because it needs to scan all paths along which an insn was moved to prove that the new register can be used, though we have invested a lot of time for speeding up this process via various caching mechanisms. On ia64, register renaming is useful for pipelining loops. Also, we have tried to limit register renaming for the 1st pass selective scheduler via tracking register pressure and having a cutoff for that, but it didn't work out very well on ia64, so I agree that much more of RA knowledge should be brought in for this task. Hope this helps. Vlad, Steven, thanks for caring. Andrey
Support question for GCC
GNU, Is there an end-of-support date for GCC version 4.3.0? I'm assisting a customer here at NAWCWD China Lake, California, to register software in our Navy database, DADMS. In order to use (or purchase) software, it must be approved in DADMS. One of the things we must show is proof of vendor support for the particular version. Normally with Commercial (COTS) software, the manufacturer has an end-of-support or end-of-life date, but I'm not sure how this works for free or open source software. I looked at the below links, but could not find the answer. Any help would be appreciated. Thanks, Gary Nordvall Code 722000D NAWCWD China Lake, CA 93555 760-939-2059 gary.nordv...@navy.mil -Original Message- From: Jeanne Rasata via RT [mailto:i...@fsf.org] Sent: Wednesday, April 08, 2009 8:55 To: Nordvall, Gary USNUNK NAVAIR 33, , LEGACY SERVER Subject: [gnu.org #427714] Support question for GCC Hello, Gary, > [gary.nordv...@navy.mil - Wed Apr 08 11:49:11 2009]: > Is there an end-of-support date for GCC version 4.3.0? I'm assisting > a customer here at NAWCWD China Lake, California, to register software > in our Navy database, DADMS. In order to use (or purchase) software, > it must be approved in DADMS. One of the things we must show is proof > of vendor support for the particular version. Normally with > commercial > (COTS) software, the manufacturer has an end-of-support or end-of-life > date, but I'm not sure how this would work with free or open source > software. Any help would be appreciated. I'm sorry, but as this is only a general contact address, I cannot properly answer technical questions such as yours. The best I can do is refer you to the GCC Manual at http://gcc.gnu.org/onlinedocs/ and Frequently Asked Questions at http://gcc.gnu.org/faq.html. If neither of those provide an answer to your question, please, contact the GCC users' help list. You can learn more about it at http://gcc.gnu.org/lists.html. I am sorry that I couldn't be of more help. Sincerely, -- Jeanne Rasata Program Assistant Free Software Foundation --- Have we been helpful to you today? Would you like to help the FSF continue to spread the word about software freedom? You too can become a member! Learn more at: http://donate.fsf.org smime.p7s Description: S/MIME cryptographic signature
Re: Need some help with fixincludes.
Hi! I have made some progress with your help. I have fixed the sed part: (1) there were missing 's' in the scripts, first I did not noticed it, then I did not know if I was supposed to povide it; (2) I have replaced the [ \t]+ by [ \t][ \t]* to get: --- ../_gcc_clean/fixincludes/inclhack.def 2009-03-31 22:37:57.0 +0200 +++ fixincludes/inclhack.def2009-04-07 22:28:11.0 +0200 @@ -1023,6 +1023,33 @@ /* + * Fix stdint.h header on Darwin. + */ +fix = { +hackname = darwin_stdint; +mach = "*-*-darwin*"; +files = stdint.h; +sed = "s/#define[ \t][ \t]*INTPTR_MIN[ \t][ \t]*INT64_MIN/#define INTPTR_MIN ((intptr_t) INT64_MIN)/"; +sed = "s/#define[ \t][ \t]*INTPTR_MIN[ \t][ \t]*INT32_MIN/#define INTPTR_MIN ((intptr_t) INT32_MIN)/"; +sed = "s/#define[ \t][ \t]*INTPTR_MAX[ \t][ \t]*INT64_MAX/#define INTPTR_MAX ((intptr_t) INT64_MAX)/"; +sed = "s/#define[ \t][ \t]*INTPTR_MAX[ \t][ \t]*INT32_MAX/#define INTPTR_MAX ((intptr_t) INT32_MAX)/"; +sed = "s/#define[ \t][ \t]*UINTPTR_MAX[ \t][ \t]*INT64_MAX/#define UINTPTR_MAX ((uintptr_t) INT64_MAX)/"; +sed = "s/#define[ \t][ \t]*UINTPTR_MAX[ \t][ \t]*INT32_MAX/#define UINTPTR_MAX ((uintptr_t) INT32_MAX)/"; +sed = "s/#define[ \t][ \t]*SIZE_MAX[ \t][ \t]*INT32_MAX/#define SIZE_MAX ((size_t) INT32_MAX)/"; +sed = "s/#define[ \t][ \t]*SIZE_MAX[ \t][ \t]*INT64_MAX/#define SIZE_MAX ((size_t) INT64_MAX)/"; +sed = "s/#define[ \t][ \t]*UINT8_C(v)[ \t][ \t]*(v ## U)/#define UINT8_C(v) (v)/"; +sed = "s/#define[ \t][ \t]*UINT16_C(v)[ \t][ \t]*(v ## U)/#define UINT16_C(v) (v)/"; +test_text = "#define INTPTR_MININT64_MIN\n" + "#define INTPTR_MAXINT64_MAX\n" + "#define UINTPTR_MAX UINT64_MAX\n" + "#define SIZE_MAX UINT64_MAX\n" + "#define UINT8_C(v) (v ## U)\n" + "#define UINT16_C(v) (v ## U)\n"; + +}; + + +/* * Fix on Digital UNIX V4.0: * It contains a prototype for a DEC C internal asm() function, * clashing with gcc's asm keyword. So protect this with __DECC. Now when I run 'make check' in fixincludes I get: --- /opt/gcc/gcc-4.5-work/fixincludes/tests/base/stdint.h Wed Apr 1 17:37:21 2009 *** *** 9,25 - #if defined( DARWIN_STDINT_CHECK ) - #define INTPTR_MIN ((intptr_t) INT64_MIN) - #define INTPTR_MAX ((intptr_t) INT64_MAX) - #define UINTPTR_MAX UINT64_MAX - #define SIZE_MAX __SIZE_MAX__ - #defineUINT8_C(v) (v) - #defineUINT16_C(v) (v) - - #endif /* DARWIN_STDINT_CHECK */ - - #if defined( IRIX_STDINT_C99_CHECK ) #if 0 #error This header file is to be used only for c99 mode compilations --- 9,14 If I understand the relevant part of the Joseph's answer I have to do some change in fixincludes/tests/base/stdint.h by replacing it with fixincludes/tests/res/stdint.h. Is this correct? >From the first part of the Joseph's answer, I understand that the *intptr_t casts in the if block are not permitted, isn't it? Question: are *intptr_t the right types? Thanks, Dominique
[lto] Merge from mainline @145637
This merge simplified some code in the streamer as we no longer need to worry about memory tags. It also exposed a couple of bugs in EH handling (we were assuming that shared EH regions always occurred when the original region was before the aliases in the EH table) and in statement streaming (we were not clearing out pointer fields in the tuple). There is a new bug exposed by alias-improvements where alias_may_ref_p() returns false on two field references inside the same union. This exposes a problem in get_alias_set() that for gimple is returning two different values for union references. This causes a few execution failures (9) in check-gcc. I will be fixing those separately. 2009-04-08 Diego Novillo Mainline merge @145637. * configure.ac (acx_pkgversion): Update revision merge string. * configure: Regenerate. 2009-04-08 Diego Novillo * gimple.h (gimple_reset_mem_ops): Remove. Update all users. * ipa-cp.c (ipcp_ltrans_cloning_candidate_p): Remove. Update all users. (ipcp_cloning_candidate_p): Do not check for LTRANS. (ipcp_update_callgraph): Revert handling of cloned caller nodes. (graph_gate_cp): Add FIXME lto note. * lto-function-in.c (input_expr_operand): Do not call build7. Remove handling of NAME_MEMORY_TAG and SYMBOL_MEMORY_TAG. (input_eh_region): Return NULL for LTO_eh_table_shared_region. (fixup_eh_region_pointers): Setup region sharing using AKA bitmap sets. (input_gimple_stmt): Clear tuple fields with pointer values. Mark the statement modified. (input_function): Call update_ssa to update SSA on .MEM. (input_tree_operand): Remove handling of NAME_MEMORY_TAG and SYMBOL_MEMORY_TAG. * lto-function-out.c (output_eh_region): Do not output the region number for LTO_eh_table_shared_region. (output_expr_operand): Remove handling of NAME_MEMORY_TAG and SYMBOL_MEMORY_TAG. (output_bb): Do not write PHI nodes for .MEM. (output_tree_with_context): Remove handling of NAME_MEMORY_TAG and SYMBOL_MEMORY_TAG. * lto-tree-flags.def: Likewise. * tree-into-ssa.c: Call bitmap_obstack_initialize. --- gimple.h2009/04/07 01:10:39 1.1 +++ gimple.h2009/04/07 15:22:08 @@ -1520,33 +1520,6 @@ gimple_references_memory_p (gimple stmt) } - -/* Reset all the memory operand vectors in STMT. Note that this makes - no attempt at freeing the existing vectors, it simply clears them. - It is meant to be used when reading GIMPLE from a file. */ - -static inline void -gimple_reset_mem_ops (gimple stmt) -{ - if (!gimple_has_ops (stmt)) -return; - - gimple_set_modified (stmt, true); - - stmt->gsops.opbase.addresses_taken = NULL; - stmt->gsops.opbase.def_ops = NULL; - stmt->gsops.opbase.use_ops = NULL; - - if (gimple_has_mem_ops (stmt)) -{ - stmt->gsmem.membase.vdef_ops = NULL; - stmt->gsmem.membase.vuse_ops = NULL; - stmt->gsmem.membase.stores = NULL; - stmt->gsmem.membase.loads = NULL; -} -} - - /* Return the subcode for OMP statement S. */ static inline unsigned --- ipa-cp.c2009/04/07 01:10:39 1.1 +++ ipa-cp.c2009/04/07 01:42:53 @@ -379,33 +379,6 @@ ipcp_print_all_lattices (FILE * f) } } -/* Return true if this NODE may be cloned in ltrans. - - FIXME lto: returns false if any caller of NODE is a clone, described - in http://gcc.gnu.org/ml/gcc/2009-02/msg00297.html; this extra check - should be deleted if the underlying issue is resolved. */ - -static bool -ipcp_ltrans_cloning_candidate_p (struct cgraph_node *node) -{ - struct cgraph_edge *e; - - /* Check callers of this node to see if any is a clone. */ - for (e = node->callers; e; e = e->next_caller) -{ - if (cgraph_is_clone_node (e->caller)) - break; -} - if (e) -{ - if (dump_file) -fprintf (dump_file, "Not considering %s for cloning; has a clone caller.\n", -cgraph_node_name (node)); - return false; -} - return true; -} - /* Return true if this NODE is viable candidate for cloning. */ static bool ipcp_cloning_candidate_p (struct cgraph_node *node) @@ -422,11 +395,6 @@ ipcp_cloning_candidate_p (struct cgraph_ if (!node->needed || !node->analyzed) return false; - /* If reading ltrans, we want an extra check here. - FIXME lto: see ipcp_ltrans_cloning_candidate_p above for details. */ - if (flag_ltrans && !ipcp_ltrans_cloning_candidate_p (node)) -return false; - if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE) { if (dump_file) @@ -970,7 +938,6 @@ ipcp_update_callgraph (void) struct ipa_node_params *info = IPA_NODE_REF (orig_node); int i, count = ipa_get_param_count (info); struct cgraph_edge *cs, *next; - gimple *call_stmt_map; for (i = 0; i < count; i++) { @@ -989,27 +95
Re: My plans on EH infrastructure
On Wed, Apr 8, 2009 at 5:19 PM, Richard Guenther wrote: >> - We ought to be able to prove finitarity of simple loops so we can >> DCE more early and won't rely on full loop unrolling to get rid of >> empty loops originally initializing dead arrays. > > I wonder if CD-DCE should not catch this, but I see that for > > for(i=0;i<4;++i) > ; > > we do > > Marking useful stmt: if (i_1 <= 3) > > which is already a problem if we want to DCE the loop. Can we > mark the controlling predicate necessary somehow only if we > mark a stmt necessary in the BBs it controls? This is just what CD-DCE is supposed to do. If it doesn't work, then CD-DCE is broken. Ciao! Steven
Re: Support question for GCC
On Wed, Apr 8, 2009 at 6:50 PM, Nordvall, Gary USNUNK NAVAIR 33, , LEGACY SERVER wrote: > GNU, > > Is there an end-of-support date for GCC version 4.3.0? I'm assisting a > customer here at NAWCWD China Lake, California, to register software in our > Navy database, DADMS. In order to use (or purchase) software, it must be > approved in DADMS. One of the things we must show is proof of vendor > support for the particular version. Normally with > Commercial (COTS) software, the manufacturer has an end-of-support or > end-of-life date, but I'm not sure how this works for free or open source > software. I looked at the below links, but could not find the answer. Any > help would be appreciated. The GCC developer community does not offer support in such a sense. A proper contact would be a re-distributor which can offer you support contracts like for example the major Linux vendors. Note that 4.3.0 is no longer supported by the GCC developer community but you will be asked to reproduce an issue with GCC version 4.3.3 which is the current bugfix release of the GCC 4.3 series. The GCC 4.3 series will be likely end-of-lifed at the point when GCC 4.5.0 is released. Richard.
Re: Support question for GCC
On Wed, 2009-04-08 at 19:40 +0200, Richard Guenther wrote: > On Wed, Apr 8, 2009 at 6:50 PM, Nordvall, Gary USNUNK NAVAIR 33, , > LEGACY SERVER wrote: > > GNU, > > > > Is there an end-of-support date for GCC version 4.3.0? I'm assisting a > > customer here at NAWCWD China Lake, California, to register software in our > > Navy database, DADMS. In order to use (or purchase) software, it must be > > approved in DADMS. One of the things we must show is proof of vendor > > support for the particular version. Normally with > > Commercial (COTS) software, the manufacturer has an end-of-support or > > end-of-life date, but I'm not sure how this works for free or open source > > software. I looked at the below links, but could not find the answer. Any > > help would be appreciated. > > The GCC developer community does not offer support in such a sense. > A proper contact would be a re-distributor which can offer you support > contracts like for example the major Linux vendors. See the service directory for GNU software at http://www.fsf.org/resources/service/ Janis
Re: My plans on EH infrastructure
Naive user question : is this going to improve the efficiency of throwing exceptions, at least in the restricted cases of : - the catch clause contains the throw (after inlining). - interprocedural analysis can connect the throwing spot and the corresponding catch clause. ? See my old PR 6588. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=6588 -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: My plans on EH infrastructure
Sylvain Pion a écrit : Naive user question : is this going to improve the efficiency of throwing exceptions, at least in the restricted cases of : - the catch clause contains the throw (after inlining). I meant the try block, sorry. - interprocedural analysis can connect the throwing spot and the corresponding catch clause. ? See my old PR 6588. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=6588 -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: Support question for GCC
> Normally with Commercial (COTS) software, the manufacturer has an > end-of-support or end-of-life date, but I'm not sure how this works > for free or open source software. Pretty much the same way except that the company providing support is not the "manufacturer". For both proprietary (what you call "commercial") software and free or open source software, if you want support for the software, you must contract with some company to provide it. In the case of proprietary software, this is almost always the "manufacturer" of the software and the contract is often part of the contract where you purchase the software. For free or open source software, you can choose between a number of companies that provide support for the software. Each company will have different policies regarding how much support they provide for each version of the software and when they no longer support that version. This question should be directed at the company you've chosen to maintain your software.
Re: My plans on EH infrastructure
> Sylvain Pion a écrit : > >Naive user question : is this going to improve the efficiency > >of throwing exceptions, at least in the restricted cases of : There is little improvement already via EH cleanup: at least cleanups/catch regions that turns out to be empty are now eliminated and does not slow down unwinding process. I have no idea how much performance it adds, but cleanups that optimize to nothing are quite common so it might show up in code that is bound in performance by EH delivery speed (I don't expect this is that common, right?) > >- the catch clause contains the throw (after inlining). > > I meant the try block, sorry. "inlining" throw() call is another item I forgot on my list. My EH-fu is not exactly on par here. I see no reason why we can't convert throw() call into FILTER_EXPR/OBJ_REF_EXPR set and RESX when RESX will be turned into direct goto. It is also something I wanted to experiment with. This has potential to improve C++ code especially by turning functions leaf and cheaper for inlining. Definitly on tramp there are such cases. One thing I worry about is that this effectivly makes it impossible to breakpoint on throw that might be common thing C++ users want to see working. This can probably be partly fixed by emitting proper dwarf for inlined function around the RESX, but GDB won't use it at a moment for breakpointing anyway. We also should get profile estimates more precise here by knowing that certain calls must throw. We are completely wrong here by basically assuming that EH is never taken. > > >- interprocedural analysis can connect the throwing spot and > > the corresponding catch clause. > >? One can work out interprocedurally where the exception will be caught, but what one can do about this info? Translating it to nonlocal goto is overkill because of colateral damage nonlocal goto infrastructure would bring on the non-throwing path. If EH delivery mechanizm has to be used, we are pretty much going as far as we can here. In the throwing function we work out there is no local catch block and in the callers similarly. Pure-const pass can probably be improved to, in addition to mark NOTHROW function to also work out list of types of exceptions. Has this chance to help? Also if you are aware of EH heavy codebase that can be nicely benchmarked, it would be interesting to consider adding it into our C++ benchmark suite. We don't cover this aspect at all. I would be interested in real code that has non-trivial EH that is not bound by the EH delivery performance, but to also see how much the EH code slows down the non-EH paths. Honza > > > >See my old PR 6588. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=6588 > > > > > -- > Sylvain Pion > INRIA Sophia-Antipolis > Geometrica Project-Team > CGAL, http://cgal.org/
Re: Support question for GCC
On Wed, Apr 08, 2009 at 09:50:33AM -0700, Nordvall, Gary USNUNK NAVAIR 33, , LEGACY SERVER wrote: > Is there an end-of-support date for GCC version 4.3.0? (In this answer I'm speaking for myself only, not for GNU or GCC). In one sense it already ended; there were three minor bug-fix releases after 4.3.0 (ending with the current 4.3.3), and I would recommend against any organization using any GCC release numbered as x.y.0 with a final zero in production. To do so is to accept experimental features and to decline the bug fixes that come out promptly after major releases. See http://gcc.gnu.org/develop.html to see how this works. In another sense there never was any support; it's free software after all. On the other hand, if you need formal support there are no shortage of qualified contractors that could provide it. > I'm assisting a > customer here at NAWCWD China Lake, California, to register software in our > Navy database, DADMS. In order to use (or purchase) software, it must be > approved in DADMS. One of the things we must show is proof of vendor > support for the particular version. I would suggest finding out what processes the US Navy already has in place for dealing with free/open source software. Surely you aren't the first to have the task of putting this particular square peg in the round hole.
Re: Debugging gcc front end
HI, I wanna say thanks to everyone that help me. My problem was find the correct path and the command line I will expose what I did to help someone that can have the same problem. I copied the files in my local path $ cp debug /usr/local/bin/ $ cp debugx /usr/local/bin/ after I just run the debugx (tip: the third arg is referent to my gcc binary, if put only gcc will debug the gcc installed in the system) $ debugx cc1 local/bin/gcc File.c To finish I added the break point (gdb) b toplev_main ( In my case, I was in emacs ). Thanks again, GPuglia On Mon, Apr 6, 2009 at 8:11 PM, Dave Korn wrote: > Guilherme Puglia wrote: >> Hi! >> >> or better, hello again! I have posted a question (with my class friend >> Eduardo) about 2 or 3 weeks ago. My question was about the grammar >> wich gcc C front end use. > > Yeh! Hi again you guys :) > >> To solve my problem I wanna debug the C front end. I was trying to >> debug the gcc main function, toplev_main. Unfortunately, I can't >> insert a break point in this line. >> >> I saw the site http://gcc.gnu.org/wiki/DebuggingGCC and >> http://gcc.gnu.org/ml/gcc/2004-03/msg01195.html >> >> But I don't understand where is "my path". And how to insert the break point. > > Ok, sounds like you're not much familiar with using the shell? When it says > your path, it means the $PATH variable used to look for command-line > executables. About setting the breakpoint: if you're using gdb, the command > would be "b toplev_main", it's all in the manual. > > The main thing I'm not sure if you're fully aware of is that when you run > "gcc" at the command-line you're just invoking a simple driver program that > then invokes the preprocessor, compiler, assembler and linker as separate > processes, so you don't want to debug gcc itself. > > If you add "-v" to the gcc command-line, you will see (among other debug > info) the separate individual commands it issues to run these other processes. > You'll see that the first two commands both invoke "cc1" (or "cc1plus in C++) > - the first time, with the "-E" option to do the preprocessing on the source > file, the second time without "-E" to process the generated .i file. It's > that second command-line that you want to actually run under the debugger to > inspect the compilation process. > > > cheers, > DaveK > > -- -- "Never memorize something that you can look up." - Albert Einstein --
Re: My plans on EH infrastructure
I was wondering if loops of form for (i=0; ; i++) a[i] can be assumed finite because eventaully a[i] would get to unallocated memory otherwise. This is however similar to inifinite recursion Correct me if I'm wrong but this is definitely wrong... Assuming a 64bit processor and i to be a 32bits int (eg x86-64), it is possible to allocate a 4Gb array of chars and such a loop is definitely OK. It is somewhat similar to say that this loop: int array[256]; for (unsigned char i=0;;++i) array[i] = XXX; is finite, which is wrong.
Re: My plans on EH infrastructure
Jan Hubicka a écrit : Sylvain Pion a écrit : Naive user question : is this going to improve the efficiency of throwing exceptions, at least in the restricted cases of : There is little improvement already via EH cleanup: at least cleanups/catch regions that turns out to be empty are now eliminated and does not slow down unwinding process. It will probably not help for the synthetic benchmark case of PR 6588 (try { throw(0); } catch(...) {}), since this case has nothing to be cleaned up, right? Not that I care about this synthetic case, but even if the case which has nothing to clean up nor unwinding is already too slow... I have no idea how much performance it adds, but cleanups that optimize to nothing are quite common so it might show up in code that is bound in performance by EH delivery speed (I don't expect this is that common, right?) Indeed, it's probably not common. However, IMO, the reason it's not more common is probably that programmers realize that they need to avoid using exceptions where they could be more natural to use (in terms of programming style) only when they hit the performance penalty. And then, experienced programmers ban them because they know they can't count on compilers to get reasonnable efficiency. EH throwing is so costly (2 cycles minimum reported in PR 6588) that, in some cases, even if it's exceptional, like a 10^-4 probability of throwing, you will see it show up on the profile. Having EH delivery at reasonnable speed would really open up the design space : it would allow to use a non-intrusive way of reporting "exceptional" return values in many more places. By non-intrusive, I mean that you don't need to change the return type of your functions and manually propagate your exceptional return values (or worse, use global or thread-local variables). So, in practice, EH is probably used for std::bad_alloc and such really rare situations, but I think there are lots of other useful places it could be used if it was faster. ( I know at least one, but people might think it's too specialized ;) ) - the catch clause contains the throw (after inlining). I meant the try block, sorry. "inlining" throw() call is another item I forgot on my list. My EH-fu is not exactly on par here. I see no reason why we can't convert throw() call into FILTER_EXPR/OBJ_REF_EXPR set and RESX when RESX will be turned into direct goto. It is also something I wanted to experiment with. A few GCC developers have put some (now a bit old) comments in PR 6588 around this. This has potential to improve C++ code especially by turning functions leaf and cheaper for inlining. Definitly on tramp there are such cases. One thing I worry about is that this effectivly makes it impossible to breakpoint on throw that might be common thing C++ users want to see working. This can probably be partly fixed by emitting proper dwarf for inlined function around the RESX, but GDB won't use it at a moment for breakpointing anyway. Right, breakpointing on throw is useful for debugging, and some option like -g probably needs to preserve this behavior. We also should get profile estimates more precise here by knowing that certain calls must throw. We are completely wrong here by basically assuming that EH is never taken. Indeed... At least throw() throws with a good probability :) I'm not sure how far this would go, but it can't hurt to model the program's behavior more correctly. - interprocedural analysis can connect the throwing spot and the corresponding catch clause. ? One can work out interprocedurally where the exception will be caught, but what one can do about this info? Translating it to nonlocal goto is overkill because of colateral damage nonlocal goto infrastructure would bring on the non-throwing path. If EH delivery mechanizm has to be used, we are pretty much going as far as we can here. In the throwing function we work out there is no local catch block and in the callers similarly. RTH has put a comment in PR 6588 that hints that the reason it's so slow might be that the EH delivery mechanism is very general, in particular it handles cross-language/libraries EH propagation. So, I was thinking that, in the case where interprocedural analysis knows the full path, and it does not go through unknown libraries (and LTO might even increase the number of such cases in the future), a less general way of delivering the exceptions could be used, which would be faster by not bothering to check the cases that are known not to be possible like going through external libraries/languages. Maybe that's too naive because I don't know what's happenning behind throw() in enough detail to understand what makes it so slow. (my naive view of stack unwinding is that you just have to find a stack pointer per call level, and make sure you run the destructors in order, and do a few type comparisons for finding the right catch clause, so how can this be so slow is a co
Re: fwprop and CSE const anchor opt
Adam Nemet writes: > Richard Sandiford writes: >> If we have an instruction: >> >> A: (set (reg Z) (plus (reg X) (const_int 0xdeadbeef))) >> >> we will need to use something like: >> >>(set (reg Y) (const_int 0xdead)) >>(set (reg Y) (ior (reg Y) (const_int 0xbeef))) >> B: (set (reg Z) (plus (reg X) (reg Y))) >> >> But if A is in a loop, the Y loads can be hoisted, and the cost >> of A is effectively the same as the cost of B. In other words, >> the (il)legitimacy of the constant operand doesn't really matter. > > My guess is that A not being a recognizable insn, this is relevant at RTL > expansion. Is this correct? Yeah. It might happen elsewhere too, in conjunction with things like emit_move_insn. >> In summary, the current costs generally work because: >> >> (a) We _usually_ only apply costs to arbitrary instructions >> (rather than candidate instruction patterns) before >> loop optimisation. > > I don't think I understand this point. I see the part that the cost is > typically queried before loop optimization but I don't understand the > distinction between "arbitrary instructions" and "candidate instruction > patterns". Can you please explain the difference? Well, I suppose I should have said something like "arbitrary pattern" or "arbitrary expression" rather than "arbitrary instruction". I just mean "taking the cost of something without trying to recognise the associated instructions". And by "taking the cost of candiate instruction patterns" I mean "taking the cost of something and also recognising it". >> E.g. suppose we're deciding how to implement an in-loop multiplication. >> We calculate the cost of a multiplication instruction vs. the cost of a >> shift/add sequence, but we don't consider whether any of the backend-specific >> shift/add set-up instructions could be hoisted. This would lead to us >> using multiplication insns in cases where we don't want to. >> >> (This was one of the most common situations in which the zero cost helped.) > > I am not sure I understand this. Why would we decide to hoist suboperations > of a multiplication? If it is loop-variant then even the suboperations are > loop-variant whereas if it is loop-invariant then we can hoist the whole > operation. What am I missing? You're probably right. I might be misremembering what the motivating case was. Richard
Fwd: Objective-C and C99 strict aliasing
I've long wondered how GCC deals with C99 strict aliasing rules when compiling Objective-C code. There's no language spec for Objective-C, other than the written prose description of the language that Apple provides (which, until recently, has been virtually unmodified since it's NeXT origins), so there's no definitive source to turn to to help answer these kinds of questions. I recently had some time to dig in to the compiler to try to find an answer. Keep in mind I'm no expert on GCC's internals. The problem is roughly this: How does C99's strict aliasing rules interact with pointers to Objective-C objects. Complicating matters, how do those rules apply to the complexities that object-oriented polymorphism causes? As an example, id object; NSString *string; NSMutableString *mutableString; Objective-C (and object-oriented principles) say the following are legal: object = string; object = mutableString; string = object; string = mutableString; mutableString = object; And the following results in 'warning: assignment from distinct Objective-C type', which is expected: mutableString = string; There's really two distinct ways of looking at this problem: What is permitted under Objective-C and object-oriented principles (of which there doesn't seem to be any problem), and what is permitted under C, specifically, C99 and its strict-aliasing rules. Without a language spec to guide us, we need to make some reasonable assumptions at some point. I've never seen a 'standards grade language specification' definition of what a 'class' (ie, NSString in the above) is. Is it a genuinely opaque type that exists outside of C defined 'types'? Or is it really just syntactic sugar for a C struct? I've always subscribed to the syntactic sugar definition, as this is how GCC represents things internally, and the way that every Objective-C compiler I'm aware of has done things. Working under the assumption that it really is just syntactic sugar, this would seem to interact rather poorly with C99's 'new' type-based strict-aliasing rules. In fact, I have a hard time reconciling Objective-C's free-wheeling type-punning ways with C99's strict-aliasing rules, with the possible exception of recasting Objective-C in terms of C's unions. When I went looking through the compiler sources to see how it managed with the problem, I was unable to find anything that dealt with these problems. Is that really the case, or have I missed something? Objective-C defines 'c_common_get_alias_set' as its language specific alias set manager. c_common_get_alias_set() seems(?) to only implement C's strict aliasing rules, with no provisions for Objective-C's needs. To test this, I added the following to c_common_get_alias_set to see what happens: if(((c_language == clk_objc) || (c_language == clk_objcxx)) && ((TYPE_LANG_SPECIFIC (t) && (TYPE_LANG_SPECIFIC(t)->objc_info)) || objc_is_object_ptr(t))) { warning(OPT_Wstrict_aliasing, "Caught and returning 'can alias anything' for objc type"); return(0); } right before the following line: if (c_language != clk_c || flag_isoc99) Compiling with -O2 -Wstrict-aliasing causes an awful lot of 'Caught..' messages to be returned. Assuming that ivar access is really just syntactic sugar for self->IVAR, then it would seem like there can be times where strict-aliasing can cause the compiler to generate "bad code", for some extremely complicated definition of what the correct thing to do without the benefit of having a standards grade language specification. Can anyone with a much better understanding of GCC's internals comment on this?
Re: My plans on EH infrastructure
> Jan Hubicka a écrit : > >>Sylvain Pion a écrit : > >>>Naive user question : is this going to improve the efficiency > >>>of throwing exceptions, at least in the restricted cases of : > > > >There is little improvement already via EH cleanup: at least > >cleanups/catch regions that turns out to be empty are now eliminated and > >does not slow down unwinding process. > > It will probably not help for the synthetic benchmark case of PR 6588 > (try { throw(0); } catch(...) {}), since this case has nothing to be > cleaned up, right? Not that I care about this synthetic case, but Right. > even if the case which has nothing to clean up nor unwinding is already > too slow... > EH throwing is so costly (2 cycles minimum reported in PR 6588) that, > in some cases, even if it's exceptional, like a 10^-4 probability > of throwing, you will see it show up on the profile. > > Having EH delivery at reasonnable speed would really open up the design > space : it would allow to use a non-intrusive way of reporting "exceptional" > return values in many more places. By non-intrusive, I mean that you > don't need to change the return type of your functions and manually > propagate your exceptional return values (or worse, use global or > thread-local variables). Problem here is that EH delivery mechanizm is part of C++ ABI. The mechanizm is indeed compicated. I am sure runtime can be optimized, but probably not made something like thousdand times faster than it is now. So if we start optimizing special cases of EH to be fast, it will not allow users to use it in performance critical stuff since they will never know if they hit the fast or slow path and will only get surprises. We can implement interprocedural pass that will identify EH that is delievred locally in the unit and use alternative EH delivery mechanizm, but it would be a lot of effort to implement faster variant of runtime that works on so many targets as GCC has and the benefits would be IMO quite low. Honza
Re: [cond-optab] svn branch created, looking for reviews for the "cleanup" parts
On Tue, 2009-04-07 at 12:32 +0200, Paolo Bonzini wrote: > Thanks, this leaves out: > > r145593: http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00545.html (i386) > r145594: http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00545.html (s390) > r145597, r145598, r145599: > http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00947.html (gen*) > r145603: http://gcc.gnu.org/ml/gcc-patches/2009-04/msg00496.html (gen*, RTL) > r145655: http://gcc.gnu.org/ml/gcc-patches/2009-04/msg00492.html (testsuite) What's the point of the compile-only tests? It would be much more useful to have tests that execute and check that the results are correct for a variety of input values. Janis
Re: Fwd: Objective-C and C99 strict aliasing
John Engelhart writes: > Objective-C defines 'c_common_get_alias_set' as its language specific > alias set manager. c_common_get_alias_set() seems(?) to only > implement C's strict aliasing rules, with no provisions for > Objective-C's needs. > Can anyone with a much better understanding of GCC's internals comment on > this? I think you are correct about what is happening today. Since I don't know anything about Objective C, I don't know what should happen instead. I would guess that you would want to set up a tree of alias sets, using record_alias_subset. Ian
Re: [cond-optab] svn branch created, looking for reviews for the "cleanup" parts
On Tue, Apr 7, 2009 at 11:03 AM, Ramana Radhakrishnan wrote: >> To aid testing, I'd like people to help bootstrapping bootstrappable >> targets -- arm, alpha, ia64, pa, s390, x86_64. > > I'm bootstrapping the branch on an arm-linux-gnueabi target. bootstrap on arm-linux-gnueabi completed. Regression testing C, C++ now but the regression tests are still running. This is as of revision 145659 because the tester on the compile farm can be slightly slow. cheers Ramana > > Ramana >
Re: My plans on EH infrastructure
Jan Hubicka a écrit : EH throwing is so costly (2 cycles minimum reported in PR 6588) that, in some cases, even if it's exceptional, like a 10^-4 probability of throwing, you will see it show up on the profile. Having EH delivery at reasonnable speed would really open up the design space : it would allow to use a non-intrusive way of reporting "exceptional" return values in many more places. By non-intrusive, I mean that you don't need to change the return type of your functions and manually propagate your exceptional return values (or worse, use global or thread-local variables). Problem here is that EH delivery mechanizm is part of C++ ABI. The mechanizm is indeed compicated. I am sure runtime can be optimized, but probably not made something like thousdand times faster than it is now. In PR6588, I had mentioned that, at the time, Sun CC was 10 times faster than GCC. That would already be something. If I remember correctly, for the application that motivated the PR, the cost of exceptions compared to propagating an error code through the return type of functions was about 10%. If we cut the cost of exception delivery by a factor of 10, this makes it only an overhead of 1% overall, which might be lost in the noise, and therefore it might become a viable option. So if we start optimizing special cases of EH to be fast, it will not allow users to use it in performance critical stuff since they will never know if they hit the fast or slow path and will only get surprises. Maybe, but for exceptions which are relatively local, say, inside a given library, the user can assume that GCC has switched to the "local ABI" with fast internal exceptions, since he may have compiled the library as one translation unit, so he may be able to control the possible scope of the exceptions. And so he may be able to make a good guess about what the costs will be. My application has this property, but again, maybe it's only me. We can implement interprocedural pass that will identify EH that is delievred locally in the unit and use alternative EH delivery mechanizm, but it would be a lot of effort to implement faster variant of runtime that works on so many targets as GCC has and the benefits would be IMO quite low. Possibly. -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
x86_64-apple-darwin libjava build broken on gcc 4.4 branch
Unfortunately this hasn't been tested for awhile, but it appears that the x86_64-apple-darwin target no longer can build java. I am seeing... checking build system type... x86_64-apple-darwin10 checking host system type... x86_64-apple-darwin10 checking target system type... x86_64-apple-darwin10 checking for a BSD-compatible install... /usr/bin/install -c checking whether ln works... yes checking whether ln -s works... yes checking for x86_64-apple-darwin10-gcc... no checking for gcc... gcc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ANSI C... none needed checking for x86_64-apple-darwin10-g++... no checking for x86_64-apple-darwin10-c++... no checking for x86_64-apple-darwin10-gpp... no checking for x86_64-apple-darwin10-aCC... no checking for x86_64-apple-darwin10-CC... no checking for x86_64-apple-darwin10-cxx... no checking for x86_64-apple-darwin10-cc++... no checking for x86_64-apple-darwin10-cl... no checking for x86_64-apple-darwin10-FCC... no checking for x86_64-apple-darwin10-KCC... no checking for x86_64-apple-darwin10-RCC... no checking for x86_64-apple-darwin10-xlC_r... no checking for x86_64-apple-darwin10-xlC... no checking for g++... g++ checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking for x86_64-apple-darwin10-gnatbind... no checking for gnatbind... no checking for x86_64-apple-darwin10-gnatmake... no checking for gnatmake... no checking whether compiler driver understands Ada... no checking how to compare bootstrapped objects... cmp --ignore-initial=16 $$f1 $$f2 checking for correct version of gmp.h... yes checking for correct version of mpfr.h... yes checking for version 0.10 of PPL... yes checking for correct version of CLooG... yes The following languages will be built: c,c++,fortran,java,objc *** This configuration is not supported in the following subdirectories: target-libmudflap target-libffi target-zlib target-libjava target-libada gnattools target-boehm-gc (Any other directories should still work fine.) from... ../gcc-4.4-20090407/configure --prefix=/sw --prefix=/sw/lib/gcc4.4 --mandir=/sw/share/man --infodir=/sw/share/info --enable-languages=c,c++,fortran,objc,java \ --with-gmp=/sw --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-system-zlib --x-includes=/usr/X11R6/include \ --x-libraries=/usr/X11R6/lib --disable-libjava-multilib --build=x86_64-apple-darwin10 --host=x86_64-apple-darwin10 --target=x86_64-apple-darwin10 The changes to make this work were checked in at... r142370 | andreast | 2008-12-02 13:05:24 -0500 (Tue, 02 Dec 2008) | 5 lines 2008-12-02 Andreas Tobler Jack Howarth * config/i386/t-darwin64: Add m32 multilib support. r142369 | andreast | 2008-12-02 13:04:30 -0500 (Tue, 02 Dec 2008) | 6 lines 2008-12-02 Jack Howarth * configure.ac: Expand to darwin10 and later. * configure: Regenerate. * testsuite/lib/libjava.exp: Expand to darwin10 and later. r142368 | andreast | 2008-12-02 13:03:26 -0500 (Tue, 02 Dec 2008) | 5 lines 2008-12-02 Jack Howarth * testsuite/gcc.dg/darwin-comm.c: Expand to darwin10 and later. r142367 | andreast | 2008-12-02 13:01:57 -0500 (Tue, 02 Dec 2008) | 5 lines 2008-12-02 Jack Howarth * configure.ac: Expand to darwin10 and later. * configure: Regenerate.
x86_64-apple-darwin libjava build broken on gcc 4.4 branch
I see one place where breakage may have occured... http://gcc.gnu.org/viewcvs/branches/gcc-4_4-branch/configure.ac?r1=144881&r2=144887 --- trunk/configure.ac 2009/03/16 13:23:13 144881 +++ trunk/configure.ac 2009/03/16 17:02:02 144887 @@ -446,11 +446,11 @@ *-*-chorusos) noconfigdirs="$noconfigdirs target-newlib target-libgloss ${libgcj}" ;; - powerpc-*-darwin* | x86_64-*-darwin[[912]]*) + powerpc-*-darwin*) noconfigdirs="$noconfigdirs ld gas gdb gprof" noconfigdirs="$noconfigdirs sim target-rda" ;; - i[[3456789]]86-*-darwin*) + i[[3456789]]86-*-darwin* | x86_64-*-darwin9*) noconfigdirs="$noconfigdirs ld gas gprof" noconfigdirs="$noconfigdirs sim target-rda" The use of darwin[[912]]* has been removed which will break the libjava build on darwin10. Jack
Re: [cond-optab] svn branch created, looking for reviews for the "cleanup" parts
>> r145655: http://gcc.gnu.org/ml/gcc-patches/2009-04/msg00492.html (testsuite) > > What's the point of the compile-only tests? It would be much more > useful to have tests that execute and check that the results are > correct for a variety of input values. That's true. I went for compile-only because it was easier to generate them and because I am aiming at equal assembly output (so execution tests are not very important to me). I may work in the future on automatically generated execution tests like those, but that's what I needed now and I figured it was better to post those than nothing. But I see your point. Paolo