Re: Designs for better debug info in GCC
Hi, On Wed, 7 Nov 2007, Alexandre Oliva wrote: > > x and y at the appropriate part. Whatever holds 'x' at a point (SSA > > name, pseudo or mem) will also mention that it holds 'c'. At a later > > point whichever holds 'y' will also mention in holds 'c' . > > I.e., there will be two parallel locations throughout the entire > function that hold the value of 'c'. No. For some PC locations the location of 'c' will happen to be the same as the one holding 'x', and for a different set of PC locations it will be the one also holding 'y'. The request "what's in 'c'" from a debugger only makes sense when done from a certain program counter. Depending on that the location of 'c' will be different. In the case from above both locations might exist in parallel throughout the entire function, but they don't hold 'c' in parallel. > Something like: > > f(int x /* but also c */, int y /* but also c */) { /* other vars */ "int x /* but also c */, int y /* but also c */" implies that x == y already, at which point the compiler will most probably have allocated just one place for x and y (and c) anyway ... > do_something_with(x, ...); // doesn't touch x or y > do_something_else_with(y, ...); // doesn't touch x or y > > Now, what will you get if you 'print c' in the debugger (or if any > other debug info evaluator needs to tell what the value of user > variable c is) at a point within do_something_with(c,...) or > do_something_else_with(c)? ... so the answer would be "whatever is in that common place for x,y and c". If the compiler did not allocate one place for x and y the answer still would be "whatever is in the place of 'y'", because that value is life, unlike 'x'. > Now consider that f is inlined into the following code: > > int g(point2d p) { > /* lots of code */ > f(p.x, p.y); > /* more code */ > f(p.y, p.x); > /* even more code */ > } > > g gets fully scalarized, so, before inlining, we have: > > int g(point2d p) { > int p$x = p.x, int p$y = p.y; > /* lots of code */ > f(p$x, p$y); > /* more code */ > f(p$y, p$x); > /* even more code */ > } > > after inlining of f, we end up with: > > int g(point2d p) { > int p$x = p.x, int p$y = p.y; > /* lots of code */ > { int f()::x.1 /* but also f()::c.1 */ = p$x, f()::y.1 /* but also f()::c.1 > */ = p$y; Here you punt. How come that f::c is actually set to p$x? I don't see any assignment and in fact no declaration for c in f. If you had one _that_ would be the place were the connection between p$x and 'c' would have been made and everything would fall in place. > { /* other vars */ > do_something_with(f()::x.1, ...); // doesn't touch x or y > do_something_else_with(f()::y.1, ...); // doesn't touch x or y > } } > /* more code */ > { int f()::x.2 /* but also f()::c.2 */ = p$x, f()::y.2 /* but also f()::c.2 > */ = p$y; > { /* other vars */ > do_something_with(f()::x.2, ...); // doesn't touch x or y > do_something_else_with(f()::y.2, ...); // doesn't touch x or y > } } > /* even more code */ > } > > then, we further optimize g and get: > > int g(point2d p) { > int p$x /* but also f()::x.1, f()::c.1, f()::y.2, f()::c.2 */ = p.x; > int p$y /* but also f()::y.1, f()::c.1, f()::x.2, f()::c.2 */ = p.y; > /* lots of code */ > { { /* other vars */ > do_something_with(p$x, ...); // doesn't touch x or y > do_something_else_with(p$y, ...); // doesn't touch x or y > } } > /* more code */ > { { /* other vars */ > do_something_with(p$y, ...); // doesn't touch x or y > do_something_else_with(p$x, ...); // doesn't touch x or y > } } > /* even more code */ > } > > and now, if you try to resolve the variable name 'c' to a location or > a value within any of the occurrences of do_something_*with(), what do > you get? What ranges do you generate for each of the variables > involved? It's not possible that p$x _and_ p$y are f()::c.1 at the same time, so the above examples are all somehow invalid. Except if p$x and p$y are somehow the same value, and if that's the case it's enough and exactly correct if the range of f()::c.1 covers the whole body of your function 'g' referring to exactly the one location of f()::c.1, f()::c.2, p$x and p$y. > Unfortunately, this mapping is not biunivocal. The chosen > representation is fundamentally lossy. What's fundamentally lossy are transformations done by the compiler. E.g. in this simple case: int f(int y) { int x = 2 * y; return x + 2; } If the compiler forward-props 2*y into the single use and simplifies: return (y+1)*2; then the value 2*y is never actually calculated anymore, not in any register, not in any local variable, nowhere. There's no way debug information could generally rectify this loss of information. As DWARF is capable to encode complete expressions it would be possible in this case to express it, because the inverse of the above function is easily determine
RE: Progress on GCC plugins ?
On 07 November 2007 17:52, David Edelsohn wrote: > > The concern is the many forms of shim layers that possibly could > be written more easily with a plug-in framework. I wonder if we could adapt some kind of privsep model, so that once the compiler has finished init, it gives away its rights and can't even open files or create pipes or sockets any more. The gcc.c driver could remain at full user prives and take care of opening everything (a bit like I guess it already does when you're using -pipe?) and cc1 and pals could all be hobbled. Or. how about we define plugins in an interpreted or bytecoded language and don't allow any IO whatsoever? Oo if we were really clever we could maybe define an interface that's entirely non-enumerable: it calls out to hooks, providing them with just enough information and interfaces to do the job they have to do, but we don't make it possible to derive information about the overall AST because we don't provide any way to know 'what's out there'. #1 seems like there might too easily be loopholes even assuming it can actually be made to work, that is to say that it would be very hard to really prevent data escaping from the process boundary using the unix fs perms model. Maybe the more modern unices with finergrained acls and kernel object permissions would be able to make work, but I think that the lesson of chroot jails is that they're to prevent fat-finger errors, not determined security attackers. #3 isn't necessarily possible either. It would probably need serious maths formalisms and proofs to define. Zero-knowledge secret problem sharing based plugins, anyone? #2 might well be a goer. It looks pretty plausible. cheers, DaveK [carefully avoiding the IANAL debate :-)] -- Can't think of a witty .sigline today
Re: Designs for better debug info in GCC
On 11/8/07, Mark Mitchell <[EMAIL PROTECTED]> wrote: > Ian Lance Taylor wrote: > > > At one time, gcc actually provided better debugging of optimized code > > than any other compiler, though I don't know if that is still true. > > Optimized gcc code is still debuggable today. I do it all the time. > > (For me poor support for debugging C++ is a much bigger issue, though > > I think that is an issue more with gdb than with gcc.) > > I think we all agree that providing better debugging of optimized code > is a priori a good thing. So, as I see it, this thread is focused on > what internal representation we might use for that. > > I don't know that there's an abstract right answer to whether something > NOTE-like or something on the side is better. There are problems with > both approaches. We know the NOTE/DEBUG_INSN thing is going to break, > from experience; we also know the on-the-side thing is going to be hard > to maintain. I think we're going to find out once both approaches are implemented up to a way that they reasonably to what they want to do. So I'm fine to defer this decision up to that point (or the point where we start the fighting on which approach will get merged). > Alexandre has clearly thought about this a lot. I'd like to start by > capturing the functional changes that we want to make to GCC's debug > output -- not the changes that we want in the debug experience, or > changes that we need in GDB, but the changes in the generated DWARF. > > For example, I'm thinking of a series of function test cases. Ignore > the substance of this example -- I'm making it up! -- I'm just trying to > capture the form. > > === > int main () { int i; i = 3; return i; } > > When optimizing, "i" is optimized away. The debug info for "i" right > before the return statement says "i has been optimized away", but not > what its value is. I think it should say that the value is "3". To do > that, we need to emit a DW_Now_My_Value_is_3 tag for "i". > === > > Now, how is whatever representation we pick going to get us that? Is > the Oliva representation sufficient? What about the Guenther/Matz > representation? Independently of the representation, what algorithms > are we going to use to track whatever we need to track as the optimizers > remove, insert, duplicate, and reorder code? For the example above, the representation we use on the tree level cannot attach a name to '3' (since obviously '3' is not a SSA_NAME). But this is fixable if we think it is worthwhile. > Until we all know what we're trying to do, I don't see how we can make a > good decision about the representation. Clearly, in the abstract, we > can represent data either on-the-side or in the instruction stream, but > until we know what output we want, I'm not sure how we can pick. That's true. I was also thinking on how to properly do testcases for both kind of infrastructure. At the moment I scan tree/rtl dumps for the names I want to preserve, but ultimately it would be nice to be able to run gdb testcases in the gcc tree to also verify 'correctness' of the information we produce (and not just existence of some information). Richard.
Re: Progress on GCC plugins ?
Ian Lance Taylor wrote: > > More deeply, I think his concern is misplaced. I think that gcc has > already demonstrated that the only widely used compilers are free > software. Proprietary compilers don't keep up over time, outside of > niche markets. Hooking proprietary code into gcc, one way or another, > is just going to create a dead end for the people who do it. > Certainly it's not a good thing, and certainly it would be preferable > to prevent it if possible. But it is not the worst possible thing > that could happen; it is merely a cost. > > I won't enumerate the benefits of plugins here. But it is clear to me > that the benefits outweigh the costs. I won't add anything to the comments of Ian because I fully agree with him but I would like to clarify what I expect from GCC. First of all, I'm coming from the verification community and I'm currently very interested in '(automatic) code verification'. What our community is really seeking for by now is a tool aiming at providing an abstract (i.e. simplified) model of program sources. And actually, GCC do it internally. Indeed, you can provide a gimplified CFG of C, C++, Java, ... source files and many other informations, which overpass tools such as CIL (C Intermediate Language, http://manju.cs.berkeley.edu/cil/) because of the numerous front-end of GCC. What I would like to have is just some mean to extract the last step before the translation into RTL (usually 'final_cleanup'). Surely, having the possibility to decorate the code with some extra information would be something more but we can actually work-around without any problem considering the benefits we get from not having to maintain all the front-ends. What is providing the 'plugin scheme' is actually much more than we need because it goes deep inside GCC and allow *modification* of GCC internals (and therefore might break a lot of things, I guess) which (in my humble opinion) is the real problem here. Maybe defining some standardized outputs (very similar to -fdump-tree-*) targeted for static-analysis and model-checking tools would be enough as the main problem for us is just to get GCC to export its intermediate internal representations of the program in a way which is reliable and stable from one version to another. For the software I'm thinking of I really just need the final CFG and maybe a summary of the IPA would help as well. Regards -- Emmanuel Fleury If you don't get everything you want, think of the things you don't get that you don't want. -- Oscar Wilde
Re: Progress on GCC plugins ?
* Robert Dewar: > Tom Tromey wrote: > >> First, aren't we already in this situation? There are at least 2 >> compilers out there that re-use parts of GCC by serializing trees and >> then reading them into a different back end. > > It's not obvious to me that this is consistent with the GPL .. It requires a fairly expansionist view of copyright to make it inconsistent. The FSF has made similar claims in the past (for instance, that all Emacs Lisp code must be GPLed), but I'm not convinced that arguing for more Draconian copyright laws helps the free software cause. Copyleft requires compromises in this area, of course, but some deals should be off limits.
Re: Progress on GCC plugins ?
Florian Weimer wrote: * Robert Dewar: Tom Tromey wrote: First, aren't we already in this situation? There are at least 2 compilers out there that re-use parts of GCC by serializing trees and then reading them into a different back end. It's not obvious to me that this is consistent with the GPL .. It requires a fairly expansionist view of copyright to make it inconsistent. I disagree (based significantly on the proceedings of the Intergraph vs Bentley trial, which unfortunately are not easily published anywhere), but I think this thread should not go too much farther here, it is really off topic.
Re: Reload using a live register to reload into
Hi, > > (call_insn:HI 91 270 92 5 cor_h.c:129 (parallel [ > >(set (reg:SI 1 $c1) > >(call (mem:SI (symbol_ref:SI > > ("DotProductWithoutShift") [flags 0x41] > DotProductWithoutShift>) [0 S4 A32]) > >(const_int 0 [0x0]))) > >(use (const_int 0 [0x0])) > >(clobber (reg:SI 31 $link)) > >]) 42 {*call_value_direct} (expr_list:REG_DEAD (reg:SI 4 $c4) > >(expr_list:REG_DEAD (reg:SI 3 $c3 [ ivtmp.103 ]) > >(expr_list:REG_DEAD (reg:SI 2 $c2 [ h ]) > >(nil > >(expr_list:REG_DEP_TRUE (use (reg:SI 4 $c4)) > >(expr_list:REG_DEP_TRUE (use (reg:SI 3 $c3 [ ivtmp.103 ])) > >(expr_list:REG_DEP_TRUE (use (reg:SI 2 $c2 [ h ])) > >(expr_list:REG_DEP_TRUE (use (reg:SI 1 $c1 [ ivtmp.101 ])) > >(nil)) > > I don't think so, it should be in dead_or_set, the value contained in $c1 dies > in the insn. Yes, after going through the code more closely, I concur. The problem lies in that, $c1 isn't live_throughout, but at the point before the call insn it is live. Therefore If an instruction is inserted before the call insn as is done when a caller save instruction is inserted (by caller-save.c) and if this doesnt kill $c1 then its live_throughout should have the bit for $c1 set. This doesnt happen because while inserting the caller save insn, its live_throughout is simply set to the live_throughout of the call insn + the registers marked with REG_DEAD notes in the call insn. However since $c1 is an argument to the call it is used by the call_insn and is marked REG_DEP_TRUE ( Read after Write). Shouldnt regs in REG_DEP_TRUE be added to live_throughout. My suspicion is that the LOG_LINKS are not always up-to-date, therefore will it be better to use DF_INSN_UID_USES ? Thanks in advance, Pranav
Re: Designs for better debug info in GCC
My general feelings on this subject: 1. I don't think we should care much about the ability to *SET* values of variables in optimized code. You can definitely do without that. So if a variable exists in two places, no problem, just register one of them. 2. It is much more important to have reasonable debugging for most users than the last mile of optimization. For me we should ensure that -O1 is still reasonably debuggable. The switch to GCC 4, at least in the Ada context, has significantly degraded -O1 debugging. I have found for instance that debugging the GNAT compiler itself, -O1 used to be perfectly fine, but now far too many arguments and variables disappear. 3. The quality of code at -O0 is really terrible compared to the competition (at least in the case of Ada), and large scale programs are just too big at -O0 to be practical (there is a big difference between a 50 megabyte image and a 100 megabyte image). So we really cannot rely on using -O0 for debugging. At -O1 we are more than competitive for performance with competing compilers. 4. In any case, most users really prefer to test and debug at the same optimization level that they will use for delivery. As noted above, -O0 is seldom practical for delivery (furthermore the voluminous extra code makes certification at the object level more work). -O1 is a fine compromise from a performance point of view, but needs to be debuggable. 5. Among our users we have relatively few who care about even a factor of 2 in performance, and VERY few who care about 10%. On the other hand we have lots of customers who definitely have severe problems with the lack of debuggability of -O1 code. 5. We have talked sometime about a -Od level or somesuch that would be fully debuggable. That's an interesting idea, but I think in practice it is more reasonable to try to ensure good debugging at -O1. Optimizations that significantly intefere with debugging should be moved to -O2. I think it is fine for -O2 to mean "optimize the heck out of the program, I really care about the last ounce of optimization, and I know debuggability will suffer."
Re: Designs for better debug info in GCC
Hi, On Thu, 8 Nov 2007, Robert Dewar wrote: > significantly degraded -O1 debugging. I have found for > instance that debugging the GNAT compiler itself, -O1 > used to be perfectly fine, but now far too many arguments > and variables disappear. Yes. That problem is addressed by Alexandre's approach and by ours. If you want to be really sure no arguments disappear (necessary for instance for meaningful use of systemtap) you also need to inhibit some transformations, which can be done under a certain option (which might or might not be on by default for -O1). > 3. The quality of code at -O0 is really terrible compared > to the competition (at least in the case of Ada), and > large scale programs are just too big at -O0 to be > practical (there is a big difference between a 50 > megabyte image and a 100 megabyte image). This is a problem on it's own. We're planning to work on this somewhen during the next months, i.e. improve code quality at -O0 at least to a point it was in the 3.x line of GCC. Ciao, Michael.
Re: Designs for better debug info in GCC
On Thu, Nov 08, 2007 at 08:59:18AM -0500, Robert Dewar wrote: > 2. It is much more important to have reasonable debugging > for most users than the last mile of optimization. For me > we should ensure that -O1 is still reasonably debuggable. > The switch to GCC 4, at least in the Ada context, has > significantly degraded -O1 debugging. I have found for > instance that debugging the GNAT compiler itself, -O1 > used to be perfectly fine, but now far too many arguments > and variables disappear. > With gcc 3.4, I can debug binutils at -O1 and -O2 in some cases. But with gcc 4, I have to use -O0 if I want to do any serious debug on binutils. H.J.
Re: Designs for better debug info in GCC
On Nov 8, 2007, Michael Matz <[EMAIL PROTECTED]> wrote: > Hi, > On Wed, 7 Nov 2007, Alexandre Oliva wrote: >> > x and y at the appropriate part. Whatever holds 'x' at a point (SSA >> > name, pseudo or mem) will also mention that it holds 'c'. At a later >> > point whichever holds 'y' will also mention in holds 'c' . >> >> I.e., there will be two parallel locations throughout the entire >> function that hold the value of 'c'. > No. For some PC locations the location of 'c' will happen to be the same > as the one holding 'x', and for a different set of PC locations it will be > the one also holding 'y'. So we're in agreement. What you say is how it ought to be done, what I did was to point out that the representation proposed by richi will be unable to do the right thing. >> f(int x /* but also c */, int y /* but also c */) { /* other vars */ > "int x /* but also c */, int y /* but also c */" implies that x == y > already No, per the posted design (assuming I understood it correctly) it just implies that, at some point in the program, an assignment 'c = x' was optimized away, and that at some other point in the program, an assignment 'c = y' was optimized away. >> do_something_with(x, ...); // doesn't touch x or y >> do_something_else_with(y, ...); // doesn't touch x or y >> >> Now, what will you get if you 'print c' in the debugger (or if any >> other debug info evaluator needs to tell what the value of user >> variable c is) at a point within do_something_with(c,...) or >> do_something_else_with(c)? > ... so the answer would be "whatever is in that common place for x,y and > c". And once we removed the incorrect assumption you made, that 'x == y', what do you get? > How come that f::c is actually set to p$x? It was in the original source code, was it not? p$x was passed to f() as x, and then x was copied to c. > I don't see any assignment and in fact no declaration for c in f. > If you had one _that_ would be the place were the connection between > p$x and 'c' would have been made and everything would fall in place. Since there is a declaration of c in the original source-level f (the only one that matters, as far as debug information is concerned), can you please expand on how you'd get everything to fall in place? > It's not possible that p$x _and_ p$y are f()::c.1 at the same time, Exactly > so the above examples are all somehow invalid. It's the bitmap debug info representation that makes them nonsensical. > int f(int y) { > int x = 2 * y; > return x + 2; > } > If the compiler forward-props 2*y into the single use and simplifies: > return (y+1)*2; > then the value 2*y is never actually calculated anymore, not in any > register, not in any local variable, nowhere. There's no way debug > information could generally rectify this loss of information. Actually, while y is live, debug information could encode that x is 2*y, even if the value is not computed at run time. So your statement is quite an exaggeration. > In case of more complicated expressions that's not possible anymore > and you lose. Yep. If the value is unavailable, debug information should say so, rather than pointing at something else. > Forcing some values life is possible, But undesirable. I'm not trying to do that. Actually, I'm working hard to make sure it doesn't happen. > So, our mapping is as accurate as your's. Not at all, and you made that point yourself, twice, in a single e-mail. > It seems in your branch you also force some values life IIUC. Nope. Any values that are forced live by debug annotations are bugs to be fixed. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: Designs for better debug info in GCC
Alexandre Oliva <[EMAIL PROTECTED]> writes: > On Nov 7, 2007, Ian Lance Taylor <[EMAIL PROTECTED]> wrote: > > >> Does it really matter? Do we compromise standards compliance (and so > >> violently, while at that) in any aspect of the compiler? > > > What standards are you talking about? > > Debug information standards such as DWARF-3. ... > Incorrectness in the compiler output is always a bug. No matter how > hard it is to implement, or how resource-intensive the solution is, > arguing that we've made a trade-off and decided to generate wrong > output for this case is a clever decision. I'm sorry, I've thought about it, but I don't buy this argument. I'm certainly willing to talk about improving debug information for optimized code, and clearly it is more important to more people than I initially thought. However, I don't think your arguments that this is an issue comparable to code correctness are valid. Incorrect generated code is a fatal problem in a compiler. Incorrect debugging information is a quality of implementation issue. > >> > We've fixed many many bugs and misoptimizations over the years due to > >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake > >> > we've made in the past. > >> > >> That's a valid concern. However, per this reasoning, we might as well > >> push every operand in our IL to separate representations, because > >> there have been so many bugs and misoptimizations over the years, > >> especially when the representation didn't make transformations > >> trivially correct. > > > Please don't use strawman arguments. > > It's not, really. A reference to an object within a debug stmt or > insn is very much like any other operand, in that most optimizer > passes must keep them up to date. If you argue for pushing them > outside the IL, why would any other operands be different? I think you misread me. I didn't argue for pushing debugging information outside the IL. I argued against a specific implementation--DEBUG_INSN--based on our experience with similar implementations. Ian
Re: Designs for better debug info in GCC
On Nov 8, 2007, Robert Dewar <[EMAIL PROTECTED]> wrote: > My general feelings on this subject: > 1. I don't think we should care much about the ability to > *SET* values of variables in optimized code. Indeed. We should care about correctness of debug information, and then this ability will come naturally ;-) > 3. The quality of code at -O0 is really terrible That's a feature, no? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: Designs for better debug info in GCC
On Nov 8, 2007, Michael Matz <[EMAIL PROTECTED]> wrote: > If you want to be really sure no arguments disappear (necessary for > instance for meaningful use of systemtap) you also need to inhibit > some transformations, I'm not aware of any situations in which we must force an argument not to disappear. All of the problems I'm aware of are those in which the argument is there, we're just missing debug information for it. If you have information about needs for preserving arguments that are actually dead, please send it my way. > This is a problem on it's own. We're planning to work on this somewhen > during the next months, i.e. improve code quality at -O0 at least to a > point it was in the 3.x line of GCC. Aah, I guess the problem here is all the gimple-introduced temps, right? That our current -O0 is more like -O-1? :-) -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: Designs for better debug info in GCC
Alexandre Oliva wrote: > On Nov 7, 2007, Mark Mitchell <[EMAIL PROTECTED]> wrote: > >> Until we all know what we're trying to do > > Here's what I am trying to do: I think these are laudable goals, but you didn't really provide the information I wanted. In particular, what I'd like to drill down from goals (like "ensure that, for every user variable for which we emit debug information, the information is correct") to concrete problems. I think that most of the goals boil down to making sure that, at any point in the program, the debug information for a variable meets the following criteria: (a) if the variable has not been optimized away, gives the location where that variable's current value can be found, or (b) if the variable has been optimized away, and the value is not a constant, says that the value is not available, or (c) if the variable has been optimized away, but is a constant, says what the constant value is Is that right? (Note "at any point" above; it might be that the variable is present in r0 for a while, and then optimized away, and then present at *0xdeadbeef for a while, and then has the constant value 7.) If so, how are you proposing to accomplish that? It's easy enough to design a representation (whether in the instruction stream, or on the side) that says "from instruction A to instruction B, the value is in this location". So, I don't think we need to worry about that just yet. But, how are we going to track this information? Algorithmically, what needs to change in the compiler to maintain this state? For example, we need some way for an optimization pass to tell the rest of the compiler that a variable was completely eliminated. (Perhaps, for example, because all uses of the variable were eliminated.) So, maybe we need a debug_var_eliminated API. Then, every pass that blows away variables can call this function, which can make whatever notations on the VAR_DECL are required. I'm not claiming that's the right approach, but I'd like to understand the plan at that kind of level. What changes will need to be made throughout the compiler to keep track of the state? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Designs for better debug info in GCC
David Daney wrote: >> (a) if the variable has not been optimized away, gives the location >> where that variable's current value can be found, or >> (b) if the variable has been optimized away, and the value is not a >> constant, says that the value is not available, or > > Perhaps if the variable has been optimized away *but* it is possible to > calculate its value by examining the state of the program, then we can > emit the expression needed to calculate its value in the debugging > information as well. Yes, that's a good addition. To be clear, I'm not trying to set the goals here; I'm just trying to make sure we have a clear set of objectives and a plan to get there. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Designs for better debug info in GCC
On Thu, Nov 08, 2007 at 02:36:57PM -0200, Alexandre Oliva wrote: > > 3. The quality of code at -O0 is really terrible > > That's a feature, no? Actually it's a misfeature, in that it's worse than it needs to be, and it's worse in ways that increase the time required to produce it (since a larger volume of code then has to be handled by the back end, assembler, and linker). Debugging would be just as easy and natural if -O0 only made sure that values of variables are written out to memory at positions where the user can set a breakpoint; the code doesn't need to preserve every operation exactly as written, or read variables in from memory that are already in registers. Kind of an -O0.5 would be more desirable than what we have now.
Re: Designs for better debug info in GCC
Alexandre Oliva wrote: On Nov 8, 2007, Robert Dewar <[EMAIL PROTECTED]> wrote: My general feelings on this subject: 1. I don't think we should care much about the ability to *SET* values of variables in optimized code. Indeed. We should care about correctness of debug information, and then this ability will come naturally ;-) Not really, there are optimizations that will still allow reading the value of a variable, but not setting it, and I think it is just fine to do these optimizations. For instance if we have b = a; the optimizer may not do a copy, it may simply know that b and a values are in the same place. This does not stand in the way of reading the value, but it does make it impossible to write a or b. Similarly, if the optimizer does test replacement, and knows that the value of a can be obtained by evaluating some expression, the debugger can read the value, but may not be able to set it. 3. The quality of code at -O0 is really terrible That's a feature, no?
Re: Designs for better debug info in GCC
Alexandre Oliva <[EMAIL PROTECTED]> writes: > So... The compiler is outputting code that tells other tools where to > look for certain variables at run time, but it's putting incorrect > information there. How can you possibly argue that this is not a code > correctness issue? I don't see any point to going around this point again, so I'll just note that I disagree. > >> >> > We've fixed many many bugs and misoptimizations over the years due to > >> >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake > >> >> > we've made in the past. > >> >> > >> >> That's a valid concern. However, per this reasoning, we might as well > >> >> push every operand in our IL to separate representations, because > >> >> there have been so many bugs and misoptimizations over the years, > >> >> especially when the representation didn't make transformations > >> >> trivially correct. > >> > >> > Please don't use strawman arguments. > >> > >> It's not, really. A reference to an object within a debug stmt or > >> insn is very much like any other operand, in that most optimizer > >> passes must keep them up to date. If you argue for pushing them > >> outside the IL, why would any other operands be different? > > > I think you misread me. I didn't argue for pushing debugging > > information outside the IL. I argued against a specific > > implementation--DEBUG_INSN--based on our experience with similar > > implementations. > > Do you remember any other notes that contained actual rtx expressions > and expected optimization passes to keep them accurate? No. > Do you think > we'd gain anything by moving them to a separate, out-of-line > representation? I don't know. I don't see such a proposal on the table, and I don't have one myself, so I don't know how to evaluate it. Ian
New branches for ix86 backporting
I created two new branches to allow companies that create ix86 processors such as AMD and Intel to backport the changes for these new chipsets into GCC 4.1 and 4.2 branches. It is expected that any changes in this branch will be put into the mainline first and then backported. When 4.3 branches, I will make a branch for that. I and H. J. will be maintainers of the branch, and I would imagine we will do merges from the official 4.2/4.1 branches. svn+ssh://gcc.gnu.org/svn/gcc/branches/ix86/gcc-4_1-branch svn+ssh://gcc.gnu.org/svn/gcc/branches/ix86/gcc-4_2-branch -- Michael Meissner, AMD 90 Central Street, MS 83-29, Boxborough, MA, 01719, USA [EMAIL PROTECTED]
Re: Designs for better debug info in GCC
On Nov 8, 2007, Mark Mitchell <[EMAIL PROTECTED]> wrote: > Alexandre Oliva wrote: >> On Nov 7, 2007, Mark Mitchell <[EMAIL PROTECTED]> wrote: >> >>> Until we all know what we're trying to do >> >> Here's what I am trying to do: > I think these are laudable goals, but you didn't really provide the > information I wanted. Oh, you didn't want goals. Design and implementation plans more detailed than http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00160.html, I suppose. Ok, let's see... 1. introduce, early in compilation (when entering SSA), annotations that map user-level variables whose location may vary throughout their lifetime to implementation-level variables or expressions at every point of assignment and PHI joins. 2. keep those annotations accurate throughout compilation, without letting them interfere with optimizations, but making sure they are kept up-to-date or marked untrackable. 3. in var-tracking, starting from the expressions in the annotations and their equivalent expressions computed with a dataflow-globalized cse analysis, emit traditional var-tracking var_location notes for all variables. For variables that didn't start out as gimple regs, the current debug info behavior should be preserved. > I think that most of the goals boil down to making sure that, at any > point in the program, the debug information for a variable meets the > following criteria: > (a) if the variable has not been optimized away, gives the location > where that variable's current value can be found, or > (b) if the variable has been optimized away, and the value is not a > constant, says that the value is not available, or > (c) if the variable has been optimized away, but is a constant, says > what the constant value is yes, except that instead of constant and constant value, I'd put it as 'computable expression from other live values'. And I'd say "locations" rather than just "location". > But, how are we going to track this information? Algorithmically, what > needs to change in the compiler to maintain this state? Most optimizations passes must already update uses of gimple or pseudo regs they modify, so these will be taken care of automatically (which is why I chose this representation). Optimization passes that move assignments to an earlier point in the program don't need any modification. Those that move them to a later point will often move them past their debug notes. This means the debug notes need updating, but it also means that, in the absence of fixes, the debug notes most likely will stand in the way of the transformation, so testing that the debug notes don't change optimization behavior ought to catch these. Transformations that copy or move blocks will retain the annotations, so this should "just work". Transformations that delete blocks might be a bit of a problem, if they delete important debug annotations. So far, the only case I've noticed of such behavior is in ifcvt, in which an if-then-assign-else-assign set of blocks is turned into a single if-then-else assignments. This particular case is covered by the PHI statement that is placed in the entry point of the block that joins the then and the else. On architectures that support longer blocks with conditional-execution of arbitrary instructions (arm, ia64), I'm not sure how to handle the debug notes. It seems to me that, with the current design, the variable may be regarded as untrackable after the first conditional assignment within the combined blocks, but at the join point there will be a the debug annotation corresponding to the PHI join that will take care of getting a correct location for the variable again. I don't have plans in place for any other kind of situation, but it appears to me that the notion of using assignments and joins as fixed points is solid, and I'm pretty sure any surprises can be overcome. Of course software pipelining and other kinds of loop transformations will yield debug information that's not exactly easy to grasp, but this would be true of any representation. When the compiler messes too much with the code, there's very little one can do to make execution resemble that of sequential execution. I'm also thinking debug info consumers would probably enjoy some means to tell a point at which all side effects present in a certain source line have been completed. But these are mostly orthogonal issues, so I won't delve into them right now. > For example, we need some way for an optimization pass to tell the > rest of the compiler that a variable was completely eliminated. In the design I'm proposing, there's no need for anything explicit like this. This would require global information, which is undesirable, especially for optimizers that operate locally. What they'd have to do when they throw away a value that a debug annotation relies on is to replace that value with something equivalent, if they can, or to mark that particular annotation as untrackable. Then, if all annotations
Re: Designs for better debug info in GCC
Ian Lance Taylor wrote: Alexandre Oliva <[EMAIL PROTECTED]> writes: So... The compiler is outputting code that tells other tools where to look for certain variables at run time, but it's putting incorrect information there. How can you possibly argue that this is not a code correctness issue? I don't see any point to going around this point again, so I'll just note that I disagree. Well I very much agree. If you are writing certified code, then a number of evidence producing tools rely on the debugging information, and it is a problem if this information is incorrect.
Re: Designs for better debug info in GCC
First off I would like to say I did not want to reply but I guess I am going to because of some false information spreading around about what GCC as a compiler is. On 11/7/07, Alexandre Oliva <[EMAIL PROTECTED]> wrote: > I'm personally getting numerous requests for debug information > correctness and better completeness from debug info consumers such as > gdb, frysk and systemtap. GCC's eagerness to inline functions, even > ones never declared as inline, and its eagerness to corrupt the > meta-information associated with them, causes these tools to > malfunction in very many situations. And it's all GCC's fault, for > generating code that is not standards-compliant in the > meta-information sections of its output. I have to ask, do you want an optimizing compiler or one which generates full debugging information Because there are trade off here really. The reason behind the extra inlining is because it improves code generation. I don't know about you but in some area of coding, they need the extra speed/size reductions that inlining of non user marked functions. I have plenty of code which needs the speed help that the extra inling helps (remember some developers don't want to change the code that much to have the optimizing compiler do its work). Remember dwarf3 is not really a standards about meta-information, it just mentions how it represented if it exists. -- Pinski
Re: Designs for better debug info in GCC
Mark Mitchell wrote: Alexandre Oliva wrote: On Nov 7, 2007, Mark Mitchell <[EMAIL PROTECTED]> wrote: Until we all know what we're trying to do Here's what I am trying to do: I think these are laudable goals, but you didn't really provide the information I wanted. In particular, what I'd like to drill down from goals (like "ensure that, for every user variable for which we emit debug information, the information is correct") to concrete problems. I think that most of the goals boil down to making sure that, at any point in the program, the debug information for a variable meets the following criteria: (a) if the variable has not been optimized away, gives the location where that variable's current value can be found, or (b) if the variable has been optimized away, and the value is not a constant, says that the value is not available, or Perhaps if the variable has been optimized away *but* it is possible to calculate its value by examining the state of the program, then we can emit the expression needed to calculate its value in the debugging information as well. I may be missing something, but it seems that may be part of Alexandre's plan as well. (c) if the variable has been optimized away, but is a constant, says what the constant value is
Re: Designs for better debug info in GCC
On Nov 8, 2007, "Andrew Pinski" <[EMAIL PROTECTED]> wrote: > On 11/7/07, Alexandre Oliva <[EMAIL PROTECTED]> wrote: >> I'm personally getting numerous requests for debug information >> correctness and better completeness from debug info consumers such as >> gdb, frysk and systemtap. GCC's eagerness to inline functions, even >> ones never declared as inline, and its eagerness to corrupt the >> meta-information associated with them, causes these tools to >> malfunction in very many situations. And it's all GCC's fault, for >> generating code that is not standards-compliant in the >> meta-information sections of its output. > I have to ask, do you want an optimizing compiler or one which > generates full debugging information I want both. That's the whole point of this project I'm in. > Because there are trade off here really. For a superficial look at the problem, they might look like trade-offs. But the assumption that it's impossible to get both is incorrect. It takes work, but it's not impossible. > The reason behind the extra inlining is because it > improves code generation. I don't see how you got the impression that I might be arguing against the inlining, as it looks like you did. I'm not. I'm arguing against the corruption of meta-information associated with them. That's just laziness on our part. > Remember dwarf3 is not really a standards about meta-information, it > just mentions how it represented if it exists. That's what meta-information is. One of the problems is that we often fail to represent information that does exist. A more serious problem is that we often represent such information incorrectly, making it seem like things that don't exist do, and that things are at different locations from those in which they actually are. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: Progress on GCC plugins ?
Ian Lance Taylor wrote: > It seems very likely that it would be possible to write a plugin which > would generate IR to be fed into a proprietary backend. > > More deeply, I think his concern is misplaced. For the record, I agree on both points. There are some ways in which plugins might make getting proprietary code in the loop easier. For example, if someone plugs Guile into the plugin interface, then it might be easy to write a pile of LISP code to poke at GCC. I understand that there are differing opinions on whether or not these kinds of indirections affect the GPL. But, that's the kind of thing that the FSF gets to think about. But, as you say, the whole idea of the GPL is that people can go build what they want -- including plugin frameworks to GCC that might or might not allow them to interface with proprietary code. If it's really worth it to them, then they will. The FSF's best defense is to make it not worth it by providing enough valuable free software. Anyhow, in practical terms, debating this here probably will have zero impact on the outcome. The ball is in RMS' court, and SC members (including myself) have made many of the arguments that have been made in this thread. If people want to influence the FSF, the best approach is probably going to be sending mail directly to RMS, not discussing it on this list. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Designs for better debug info in GCC
Andrew Pinski wrote: I have to ask, do you want an optimizing compiler or one which generates full debugging information Both! I would like modes which do the following a) reasonable amount of optimization that does not intefere too much with debugging. The old GCC 3 -O1 was a close approximation to this (certainly a closer approximation than the current -O1). b) all possible optimziations even if debuggability is compromised That's a perfectly reasonable request, and we used to be pretty close to having it, but now -O1 has really degraded as a solution to a). Yes, it's somewhat more efficient, but I suspect that the small minority of those interested in the last bit of performance are using -O2 anyway, so I doubt many people get much benefit from the improved performance of -O1 code. On the other hand lots of people are negatively affected by the degrading of debugging in -O1 mode. Because there are trade off here really. The reason behind the extra inlining is because it improves code generation. I don't know about you but in some area of coding, they need the extra speed/size reductions that inlining of non user marked functions. I have plenty of code which needs the speed help that the extra inling helps (remember some developers don't want to change the code that much to have the optimizing compiler do its work). Obviously you don't want a lot of inlining unless the debugger can handle inlining properly if your interest is in being able to debug! Remember dwarf3 is not really a standards about meta-information, it just mentions how it represented if it exists. But consumers want a debugger that works, without having to take the hit of huge volumes of code at -O0 -- Pinski
Re: Designs for better debug info in GCC
On Nov 8, 2007, Ian Lance Taylor <[EMAIL PROTECTED]> wrote: > However, I don't think your arguments that this is > an issue comparable to code correctness are valid. It *is* code correctness. Say, if the linker emitted incorrect addresses in an executable, but the kernel and dynamic loader didn't rely on those addresses, would it not still be a bug in the linker? And then, if those tools started relying on those addresses and exposed the problem, would it be right to tell them they must not rely on them because they were broken in the past and we don't feel like correcting the linker? So... The compiler is outputting code that tells other tools where to look for certain variables at run time, but it's putting incorrect information there. How can you possibly argue that this is not a code correctness issue? > Incorrect generated code is a fatal problem in a compiler. > Incorrect debugging information is a quality of implementation > issue. Incomplete debugging information is a quality of implementation, just like missed optimizations. Incorrect compiler output is a bug. Claiming it's not just because tools you happen to rely on don't care about that piece of information won't make it any less of a bug. It may make it a less important bug for some time, but it's still a bug. >> >> > We've fixed many many bugs and misoptimizations over the years due to >> >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a mistake >> >> > we've made in the past. >> >> >> >> That's a valid concern. However, per this reasoning, we might as well >> >> push every operand in our IL to separate representations, because >> >> there have been so many bugs and misoptimizations over the years, >> >> especially when the representation didn't make transformations >> >> trivially correct. >> >> > Please don't use strawman arguments. >> >> It's not, really. A reference to an object within a debug stmt or >> insn is very much like any other operand, in that most optimizer >> passes must keep them up to date. If you argue for pushing them >> outside the IL, why would any other operands be different? > I think you misread me. I didn't argue for pushing debugging > information outside the IL. I argued against a specific > implementation--DEBUG_INSN--based on our experience with similar > implementations. Do you remember any other notes that contained actual rtx expressions and expected optimization passes to keep them accurate? All notes (as in matching NOTE_P) I remember didn't really contain rtx expressions. The first exception I remember is VAR_LOCATION, and this one explicitly does *not* want to be updated, for it's generated so late in the process. Conversely, REG_NOTES do contain rtx, and they often have to be updated, so that's the right representation for them. Do you think we'd gain anything by moving them to a separate, out-of-line representation? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: Reload using a live register to reload into
> This doesnt happen because while inserting the caller save insn, its > live_throughout is simply set to the live_throughout of the call insn > + the registers marked with REG_DEAD notes in the call insn. Ouch. Relying on REG_DEAD notes to get complete liveness info is a no-no: /* The value in REG dies in this insn (i.e., it is not needed past this insn). If REG is set in this insn, the REG_DEAD note may, but need not, be omitted. */ REG_NOTE (DEAD) This problem was apparently introduced a long time ago: http://gcc.gnu.org/ml/gcc-cvs/1999-12/msg00212.html > However since $c1 is an argument to the call it is used by the call_insn > and is marked REG_DEP_TRUE ( Read after Write). Yes, and it's because it is also set in the insn (as return value register) that it doesn't get the REG_DEAD note; the 3 other argument registers do. > Shouldnt regs in REG_DEP_TRUE be added to live_throughout. My suspicion is > that the LOG_LINKS are not always up-to-date, therefore will it be > better to use DF_INSN_UID_USES ? DF is supposed to be out of the game at this point, it has handed over the control since global.c:build_insn_chain as far as liveness info is concerned. The REG_DEP_TRUE are somewhat misleading, it's an artifact in the dump. What you're seeing are the contents of CALL_INSN_FUNCTION_USAGE, which are always correct, so a solution to your problem would be to scan it for uses of registers in caller-save.c:insert_one_insn. Of course this wouldn't plug the hole entirely but would very likely be sufficient in practice. This problem doesn't seem to have already been reported, probably because the compiler is not supposed to be using the return value register for reloading, unless seriously under pressure. Do you define REG_ALLOC_ORDER for your port? If so, in what position is $c1? -- Eric Botcazou
Re: Designs for better debug info in GCC
I think both sides are talking over each other, partially because two different goals are in mind. IMHO, there are two extremes when it comes to the so called debugging optimized code. One camp wants the full debuggability (let's call them debuggability crowd) - which means they want to know the value of any valid program state anywhere, and wants to set breakpoint anywhere and be able to even change the program state anywhere as if there was an assignment at the point the debugger stopped the program at. This camp still wants better performance (like everyone else) but they don't want to sacrifice the debuggability for performance, because they rely on these. The other camp is the performance crowd, where they want the absolute best performance but they still want as much debug information possible. Most people fall in this camp and this is what gcc has implemented. This camp doesn't want to change the code so that they can get better debugging information. Of course, the real world is somewhere in between, but in practice, most people fall in the latter group (aka performance crowd). Alexandre's proposal would make it possible to make the debuggability crowd happy at some unknown cost of compile-time/runtime cost and maintenance cost. Richiard's proposal (from what I can understand) would make performance crowd happy, since it would be less costly to implement than Alexandre's and would provide incrementally better debugging information than current, but it doesn't seem to be that it would make the debuggability crowd happy (or at least the extremists among debuggability crowd). So I think the difference in the opinion isn't so much as Alexandre's proposal is good or bad, but rather whether we aim to make the debuggability crowd happy or the performance crowd happy or both. Ideally we should serve both groups of users, but there's non-trivial ongoing maintenance cost for having two different approaches. So I'd like to ask both Alexandre and Richard whether they each can satisfy the other camp, that is, Alexandre to come up with a way to tweak his proposal so that it is possible to keep the compile time cost comparable to what is right now with similar or better debug information, and with reasonable maintenance cost, and Richard whether his proposal can satisfy the debuggability crowd. Of course, another possible opinion would be to ignore the debuggability crowd on the ground that they are not important or big. I personally think it's a mistake to do so, but you may disagree on that point. Seongbae On 08 Nov 2007 12:50:17 -0800, Ian Lance Taylor <[EMAIL PROTECTED]> wrote: > Alexandre Oliva <[EMAIL PROTECTED]> writes: > > > So... The compiler is outputting code that tells other tools where to > > look for certain variables at run time, but it's putting incorrect > > information there. How can you possibly argue that this is not a code > > correctness issue? > > I don't see any point to going around this point again, so I'll just > note that I disagree. > > > > >> >> > We've fixed many many bugs and misoptimizations over the years due > > >> >> > to > > >> >> > NOTEs. I'm concerned that adding DEBUG_INSN in RTL repeats a > > >> >> > mistake > > >> >> > we've made in the past. > > >> >> > > >> >> That's a valid concern. However, per this reasoning, we might as well > > >> >> push every operand in our IL to separate representations, because > > >> >> there have been so many bugs and misoptimizations over the years, > > >> >> especially when the representation didn't make transformations > > >> >> trivially correct. > > >> > > >> > Please don't use strawman arguments. > > >> > > >> It's not, really. A reference to an object within a debug stmt or > > >> insn is very much like any other operand, in that most optimizer > > >> passes must keep them up to date. If you argue for pushing them > > >> outside the IL, why would any other operands be different? > > > > > I think you misread me. I didn't argue for pushing debugging > > > information outside the IL. I argued against a specific > > > implementation--DEBUG_INSN--based on our experience with similar > > > implementations. > > > > Do you remember any other notes that contained actual rtx expressions > > and expected optimization passes to keep them accurate? > > No. > > > Do you think > > we'd gain anything by moving them to a separate, out-of-line > > representation? > > I don't know. I don't see such a proposal on the table, and I don't > have one myself, so I don't know how to evaluate it. > > Ian > -- #pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";
Re: Designs for better debug info in GCC
Seongbae Park (¹Ú¼º¹è, ÚÓà÷ÛÆ) wrote: Most people fall in this camp and this is what gcc has implemented. This camp doesn't want to change the code so that they can get better debugging information. This is definitely not the case. At least among our users, very few fall into this camp. But in any case I think we all agree that there should be a mode in which this is the emphasis. Of course, the real world is somewhere in between, but in practice, most people fall in the latter group (aka performance crowd). You must live in a strange world, after all think about it, lots of people find Java quite fine, even though it throws away a lot of performance. Of course, another possible opinion would be to ignore the debuggability crowd on the ground that they are not important or big. Actually I think big serious users with programs in the millions of lines category are much more likely to be in the "debuggability" crowd.