gcc@gcc.gnu.org
2008/8/14 Daniel Berlin <[EMAIL PROTECTED]>: > 1. You can't assume VUSE's are must-aliases. The fact that there is a > vuse for something does not imply it is must-used, it implies it is > may-used. > > We do not differentiate may-use from must-use in our alias system. You > can do some trivial must-use analysis if you like (by computing > cardinality of points-to set as either single or multiple and > propagating/meeting it in the right place). > > Must-use is actually quite rare. Then, is it impossible to distinguish the following testcase and the one from my previous mail with the current infrastructure? extern void foo (int *); extern void bar (int); void baz (void) { int i; if (i) /* { dg-warning "is used uninitialized" "uninit i warning" } */ bar (i); foo (&i); } > 2. " if (!gimple_references_memory_p (def)) > + return; > +" > Is nonsensical the SSA_NAME_DEF_STMT of a vuse must contain a vdef, > and thus must access memory. Two things here. 1) The case I am trying to war about is: # BLOCK 2 freq:1 # PRED: ENTRY [100.0%] (fallthru,exec) [/home/manuel/src/trunk/gcc/testsuite/gcc.dg/uninit-B.c : 12] # VUSE { iD.1951 } i.0D.1952_1 = iD.1951; [/home/manuel/src/trunk/gcc/testsuite/gcc.dg/uninit-B.c : 12] if (i.0D.1952_1 != 0) The def_stmt of i.0 is precisely that one. There is no vdef there. 2) I use that test to return early if the def_stmt of "t" does not reference memory. t is just a SSA_NAME (like i.0 above), I do not know whether its def_stmt has a VUSE like the above or not. I guess the test is redundant since SINGLE_SSA_USE_OPERAND will return NULL anyway. Is that what yo mean? Cheers, Manuel.
Better GCC diagnostics
Let's try to focus on what needs to be done looking for specific features (or fixes) and how we could do it: A) Printing the input expression instead of re-constructing it. As Joseph explained, this will fix the problems that Aldy mentioned (PR3544[123] and PR35742) and this requires: 1) For non-preprocessed expr we need at least two locations per expr (beg/end). This will require changes on the build_* functions to handle multiple locations. 1b) For each preprocessed token, we would need to keep two locations: one for the preprocessed location and another for the original location. As Joseph pointed out, ideally we should be able to find a way to track this with a single location_t object so we do not need 4 locations per expr. 2) Changes in the parser to pass down the correct locations to the build_* functions. 3) A location(s) -> source strings interface and machinery. Ideally, this should be more or less independent of CPP, so CPP (through the diagnostics machinery) calls into this when needed and not the other way around. This can be implemented in several ways: a) Keeping the CPP buffers in memory and having in line-maps pointers directly into the buffers contents. This is easy and fast but potentially memory consuming. Care to handle charsets, tabs, etc must be taken into account. Factoring out anything useful from libcpp would help to implement this. b) Re-open the file and fseek. This is not trivial since we need to do it fast but still do all character conversions that we did when libcpp opened it the first time. This is approximately what Clang (LLVM) does and it seems they can do it very fast by keeping a cache of buffers ever reopened. I think that thanks to our line-maps implementation, we can do the seeking quite more efficiently in terms of computation time. However, opening files is quite embedded into CPP, so that would need to be factored out so we can avoid any unnecessary CPP stuff when reopening but still do it *properly* and *efficiently*. 4) Changes in the diagnostics machinery to extract locations from expr and print a string from a source file instead of re-constructing things. 5) Handle locations during folding or avoid aggressive folding in the front-ends. 6) Handle locations during optimisation or update middle-end diagnostics to not rely in perfect location information. This probably means not using %qE, not column info, and similar limitations. Some trade-off must be investigated. B) Printing accurate column information. This requires: *) Preprocessed/original locations in a single location_t. Similar as (A.1b) above. *) Changes in the parser to pass down the correct locations to diagnostics machinery. Similar to (A.2) above. B.1) Changes in the testsuite to enable testing column numbers. C) Consistent diagnostics. This requires: C.1) Make CPP use the diagnostics machinery. This will fix part of PR7263 and other similar bugs where there is a mismatch between the diagnostics machinery and CPP's own diagnostics machinery. *) Preprocessed/original locations in a single location_t. This will avoid different behaviour when a token comes from a macro expansion. Similar as (A.1b) above. D) Printing Ranges. This requires: *) Printing accurate column information. Similar to (B) above. *) A location(s) -> source strings interface and machinery. Similar to (A.3) above. *) Changes in the parser to pass down ranges. Similar to (A.2) above. D.1) Changes in the testsuite to enable testing ranges. D.2) Changes in the diagnostics machinery to handle ranges. E) Caret diagnostics. This requires: *) Printing accurate column information. Similar to (B) above. *) A location(s) -> source strings interface and machinery. Similar to (A.3) above. E.1) Changes in the diagnostics machinery to print the source line and a caret. I have copied this in the wiki so anyone can update it or add comments: http://gcc.gnu.org/wiki/Better_Diagnostics I have some patches to make the diagnostic functions take explicit locations and I hope to send them soon. Apart from those, I personally don't have any specific plans to address any of the points above in the near future because of lack of free time and I still have a long queue of some trivial patches that I would like to get rid of before we enter in regression-only mode. Cheers, Manuel.
Re: Better GCC diagnostics
"Manuel López-Ibáñez" <[EMAIL PROTECTED]> writes: > Let's try to focus on what needs to be done looking for specific > features (or fixes) and how we could do it: Thanks for writing this up. > A) Printing the input expression instead of re-constructing it. As >Joseph explained, this will fix the problems that Aldy mentioned >(PR3544[123] and PR35742) and this requires: > > 1) For non-preprocessed expr we need at least two locations per expr > (beg/end). This will require changes on the build_* functions to > handle multiple locations. This is probably obvious, but can you outline why we need two locations for each expression? The tools with which I am familiar only print a single caret. What would use the two locations for? > b) Re-open the file and fseek. This is not trivial since we need > to do it fast but still do all character conversions that we > did when libcpp opened it the first time. This is > approximately what Clang (LLVM) does and it seems they can do > it very fast by keeping a cache of buffers ever reopened. I > think that thanks to our line-maps implementation, we can do > the seeking quite more efficiently in terms of computation > time. However, opening files is quite embedded into CPP, so > that would need to be factored out so we can avoid any > unnecessary CPP stuff when reopening but still do it > *properly* and *efficiently*. If we are going to reopen the file, then why do we need to record the locations in the preprocessed token stream? If we keep, for each source line, the file offset in the file of the start of that source line, then I think that printing the line from the source file would be pretty fast. That would not be free but it would be much cheaper than keeping the entire input file. Various optimizations are possible--e.g., keep the file offset for every 16th line. Conversely, perhaps we could record the line number and the charset conversion state at each 4096th byte in the file; that would let us quickly find the page in the file which contains the line. Ian
gcc@gcc.gnu.org
On Fri, Aug 15, 2008 at 8:06 AM, Manuel López-Ibáñez <[EMAIL PROTECTED]> wrote: > 2008/8/14 Daniel Berlin <[EMAIL PROTECTED]>: >> 1. You can't assume VUSE's are must-aliases. The fact that there is a >> vuse for something does not imply it is must-used, it implies it is >> may-used. >> >> We do not differentiate may-use from must-use in our alias system. You >> can do some trivial must-use analysis if you like (by computing >> cardinality of points-to set as either single or multiple and >> propagating/meeting it in the right place). >> >> Must-use is actually quite rare. > > Then, is it impossible to distinguish the following testcase and the > one from my previous mail with the current infrastructure? If by current you mean "code that already exists", then yes :) You could write code to do further analysis, but with the existing code, it will not work. > >> 2. " if (!gimple_references_memory_p (def)) >> + return; >> +" >> Is nonsensical the SSA_NAME_DEF_STMT of a vuse must contain a vdef, >> and thus must access memory. > > Two things here. > > 1) The case I am trying to war about is: > > # BLOCK 2 freq:1 > # PRED: ENTRY [100.0%] (fallthru,exec) > [/home/manuel/src/trunk/gcc/testsuite/gcc.dg/uninit-B.c : 12] # VUSE > { iD.1951 } > i.0D.1952_1 = iD.1951; > [/home/manuel/src/trunk/gcc/testsuite/gcc.dg/uninit-B.c : 12] if > (i.0D.1952_1 != 0) > > The def_stmt of i.0 is precisely that one. There is no vdef there. Sure but this is a default def, which are special, and do nothing anyway. > > 2) I use that test to return early if the def_stmt of "t" does not > reference memory. t is just a SSA_NAME (like i.0 above), I do not know > whether its def_stmt has a VUSE like the above or not. I guess the > test is redundant since SINGLE_SSA_USE_OPERAND will return NULL > anyway. Is that what yo mean? > No, i mean any SSA_NAME_DEF_STMT for a vuse that is not a default_def will reference memory.
gcc@gcc.gnu.org
On Fri, Aug 15, 2008 at 10:58 AM, Daniel Berlin <[EMAIL PROTECTED]> wrote: > On Fri, Aug 15, 2008 at 8:06 AM, Manuel López-Ibáñez > <[EMAIL PROTECTED]> wrote: >> 2008/8/14 Daniel Berlin <[EMAIL PROTECTED]>: >>> 1. You can't assume VUSE's are must-aliases. The fact that there is a >>> vuse for something does not imply it is must-used, it implies it is >>> may-used. >>> >>> We do not differentiate may-use from must-use in our alias system. You >>> can do some trivial must-use analysis if you like (by computing >>> cardinality of points-to set as either single or multiple and >>> propagating/meeting it in the right place). >>> >>> Must-use is actually quite rare. >> >> Then, is it impossible to distinguish the following testcase and the >> one from my previous mail with the current infrastructure? > > If by current you mean "code that already exists", then yes :) > You could write code to do further analysis, but with the existing > code, it will not work. FWIW, it is actually worse than the cases you have posited so far. Consider the following simple case (which are different from yours in that the conditionals are not dependent on maybe uninitialized variables), where you will miss an obvious warning. extern int foo(int *); extern int bar(int); int main(int argc, char **argv) { int a; if (argc) foo (&a) /* VUSE of a will be a phi node, but it still may be used uninitialized. */ bar(a); } Realistically, to get good results, you would have to track may-use vs must-use and also propagate where the default def is being used when the default_def is not from a parameter. (noticing that the a is a must-use there and comes from a phi node whose arguments contain the default def would prove it is uninitialized along some path) --Dan
Re: Better GCC diagnostics
2008/8/15 Ian Lance Taylor <[EMAIL PROTECTED]>: > "Manuel López-Ibáñez" <[EMAIL PROTECTED]> writes: > >> A) Printing the input expression instead of re-constructing it. As >>Joseph explained, this will fix the problems that Aldy mentioned >>(PR3544[123] and PR35742) and this requires: >> >> 1) For non-preprocessed expr we need at least two locations per expr >> (beg/end). This will require changes on the build_* functions to >> handle multiple locations. > > This is probably obvious, but can you outline why we need two > locations for each expression? The tools with which I am familiar > only print a single caret. What would use the two locations for? This has nothing to do with caret diagnostics. This is an orthogonal issue that would share some infrastructure as Joseph explained. If you do warning("called object %qE is not a function", expr); for ({break;})(); we currently try to re-construct expr and that fails in some cases (see the PRs referenced). #'goto_expr' not supported by pp_c_expression#'bug.c: In function 'foo': bug.c:4: error: called object is not a function The alternative is to print whatever we parsed when building expr. To do that we would need to have begin/end locations for expr, and then do a location_t->const char * translation and print whatever is between those two pointers: bug.c:4: error: called object '({break;})' is not a function Is it clear now? If so, I will update the wiki to put this example. >> b) Re-open the file and fseek. This is not trivial since we need >> to do it fast but still do all character conversions that we >> did when libcpp opened it the first time. This is >> approximately what Clang (LLVM) does and it seems they can do >> it very fast by keeping a cache of buffers ever reopened. I >> think that thanks to our line-maps implementation, we can do >> the seeking quite more efficiently in terms of computation >> time. However, opening files is quite embedded into CPP, so >> that would need to be factored out so we can avoid any >> unnecessary CPP stuff when reopening but still do it >> *properly* and *efficiently*. > > If we are going to reopen the file, then why do we need to record the > locations in the preprocessed token stream? Because for some diagnostics we want to give the warnings in the instantiation point not in the macro definition point. Moreover, this is what we currently do, so if we don't want to change the current behaviour, we need to track both locations. Example /*header.h*/ #pragma GCC system_header #define BIG 0x1b27da572ef3cd86ULL /* file.c */ #include "pr7263.h" __extension__ unsigned long long bar () { return BIG; } We print a diagnostic at file.c for the expansion of BIG. However, since we do not have the original location we cannot check that the token comes from a system header, and we do not suppress the warning. There are more subtle bugs that arise from not having the original location available. See PR36478. BTW, Clang takes into account both locations when printing diagnostics. > If we keep, for each source line, the file offset in the file of the > start of that source line, then I think that printing the line from > the source file would be pretty fast. That would not be free but it > would be much cheaper than keeping the entire input file. Various Cheaper in terms of memory. It cannot be cheaper in terms of compilation time than a direct pointer to the already opened buffer for each line-map. > optimizations are possible--e.g., keep the file offset for every 16th > line. Conversely, perhaps we could record the line number and the > charset conversion state at each 4096th byte in the file; that would > let us quickly find the page in the file which contains the line. I am not sure how you plan such approach to interact with mapped-locations. I think that having an offset for each line-map and then seeking until you find the correct position would be fine for an initial implementation. More complex setups could be tested against this basic implementation. And any optimization done here could be done with the buffer already opened, so yes, cheaper in terms of memory maybe but not cheaper in terms of compilation time. If this is abstracted enough both approaches could perhaps coexist and share the optimizations: while the front-end is working (where most of the diagnostics come from) keep the buffers around, when going into middle-end free them and if we need to give a diagnostic from the middle-end, reopen and seek. But all this relies on someone factoring file opening and charset conversion out of CCP first. Once that is done, we could do whatever strategy or both or something else. Cheers, Manuel.
gcc@gcc.gnu.org
2008/8/15 Daniel Berlin <[EMAIL PROTECTED]>: > On Fri, Aug 15, 2008 at 10:58 AM, Daniel Berlin <[EMAIL PROTECTED]> wrote: >> On Fri, Aug 15, 2008 at 8:06 AM, Manuel López-Ibáñez >> <[EMAIL PROTECTED]> wrote: >>> 2008/8/14 Daniel Berlin <[EMAIL PROTECTED]>: 1. You can't assume VUSE's are must-aliases. The fact that there is a vuse for something does not imply it is must-used, it implies it is may-used. We do not differentiate may-use from must-use in our alias system. You can do some trivial must-use analysis if you like (by computing cardinality of points-to set as either single or multiple and propagating/meeting it in the right place). Must-use is actually quite rare. >>> >>> Then, is it impossible to distinguish the following testcase and the >>> one from my previous mail with the current infrastructure? >> >> If by current you mean "code that already exists", then yes :) >> You could write code to do further analysis, but with the existing >> code, it will not work. > > FWIW, it is actually worse than the cases you have posited so far. > > Consider the following simple case (which are different from yours in > that the conditionals are not dependent on maybe uninitialized > variables), where you will miss an obvious warning. > > extern int foo(int *); > extern int bar(int); > int main(int argc, char **argv) > { > int a; > > if (argc) > foo (&a) > /* VUSE of a will be a phi node, but it still may be used uninitialized. */ > bar(a); > } > > > Realistically, to get good results, you would have to track may-use vs > must-use and also propagate where the default def is being used when > the default_def is not from a parameter. > The problem in the original testcase is that the default def of variable 'c' is a VUSE in a statement that does not even use c. # BLOCK 2 freq:1 # PRED: ENTRY [100.0%] (fallthru,exec) [/home/manuel/src/trunk/gcc/builtins.c : 11095] # VUSE { cD.68618 } D.68627_3 = validate_argD.45737 (s1D.68612_2(D), 10); Moreover, if you check fold_builtin_strchr in builtins.c, it is clear that there is no path along which c is used uninitialized. Cheers, Manuel.
Re: Better GCC diagnostics
D) Printing Ranges. This requires: *) Printing accurate column information. Similar to (B) above. *) A location(s) -> source strings interface and machinery. Similar to (A.3) above. Ranges also require some way to get the end of a token (in addition to its beginning). For example, a range for: X + some_long\ _ident??/ ifier The range should start at "X" and end at "r". This is just a location like any other, but requires passing down like the begin loc. You might instead decide to do some fuzzy matching or something, but clang at least gets this right. This is important for other clients of the loc info, e.g. refactoring clients. -Chris
Re: Better GCC diagnostics
2008/8/15 Chris Lattner <[EMAIL PROTECTED]>: >> D) Printing Ranges. This requires: >> >> *) Printing accurate column information. Similar to (B) above. >> >> *) A location(s) -> source strings interface and machinery. Similar >> to (A.3) above. > > Ranges also require some way to get the end of a token (in addition to its > beginning). For example, a range for: > > X + some_long\ > _ident??/ > ifier > > The range should start at "X" and end at "r". This is just a location like > any other, but requires passing down like the begin loc. You might instead > decide to do some fuzzy matching or something, but clang at least gets this > right. This is important for other clients of the loc info, e.g. > refactoring clients. Oh yes. Well, there is a lot of fine-tunning to do but I think that would be covered by A.1 and the binary_op expression would have at least two locations begin/end pointing to X and r. If we are able to print ({break;}), in the example I gave earlier, then we will be able to print nice ranges most of the time. Cheers, Manuel.
Re: Better GCC diagnostics
On Fri, 15 Aug 2008, Manuel López-Ibáñez wrote: > 5) Handle locations during folding or avoid aggressive folding in > the front-ends. I plan to delay folding for C (beyond some folding for expressions of constants) for 4.5, probably in October. (I do not plan to do anything for C++, and the folding will still happen before gimplification, though I believe that in principle most of what fold-const does would be better done on language-independent IR after gimplification.) Note that for correct ranges you need to handle locations for 0 and (0) and ((0)) and so on, so either need duplicate trees for constants and parenthesised expressions or need to keep the locations outside the trees. It's possible you can keep them in the c_expr structures or other local variables, depending on whether you ever need locations for both an expression and deeply nested subexpressions of it. -- Joseph S. Myers [EMAIL PROTECTED]
gcc@gcc.gnu.org
On Fri, Aug 15, 2008 at 11:31 AM, Manuel López-Ibáñez <[EMAIL PROTECTED]> wrote: > 2008/8/15 Daniel Berlin <[EMAIL PROTECTED]>: >> On Fri, Aug 15, 2008 at 10:58 AM, Daniel Berlin <[EMAIL PROTECTED]> wrote: >>> On Fri, Aug 15, 2008 at 8:06 AM, Manuel López-Ibáñez >>> <[EMAIL PROTECTED]> wrote: 2008/8/14 Daniel Berlin <[EMAIL PROTECTED]>: > 1. You can't assume VUSE's are must-aliases. The fact that there is a > vuse for something does not imply it is must-used, it implies it is > may-used. > > We do not differentiate may-use from must-use in our alias system. You > can do some trivial must-use analysis if you like (by computing > cardinality of points-to set as either single or multiple and > propagating/meeting it in the right place). > > Must-use is actually quite rare. Then, is it impossible to distinguish the following testcase and the one from my previous mail with the current infrastructure? >>> >>> If by current you mean "code that already exists", then yes :) >>> You could write code to do further analysis, but with the existing >>> code, it will not work. >> >> FWIW, it is actually worse than the cases you have posited so far. >> >> Consider the following simple case (which are different from yours in >> that the conditionals are not dependent on maybe uninitialized >> variables), where you will miss an obvious warning. >> >> extern int foo(int *); >> extern int bar(int); >> int main(int argc, char **argv) >> { >> int a; >> >> if (argc) >> foo (&a) >> /* VUSE of a will be a phi node, but it still may be used uninitialized. */ >> bar(a); >> } >> >> >> Realistically, to get good results, you would have to track may-use vs >> must-use and also propagate where the default def is being used when >> the default_def is not from a parameter. >> > > The problem in the original testcase is that the default def of > variable 'c' is a VUSE in a statement that does not even use c. > It may-uses c, as we've been through. > # BLOCK 2 freq:1 > # PRED: ENTRY [100.0%] (fallthru,exec) > [/home/manuel/src/trunk/gcc/builtins.c : 11095] # VUSE > { cD.68618 } > D.68627_3 = validate_argD.45737 (s1D.68612_2(D), 10); > > Moreover, if you check fold_builtin_strchr in builtins.c, it is clear > that there is no path along which c is used uninitialized. This is not a default def. cD.68618_34(D) is the default def. if you look at default_def (c) it will be a NOP_EXPR statement.
prevent optimisation of variables on SSA
Hi all, I currently introduce a temporary variable on SSA form but it does not survive the SSA optimisation passes. (Not even the simple ones triggered with -O1). Since the temporary variable is considered to be a gimple register it is not possible to set a VUSE or VDEF for it. (I tried and it breaks the SSA properties). Are there any mechanisms to work around this restriction and set the VDEF/VUSE anyway or is there a way to tell the SSA optimisers not to touch this variable (or SSA_NAME)? Thank you very much for sharing your thoughts! Regards, Martin
Re: prevent optimisation of variables on SSA
On Fri, Aug 15, 2008 at 7:00 PM, Martin Schindewolf <[EMAIL PROTECTED]> wrote: > Hi all, > > I currently introduce a temporary variable on SSA form but it > does not survive the SSA optimisation passes. (Not even the > simple ones triggered with -O1). Since the temporary variable > is considered to be a gimple register it is not possible to > set a VUSE or VDEF for it. (I tried and it breaks the SSA > properties). Are there any mechanisms to work around this > restriction and set the VDEF/VUSE anyway or is there a way to > tell the SSA optimisers not to touch this variable (or > SSA_NAME)? > Thank you very much for sharing your thoughts! You need to better explain what you are trying to do. Richard.
Re: Bootstrap broken on x86_64-linux
On Thu, 2008-08-14 at 14:41 -0700, H.J. Lu wrote: > It should be fixed now. Thanks a lot for the quick fix. My problem is that I don't have access to a machine with GFC_REAL_16 and working autoconf2.59, so possible problems in cut&paste code tend to be hidden from me. I'll make special mention of this the next time I submit a patch containing #ifdef HAVE_GFC_REAL_16, to make sure that somebody tests this before commit. Thomas
Re: Better GCC diagnostics
"Manuel López-Ibáñez" <[EMAIL PROTECTED]> writes: > 2008/8/15 Ian Lance Taylor <[EMAIL PROTECTED]>: >> "Manuel López-Ibáñez" <[EMAIL PROTECTED]> writes: >> >>> A) Printing the input expression instead of re-constructing it. As >>>Joseph explained, this will fix the problems that Aldy mentioned >>>(PR3544[123] and PR35742) and this requires: >>> >>> 1) For non-preprocessed expr we need at least two locations per expr >>> (beg/end). This will require changes on the build_* functions to >>> handle multiple locations. >> >> This is probably obvious, but can you outline why we need two >> locations for each expression? The tools with which I am familiar >> only print a single caret. What would use the two locations for? > > This has nothing to do with caret diagnostics. This is an orthogonal > issue that would share some infrastructure as Joseph explained. If you > do > > warning("called object %qE is not a function", expr); > > for > > ({break;})(); > > we currently try to re-construct expr and that fails in some cases > (see the PRs referenced). > > #'goto_expr' not supported by pp_c_expression#'bug.c: In function 'foo': > bug.c:4: error: called object is not a function > > The alternative is to print whatever we parsed when building expr. To > do that we would need to have begin/end locations for expr, and then > do a location_t->const char * translation and print whatever is > between those two pointers: > > bug.c:4: error: called object '({break;})' is not a function > > > Is it clear now? If so, I will update the wiki to put this example. That is clear. Thanks. I personally would be perfectly happy if the compiler said bug.c:4.COLUMN: error: called object is not a function That is, fixing the compiler to includes parts of the source code in the error message itself is, for me, of considerably lower priority than fixing the compiler to generate good column numbers. >>> b) Re-open the file and fseek. This is not trivial since we need >>> to do it fast but still do all character conversions that we >>> did when libcpp opened it the first time. This is >>> approximately what Clang (LLVM) does and it seems they can do >>> it very fast by keeping a cache of buffers ever reopened. I >>> think that thanks to our line-maps implementation, we can do >>> the seeking quite more efficiently in terms of computation >>> time. However, opening files is quite embedded into CPP, so >>> that would need to be factored out so we can avoid any >>> unnecessary CPP stuff when reopening but still do it >>> *properly* and *efficiently*. >> >> If we are going to reopen the file, then why do we need to record the >> locations in the preprocessed token stream? > > Because for some diagnostics we want to give the warnings in the > instantiation point not in the macro definition point. Moreover, this > is what we currently do, so if we don't want to change the current > behaviour, we need to track both locations. > > Example > > /*header.h*/ > #pragma GCC system_header > #define BIG 0x1b27da572ef3cd86ULL > > /* file.c */ > #include "pr7263.h" > __extension__ unsigned long long > bar () > { > return BIG; > } > > We print a diagnostic at file.c for the expansion of BIG. However, > since we do not have the original location we cannot check that the > token comes from a system header, and we do not suppress the warning. > There are more subtle bugs that arise from not having the original > location available. See PR36478. > > BTW, Clang takes into account both locations when printing diagnostics. Perhaps I misunderstand what you mean by recording the location in the preprocessed token stream. You evidently do not mean getting column numbers for the preprocessed code. You mean that when a preprocessor macro is expanded, we should record both the location where the macro is used, and also some sort of reference to the macro so that we know the location where the macro was defined. Is that right? >> If we keep, for each source line, the file offset in the file of the >> start of that source line, then I think that printing the line from >> the source file would be pretty fast. That would not be free but it >> would be much cheaper than keeping the entire input file. Various > > Cheaper in terms of memory. It cannot be cheaper in terms of > compilation time than a direct pointer to the already opened buffer > for each line-map. Except that we know that increased memory use leads to increased compile time. Diagnostic printing can't be slow, but it's also not on the critical path. Most code does not generate diagnostics. So there is a balance to be struck. Ian
Re: Better GCC diagnostics
2008/8/15 Ian Lance Taylor <[EMAIL PROTECTED]>: >> >> BTW, Clang takes into account both locations when printing diagnostics. > > Perhaps I misunderstand what you mean by recording the location in the > preprocessed token stream. You evidently do not mean getting column > numbers for the preprocessed code. You mean that when a preprocessor > macro is expanded, we should record both the location where the macro > is used, and also some sort of reference to the macro so that we know > the location where the macro was defined. Is that right? > I don't see the difference. We do currently assign to the preprocessed code the location of the macro call. So we get column numbers to the preprocessed code. In addition, we should record the location were the macro was defined. In the previous example, we should be able to obtain somehow two locations for 0x1b27da572ef3cd86ULL, one for its position in header.h and another for its final position in file.c. Whether we record two locations explicitly or we keep some lookup table of macro expansions/definitions would depend on what is found to be the most efficient implementation. I don't know which approach Clang follows but in our case it would be better to use the mapped-location infrastructure to track this for us in a single source_location number. Cheers, Manuel.
Re: Better GCC diagnostics
On Fri, 15 Aug 2008, Ian Lance Taylor wrote: > Perhaps I misunderstand what you mean by recording the location in the > preprocessed token stream. You evidently do not mean getting column > numbers for the preprocessed code. You mean that when a preprocessor > macro is expanded, we should record both the location where the macro > is used, and also some sort of reference to the macro so that we know > the location where the macro was defined. Is that right? My use case for locations in preprocessed code is tracking down diagnostics arising in the expansions of complicated macros. I find that's the main case where GCC's existing diagnostics are unhelpful; in that case I would like an option to show the relevant fragment of preprocessed source that causes the diagnostic. There are also easy heuristics to decide whether the preprocessed source is likely to be more useful than the non-preprocessed source. Indexes into a table of tokens would be one possibility; C++ already creates such a table in the course of lexing the whole input up front. -- Joseph S. Myers [EMAIL PROTECTED]
gcc-4.4-20080815 is now available
Snapshot gcc-4.4-20080815 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20080815/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.4 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 139138 You'll find: gcc-4.4-20080815.tar.bz2 Complete GCC (includes all of below) gcc-core-4.4-20080815.tar.bz2 C front end and core compiler gcc-ada-4.4-20080815.tar.bz2 Ada front end and runtime gcc-fortran-4.4-20080815.tar.bz2 Fortran front end and runtime gcc-g++-4.4-20080815.tar.bz2 C++ front end and runtime gcc-java-4.4-20080815.tar.bz2 Java front end and runtime gcc-objc-4.4-20080815.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.4-20080815.tar.bz2The GCC testsuite Diffs from 4.4-20080808 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.4 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: broken FE diagnostics wrt complex expressions
On Wed, Aug 13, 2008 at 12:52 PM, Aldy Hernandez <[EMAIL PROTECTED]> wrote: > > It seems to me that the only approach here would be to provide caret > diagnostics, because reconstructing the original sources from GENERIC > seems like a loosing proposition. Hi Aldy, I agree with your analysis. > > But, is this slew of work even worth it? For the long term benefit, yes. > I for one think that the > aforementioned PRs should at least be marked down considerably. 3 of > them are P2s-- and I think they should be some Pn+5, and/or mark them as > an enhancement request. > > Are there any thoughts on this (the PRs, the caret diagnostics, plan of > attack, etc?). I'm kind of busy at the moment, and had put off work on caret diagnostics. I've seen mail suggesting that some folks are working on it... > > Aldy >
Re: broken FE diagnostics wrt complex expressions
On Wed, Aug 13, 2008 at 2:16 PM, Joseph S. Myers <[EMAIL PROTECTED]> wrote: > On Wed, 13 Aug 2008, Aldy Hernandez wrote: > > I think it would certainly be reasonable to print for > anything unsupported instead of broken diagnostics, and to reclassify all > such bugs as wishlist requests for certain complex expressions to be > better supported in diagnostics. That makes sense to me. -- Gaby
Re: [PATCH] caret diagnostics (was: broken FE diagnostics wrt complex expressions)
On Thu, Aug 14, 2008 at 7:39 AM, Joseph S. Myers <[EMAIL PROTECTED]> wrote: > On Thu, 14 Aug 2008, Manuel López-Ibáñez wrote: > > I don't think the option should necessarily just be boolean; once choice > that may make sense would be caret diagnostics for the first diagnostic > from an input file only, to avoid blowing up the output size when one > mistake causes a cascade of diagnostics. (This is a matter of designing > the option as e.g. -fdiagnostics-show-caret={no,yes,first} rather than as > -f/fno-, not a matter of needing such a feature implemented in the first > version going on trunk.) Yes, -diagnostics-show-caret= is more flexible. > > -- > Joseph S. Myers > [EMAIL PROTECTED]
Re: [PATCH] caret diagnostics (was: broken FE diagnostics wrt complex expressions)
On Thu, Aug 14, 2008 at 12:14 PM, Aldy Hernandez <[EMAIL PROTECTED]> wrote: >> * In the near future, make -fdiagnostics-show-caret the default at >> least while in experimental mode or at least during stages1 and 2. >> When making a release -fno-diagnostics-show-caret would be the >> default. Do this through a configure option that sets the default. >> >> * In the far away future, review the defaults and get rid of the >> configure option. > > To be honest, I don't mind either way. Either your option or Joseph's > is fine. I am however interested in getting the caret diagnostics in, > and then I can work on location information that will benefit everyone. I'm in favor of getting -fdiagnostics-show-caret=no by default in this release, and enable people like you to get useful stuff done. That gives us time to iron out outstanding bugs for the next release (and making it the default). -- Gaby > > Aldy >
Re: [PATCH] caret diagnostics
On Thu, Aug 14, 2008 at 11:52 AM, Tom Tromey <[EMAIL PROTECTED]> wrote: >> "Joseph" == Joseph S Myers <[EMAIL PROTECTED]> writes: > I'd like to see carets on by default as part of a major release -- say > GCC 5.0. (First mention!!) 100% agreed. -- Gaby