Tags out of gcc
Hi All, I have this brainstorm which I'd like to get some feedback on. I reckon it's a bad idea to make source browsing info with a separate program like cscope or etags. I reckon it's the compiler's job. Why? (1) Because only the compiler can do it authoritatively, after all, it decides what's in your executable. (2) Doing it properly involves following #ifdefs and #includes, which the compiler already does. (3) Because the types of tag you want are defined by the language. (4) Because there's very little to it after parsing, which the compiler is already doing. I imagine doing it for c++ and outputting cscope format which is reasonably expressive and popular. I have no idea how hard it would be, but if I can bug people for help I'd be willing to give it a shot. Is there any interest in something like that? Adrian. PS: I'm not talking about local text completion here. I"m talking about finding your way around a huge project where cscope etc give 50 different definitions of any important function because they can't follow the #ifdefs.
Re: Tags out of gcc
> I reckon it's a bad idea to make source browsing info with a separate > program like cscope or etags. I reckon it's the compiler's job. One of the issues with soure browsing is that you want to be able to do it in the presence of syntax errors. That can make it harder for the compiler to do it since it's usually not doing a robust parse in the presense of errors.
Re: Tags out of gcc
Well it seems to be able to report a lot of syntax errors even if they're close together, so it must be getting back on its feet fairly quickly. I don't know how that works. Maybe it just scoots along to the next semicolon or maybe you explicitly have productions like "if (syntax error) { ... }". What I also don't know is what the parser outputs if there's an error. Can it say "he tried to define bool foo() at line 123 but the body was erroneous", or does it just stdout the error message and forget there was ever an attempt to define foo? On 4 October 2014 18:17, Richard Kenner wrote: >> I reckon it's a bad idea to make source browsing info with a separate >> program like cscope or etags. I reckon it's the compiler's job. > > One of the issues with soure browsing is that you want to be able to do > it in the presence of syntax errors. That can make it harder for the > compiler to do it since it's usually not doing a robust parse in the > presense of errors.
Re: Tags out of gcc
> Well it seems to be able to report a lot of syntax errors even if > they're close together, so it must be getting back on its feet fairly > quickly. I don't know how that works. Maybe it just scoots along to > the next semicolon or maybe you explicitly have productions like "if > (syntax error) { ... }". > > What I also don't know is what the parser outputs if there's an error. > Can it say "he tried to define bool foo() at line 123 but the body was > erroneous", or does it just stdout the error message and forget there > was ever an attempt to define foo? You're missing my point by getting too deep into details. I'm making a more general point, which is that a parser of a compiler and a source browser have two different purposes. The purpose of the former is primarily to produce a parse tree of a correct program and secondarily to produce as many error messages as possible for an incorrect program. The purpose of the latter is to try to figure out as much as it can about the semantic meaning of what may be program fragments and be completely uncaring about the presence or absence of errors. Although there is indeed significant commonality between these two purposes, there are very significant difference as well. For example, a compiler usually won't look at things such as indentation and whitespace at all (except maybe when deciding what message to give for errors, but I think only the Ada front end does this), but high-quality source file browser would rely more on indentation than the exact parse because the indentation of a program in the process of being written will usually be more likely to be able to identify semantic constructs than a parse based on the tokens in the file.
RE: Tags out of gcc
> I imagine doing it for c++ and outputting cscope format which is > reasonably expressive and popular. > > I have no idea how hard it would be, but if I can bug people for help > I'd be willing to give it a shot. There are two ways to do this with GCC. One is trivial and one is hard, but the hard one will likely give better results than the trivial one. The trivial one is that you build a plugin (https://gcc.gnu.org/onlinedocs/gccint/Plugins.html) and hook it at PLUGIN_FINISH_DECL (and perhaps also at PLUGIN_FINISH_TYPE, not sure about that). You can then run the plugin in the same command that compiles your code. However, this approach has some limitations. It will not handle preprocessor macros. You'll need to add new plugin hooks to GCC (which I think would be welcome). And it may not work well in the presence of compilation errors. It will also be slower than it really needs to be (although perhaps faster than etags? The GCC parser is very optimized...). The hard approach is that you contribute to the effort to make GCC more modular so that you can call the functions in the C++ parser that you really need, while ignoring the rest of the compiler. Then, you will be able to build a stand-alone program that does what you want without requiring a complete gcc. The way to do this is to join the GCC project, create a branch and try to build a prototype that doesn't break the compiler and allows you to achieve what you want. Then, propose to merge your changes to the main development branch. I would suggest to start with the trivial approach, get used to GCC development, then think about what it would take to do the hard approach. Cheers, Manuel.
Re: Tags out of gcc
On 4 October 2014 15:47, Manuel López-Ibáñez wrote: > The trivial one is that you build a plugin > (https://gcc.gnu.org/onlinedocs/gccint/Plugins.html) and hook it at > PLUGIN_FINISH_DECL (and perhaps also at PLUGIN_FINISH_TYPE, not sure > about that). You can then run the plugin in the same command that > compiles your code. Does that hook get called for uninstantiated templates? A source browser should index C++ templates where they are defined, not only if they are used.
Re: Tags out of gcc
On 4 October 2014 21:07, Jonathan Wakely wrote: > On 4 October 2014 15:47, Manuel López-Ibáñez wrote: >> The trivial one is that you build a plugin >> (https://gcc.gnu.org/onlinedocs/gccint/Plugins.html) and hook it at >> PLUGIN_FINISH_DECL (and perhaps also at PLUGIN_FINISH_TYPE, not sure >> about that). You can then run the plugin in the same command that >> compiles your code. > > Does that hook get called for uninstantiated templates? > > A source browser should index C++ templates where they are defined, > not only if they are used. Maybe not, but it is a matter of adding more hooks, which seems easy enough. In any case, the advice is the same: Join us and together we'll conquer the galaxy, ah no, that's not it. It is: join GCC development and propose changes that enable what you want to do. Nonetheless, if I wanted to try this idea, I would start with the hooks that are there already, thus I wouldn't even need to modify GCC. Then, I would test whether the result is fast enough, and then think about adding more hooks to catch anything that is missing. Cheers, Manuel.
Re: Tags out of gcc
At first sight, I prefer the hooks approach. Not just cos I'm a noob (although that is a compelling reason in itself) but also because it happens during the main compile. A separate innovation could have different flags so it wouldn't be authoritative anymore. But it absolutely has to follow the preprocessor, so how do I do that? I'm a bit surprised about that being a problem cos when I look at preprocessor output it looks very convenient - I get one big file but it's full of clues as to where it all came from. Perhaps I have to hook those clues. Adrian