On 8/18/06, Roel Meeuws <[EMAIL PROTECTED]> wrote:
In order to build a metrication tool I need a frontend that can provide me with an abstract syntax tree containing information on all actual language constructs in the code and also a CFG representation. I reckon GCC has these capabilities and I was wondering if any of you could tell me if it is possible to use just GCC's frontend. Furthermore, where should I start, how do I extract the frontend from GCC, which of the intermediate GCC representation could I use, are they documented? I would like to thank you in advance for any help you can give me.
Right, so you want to have a count of source level constructs, and basically something similar at the lower levels... If you're going to do source level metrics, you will have to instrument the front ends. All front ends, perhaps with the exception of Ada and C++, have a pretty quick lowering to a level where you won't be able to e.g. distinguish a for-loop from a while-loop, if that would be something you're interested in. Depending on what language you'll be analyzing, or rather how many of them, I'd suggest you instrument the parser for your metrics, or forget about source level constructs and just look at lower level information only. As for CFG work, you should probably write a tree pass and insert it at some point in the compilation schedule (see passes.c). Depending on how close you want to stay to the original source code, you could put the pass early or late. If you put it late, you can analyze the optimized representation. In any case, you're going to find that gcc will produce a CFG pretty early on for GIMPLE (gcc's three-address, high level intermediate representation), but this happens _after_ the front ends are done, and _after_ lowering to GIMPLE. You can usually only find documentation on the front ends in the source code, but the gcc online documentation can guide you a bit there. So your first step would be to look at the GCC internals documentation on http://gcc.gnu.org/onlinedocs/. You'll want to work on GIMPLE (as opposed to RTL) which is reasonably well documented, again in the GCC internals documentation. And if you get stuck after looking for a while, you'll usually find someone helpful on this list. You may also want to look at the GCC wiki (http://gcc.gnu.org/wiki/) and the Introspector Project (http://introspector.sourceforge.net/). Hope this helps. Gr. Steven