I've been working on a new plugin for GCC, which supports embedding Python within GCC, exposing GCC's internal data structures as Python objects and classes.
The plugin links against libpython, and (I hope) allows you to invoke arbitrary Python scripts from inside a compile. My aim is to allow people to write GCC "plugins" as Python scripts, and to make it much easier to prototype new GCC features (Python is great for doing this kind of thing). The plugin is Free Software, licensed under the GPLv3 (or later). The code can be seen here: http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=summary and the website for the plugin is the Trac instance here: https://fedorahosted.org/gcc-python-plugin/ The documentation is in the "docs" subdirectory (using sphinx). You can see a pre-built HTML version of the docs here: http://readthedocs.org/docs/gcc-python-plugin/en/latest/index.html It's still at the "experimental proof-of-concept stage"; expect crashes and tracebacks (I'm new to the insides of GCC, and I may have misunderstood things. I'm entirely ignoring the garbage collector, and I've also used a few entrypoints that aren't yet exposed in the plugin headers). It's already possible to use this to add additional compiler errors/warnings, e.g. domain-specific checks, or static analysis. One of my goals for this is to "teach" GCC about the common mistakes people make when writing extensions for CPython [1], but it could be used - e.g. to teach GCC about GTK's reference-counting semantics, - to check locking in the Linux kernel - to check signal-safety in APIs, etc - rapid prototyping Other ideas include visualizations of code structure. There are handy methods for plotting control flow graphs (using graphviz), showing the source code interleaved with GCC's internal representation, such as the one here: http://readthedocs.org/docs/gcc-python-plugin/en/latest/cfg.html It could also be used to build a more general static-analysis tool. The CPython API checker has the beginnings of this: Example output: test.c: In function ‘leaky’: test.c:21:10: error: leak of PyObject* reference acquired at call to PyList_New at test.c:21 [-fpermissive] test.c:22: taking False path at if (!list) test.c:24: reaching here item = PyLong_FromLong(42); test.c:27: taking True path at if (!item) test.c:21: returning NULL Numerous caveats right now (e.g. how I deal with loops is really dubious). It's disabled for now within the source tree (I need to fix my selftests to pass again...) It perhaps could be generalized to do e.g. {malloc,FILE*, fd} leaks, array bounds checking, int overflow, etc, but obviously that's a far bigger task. So far, I'm just doing a limited form of "abstract interpretation" (or, at least, based on my understanding of that term), dealing with explicit finite prefixes of traces of execution, tracking abstract values (e.g. NULL-ptr vs non-NULL-ptr) and stopping when the trace loops (which is just an easy way to guarantee termination, not a good one, but for my use-case is good enough, I hope. Plus it ought to make it easier to generate highly-readable error messages). Thanks to Red Hat for allowing me to devote a substantial chunk of $DAYJOB to this over the last couple of months. I hope this will be helpful to both the GCC and Python communities. Dave [1] see http://readthedocs.org/docs/gcc-python-plugin/en/latest/cpychecker.html and https://fedoraproject.org/wiki/Features/StaticAnalysisOfCPythonExtensions