On Thu, 2012-03-29 at 11:10 +0100, mark florisson wrote: Thanks for CCing me; various comments inline below throughout.
> On 29 March 2012 04:28, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> > wrote: > > On 03/28/2012 07:58 PM, Philip Herron wrote: > >> > >> Hey all > >> > >> I am implemented a very crude and simplistic and very badly programmed > >> version of a pxd generator i think i understand what were after now > >> but i would appreciate if you look over what i did to make sure i have > >> grasped the basic idea for now: [...snip example sources...] > >> We run gcc -fplugin=./python.so -fplugin-arg-python-script=walk.py test.c FWIW, the plugin has a helper script, so that you ought to be able to simply run: ./gcc-with-python walk.py test.c (paths permitting) My primary use-case for the plugin is my libcpychecker code which implements static analysis of refcount-handling, and for that I have another helper script "gcc-with-cpychecker" that invokes my code so that you can simply run: ./gcc-with-cpychecker -I/usr/include/python2.7 test.c So you might want to do something similar for the pxd generation. [...snip sample output...] > > Another slight complication is that you should ideally turn > > > > #define FOO 3 > > #define BAR 4 > > > > into > > > > cdef extern from "foo.h": > > enum: > > FOO > > BAR > > > > so you need to hook in before the preprocessor and after the preprocessor > > and dig out different stuff. > > David, I'm CCing you as this might be of interest to you. Very much so - thanks! (Hi everyone!) FWIW, I happened to see Dag's earlier email via a google search, and added the Cython idea to the list of "Ideas for using the GCC plugin" here: http://gcc-python-plugin.readthedocs.org/en/latest/getting-involved.html#ideas-for-using-the-plugin > I think the current GCC plugin support doesn't allow you to do much > with te preprocessor, it operates entirely after the C preprocessor > has run. So far, yes. I haven't explored GCC's C frontend as much as I have the stages that follow. The C preprocessor does run in-process; I don't know yet to what extent it's amenable to hacking via a GCC plugin. I believe that aspects of its integration may have been rewritten somewhat in GCC 4.7 (some of my colleagues tried to improve the line-numbering capture in the presence of macros). > So to support macros we have to consider that for this to > work the gcc plugin may have to be extended, which uses C to extend > GCC and Python, so it also requires knowledge of the CPython C API. Yes; I'd expect you to have to go digging into the guts of the GCC C preprocessor implementation, using GDB. I don't know yet how feasible it is to get at the data from a plugin: it might be anywhere from "easy" to "impossible". You might need to get a patch into GCC to expose the necessary information (if so, that would probably be worthy of a GSoC slot, I think). One issue is that although GCC has an API for plugins to use to register themselves, it doesn't yet have an official API for plugins to use for doing anything else, so we're somewhat at the mercy of future GCC developments (hopefully Python will make it easier to survive future internal interface changes though). BTW, the Python plugin's API isn't 100% frozen yet: I still reserve the right to tweak things if appropriate (I've only done this occasionally though, and I've gone through all the code I know of when I do to doublecheck if I'm about to break something). > David, would you mind elaborating why C was used for this project and > not (partially) Cython, and would it be possible to extend the plugin > with Cython? I did initial try using Cython: see early commits here: http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=commitdiff;h=4d62721d519008c325d7369f1330dc09080c0b51 http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=commitdiff;h=9b5145955c823453404c49e4b295e8c739c5ff44 but GCC internals are just too, err, "baroque" (that's a euphemism): it makes very heavy use of the C preprocessor (e.g. *all* field accesses go through an access macro; there are garbage-collection annotations thoughout); many of the types are declared by repeatedly #include-ing .def files using macro definitions to expand the contents in a variety of ways. > > Then what happens if you have > > > > #ifdef FOO > > #define BAR 3 > > #else > > #define BAR 4 > > #endif > > > > ?? I'm not saying it is hard, but perhaps no longer completely trivial :-) Yeah. I have no idea to what extent the C preprocessor stuff is exposed internally, and if the branching logic is preserved in any way that's usable. [...snip...] > > Does gccgo use the C ABI so that Cython could call it? If so, go for it! > > > > (Fortran is actually very much in use in the Cython userbase and would get a > > lot more "customers" than Go, but if you have more of a CS background or > > similar I can see why you wouldn't be so interested in Fortran. I didn't > > believe people were still using Fortran either until I started doing > > astrophysics, and suddenly it seems to be the default tool everybody uses > > for everything.) I downloaded Philip's script from http://mail.python.org/pipermail/cython-devel/attachments/20120329/cdeb9453/attachment.py It's running immediately before "free_lang_data", which is the first interprocedural "whole-file" optimization pass, after some per-function passes have been run. You can see a map of the passes here: http://gcc-python-plugin.readthedocs.org/en/latest/tables-of-passes.html [See also http://gcc-python-plugin.readthedocs.org/en/latest/callbacks.html#gcc.PLUGIN_PASS_EXECUTION for notes on how the sample code I showed Dag at PyCon works] So my guess is that this code can be run for *all* languages that GCC can handle: all of the language frontends feed in data near the top of that map: so in theory this ought to work for Fortran, C++, Go, etc. Having said that, I've been trying to get my libcpychecker code running on C++ and I keep running into subtle difference in the exact data they generate: e.g. the C++ frontend seems to add Nop statements for empty functions, whereas the C frontend doesn't; type declarations get hidden inside namespace objects in the C++ frontend; etc etc. BTW, some stylistic nits on Philip's script: * don't match types based on strings: c.f.: if T == "<type 'gcc.FunctionDecl'>": instead, use isinstance: if isinstance(decl.type, gcc.FunctionDecl) so that you're not relying on repr() or str(), and so you match subclasses, not just one class * "decl_location_get_file (decl)" jumps through lots of hoops to get at the filename of a decl.location by parsing the repr(). But you can simply look at the decl.location.file attribute: http://gcc-python-plugin.readthedocs.org/en/latest/basics.html#gcc.Location.file * similar considerations apply to decl_identifier_node_to_string(); have a look at the dir() of the object (and if something is not documented, file a bug, or a patch!). Hope this is helpful; good luck! Dave _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel