Re: Defining a common plugin machinery
Sorry, I think this bounced twice. Hugh Leather wrote: Hi All, Thanks, Grigori, for mentioning my plugin system, libplugin, which can be found at http://libplugin.sourceforge.net/. I have been meaning to release it but finding the time to finish off the documentation and upload all the newest code to SourceForge has been difficult (both code and docs on SourceForge are some months out of date). The plugin system was built to support MilePost goals for GCC, as we need to be able to capture events during compilation as well as change compilation behaviour. Here are some of the features of the system: *Application agnostic.* The plugin system is not GCC specific but can be used to make any application plugin aware. Plugin management is handled through a shared library which GCC, or any other application, can link to. I think if GCC took a similar approach then it would benefit from the exposure the system received elsewhere and the wider community would also have access to a professionally built plugin system. As the plugin system becomes more powerful, GCC reaps the rewards without having to change a line of code. The other, huge advantage of this, together with the design I'll describe below, is that GCC only has ~10 lines of plugin code; to initialise the library. The rest is working out how to refactor GCC to make it extensible. This way GCC won't be cluttered with nasty plugin stuff that obscures the meaning of the code. Finally, plugins for different applications can coexist. For example, we might have some plugins for the linker, some for driver, some for compiler and some that work in any of those. *Eclipse Inspired.* I've take inspiration from the excellent plugin system for the Eclipse IDE, http://www.eclipse.org. It has proved very successful, so it seemed like a good starting point. *What it is.* * Each plugin has an XML file describing it * Plugins have ExtensionPoints that other plugins can extend * Plugins can have shared libraries * Requires libxml2, libffi, libltdl An ExtensionPoint is one of the fundamental parts of the system. It provides the links between plugins. Each ExtensionPoint is really just an object with one method: bool extend( ExtensionPoint* self, Plugin* extendingPlugin, xmlNodePtr specfication ) This method tells the ExtensionPoint that some other plugin is extending it and gives it the XML that plugin uses. The ExtensionPoint can do whatever it likes with that XML. It might contain symbols pointing to functions it should use, it might be markup for logging text. It could be a list of unroll factors, one for each function or a list of passes to apply to a particular function. You can describe anything in XML. * An Example.* Maybe that's a bit confusing, so here's an example. Suppose we have a plugin which offers a logging or message service. It would have a plugin specification in XML like this: That says it's a plugin for GCC, it has id "simple-message", it uses a certain shared library. It also says it has an extension point called "simple-message.print" and gives the function to call when anyone extends that extension point. This function is called "simpleMessage_extend" and is in the shared library the plugin specified. It looks like this: bool simpleMessage_extend( ExtensionPoint* self, Plugin* extendingPlugin, xmlNodePtr specfication ) { printf( "%s\n", xmlNodeGetContent( specification )); return TRUE; } It simply prints the text content of any plugin that extends it. Another plugin might come along later and have this as its specification: Hello,World! Hopefully that little guy should be clear, it prints "Hello, World!" Now the plugin system has taken care of all the dependency management, only required plugins are loaded, etc. All the appropriate extension points are created (only those used) and extensions are applied. There's also a plugin lifecycle allowing things to happen at appropriate times. We could have had our plugins exchange code, remember things until later, do anything in fact. If you can describe it in XML then the plugin system lets you do it. *Ease of Use with Events and JoinPoints* Such simple extension points provide all the power you ever need, but not the ease of use. So, the system also lets you do common things with almost no code in GCC, just a slight refactoring and tiny description in XML. The most common things people want from a plugin system is to be able to listen to events or to replace behaviours. I'll show you how events are added here. Suppose GCC (or another plug
Re: Defining a common plugin machinery
Aye up all, I've now been reading through some of the list archive. Some of the posts were about how to tell GCC which plugins to load. I thought I'd tell you how libplugin does it. First there is a plugin path. This tells the system where plugin XML specifications can be found (each plugin needs one specification file). The path can be set by environment variable and/or command line argument. The application can also add to the plugin path itself, so GCC could include plugins from its own installation directory (although it doesn't at the moment). The plugin system will look at every specification file in the plugin path. Those files must be well formed XML with processing directives saying that the plugin is for the current application. for only GCC 4.3.1 or for any GCC 4.X.X or for any application - a few of the provided plugins work with all applications - like logging support. For the patch I have for GCC, only the compiler is plugin aware, not the driver, linker, etc. We could easily have processing directives for each so that you could define a plugin that worked on any set of the compiler applications. A plugin with both the processing directives below would work on both the linker and the driver, but not anything else: Plugins can be marked as lazy or eager (eager by default). Lazy plugins aren't loaded unless explicitly asked for or unless needed by another plugin. This allows users to setup (or for the compiler to set up) default plugins. You could, for example, remove all passes from GCC and distribute them as plugins with a small number being required (not that you should). The user can specify a list plugins with environment variable (GCC_PLUGINS=id,id) and/or command line argument (-plugins id,id). These are just comma separated lists of plugin ids (actually they can be glob patterns, too). Every plugin with such a matching id is marked as eager if it isn't already. The system fails if a requested plugin can't be found. So, we start loading all eager plugins. This means setting up their extension points, loading their libraries, executing lifecycle methods, etc., etc.. If any loaded plugin needs a lazy plugin, that lazy plugin is marked eager and will also be loaded. Plugins 'need' each other by either: * Having an explicit 'requires' element in their specfication, e.g. * Extending an extension point from the other plugin. This can be either by 'extension' elements in the specifications or by code (for example in lifecycle methods or, well, pretty much anything). This means that you don't need to know what plugins provide extension points you want, the system just handles it for you. This also means that you can have one plugin which loads up lots of other plugins. If we had all non-essential passes as plugins for example, one plugin called "O3" could load up a certain number of them and set parameters - I'm not suggesting we do that, though :-) Finally, you can specify arguments to plugins. This can either be via command line (-plugin-var id=value;id=value) and/or environment variable (GCC_PLUGIN_VAR=id=value;id=value). Plugin XML specifications can directly use these arguments, specify their own, use expansion over them etc. Plugin XML files can also use other sources of variable, such as any environment variable with variable names like "env.PATH". Plugin shared libraries also have an API to access these arguments. The variables also have an escaping syntax so that characters like '=' and ';' can be represented. Cheers, Hugh. Hugh Leather wrote: Sorry, I think this bounced twice. Hugh Leather wrote: Hi All, Thanks, Grigori, for mentioning my plugin system, libplugin, which can be found at http://libplugin.sourceforge.net/. I have been meaning to release it but finding the time to finish off the documentation and upload all the newest code to SourceForge has been difficult (both code and docs on SourceForge are some months out of date). The plugin system was built to support MilePost goals for GCC, as we need to be able to capture events during compilation as well as change compilation behaviour. Here are some of the features of the system: *Application agnostic.* The plugin system is not GCC specific but can be used to make any application plugin aware. Plugin management is handled through a shared library which GCC, or any other application, can link to. I think if GCC took a similar approach then it would benefit from the exposure the system received elsewhere and the wider community would also have access to a professionally built plugin system. As the plugin system becomes more powerful, GCC reaps the rewards without having to change a line of code. The other, huge advan
Re: Defining a common plugin machinery
' gate and execute function. This allows you turn passes on or off, find out what happened or to completely change the behaviour of a pass. BTW, the pass manager also creates names for those passes which don't already have them. There are also join points around execute_one_pass and execute_one_ipa_transform_pass (I'm still on 4.3.1). These allow you to find out what happened to each pass, rather than having to listen to the events of individual passes. You can also change the way those functions work. *Next gcc-pass-manager* Also allows you to add passes. First, you can just add to the managed passes without putting a pass into the compilation order. Or you can add one after or before another pass. At the moment this happens only to the first occurence of the other pass. This is one thing I don't like. You can also remove passes - again I'm not happy with this yet. The above control the default pass ordering. You can also set up particular pass orders for certain functions. I'm still not happy with it and it doesn't do IPA passes (though I think I can handle that). ... ... The things I'm not happy with are due to the abillity to have multiple copies of a pass in the pass tree. The other is the tree flattening I do for extension point gcc-pass-manager.set-pass-order. I need to think about it for a while. Note that the above XML format is for convenience. You could write your own code and replace how passes are done completely if you want. *Licensing* I don't know anything about licensing, but we could do something similar to the approach that Joern suggested. We could only load plugins that included the GPL or other approved OSS lisence at the top of the file. The plugin would then declare that it and everything it used was good. I don't think people could avoid that declaration. Maybe I'm wrong. What do you all think? Is this interesting? Cheers, Hugh. Basile STARYNKEVITCH wrote: Hugh Leather wrote: Aye up all, I've now been reading through some of the list archive. Some of the posts were about how to tell GCC which plugins to load. I thought I'd tell you how libplugin does it. Thanks for the nice explanation. I'm not sure to understand exactly how libplugin deals with adding passes; apparently, the entire pass manager (ie gcc/passes.c) has been rewritten or enhanced. Also, I did not understood the exact conceptual differences between libplugin & other proposals. Apparently libplugin is much more ambitious. So we now have many plugin proposals & experiments. However, we do know that there are some legal/political/license issues on these points (with the GCC community rightly wanting as hard as possible to avoid proprietary plugins), that some interaction seems to happen (notably between Steering Committee & FSF), that the work is going slowly (because of lack of resource & labor & funding? at FSF). My perception is that the issues are not mostly technical, but still political (and probably, as Ian Taylor mentioned it in http://gcc.gnu.org/ml/gcc/2008-09/msg00442.html a lack of lawyer or other human resources at FSF, which cost much more than any reasonable person could afford individually). I actually might not understand why exactly plugins are not permitted by the current GCC licenses. What I don't understand is * what exactly do we call a plugin? I feel (but I am not a lawyer) that (on linux) it is any *.so file which is fed to dlopen. I'm not able to point what parts of the GCC license prohibit that (I actually hope that nothing prohibits it right now, if the *.so is compiled from GPLv3-ed FSF copyrighted code. the MELT branch is doing exactly that right now). * will the runtime license be working for Christmas 2008. [some messages made me think that not, it is too much lawyer work; other messages made me a bit more optimistic; I really am confused]. Of course, I don't want any hard date, but I am in the absolute darkness on the actual work already done on improving the runtime license, and even more on what needs to be fixed. Also, I have no idea of the work involved in writing new licenses (I only know that the GPLv3 effort lasted much more than one year). Did I say that I am not a lawyer, and not understanding even the basic principles of US laws (or perhaps even Fre
Re: Defining a common plugin machinery
Hi Brendon, Thanks for reading all this :-) Comments in line. Brendon Costa wrote: I have notes inline below, following is my summary of libplugin from what i understand of your posts: * It exists as a fraemwork that works with GCC now * It uses xml files to define plugins (Allows making new plugins as combinations of others without making a new shared library, i.e. just create an xml file that describes the plugin) * It handles issues with inter-dependencies between plugins * It uses a "push" framework, where function pointers are replaced/chained in the original application rather than explicit calls to plugins (Provides more extensibility in a application that makes heavy use of function pointers, but produces a less explicit set of entry points or hooks for plugins) * Currently it provides automatic loading of plugins without specific user request * It already has a framework for allowing plugins to interact with the pass manager If you can think of any other points to summarize the features it might be helpful as you are closer to it. The issues i see with this framework: * it seems to provide a lot of features that we may not necessarily need (That should be up for discussion) Yes, it's entirely possible that it has more features than most people will need. Some have been taken from Eclipse, since they've some years fleshing out the problems with a major plugin system, some come from AOP. If we only ever see there being <10 entry points (i.e. not very fine grained) and not having cooperating plugins, then libplugin is definitely overkill. * plugin entry points are not well defined but can be "any function pointer call" Achh, hoist by my own petard! Part of my design goals was to make the changes almost invisible in GCC. I thought it might make it more acceptable to people. The difference between plugin-GCC and normal GCC would be 10 lines in toplev.c. Two solutions come to mind. One would be to provide an empty define for documentation purposes. Something like #define EXTENSION? The other is to say, well, entry points are pretty well defined in the plugin XMLs. Again, this is the approach taken by Eclipse and seems to work well. Actually, I have another which I describe later when I talk about enforcing application plugins. It is also possible to have entry points that aren't events, join-points or lists, those are just the ones that the system makes easy for you. Some questions: * How does the framework interact with the compile command line arguments? I wrote a bit about that in a previous email http://gcc.gnu.org/ml/gcc/2008-10/msg00011.html. I didn't put all the details in. For example, at the moment, the plugin system gets to see the command line before any other part of GCC. This allows plugins to alter the command line themselves. Secondly, though I was thinking of making this optional, the plugin system removes all plugin command line options from the command line before GCC gets to see it. * Does this work on platforms that dont support -rdynamic or can it be modified to do so in the future? It uses ltdl. I think that can be made to statically load dlls on those systems. You'd then have all the capabilities your base libraries provide + anything you can describe in XML only. I have no idea how practical this would be. Personally, I think plugins on those systems might be more trouble than it's worth. I've never used a compiler on one though (only compiled for them on my PC), so I don't really know much about it. I've only been developing on Linux. I don't know how much work it would be to port to other machines. Oh, I also use libffi for events and join-points. The system would certainly work without events and join-points, it just wouldn't be as fun. If you leave it in then it limits libplugin to systems with libffi. Hugh Leather wrote: *Separating Plugin system from appliction* Libplugin ships as a library. Apart from a few lines of code in toplev.c, the only other changes to GCC will be refactorings and maybe calling a few functions through pointers. As i understand the difference between the pull vs push, a plugin will load, and then modify existing function pointers in GCC to insert its own code and chain the existing code to be called after it. Is this correct? Yes, although really the plugin system does it on the plugin's behalf. We also have a 'non-enforced' distinction between plugins. There are those which use symbols in the application, making application events and join-points and those which provide their own shared libraries (some do both). So, some plugins make bits of the application extensible, some provide additional services, or both. It would be relatively easy to enforce that distinction. We could insist that no plugin is able to make bits of the a
Re: Defining a common plugin machinery
Aye up all, I think the env var solution is easier for people to use and immediately understand. There would be nothing to stop those people who don't like env vars from using the shell wrapper approach. Why not allow both? Are you sure about this style of event/callback mechanism? It seems to scale poorly. Isn't it likely to be a bit inefficient, too? Through this approach, plugins can't cooperate; they can't easily define their own events and it feels too early to prevent that. It looks like it's trying to replicate the abstraction of calling a function with something worse. What I mean is that I think what people would like to write is: GCC side: void fireMyEvent( 10, "hello" ); Plugin side: void handleMyEvent( int n, const char* msg ) { // use args } But now they have to write: GCC side: //Add to enum enum plugin_event { PLUGIN_MY_EVENT } // Create struct for args typedef struct my_event_args { int n; const char* msg; } my_event_args; // Call it: my_event_args args = { 10, "hello" }; plugin_callback( PLUGIN_MY_EVENT, &args }; Plugin side: void handleMyEvent( enum plugin_event, void* data, void* registration_data ) { if( plugin_event == PLUGIN_MY_EVENT ) { my_event_args* args = ( my_event_args* )data; // Use args } } Which seems a bit ugly to me. Although, it does have the advantage of being easy to implement on the GCC side. And, if they're replacing a heuristic and need a return value then even more lines of code are needed on both sides. How would this style work for replacing heuristics? Cheers, Hugh. Grigori Fursin wrote: Personally I'm against the env var idea as it would make it harder to figure out what's going on. I think someone mentioned that the same effect could be achieved using spec files. Ian mentioned the idea of creating small wrapper scripts with the names: gcc/g++ etc which just call the real gcc/g++... adding the necessary command line args. These can then just be put earlier in the search path. I currently use the env var method in my project, but I think the wrapper script idea is a bit nicer than using env vars personally, so i will likely change to that soon. That's right. It's a nicer solution. We just already have environment variables in our ICI implementation, but it can be useful if we will one day switch to the common plugin system without support for env variables ... Cheers, Grigori Brendon.
Re: Defining a common plugin machinery
Aye up, I like the idea of signals. The chaining code can be automatically generated by FFI - the code to do it is pretty trivial. Also, if instead of having a single struct with all the signals in, you could have a hashtable of signals referenced by a string id, then plugins could define their own and be able to cooperate. Cheers, Hugh. Brendon Costa wrote: Sounds like you're almost in need of a generic data marshalling interface here. Why do we need the complication of data marshaling? I don't see why we need to define that all plugin hooks have the same function interface as currently proposed. I.e. a single void*. This makes a lot of work marshaling data both as parameters and from return values. This is already done for us by the language (Though i may have mis-understood the train of thought here). I will propose the start of a new idea. This needs to be fleshed out a lot but it would be good to get some feedback. I will use the following terminology borrowed from QT: signal: Is a uniquely identified "hook" to which zero or more slots are added. (I.e. Caller) slot: Is a function implementation say in a plugin. This is added to a linked list for the specified signal. (I.e. Callee) The main concepts in this plugin hook definition are: * Signals can define any type of function pointer so can return values and accept any parameters without special data marshaling * Each signal is uniquely identified as a member variable in a struct called Hooks * A signal is implemented as a linked list where each node has a reference to a slot that has been connected to the signal * A slot is a function pointer and a unique string identifier This differs a bit from the QT definition but i find it helpful to describe the entities. Important things to note: Multiple plugins are "chained" one after the other. I.e. It is the responsibility of the plugin author to call any plugins that follow it in the list. This gives the plugin authors a bit more control over how their plugins inter-operate with other plugins, however it would be STRONGLY recommended that they follow a standard procedure and just call the next plugin after they have done their work. Basically, the idea is to provide the following structure and then most of the work will involve manipulation of the linked lists. I.e. Querying existing items in the LL, inserting new items before/after existing items, removing items from the LL. This is not a proposed end product. It is just to propose an idea. There are a few disadvantages with the way it is implemented right now: * Too much boilerplate code for each signal definition * The idea of chaining calls means the responsibility of calling the next plugin ends up with the plugin developer which could be bad if a plugin developer does not take due care, however it also provides them with more flexibility (not sure if that is necessary). Now, i have NO experience with the current pass manager in GCC, but would the passes be able to be managed using this same framework assuming that each pass is given a unique identifier? Thanks, Brendon. #include #include /* GCC : Code */ struct Hooks { /* Define the blah signal. */ struct BlahFPWrap { const char* name; int (*fp)(struct BlahFPWrap* self, int i, char c, float f); void* data; struct BlahFPWrap* next; struct BlahFPWrap* prev; }* blah; struct FooFPWrap { const char* name; void (*fp)(struct FooFPWrap* self); void* data; struct FooFPWrap* next; struct FooFPWrap* prev; }* foo; }; /* Initialised by main */ struct Hooks hooks; void SomeFunc1(void) { /* Call plugin hook: blah */ int result = (!hooks.blah ? 0 : hooks.blah->fp(hooks.blah, 3, 'c', 2.0f)); /* ... do stuff with result ... */ (void)result; } void SomeFunc2(void) { /* Call plugin hook: foo */ if (hooks.foo) hooks.foo->fp(hooks.foo); } void PlgInit(struct Hooks* h); int main() { hooks.blah = NULL; hooks.foo = NULL; PlgInit(&hooks); return 0; } /* Writeme... */ #define PLUGIN_INSERT_BEFORE(Hooks, Struct, Hook, FuncPtr, Before, SlotName) /* In plugin */ #define PLUGIN_NAME "myplg" static void MyFoo(struct FooFPWrap* self) { printf("MyFoo\n"); if (self->next) self->next->fp(self->next); } static void MyBlah(struct BlahFPWrap* self, int i, char c, float f) { printf("MyBlah\n"); if (self->next) self->next->fp(self->next, i, c, f); } void PlgInit(struct Hooks* h) { PLUGIN_INSERT_BEFORE(h, struct BlahFPWrap, blah, &MyBlah, NULL, PLUGIN_NAME "_MyBlah"); PLUGIN_INSERT_BEFORE(h, struct FooFPWrap, foo, &MyFoo, NULL, PLUGIN_NAME "_MyFoo"); } void PlgShut(struct Hooks* h) { PLUGIN_REMOVE(h, PLUGIN_NAME "_MyBlah"); PLUGIN_REMOVE(h, PLUGIN_NAME "_MyFoo"); }
How to insert functions?
Hi, I am trying to add a new destructor function to object files I compile. I'm doing this to instrument programs and then, once the program has finished I want to print out the statistics I've gathered. So, just before pass 'remove_useless_stmts' is called on each function I try to create a static destructor which will print stats for the current function (and obviously, I don't do it on the functions just created). Below I've put a simpler form, which should work for a one function program. Calling createFnDecl() creates a new function called __my_new_function. It should print "It worked!" when the program exits. Now, with compiler flag -O0, this works just fine, but with -O1 or above it seg faults the compiler. Can anyone tell me what I'm doing wrong? Cheers, Hugh. static tree createFnBody() { tree arg; tree argList; tree putsFn; tree call; tree bind; char msg[] = "It worked!"; bind = build3( BIND_EXPR, void_type_node, NULL_TREE, NULL_TREE, NULL_TREE ); TREE_SIDE_EFFECTS( bind ) = 1; putsFn = implicit_built_in_decls[ BUILT_IN_PUTS ]; arg = build_string_literal( sizeof( msg ) + 1, msg ); argList = tree_cons( NULL_TREE, arg, NULL_TREE ); call = build_function_call_expr( putsFn, argList ); append_to_statement_list( call, &BIND_EXPR_BODY( bind )); return bind; } static tree createFnDecl() { tree callTypes; tree fnName; tree fnDecl; tree t; struct function* oldCfun = cfun; fnName = get_identifier( "__my_new_function" ); callTypes = build_function_type_list( void_type_node, NULL_TREE ); fnDecl = build_decl( FUNCTION_DECL, fnName, callTypes ); fnDecl = lang_hooks.decls.pushdecl( fnDecl ); TREE_STATIC( fnDecl ) = 1; TREE_USED( fnDecl ) = 1; DECL_ARTIFICIAL( fnDecl ) = 1; DECL_IGNORED_P( fnDecl ) = 0; TREE_PUBLIC( fnDecl ) = 0; DECL_UNINLINABLE( fnDecl ) = 1; DECL_EXTERNAL( fnDecl ) = 0; DECL_CONTEXT( fnDecl ) = NULL_TREE; DECL_INITIAL( fnDecl ) = make_node ( BLOCK ); DECL_STATIC_DESTRUCTOR( fnDecl ) = 1; DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT( fnDecl ) = 1; t = build_decl( RESULT_DECL, NULL_TREE, void_type_node ); DECL_ARTIFICIAL( t ) = 1; DECL_IGNORED_P( t ) = 1; DECL_RESULT( fnDecl ) = t; DECL_SAVED_TREE( fnDecl ) = createFnBody(); allocate_struct_function( fnDecl ); cgraph_add_new_function( fnDecl ); cfun = oldCfun; return fnDecl; }
Re: How to insert functions?
Hi Andrew, Yes, I did a bit. it segfaults at cgraphunit.c:cgraph_expand_all_functions:1323 node->lowered = DECL_STRUCT_FUNCTION (node->decl)->cfg != NULL; It seems that the node->decl has been nulled by the time it gets here. It definitely isn't NULL after leaving my code. Can you think of anything that might do that? I figured I must be doing something pretty wrong and that there must be tons of examples for this kind of thing but I haven't been able to find one. Cheers for the help, Hugh. Andrew Haley wrote: Hugh Leather wrote: Hi, I am trying to add a new destructor function to object files I compile. I'm doing this to instrument programs and then, once the program has finished I want to print out the statistics I've gathered. So, just before pass 'remove_useless_stmts' is called on each function I try to create a static destructor which will print stats for the current function (and obviously, I don't do it on the functions just created). Below I've put a simpler form, which should work for a one function program. Calling createFnDecl() creates a new function called __my_new_function. It should print "It worked!" when the program exits. Now, with compiler flag -O0, this works just fine, but with -O1 or above it seg faults the compiler. Can anyone tell me what I'm doing wrong? Nothing obvious to me. Did you debug the point at which the segfault happened? Andrew.