Re: Calling convention for Intel APX extension
Am 27.07.23 um 15:43 schrieb Michael Matz: I've recently submitted a patch that adds some attributes that basically say "these-and-those regs aren't clobbered by this function" (I did them for not clobbered xmm8-15). Something similar could be used for the new GPRs as well. Then it would be a matter of ensuring that the interesting functions are marked with that attributes (and then of course do the necessary call-save/restore). Interesting. Taking this a bit further: The compiler knows which registers it used (and which ones might get clobbered by called functions) and could generate such information automatically and embed it in the assembly file, and the assembler could, in turn, put it into the object file. A linker (or LTO) could then check this and elide save/restore pairs where they are not needed. Now, I know that removing instructions during linking is a dangerous business, and is a source of hard-to-find and rare bugs (the worst kind) if not done right; a bullet-proof algorithm would be needed for that. It would probably be impossible for calls into shared libraries, since the saved registers might change from version to version. It also would probably not work for virtual member functions which are not found by devirtualitzation. Still, potential gains could be substantial, and it could have an effect which could come close to inlining, while actually saving space instead of using extra. Comments?
Re: Update and Questions on CPython Extension Module -fanalyzer plugin development
[...] > As noted in our chat earlier, I don't think we can easily make these > work. Looking at CPython's implementation: PyList_Type's initializer > here: > https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101 > initializes tp_flags with the flags, but: > (a) we don't see that code when compiling a user's extension module > (b) even if we did, PyList_Type is non-const, so the analyzer has to > assume that tp_flags could have been written to since it was > initialized > > In theory we could specialcase such lookups, so that, say, a plugin > could register assumptions into the analyzer about the value of bits > within (PyList_Type.tp_flags). > > However, this seems like a future feature. I agree that it is more appropriate as a future feature. Recently, in preparation for a patch, I have been focusing on migrating as much of our plugin-specific functionality as possible, which is currently scattered across core analyzer files for convenience, into the plugin itself. Specifically, I am currently trying to transfer the code related to stashing Python-specific types and global variables into analyzer_cpython_plugin.c. This approach has three main benefits, among which some I believe we have previously discussed: 1) We only need to search for these values when initializing our plugin, instead of every time the analyzer is enabled. 2) We can extend the values that we stash by modifying only our plugin, avoiding changes to core analyzer files such as analyzer-language.cc, which seems a safer and more resilient approach. 3) Future analyzer plugins will have an easier time stashing values relevant to their respective projects. Let me know if my concerns or reasons appear unfounded. My initial approach involved adding a hook to the end of ana::on_finish_translation_unit which calls the relevant stashing-related callbacks registered during plugin initialization. Here's a rough sketch: void on_finish_translation_unit (const translation_unit &tu) { // ... existing code stash_named_constants (the_logger.get_logger (), tu); do_finish_translation_unit_callbacks(the_logger.get_logger (), tu); } Inside do_finish_translation_unit_callbacks we have a loop like so: for (auto& callback : finish_translation_unit_callbacks) { callback(logger, tu); } Where finish_translation_unit_callbacks is a vector defined as follows: typedef void (*finish_translation_unit_callback) (logger *, const translation_unit &); vec *finish_translation_unit_callbacks; To register a callback, we use: void register_finish_translation_unit_callback ( finish_translation_unit_callback callback) { if (!finish_translation_unit_callbacks) vec_alloc (finish_translation_unit_callbacks, 1); finish_translation_unit_callbacks->safe_push (callback); } And finally, from our plugin (or any other plugin), we can register callbacks like so: ana::register_finish_translation_unit_callback (&stash_named_types); ana::register_finish_translation_unit_callback (&stash_global_vars); However, on_finish_translation_unit runs before plugin initialization occurs, so, unfortunately, we would be registering our callbacks after on_finish_translation_unit with this method. As a workaround, I tried saving the translation unit like this: void on_finish_translation_unit (const translation_unit &tu) { // ... existing code stash_named_constants (the_logger.get_logger (), tu); saved_tu = &tu; } Then in our plugin: ana::register_finish_translation_unit_callback (&stash_named_types); ana::register_finish_translation_unit_callback (&stash_global_vars); ana:: do_finish_translation_unit_callbacks(); With do_finish_translation_units passing the stored_tu to the callbacks. Unfortunately, with this method, it seems like we encounter a segmentation fault when trying to call the lookup functions within translation_unit at the time of plugin initialization, even though the translation unit is stored correctly. So it seems like the solution may not be quite so simple. I'm currently investigating this issue, but if there's an obvious solution that I might be missing or any general suggestions, please let me know! Thanks as always, Eric
gcc-14-20230730 is now available
Snapshot gcc-14-20230730 is now available on https://gcc.gnu.org/pub/gcc/snapshots/14-20230730/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 14 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch master revision c9434ea40e20584a44a0b6fc8659ee983d5f2dd2 You'll find: gcc-14-20230730.tar.xz Complete GCC SHA256=c6e4de606800be0b70dfc5592420d6d5918b95e47df7b457423bd6f63cf409ba SHA1=2802aa38149358c551b016c47bd415853aeacd7f Diffs from 14-20230723 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-14 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Update and Questions on CPython Extension Module -fanalyzer plugin development
On Sun, 2023-07-30 at 13:52 -0400, Eric Feng wrote: > [...] > > As noted in our chat earlier, I don't think we can easily make > > these > > work. Looking at CPython's implementation: PyList_Type's > > initializer > > here: > > https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101 > > initializes tp_flags with the flags, but: > > (a) we don't see that code when compiling a user's extension module > > (b) even if we did, PyList_Type is non-const, so the analyzer has > > to > > assume that tp_flags could have been written to since it was > > initialized > > > > In theory we could specialcase such lookups, so that, say, a plugin > > could register assumptions into the analyzer about the value of > > bits > > within (PyList_Type.tp_flags). > > > > However, this seems like a future feature. > > I agree that it is more appropriate as a future feature. > > Recently, in preparation for a patch, I have been focusing on > migrating as much of our plugin-specific functionality as possible, > which is currently scattered across core analyzer files for > convenience, into the plugin itself. Specifically, I am currently > trying to transfer the code related to stashing Python-specific types > and global variables into analyzer_cpython_plugin.c. This approach > has > three main benefits, among which some I believe we have previously > discussed: > > 1) We only need to search for these values when initializing our > plugin, instead of every time the analyzer is enabled. > 2) We can extend the values that we stash by modifying only our > plugin, avoiding changes to core analyzer files such as > analyzer-language.cc, which seems a safer and more resilient > approach. > 3) Future analyzer plugins will have an easier time stashing values > relevant to their respective projects. Sounds good, though I don't mind if the initial version of your patch adds CPython-specific stuff to the core, if there are unexpected hurdles in converting things to be more purely plugin based. > > Let me know if my concerns or reasons appear unfounded. > > My initial approach involved adding a hook to the end of > ana::on_finish_translation_unit which calls the relevant > stashing-related callbacks registered during plugin initialization. > Here's a rough sketch: > > void > on_finish_translation_unit (const translation_unit &tu) > { > // ... existing code > stash_named_constants (the_logger.get_logger (), tu); > > do_finish_translation_unit_callbacks(the_logger.get_logger (), tu); > } > > Inside do_finish_translation_unit_callbacks we have a loop like so: > > for (auto& callback : finish_translation_unit_callbacks) > { > callback(logger, tu); > } > > Where finish_translation_unit_callbacks is a vector defined as > follows: > typedef void (*finish_translation_unit_callback) (logger *, const > translation_unit &); > vec > *finish_translation_unit_callbacks; Seems reasonable. > > To register a callback, we use: > > void > register_finish_translation_unit_callback ( > finish_translation_unit_callback callback) > { > if (!finish_translation_unit_callbacks) > vec_alloc (finish_translation_unit_callbacks, 1); > finish_translation_unit_callbacks->safe_push (callback); > } > > And finally, from our plugin (or any other plugin), we can register > callbacks like so: > ana::register_finish_translation_unit_callback (&stash_named_types); > ana::register_finish_translation_unit_callback (&stash_global_vars); > > However, on_finish_translation_unit runs before plugin initialization > occurs, so, unfortunately, we would be registering our callbacks > after > on_finish_translation_unit with this method. Really? I thought the plugin_init callback is called from initialize_plugins, which is called from toplev::main fairly early on; I though on_finish_translation_unit is called from deep within do_compile, which is called later on from toplev::main. What happens if you put breakpoints on both the plugin_init hook and on on_finish_translation_unit, and have a look at the backtrace at each? Note that this is the "plugin_init" code, not the PLUGIN_ANALYZER_INIT callback. The latter *is* called after on_finish_translation_unit, when the analyzer runs. You'll need to put your code in the former. > As a workaround, I tried > saving the translation unit like this: > > void > on_finish_translation_unit (const translation_unit &tu) > { > // ... existing code > stash_named_constants (the_logger.get_logger (), tu); > > saved_tu = &tu; > } That's not going to work; the "tu" is a reference to an on-stack object, i.e. essentially a pointer to a temporary on the stack. If saved_tu is a pointer, then it's going to be pointing at garbage when the function returns; if it's an object, then it's going to take a copy of just the base class, which isn't going to be usable either ("object slicing"). > > Then in our plugin: > ana::register_finish_translation_unit_callback (&stash_named_types); > ana::register_fi