Re: Calling convention for Intel APX extension

2023-07-30 Thread Thomas Koenig via Gcc

Am 27.07.23 um 15:43 schrieb Michael Matz:


I've recently submitted a patch that adds some attributes that basically
say "these-and-those regs aren't clobbered by this function" (I did them
for not clobbered xmm8-15).  Something similar could be used for the new
GPRs as well.  Then it would be a matter of ensuring that the interesting
functions are marked with that attributes (and then of course do the
necessary call-save/restore).


Interesting.

Taking this a bit further: The compiler knows which registers it used
(and which ones might get clobbered by called functions) and could
generate such information automatically and embed it in the assembly
file, and the assembler could, in turn, put it into the object file.

A linker (or LTO) could then check this and elide save/restore pairs
where they are not needed.

Now, I know that removing instructions during linking is a dangerous
business, and is a source of hard-to-find and rare bugs (the worst kind)
if not done right; a bullet-proof algorithm would be needed for that.

It would probably be impossible for calls into shared libraries, since
the saved registers might change from version to version.  It also would
probably not work for virtual member functions which are not found by
devirtualitzation.

Still, potential gains could be substantial, and it could have an
effect which could come close to inlining, while actually saving space
instead of using extra.

Comments?


Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-30 Thread Eric Feng via Gcc
[...]
> As noted in our chat earlier, I don't think we can easily make these
> work.  Looking at CPython's implementation: PyList_Type's initializer
> here:
> https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101
> initializes tp_flags with the flags, but:
> (a) we don't see that code when compiling a user's extension module
> (b) even if we did, PyList_Type is non-const, so the analyzer has to
> assume that tp_flags could have been written to since it was
> initialized
>
> In theory we could specialcase such lookups, so that, say, a plugin
> could register assumptions into the analyzer about the value of bits
> within (PyList_Type.tp_flags).
>
> However, this seems like a future feature.

I agree that it is more appropriate as a future feature.

Recently, in preparation for a patch, I have been focusing on
migrating as much of our plugin-specific functionality as possible,
which is currently scattered across core analyzer files for
convenience, into the plugin itself. Specifically, I am currently
trying to transfer the code related to stashing Python-specific types
and global variables into analyzer_cpython_plugin.c. This approach has
three main benefits, among which some I believe we have previously
discussed:

1) We only need to search for these values when initializing our
plugin, instead of every time the analyzer is enabled.
2) We can extend the values that we stash by modifying only our
plugin, avoiding changes to core analyzer files such as
analyzer-language.cc, which seems a safer and more resilient approach.
3) Future analyzer plugins will have an easier time stashing values
relevant to their respective projects.

Let me know if my concerns or reasons appear unfounded.

My initial approach involved adding a hook to the end of
ana::on_finish_translation_unit which calls the relevant
stashing-related callbacks registered during plugin initialization.
Here's a rough sketch:

void
on_finish_translation_unit (const translation_unit &tu)
{
  // ... existing code
  stash_named_constants (the_logger.get_logger (), tu);

  do_finish_translation_unit_callbacks(the_logger.get_logger (), tu);
}

Inside do_finish_translation_unit_callbacks we have a loop like so:

for (auto& callback : finish_translation_unit_callbacks)
{
callback(logger, tu);
}

Where finish_translation_unit_callbacks is a vector defined as follows:
typedef void (*finish_translation_unit_callback) (logger *, const
translation_unit &);
vec *finish_translation_unit_callbacks;

To register a callback, we use:

void
register_finish_translation_unit_callback (
finish_translation_unit_callback callback)
{
  if (!finish_translation_unit_callbacks)
vec_alloc (finish_translation_unit_callbacks, 1);
  finish_translation_unit_callbacks->safe_push (callback);
}

And finally, from our plugin (or any other plugin), we can register
callbacks like so:
ana::register_finish_translation_unit_callback (&stash_named_types);
ana::register_finish_translation_unit_callback (&stash_global_vars);

However, on_finish_translation_unit runs before plugin initialization
occurs, so, unfortunately, we would be registering our callbacks after
on_finish_translation_unit with this method. As a workaround, I tried
saving the translation unit like this:

void
on_finish_translation_unit (const translation_unit &tu)
{
  // ... existing code
  stash_named_constants (the_logger.get_logger (), tu);

  saved_tu = &tu;
}

Then in our plugin:
ana::register_finish_translation_unit_callback (&stash_named_types);
ana::register_finish_translation_unit_callback (&stash_global_vars);
ana:: do_finish_translation_unit_callbacks();

With do_finish_translation_units passing the stored_tu to the callbacks.

Unfortunately, with this method, it seems like we encounter a
segmentation fault when trying to call the lookup functions within
translation_unit at the time of plugin initialization, even though the
translation unit is stored correctly. So it seems like the solution
may not be quite so simple.

I'm currently investigating this issue, but if there's an obvious
solution that I might be missing or any general suggestions, please
let me know!

Thanks as always,
Eric


gcc-14-20230730 is now available

2023-07-30 Thread GCC Administrator via Gcc
Snapshot gcc-14-20230730 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/14-20230730/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 14 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch master 
revision c9434ea40e20584a44a0b6fc8659ee983d5f2dd2

You'll find:

 gcc-14-20230730.tar.xz   Complete GCC

  SHA256=c6e4de606800be0b70dfc5592420d6d5918b95e47df7b457423bd6f63cf409ba
  SHA1=2802aa38149358c551b016c47bd415853aeacd7f

Diffs from 14-20230723 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-14
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

2023-07-30 Thread David Malcolm via Gcc
On Sun, 2023-07-30 at 13:52 -0400, Eric Feng wrote:
> [...]
> > As noted in our chat earlier, I don't think we can easily make
> > these
> > work.  Looking at CPython's implementation: PyList_Type's
> > initializer
> > here:
> > https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101
> > initializes tp_flags with the flags, but:
> > (a) we don't see that code when compiling a user's extension module
> > (b) even if we did, PyList_Type is non-const, so the analyzer has
> > to
> > assume that tp_flags could have been written to since it was
> > initialized
> > 
> > In theory we could specialcase such lookups, so that, say, a plugin
> > could register assumptions into the analyzer about the value of
> > bits
> > within (PyList_Type.tp_flags).
> > 
> > However, this seems like a future feature.
> 
> I agree that it is more appropriate as a future feature.
> 
> Recently, in preparation for a patch, I have been focusing on
> migrating as much of our plugin-specific functionality as possible,
> which is currently scattered across core analyzer files for
> convenience, into the plugin itself. Specifically, I am currently
> trying to transfer the code related to stashing Python-specific types
> and global variables into analyzer_cpython_plugin.c. This approach
> has
> three main benefits, among which some I believe we have previously
> discussed:
> 
> 1) We only need to search for these values when initializing our
> plugin, instead of every time the analyzer is enabled.
> 2) We can extend the values that we stash by modifying only our
> plugin, avoiding changes to core analyzer files such as
> analyzer-language.cc, which seems a safer and more resilient
> approach.
> 3) Future analyzer plugins will have an easier time stashing values
> relevant to their respective projects.

Sounds good, though I don't mind if the initial version of your patch
adds CPython-specific stuff to the core, if there are unexpected
hurdles in converting things to be more purely plugin based.

> 
> Let me know if my concerns or reasons appear unfounded.
> 
> My initial approach involved adding a hook to the end of
> ana::on_finish_translation_unit which calls the relevant
> stashing-related callbacks registered during plugin initialization.
> Here's a rough sketch:
> 
> void
> on_finish_translation_unit (const translation_unit &tu)
> {
>   // ... existing code
>   stash_named_constants (the_logger.get_logger (), tu);
> 
>   do_finish_translation_unit_callbacks(the_logger.get_logger (), tu);
> }
> 
> Inside do_finish_translation_unit_callbacks we have a loop like so:
> 
> for (auto& callback : finish_translation_unit_callbacks)
> {
>     callback(logger, tu);
> }
> 
> Where finish_translation_unit_callbacks is a vector defined as
> follows:
> typedef void (*finish_translation_unit_callback) (logger *, const
> translation_unit &);
> vec
> *finish_translation_unit_callbacks;

Seems reasonable.

> 
> To register a callback, we use:
> 
> void
> register_finish_translation_unit_callback (
>     finish_translation_unit_callback callback)
> {
>   if (!finish_translation_unit_callbacks)
>     vec_alloc (finish_translation_unit_callbacks, 1);
>   finish_translation_unit_callbacks->safe_push (callback);
> }
> 
> And finally, from our plugin (or any other plugin), we can register
> callbacks like so:
> ana::register_finish_translation_unit_callback (&stash_named_types);
> ana::register_finish_translation_unit_callback (&stash_global_vars);
> 
> However, on_finish_translation_unit runs before plugin initialization
> occurs, so, unfortunately, we would be registering our callbacks
> after
> on_finish_translation_unit with this method.

Really?   I thought the plugin_init callback is called from
initialize_plugins, which is called from toplev::main fairly early on;
I though on_finish_translation_unit is called from deep within
do_compile, which is called later on from toplev::main.

What happens if you put breakpoints on both the plugin_init hook and on
on_finish_translation_unit, and have a look at the backtrace at each?

Note that this is the "plugin_init" code, not the PLUGIN_ANALYZER_INIT
callback.  The latter *is* called after on_finish_translation_unit,
when the analyzer runs.  You'll need to put your code in the former.


>  As a workaround, I tried
> saving the translation unit like this:
> 
> void
> on_finish_translation_unit (const translation_unit &tu)
> {
>   // ... existing code
>   stash_named_constants (the_logger.get_logger (), tu);
> 
>   saved_tu = &tu;
> }

That's not going to work; the "tu" is a reference to an on-stack
object, i.e. essentially a pointer to a temporary on the stack.  If
saved_tu is a pointer, then it's going to be pointing at garbage when
the function returns; if it's an object, then it's going to take a copy
of just the base class, which isn't going to be usable either ("object
slicing").

> 
> Then in our plugin:
> ana::register_finish_translation_unit_callback (&stash_named_types);
> ana::register_fi