On Thu, Oct 09, 2025 at 10:25:45PM +0200, Josef Melcr wrote:
> On 10/9/25 12:58, Tobias Burnus wrote:
> > Still, I think enabling it always with -O2 - or at least when
> > 'static' and only called once makes sense.
> Enabling it by default for static functions only called once is a good
> start. It makes a lot of sense and it will also allow me to gauge the
> complexity of solving the rest.
Doesn't have to be necessarily called just once, but it really needs
to be static and not called directly and only called in the case with
explicit or implicit GNU callback attribute (because those attributes
guarantee that the callee to which the callback is passed doesn't leak it
anywhere where something else could call those).
Simple
void foo (void (*) (void *), void *);
function won't do it, it could set some void (*) (void *) pointer e.g. in
some other TU and the other TU could call that function later on as many
times as it wants with whatever arguments.
If it is multiple callers, without cloning one can change those too, just
to a subset of all the callers.
E.g. if all callers pass value 42 to some field in the pointed structure,
that can be changed, if you e.g. have ranges, one needs to pass union of
all the ranges etc.
Even for OpenMP/OpenACC you have no guarantee you'll have a single caller.
Consider:
void bar (int);
[[gnu::always_inline]] static inline void
foo (int x)
{
#pragma omp parallel
bar (x);
}
[[gnu::always_inline]] static inline void
qux (int x)
{
#pragma omp parallel
bar (x);
}
void
baz ()
{
foo (42);
foo (42);
foo (42);
qux (15);
qux (22);
}
Yet you could propagate 42 into .omp_data_i_2(D)->x in foo._omp_fn.0,
all callers agree on that. You can't do that for qux._omp_fn.0,
all you could do is propagate range like int [15, 22] or int [15, 15][22, 22].
Note, even if it is static, you can't modify the function if it has
[[gnu::used]] attribute (i.e. DECL_PRESERVE_P flag set on it).
In that case e.g. inline asm might be referring to that.
BTW, for .ASSUME ifn which really would be nice to get handled as well,
it also due to inlining can have multiple callers, but the function in
that case is magic, not actually emitted into assembly. And while e.g.
cp into the function might help, the most important would be IPA-SRA.
If it agrees on all callers (or clones the magic function), in that case
it even doesn't have to keep a single pointer data pointing to some
structure, it can change the number of arguments to the function, and
ideally split all the ones passing structures to the actually used scalar
elements.
Jakub