Hi,

On Mon, Oct 06 2025, Tobias Burnus wrote:
> Hi Josef, Martin & Honza, hello world,
>
>
> Josef Melcr wrote:
>> this is the fourth version of this patch
>
> I have some questions below - as a mere user.
>
> Disclaimer: I presume that there are good reasons for the
> current behavior and I know that there is always room for
> improvement - and my puzzled questions shouldn't delay
> the landing of the patch. (Additionally, I have no idea
> how GCC's IPA internally works and I only glanced at the
> patch. Still, am a bit puzzled.)
>
> * * *
>
> I tried the following test case:
> ------------------
> int f()
> {
>    int x = 1;
>    int res;
>    #pragma omp parallel
>      if (x > 0)
>        res = 55;
>      else
>        res = 33;
>    return res;
> }
> ------------------
>
> But, using gcc -fopenmp -O3 -fdump-tree-optimized, I still see
> the condition in the dump/assembly.
>
> What puzzled me a bit with the examples is that all of them use
> LTO. Shouldn't the optimization not always be in place? In particular,
> the generated f._omp_fn.0 is/should be static, i.e. issues like code
> side growth or use outside of the TU shouldn't apply?
>
> If I add 'int main() { return f(); }' and compile with -flto, I see
> that the condition is gone.
>
> Secondly, as the function is static, I would have expected that -O2
> (implies -fipa-cp) is enough; however, I see that -fipa-cp-clone
> (implied by -O3) is required. - I understand that internally cloning
> is easier, but I wonder whether we it could still enable this with
> -O2, given that the func is static and has a single caller, only.

The function that is being cloned is the outlined artificial function
containing the OMP parallel region.  That is static all right but it
also appears to be addressable, because its address is taken right there
in the callback-carrying call.  And so currently IPA-CP does not treat
it as under control and so is only willing to modify a clone.

I hope that something like replacing all uses of node->local in
ipa-cp.cc with a predicate that, in addition to node->local nodes would
also allow nodes with 1) a static decl (we may need to stream this for
lto?) and 2) just one callback-edge caller and 3) just one reference,
that that would allow also propagation in this particular case.

Or we can change the meaning of the local flag to the above but that may
bubble up in a surprising way elsewhere.

Martin


>
> But also if it weren't static (or had multiple caller), for -O3 I'd
> expect the cloning in this case (assuming that it is profitable by
> whatever measure the compiler has).
>
> Thanks again for the patch and all others involved in reviewing
> creating, it.
>
> Tobias

Reply via email to