Re: [PATCH v4] ipa, cgraph: Enable constant propagation to OpenMP kernels

Josef Melcr Sat, 18 Oct 2025 11:11:45 -0700

Hi,

On 10/6/25 15:51, Tobias Burnus wrote:

Hi Josef, Martin & Honza, hello world,



Josef Melcr wrote:

this is the fourth version of this patch


I have some questions below - as a mere user.

Disclaimer: I presume that there are good reasons for the
current behavior and I know that there is always room for
improvement - and my puzzled questions shouldn't delay
the landing of the patch. (Additionally, I have no idea
how GCC's IPA internally works and I only glanced at the
patch. Still, am a bit puzzled.)

* * *

I tried the following test case:
------------------
int f()
{
  int x = 1;
  int res;
  #pragma omp parallel
    if (x > 0)
      res = 55;
    else
      res = 33;
  return res;
}
------------------

But, using gcc -fopenmp -O3 -fdump-tree-optimized, I still see
the condition in the dump/assembly.

What puzzled me a bit with the examples is that all of them use
LTO. Shouldn't the optimization not always be in place? In particular,
the generated f._omp_fn.0 is/should be static, i.e. issues like code
side growth or use outside of the TU shouldn't apply?

If I add 'int main() { return f(); }' and compile with -flto, I see
that the condition is gone.

I remember running into the exact same issue when I was first developingthe testcases. It was right around the time we discovered that ipa-cpheuristics were broken, so I just assumed it was something related tothat and just stuck with LTO. The real reason is in the functionipa_compute_jump_functions_for_bb:


-----

      /* We do not need to bother analyzing calls to unknown functions
         unless they may become known during lto/whopr.  */
      if (!callee->definition && !flag_lto
          && !gimple_call_fnspec (cs->call_stmt).known_p ())
        continue;

-----

Since the definition of GOMP_parallel isn't known, the call doesn't getanalyzed and the callback edge is never created. LTO breaks thiscondition, which is why enabling it yields the expected result. If youextend the condition with something like "and the function doesn't havea callback attribute", the edge does get constructed and the ifstatement does get optimized away. We could certainly add such a check,it makes sense and I don't think that people would object.

Secondly, as the function is static, I would have expected that -O2
(implies -fipa-cp) is enough; however, I see that -fipa-cp-clone
(implied by -O3) is required. - I understand that internally cloning
is easier, but I wonder whether we it could still enable this with
-O2, given that the func is static and has a single caller, only.

But also if it weren't static (or had multiple caller), for -O3 I'd
expect the cloning in this case (assuming that it is profitable by
whatever measure the compiler has).

Do you suggest cloning with -O2 or editing the function inline? I amnot sure how difficult either option would be to implement, though justenabling cloning with -O2 shouldn't be too difficult. Cloning makessense in this case, since the original function will be cleaned upanyway, meaning there would be very little code growth, if any. Implementing it through a new compilation flag enabled at -O2 by default(as suggested by Martin) would probably be the best.

Thanks again for the patch and all others involved in reviewing
creating, it.

Tobias

On a side note, Tobias noticed that I am missing an initializer inattr-callback.cc on line 361, which results in a warning and a failedbuild. I remember fixing this in my bootstrap build, but I forgot tofix it in my dev build, from which I dumped the patch. Should I resendthe patch?


Best regards,

Josef

Re: [PATCH v4] ipa, cgraph: Enable constant propagation to OpenMP kernels

Reply via email to