https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46032

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #13)
> > > So there's no longer a path in the call graph from main to main._omp_fn.
> > > Perhaps a dummy body for GOMP_parallel could fix that.
> > 
> > Hm?  The IPA PTA "solution" was to tell IPA PTA that the call to
> > GOMP_parallel
> > doesn't make .omp_data_o escape.
> > 
> > The attached patch doesn't work because it only patches GOMP_parallel_start,
> > not GOMP_parallel.
> > 
> > Of course it would even better to teach IPA PTA that GOMP_parallel
> > is really invoking main._omp_fn.0 with a &.omp_data_o.1 argument.
> > 
> > How many different ways of IL do we get doing this kind of indirect
> > function invocations?
> 
> Other IPA propagators like IPA-CP probably also would like to know this.
> 
> I see various builtins taking a OMPFN argument in omp-builtins.def.  If we
> assume the GOMP runtime itself is "transparent" then do we know how the
> builtins end up calling the actual implementation function?

GOMP_parallel* call the ompfn function (first argument) with the second
argument (pointer to some structure filled before GOMP_parallel* and dead
(using a clobber) after the call) as the only argument.  The callback function
can be called just once or more times (once in each thread).
Then there is GOMP_task*, where there is one or two callback functions,
if just one (the other one is NULL), then either the first callback function
(1st argument) is called with the second argument as the only argument, or
with a pointer to a memory block that was filled with memcpy from the second
argument.  If the third argument (second callback) is non-NULL, then that
callback is called instead of the memcpy and the pointers can be to two
different structures.
GOMP_target is another case, but there is often a cross-device boundary in
between the two, so it is much harder to model that for IPA-PTA etc. purposes.
So, schematically, GOMP_parallel* (fn1, data1, ...) performs:
if (somecond)
  for (...)
    pthread_create (..., fn1, data1);
fn1 (data1);
if (somecond)
  for (...)
    pthread_join (...);
and GOMP_task (fn1, data1, fn2, ...) performs:
if (fn2 == 0 && somecond1)
  fn1 (data1);
else
  {
    char *buf = malloc (...); // or alloca/vla
    if (fn2 == 0)

  }

Reply via email to