Re: [RFC][WIP Patch] OpenMP map with iterator + Fortran OpenMP deep mapping / custom allocator (+ Fortran co_reduce)

Jakub Jelinek via Fortran Mon, 06 Dec 2021 07:19:20 -0800

On Mon, Dec 06, 2021 at 03:00:30PM +0100, Tobias Burnus wrote:
> This is a RFC/WIP patch about:
> 
> (A) OpenMP (C/C++/Fortran)
>    omp target map(iterator(i=n:m),to : x(i))
> 
> (B) Fortran:
> (1)   omp target map(to : dt_var, class_var)
> (2)   omp parallel allocator(my_alloc) firstprivate(class_var)
> (3)  call co_reduce(dt_coarray, my_func)
> 
> The problem with (A) is that there is not a compile-time countable
> number of iterations such that it cannot be easily add to the array
> used to call GOMP_target_ext.
> 
> The problem with (B) is that dt_var can have allocatable components
> which complicates stuff and with recursive types, the number of
> elements it not known at compile time - not with polymorphic types
> as it depends on the recursion depth and dynamic type, respectively.


I think there is no reason why the 3 arrays passed to GOMP_target_ext
(etc., for target data {, enter, exit} too and because this
affects to and from clauses as well, target update as well)
need to be constant size.
We can allocate them as VLA or from heap as well.
I guess only complication for using __builtin_allocate_with_align
would be target data, where the construct body could be using alloca
and we wouldn't want to silently free those allocas at the end of the
construct, though I bet we already have that problem whenever we
privatize some variable length variables on constructs that don't
result in outlined body into a new function, and outlining a body
into a new function will also break alloca across the boundaries.

We do a lot of sorting of the map clauses especially during gimplification,
one question is whether it is ok to sort the whole map clause with iterator
as one clause, or if we'd need to do the sorting at runtime.
With arbitrary lvalue expressions, the clauses with iterator
don't need to be just map(iterator(i=0:n),to : x[i]) but can be e.g.
map(iterator(i=0:n), tofrom : i == 0 ? a : i == 1 ? b : c[i - 2])
etc. (at least in C++, in C I think ?: doesn't give lvalues), or
*(i == 0 ? &a : i == 1 ? &b : &c[i - 2]) otherwise, though
I hope that is ok, it isn't much different from such lvalue expressions
when i isn't an iterator but say function parameter or some other variable,
I think we only map value in that case and don't really remap the vars
etc. (but sure, for map(iterator(i=0:n), to : foo(i).a[i].b[i]) we should
follow the rules for []s and .

So, I wouldn't be really afraid of going into dynamic allocation of the
arrays if the count isn't compile time constant.

Another thing is that it would be nice to optimize some most common cases
where some mappings could be described in more compact ways, and that
wouldn't be solely about iterator clause, but also when we start properly
implementing all the mapping nastiness of 5.0 and beyond, like mapping
of references, or the declare mapper stuff etc.
So if we come up with something like array descriptors Fortran has to
describe mapping of some possibly non-contiguous multidimensional array
with strides etc. in a single map element, it will be nice, but I'd
prefer not to outline complex expressions from map's clause as separate
function each, it can use many variables etc. from the parent function
and calling those as callbacks would be too ugly.

        Jakub

Re: [RFC][WIP Patch] OpenMP map with iterator + Fortran OpenMP deep mapping / custom allocator (+ Fortran co_reduce)

Reply via email to