Hi Jakub, thanks for the prompt review.

On 2020/10/23 8:13 PM, Jakub Jelinek wrote:
In general, upon encountering a construct, we can't statically determine and 
insert alloc/release maps
for each element of a structure variable, since we don't really know which 
region of the structure is
currently mapped or not, hence this probably can't be properly implemented in 
the compiler.

Instead this patch tries to do the equivalent in the runtime: I've modified the 
handling of the
(GOMP_MAP_STRUCT, <field-map1>, <field-map2>, ...) sequence to:

   (1) Create just a single splay_tree_key to represent the entire structure's 
mapped-region
       (all element target_var_desc's now reference this same key instead of 
creating their own), and
I'm not sure that is what we want.  If we create just a single
splay_tree_key spanning the whole structure mapped region, then we can't
diagnose various mapping errors.  E.g. if I have:
void bar (struct S *);
struct S { int a, b, c, d, e; };
void foo (struct S s)
{
   #pragma omp target data map(tofrom: s.b, s.d)
   #pragma omp target map (s.b, s.c)
   bar (&s);
}
then target data maps the &s.b to &s.d + 1 region of the struct, but s.c
wasn't mapped and so the target region's mapping should fail, even when it
is in the middle of the mapped region.

Are you really sure this is what we want? I don't quite see anything harmful
about implicitly mapping "middle fields" like s.c, in fact the corresponding
memory is actually "mapped" anyways.

The structure mapping wording was written in a way to give implementations a
choice, either map the whole struct (space inefficient), or the region from
the first to last element in the struct the needs mapping (what GCC
implements, also space inefficient, but less so), or only map the fields
individually and somehow remap all uses of the struct in the region (I think
that is only theoretically possible if one can analyze the whole target
region and rewrite anything that could refer to it in there).

That seems to imply that rejecting "middle fields" are not really required
behavior.

So, I'd think instead of having just one splay_tree_key, we need multiple
(we could merge adjacent ones though if we want) but we need some way to tie
them together (e.g. represent them as one master entry (perhaps the first one) 
and
slaves entries and use the refcount of the master entry for all of them.

I did think of that route before, but it's just too complex and unwieldly 
compared
to an elegant solution like using one single splay_tree_key. I can try to think
more about a "composite-key" like design, but please reconsider the current 
patch.
It's already very close to the 5.0 spec, with what you mention not really "that"
large an issue.

Thanks,
Chung-Lin

Reply via email to