Re: [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949]

2023-02-28 Thread Thomas Schwinge
Hi!

I'm currently reviewing 'gomp_copy_host2dev', 'ephemeral' in a different
context, and a question came up here;
commit r13-706-g49d1a2f91325fa8cc011149e27e5093a988b3a49
"OpenMP: Handle descriptors in target's firstprivate [PR104949]":

On 2022-05-11T19:33:00+0200, Tobias Burnus  wrote:
> this patch handles (for target regions)
>firstprivate(array_descriptor)
> by not only firstprivatizing the descriptor but also the data
> it points to. This is done by turning it in omp-low.cc the clause
> into
>firstprivate(descr) firstprivate(descr.data)
> and then attaching the latter to the former. That works by
> adding an 'attach' after the last firstprivate (and checking
> for it in libgomp). The attached-to device address for a
> previous (here: the first) firstprivate is obtained by returning
> the device address inside the hostaddrs[i] alias omp_arr array,
> i.e. the compiler generates:
>omp_arr.1 = &descr;  /* firstprivate */
>omp_arr.2 = descr.data;  /* firstprivate */
>omp_arr.3 = &omp_arr.1;  /* attach; bias: &desc.data-&desc */
> and libgomp then knows that the device address is in the
> pointer.

> Note: The code is not active for OpenACC. The existing code uses, e.g.,
> 'goto oacc_firstprivate' – thus, the new code would be
> partially active. I went for making it completely inactive for OpenACC
> by adding one '!is_gimple_omp_oacc'.

ACK.

> I bet that a deep copy would be
> also useful for OpenACC, but I have neither checked what the current
> code does nor what the OpenACC spec says about this.

Instead of adding corresponding handling to the OpenACC 'firstprivate'
special code paths later on, I suggest that we first address known issues
with OpenACC 'firstprivate' -- which probably may largely be achieved by
in fact removing those 'goto oacc_firstprivate's and other special code
paths?  For example, see 
"OpenACC 'firstprivate' clause: initial value".

That means, the following code currently isn't active for OpenACC, and
given that OpenMP 'target' doesn't do asynchronous device execution
(meaning: not in the way/implementation of OpenACC 'async'), it thus
doesn't care about the 'ephemeral' argument to 'gomp_copy_host2dev', but
still, for correctness (and once that code gets used for OpenACC):

> OpenMP: Handle descriptors in target's firstprivate [PR104949]
>
> For allocatable/pointer arrays, a firstprivate to a device
> not only needs to privatize the descriptor but also the actual
> data. This is implemented as:
>   firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x)
> where the address of x in device memory is saved in hostaddrs[i]
> by libgomp and the middle end actually passes hostaddrs[i]' to
> attach.

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -1350,7 +1350,24 @@ gomp_map_vars_internal (struct gomp_device_descr 
> *devicep,
>   gomp_copy_host2dev (devicep, aq,
>   (void *) (tgt->tgt_start + tgt_size),
>   (void *) hostaddrs[i], len, false, cbufp);

Here, passing 'ephemeral <- false' is correct, as 'h <- hostaddrs[i]'
points to non-ephemeral data.

> + /* Save device address in hostaddr to permit latter availablity
> +when doing a deep-firstprivate with pointer attach.  */
> + hostaddrs[i] = (void *) (tgt->tgt_start + tgt_size);

Here, we modify 'hostaddrs[i]' (itself -- *not* the data that the
original 'hostaddrs[i]' points to), so the above 'gomp_copy_host2dev'
with 'ephemeral <- false' is still correct, right?

>   tgt_size += len;
> +
> + /* If followed by GOMP_MAP_ATTACH, pointer assign this
> +firstprivate to hostaddrs[i+1], which is assumed to contain a
> +device address.  */
> + if (i + 1 < mapnum
> + && (GOMP_MAP_ATTACH
> + == (typemask & get_kind (short_mapkind, kinds, i+1
> +   {
> + uintptr_t target = (uintptr_t) hostaddrs[i];
> + void *devptr = *(void**) hostaddrs[i+1] + sizes[i+1];
> + gomp_copy_host2dev (devicep, aq, devptr, &target,
> + sizeof (void *), false, cbufp);

However, 'h <- &target' here points to data in the local frame
('target'), which potentially goes out of scope before an asynchronous
'gomp_copy_host2dev' has completed.  Thus, don't we have to pass here
'ephemeral <- true' instead of 'ephemeral <- false'?  Or, actually
instead of '&target', pass '&hostaddrs[i]', which then again points to
non-ephemeral data?  Is the latter safe to do, or are we potentially
further down the line modifying the data that '&hostaddrs[i]' points to?
(I got a bit lost in the use of 'hostaddrs[i]' here.)

> + ++i;
> +   }
>   continue;
> case GOMP_MAP_FIRSTPRIVATE_INT:
> case GOMP_MAP_ZERO_LEN_ARRAY_SECTION:
> @@ -2517,6 +2534,11 @@ 

[Patch] OpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546]

2023-02-28 Thread Tobias Burnus

(That's a[11/12/13 Regression] P2 regression) The problem is that an is-present
check on the receiver side (inside the target region) does not make much
sense; the != NULL check needs to be done before the GOMP_target_ext but
it *is* already there. (Having the check inside the target region failed
with an ICE.) Additionally, I encountered an issue for 'void *' alias
'type(c_ptr)'. That's on the Fortran-language side represented as
derived type with private component(s), but then mapped to 'void*'. In
any case, it lead to 'map(to: p) map(pset: p)' and the former will be at
some point get TYPE_UNIT_SIZE(TREE_TYPE(*p)) but '*p' is void, giving an
ICE in omp-lower ... OK for mainline – and later backport to 12 + 11?
Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546]

For is_device_ptr, optional checks should only be done before calling
libgomp, afterwards they are NULL either because of absent or, by
chance, because it is unallocated or unassociated (for pointers/allocatables).

Additionally, it fixes an issue with explicit mapping for 'type(c_ptr)'.

	PR middle-end/108546

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Fix mapping of
	type(C_ptr) variables.

gcc/ChangeLog:

	* omp-low.cc (lower_omp_target): Remove optional handling
	on the receiver side, i.e. inside target (data), for
	use_device_ptr.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/is_device_ptr-3.f90: New test.
	* testsuite/libgomp.fortran/use_device_ptr-optional-4.f90: New test.

 gcc/fortran/trans-openmp.cc|  4 +-
 gcc/omp-low.cc |  3 +-
 .../testsuite/libgomp.fortran/is_device_ptr-3.f90  | 46 +++
 .../libgomp.fortran/use_device_ptr-optional-4.f90  | 53 ++
 4 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 2d16f3be8ea..84c0184f48e 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -3152,7 +3152,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			   || GFC_DECL_CRAY_POINTEE (decl)
 			   || GFC_DESCRIPTOR_TYPE_P
 	 (TREE_TYPE (TREE_TYPE (decl)))
-			   || n->sym->ts.type == BT_DERIVED))
+			   || (n->sym->ts.type == BT_DERIVED
+   && (n->sym->ts.u.derived->ts.f90_type
+   != BT_VOID
 		{
 		  tree orig_decl = decl;
 
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index fef41a013ec..9757592c635 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -13942,7 +13942,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	  }
 	tree present;
 	present = ((do_optional_check
-			&& OMP_CLAUSE_CODE (c) != OMP_CLAUSE_HAS_DEVICE_ADDR)
+			&& OMP_CLAUSE_CODE (c) != OMP_CLAUSE_HAS_DEVICE_ADDR
+			&& OMP_CLAUSE_CODE (c) != OMP_CLAUSE_IS_DEVICE_PTR)
 		   ? omp_check_optional_argument (OMP_CLAUSE_DECL (c), true)
 		   : NULL_TREE);
 	if (present)
diff --git a/libgomp/testsuite/libgomp.fortran/is_device_ptr-3.f90 b/libgomp/testsuite/libgomp.fortran/is_device_ptr-3.f90
new file mode 100644
index 000..ab9f00ebecb
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/is_device_ptr-3.f90
@@ -0,0 +1,46 @@
+module m
+   use iso_c_binding
+   implicit none
+contains
+   subroutine s(x,y,z)
+  type(c_ptr), optional :: x
+  integer, pointer, optional :: y
+  integer, allocatable, optional :: z
+  logical is_present, is_null
+  is_present = present(x)
+  if (is_present) &
+is_null = .not. c_associated(x)
+
+  !$omp target is_device_ptr(x) has_device_addr(y) has_device_addr(z)
+if (is_present) then
+  if (is_null) then
+if (c_associated(x)) stop 1
+if (associated(y)) stop 2
+if (allocated(z)) stop 3
+  else
+if (.not. c_associated(x, c_loc(y))) stop 4
+if (y /= 7) stop 5
+if (z /= 9) stop 6
+  end if
+end if
+  !$omp end target
+   end
+end
+
+use m
+implicit none
+integer, pointer :: p
+integer, allocatable :: a
+p => null()
+call s()
+!$omp target data map(p,a) use_device_addr(p,a)
+  call s(c_null_ptr, p, a)
+!$omp end target data
+allocate(p,a)
+p = 7
+a = 9
+!$omp target data map(p,a) use_device_addr(p,a)
+  call s(c_loc(p), p, a)
+!$omp end target data
+deallocate(p,a)
+end
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-4.f90 b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-4.f90
new file mode 100644
index 000..b2a5c314685
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-4.f90
@@ -0,0 +1,53 @@
+! PR middle-end/108