[PING] [PATCH] C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct

2022-01-10 Thread Marcel Vollweiler

Hi,

I'd like to ping the patch for the OpenMP 'has_device_addr' clause on
the target construct:

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585361.html


Thanks
Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct

2022-02-02 Thread Marcel Vollweiler

Hi Jakub,


+case OMP_CLAUSE_HAS_DEVICE_ADDR:
+  t = OMP_CLAUSE_DECL (c);
+  if (TREE_CODE (t) == TREE_LIST)
+{
+  if (handle_omp_array_sections (c, ort))
+remove = true;
+  else
+{
+  t = OMP_CLAUSE_DECL (c);
+  while (TREE_CODE (t) == ARRAY_REF)
+t = TREE_OPERAND (t, 0);
+}
+}
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR)
+bitmap_set_bit (&is_on_device_head, DECL_UID (t));


Why the OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR check?
There is no goto into this block nor fallthru into it, and
handle_omp_array_sections better shouldn't change OMP_CLAUSE_CODE.


Good point. Removed.




   goto check_dup_generic;

+case OMP_CLAUSE_HAS_DEVICE_ADDR:
+  t = OMP_CLAUSE_DECL (c);
+  if (TREE_CODE (t) == TREE_LIST)
+if (handle_omp_array_sections (c, ort))
+  remove = true;
+else
+  {
+t = OMP_CLAUSE_DECL (c);
+while (TREE_CODE (t) == ARRAY_REF)
+  t = TREE_OPERAND (t, 0);
+  }
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR)
+bitmap_set_bit (&is_on_device_head, DECL_UID (t));


Likewise.


Removed.




+  if (VAR_P (t) || TREE_CODE (t) == PARM_DECL)
+cxx_mark_addressable (t);
+  goto check_dup_generic_t;
+
 case OMP_CLAUSE_USE_DEVICE_ADDR:
   field_ok = true;
   t = OMP_CLAUSE_DECL (c);



--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1391,7 +1391,8 @@ enum
OMP_LIST_USE_DEVICE_PTR,
OMP_LIST_USE_DEVICE_ADDR,
OMP_LIST_NONTEMPORAL,
-  OMP_LIST_NUM
+  OMP_LIST_HAS_DEVICE_ADDR,
+  OMP_LIST_NUM  /* must be the last  */


Capital M and . at the end.


Changed.




@@ -2077,6 +2078,12 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
 }
   break;
 case 'h':
+  if ((mask & OMP_CLAUSE_HAS_DEVICE_ADDR)
+  && gfc_match_omp_variable_list
+   ("has_device_addr (",
+&c->lists[OMP_LIST_HAS_DEVICE_ADDR], false, NULL, NULL,
+ true) == MATCH_YES)


Formatting, true should be IMO below &c->lists.


Corrected the formatting.




+continue;
   if ((mask & OMP_CLAUSE_HINT)
   && (m = gfc_match_dupl_check (!c->hint, "hint", true, &c->hint))
  != MATCH_NO)
@@ -2850,7 +2857,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
   if ((mask & OMP_CLAUSE_USE_DEVICE_ADDR)
   && gfc_match_omp_variable_list
("use_device_addr (",
-&c->lists[OMP_LIST_USE_DEVICE_ADDR], false) == MATCH_YES)
+&c->lists[OMP_LIST_USE_DEVICE_ADDR], false, NULL, NULL,
+ true) == MATCH_YES)


Likewise.


Corrected.




--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1910,7 +1910,17 @@ gfc_trans_omp_variable_list (enum omp_clause_code code,
 tree t = gfc_trans_omp_variable (namelist->sym, declare_simd);
 if (t != error_mark_node)
   {
-tree node = build_omp_clause (input_location, code);
+tree node;
+/* For HAS_DEVICE_ADDR of an array descriptor, firstprivatize the
+   descriptor such that the bounds are available; its data component
+   is unmodified; it is handled as device address inside target. */
+if (code == OMP_CLAUSE_HAS_DEVICE_ADDR
+&& (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (t))
+|| (POINTER_TYPE_P (TREE_TYPE (t))
+&& GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (TREE_TYPE (t))
+  node = build_omp_clause (input_location, OMP_CLAUSE_FIRSTPRIVATE);


Not sure about the above,


This is needed for allocatable arrays and array pointers to ensure that
not only the (array) data is (already) present on the device but also
the array descriptor. Otherwise the test cases
target-has-device-addr-2.f90, target-has-device-addr-3.f90 (because of
variable "c") and target-has-device-addr-4.f90 (also because of variable
"c") won't work.




--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -10024,6 +10024,15 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
   flags = GOVD_EXPLICIT;
   goto do_add;

+case OMP_CLAUSE_HAS_DEVICE_ADDR:
+  decl = OMP_CLAUSE_DECL (c);
+  if (TREE_CODE (decl) == ARRAY_REF)
+{
+  flags = GOVD_FIRSTPRIVATE | GOVD_EXPLICIT;
+  while (TREE_CODE (decl) == ARRAY_REF)
+decl = TREE_OPERAND (decl, 0);
+  goto do_add_decl;


but this looks weird.
If decl after stripping the ARRAY_REFs is a var with pointer type, sure,
firstprivatizing it is the way to go.
But it can be also a variable with ARRAY_TYPE, can't it?  Something like:
   int a[64];
   #pragma omp target data map(a) use_device_addr(a)
   {
 #pragma omp target has_device_addr(a[3:16])
 a[3] = 1;
   }
and 

[PATCH] OpenMP, libgomp: Add new runtime routines omp_target_memcpy_async and omp_target_memcpy_rect_async

2022-02-21 Thread Marcel Vollweiler

Hi,

This patch adds two new OpenMP runtime routines: omp_target_memcpy_async and
omp_target_memcpy_rect_async. Both functions are introduced in OpenMP 5.1 as
asynchronous variants of omp_target_memcpy and omp_target_memcpy_rect.

In contrast to the synchronous variants, the asynchronous functions have two
additional function parameters to allow the specification of task dependences:

   int depobj_count
   omp_depend_t *depobj_list

   integer(c_int), value :: depobj_count
   integer(omp_depend_kind), optional :: depobj_list(*)

The implementation splits the synchronous functions into two parts: (a) check
and (b) copy. Then (a) is used in the asynchronous functions for the sequential
part, and the actual copy process (b) is executed in a new created task. The
sequential part (a) takes into account the requirements for the return values:

"The routine returns zero if successful. Otherwise, it returns a non-zero
value." (omp_target_memcpy_async, OpenMP 5.1 spec, section 3.8.7)

"An application can determine the number of inclusive dimensions supported by an
implementation by passing NULL pointers (or C_NULL_PTR, for Fortran) for both
dst and src. The routine returns the number of dimensions supported by the
implementation for the specified device numbers. No copy operation is
performed." (omp_target_memcpy_rect_async, OpenMP 5.1 spec, section 3.8.8)

Due to asynchronicity an error is thrown if the asynchronous memcpy is not
successful (in contrast to the synchronous functions which use a return
value unequal to zero).

The patch was tested on x86_64-linux with nvptx and amdgcn offloading and with
PowerPC with nvptx offloading. All with no regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, libgomp: Add new runtime routines omp_target_memcpy_async and
omp_target_memcpy_rect_async.

This patch adds two new OpenMP runtime routines: omp_target_memcpy_async and
omp_target_memcpy_rect_async. Both functions are introduced in OpenMP 5.1 as
asynchronous variants of omp_target_memcpy and omp_target_memcpy_rect.

In contrast to the synchronous variants, the asynchronous functions have two
additional function parameters to allow the specification of task dependences:

int depobj_count
omp_depend_t *depobj_list

integer(c_int), value :: depobj_count
integer(omp_depend_kind), optional :: depobj_list(*)

The implementation splits the synchronous functions into two parts: (a) check
and (b) copy. Then (a) is used in the asynchronous functions for the sequential
part, and the actual copy process (b) is executed in a new created task. The
sequential part (a) takes into account the requirements for the return values: 

"The routine returns zero if successful. Otherwise, it returns a non-zero
value." (omp_target_memcpy_async, OpenMP 5.1 spec, section 3.8.7)

"An application can determine the number of inclusive dimensions supported by an
implementation by passing NULL pointers (or C_NULL_PTR, for Fortran) for both
dst and src. The routine returns the number of dimensions supported by the
implementation for the specified device numbers. No copy operation is
performed." (omp_target_memcpy_rect_async, OpenMP 5.1 spec, section 3.8.8)

Due to asynchronicity an error is thrown if the asynchronous memcpy is not
successful (in contrast to the synchronous functions which use a return
value unequal to zero).

gcc/ChangeLog:

* omp-low.cc (omp_runtime_api_call): Added target_memcpy_async and
target_memcpy_rect_async to omp_runtime_apis array.

libgomp/ChangeLog:

* libgomp.map: Added omp_target_memcpy_async and
omp_target_memcpy_rect_async.
* libgomp.texi: Both functions are now supported.
* omp.h.in: Added omp_target_memcpy_async and
omp_target_memcpy_rect_async.
* omp_lib.f90.in: Added interfaces for both new functions.
* omp_lib.h.in: Likewise.
* target.c (omp_target_memcpy): Restructured into check and copy part.
(omp_target_memcpy_check): New helper function for omp_target_memcpy and
omp_target_memcpy_async that checks requirements.
(omp_target_memcpy_copy): New helper function for omp_target_memcpy and
omp_target_memcpy_async that performs the memcpy.
(omp_target_memcpy_async_helper): New helper function that is used in
omp_target_memcpy_async for the asynchronous task.
(omp_target_memcpy_async): Added.
(omp_target_memcpy_rect): Restructured into check and copy part.
(omp_target_memcpy_rect_check): New helper function for
omp_target_memcpy_rect and omp_target_memcpy_rect_async that checks
requirements.
(omp_target_memcpy_rect_copy): New helper function for
omp_target_m

[PATCH] OpenMP, libgomp: Add new runtime routine omp_get_mapped_ptr.

2022-03-04 Thread Marcel Vollweiler

Hi,

This patch adds the OpenMP runtime routine "omp_get_mapped_ptr" which was
introduced in OpenMP 5.1 (specification section 3.8.11):

"The omp_get_mapped_ptr routine returns the device pointer that is associated
with a host pointer for a given device."

"The device_num argument must be greater than or equal to zero and less than or
equal to the result of omp_get_num_devices()."

"A call to this routine for a pointer that is not NULL (or C_NULL_PTR, for
Fortran) and does not have an associated pointer on the given device results in
a NULL pointer."

"The routine returns NULL (or C_NULL_PTR, for Fortran) if unsuccessful.
Otherwise it returns the device pointer, which is ptr if device_num is the value
returned by omp_get_initial_device()."

Implementation and tests were added for C/C++ and Fortran.

There is a small inconvenience considering zero-length arrays as list items of
the "target map" construct: it seems that zero-length arrays are not associated
correctly there, such that omp_get_mapped_ptr returns NULL instead of the
associated device pointer - in contrast to the situation where a device pointer
is associated with the host pointer via omp_target_associate_ptr.
However, the result for omp_get_mapped_ptr is consistent with
omp_target_is_present (which returns 0, i.e. "not present") in this situation.

The patch was tested on x86_64-linux with nvptx and amdgcn offloading. All with
no regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, libgomp: Add new runtime routine omp_get_mapped_ptr.

libgomp/ChangeLog:

* libgomp.map: Added omp_get_mapped_ptr.
* libgomp.texi: Tagged omp_get_mapped_ptr as supported.
* omp.h.in: Added omp_get_mapped_ptr.
* omp_lib.f90.in: Added interface for omp_get_mapped_ptr.
* omp_lib.h.in: Likewise.
* target.c (omp_get_mapped_ptr): Added implementation of
omp_get_mapped_ptr.
* testsuite/libgomp.c-c++-common/get-mapped-ptr-1.c: New test.
* testsuite/libgomp.c-c++-common/get-mapped-ptr-2.c: New test.
* testsuite/libgomp.c-c++-common/get-mapped-ptr-3.c: New test.
* testsuite/libgomp.c-c++-common/get-mapped-ptr-4.c: New test.
* testsuite/libgomp.fortran/get-mapped-ptr-1.f90: New test.
* testsuite/libgomp.fortran/get-mapped-ptr-2.f90: New test.
* testsuite/libgomp.fortran/get-mapped-ptr-3.f90: New test.
* testsuite/libgomp.fortran/get-mapped-ptr-4.f90: New test.

diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2ac5809..00a4858 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -224,6 +224,7 @@ OMP_5.1 {
omp_set_teams_thread_limit_8_;
omp_get_teams_thread_limit;
omp_get_teams_thread_limit_;
+   omp_get_mapped_ptr;
 } OMP_5.0.2;
 
 GOMP_1.0 {
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 161a423..c163b56 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -314,7 +314,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{omp_target_is_accessible} runtime routine @tab N @tab
 @item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
   runtime routines @tab N @tab
-@item @code{omp_get_mapped_ptr} runtime routine @tab N @tab
+@item @code{omp_get_mapped_ptr} runtime routine @tab Y @tab
 @item @code{omp_calloc}, @code{omp_realloc}, @code{omp_aligned_alloc} and
   @code{omp_aligned_calloc} runtime routines @tab Y @tab
 @item @code{omp_alloctrait_key_t} enum: @code{omp_atv_serialized} added,
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index 89c5d65..18d0152 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -282,6 +282,7 @@ extern int omp_target_memcpy_rect (void *, const void *, 
__SIZE_TYPE__, int,
 extern int omp_target_associate_ptr (const void *, const void *, __SIZE_TYPE__,
 __SIZE_TYPE__, int) __GOMP_NOTHROW;
 extern int omp_target_disassociate_ptr (const void *, int) __GOMP_NOTHROW;
+extern void *omp_get_mapped_ptr (const void *, int) __GOMP_NOTHROW;
 
 extern void omp_set_affinity_format (const char *) __GOMP_NOTHROW;
 extern __SIZE_TYPE__ omp_get_affinity_format (char *, __SIZE_TYPE__)
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index daf40dc..506f15c 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -835,6 +835,15 @@
   end function omp_target_disassociate_ptr
 end interface
 
+interface
+  function omp_get_mapped_ptr (ptr, device_num) bind(c)
+use, intrinsic :: iso_c_binding, only : c_ptr, c_int
+type(c_ptr) :: omp_get_mapped_ptr
+type(c_ptr), value :: ptr
+integer(c_int), value :: device_num
+  end function omp_get_mapped_ptr
+end

Re: [PATCH] OpenMP, libgomp: Add new runtime routine omp_get_mapped_ptr.

2022-03-10 Thread Marcel Vollweiler

Hi Jakub,

This is an update to the patch from Tue Mar 8:

https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591343.html

I just added "get_mapped_ptr" to the "omp_runtime_apis" array in omp-low.cc and
replaced "omp_get_num_devices" by "gomp_get_num_devices" in target.c.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, libgomp: Add new runtime routine omp_get_mapped_ptr.

gcc/ChangeLog:

* omp-low.cc (omp_runtime_api_call): Added get_mapped_ptr to
omp_runtime_apis array.

libgomp/ChangeLog:

* libgomp.map: Added omp_get_mapped_ptr.
* libgomp.texi: Tagged omp_get_mapped_ptr as supported.
* omp.h.in: Added omp_get_mapped_ptr.
* omp_lib.f90.in: Added interface for omp_get_mapped_ptr.
* omp_lib.h.in: Likewise.
* target.c (omp_get_mapped_ptr): Added implementation of
omp_get_mapped_ptr.
* testsuite/libgomp.c-c++-common/get-mapped-ptr-1.c: New test.
* testsuite/libgomp.c-c++-common/get-mapped-ptr-2.c: New test.
* testsuite/libgomp.c-c++-common/get-mapped-ptr-3.c: New test.
* testsuite/libgomp.c-c++-common/get-mapped-ptr-4.c: New test.
* testsuite/libgomp.fortran/get-mapped-ptr-1.f90: New test.
* testsuite/libgomp.fortran/get-mapped-ptr-2.f90: New test.
* testsuite/libgomp.fortran/get-mapped-ptr-3.f90: New test.
* testsuite/libgomp.fortran/get-mapped-ptr-4.f90: New test.

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 77176ef..02a0f72 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -3962,6 +3962,7 @@ omp_runtime_api_call (const_tree fndecl)
   "target_is_present",
   "target_memcpy",
   "target_memcpy_rect",
+  "get_mapped_ptr",
   NULL,
   /* Now omp_* calls that are available as omp_* and omp_*_; however, the
 DECL_NAME is always omp_* without tailing underscore.  */
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2ac5809..608a54c 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -226,6 +226,11 @@ OMP_5.1 {
omp_get_teams_thread_limit_;
 } OMP_5.0.2;
 
+OMP_5.1.1 {
+  global:
+   omp_get_mapped_ptr;
+} OMP_5.1;
+
 GOMP_1.0 {
   global:
GOMP_atomic_end;
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 161a423..c163b56 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -314,7 +314,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{omp_target_is_accessible} runtime routine @tab N @tab
 @item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
   runtime routines @tab N @tab
-@item @code{omp_get_mapped_ptr} runtime routine @tab N @tab
+@item @code{omp_get_mapped_ptr} runtime routine @tab Y @tab
 @item @code{omp_calloc}, @code{omp_realloc}, @code{omp_aligned_alloc} and
   @code{omp_aligned_calloc} runtime routines @tab Y @tab
 @item @code{omp_alloctrait_key_t} enum: @code{omp_atv_serialized} added,
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index 89c5d65..18d0152 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -282,6 +282,7 @@ extern int omp_target_memcpy_rect (void *, const void *, 
__SIZE_TYPE__, int,
 extern int omp_target_associate_ptr (const void *, const void *, __SIZE_TYPE__,
 __SIZE_TYPE__, int) __GOMP_NOTHROW;
 extern int omp_target_disassociate_ptr (const void *, int) __GOMP_NOTHROW;
+extern void *omp_get_mapped_ptr (const void *, int) __GOMP_NOTHROW;
 
 extern void omp_set_affinity_format (const char *) __GOMP_NOTHROW;
 extern __SIZE_TYPE__ omp_get_affinity_format (char *, __SIZE_TYPE__)
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index daf40dc..506f15c 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -835,6 +835,15 @@
   end function omp_target_disassociate_ptr
 end interface
 
+interface
+  function omp_get_mapped_ptr (ptr, device_num) bind(c)
+use, intrinsic :: iso_c_binding, only : c_ptr, c_int
+type(c_ptr) :: omp_get_mapped_ptr
+type(c_ptr), value :: ptr
+integer(c_int), value :: device_num
+  end function omp_get_mapped_ptr
+end interface
+
 #if _OPENMP >= 201811
 !GCC$ ATTRIBUTES DEPRECATED :: omp_get_nested, omp_set_nested
 #endif
diff --git a/libgomp/omp_lib.h.in b/libgomp/omp_lib.h.in
index ff857a4..0f48510 100644
--- a/libgomp/omp_lib.h.in
+++ b/libgomp/omp_lib.h.in
@@ -416,3 +416,12 @@
   integer(c_int), value :: device_num
 end function omp_target_disassociate_ptr
   end interface
+
+  interface
+function omp_get_mapped_ptr (ptr, device_num) bind(c)
+  use, intrinsic :: iso_c_binding, only : c_ptr, c_int
+  type(c_ptr) :: omp_get_mapped_ptr
+  t

[Patch] OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

2022-03-11 Thread Marcel Vollweiler

Hi,

This patch adds the OpenMP runtime routine "omp_target_is_accessible" which was
introduced in OpenMP 5.1 (specification section 3.8.4):

"The omp_target_is_accessible routine tests whether host memory is accessible
from a given device."

"This routine returns true if the storage of size bytes starting at the address
given by ptr is accessible from device device_num. Otherwise, it returns false."

"The value of ptr must be a valid host pointer or NULL (or C_NULL_PTR, for
Fortran). The device_num argument must be greater than or equal to zero and less
than or equal to the result of omp_get_num_devices()."

"When called from within a target region the effect is unspecified."

Currently, the only way of accessing host memory on a non-host device is via
shared memory. This will change with unified shared memory (usm) that was
recently submitted but not yet approved/committed. A follow-up patch for
omp_target_is_accessible is planned considering usm when available. The current
patch handles the basic implementation for C/C++ and Fortran and includes
comments pointing to usm.

Although not explicitly specified in the OpenMP 5.1 standard, the implemented
function returns "true" if the given device_num is equal to
"omp_get_num_devices" (i.e. the host) as it is expected that host memory can be
accessed from the host device.

The patch was tested on x86_64-linux and PowerPC, both with nvptx offloading.
All with no regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

gcc/ChangeLog:

* omp-low.cc (omp_runtime_api_call): Added target_is_accessible to
omp_runtime_apis array.

libgomp/ChangeLog:

* libgomp.map: Added omp_target_is_accessible.
* libgomp.texi: Tagged omp_target_is_accessible as supported.
* omp.h.in: Added omp_target_is_accessible.
* omp_lib.f90.in: Added interface for omp_target_is_accessible.
* omp_lib.h.in: Likewise.
* target.c (omp_target_is_accessible): Added implementation of
omp_target_is_accessible.
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: New test.
* testsuite/libgomp.fortran/target-is-accessible-1.f90: New test.

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 77176ef..bf38fad 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -3959,6 +3959,7 @@ omp_runtime_api_call (const_tree fndecl)
   "target_associate_ptr",
   "target_disassociate_ptr",
   "target_free",
+  "target_is_accessible",
   "target_is_present",
   "target_memcpy",
   "target_memcpy_rect",
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2ac5809..1764380 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -226,6 +226,11 @@ OMP_5.1 {
omp_get_teams_thread_limit_;
 } OMP_5.0.2;
 
+OMP_5.1.1 {
+  global:
+   omp_target_is_accessible;
+} OMP_5.1;
+
 GOMP_1.0 {
   global:
GOMP_atomic_end;
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 161a423..58e432c 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -311,7 +311,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
   @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
   routines @tab Y @tab
-@item @code{omp_target_is_accessible} runtime routine @tab N @tab
+@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
 @item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
   runtime routines @tab N @tab
 @item @code{omp_get_mapped_ptr} runtime routine @tab N @tab
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index 89c5d65..1ec7415 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -282,6 +282,8 @@ extern int omp_target_memcpy_rect (void *, const void *, 
__SIZE_TYPE__, int,
 extern int omp_target_associate_ptr (const void *, const void *, __SIZE_TYPE__,
 __SIZE_TYPE__, int) __GOMP_NOTHROW;
 extern int omp_target_disassociate_ptr (const void *, int) __GOMP_NOTHROW;
+extern int omp_target_is_accessible (const void *, __SIZE_TYPE__, int)
+  __GOMP_NOTHROW;
 
 extern void omp_set_affinity_format (const char *) __GOMP_NOTHROW;
 extern __SIZE_TYPE__ omp_get_affinity_format (char *, __SIZE_TYPE__)
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index daf40dc..f369507 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -835,6 +835,16 @@
   end function omp_target_disassociate_ptr
 end interface
 
+interface
+  function omp_target_is_accessible (ptr, size, device_num) bind(c)
+use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
+inte

Re: [Patch] OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

2022-03-14 Thread Marcel Vollweiler

Hi Tobias,


Minor remark to the test:

On 11.03.22 13:30, Marcel Vollweiler wrote:

+  int d = omp_get_default_device ();

...

+  int shared_mem = 0;
+  #pragma omp target map (alloc: shared_mem) device (d)
+shared_mem = 1;
+  if (omp_target_is_accessible (p, sizeof (int), d) != shared_mem)
+__builtin_abort ();


I wonder whether it makes sense to do instead
   for (d = 0; d <= omp_get_num_devices(); ++d)
instead of just
   d = omp_get_default_device();
given that we have already found once in a while bugs when testing more
than just the default device - be it because devices differed or because
'0' was special.

In particular, I could image having at the same time two or three devices
available of type intelmic + gcn + nvptx, possibly mixing shared memory,
nonshared memory and semi-shared memory*


Good hint, thanks. I updated the C(++) and Fortran tests accordingly and
attached the updated patch.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

gcc/ChangeLog:

* omp-low.cc (omp_runtime_api_call): Added target_is_accessible to
omp_runtime_apis array.

libgomp/ChangeLog:

* libgomp.map: Added omp_target_is_accessible.
* libgomp.texi: Tagged omp_target_is_accessible as supported.
* omp.h.in: Added omp_target_is_accessible.
* omp_lib.f90.in: Added interface for omp_target_is_accessible.
* omp_lib.h.in: Likewise.
* target.c (omp_target_is_accessible): Added implementation of
omp_target_is_accessible.
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: New test.
* testsuite/libgomp.fortran/target-is-accessible-1.f90: New test.

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 77176ef..bf38fad 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -3959,6 +3959,7 @@ omp_runtime_api_call (const_tree fndecl)
   "target_associate_ptr",
   "target_disassociate_ptr",
   "target_free",
+  "target_is_accessible",
   "target_is_present",
   "target_memcpy",
   "target_memcpy_rect",
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2ac5809..1764380 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -226,6 +226,11 @@ OMP_5.1 {
omp_get_teams_thread_limit_;
 } OMP_5.0.2;
 
+OMP_5.1.1 {
+  global:
+   omp_target_is_accessible;
+} OMP_5.1;
+
 GOMP_1.0 {
   global:
GOMP_atomic_end;
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 161a423..58e432c 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -311,7 +311,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
   @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
   routines @tab Y @tab
-@item @code{omp_target_is_accessible} runtime routine @tab N @tab
+@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
 @item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
   runtime routines @tab N @tab
 @item @code{omp_get_mapped_ptr} runtime routine @tab N @tab
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index 89c5d65..1ec7415 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -282,6 +282,8 @@ extern int omp_target_memcpy_rect (void *, const void *, 
__SIZE_TYPE__, int,
 extern int omp_target_associate_ptr (const void *, const void *, __SIZE_TYPE__,
 __SIZE_TYPE__, int) __GOMP_NOTHROW;
 extern int omp_target_disassociate_ptr (const void *, int) __GOMP_NOTHROW;
+extern int omp_target_is_accessible (const void *, __SIZE_TYPE__, int)
+  __GOMP_NOTHROW;
 
 extern void omp_set_affinity_format (const char *) __GOMP_NOTHROW;
 extern __SIZE_TYPE__ omp_get_affinity_format (char *, __SIZE_TYPE__)
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index daf40dc..f369507 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -835,6 +835,16 @@
   end function omp_target_disassociate_ptr
 end interface
 
+interface
+  function omp_target_is_accessible (ptr, size, device_num) bind(c)
+use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
+integer(c_int) :: omp_target_is_accessible
+type(c_ptr), value :: ptr
+integer(c_size_t), value :: size
+integer(c_int), value :: device_num
+  end function omp_target_is_accessible
+end interface
+
 #if _OPENMP >= 201811
 !GCC$ ATTRIBUTES DEPRECATED :: omp_get_nested, omp_set_nested
 #endif
diff --git a/libgomp/omp_lib.h.in b/libgomp/omp_lib.h.in
index ff857a4..5ea0366 100644
--- a/l

[PATCH] OpenMP, Fortran: Bugfix for omp_set_num_teams.

2022-03-15 Thread Marcel Vollweiler

Hi,

This patch fixes a small bug for omp_set_num_teams in fortran.c.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, Fortran: Bugfix for omp_set_num_teams.

This patch fixes a small bug in the omp_set_num_teams implementation.

libgomp/ChangeLog:

* fortran.c (omp_set_num_teams_8_): Fix bug.

diff --git a/libgomp/fortran.c b/libgomp/fortran.c
index 8c1cfd1..d984ce5 100644
--- a/libgomp/fortran.c
+++ b/libgomp/fortran.c
@@ -491,7 +491,7 @@ omp_set_num_teams_ (const int32_t *num_teams)
 void
 omp_set_num_teams_8_ (const int64_t *num_teams)
 {
-  omp_set_max_active_levels (TO_INT (*num_teams));
+  omp_set_num_teams (TO_INT (*num_teams));
 }
 
 int32_t


Re: [PATCH] OpenMP, Fortran: Bugfix for omp_set_num_teams.

2022-03-16 Thread Marcel Vollweiler

Hi Jakub,


! { dg-do run }
! { dg-additional-options "-fdefault-integer-8" }

program set_num_teams_8
   use omp_lib
   omp_set_num_teams (42)
   if (omp_get_num_teams () .ne. 42) stop 1
end program


I modified your suggested test case a bit:

program set_num_teams_8
  use omp_lib
  use, intrinsic :: iso_fortran_env
  integer(int64) :: x
  x = 42
  call omp_set_num_teams (x)
  if (omp_get_max_teams () .ne. 42) stop 1
end program

I tested it with/without the fix and the test passed/failed as expected.

Hope, that's ok?

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, Fortran: Bugfix for omp_set_num_teams.

This patch fixes a small bug in the omp_set_num_teams implementation.

libgomp/ChangeLog:

* fortran.c (omp_set_num_teams_8_): Fix bug.
* testsuite/libgomp.fortran/icv-8.f90: New test.

diff --git a/libgomp/fortran.c b/libgomp/fortran.c
index 8c1cfd1..d984ce5 100644
--- a/libgomp/fortran.c
+++ b/libgomp/fortran.c
@@ -491,7 +491,7 @@ omp_set_num_teams_ (const int32_t *num_teams)
 void
 omp_set_num_teams_8_ (const int64_t *num_teams)
 {
-  omp_set_max_active_levels (TO_INT (*num_teams));
+  omp_set_num_teams (TO_INT (*num_teams));
 }
 
 int32_t
diff --git a/libgomp/testsuite/libgomp.fortran/icv-8.f90 
b/libgomp/testsuite/libgomp.fortran/icv-8.f90
new file mode 100644
index 000..9478c15
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/icv-8.f90
@@ -0,0 +1,10 @@
+! This tests 'set_num_teams_8' function.
+
+program set_num_teams_8
+  use omp_lib
+  use, intrinsic :: iso_fortran_env
+  integer(int64) :: x
+  x = 42
+  call omp_set_num_teams (x)
+  if (omp_get_max_teams () .ne. 42) stop 1
+end program


Re: [Patch] OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

2022-05-06 Thread Marcel Vollweiler

Hi Jakub,

Am 05.05.2022 um 11:33 schrieb Jakub Jelinek:

On Mon, Mar 14, 2022 at 04:42:14PM +0100, Marcel Vollweiler wrote:

--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -226,6 +226,11 @@ OMP_5.1 {
 omp_get_teams_thread_limit_;
  } OMP_5.0.2;

+OMP_5.1.1 {
+  global:
+omp_target_is_accessible;
+} OMP_5.1;
+


You've already added another OMP_5.1.1 symbol, so this hunk will need to be
adjusted.  Keep the names in there alphabetically sorted.


Adjusted.


--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -835,6 +835,16 @@
end function omp_target_disassociate_ptr
  end interface

+interface
+  function omp_target_is_accessible (ptr, size, device_num) bind(c)
+use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
+integer(c_int) :: omp_target_is_accessible


The function returning integer(c_int) rather than logical seems like
a screw up in the standard, but too late to fix that :(.


--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -3666,6 +3666,24 @@ omp_target_disassociate_ptr (const void *ptr, int 
device_num)
  }

  int
+omp_target_is_accessible (const void *ptr, size_t size, int device_num)
+{
+  if (device_num < 0 || device_num > gomp_get_num_devices ())
+return false;
+
+  if (device_num == gomp_get_num_devices ())
+return true;
+
+  struct gomp_device_descr *devicep = resolve_device (device_num);
+  if (devicep == NULL)
+return false;
+
+  /* TODO: Unified shared memory must be handled when available.  */
+
+  return devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM;


I guess for now it is reasonable, but I wonder if even without
GOMP_OFFLOAD_CAP_SHARED_MEM one can't for CUDA or GCN allocate host
memory (not all, but just some subset) that will be accessible on the
device (I bet that means accessible through the same address on the host and
device, aka partial shared mem).


Currently, I am only aware of

(a) physically shared memory which is used for some architectures where CPU and
GPU are close together (handled via GOMP_OFFLOAD_CAP_SHARED_MEM) and
(b) unified shared memory as being more a logical memory sharing via managed
memory (using sth. like cudaMallocManaged).

For (b) I will submit a follow up patch very soon that depends on the submitted
but not yet approved/committed usm patches:
   https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591349.html



So, ok for trunk.

OT, tried to look how libomptarget implements it and they don't at least
on llvm-project trunk, but while looking at that, noticed that for
omp_target_is_present they do return false from omp_target_is_present
while we return true.  It is unclear if NULL has corresponding storage
on the device (NULL always corresponds to NULL on the device) or not.


That's indeed an interesting point. I am not sure whether returning "true" for a
given NULL pointer is the desired behaviour for omp_target_is_present. For the
host that might be ok (for whatever reason) but for offload devices this implies
that NULL is actually mapped to some address on the device (as far as I
understand the definition):

"The omp_target_is_present routine tests whether a host pointer refers to
storage that is mapped to a given device."

I don't know if such a "NULL mapping" is valid/useful.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH] OpenMP, libgomp: Handle unified shared memory in omp_target_is_accessible.

2022-05-06 Thread Marcel Vollweiler

Hi,

This is a follow up patch of the patch that adds the OpenMP runtime routine
omp_target_is_accessible:

   https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591601.html

It considers now also unified shared memory (usm) that was submitted recently
(but not yet approved/committed):
   https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591349.html

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP, libgomp: Handle unified shared memory in omp_target_is_accessible.

libgomp/ChangeLog:

* target.c (omp_target_is_accessible): Handle unified shared memory.
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: Updated.
* testsuite/libgomp.fortran/target-is-accessible-1.f90: Updated.
* testsuite/libgomp.c-c++-common/target-is-accessible-2.c: New test.
* testsuite/libgomp.fortran/target-is-accessible-2.f90: New test.

diff --git a/libgomp/target.c b/libgomp/target.c
index 74a031f..e6d00c5 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -3909,9 +3909,13 @@ omp_target_is_accessible (const void *ptr, size_t size, 
int device_num)
   if (devicep == NULL)
 return false;
 
-  /* TODO: Unified shared memory must be handled when available.  */
+  if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
+return true;
 
-  return devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM;
+  if (devicep->is_usm_ptr_func && devicep->is_usm_ptr_func ((void *) ptr))
+return true;
+
+  return false;
 }
 
 int
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c 
b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
index 7c2cf62..e3f494b 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
@@ -23,23 +23,28 @@ main ()
   if (omp_target_is_accessible (p, sizeof (int), n + 1))
 __builtin_abort ();
 
-  /* Currently, a host pointer is accessible if the device supports shared
- memory or omp_target_is_accessible is executed on the host. This
- test case must be adapted when unified shared memory is avialable.  */
   int a[128];
   for (int d = 0; d <= omp_get_num_devices (); d++)
 {
+  /* SHARED_MEM is 1 if and only if host and device share the same memory.
+OMP_TARGET_IS_ACCESSIBLE should not return 0 for shared memory.  */
   int shared_mem = 0;
   #pragma omp target map (alloc: shared_mem) device (d)
shared_mem = 1;
-  if (omp_target_is_accessible (p, sizeof (int), d) != shared_mem)
+
+  if (shared_mem && !omp_target_is_accessible (p, sizeof (int), d))
+   __builtin_abort ();
+
+  /* USM is disabled by default.  Hence OMP_TARGET_IS_ACCESSIBLE should
+return 0 if shared_mem is false.  */
+  if (!shared_mem && omp_target_is_accessible (p, sizeof (int), d))
__builtin_abort ();
 
-  if (omp_target_is_accessible (a, 128 * sizeof (int), d) != shared_mem)
+  if (shared_mem && !omp_target_is_accessible (a, 128 * sizeof (int), d))
__builtin_abort ();
 
   for (int i = 0; i < 128; i++)
-   if (omp_target_is_accessible (&a[i], sizeof (int), d) != shared_mem)
+   if (shared_mem && !omp_target_is_accessible (&a[i], sizeof (int), d))
  __builtin_abort ();
 }
 
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-2.c 
b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-2.c
new file mode 100644
index 000..24af51f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-2.c
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+/* { dg-skip-if "USM is only implemented for nvptx." { ! offload_target_nvptx 
} } */
+
+#include 
+#include 
+
+#pragma omp requires unified_shared_memory
+
+int
+main ()
+{
+  int *a = (int *) omp_alloc (sizeof(int), ompx_unified_shared_mem_alloc);
+  if (!a)
+__builtin_abort ();
+
+  for (int d = 0; d <= omp_get_num_devices (); d++)
+if (!omp_target_is_accessible (a, sizeof (int), d))
+  __builtin_abort ();
+
+  omp_free(a, ompx_unified_shared_mem_alloc);
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90 
b/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90
index 2611855..015f74a 100644
--- a/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90
@@ -1,3 +1,5 @@
+! { dg-do run }
+
 program main
   use omp_lib
   use iso_c_binding
@@ -25,24 +27,28 @@ program main
   if (omp_target_is_accessible (p, c_sizeof (d), n + 1) /= 0) &
 stop 4
 
-  ! Currently, a host pointer is accessible if the device supports shared
-  ! memory or omp_target_is_accessible is executed on the host. This
-  ! test case must be adapted w

Re: [PATCH] OpenMP, libgomp: Add new runtime routines omp_target_memcpy_async and omp_target_memcpy_rect_async

2022-05-19 Thread Marcel Vollweiler

Hi Jakub,

Am 17.05.2022 um 20:08 schrieb Jakub Jelinek:

On Tue, May 17, 2022 at 11:57:02AM +0200, Marcel Vollweiler wrote:

More importantly, I have no idea how this can work when you pass arg_size 0
and arg_align 0.  The s variable is in the current function frame, with
arg_size 0 nothing is really copied to the generated task.
arg_size should be sizeof (memcpy_t) and arg_align __alignof__ (memcpy_t)
(well, struct omp_target_memcpy_data).


The copy function of GOMP_task ("cpyfn") is not used here (set to NULL) and thus
also arg_size and arg_align are set to 0 since they are related to cpyfn if I
understand it correctly.


No, arg_size and arg_align are for all (explicit) tasks the size and
alignment of the arguments.  For an included task (one executed by the
encountering thread) we indeed use data directly instead of allocating
arg_size arg_align aligned bytes and copying data to it.  But when we create
a deferred task (that is the only thing that actually can be asynchronous), we
allocate struct gomp_task together with memory for the data (arg_size bytes
aligned to arg_align).  If cpyfn, we invoke that copy function (from source
data to the destination buffer), otherwise memcpy.  cpyfn is a callback that
will do memcpy for parts that need bitwise copy and copy construction /
whatever else is needed for other data.
Looking at your patch, you call GOMP_task always with if_clause = false,
that means it is always included task (like with #pragma omp task if(0)),
but that also means calling GOMP_task doesn't bring any advantages and it is
not asynchronous.
If you called it with if_clause = true, like what #pragma omp task would do,
then the arg_size = 0 and arg_align = 0 would make it not work at all,
so after fixing if_clause, you need to supply sizeof (s) and __alignof__ (s).


Good explanation, thanks. Changed accordingly.




Also, it would be nice to avoid GOMP_task for the depobj_count == 0 case
at least sometimes (but perhaps that can be done incrementally) and instead
use some CUDA etc. asynchronous copy APIs.  We don't really need to wait
for anything in that case, and from OpenMP POV all we need to make sure is
that barrier/taskwait/taskgroup end will know about these "tasks" and
wait for them.  So, it can be implemented more like #pragma omp target nowait
instead of #pragma omp task that calls the synchronous omp_target_memcpy.
Though, maybe that is how it should be implemented always, something like
gomp_create_target_task and its caller.  We already use that single routine
for multiple purposes (target nowait as well as target enter/exit data
nowait), so just telling it somehow that it shouldn't do mapping/unmapping
and perhaps target execution and instead copying would be nice.


I dont't see/understand the advantage using gomp_create_target_task over
GOMP_task. Whether the task waits for dependencies
("gomp_task_maybe_wait_for_dependencies") depends on GOMP_TASK_FLAG_DEPEND which
is only set if depobj_count > 0 and depobj_list != NULL. Thus, there shouldn't
be any waiting in case of depobj_count == 0? Additionally, in both functions a
new thread is created - independently of dependencies.


GOMP_task never creates a new thread.
gomp_create_target_task can create (but just once) an unshackeled thread
that runs on the side, doesn't do normal OpenMP user work and just polls the
offloading device and performs unmapping or whatever is needed to finish a
nowait offloaded task.

The disadvantage of GOMP_task is:
1) if you call say omp_target_memcpy_async from outside of parallel, it will
not be actually asynchronous even if you call GOMP_task with if_clause = 
true
2) if you call it from inside of parallel, it might be scheduled only when
some host thread is ready for work (e.g. when reaching #pragma omp barrier,
implicit barrier, #pragma omp taskwait etc.), so even when the offloading
device is unused but host has lots of work to do, it might take quite a
while before starting the work, and then one of the OpenMP host threads
will be blocked waiting for the copying to be done

gomp_create_target_task doesn't have these disadvantages, it can fire off the
copying right away and then just needs to be able to figure out when it
finished (either the unshackeled thread polls the device, or some other way
how to find out that it finished; but OpenMP certainly needs to know that,
because user code can say #pragma omp taskwait for it, or it should be
complete at the end of a taskgroup, or at the end of #pragma omp barrier
or implicit barrier etc.).

Anyway, I guess it is ok to use GOMP_task in the initial patch and change it
later, but if_clause = false and 0, 0 for arg_{size,align} are definitely
wrong.


Agreed. Thanks for the details.




+int
+omp_target_memcpy (void *dst, const void *src, size_t length, size_t 
dst_offset,
+   size_t src_offset, int dst_device_num, int src_device_num)
+{
+  struc

[PATCH] Fortran/OpenMP: Add support for 'close' in map clause

2021-05-20 Thread Marcel Vollweiler

Hi,

This patch adds handling for the map-type-modifier 'close' in the map
clause in the Fortran parser (for C and C++ parsers the changes were
already committed).

'close' was introduced with OpenMP 5.0: "The close map-type-modifier is
a hint to the runtime to allocate memory close to the target device."
In OpenMP 5.0 'close' can be used beside/together with 'always' in a
list of map-type-modifiers.

This patch also considers the optional commas in the modifier list,
which the old code did not (although the comma after 'always' was
already optional in OpenMP 4.5).

Marcel
-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran/OpenMP: Add support for 'close' in map clause

gcc/fortran/ChangeLog: 

* openmp.c (gfc_match_omp_clauses): Support map-type-modifier 'close'.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/map-6.f90: New test.
* gfortran.dg/gomp/map-7.f90: New test.
* gfortran.dg/gomp/map-8.f90: New test.

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 7eeabff..bec852a 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1710,10 +1710,21 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
  && gfc_match ("map ( ") == MATCH_YES)
{
  locus old_loc2 = gfc_current_locus;
- bool always = false;
+
+ int always = 0;
+ int close = 0;
+ for (;;)
+   {
+ if (gfc_match ("always ") == MATCH_YES)
+   always++;
+ else if (gfc_match ("close ") == MATCH_YES)
+   close++;
+ else
+   break;
+ gfc_match (", ");
+   }
+
  gfc_omp_map_op map_op = OMP_MAP_TOFROM;
- if (gfc_match ("always , ") == MATCH_YES)
-   always = true;
  if (gfc_match ("alloc : ") == MATCH_YES)
map_op = OMP_MAP_ALLOC;
  else if (gfc_match ("tofrom : ") == MATCH_YES)
@@ -1726,11 +1737,24 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
map_op = OMP_MAP_RELEASE;
  else if (gfc_match ("delete : ") == MATCH_YES)
map_op = OMP_MAP_DELETE;
- else if (always)
+ else
{
  gfc_current_locus = old_loc2;
- always = false;
+ always = 0;
+ close = 0;
}
+
+ if (always > 1)
+   {
+ gfc_error ("too many % modifiers at %C");
+ break;
+   }
+ if (close > 1)
+   {
+ gfc_error ("too many % modifiers at %C");
+ break;
+   }
+
  head = NULL;
  if (gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_MAP],
   false, NULL, &head,
@@ -1741,8 +1765,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
n->u.map_op = map_op;
  continue;
}
- else
-   gfc_current_locus = old_loc;
+ gfc_current_locus = old_loc;
+ break;
}
  if ((mask & OMP_CLAUSE_MERGEABLE) && !c->mergeable
  && gfc_match ("mergeable") == MATCH_YES)
diff --git a/gcc/testsuite/gfortran.dg/gomp/map-6.f90 
b/gcc/testsuite/gfortran.dg/gomp/map-6.f90
new file mode 100644
index 000..309f845
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/map-6.f90
@@ -0,0 +1,50 @@
+! { dg-additional-options "-fdump-tree-original" }
+
+implicit none
+
+integer :: a, b, b1, b2, b3, b4, b5, b6
+
+!$omp target map(a)
+!$omp end target
+
+!$omp target map(to : a)
+!$omp end target
+
+!$omp target map(always to: a)
+!$omp end target
+!$omp target map(always, to: a)
+!$omp end target
+!$omp target map(close to: a)
+!$omp end target
+!$omp target map(close, to: a)
+!$omp end target
+
+!$omp target map(close always to:b1)
+!$omp end target
+!$omp target map(close, always to:b2)
+!$omp end target
+!$omp target map(close, always, to:b3)
+!$omp end target
+!$omp target map(always close to:b4)
+!$omp end target
+!$omp target map(always, close to:b5)
+!$omp end target
+!$omp target map(always, close, to:b6)
+!$omp end target
+
+
+!$omp target map (always to : a) map (close to : b)
+!$omp end target
+
+end
+
+! { dg-final { scan-tree-dump-not "map\\(\[^\n\r)]*close\[^\n\r)]*to:" 
"original" } }
+
+! { dg-final { scan-tree-dump-times "#pragma omp target map\\(always,to:" 9 
"original" } }
+
+! { dg-final { scan-tree-dump "#pragma omp target map\\(always,to:b1\\)" 
"original" } }
+! { dg-final { scan-tree-dump "#pragma omp target map\\(always,to:b2\\)" 
"original" } }
+! { dg-final { scan-tree-dump "#pragma omp target map\\(alway

Re: [PATCH] Fortran/OpenMP: Add support for 'close' in map clause

2021-05-20 Thread Marcel Vollweiler

Hi Jakub,

Am 20.05.2021 um 10:57 schrieb Jakub Jelinek:

On Thu, May 20, 2021 at 10:47:52AM +0200, Marcel Vollweiler wrote:

--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1710,10 +1710,21 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
   && gfc_match ("map ( ") == MATCH_YES)
 {
   locus old_loc2 = gfc_current_locus;
-  bool always = false;
+
+  int always = 0;
+  int close = 0;


The vertical space should be after the 3 variable declarations
rather than in between 1 and 2.


Changed.




+  for (;;)
+{
+  if (gfc_match ("always ") == MATCH_YES)
+always++;
+  else if (gfc_match ("close ") == MATCH_YES)
+close++;
+  else
+break;
+  gfc_match (", ");
+}
+
   gfc_omp_map_op map_op = OMP_MAP_TOFROM;
-  if (gfc_match ("always , ") == MATCH_YES)
-always = true;
   if (gfc_match ("alloc : ") == MATCH_YES)
 map_op = OMP_MAP_ALLOC;
   else if (gfc_match ("tofrom : ") == MATCH_YES)
@@ -1726,11 +1737,24 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
 map_op = OMP_MAP_RELEASE;
   else if (gfc_match ("delete : ") == MATCH_YES)
 map_op = OMP_MAP_DELETE;
-  else if (always)
+  else
 {
   gfc_current_locus = old_loc2;
-  always = false;
+  always = 0;
+  close = 0;
 }
+
+  if (always > 1)
+{
+  gfc_error ("too many % modifiers at %C");
+  break;
+}
+  if (close > 1)
+{
+  gfc_error ("too many % modifiers at %C");
+  break;


I think it would be nice to show the locus of the second always or close
modifier.  Could the loop above remember that locus when always++ == 1
(or ++always == 2) and similarly for close and use it when printing the
error?


Good point. I changed the loop and the error messages accordingly.


And similarly to the C/C++ patch, better use always_modifier and
close_modifier as the names of the variables, as close is a function and
could be defined as macro.


Changed.



  Jakub



Thanks!

Marcel
-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran/OpenMP: Add support for 'close' in map clause

gcc/fortran/ChangeLog: 

* openmp.c (gfc_match_omp_clauses): Support map-type-modifier 'close'.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/map-6.f90: New test.
* gfortran.dg/gomp/map-7.f90: New test.
* gfortran.dg/gomp/map-8.f90: New test.

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 7eeabff..f8d198e 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1710,27 +1710,62 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
  && gfc_match ("map ( ") == MATCH_YES)
{
  locus old_loc2 = gfc_current_locus;
- bool always = false;
+ int always_modifier = 0;
+ int close_modifier = 0;
+ locus second_always_locus;
+ locus second_close_locus;
+
+ for (;;)
+   {
+ locus current_locus = gfc_current_locus;
+ if (gfc_match ("always ") == MATCH_YES)
+   {
+ if (always_modifier++ == 1)
+   second_always_locus = current_locus;
+   }
+ else if (gfc_match ("close ") == MATCH_YES)
+   {
+ if (close_modifier++ == 1)
+   second_close_locus = current_locus;
+   }
+ else
+   break;
+ gfc_match (", ");
+   }
+
  gfc_omp_map_op map_op = OMP_MAP_TOFROM;
- if (gfc_match ("always , ") == MATCH_YES)
-   always = true;
  if (gfc_match ("alloc : ") == MATCH_YES)
map_op = OMP_MAP_ALLOC;
  else if (gfc_match ("tofrom : ") == MATCH_YES)
-   map_op = always ? OMP_MAP_ALWAYS_TOFROM : OMP_MAP_TOFROM;
+   map_op = always_modifier ? OMP_MAP_ALWAYS_TOFROM : 
OMP_MAP_TOFROM;
  else if (gfc_match ("to : ") == MATCH_YES)
-   map_op = always ? OMP_MAP_ALWAYS_TO : OMP_MAP_TO;
+   map_op = always_modifier ? OMP_MAP_ALWAYS_TO : OMP_MAP_TO;
  else if (gfc_match ("from : ") == MATCH_YES)
-   map

[Patch] C, C++, Fortran, OpenMP: Add support for device-modifiers for 'omp target device'

2021-07-07 Thread Marcel Vollweiler

This patch adds device-modifiers to the device clause:

   #pragma omp target device ([ device-modifier :] integer-expression)

where device-modifier is either 'ancestor' or 'device_num'.

The 'device_num' case

   #pragma omp target device (device_num : integer-expression)

is treated in the same way as

   #pragma omp target device (integer-expression)

before.

For the 'ancestor' case

   #pragma omp target device (ancestor: integer-expression)

a message 'sorry, not yet implemented' is output.


-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
OpenMP: Add support for device-modifiers for 'omp target device'

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_clause_device): Add support for 
device-modifiers for 'omp target device'.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_clause_device): Add support for 
device-modifiers for 'omp target device'.

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_clauses): Add support for 
device-modifiers for 'omp target device'.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/target-device-1.c: New test.
* c-c++-common/gomp/target-device-2.c: New test.
* gfortran.dg/gomp/target-device-1.f90: New test.
* gfortran.dg/gomp/target-device-2.f90: New test.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 9a56e0c..defc52d 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -15864,37 +15864,117 @@ c_parser_omp_clause_map (c_parser *parser, tree list)
 }
 
 /* OpenMP 4.0:
-   device ( expression ) */
+   device ( expression )
+
+   OpenMP 5.0:
+   device ( [device-modifier :] integer-expression )
+
+   device-modifier:
+ ancestor | device_num */
 
 static tree
 c_parser_omp_clause_device (c_parser *parser, tree list)
 {
   location_t clause_loc = c_parser_peek_token (parser)->location;
+  location_t expr_loc;
+  c_expr expr;
+  tree c, t;
+
   matching_parens parens;
-  if (parens.require_open (parser))
+  if (!parens.require_open (parser))
+return list;
+
+  int pos = 1;
+  int pos_colon = 0;
+  while (c_parser_peek_nth_token_raw (parser, pos)->type == CPP_NAME
+|| c_parser_peek_nth_token_raw (parser, pos)->type == CPP_COLON
+|| c_parser_peek_nth_token_raw (parser, pos)->type == CPP_COMMA)
 {
-  location_t expr_loc = c_parser_peek_token (parser)->location;
-  c_expr expr = c_parser_expr_no_commas (parser, NULL);
-  expr = convert_lvalue_to_rvalue (expr_loc, expr, false, true);
-  tree c, t = expr.value;
-  t = c_fully_fold (t, false, NULL);
+  if (c_parser_peek_nth_token_raw (parser, pos)->type == CPP_COLON)
+   {
+ pos_colon = pos;
+ break;
+   }
+  pos++;
+}
 
-  parens.skip_until_found_close (parser);
+  const char *err_msg;
+  if (pos_colon == 1)
+{
+  err_msg = "expected device-modifier % or %";
+  goto invalid_kind;
+}
 
-  if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
+  if (pos_colon > 1)
+{
+  if (c_parser_peek_nth_token_raw (parser, 1)->type == CPP_NAME)
{
- c_parser_error (parser, "expected integer expression");
- return list;
+ c_token *tok = c_parser_peek_token (parser);
+ const char *p = IDENTIFIER_POINTER (tok->value);
+ if (strcmp ("ancestor", p) == 0)
+   {
+ if (pos_colon > 2)
+   {
+ err_msg = "expected only one device-modifier % or "
+   "%";
+ goto invalid_kind;
+   }
+
+ sorry_at (tok->location, "% not yet supported");
+ c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
+ return list;
+   }
+ else if (strcmp ("device_num", p) == 0)
+   {
+ if (pos_colon > 2)
+   {
+ err_msg = "expected only one device-modifier % or "
+   "%";
+ goto invalid_kind;
+   }
+ c_parser_consume_token (parser);
+ c_parser_peek_token (parser);
+ c_parser_consume_token (parser);
+   }
+ else
+   {
+ err_msg = "expected device-modifier % or "
+   "%";
+ goto invalid_kind;
+   }
+   }
+  else
+   {
+ err_msg = "expected device-modifier % or %";
+ goto invalid_kind;
}
+}
 
-  check_no_duplicate_clause (list, OMP_CLAUSE_DEVICE, "device");
+  expr_loc = c_parser_peek_token (parser)->location;
+  expr = c_parser_expr_no_commas (parser, NULL);
+  expr = convert_lvalue_to_rvalue (expr_loc, expr, false, true);
+  c, t = expr.value;
+  t = c_fully_fold (t, false, NULL);
 
-  c = build_omp_clause (clause_loc, OMP_CLAUSE_DEVICE);
-  OMP_CLAUSE_DEVICE_ID (c) = t;
-  OMP_CLAUSE_CHAIN (c) = list;
-  list = c;

Re: [Patch v2] C, C++, Fortran, OpenMP: Add support for device-modifiers for 'omp target device'

2021-08-25 Thread Marcel Vollweiler

Hi Jakub,

I applied all your suggested changes and checked for no test regressions
on x86_64-linux with nvptx offloading. The revised patch is attached.

Do you think that it's ok to commit the code?

Thanks,

Marcel

Am 23.08.2021 um 19:47 schrieb Jakub Jelinek:

On Fri, Aug 20, 2021 at 09:18:32PM +0200, Marcel Vollweiler wrote:


--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -15864,37 +15864,81 @@ c_parser_omp_clause_map (c_parser *parser, tree list)
  }

  /* OpenMP 4.0:
-   device ( expression ) */
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
   device ( expression )


Please remove all the >>>>>s.

+
+   OpenMP 5.0:
+   device ( [device-modifier :] integer-expression )
+
+   device-modifier:
+ ancestor | device_num */




+  /* A requires directive with the reverse_offload clause must be
+  specified.  */
+  if ((omp_requires_mask & OMP_REQUIRES_REVERSE_OFFLOAD) == 0)
+{
+  c_parser_error (parser, "a % directive with the "
+  "% clause must be "
+  "specified");


[BI think this diagnostics is confusing, it tells the user that it has to
do something but doesn't tell why.  It is also not a parser error.
So I think it should be instead
error_at (tok->location, "% device modifier not "
 "preceded by % directive "
 "with % clause");


+  parens.skip_until_found_close (parser);
+  return list;
+}
+  ancestor = true;
+}



+  if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
+{
+  c_parser_error (parser, "expected integer expression");
+  return list;
  }

+  check_no_duplicate_clause (list, OMP_CLAUSE_DEVICE, "device");
+
+  c = build_omp_clause (clause_loc, OMP_CLAUSE_DEVICE);
+
+  OMP_CLAUSE_DEVICE_ID (c) = t;
+  OMP_CLAUSE_CHAIN (c) = list;
+  OMP_CLAUSE_DEVICE_ANCESTOR (c) = ancestor;
+
+  list = c;
return list;
  }

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 5349ef1..b4d8d81 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -15139,6 +15139,22 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
 case OMP_CLAUSE_COLLAPSE:
 case OMP_CLAUSE_FINAL:
 case OMP_CLAUSE_DEVICE:
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEVICE
+  && OMP_CLAUSE_DEVICE_ANCESTOR (c))
+{
+  t = OMP_CLAUSE_DEVICE_ID (c);
+  if (TREE_CODE (t) == INTEGER_CST
+  && wi::to_widest (t) != 1)
+{
+  error_at (OMP_CLAUSE_LOCATION (c),
+"the % clause expression must evaluate to "
+"%<1%>");
+  remove = true;
+  break;
+}
+}
+  /* FALLTHRU */


For the C FE, I'd suggest to move this to the c_parser_omp_clause_device
routine like other similar checking is done there too.  And you can use
if (TREE_CODE (t) == INTEGER_CST && !integer_onep (t))

+  error_at (tok->location, "a % directive with the "



+   "% clause must be "
+   "specified");


See above.


@@ -38562,6 +38601,7 @@ cp_parser_omp_clause_device (cp_parser *parser, tree 
list,
c = build_omp_clause (location, OMP_CLAUSE_DEVICE);
OMP_CLAUSE_DEVICE_ID (c) = t;
OMP_CLAUSE_CHAIN (c) = list;
+  OMP_CLAUSE_DEVICE_ANCESTOR (c) = ancestor;


But in C++ the INTEGER_CST checking shouldn't be done here, because
the argument could be type or value dependent.


--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7334,6 +7334,15 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
 "% id must be integral");
   remove = true;
 }
+  else if (OMP_CLAUSE_DEVICE_ANCESTOR (c)
+   && TREE_CODE (t) == INTEGER_CST
+   && wi::to_widest (t) != 1)


!integer_onep (t)


+  if (!(gfc_current_ns->omp_requires & OMP_REQ_REVERSE_OFFLOAD))
+{
+  gfc_error ("a % directive with the "
+ "% clause must be "
+ "specified at %C");


See above.


+  else if (gfc_match ("%e )", &c->device) == MATCH_YES)
+{
+}
+  else


Better != MATCH_YES and drop the {} else ?


+{
+  gfc_error (&q

Re: [Patch v2] C, C++, Fortran, OpenMP: Add support for device-modifiers for 'omp target device'

2021-09-02 Thread Marcel Vollweiler



Am 01.09.2021 um 11:02 schrieb Jakub Jelinek:

On Wed, Sep 01, 2021 at 09:06:31AM +0200, Christophe Lyon wrote:

   * gfortran.dg/gomp/target-device-ancestor-4.f90: New test.




The last new test fails on aarch64:
  /gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90:7:15: Error:
Sorry, 'reverse_offload' clause at (1) on REQUIRES directive is not yet
supported
compiler exited with status 1
PASS: gfortran.dg/gomp/target-device-ancestor-4.f90   -O   (test for
errors, line 7)
XFAIL: gfortran.dg/gomp/target-device-ancestor-4.f90   -O  sorry,
unimplemented: 'ancestor' not yet supported (test for warnings, line 9)
PASS: gfortran.dg/gomp/target-device-ancestor-4.f90   -O  (test for excess
errors)
gfortran.dg/gomp/target-device-ancestor-4.f90   -O  : dump file does not
exist
UNRESOLVED: gfortran.dg/gomp/target-device-ancestor-4.f90   -O
scan-tree-dump original "pragma omp target [^\n\r)]*device\\(ancestor:1\\)"


It is UNRESOLVED everywhere.  Unlike the C/C++ FEs that emit the original
dump even if there are errors/sorry during parsing, the Fortran FE doesn't
do that.
So I think either the dg-final should be xfailed or removed for now.


To xfail dg-final does not seem to work with a missing dump (it results
in UNRESOLVED as before). Instead I commented out dg-final with "TODO"
similar to other tests and hope that this is ok?



  Jakub



Marcel

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/target-device-ancestor-4.f90: Comment out dg-final 
to avoid
 UNRESOLVED.

diff --git a/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90 
b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90
index 540b3d0..63872fa 100644
--- a/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90
@@ -11,4 +11,4 @@
 
 end
 
-! { dg-final { scan-tree-dump "pragma omp target 
\[^\n\r)]*device\\(ancestor:1\\)" "original" } }
+! TODO: dg-final { scan-tree-dump-times "pragma omp target 
\[^\n\r)]*device\\(ancestor:1\\)" 1 "original" } }


[Patch] C, C++, Fortran, OpenMP: Add support for 'flush seq_cst' construct

2021-09-06 Thread Marcel Vollweiler

Hi,

this patch adds support for the 'seq_cst' memory order clause on the
'flush' directive which was introduced in OpenMP 5.1 (p.275ff of the
OpenMP 5.1 Specification):

"If neither memory-order-clause nor a list appears on the flush
construct then the behavior is as if memory-order-clause is seq_cst.

A flush construct with the seq_cst clause, executed on a given thread,
operates as if all data storage blocks that are accessible to the thread
are flushed by a strong flush operation.

...

An implementation may implement a flush construct with a list by
ignoring the list and treating it the same as a flush construct with the
seq_cst clause."

I am not completely sure about the correct memory model specification:
"MEMMODEL_SYNC_SEQ_CST" vs. "MEMMODEL_SEQ_CST".
As "MEMMODEL_SYNC_SEQ_CST" is already used for flush without a clause
(that should behave in the same way than using seq_cst), see
expand_builtin_sync_synchronize in gcc/builtins.c, and regarding the
discussion in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697 I found
it appropriate to use "MEMMODEL_SYNC_SEQ_CST" in order to guarantee a
strong flush.

I tested on x86_64-linux with nvptx offloading with no regressions.

Marcel



-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
C, C++, Fortran, OpenMP: Add support for 'flush seq_cst' construct.

This patch adds support for the 'seq_cst' memory order clause on the 'flush'
directive which was introduced in OpenMP 5.1.

gcc/c-family/ChangeLog:

* c-omp.c (c_finish_omp_flush): Handle MEMMODEL_SEQ_CST.

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_flush): Parse 'seq_cst' clause on 'flush' 
directive.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_flush): Parse 'seq_cst' clause on 'flush'
directive.
* semantics.c (finish_omp_flush): Handle MEMMODEL_SEQ_CST.

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_flush): Parse 'seq_cst' clause on 'flush'
directive.
* trans-openmp.c (gfc_trans_omp_flush): Handle OMP_MEMORDER_SEQ_CST.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/flush-1.c: Add test case for 'seq_cst'.
* c-c++-common/gomp/flush-2.c: Add test case for 'seq_cst'.
* g++.dg/gomp/attrs-1.C:  Adapt test to handle all flush clauses.
* gfortran.dg/gomp/flush-1.f90:  Add test case for 'seq_cst'.
* gfortran.dg/gomp/flush-2.f90:  Add test case for 'seq_cst'.

diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 18de7e4..4b95fc1 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -606,7 +606,7 @@ c_finish_omp_flush (location_t loc, int mo)
 {
   tree x;
 
-  if (mo == MEMMODEL_LAST)
+  if (mo == MEMMODEL_LAST || mo == MEMMODEL_SEQ_CST)
 {
   x = builtin_decl_explicit (BUILT_IN_SYNC_SYNCHRONIZE);
   x = build_call_expr_loc (loc, x, 0);
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 3b1d10f..4d074ec 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -18339,7 +18339,9 @@ c_parser_omp_flush (c_parser *parser)
   const char *p
= IDENTIFIER_POINTER (c_parser_peek_token (parser)->value);
 
-  if (!strcmp (p, "acq_rel"))
+  if (!strcmp (p, "seq_cst"))
+   mo = MEMMODEL_SEQ_CST;
+  else if (!strcmp (p, "acq_rel"))
mo = MEMMODEL_ACQ_REL;
   else if (!strcmp (p, "release"))
mo = MEMMODEL_RELEASE;
@@ -18347,7 +18349,8 @@ c_parser_omp_flush (c_parser *parser)
mo = MEMMODEL_ACQUIRE;
   else
error_at (c_parser_peek_token (parser)->location,
- "expected %, % or %");
+ "expected %, %, % or "
+ "%");
   c_parser_consume_token (parser);
 }
   if (c_parser_next_token_is (parser, CPP_OPEN_PAREN))
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index ea71f9c..f9c2c8a 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -40742,7 +40742,9 @@ cp_parser_omp_flush (cp_parser *parser, cp_token 
*pragma_tok)
 {
   tree id = cp_lexer_peek_token (parser->lexer)->u.value;
   const char *p = IDENTIFIER_POINTER (id);
-  if (!strcmp (p, "acq_rel"))
+  if (!strcmp (p, "seq_cst"))
+   mo = MEMMODEL_SEQ_CST;
+  else if (!strcmp (p, "acq_rel"))
mo = MEMMODEL_ACQ_REL;
   else if (!strcmp (p, "release"))
mo = MEMMODEL_RELEASE;
@@ -40750,7 +40752,8 @@ cp_parser_omp_flush (cp_parser *parser, cp_token 
*pragma_tok)
mo = MEMMODEL_ACQUIRE;
   else
error_at (cp_lexer_peek_token (parser->lexer)->location,
- "expected %, % or %");
+ "expected %, %, % or "
+ "%");
   cp_lexer_consume_token (parser->lexer);
 }
   if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN))
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index f4b042f..8b78e89

[Patch] libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential.

2021-10-11 Thread Marcel Vollweiler

Hi,

The variable omp_atv_sequential was replaced by omp_atv_serialized in
OpenMP 5.1. This was already implemented by Jakub (C/C++, commit
ea82325afec) and Tobias (Fortran, commit fff15bad1ab).

This patch adds two tests to check if omp_atv_serialized is available
(one test for C/C++ and one for Fortran). Besides that
omp_atv_sequential is marked as deprecated in C/C++ and Fortran for
OpenMP 5.1.

The patch was tested on x86_64-linux and powerpc64le-linux with nvptx
offloading and on x86_64-linux with amdgcn offloading with no regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential.

The variable omp_atv_sequential was replaced by omp_atv_serialized in OpenMP
5.1. This was already implemented by Jakub (C/C++, commit ea82325afec) and
Tobias (Fortran, commit fff15bad1ab).

This patch adds two tests to check if omp_atv_serialized is available (one test
for C/C++ and one for Fortran). Besides that omp_atv_sequential is marked as
deprecated in C/C++ and Fortran for OpenMP 5.1.

libgomp/ChangeLog:

* allocator.c (omp_init_allocator): Replace omp_atv_sequential with
omp_atv_serialized.
* omp.h.in: Add deprecated flag for omp_atv_sequential.
* omp_lib.f90.in: Add deprecated flag for omp_atv_sequential.
* testsuite/libgomp.c-c++-common/alloc-10.c: New test.
* testsuite/libgomp.fortran/alloc-12.f90: New test.

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index dce600f..deebb6a 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -82,7 +82,7 @@ omp_init_allocator (omp_memspace_handle_t memspace, int 
ntraits,
break;
  case omp_atv_contended:
  case omp_atv_uncontended:
- case omp_atv_sequential:
+ case omp_atv_serialized:
  case omp_atv_private:
data.sync_hint = traits[i].value;
break;
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index d75ee13..e57e192 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -157,7 +157,7 @@ typedef enum omp_alloctrait_value_t
   omp_atv_contended = 3,
   omp_atv_uncontended = 4,
   omp_atv_serialized = 5,
-  omp_atv_sequential = omp_atv_serialized,
+  omp_atv_sequential __GOMP_DEPRECATED_5_1 = omp_atv_serialized,
   omp_atv_private = 6,
   omp_atv_all = 7,
   omp_atv_thread = 8,
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index 1063eee..57766b5 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -810,7 +810,7 @@
 #endif
 
 #if _OPENMP >= 202011
-!GCC$ ATTRIBUTES DEPRECATED :: omp_proc_bind_master
+!GCC$ ATTRIBUTES DEPRECATED :: omp_proc_bind_master, omp_atv_sequential
 #endif
 
   end module omp_lib
diff --git a/libgomp/testsuite/libgomp.c-c++-common/alloc-10.c 
b/libgomp/testsuite/libgomp.c-c++-common/alloc-10.c
new file mode 100644
index 000..742c64a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/alloc-10.c
@@ -0,0 +1,25 @@
+#include 
+#include 
+#include 
+
+const omp_alloctrait_t traits[]
+= { { omp_atk_alignment, 64 },
+{ omp_atk_sync_hint, omp_atv_serialized },
+{ omp_atk_fallback, omp_atv_null_fb } };
+
+int
+main ()
+{
+  omp_allocator_handle_t a;
+  int *volatile p;
+  a = omp_init_allocator (omp_default_mem_space, 3, traits);
+  if (a == omp_null_allocator)
+abort ();
+  p = (int *) omp_alloc (3072, a);
+  if uintptr_t) p) % 64) != 0)
+abort ();
+  p[0] = 1;
+  p[3071 / sizeof (int)] = 2;
+  omp_free (p, a);
+  omp_destroy_allocator (a);
+}
\ No newline at end of file
diff --git a/libgomp/testsuite/libgomp.fortran/alloc-12.f90 
b/libgomp/testsuite/libgomp.fortran/alloc-12.f90
new file mode 100644
index 000..3d10959
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/alloc-12.f90
@@ -0,0 +1,28 @@
+! { dg-additional-options "-Wall -Wextra" }
+program main
+  use omp_lib
+  use ISO_C_Binding
+  implicit none (external, type)
+  type(c_ptr) :: p
+  integer, pointer, contiguous :: ip(:)
+  type (omp_alloctrait) :: traits(3)
+  integer (omp_allocator_handle_kind) :: a
+  integer (c_ptrdiff_t) :: iptr
+
+  traits = [omp_alloctrait (omp_atk_alignment, 64), &
+omp_alloctrait (omp_atk_fallback, omp_atv_null_fb), &
+omp_alloctrait (omp_atk_sync_hint, omp_atv_serialized)]
+  a = omp_init_allocator (omp_default_mem_space, 3, traits)
+  if (a == omp_null_allocator) stop 1
+
+  p = omp_alloc (3 * c_sizeof (0), a)
+  if (.not. c_associated (p)) stop 2
+  call c_f_pointer (p, ip, [3])
+  if (mod (TRANSFER (p, iptr), 64) /= 0) &
+stop 3
+  ip(1) = 1
+  ip(2) = 2
+  ip(3) = 3
+  call omp_free (p, a)
+  call omp_destroy_allocator (a)
+end program main


Re: [Patch] libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential.

2021-10-11 Thread Marcel Vollweiler

Hi Jakub,

Am 11.10.2021 um 11:49 schrieb Jakub Jelinek:

On Mon, Oct 11, 2021 at 11:40:54AM +0200, Marcel Vollweiler wrote:

libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential.

The variable omp_atv_sequential was replaced by omp_atv_serialized in OpenMP
5.1. This was already implemented by Jakub (C/C++, commit ea82325afec) and
Tobias (Fortran, commit fff15bad1ab).

This patch adds two tests to check if omp_atv_serialized is available (one test
for C/C++ and one for Fortran). Besides that omp_atv_sequential is marked as
deprecated in C/C++ and Fortran for OpenMP 5.1.

libgomp/ChangeLog:

 * allocator.c (omp_init_allocator): Replace omp_atv_sequential with
 omp_atv_serialized.
 * omp.h.in: Add deprecated flag for omp_atv_sequential.
 * omp_lib.f90.in: Add deprecated flag for omp_atv_sequential.
 * testsuite/libgomp.c-c++-common/alloc-10.c: New test.
 * testsuite/libgomp.fortran/alloc-12.f90: New test.


LGTM, except one nit.


--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/alloc-10.c
+}
\ No newline at end of file


Please make sure the file ends with a newline before committing.


Changed :)



  Jakub



Thanks,
Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential.

The variable omp_atv_sequential was replaced by omp_atv_serialized in OpenMP
5.1. This was already implemented by Jakub (C/C++, commit ea82325afec) and
Tobias (Fortran, commit fff15bad1ab).

This patch adds two tests to check if omp_atv_serialized is available (one test
for C/C++ and one for Fortran). Besides that omp_atv_sequential is marked as
deprecated in C/C++ and Fortran for OpenMP 5.1.

libgomp/ChangeLog:

* allocator.c (omp_init_allocator): Replace omp_atv_sequential with
omp_atv_serialized.
* omp.h.in: Add deprecated flag for omp_atv_sequential.
* omp_lib.f90.in: Add deprecated flag for omp_atv_sequential.
* testsuite/libgomp.c-c++-common/alloc-10.c: New test.
* testsuite/libgomp.fortran/alloc-12.f90: New test.

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index dce600f..deebb6a 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -82,7 +82,7 @@ omp_init_allocator (omp_memspace_handle_t memspace, int 
ntraits,
break;
  case omp_atv_contended:
  case omp_atv_uncontended:
- case omp_atv_sequential:
+ case omp_atv_serialized:
  case omp_atv_private:
data.sync_hint = traits[i].value;
break;
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index d75ee13..e57e192 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -157,7 +157,7 @@ typedef enum omp_alloctrait_value_t
   omp_atv_contended = 3,
   omp_atv_uncontended = 4,
   omp_atv_serialized = 5,
-  omp_atv_sequential = omp_atv_serialized,
+  omp_atv_sequential __GOMP_DEPRECATED_5_1 = omp_atv_serialized,
   omp_atv_private = 6,
   omp_atv_all = 7,
   omp_atv_thread = 8,
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index 1063eee..57766b5 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -810,7 +810,7 @@
 #endif
 
 #if _OPENMP >= 202011
-!GCC$ ATTRIBUTES DEPRECATED :: omp_proc_bind_master
+!GCC$ ATTRIBUTES DEPRECATED :: omp_proc_bind_master, omp_atv_sequential
 #endif
 
   end module omp_lib
diff --git a/libgomp/testsuite/libgomp.c-c++-common/alloc-10.c 
b/libgomp/testsuite/libgomp.c-c++-common/alloc-10.c
new file mode 100644
index 000..01ae150d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/alloc-10.c
@@ -0,0 +1,25 @@
+#include 
+#include 
+#include 
+
+const omp_alloctrait_t traits[]
+= { { omp_atk_alignment, 64 },
+{ omp_atk_sync_hint, omp_atv_serialized },
+{ omp_atk_fallback, omp_atv_null_fb } };
+
+int
+main ()
+{
+  omp_allocator_handle_t a;
+  int *volatile p;
+  a = omp_init_allocator (omp_default_mem_space, 3, traits);
+  if (a == omp_null_allocator)
+abort ();
+  p = (int *) omp_alloc (3072, a);
+  if uintptr_t) p) % 64) != 0)
+abort ();
+  p[0] = 1;
+  p[3071 / sizeof (int)] = 2;
+  omp_free (p, a);
+  omp_destroy_allocator (a);
+}
diff --git a/libgomp/testsuite/libgomp.fortran/alloc-12.f90 
b/libgomp/testsuite/libgomp.fortran/alloc-12.f90
new file mode 100644
index 000..3d10959
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/alloc-12.f90
@@ -0,0 +1,28 @@
+! { dg-additional-options "-Wall -Wextra" }
+program main
+  use omp_lib
+  use ISO_C_Binding
+  implicit none (external, type)
+  type(c_ptr) :: p
+  integer, pointer, contiguous :: ip(:)
+  type (omp_alloctrait) :: traits(3)
+  integer (omp_allocator_handle_kind) :: a
+  integer (c_ptrdiff_t) :: iptr
+
+  traits = [omp_alloctrait (omp_

Re: [PATCH] C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct

2021-11-24 Thread Marcel Vollweiler

Hi Jakub,

this is again a new version of the 'has_device_addr' patch. It includes
further minor changes in the C/C++ part and in addition the Fortran
implementation.

Tested on x86_64-linux with nvptx offloading with no regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct.

This patch adds the 'has_device_addr' clause to the OpenMP 'target' construct
which was introduced in OpenMP 5.1 (OpenMP API 5.1 specification pp. 197ff):

has_device_addr(list)

"The has_device_addr clause indicates that its list items already have device
addresses and therefore they may be directly accessed from a target device.
If the device address of a list item is not for the device on which the target
region executes, accessing the list item inside the region results in
unspecified behavior. The list items may include array sections." (p. 200)

"A list item may not be specified in both an is_device_ptr clause and a
has_device_addr clause on the directive." (p. 202)

"A list item that appears in an is_device_ptr or a has_device_addr clause must
not be specified in any data-sharing attribute clause on the same target
construct." (p. 203)

gcc/c-family/ChangeLog:

* c-omp.c (c_omp_split_clauses): Add OMP_CLAUSE_HAS_DEVICE_ADDR case.
* c-pragma.h (enum pragma_kind): Add 5.1 in comment.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR.

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_clause_name): Parse 'has_device_addr' clause.
(c_parser_omp_variable_list): Handle array sections.
(c_parser_omp_clause_has_device_addr): Added.
(c_parser_omp_all_clauses): Add PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR case.
(c_parser_omp_target_exit_data): Add HAS_DEVICE_ADDR to OMP_CLAUSE_MASK.
* c-typeck.c (handle_omp_array_sections): Handle clause restrictions.
(c_finish_omp_clauses): Handle array sections.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_clause_name): Parse 'has_device_addr' clause.
(cp_parser_omp_var_list_no_open): Handle array sections.
(cp_parser_omp_all_clauses): Add PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR case.
(cp_parser_omp_target_update): Add HAS_DEVICE_ADDR to OMP_CLAUSE_MASK.
* pt.c (tsubst_omp_clauses): Add cases for OMP_CLAUSE_HAS_DEVICE_ADDR.
* semantics.c (handle_omp_array_sections): Handle clause restrictions.
(finish_omp_clauses): Handle array sections.

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_clauses): Added OMP_LIST_HAS_DEVICE_ADDR
case.
* gfortran.h: Added OMP_LIST_HAS_DEVICE_ADDR.
* openmp.c (enum omp_mask1): Added OMP_CLAUSE_HAS_DEVICE_ADDR.
(gfc_match_omp_clauses): Parse HAS_DEVICE_ADDR clause.
(resolve_omp_clauses): Same.
* trans-openmp.c (gfc_trans_omp_variable_list): Added 
OMP_LIST_HAS_DEVICE_ADDR case.
(gfc_trans_omp_clauses): Firstprivatize of array descriptors.

gcc/ChangeLog:

* gimplify.c (gimplify_scan_omp_clauses): Add OMP_CLAUSE_HAS_DEVICE_ADDR
cases
and handle array sections.
(gimplify_adjust_omp_clauses): Add OMP_CLAUSE_HAS_DEVICE_ADDR case.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_HAS_DEVICE_ADDR.
(lower_omp_target): Same.
* tree-core.h (enum omp_clause_code): Same.
* tree-nested.c (convert_nonlocal_omp_clauses): Same.
(convert_local_omp_clauses): Same.
* tree-pretty-print.c (dump_omp_clause): Same.
* tree.c: Same.

libgomp/ChangeLog:

* libgomp.texi: Updated entry for HAS_DEVICE_ADDR.
* target.c (copy_firstprivate_data): Copy only if host address is not
NULL.
* testsuite/libgomp.c++/target-has-device-addr-2.C: New test.
* testsuite/libgomp.c++/target-has-device-addr-4.C: New test.
* testsuite/libgomp.c-c++-common/target-has-device-addr-1.c: New test.
* testsuite/libgomp.c/target-has-device-addr-3.c: New test.
* testsuite/libgomp.fortran/target-has-device-addr-1.f90: New test.
* testsuite/libgomp.fortran/target-has-device-addr-2.f90: New test.
* testsuite/libgomp.fortran/target-has-device-addr-3.f90: New test.
* testsuite/libgomp.fortran/target-has-device-addr-4.f90: New test.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/clauses-1.c: Added has_device_addr to test cases.
* g++.dg/gomp/attrs-1.C: Added has_device_addr to test cases.
* g++.dg/gomp/attrs-2.C: Added has_device_addr to test cases.
* c-c++-common/gomp/target-has-device-addr-1.c: New test.
* c-c++-common/gomp/target-has-device-addr-2.c: New test.
* c-c++-common/gomp/target-is-devi