[PATCH 0/5] openmp: Handle pinned and unified shared memory.

2022-03-08 Thread Hafiz Abid Qadeer
This patch series add support for unified shared memory (USM) and pinned
memory. The support in libgomp is for nvptx offloading only.  A new
command line option -foffload-memory allows user to choose either USM
or pinned memory. The USM can also be enabled using requires construct.

When USM us in use, calls to memory allocation function like malloc are
changed to omp_alloc with appropriate allocator.  No transformations are
required for the pinned memory which is implemented using mlockall so is
only available on Linux.

Andrew Stubbs (4):
  openmp: Add -foffload-memory
  openmp: allow requires unified_shared_memory
  openmp, nvptx: ompx_unified_shared_mem_alloc
  openmp: -foffload-memory=pinned

Hafiz Abid Qadeer (1):
  openmp: Use libgomp memory allocation functions with unified shared
memory.

 gcc/c/c-parser.cc |  13 +-
 gcc/common.opt|  16 ++
 gcc/coretypes.h   |   7 +
 gcc/cp/parser.cc  |  13 +-
 gcc/doc/invoke.texi   |  16 +-
 gcc/fortran/openmp.cc |  10 +-
 gcc/omp-low.cc| 220 ++
 gcc/passes.def|   1 +
 .../c-c++-common/gomp/alloc-pinned-1.c|  28 +++
 gcc/testsuite/c-c++-common/gomp/usm-1.c   |   4 +
 gcc/testsuite/c-c++-common/gomp/usm-2.c   |  34 +++
 gcc/testsuite/c-c++-common/gomp/usm-3.c   |  32 +++
 gcc/testsuite/g++.dg/gomp/usm-1.C |  32 +++
 gcc/testsuite/g++.dg/gomp/usm-2.C |  30 +++
 gcc/testsuite/g++.dg/gomp/usm-3.C |  38 +++
 gcc/testsuite/gfortran.dg/gomp/usm-1.f90  |   6 +
 gcc/testsuite/gfortran.dg/gomp/usm-2.f90  |  16 ++
 gcc/testsuite/gfortran.dg/gomp/usm-3.f90  |  13 ++
 gcc/tree-pass.h   |   1 +
 libgomp/allocator.c   |  13 +-
 libgomp/config/linux/allocator.c  |  70 --
 libgomp/config/nvptx/allocator.c  |   6 +
 libgomp/libgomp-plugin.h  |   3 +
 libgomp/libgomp.h |   6 +
 libgomp/libgomp.map   |   5 +
 libgomp/omp.h.in  |   4 +
 libgomp/omp_lib.f90.in|   8 +
 libgomp/plugin/plugin-nvptx.c |  45 +++-
 libgomp/target.c  |  70 ++
 libgomp/testsuite/libgomp.c++/usm-1.C |  54 +
 libgomp/testsuite/libgomp.c/alloc-pinned-7.c  |  66 ++
 libgomp/testsuite/libgomp.c/usm-1.c   |  24 ++
 libgomp/testsuite/libgomp.c/usm-2.c   |  32 +++
 libgomp/testsuite/libgomp.c/usm-3.c   |  35 +++
 libgomp/testsuite/libgomp.c/usm-4.c   |  36 +++
 libgomp/testsuite/libgomp.c/usm-5.c   |  28 +++
 libgomp/testsuite/libgomp.c/usm-6.c   |  70 ++
 37 files changed, 1075 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c
 create mode 100644 gcc/testsuite/c-c++-common/gomp/usm-1.c
 create mode 100644 gcc/testsuite/c-c++-common/gomp/usm-2.c
 create mode 100644 gcc/testsuite/c-c++-common/gomp/usm-3.c
 create mode 100644 gcc/testsuite/g++.dg/gomp/usm-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/usm-2.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/usm-3.C
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/usm-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/usm-2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/usm-3.f90
 create mode 100644 libgomp/testsuite/libgomp.c++/usm-1.C
 create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-7.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-1.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-2.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-6.c

-- 
2.25.1



[PATCH 1/5] openmp: Add -foffload-memory

2022-03-08 Thread Hafiz Abid Qadeer
From: Andrew Stubbs 

Add a new option.  It will be used in follow-up patches.

gcc/ChangeLog:

* common.opt: Add -foffload-memory and its enum values.
* coretypes.h (enum offload_memory): New.
* doc/invoke.texi: Document -foffload-memory.
---
 gcc/common.opt  | 16 
 gcc/coretypes.h |  7 +++
 gcc/doc/invoke.texi | 16 +++-
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 8b6513de47c..17426523e23 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2182,6 +2182,22 @@ Enum(offload_abi) String(ilp32) Value(OFFLOAD_ABI_ILP32)
 EnumValue
 Enum(offload_abi) String(lp64) Value(OFFLOAD_ABI_LP64)
 
+foffload-memory=
+Common Joined RejectNegative Enum(offload_memory) Var(flag_offload_memory) 
Init(OFFLOAD_MEMORY_NONE)
+-foffload-memory=[none|unified|pinned] Use an offload memory optimization.
+
+Enum
+Name(offload_memory) Type(enum offload_memory) UnknownError(Unknown offload 
memory option %qs)
+
+EnumValue
+Enum(offload_memory) String(none) Value(OFFLOAD_MEMORY_NONE)
+
+EnumValue
+Enum(offload_memory) String(unified) Value(OFFLOAD_MEMORY_UNIFIED)
+
+EnumValue
+Enum(offload_memory) String(pinned) Value(OFFLOAD_MEMORY_PINNED)
+
 fomit-frame-pointer
 Common Var(flag_omit_frame_pointer) Optimization
 When possible do not generate stack frames.
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index 08b9ac9094c..dd52d5bb113 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -206,6 +206,13 @@ enum offload_abi {
   OFFLOAD_ABI_ILP32
 };
 
+/* Types of memory optimization for an offload device.  */
+enum offload_memory {
+  OFFLOAD_MEMORY_NONE,
+  OFFLOAD_MEMORY_UNIFIED,
+  OFFLOAD_MEMORY_PINNED
+};
+
 /* Types of profile update methods.  */
 enum profile_update {
   PROFILE_UPDATE_SINGLE,
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 248ed534aee..d16019fc8c3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -202,7 +202,7 @@ in the following sections.
 -fno-builtin  -fno-builtin-@var{function}  -fcond-mismatch @gol
 -ffreestanding  -fgimple  -fgnu-tm  -fgnu89-inline  -fhosted @gol
 -flax-vector-conversions  -fms-extensions @gol
--foffload=@var{arg}  -foffload-options=@var{arg} @gol
+-foffload=@var{arg}  -foffload-options=@var{arg} -foffload-memory=@var{arg} 
@gol
 -fopenacc  -fopenacc-dim=@var{geom} @gol
 -fopenmp  -fopenmp-simd @gol
 -fpermitted-flt-eval-methods=@var{standard} @gol
@@ -2694,6 +2694,20 @@ Typical command lines are
 -foffload-options=amdgcn-amdhsa=-march=gfx906 -foffload-options=-lm
 @end smallexample
 
+@item -foffload-memory=none
+@itemx -foffload-memory=unified
+@itemx -foffload-memory=pinned
+@opindex foffload-memory
+@cindex OpenMP offloading memory modes
+Enable a memory optimization mode to use with OpenMP.  The default behavior,
+@option{-foffload-memory=none}, is to do nothing special (unless enabled via
+a requires directive in the code).  @option{-foffload-memory=unified} is
+equivalent to @code{#pragma omp requires unified_shared_memory}.
+@option{-foffload-memory=pinned} forces all host memory to be pinned (this
+mode may require the user to increase the ulimit setting for locked memory).
+All translation units must select the same setting to avoid undefined
+behavior.
+
 @item -fopenacc
 @opindex fopenacc
 @cindex OpenACC accelerator programming
-- 
2.25.1



[PATCH 2/5] openmp: allow requires unified_shared_memory

2022-03-08 Thread Hafiz Abid Qadeer
From: Andrew Stubbs 

This is the front-end portion of the Unified Shared Memory implementation.
It removes the "sorry, unimplemented message" in C, C++, and Fortran, and sets
flag_offload_memory, but is otherwise inactive, for now.

It also checks that -foffload-memory isn't set to an incompatible mode.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_requires): Allow "requires
  unified_share_memory".

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_requires): Allow "requires
unified_share_memory".

gcc/fortran/ChangeLog:

* openmp.cc (gfc_match_omp_requires): Allow "requires
unified_share_memory".

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/usm-1.c: New test.
* gfortran.dg/gomp/usm-1.f90: New test.
---
 gcc/c/c-parser.cc| 13 -
 gcc/cp/parser.cc | 13 -
 gcc/fortran/openmp.cc| 10 +-
 gcc/testsuite/c-c++-common/gomp/usm-1.c  |  4 
 gcc/testsuite/gfortran.dg/gomp/usm-1.f90 |  6 ++
 5 files changed, 43 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/usm-1.c
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/usm-1.f90

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 84deac04c44..dc834158d1c 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -22542,7 +22542,16 @@ c_parser_omp_requires (c_parser *parser)
  if (!strcmp (p, "unified_address"))
this_req = OMP_REQUIRES_UNIFIED_ADDRESS;
  else if (!strcmp (p, "unified_shared_memory"))
+ {
this_req = OMP_REQUIRES_UNIFIED_SHARED_MEMORY;
+
+   if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED
+   && flag_offload_memory != OFFLOAD_MEMORY_NONE)
+ error_at (cloc,
+   "unified_shared_memory is incompatible with the "
+   "selected -foffload-memory option");
+   flag_offload_memory = OFFLOAD_MEMORY_UNIFIED;
+ }
  else if (!strcmp (p, "dynamic_allocators"))
this_req = OMP_REQUIRES_DYNAMIC_ALLOCATORS;
  else if (!strcmp (p, "reverse_offload"))
@@ -22609,7 +22618,9 @@ c_parser_omp_requires (c_parser *parser)
  c_parser_skip_to_pragma_eol (parser, false);
  return;
}
- if (p && this_req != OMP_REQUIRES_DYNAMIC_ALLOCATORS)
+ if (p
+ && this_req != OMP_REQUIRES_DYNAMIC_ALLOCATORS
+ && this_req != OMP_REQUIRES_UNIFIED_SHARED_MEMORY)
sorry_at (cloc, "%qs clause on % directive not "
"supported yet", p);
  if (p)
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 03d99aba13e..ba263152aaf 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -46464,7 +46464,16 @@ cp_parser_omp_requires (cp_parser *parser, cp_token 
*pragma_tok)
  if (!strcmp (p, "unified_address"))
this_req = OMP_REQUIRES_UNIFIED_ADDRESS;
  else if (!strcmp (p, "unified_shared_memory"))
+ {
this_req = OMP_REQUIRES_UNIFIED_SHARED_MEMORY;
+
+   if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED
+   && flag_offload_memory != OFFLOAD_MEMORY_NONE)
+ error_at (cloc,
+   "unified_shared_memory is incompatible with the "
+   "selected -foffload-memory option");
+   flag_offload_memory = OFFLOAD_MEMORY_UNIFIED;
+ }
  else if (!strcmp (p, "dynamic_allocators"))
this_req = OMP_REQUIRES_DYNAMIC_ALLOCATORS;
  else if (!strcmp (p, "reverse_offload"))
@@ -46537,7 +46546,9 @@ cp_parser_omp_requires (cp_parser *parser, cp_token 
*pragma_tok)
  cp_parser_skip_to_pragma_eol (parser, pragma_tok);
  return false;
}
- if (p && this_req != OMP_REQUIRES_DYNAMIC_ALLOCATORS)
+ if (p
+ && this_req != OMP_REQUIRES_DYNAMIC_ALLOCATORS
+ && this_req != OMP_REQUIRES_UNIFIED_SHARED_MEMORY)
sorry_at (cloc, "%qs clause on % directive not "
"supported yet", p);
  if (p)
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 16cd03a3d67..1f434857719 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic.h"
 #include "gomp-constants.h"
 #include "target-memory.h"  /* For gfc_encode_character.  */
+#include "options.h"
 
 /* Match an end of OpenMP directive.  End of OpenMP directive is optional
whitespace, followed by '\n' or comment '!'.  */
@@ -5373,6 +5374,12 @@ gfc_match_omp_requires (void)
  requires_clause = OMP_REQ_UNIFIED_SHARED_MEMORY;
  if (requires_clauses & OMP_REQ_UNIFIED_SHARED_MEMORY)
goto duplicate_clause;
+
+ if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED
+ && flag_offload_memory != OFFLOAD_M

[PATCH 3/5] openmp, nvptx: ompx_unified_shared_mem_alloc

2022-03-08 Thread Hafiz Abid Qadeer
From: Andrew Stubbs 

This adds support for using Cuda Managed Memory with omp_alloc.  It will be
used as the underpinnings for "requires unified_shared_memory" in a later
patch.

There are two new predefined allocators, ompx_unified_shared_mem_alloc and
ompx_host_mem_alloc, plus corresponding memory spaces, which can be used to
allocate memory in the "managed" space and explicitly on the host (it is
intended that "malloc" will be intercepted by the compiler).

The nvptx plugin is modified to make the necessary Cuda calls, and libgomp
is modified to switch to shared-memory mode for USM allocated mappings.

libgomp/ChangeLog:

* allocator.c (omp_max_predefined_alloc): Update.
(omp_aligned_alloc): Don't fallback ompx_host_mem_alloc.
(omp_aligned_calloc): Likewise.
(omp_realloc): Likewise.
* config/linux/allocator.c (linux_memspace_alloc): Handle USM.
(linux_memspace_calloc): Handle USM.
(linux_memspace_free): Handle USM.
(linux_memspace_realloc): Handle USM.
* config/nvptx/allocator.c (nvptx_memspace_alloc): Reject
ompx_host_mem_alloc.
(nvptx_memspace_calloc): Likewise.
(nvptx_memspace_realloc): Likewise.
* libgomp-plugin.h (GOMP_OFFLOAD_usm_alloc): New prototype.
(GOMP_OFFLOAD_usm_free): New prototype.
(GOMP_OFFLOAD_is_usm_ptr): New prototype.
* libgomp.h (gomp_usm_alloc): New prototype.
(gomp_usm_free): New prototype.
(gomp_is_usm_ptr): New prototype.
(struct gomp_device_descr): Add USM functions.
* omp.h.in (omp_memspace_handle_t): Add ompx_unified_shared_mem_space
and ompx_host_mem_space.
(omp_allocator_handle_t): Add ompx_unified_shared_mem_alloc and
ompx_host_mem_alloc.
* omp_lib.f90.in: Likewise.
* plugin/plugin-nvptx.c (nvptx_alloc): Add "usm" parameter.
Call cuMemAllocManaged as appropriate.
(GOMP_OFFLOAD_alloc): Move internals to ...
(GOMP_OFFLOAD_alloc_1): ... this, and add usm parameter.
(GOMP_OFFLOAD_usm_alloc): New function.
(GOMP_OFFLOAD_usm_free): New function.
(GOMP_OFFLOAD_is_usm_ptr): New function.
* target.c (gomp_map_vars_internal): Add USM support.
(gomp_usm_alloc): New function.
(gomp_usm_free): New function.
(gomp_load_plugin_for_device): New function.
* testsuite/libgomp.c/usm-1.c: New test.
* testsuite/libgomp.c/usm-2.c: New test.
* testsuite/libgomp.c/usm-3.c: New test.
* testsuite/libgomp.c/usm-4.c: New test.
* testsuite/libgomp.c/usm-5.c: New test.
---
 libgomp/allocator.c | 13 --
 libgomp/config/linux/allocator.c| 48 
 libgomp/config/nvptx/allocator.c|  6 +++
 libgomp/libgomp-plugin.h|  3 ++
 libgomp/libgomp.h   |  6 +++
 libgomp/omp.h.in|  4 ++
 libgomp/omp_lib.f90.in  |  8 
 libgomp/plugin/plugin-nvptx.c   | 45 ---
 libgomp/target.c| 70 +
 libgomp/testsuite/libgomp.c/usm-1.c | 24 ++
 libgomp/testsuite/libgomp.c/usm-2.c | 32 +
 libgomp/testsuite/libgomp.c/usm-3.c | 35 +++
 libgomp/testsuite/libgomp.c/usm-4.c | 36 +++
 libgomp/testsuite/libgomp.c/usm-5.c | 28 
 14 files changed, 330 insertions(+), 28 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c/usm-1.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-2.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/usm-5.c

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index 000ccc2dd9c..18045dbe0c4 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -32,7 +32,7 @@
 #include 
 #include 
 
-#define omp_max_predefined_alloc ompx_pinned_mem_alloc
+#define omp_max_predefined_alloc ompx_host_mem_alloc
 
 /* These macros may be overridden in config//allocator.c.  */
 #ifndef MEMSPACE_ALLOC
@@ -68,6 +68,8 @@ static const omp_memspace_handle_t predefined_alloc_mapping[] 
= {
   omp_low_lat_mem_space,   /* omp_pteam_mem_alloc. */
   omp_low_lat_mem_space,   /* omp_thread_mem_alloc. */
   omp_default_mem_space,   /* ompx_pinned_mem_alloc. */
+  ompx_unified_shared_mem_space,  /* ompx_unified_shared_mem_alloc. */
+  ompx_host_mem_space, /* ompx_host_mem_alloc.  */
 };
 
 struct omp_allocator_data
@@ -367,7 +369,8 @@ fail:
   int fallback = (allocator_data
  ? allocator_data->fallback
  : (allocator == omp_default_mem_alloc
-|| allocator == ompx_pinned_mem_alloc)
+|| allocator == ompx_pinned_mem_alloc
+|| allocator == ompx_host_mem_alloc)
  ? omp_atv_null_fb
  : omp_atv_default_mem_fb);
   switch (fallback)
@@ -597,7 +600,8 @@ fail:
   

[PATCH 4/5] openmp: Use libgomp memory allocation functions with unified shared memory.

2022-03-08 Thread Hafiz Abid Qadeer
This patches changes calls to malloc/free/calloc/realloc and operator new to
memory allocation functions in libgomp with
allocator=ompx_unified_shared_mem_alloc.  This helps existing code to benefit
from the unified shared memory.  The libgomp does the correct thing with all
the mapping constructs and there is no memory copies if the pointer is pointing
to unified shared memory.

We only replace replacable new operator and not the class member or placement 
new.

gcc/ChangeLog:

* omp-low.cc (usm_transform): New function.
(make_pass_usm_transform): Likewise.
(class pass_usm_transform): New.
* passes.def: Add pass_usm_transform.
* tree-pass.h (make_pass_usm_transform): New declaration.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/usm-2.c: New test.
* c-c++-common/gomp/usm-3.c: New test.
* g++.dg/gomp/usm-1.C: New test.
* g++.dg/gomp/usm-2.C: New test.
* g++.dg/gomp/usm-3.C: New test.
* gfortran.dg/gomp/usm-2.f90: New test.
* gfortran.dg/gomp/usm-3.f90: New test.

libgomp/ChangeLog:

* testsuite/libgomp.c/usm-6.c: New test.
* testsuite/libgomp.c++/usm-1.C: Likewise.
---
 gcc/omp-low.cc   | 152 +++
 gcc/passes.def   |   1 +
 gcc/testsuite/c-c++-common/gomp/usm-2.c  |  34 +
 gcc/testsuite/c-c++-common/gomp/usm-3.c  |  32 +
 gcc/testsuite/g++.dg/gomp/usm-1.C|  32 +
 gcc/testsuite/g++.dg/gomp/usm-2.C|  30 +
 gcc/testsuite/g++.dg/gomp/usm-3.C|  38 ++
 gcc/testsuite/gfortran.dg/gomp/usm-2.f90 |  16 +++
 gcc/testsuite/gfortran.dg/gomp/usm-3.f90 |  13 ++
 gcc/tree-pass.h  |   1 +
 libgomp/testsuite/libgomp.c++/usm-1.C|  54 
 libgomp/testsuite/libgomp.c/usm-6.c  |  70 +++
 12 files changed, 473 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/usm-2.c
 create mode 100644 gcc/testsuite/c-c++-common/gomp/usm-3.c
 create mode 100644 gcc/testsuite/g++.dg/gomp/usm-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/usm-2.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/usm-3.C
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/usm-2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/usm-3.f90
 create mode 100644 libgomp/testsuite/libgomp.c++/usm-1.C
 create mode 100644 libgomp/testsuite/libgomp.c/usm-6.c

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 5ce3a50709a..ec08d59f676 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -14849,6 +14849,158 @@ make_pass_diagnose_omp_blocks (gcc::context *ctxt)
 {
   return new pass_diagnose_omp_blocks (ctxt);
 }
+
+/* Provide transformation required for using unified shared memory
+   by replacing calls to standard memory allocation functions with
+   function provided by the libgomp.  */
+
+static tree
+usm_transform (gimple_stmt_iterator *gsi_p, bool *,
+  struct walk_stmt_info *wi)
+{
+  gimple *stmt = gsi_stmt (*gsi_p);
+  /* ompx_unified_shared_mem_alloc is 10.  */
+  const unsigned int unified_shared_mem_alloc = 10;
+
+  switch (gimple_code (stmt))
+{
+case GIMPLE_CALL:
+  {
+   gcall *gs = as_a  (stmt);
+   tree fndecl = gimple_call_fndecl (gs);
+   if (fndecl)
+ {
+   tree allocator = build_int_cst (pointer_sized_int_node,
+   unified_shared_mem_alloc);
+   const char *name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
+   if ((strcmp (name, "malloc") == 0)
+|| (fndecl_built_in_p (fndecl, BUILT_IN_NORMAL)
+&& DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MALLOC)
+|| DECL_IS_REPLACEABLE_OPERATOR_NEW_P (fndecl))
+ {
+ tree omp_alloc_type
+   = build_function_type_list (ptr_type_node, size_type_node,
+   pointer_sized_int_node,
+   NULL_TREE);
+   tree repl = build_fn_decl ("omp_alloc", omp_alloc_type);
+   tree size = gimple_call_arg (gs, 0);
+   gimple *g = gimple_build_call (repl, 2, size, allocator);
+   gimple_call_set_lhs (g, gimple_call_lhs (gs));
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi_p, g, true);
+ }
+   else if ((strcmp (name, "calloc") == 0)
+ || (fndecl_built_in_p (fndecl, BUILT_IN_NORMAL)
+ && DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CALLOC))
+ {
+   tree omp_calloc_type
+ = build_function_type_list (ptr_type_node, size_type_node,
+ size_type_node,
+ pointer_sized_int_node,
+ NULL_TREE);
+   tree repl = build_fn_decl ("omp_calloc", omp_calloc_type);
+

[PATCH 5/5] openmp: -foffload-memory=pinned

2022-03-08 Thread Hafiz Abid Qadeer
From: Andrew Stubbs 

Implement the -foffload-memory=pinned option such that libgomp is
instructed to enable fully-pinned memory at start-up.  The option is
intended to provide a performance boost to certain offload programs without
modifying the code.

This feature only works on Linux, at present, and simply calls mlockall to
enable always-on memory pinning.  It requires that the ulimit feature is
set high enough to accommodate all the program's memory usage.

In this mode the ompx_pinned_memory_alloc feature is disabled as it is not
needed and may conflict.

gcc/ChangeLog:

* omp-low.cc (omp_enable_pinned_mode): New function.
(execute_lower_omp): Call omp_enable_pinned_mode.

libgomp/ChangeLog:

* config/linux/allocator.c (always_pinned_mode): New variable.
(GOMP_enable_pinned_mode): New function.
(linux_memspace_alloc): Disable pinning when always_pinned_mode set.
(linux_memspace_calloc): Likewise.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise.
* libgomp.map (GOMP_5.1.1): New version space with
GOMP_enable_pinned_mode.
* testsuite/libgomp.c/alloc-pinned-7.c: New test.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/alloc-pinned-1.c: New test.
---
 gcc/omp-low.cc| 68 +++
 .../c-c++-common/gomp/alloc-pinned-1.c| 28 
 libgomp/config/linux/allocator.c  | 26 +++
 libgomp/libgomp.map   |  5 ++
 libgomp/testsuite/libgomp.c/alloc-pinned-7.c  | 66 ++
 5 files changed, 193 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c
 create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-7.c

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index ec08d59f676..ce21b3bd6f8 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -14441,6 +14441,70 @@ lower_omp (gimple_seq *body, omp_context *ctx)
   input_location = saved_location;
 }
 
+/* Emit a constructor function to enable -foffload-memory=pinned
+   at runtime.  Libgomp handles the OS mode setting, but we need to trigger
+   it by calling GOMP_enable_pinned mode before the program proper runs.  */
+
+static void
+omp_enable_pinned_mode ()
+{
+  static bool visited = false;
+  if (visited)
+return;
+  visited = true;
+
+  /* Create a new function like this:
+
+   static void __attribute__((constructor))
+   __set_pinned_mode ()
+   {
+GOMP_enable_pinned_mode ();
+   }
+  */
+
+  tree name = get_identifier ("__set_pinned_mode");
+  tree voidfntype = build_function_type_list (void_type_node, NULL_TREE);
+  tree decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, name, voidfntype);
+
+  TREE_STATIC (decl) = 1;
+  TREE_USED (decl) = 1;
+  DECL_ARTIFICIAL (decl) = 1;
+  DECL_IGNORED_P (decl) = 0;
+  TREE_PUBLIC (decl) = 0;
+  DECL_UNINLINABLE (decl) = 1;
+  DECL_EXTERNAL (decl) = 0;
+  DECL_CONTEXT (decl) = NULL_TREE;
+  DECL_INITIAL (decl) = make_node (BLOCK);
+  BLOCK_SUPERCONTEXT (DECL_INITIAL (decl)) = decl;
+  DECL_STATIC_CONSTRUCTOR (decl) = 1;
+  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier ("constructor"),
+ NULL_TREE, NULL_TREE);
+
+  tree t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE,
+  void_type_node);
+  DECL_ARTIFICIAL (t) = 1;
+  DECL_IGNORED_P (t) = 1;
+  DECL_CONTEXT (t) = decl;
+  DECL_RESULT (decl) = t;
+
+  push_struct_function (decl);
+  init_tree_ssa (cfun);
+
+  tree callname = get_identifier ("GOMP_enable_pinned_mode");
+  tree calldecl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, callname,
+ voidfntype);
+  gcall *call = gimple_build_call (calldecl, 0);
+
+  gimple_seq seq = NULL;
+  gimple_seq_add_stmt (&seq, call);
+  gimple_set_body (decl, gimple_build_bind (NULL_TREE, seq, NULL));
+
+  cfun->function_end_locus = UNKNOWN_LOCATION;
+  cfun->curr_properties |= PROP_gimple_any;
+  pop_cfun ();
+  cgraph_node::add_new_function (decl, true);
+}
+
 /* Main entry point.  */
 
 static unsigned int
@@ -14497,6 +14561,10 @@ execute_lower_omp (void)
   for (auto task_stmt : task_cpyfns)
 finalize_task_copyfn (task_stmt);
   task_cpyfns.release ();
+
+  if (flag_offload_memory == OFFLOAD_MEMORY_PINNED)
+omp_enable_pinned_mode ();
+
   return 0;
 }
 
diff --git a/gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c 
b/gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c
new file mode 100644
index 000..e0e08019bff
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-additional-options "-foffload-memory=pinned" } */
+/* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu 
} } */
+
+#if __cplusplus
+#define EXTERNC extern "C"
+#else
+#define EXTERNC
+#endif
+
+/* Intercept the libgomp initialization call to check it happens.  */
+
+int good = 0;
+
+EXTERNC void
+GOMP_enable_pinned_m

[Patch] Fortran: OpenMP/OpenACC avoid uninit access in size calc for mapping

2022-03-08 Thread Tobias Burnus

Hi Thomas & Jakub,

found when working on the deep-mapping patch* with OpenMP code
(and part of that patch) but it already shows up in an existing
OpenACC testcase. I think it makes sense to fix it already for GCC 12.

Problem: Also for unallocated allocatables, their size was
calculated - the 'if(desc.data == NULL)' check was only added
for pointers.

Result after the patch: When compiling with -O (which is the default
for goacc.exp), the warning now disappears. Thus, I now use '-O0'
and the previous "is uninitialized" is now "may be uninitialized".

Unrelated to the patch and the testcase, I added some
'allocate'**/'if(allocated())' to the testcase - as otherwise
uninit vars would be accessed. (Not relevant for the warning
or the patch - but I prefer no invalid code in testcases,
if it can be avoided.)

OK for mainline?

Tobias
* https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591144.html

** I am actually not sure whether 'acc update(b)' will/should map a
previous allocated variable - or whether it should. But that's
unrelated to this bug fix. See also: https://gcc.gnu.org/PR96668
for the re-mapping in OpenMP (works for arrays but not scalars).
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: OpenMP/OpenACC avoid uninit access in size calc for mapping

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses, gfc_omp_finish_clause):
	Obtain size for mapping only if allocatable array is allocated.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/array-with-dt-1.f90: Run with -O0 and
	update dg-warning.
	* gfortran.dg/goacc/pr93464.f90: Likewise.

 gcc/fortran/trans-openmp.cc |  6 --
 gcc/testsuite/gfortran.dg/goacc/array-with-dt-1.f90 | 12 +---
 gcc/testsuite/gfortran.dg/goacc/pr93464.f90 |  8 
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 4d56a771349..fad76a4791f 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -1597,7 +1597,8 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p, bool openacc)
   tree size = create_tmp_var (gfc_array_index_type);
   tree elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type));
   elemsz = fold_convert (gfc_array_index_type, elemsz);
-  if (GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_POINTER
+  if (GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_ALLOCATABLE
+	  || GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_POINTER
 	  || GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_POINTER_CONT)
 	{
 	  stmtblock_t cond_block;
@@ -3208,7 +3209,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 
 		  /* We have to check for n->sym->attr.dimension because
 			 of scalar coarrays.  */
-		  if (n->sym->attr.pointer && n->sym->attr.dimension)
+		  if ((n->sym->attr.pointer || n->sym->attr.allocatable)
+			  && n->sym->attr.dimension)
 			{
 			  stmtblock_t cond_block;
 			  tree size
diff --git a/gcc/testsuite/gfortran.dg/goacc/array-with-dt-1.f90 b/gcc/testsuite/gfortran.dg/goacc/array-with-dt-1.f90
index 136e42acd59..f6880238c89 100644
--- a/gcc/testsuite/gfortran.dg/goacc/array-with-dt-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/array-with-dt-1.f90
@@ -1,4 +1,4 @@
-! { dg-additional-options -Wuninitialized }
+! { dg-additional-options "-Wuninitialized -O0" }
 
 type t
integer, allocatable :: A(:,:)
@@ -8,9 +8,15 @@ type(t), allocatable :: b(:)
 ! { dg-note {'b' declared here} {} { target *-*-* } .-1 }
 
 !$acc update host(b)
-! { dg-warning {'b\.dim\[0\]\.ubound' is used uninitialized} {} { target *-*-* } .-1 }
-! { dg-warning {'b\.dim\[0\]\.lbound' is used uninitialized} {} { target *-*-* } .-2 }
+! { dg-warning {'b\.dim\[0\]\.ubound' may be used uninitialized} {} { target *-*-* } .-1 }
+! { dg-warning {'b\.dim\[0\]\.lbound' may be used uninitialized} {} { target *-*-* } .-2 }
+
+allocate(b(1))
+!$acc update host(b)
 !$acc update host(b(:))
+
+!$acc update host(b(1)%A)
+allocate(b(1)%A(1,1))
 !$acc update host(b(1)%A)
 !$acc update host(b(1)%A(:,:))
 end
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr93464.f90 b/gcc/testsuite/gfortran.dg/goacc/pr93464.f90
index c92f1d3d8b2..18531abdf77 100644
--- a/gcc/testsuite/gfortran.dg/goacc/pr93464.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/pr93464.f90
@@ -2,17 +2,17 @@
 !
 ! Contributed by G. Steinmetz
 
-! { dg-additional-options -Wuninitialized }
+! { dg-additional-options "-Wuninitialized -O0" }
 
 program p
character :: c(2) = 'a'
character, allocatable :: z(:)
! { dg-note {'z' declared here} {} { target *-*-* } .-1 }
!$acc parallel
-   ! { dg-warning {'z\.dim\[0\]\.ubound' is used uninitialized} {} { target *-*-* } .-1 }
-   ! { dg-warning {'z\.dim\[0\]\.lbound' is used uninitialized} {} { targ

[Patch] Fortran: Fix CLASS handling in SIZEOF intrinsic

2022-03-08 Thread Tobias Burnus

Fix SIZEOF handling.

I have to admit that I do understand what the current code does,
but do not understand what the previous code did. However, it
still passes the testsuite - and also some code which did ICE
now compiles :-)

While writing the testcase, I did find two issues:
* Passing a CLASS to TYPE(*),dimension(..) will have an
  elem_len of the declared type and not of the dynamic type.
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104844
* var%class_array(1,1)%array will have size(...) == 0
  instead of size(... % array).
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104845

OK for mainline? (Unless you want to hold off until GCC 13)

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Fix CLASS handling in SIZEOF intrinsic

gcc/fortran/ChangeLog:

	* trans-intrinsic.cc (gfc_conv_intrinsic_sizeof): Fix CLASS handling.

gcc/testsuite/ChangeLog:

	* gfortran.dg/sizeof_6.f90: New test.

 gcc/fortran/trans-intrinsic.cc |  16 +-
 gcc/testsuite/gfortran.dg/sizeof_6.f90 | 437 +
 2 files changed, 446 insertions(+), 7 deletions(-)

diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc
index e680de1dbd1..2249723540d 100644
--- a/gcc/fortran/trans-intrinsic.cc
+++ b/gcc/fortran/trans-intrinsic.cc
@@ -8099,12 +8099,14 @@ gfc_conv_intrinsic_sizeof (gfc_se *se, gfc_expr *expr)
 	 class object.  The class object may be a non-pointer object, e.g.
 	 located on the stack, or a memory location pointed to, e.g. a
 	 parameter, i.e., an indirect_ref.  */
-  if (arg->rank < 0
-	  || (arg->rank > 0 && !VAR_P (argse.expr)
-	  && ((INDIRECT_REF_P (TREE_OPERAND (argse.expr, 0))
-		   && GFC_DECL_CLASS (TREE_OPERAND (
-	TREE_OPERAND (argse.expr, 0), 0)))
-		  || GFC_DECL_CLASS (TREE_OPERAND (argse.expr, 0)
+  if (POINTER_TYPE_P (TREE_TYPE (argse.expr))
+	  && GFC_CLASS_TYPE_P (TREE_TYPE (TREE_TYPE (argse.expr
+	byte_size
+	  = gfc_class_vtab_size_get (build_fold_indirect_ref (argse.expr));
+  else if (GFC_CLASS_TYPE_P (TREE_TYPE (argse.expr)))
+	byte_size = gfc_class_vtab_size_get (argse.expr);
+  else if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (argse.expr))
+	   && TREE_CODE (argse.expr) == COMPONENT_REF)
 	byte_size = gfc_class_vtab_size_get (TREE_OPERAND (argse.expr, 0));
   else if (arg->rank > 0
 	   || (arg->rank == 0
@@ -8114,7 +8116,7 @@ gfc_conv_intrinsic_sizeof (gfc_se *se, gfc_expr *expr)
 	byte_size = gfc_class_vtab_size_get (
 	  GFC_DECL_SAVED_DESCRIPTOR (arg->symtree->n.sym->backend_decl));
   else
-	byte_size = gfc_class_vtab_size_get (argse.expr);
+	gcc_unreachable ();
 }
   else
 {
diff --git a/gcc/testsuite/gfortran.dg/sizeof_6.f90 b/gcc/testsuite/gfortran.dg/sizeof_6.f90
new file mode 100644
index 000..21b57350dc3
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/sizeof_6.f90
@@ -0,0 +1,437 @@
+! { dg-do run }
+!
+! Check that sizeof is properly handled
+!
+use iso_c_binding
+implicit none (type, external)
+
+type t
+  integer, allocatable :: a(:,:,:), aa
+  integer :: b(5), c
+end type t
+
+type t2
+   class(t), allocatable :: d(:,:), e
+end type t2
+
+type, extends(t2) :: t2e
+  integer :: q(7), z
+end type t2e
+
+type t3
+   class(t2), allocatable :: ct2, ct2a(:,:,:)
+   type(t2), allocatable :: tt2, tt2a(:,:,:)
+   integer, allocatable :: ii, iia(:,:,:)
+end type t3
+
+type(t3) :: var, vara(5)
+type(t3), allocatable :: avar, avara(:)
+class(t3), allocatable :: cvar, cvara(:)
+type(t2), allocatable :: ax, axa(:,:,:)
+class(t2), allocatable :: cx, cxa(:,:,:)
+
+integer(c_size_t) :: n
+
+allocate (t3 :: avar, avara(5))
+allocate (t3 :: cvar, cvara(5))
+
+n = sizeof(var)
+
+! Assume alignment plays no tricks and system has 32bit/64bit.
+! If needed change
+if (n /= 376 .and. n /= 200) error stop
+
+if (n /= sizeof(avar)) error stop
+if (n /= sizeof(cvar)) error stop
+if (n * 5 /= sizeof(vara)) error stop
+if (n * 5 /= sizeof(avara)) error stop
+if (n * 5 /= sizeof(cvara)) error stop
+
+if (n /= sz_ar(var,var,var,var)) error stop
+if (n /= sz_s(var,var)) error stop
+if (n /= sz_t3(var,var,var,var)) error stop
+if (n /= sz_ar(avar,avar,avar,avar)) error stop
+if (n /= sz_s(avar,avar)) error stop
+if (n /= sz_t3(avar,avar,avar,avar)) error stop
+if (n /= sz_t3_at(avar,avar)) error stop
+if (n /= sz_ar(cvar,cvar,cvar,cvar)) error stop
+if (n /= sz_s(cvar,cvar)) error stop
+if (n /= sz_t3(cvar,cvar,cvar,cvar)) error stop
+if (n /= sz_t3_a(cvar,cvar)) error stop
+
+if (n*5 /= sz_ar(vara,vara,vara,vara)) error stop
+if (n*5 /= sz_r1(vara,vara,vara,vara)) error stop
+if (n*5 /= sz_t3(vara,vara,vara,vara)) error stop
+if (n*5 /= sz_ar(avara,avara,avara,avara)) error stop
+if (n*5 /= sz_r1(avara,avara,avara,avara)) error stop
+if (n*5 /= sz_t3(avara,avara,avara,avara))

Re: [Patch] Fortran: Fix gfc_maybe_dereference_var [PR104430]

2022-03-08 Thread Tobias Burnus

Hi Harald,

On 07.03.22 20:58, Harald Anlauf wrote:

I think there are other PRs which profit from this fix.
Can you please have a look at PR99585, and in particular
the link in comment#0?  ;-)


Good pointer – the testcase looks nearly identical and it is indeed fixed.

I included it in addition in the same testcase file. (See attached patch
for the commit,  .)

Thanks,

Tobias

PS: Can I make you review my two pending patches? (NULL and SIZEOF) ;-)

PPS: I lost a bit track working on other things – are there patches
pending review?

PPPS: I think someone still has to deal with the approved and pending
patches by José ...
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit c0134b7383992aab5c1a91440dbdd8fbb747169c
Author: Tobias Burnus 
Date:   Mon Mar 7 22:11:33 2022 +0100

Fortran: Fix gfc_maybe_dereference_var [PR104430][PR99585]

PR fortran/99585
PR fortran/104430

gcc/fortran/ChangeLog:

* trans-expr.cc (conv_parent_component_references): Fix comment;
simplify comparison.
(gfc_maybe_dereference_var): Avoid d referencing a nonpointer.

gcc/testsuite/ChangeLog:

* gfortran.dg/class_result_10.f90: New test.

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index c9d9a916c28..71d037101d4 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -2805,9 +2805,9 @@ conv_parent_component_references (gfc_se * se, gfc_ref * ref)
   dt = ref->u.c.sym;
   c = ref->u.c.component;
 
-  /* Return if the component is in the parent type.  */
+  /* Return if the component is in this type, i.e. not in the parent type.  */
   for (cmp = dt->components; cmp; cmp = cmp->next)
-if (strcmp (c->name, cmp->name) == 0)
+if (c == cmp)
   return;
 
   /* Build a gfc_ref to recursively call gfc_conv_component_ref.  */
@@ -2867,6 +2867,8 @@ tree
 gfc_maybe_dereference_var (gfc_symbol *sym, tree var, bool descriptor_only_p,
 			   bool is_classarray)
 {
+  if (!POINTER_TYPE_P (TREE_TYPE (var)))
+return var;
   if (is_CFI_desc (sym, NULL))
 return build_fold_indirect_ref_loc (input_location, var);
 
diff --git a/gcc/testsuite/gfortran.dg/class_result_10.f90 b/gcc/testsuite/gfortran.dg/class_result_10.f90
new file mode 100644
index 000..a4d29ab9c1d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/class_result_10.f90
@@ -0,0 +1,52 @@
+! { dg-do run}
+
+
+! PR fortran/99585
+
+module m2
+  type t
+ class(*), pointer :: bar(:)
+  end type
+  type t2
+ class(t), allocatable :: my(:)
+  end type t2
+contains
+  function f (x, y) result(z)
+class(t) :: x(:)
+class(t) :: y(size(x(1)%bar))
+type(t)  :: z(size(x(1)%bar))
+  end
+  function g (x) result(z)
+class(t) :: x(:)
+type(t)  :: z(size(x(1)%bar))
+  end
+  subroutine s ()
+class(t2), allocatable :: a(:), b(:), c(:), d(:)
+class(t2), pointer :: p(:)
+c(1)%my = f (a(1)%my, b(1)%my)
+d(1)%my = g (p(1)%my)
+  end
+end
+
+! Contributed by  G. Steinmetz:
+! PR fortran/104430
+
+module m
+   type t
+  integer :: a
+   end type
+contains
+   function f(x) result(z)
+  class(t) :: x(:)
+  type(t) :: z(size(x%a))
+  z%a = 42
+   end
+end
+program p
+   use m
+   class(t), allocatable :: y(:), z(:)
+   allocate (y(32))
+   z = f(y)
+   if (size(z) /= 32) stop 1
+   if (any (z%a /= 42)) stop 2
+end


[PATCH, committed] PR fortran/104811 - maxloc/minloc cannot accept character arguments without `dim` optional argument

2022-03-08 Thread Harald Anlauf via Fortran
Dear all,

frontend-optimization of MINLOC/MAXLOC tries to generate code for rank-1
arrays that may be expanded inline later and optimzed.  Except when the
argument is a character array...

As there is even a comment in trans-intrinsic.cc that we will call a
library function for character arguments anyway, we better punt here.
The attached obvious patch does this and was pre-approved by Thomas in
the PR.

Regtested on x86_64-pc-linux-gnu and pushed to mainline as

https://gcc.gnu.org/g:e3e369dad6cbecb1b490b3f3b154c600fba5a6f3

As this is a wrong-code issue, I'd like to backport this to 11-branch.

Thanks,
Harald

From e3e369dad6cbecb1b490b3f3b154c600fba5a6f3 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 8 Mar 2022 21:47:04 +0100
Subject: [PATCH] Fortran: do not frontend-optimize MINLOC/MAXLOC for character
 arrays

gcc/fortran/ChangeLog:

	PR fortran/104811
	* frontend-passes.cc (optimize_minmaxloc): Do not attempt
	frontend-optimization of MINLOC/MAXLOC for character arrays, as
	there is no suitable code yet for inline expansion.

gcc/testsuite/ChangeLog:

	PR fortran/104811
	* gfortran.dg/minmaxloc_16.f90: New test.
---
 gcc/fortran/frontend-passes.cc |  1 +
 gcc/testsuite/gfortran.dg/minmaxloc_16.f90 | 14 ++
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_16.f90

diff --git a/gcc/fortran/frontend-passes.cc b/gcc/fortran/frontend-passes.cc
index 4033f27df99..5eba6345145 100644
--- a/gcc/fortran/frontend-passes.cc
+++ b/gcc/fortran/frontend-passes.cc
@@ -2276,6 +2276,7 @@ optimize_minmaxloc (gfc_expr **e)
   if (fn->rank != 1
   || fn->value.function.actual == NULL
   || fn->value.function.actual->expr == NULL
+  || fn->value.function.actual->expr->ts.type == BT_CHARACTER
   || fn->value.function.actual->expr->rank != 1)
 return;

diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_16.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_16.f90
new file mode 100644
index 000..099248df2e3
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/minmaxloc_16.f90
@@ -0,0 +1,14 @@
+! { dg-do run }
+! { dg-options "-fdump-tree-original" }
+! PR fortran/104811
+! Frontend-optimization mis-optimized minloc/maxloc of character arrays
+
+program p
+  character(1) :: str(3)
+  str = ["a", "c", "a"]
+  if (any (maxloc (str) /= 2)) stop 1
+  if (minloc (str,dim=1) /= 1) stop 2
+end
+
+! { dg-final { scan-tree-dump-times "_gfortran_maxloc0_4_s1" 1 "original" } }
+! { dg-final { scan-tree-dump-times "_gfortran_minloc2_4_s1" 1 "original" } }
--
2.34.1



Re: [Patch] Fortran: Fix gfc_conv_gfc_desc_to_cfi_desc with NULL [PR104126]

2022-03-08 Thread Harald Anlauf via Fortran

Hi Tobias,

Am 07.03.22 um 15:16 schrieb Tobias Burnus:

Pre-remark: Related NULL, there some accepts-invalid issues, not
addressed in this
patch. See https://gcc.gnu.org/PR104819

This patch fixes an ICE (12 regression) with NULL() that has no MOLD
argument.


the patch does fix the ICE.  But given your short pre-remark:
are you saying that the testcase is invalid, and with the patch
we silently accept it now?

(The testcase compiles with Intel, but triggers a funny bug in
crayftn, which made me read 16.9.144 to learn more about the
tricks of NULL.  But I tend to think this case is valid.)


OK for mainline?


LGTM.

Thanks for the patch!

Harald


Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
Registergericht München, HRB 106955




Re: [Patch] Fortran: Fix gfc_maybe_dereference_var [PR104430]

2022-03-08 Thread Harald Anlauf via Fortran

Hi Tobias,

Am 08.03.22 um 21:19 schrieb Tobias Burnus:

PS: Can I make you review my two pending patches? (NULL and SIZEOF) ;-)


I just approved the former one, but rather hope that Paul or Mikael
or somebody else would jump in on the other one.


PPS: I lost a bit track working on other things – are there patches
pending review?

PPPS: I think someone still has to deal with the approved and pending
patches by José ...


What did prevent them getting processed after approval?

Cheers,
Harald


Re: [Patch] Fortran: Fix gfc_conv_gfc_desc_to_cfi_desc with NULL [PR104126]

2022-03-08 Thread Tobias Burnus

Hi Harald,

On 08.03.22 22:44, Harald Anlauf wrote:

Am 07.03.22 um 15:16 schrieb Tobias Burnus:

Pre-remark: Related NULL, there some accepts-invalid issues, not
addressed in this
patch. See https://gcc.gnu.org/PR104819

This patch fixes an ICE (12 regression) with NULL() that has no MOLD
argument.

the patch does fix the ICE.  But given your short pre-remark:
are you saying that the testcase is invalid, and with the patch
we silently accept it now?


Sorry for being confusing. I also believe the testcase of the just
committed patch is valid Fortran.

However, when fixing this PR, I was looking at the spec – and saw that
GCC accepts invalid code using NULL(), which is not diagnosed. Those
issues are orthogonal to this patch, except that the accepts-invalid
issues also are about NULL().

Thanks for the review!

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955