Re: [PATCH v3] libgfortran: Replace mutex with rwlock

2023-04-20 Thread Zhu, Lipeng via Fortran

Hi Bernhard,

Thanks for your questions and suggestions.
The rwlock could allow multiple threads to have concurrent read-only 
access to the cache/unit list, only a single writer is allowed.

Write lock will not be acquired until all read lock are released.
And I didn't change the mutex scope when refactor the code, only make a 
more fine-grained distinction for the read/write cache/unit list.


I complete the comment according to your template, I will insert the 
comment in the source code in next version patch with other refinement 
by your suggestions.

"
Now we did not get a unit in cache and unit list, so we need to create a
new unit, and update it to cache and unit list.
Prior to update the cache and list, we need to release all read locks,
and then immediately to acquire write lock, thus ensure the exclusive
update to the cache and unit list.
Either way, we will manipulate the cache and/or the unit list so we must
take a write lock now.
We don't take the write bit in *addition* to the read lock because:
1. It will needlessly complicate releasing the respective lock;
2. By separate the read/write lock, it will greatly reduce the
contention at the read part, while write part is not always necessary or
most unlikely once the unit hit in cache;
3. We try to balance the implementation complexity and the performance
gains that fit into current cases we observed.
"

Best Regards,
Zhu, Lipeng

On 1/1/1970 8:00 AM, Bernhard Reutner-Fischer wrote:

On 19 April 2023 09:06:28 CEST, Lipeng Zhu via Fortran  
wrote:

This patch try to introduce the rwlock and split the read/write to
unit_root tree and unit_cache with rwlock instead of the mutex to
increase CPU efficiency. In the get_gfc_unit function, the percentage
to step into the insert_unit function is around 30%, in most instances,
we can get the unit in the phase of reading the unit_cache or unit_root
tree. So split the read/write phase by rwlock would be an approach to
make it more parallel.

BTW, the IPC metrics can gain around 9x in our test server with 220
cores. The benchmark we used is https://github.com/rwesson/NEAT




+#define RD_TO_WRLOCK(rwlock) \
+  RWUNLOCK (rwlock);\
+  WRLOCK (rwlock);
+#endif
+




diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c index
82664dc5f98..4312c5f36de 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c



@@ -329,7 +335,7 @@ get_gfc_unit (int n, int do_create)
   int c, created = 0;

   NOTE ("Unit n=%d, do_create = %d", n, do_create);
-  LOCK (&unit_lock);
+  RDLOCK (&unit_rwlock);

retry:
   for (c = 0; c < CACHE_SIZE; c++)
@@ -350,6 +356,7 @@ retry:
   if (c == 0)
break;
 }
+  RD_TO_WRLOCK (&unit_rwlock);


So I'm trying to convince myself why it's safe to unlock and only then take the 
write lock.

Can you please elaborate/confirm why that's ok?

I wouldn't mind a comment like
We can release the unit and cache read lock now. We might have to allocate a 
(locked) unit, below in
do_create.
Either way, we will manipulate the cache and/or the unit list so we have to 
take a write lock now.

We don't take the write bit in *addition* to the read lock because..

(that needlessly complicates releasing the respective locks / it triggers too 
much contention when we..
/ ...?)

thanks,



   if (p == NULL && do_create)
 {
@@ -368,8 +375,8 @@ retry:
   if (created)
 {
   /* Newly created units have their lock held already
-from insert_unit.  Just unlock UNIT_LOCK and return.  */
-  UNLOCK (&unit_lock);
+from insert_unit.  Just unlock UNIT_RWLOCK and return.  */
+  RWUNLOCK (&unit_rwlock);
   return p;
 }

@@ -380,7 +387,7 @@ found:
   if (! TRYLOCK (&p->lock))
{
  /* assert (p->closed == 0); */
- UNLOCK (&unit_lock);
+ RWUNLOCK (&unit_rwlock);
  return p;
}

@@ -388,14 +395,14 @@ found:
 }


-  UNLOCK (&unit_lock);
+  RWUNLOCK (&unit_rwlock);

   if (p != NULL && (p->child_dtio == 0))
 {
   LOCK (&p->lock);
   if (p->closed)
{
- LOCK (&unit_lock);
+ WRLOCK (&unit_rwlock);
  UNLOCK (&p->lock);
  if (predec_waiting_locked (p) == 0)
destroy_unit_mutex (p);
@@ -593,8 +600,8 @@ init_units (void)
#endif
#endif

-#ifndef __GTHREAD_MUTEX_INIT
-  __GTHREAD_MUTEX_INIT_FUNCTION (&unit_lock);
+#if (!defined(__GTHREAD_RWLOCK_INIT) &&
+!defined(__GTHREAD_MUTEX_INIT))
+  __GTHREAD_MUTEX_INIT_FUNCTION (&unit_rwlock);
#endif

   if (sizeof (max_offset) == 8)




[PATCH] Fortran: function results never have the ALLOCATABLE attribute [PR109500]

2023-04-20 Thread Harald Anlauf via Fortran
Dear all,

Fortran 2018 added a clarification that the *result* of a function
whose result *variable* has the ALLOCATABLE attribute is a *value*
that itself does not have the ALLOCATABLE attribute.

For those interested: there was a thread on the J3 mailing list
some time ago (for links see the PR).

The patch which implements a related check was co-authored with
Steve and regtested by him.  Testcase verified against NAG.

OK for mainline (gcc-14)?

Thanks,
Harald & Steve

From 2cebc8f9e7b399b7747c9ad0392831de91851b5b Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 20 Apr 2023 21:47:34 +0200
Subject: [PATCH] Fortran: function results never have the ALLOCATABLE
 attribute [PR109500]

Fortran 2018 8.5.3 (ALLOCATABLE attribute) explains in Note 1 that the
result of referencing a function whose result variable has the ALLOCATABLE
attribute is a value that does not itself have the ALLOCATABLE attribute.

gcc/fortran/ChangeLog:

	PR fortran/109500
	* interface.cc (gfc_compare_actual_formal): Reject allocatable
	functions being used as actual argument for allocable dummy.

gcc/testsuite/ChangeLog:

	PR fortran/109500
	* gfortran.dg/allocatable_function_11.f90: New test.

Co-authored-by: Steven G. Kargl 
---
 gcc/fortran/interface.cc  | 12 +++
 .../gfortran.dg/allocatable_function_11.f90   | 36 +++
 2 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/allocatable_function_11.f90

diff --git a/gcc/fortran/interface.cc b/gcc/fortran/interface.cc
index e9843e9549c..968ee193c07 100644
--- a/gcc/fortran/interface.cc
+++ b/gcc/fortran/interface.cc
@@ -3638,6 +3638,18 @@ gfc_compare_actual_formal (gfc_actual_arglist **ap, gfc_formal_arglist *formal,
 	  goto match;
 	}

+  if (a->expr->expr_type == EXPR_FUNCTION
+	  && a->expr->value.function.esym
+	  && f->sym->attr.allocatable)
+	{
+	  if (where)
+	gfc_error ("Actual argument for %qs at %L is a function result "
+		   "and the dummy argument is ALLOCATABLE",
+		   f->sym->name, &a->expr->where);
+	  ok = false;
+	  goto match;
+	}
+
   /* Check intent = OUT/INOUT for definable actual argument.  */
   if (!in_statement_function
 	  && (f->sym->attr.intent == INTENT_OUT
diff --git a/gcc/testsuite/gfortran.dg/allocatable_function_11.f90 b/gcc/testsuite/gfortran.dg/allocatable_function_11.f90
new file mode 100644
index 000..1a2831e186f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/allocatable_function_11.f90
@@ -0,0 +1,36 @@
+! { dg-do compile }
+! PR fortran/109500 - check F2018:8.5.3 Note 1
+!
+! The result of referencing a function whose result variable has the
+! ALLOCATABLE attribute is a value that does not itself have the
+! ALLOCATABLE attribute.
+
+program main
+  implicit none
+  integer, allocatable  :: p
+  procedure(f), pointer :: pp
+  pp => f
+  p = f()
+  print *, allocated (p)
+  print *, is_allocated (p)
+  print *, is_allocated (f())  ! { dg-error "is a function result" }
+  print *, is_allocated (pp()) ! { dg-error "is a function result" }
+  call s (p)
+  call s (f())  ! { dg-error "is a function result" }
+  call s (pp()) ! { dg-error "is a function result" }
+
+contains
+  subroutine s(p)
+integer, allocatable :: p
+  end subroutine s
+
+  function f()
+integer, allocatable :: f
+allocate (f, source=42)
+  end function
+
+  logical function is_allocated(p)
+integer, allocatable :: p
+is_allocated = allocated(p)
+  end function
+end program
--
2.35.3