possible bug in Windows version from Gfortran 11.3.0 when using omp_set_num_threads

2024-10-06 Thread John Campbell
I would like to report a problem I have identified with using "call
omp_set_num_threads (n)", which has appeared when on Windows 10 using
Gfortran version 11.3.0, (12.3.0 and 14.1.0). 

Prior versions ( 9.2.0, 10.2.0 and 11.1.0 run to completion.

The reproducer program below does not exit, but hangs after the last print
statement. ( I wonder if there are excess threads outside the END DO?)

 

I have experienced this problem for a while now, mainly with small
demonstration programs on https://fortran-lang.discourse.group, but now have
identified the cause of the program hanging.

 

I have listed a simple reproducer below which exhibits the problem if "call
omp_set_num_threads (4)" is included. I am not aware that this form of usage
of omp_set_num_threads has been made obsolete ?

 

I can confirm that the bug is not evident in equation.com's Gfortran 11.1.0
and earlier, but is present from Gfortran 11.3.0.

 

I am unsure where the bug is located. It could be in either Gfortran or
Equation.com's windows thread interface modifications.

 

Could others run the test example below to confirm where the problem is
identified.

 

Either the program runs to completion (ok) or hangs after "print..."
reporting the calculation result.

 

To confirm this problem, I have a windows test batch file to first select
the Gfortran version then compile and run the test.

I have 2 batch files : 

set_gcc.bat which updates the path for the required gfortran
version. I store different versions in C:\Program Files
(x86)\gcc_eq\gcc_xx.y.0 ( not provided)

do_test.bat which selects compiler, compiles the code example and
runs the test. (listed below, the first compiler options are from earlier
tests, but second also show error)

 

{{ do_test.bat }}

call set_gcc %1

 

set program=test4

set vec=-fimplicit-none -O3 -march=native -ffast-math -fopenmp
-fstack-arrays

set vec=-O2 -march=native -fopenmp

 

del %program%.exe

gfortran %program%.f90 %vec% -o %program%.exe

%program%

 

My latest test program is below, but probably could be further simplified.

{{ test4.f90 }}

program test

 

!  small working version of OpenMP program hanging on Win 10 with Gfortran
11.3.0 +

  use iso_fortran_env

   implicit none

 

integer, parameter :: num = 1000

real:: A(num)

 

real:: RA

integer :: i

 

write (*,*) 'Vern : ',compiler_version ()

write (*,*) 'Opts : ',compiler_options ()

 

call omp_set_num_threads (4)   ! omit to run to completion with recent
Gfortran versions

  

 write ( *,*) 'Test n=',num

 A = 1

 ra = 0

 

  !$OMP PARALLEL DO private (i) shared (A), REDUCTION (+: RA)

 do i = 1, size(A)

   RA = RA + A(i)**2

 end do

  !$OMP END PARALLEL DO

 

 RA = sqrt (RA)

 print*,RA,' OpenMP', sqrt(real(num))

 

!   Program hangs here but does not exit on recent Gfortran versions

!Gfortran versions since 11.1.0 hang, however if "call
omp_set_num_threads (4)" is omitted runs to completion.

!Gfortran versions 11.1.0 and earlier run to completion.

 

end program test

!++++

 

Could others please confirm if you can reproduce this problem.

 

Regards,

 

John Campbell

 



RE: DO CONCURRENT with LOCAL / LOCAL_INIT [was: [PATCH] Fortran: Added support for locality specs in DO CONCURRENT (Fortran 2018/23)]

2025-01-26 Thread John Campbell
Would it be easier to consider "DO CONCURRENT with LOCAL / LOCAL_INIT" as a 
special case of !$OMP PARALLEL DO, so utilise this existing implementation ?

John Campbell

-Original Message-
From: Damian Rouson  
Sent: Sunday, 26 January 2025 9:38 AM
To: Tobias Burnus 
Cc: Jerry D ; Andre Vehreschild ; 
fortran@gcc.gnu.org
Subject: Re: DO CONCURRENT with LOCAL / LOCAL_INIT [was: [PATCH] Fortran: Added 
support for locality specs in DO CONCURRENT (Fortran 2018/23)]

In case it helps anyone to see how significant the reduction in complexity can 
be with support for locality specifiers, we have Fortran 2008, 2018, and 2023 
versions of the "do concurrent" construct that lies at the heart of our 
neural-network training code here:

https://github.com/BerkeleyLab/fiats/blob/25457d8e390f59f3191d6fcd0d5a609f173059d6/src/fiats/neural_network_s.F90#L899

The 2023 version has one do concurrent statement that we expand into 7 
statements if we have to fall back to 2018 (including the "end block"
that is much lower in the code) without reduce locality and expands to
10 statements if we have to fall all the way back to 2008 without local or 
reduce.  Although there's a minor win associated with supporting local, there's 
a much bigger reduction in code complexity that comes with supporting both 
local and reduce.

Damian