On 7/4/25 19:51, Jerry D wrote:
On 7/4/25 5:12 AM, Andre Vehreschild wrote:
Hi all,
attached patches goes on top of other 6 caf_shmem coarray patches and
fixes
missing includes esp. on non-Linux systems. I have tested this on a
FreeBSD,
which is very time consuming due to it being fully virtualized on my
system.
Regtests ok on x86_64-pc-linux-gnu and aarch64-unknown-freebsd14.3. Ok
for
mainline?
Thanks to Steve for bringing these deficiencies to my attention.
Regards,
Andre
So far,
$ export GFORTRAN_NUM_IMAGES=9
$ rm *.mod
$ gfc -fcoarray=lib random-weather.f90 -lcaf_shmem
$ ./a.out
Decomposition information on image 6 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
Decomposition information on image 1 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
Decomposition information on image 4 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
Decomposition information on image 5 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
Decomposition information on image 9 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
Decomposition information on image 3 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
Decomposition information on image 8 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
Decomposition information on image 2 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
Decomposition information on image 7 : there are 9 * 1 slabs; the
slabs are 8 * 70 grid cells in size.
.
.
.
Time 3600 Image 4 PS= 99925.0391 T=
301.282928 U= -51.2542686 V= 24.3605309 W=
-0.296301365 Q= 1.48258626E-03
Time 3600 Image 9 PS= 99899.3047 T=
299.897095 U= 62.8683090 V= -57.9342270 W=
0.445489585 Q= 1.90666097E-03
Time 3600 Image 1 PS= 99966.7656 T=
300.011597 U= -1.93229961 V= -118.892410 W=
-6.45965934E-02 Q= 2.03774264E-03
Time 3600 Image 7 PS= 100015.938 T=
300.066162 U= -17.6038494 V= -0.982973158 W=
7.21789524E-02 Q= 2.17592530E-03
Time 3600 Image 2 PS= 100003.477 T=
300.078522 U= -2.38964367 V= -18.8026981 W=
-0.179861650 Q= 1.99834118E-03
Time 3600 Image 5 PS= 100077.422 T=
300.781494 U= -16.6273994 V= -101.607895 W=
0.361649722 Q= 1.74388883E-03
Time 3600 Image 3 PS= 100002.391 T=
299.708862 U= 18.6304798 V= 0.391739845 W=
2.24014949E-02 Q= 1.96914421E-03
Time 3600 Image 8 PS= 100074.359 T=
299.516235 U= -55.1445618 V= 68.3090286 W=
-0.537869334 Q= 2.32057413E-03
Time 3600 Image 6 PS= 99976.4453 T=
300.221924 U= -1.62557888 V= 1.44226456 W=
0.201509774 Q= 1.97460176E-03
$
real 0m0.066s
user 0m0.337s
sys 0m0.107s
Definitely much faster than mpich. I also over prescribed the number of
images to 30 and ran as well.
I still need to build OpenCoarrays using this gfortran-16 and make sure
it succeeds those tests with mpich. I will try to then test each case
on the OpenCoarrays suite of tests with -lcaf_shmem and see if those all
work.
Any ideas on how to stress test this. I only have 32 gig of memory here
and would like to see how a longer running program does.
echo ' &config nxglobal=1000, nyglobal=1000, nzglobal=100 / ' | time
mpirun -np 16 --mca btl self,vader ./a.out
works on my 128 Gbyte system using the libcoarray+openmpi combination,
but it surely doesn't need all that memory.
During the weekend I'll try to come up with a "randomization" scheme
that gives guaranteed results independent on slight differences in
timing (the current use of "call random_number" surely won't do that) -
so that runs using different libraries will at least give the same
results if the implementations are correct.
Kind regards,
--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands