Source: openmpi
Version: 3.1.3-9
Severity: important
SUMMARY: mpifort (openmpi 3.1.3-9) infinite loop on hurd while reading from
stdin
DESCRIPTION:
On GNU/Hurd, a few-line fortran program that reads an integer from
stdin and outputs it to stdout, and has no calls to mpi, runs
as expected when compiled with gfortran-8.2.0-2, but when compiled
with mpifort (openmpi 3.1.3-9), runs at about 10-30% of cpu
in an apparently infinite loop (at least 900 seconds).
CONTEXT:
$ uname -a
GNU exodar 0.9 GNU-Mach 1.8+git20181103-486-dbg/Hurd-0.9 i686-AT386 GNU
$ cat /proc/hostinfo
Basic info:
max_cpus = 1 /* max number of cpus possible */
avail_cpus = 1 /* number of cpus now available */
memory_size = 3221151744 /* size of memory in bytes */
cpu_type = 19 /* cpu type */
cpu_subtype = 1 /* cpu subtype */
Scheduling info:
min_timeout = 10 /* minimum timeout in milliseconds */
min_quantum = 100 /* minimum quantum in milliseconds */
Load info:
avenrun[3] = { 1.57, 1.57, 1.43 }
mach_factor[3] = { 0.15, 0.14, 0.09 }
$ dpkg -l |egrep "openmpi|pmix|gfortran|gcc|mpifort|libc[-0]"
ii gcc 4:8.2.0-2 hurd-i386
GNU C compiler
ii gcc-8 8.2.0-14 hurd-i386
GNU C compiler
ii gcc-8-base:hurd-i386 8.2.0-14 hurd-i386
GCC, the GNU Compiler Collection (base package)
ii gfortran 4:8.2.0-2 hurd-i386
GNU Fortran 95 compiler
ii gfortran-8 8.2.0-14 hurd-i386
GNU Fortran compiler
ii libc-bin 2.28-5 hurd-i386
GNU C Library: Binaries
ii libc-dev-bin 2.28-5 hurd-i386
GNU C Library: Development binaries
ii libc-l10n 2.28-5 all
GNU C Library: localization files
ii libc0.3:hurd-i386 2.28-5 hurd-i386
GNU C Library: Shared libraries
ii libc0.3-dev:hurd-i386 2.28-5 hurd-i386
GNU C Library: Development Libraries and Header Files
ii libgcc-8-dev:hurd-i386 8.2.0-14 hurd-i386
GCC support library (development files)
ii libgcc1:hurd-i386 1:8.2.0-14 hurd-i386
GCC support library
ii libgfortran-8-dev:hurd-i386 8.2.0-14 hurd-i386
Runtime library for GNU Fortran applications (development files)
ii libgfortran5:hurd-i386 8.2.0-14 hurd-i386
Runtime library for GNU Fortran applications
ii libopenmpi-dev:hurd-i386 3.1.3-9 hurd-i386
high performance message passing library -- header files
ii libopenmpi3:hurd-i386 3.1.3-9 hurd-i386
high performance message passing library -- shared library
ii libpmix2:hurd-i386 3.1.0~rc4-1 hurd-i386
Process Management Interface (Exascale) library
ii openmpi-bin 3.1.3-9 hurd-i386
high performance message passing library -- binaries
ii openmpi-common 3.1.3-9 all
high performance message passing library -- common files
$ mpifort --showme
gfortran -I/usr/lib/i386-gnu/openmpi/include -pthread
-I/usr/lib/i386-gnu/openmpi/lib -Wl,--enable-new-dtags
-L/usr/lib/i386-gnu/openmpi/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr
-lmpi_mpifh -lmpi
$ mpifort --version
GNU Fortran (Debian 8.2.0-14) 8.2.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gfortran --version
GNU Fortran (Debian 8.2.0-14) 8.2.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
REPRODUCE THE BUG:
$ cat hurd_fortran_minimal.f90
program openmpi_hurd_bug
i_myid = 178
print*,'i_myid = ',i_myid,' Please type in an integer and hit return.'
read(*,*) icase
print*,'Your integer is ', icase, ' Bye.'
end program openmpi_hurd_bug
$ cat infile
414
(1) gfortran
$ gfortran src/hurd_fortran_minimal.f90 && ./a.out < infile
i_myid = 178 Please type in an integer and hit return.
Your integer is 414 Bye.
Compiled without errors or warnings; printed out the integer 414 as expected.
(2) mpifort
Since exodar has only one cpu, and since a test script for openmpi
should be able to run on a single-processor machine, and since it's a bad
idea to ssh to any external machines when doing automated tests,
the '--mca plm_rsh_agent /bin/false' option is used here, meaning there's no
remote shell usage.
Compile and run in foreground using mpifort. This appears to run without
stopping, so with a ^C, the output to terminal is:
$ mpifort src/hurd_fortran_minimal.f90 && mpirun -n 1 --mca plm_rsh_agent
/bin/false ./a.out < infile
i_myid = 178 Please type in an integer and hit return.
^CAt line 4 of file src/hurd_fortran_minimal.f90 (unit = 5, file = 'stdin')
Fortran runtime error: End of file
Error termination. Backtrace:
#0 0x11f874b
#1 0x11f925e
#2 0x11f98f6
#3 0x13bb11c
#4 0x13b3d79
#5 0x13b5520
#6 0x13ba8c3
#7 0x803089a
#8 0x8030996
#9 0x164735c
#10 0x80306c0
#11 0xffffffff
Running this in the background instead, with a log file:
$ mpirun -n 1 --mca plm_rsh_agent /bin/false ./a.out < infile > log.1 2>&1 &
The job run in the background runs at about 10-30% of cpu according to
'top', and there's no evidence of it ever stopping. This might be an
infinite loop problem. This binary ran for over 900 seconds using
about 30-40% of cpu, before I killed it. According to the program
code, I would expect it to be reading one integer on one line of an
input file and then to print it to stdout. In other words, a program
with no mpi instructions, compiled with mpifort, should behave the
same way as if the program is compiled with gfortran.
During this apparently infinite loop, the log file contains:
i_myid = 178 Please type in an integer and hit return.
HYPOTHESIS:
This test is run under an schroot running sid, so it might be related
to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=494046 -
"openmpi-bin: Doesn't work in a chroot environment".