The attached program is a straightforward implementation of matrix inversion via the Gauss-Jordan algorithm. The explicitly parallelized version (not attached) produces exactly the same result for each run independent of the number of threads. The version parallelized via OpenMP produces incorrect results for sufficiently large matrices and two or more threads. An example:
$ OMP_NUM_THREADS=1 ./matinv_openmp 180 error = 4.75689e-14; epsilon = 2.22045e-16; error / (epsilon * n) = 1.19018 Error within bounds. $ OMP_NUM_THREADS=2 ./matinv_openmp 180 error = nan; epsilon = 2.22045e-16; error / (epsilon * n) = nan Error out of bounds. I compiled this program as follows: $ ~/gcc-4.2.3/bin/gcc -Wall -fopenmp -g matinv_openmp.c -static -o matinv_openmp -lm $ ~/gcc-4.2.3/bin/gcc -v Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: /home/bart/software/gcc-4.2.3/configure --disable-nls --enable-threads=posix --enable-tls --prefix=/home/bart/gcc-4.2.3 Thread model: posix gcc version 4.2.3 -- Summary: OpenMP: Incorrect result when run with two or more threads Product: gcc Version: 4.2.3 Status: UNCONFIRMED Severity: major Priority: P3 Component: libgomp AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: bart dot vanassche at gmail dot com GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35517