Thanks everyone for your help, all of which was useful. Martin: I am glad that I am not the only one who is (now) disillusioned about parallel computations.
Jesse On Monday, May 5th, 2025 at 4:25 AM, Viechtbauer, Wolfgang (NP) <wolfgang.viechtba...@maastrichtuniversity.nl> wrote: > > > A relevant thread from a few years ago where this was discussed: > > https://www.stat.math.ethz.ch/pipermail/r-help/2023-August/477904.html > > I generally use: > > export OPENBLAS_NUM_THREADS=1 > export MKL_NUM_THREADS=1 > > since in my experience the biggest performance gains come from switching to > OpenBLAS / MKL in the first place. Using their multithreading capabilities > tends to yield diminishing gains in comparison. > > For MKL, you could try: > > export MKL_THREADING_LAYER=GNU > > or > > export MKL_THREADING_LAYER=sequential > > when using multiple threads to see if this avoids the issue. > > But if you are using explicit parallelization (as I often do), then you would > want to avoid the multithreading in the math libs anyway. > > Best, > Wolfgang > > > -----Original Message----- > > From: R-help r-help-boun...@r-project.org On Behalf Of Martin Maechler > > Sent: Monday, May 5, 2025 10:50 > > To: smallepsilon smallepsi...@proton.me > > Cc: r-help@r-project.org; peter dalgaard pda...@gmail.com > > Subject: Re: [R] non-reproducible eigen() output with MKL > > > > > > > > > smallepsilon via R-help > > > > > > > on Sun, 04 May 2025 18:11:57 +0000 writes: > > > > > Peter, The eigenvalues are not identical(), but are > > > all.equal(). When n is 20, the crossproduct is > > > (numerically) a diagonal matrix with +-1 on the > > > diagonal. When n is 50, this is not the case, but that > > > could be an issue of nearly identical eigenvalues. > > > > > Is there no way within R to require that the sequence of > > > operations be the same for identical calls? > > > > As Peter Dalgaard mentioned, you have the problem of underlying > > system libraries that try to be fast and hence parallelize > > computations, notably in this (and man similar calls), MKL uses > > parallelized BLAS and/or LAPACK .. and that's what eigen() in R > > (and Julia, python, matlab, ..) all use. > > > > And parallelization is a killer of (strict / bit-level) > > reproducibility, as you have just experienced. > > > > If you know how to tell your OS / that you want only one > > parallel "thread" / process / ... > > you get back to reproducible (and slower) computations. > > > > > The problem arose originally in a package test in which I wanted to > > > verify that two ways of specifying something led to the > > > execution of exactly the same calculations. The best proxy > > > for this seemed to be to use identical() on the outputs, > > > but if the same line of code (that should not involve an > > > RNG) can lead to different results, this approach is > > > doomed, yes? It is not absolutely critical that the > > > outputs be identical(), but it would be much more > > > reassuring than all.equal(). > > > > I understand and agree. > > > > When I first became aware of the irreprodicibility of parallel > > computations, only a few years ago, I was quite shocked and deillusionized.. > > > > > Thanks, Jesse > > > > Martin Maechler > > ETH Zurich and R Core team > > > > > On Sunday, May 4th, 2025 at 12:27 PM, peter dalgaard > > > pda...@gmail.com wrote: > > >> > > >> > > >> Have you looked more closely at the differences? > > >> Eigenvectors are only determined up to a sign change, so > > >> different platforms often give results that differ by > > >> sign. If you use a multitasking numerical library, the > > >> same can happen within platform because the exact > > >> sequence of computations differs between calls. > > >> > > >> You could check > > >> > > >> a) that e1$values and e2$values are the same b) that the > > >> crossprod(e1$vectors, e2$vectors) is a diagonal matrix > > >> with 1 or -1 in the diagonal. (This might fail if you > > >> have eigenvalues that are almost identical, though.) > > >> > > >> -pd > > >> > > >> > On 4 May 2025, at 17.57, smallepsilon via R-help > > >> r-help@r-project.org wrote: > > >> > > > >> > I am using MKL with R 4.5.0 on Linux, and eigen() is > > >> producing different results with identical > > >> calls. Specifically, when I run the code below, I get > > >> "FALSE" from identical(). Just in case it had something > > >> to do with a random number generator, I put identical > > >> set.seed() calls immediately before each eigen() call, > > >> but that did not help. Has anyone seen this behavior > > >> before? > > >> > > > >> > (When n is 5, the identical() call almost always > > >> returns "TRUE". As n increases, the proportion of FALSE > > >> results increases, and it is nearly always FALSE when n > > >> is 50.) > > >> > > > >> > Jesse > > >> > > > >> > *** > > >> > > > >> > n <- 50 > set.seed(20250504) > Sigma <- rWishart(1, df > > >> = n, Sigma = diag(n))[,,1] > e1 <- eigen(Sigma) > e2 <- > > >> eigen(Sigma) > identical(e1, e2) > > >> > > >> -- > > >> Peter Dalgaard, Professor, Center for Statistics, > > >> Copenhagen Business SchoolSolbjerg Plads 3, 2000 > > >> Frederiksberg, Denmark Phone: (+45)38153501 Office: A > > >> 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.