Thanks everyone for your help, all of which was useful. Martin: I am glad that 
I am not the only one who is (now) disillusioned about parallel computations.

Jesse



On Monday, May 5th, 2025 at 4:25 AM, Viechtbauer, Wolfgang (NP) 
<wolfgang.viechtba...@maastrichtuniversity.nl> wrote:

> 
> 
> A relevant thread from a few years ago where this was discussed:
> 
> https://www.stat.math.ethz.ch/pipermail/r-help/2023-August/477904.html
> 
> I generally use:
> 
> export OPENBLAS_NUM_THREADS=1
> export MKL_NUM_THREADS=1
> 
> since in my experience the biggest performance gains come from switching to 
> OpenBLAS / MKL in the first place. Using their multithreading capabilities 
> tends to yield diminishing gains in comparison.
> 
> For MKL, you could try:
> 
> export MKL_THREADING_LAYER=GNU
> 
> or
> 
> export MKL_THREADING_LAYER=sequential
> 
> when using multiple threads to see if this avoids the issue.
> 
> But if you are using explicit parallelization (as I often do), then you would 
> want to avoid the multithreading in the math libs anyway.
> 
> Best,
> Wolfgang
> 
> > -----Original Message-----
> > From: R-help r-help-boun...@r-project.org On Behalf Of Martin Maechler
> > Sent: Monday, May 5, 2025 10:50
> > To: smallepsilon smallepsi...@proton.me
> > Cc: r-help@r-project.org; peter dalgaard pda...@gmail.com
> > Subject: Re: [R] non-reproducible eigen() output with MKL
> > 
> > > > > > > smallepsilon via R-help
> > > > > > > on Sun, 04 May 2025 18:11:57 +0000 writes:
> > 
> > > Peter, The eigenvalues are not identical(), but are
> > > all.equal(). When n is 20, the crossproduct is
> > > (numerically) a diagonal matrix with +-1 on the
> > > diagonal. When n is 50, this is not the case, but that
> > > could be an issue of nearly identical eigenvalues.
> > 
> > > Is there no way within R to require that the sequence of
> > > operations be the same for identical calls?
> > 
> > As Peter Dalgaard mentioned, you have the problem of underlying
> > system libraries that try to be fast and hence parallelize
> > computations, notably in this (and man similar calls), MKL uses
> > parallelized BLAS and/or LAPACK .. and that's what eigen() in R
> > (and Julia, python, matlab, ..) all use.
> > 
> > And parallelization is a killer of (strict / bit-level)
> > reproducibility, as you have just experienced.
> > 
> > If you know how to tell your OS / that you want only one
> > parallel "thread" / process / ...
> > you get back to reproducible (and slower) computations.
> > 
> > > The problem arose originally in a package test in which I wanted to
> > > verify that two ways of specifying something led to the
> > > execution of exactly the same calculations. The best proxy
> > > for this seemed to be to use identical() on the outputs,
> > > but if the same line of code (that should not involve an
> > > RNG) can lead to different results, this approach is
> > > doomed, yes? It is not absolutely critical that the
> > > outputs be identical(), but it would be much more
> > > reassuring than all.equal().
> > 
> > I understand and agree.
> > 
> > When I first became aware of the irreprodicibility of parallel
> > computations, only a few years ago, I was quite shocked and deillusionized..
> > 
> > > Thanks, Jesse
> > 
> > Martin Maechler
> > ETH Zurich and R Core team
> > 
> > > On Sunday, May 4th, 2025 at 12:27 PM, peter dalgaard
> > > pda...@gmail.com wrote:
> > >>
> > >>
> > >> Have you looked more closely at the differences?
> > >> Eigenvectors are only determined up to a sign change, so
> > >> different platforms often give results that differ by
> > >> sign. If you use a multitasking numerical library, the
> > >> same can happen within platform because the exact
> > >> sequence of computations differs between calls.
> > >>
> > >> You could check
> > >>
> > >> a) that e1$values and e2$values are the same b) that the
> > >> crossprod(e1$vectors, e2$vectors) is a diagonal matrix
> > >> with 1 or -1 in the diagonal. (This might fail if you
> > >> have eigenvalues that are almost identical, though.)
> > >>
> > >> -pd
> > >>
> > >> > On 4 May 2025, at 17.57, smallepsilon via R-help
> > >> r-help@r-project.org wrote:
> > >> >
> > >> > I am using MKL with R 4.5.0 on Linux, and eigen() is
> > >> producing different results with identical
> > >> calls. Specifically, when I run the code below, I get
> > >> "FALSE" from identical(). Just in case it had something
> > >> to do with a random number generator, I put identical
> > >> set.seed() calls immediately before each eigen() call,
> > >> but that did not help. Has anyone seen this behavior
> > >> before?
> > >> >
> > >> > (When n is 5, the identical() call almost always
> > >> returns "TRUE". As n increases, the proportion of FALSE
> > >> results increases, and it is nearly always FALSE when n
> > >> is 50.)
> > >> >
> > >> > Jesse
> > >> >
> > >> > ***
> > >> >
> > >> > n <- 50 > set.seed(20250504) > Sigma <- rWishart(1, df
> > >> = n, Sigma = diag(n))[,,1] > e1 <- eigen(Sigma) > e2 <-
> > >> eigen(Sigma) > identical(e1, e2)
> > >>
> > >> --
> > >> Peter Dalgaard, Professor, Center for Statistics,
> > >> Copenhagen Business SchoolSolbjerg Plads 3, 2000
> > >> Frederiksberg, Denmark Phone: (+45)38153501 Office: A
> > >> 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to