On Fri, 3 Mar 2006, Douglas Bates wrote: > I have been timing a particular model fit using lmer on several > different computers and came up with a peculiar result - the model fit > is considerably slower on a dual-core Athlon 64 using Goto's > multithreaded BLAS than on a single-core processor.
Is there a Goto BLAS tuned for that chip? I can only see one tuned for an (unspecified) Opteron. L1 and L2 cache sizes do sometimes matter a lot for tuned BLAS, and (according to the AMD site I just looked up) the X2 3800+ only has a 512Kb per core L2 cache. Opterons have a 1Mb L2 cache. Also, the very large system time seen in the dual-core run is typical of what I see when pthreads is not working right, and I suggest you try a limit of one thread (see the R-admin manual). On our dual-processor Opteron 248 that ran in 44 secs instead of 328. > Here is the timing on a single-core Athlon 64 3000+ running under > today's R-devel with version 0.995-5 of the Matrix package. > >> library(Matrix) >> data(star, package = 'mlmRev') >> system.time(fm1 <- lmer(math~gr+sx+eth+cltype+(yrs|id)+(1|tch)+(yrs|sch), >> star, control = list(nit=0,grad=0,msV=1))) > [1] 43.10 3.78 48.41 0.00 0.00 > > > (If you run the timing yourself and don't want to see the iteration > output, take the msV=1 out of the control list. I keep it in there so > I can monitor the progress.) > > If I time the same model fit on a dual-core Athlon 64 X2 3800+ with > the same version of R, BLAS and Matrix package, the timing ends up > with something like > > 90 140 235 0 0 .... -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel