>>>>> "PJ" == Paul Johnson <pauljoh...@gmail.com> >>>>> on Fri, 14 Dec 2012 01:01:19 -0600 writes:
PJ> On Thu, Dec 13, 2012 at 9:01 PM, Yi (Alice) Wang <yi.w...@unsw.edu.au> wrote: >> I have also encountered a similar problem. My mvabund package runs much >> faster on linux/OSX than on windows with both R/2.15.1 and R/2.15.2. For >> example, with mvabund_3.6.3 and R/2.15.2, >> system.time(example(anova.manyglm)) >> PJ> Hi, Alice PJ> You have a different problem than I do. PJ> The change from R-2.15.1 to R-2.15.2 makes the program slower on all PJ> platforms. The slowdown that emerges in R-2.15.2 on all types of PJ> hardware concerns me. PJ> It only seemed like a "Windows is better" issue when all the Windows PJ> users who tested my program were using R-2.15.0 or R-2.15.1. As soon PJ> as they update R, then they have the slowdown as well. Paul, I'm pretty sure you are right that it is not just your package. Rather, the NEWS for R 2.15.2 contain • The included LAPACK has been updated to 3.4.1, with some patches from the current SVN sources. (_Inter alia_, this resolves PR#14692.) and as I got from your e-mails --- yes, a reproducible example (without package Amelia) would have been (and would still be) really enlightening --- indeed, "the default tolerance" (in a vague sense) of detecting (near)singularity may well have been tightened in the newer LAPACK. Martin >> on OSX returns >> >> user system elapsed >> 3.351 0.006 3.381 >> >> but on windows 7 it returns >> >> user system elapsed >> 13.13 0.00 13.14 >> >> I also used svd frequently in my c code though by calling the gsl functions >> only. In my memory, I think the comp time difference is not that significant >> with earlier R versions. So maybe it is worth an investigation? >> >> Many thanks, >> Yi Wang >> >> >> On Thu, Dec 13, 2012 at 5:33 PM, Uwe Ligges >> <lig...@statistik.tu-dortmund.de> wrote: >>> >>> Long message, but as far as I can see, this is not about base R but the >>> contributed package Amelia: Please discuss possible improvements with its >>> maintainer. >>> >>> Best, >>> Uwe Ligges >>> >>> >>> On 12.12.2012 19:14, Paul Johnson wrote: >>>> >>>> Speaking of optimization and speeding up R calculations... >>>> >>>> I mentioned last week I want to speed up calculation of generalized >>>> inverses. On Debian Wheezy with R-2.15.2, I see a huge speedup using a >>>> souped up generalized inverse algorithm published by >>>> >>>> V. N. Katsikis, D. Pappas, Fast computing of theMoore-Penrose inverse >>>> matrix, Electronic Journal of Linear Algebra, >>>> 17(2008), 637-650. >>>> >>>> I was so delighted to see the computation time drop on my Debian >>>> system that I boasted to the WIndows users and gave them a test case. >>>> They answered back "there's no benefits, plus Windows is faster than >>>> Linux". >>>> >>>> That sent me off on a bit of a goose chase, but I think I'm beginning >>>> to understand the situation. I believe R-2.15.2 introduced a tighter >>>> requirement for precision, thus triggering longer-lasting calculations >>>> in many example scripts. Better algorithms can avoid some of that >>>> slowdown, as you see in this test case. >>>> >>>> Here is the test code you can run to see: >>>> >>>> http://pj.freefaculty.org/scraps/profile/prof-puzzle-1.R >>>> >>>> It downloads a data file from that same directory and then runs some >>>> multiple imputations with the Amelia package. >>>> >>>> Here's the output from my computer >>>> >>>> http://pj.freefaculty.org/scraps/profile/prof-puzzle-1.Rout >>>> >>>> That includes the profile of the calculations that depend on the >>>> ordinary generalized inverse algorithm based on svd and the new one. >>>> >>>> See? The KP algorithm is faster. And just as accurate as >>>> Amelia:::mpinv or MASS::ginv (for details on that, please review my >>>> notes in http://pj.freefaculty.org/scraps/profile/qrginv.R). >>>> >>>> So I asked WIndows users for more detailed feedback, including >>>> sessionInfo(), and I noticed that my proposed algorithm is not faster >>>> on Windows--WITH OLD R! >>>> >>>> Here's the script output with R-2.15.0, shows no speedup from the >>>> KPginv algorithm >>>> >>>> http://pj.freefaculty.org/scraps/profile/prof-puzzle-1-Windows.Rout >>>> >>>> On the same machine, I updated to R-2.15.2, and we see the same >>>> speedup from the KPginv algorithm >>>> >>>> >>>> http://pj.freefaculty.org/scraps/profile/prof-puzzle-1-CRMDA02-WinR2.15.2.Rout >>>> >>>> After that, I realized it is an R version change, not an OS >>>> difference, I was a bit relieved. >>>> >>>> What causes the difference in this case? In the Amelia code, they try >>>> to avoid doing the generalized inverse by using the ordinary solve(), >>>> and if that fails, then they do the generalized inverse. In R 2.15.0, >>>> the near singularity of the matrix is ignored, but not in R 2.15.2. >>>> The ordinary solve is failing almost all the time, thus triggering the >>>> use of the svd based generalized inverse. Which is slower. >>>> >>>> The Katsikis and Pappas 2008 algorithm is the fastest one I've found >>>> after translating from Matlab to R. It is not so universally >>>> applicable as svd based methods, it will fail if there are linearly >>>> dependent columns. However, it does tolerate columns of all zeros, >>>> which seems to be the problem case in the particular application I am >>>> testing. >>>> >>>> I tried very hard to get the newer algorithm described here to go as >>>> fast, but it is way way slower, at least in the implementations I >>>> tried: >>>> ## KPP >>>> ## Vasilios N. Katsikis, Dimitrios Pappas, Athanassios Petralias. "An >>>> improved method for >>>> ## the computation of the Moore Penrose inverse matrix," Applied >>>> ## Mathematics and Computation, 2011 >>>> >>>> The notes on that are in the qrginv.R file linked above. >>>> >>>> The fact that I can't make that newer KPP algorithm go faster, >>>> although the authors show it can go faster in Matlab, leads me to a >>>> bunch of other questions and possibly the need to implement all of >>>> this in C with LAPACK or EIGEN or something like that, but at this >>>> point, I've got to return to my normal job. If somebody is good at >>>> R's .Call interface and can make a pure C implementation of KPP. >>>> >>>> I think the key thing is that with R-2.15.2, there is an svd-related >>>> bottleneck in the multiple imputation algorithms in Amelia. The >>>> replacement version of the function Amelia:::mpinv does reclaim a 30% >>>> time saving, while generating imputations that are identical, so far >>>> as i can tell. >>>> >>>> pj >>>> >>>> >>>> >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> >> >> -- >> >> >> -- >> Dr. Wang, Yi (Alice) >> Research Assistant Professor >> Institute of Computational and Theoretical Studies >> Department of Computer Science >> Faculty of Science >> Hong Kong Baptist University >> Kowloon Tong, Hong Kong >> Email: yiw...@comp.hkbu.edu.hk >> Tel: +852-3411-2789 >> Web: http://www.icts.hkbu.edu.hk/yiwang/public/ >> PJ> -- PJ> Paul E. Johnson PJ> Professor, Political Science Assoc. Director PJ> 1541 Lilac Lane, Room 504 Center for Research Methods PJ> University of Kansas University of Kansas PJ> http://pj.freefaculty.org http://quant.ku.edu PJ> ______________________________________________ PJ> R-devel@r-project.org mailing list PJ> https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel