Note that using solve can be numerically unstable for certain problems. On Fri, Feb 20, 2009 at 6:50 AM, Kenn Konstabel <lebats...@gmail.com> wrote: > Decyphering formulas seems to be the most time consuming part of lm: > > mylm1 <- function(formula, data) { > # not perfect but works > F <- model.frame(formula,data) > y <- model.response(F) > mt <- attr(F, "terms") > x <- model.matrix(mt,F) > coefs <- solve(crossprod(x), crossprod(x,y)) > coefs > } > > mylm2 <- function(x, y, intercept=TRUE) { > if(!is.matrix(x)) x <- as.matrix(x) > if(intercept) x <- cbind(1,x) > if(!is.matrix(y)) y <- as.matrix(y) > solve(crossprod(x), crossprod(x,y)) > } > >> system.time(for(i in 1:1000) mylm2(EuStockMarkets[,-1], >> EuStockMarkets[,"DAX"])) > user system elapsed > 6.43 0.00 6.53 >> system.time(for(i in 1:1000) mylm1(DAX~., EuStockMarkets)) > user system elapsed > 16.19 0.00 16.23 >> system.time(for(i in 1:1000) lm(DAX~., EuStockMarkets)) > user system elapsed > 21.43 0.00 21.44 > > So if you need to save time, I'd suggest something close to mylm2 rather > than mylm1. > > Kenn > > > > On Thu, Feb 19, 2009 at 8:04 PM, Gabor Grothendieck > <ggrothendi...@gmail.com> wrote: >> >> On Thu, Feb 19, 2009 at 8:30 AM, Esmail Bonakdarian <esmail...@gmail.com> >> wrote: >> >> > Thanks for the suggestions, I'll have to see if I can figure out how to >> > convert the relatively simple call to lm with an equation and the data >> > file >> > to the functions you mention (or if that's even feasible). >> >> X <- model.matrix(formula, data) >> >> will calculate the X matrix for you. >> >> > >> > Not an expert in statistics myself, I am mostly concentrating on the >> > programming aspects of R. Problem is that I suspect my colleagues who >> > are providing some guidance with the stats end are not quite experts >> > themselves, and certainly new to R. >> > >> > Cheers, >> > >> > Esmail >> > >> > Kenn Konstabel wrote: >> >> >> >> lm does lots of computations, some of which you may never need. If >> >> speed >> >> really matters, you might want to compute only those things you will >> >> really >> >> use. If you only need coefficients, then using %*%, solve and crossprod >> >> will >> >> be remarkably faster than lm >> >> >> >> # repeating someone else's example >> >> # lm(DAX~., EuStockMarkets) >> >> >> >> y <- EuStockMarkets[,"DAX"] >> >> x <- EuStockMarkets >> >> x[,1]<-1 >> >> colnames(x)[1] <- "Intercept" >> >> >> >> lm(y ~ x-1) >> >> solve(crossprod(x), t(x))%*%y # probably this can be done more >> >> efficiently >> >> >> >> # and a naive timing >> >> >> >> > system.time( for(i in 1:1000) lm(y ~ x-1)) >> >> user system elapsed >> >> 14.64 0.33 32.69 >> >> > system.time(for(i in 1:1000) solve(crossprod(x), crossprod(x,y)) ) >> >> user system elapsed >> >> 0.36 0.00 0.36 >> >> >> >> >> >> Also lsfit() is a bit quicker than lm or lm.fit. >> >> >> >> Regards, >> >> Kenn >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.