Decyphering formulas seems to be the most time consuming part of lm: mylm1 <- function(formula, data) { # not perfect but works F <- model.frame(formula,data) y <- model.response(F) mt <- attr(F, "terms") x <- model.matrix(mt,F) coefs <- solve(crossprod(x), crossprod(x,y)) coefs }
mylm2 <- function(x, y, intercept=TRUE) { if(!is.matrix(x)) x <- as.matrix(x) if(intercept) x <- cbind(1,x) if(!is.matrix(y)) y <- as.matrix(y) solve(crossprod(x), crossprod(x,y)) } > system.time(for(i in 1:1000) mylm2(EuStockMarkets[,-1], EuStockMarkets[,"DAX"])) user system elapsed 6.43 0.00 6.53 > system.time(for(i in 1:1000) mylm1(DAX~., EuStockMarkets)) user system elapsed 16.19 0.00 16.23 > system.time(for(i in 1:1000) lm(DAX~., EuStockMarkets)) user system elapsed 21.43 0.00 21.44 So if you need to save time, I'd suggest something close to mylm2 rather than mylm1. Kenn On Thu, Feb 19, 2009 at 8:04 PM, Gabor Grothendieck <ggrothendi...@gmail.com > wrote: > On Thu, Feb 19, 2009 at 8:30 AM, Esmail Bonakdarian <esmail...@gmail.com> > wrote: > > Thanks for the suggestions, I'll have to see if I can figure out how to > > convert the relatively simple call to lm with an equation and the data > file > > to the functions you mention (or if that's even feasible). > > X <- model.matrix(formula, data) > > will calculate the X matrix for you. > > > > > Not an expert in statistics myself, I am mostly concentrating on the > > programming aspects of R. Problem is that I suspect my colleagues who > > are providing some guidance with the stats end are not quite experts > > themselves, and certainly new to R. > > > > Cheers, > > > > Esmail > > > > Kenn Konstabel wrote: > >> > >> lm does lots of computations, some of which you may never need. If speed > >> really matters, you might want to compute only those things you will > really > >> use. If you only need coefficients, then using %*%, solve and crossprod > will > >> be remarkably faster than lm > >> > >> # repeating someone else's example > >> # lm(DAX~., EuStockMarkets) > >> > >> y <- EuStockMarkets[,"DAX"] > >> x <- EuStockMarkets > >> x[,1]<-1 > >> colnames(x)[1] <- "Intercept" > >> > >> lm(y ~ x-1) > >> solve(crossprod(x), t(x))%*%y # probably this can be done more > >> efficiently > >> > >> # and a naive timing > >> > >> > system.time( for(i in 1:1000) lm(y ~ x-1)) > >> user system elapsed > >> 14.64 0.33 32.69 > >> > system.time(for(i in 1:1000) solve(crossprod(x), crossprod(x,y)) ) > >> user system elapsed > >> 0.36 0.00 0.36 > >> > >> > >> Also lsfit() is a bit quicker than lm or lm.fit. > >> > >> Regards, > >> Kenn > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.