This page will tell you how to delete rows:

https://stat.ethz.ch/pipermail/r-help/2007-November/146220.html

I have used lm() with matrices, however, there maybe a more efficient method.  
I don't have the need for matrix functions, often.  Hopefully, someone can 
direct you, if there is a more efficient method.


~Nicole Ford
Ph.D. Student
Graduate Assistant/ Instructor
Department of Government and International Affairs
University of South Florida
office: SOC 012M


Sent from my iPhone

On Feb 26, 2013, at 11:41 AM, Jie <jimmycl...@gmail.com> wrote:

> Hi All,
> 
> I have millions of regression lines to fit. So I am looking for the
> most efficient approach in R.
> 
> Details:
> I have a large desing matrix X. The dimension is n by p.
> Each time when fitting the model, select rows from this matrix X and
> form a new design matrix, called  X_current.
> There is another binary matrix M, with dim m by n, and each row is a
> 1*n vector. It helps to determin X_current.
> For instance, if the first row of M is (1,0,1,0,1,..,1), then I remove
> the 2nd and 4th row from X, and use this submatrix as the design
> matrix to fit the model.
> if the second row is (0,0,0,1,...,1), then I use X[4:n,] as input
> matrix to do the regression,
> etc.
> 
> I would like to extract a single element of estimated parameter beta,
> and the SSE from the results.
> 
> So far what I know is to use these functions which performs better than lm():
> 1. lsfit()  # this works faster than lm(), but have to extract the
> results without using ls.print(print=F), maybe I/O slows down. And not
> better than the other two approaches
> 2. directly use solve(X'X) and the formula from score equation to get
> the results  # not sure how well for almost singular or wired matrices
> 3. directly use qr() with LAPACK or qr.solve()  and the same formula
> # qr decomposion is reliable but takes a little more time than solve()
> 
> Below is a piece of code for demonstration and time comparision. I do
> not know the best way to generate random non-singular design matrix,
> so I add try() to ignore errors.
> I did not include apply/sapply/lapply, seems they can save the memory
> but in my case do not improve. Thank you.
> 
> #----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> M = matrix( NA, nrow = 10^4, ncol = 200)
> for (i in 1:10^4)
> {
>    M[i,] = sample(c(0,1), size=200, replace = T)
> }
> 
> X = cbind( rnorm(200,1,10) ,1:200, sample(1:10^5,200,T))
> Y = rnorm(200)
> 
> 
> 
> # loop version
> resu1=matrix(NA,nrow=10^4, ncol=2)
> resu2=matrix(NA,nrow=10^4, ncol=2)
> resu3=matrix(NA,nrow=10^4, ncol=2)
> resu4=matrix(NA,nrow=10^4, ncol=2)
> 
> 
> t1 = proc.time()
> for (i in 1:10^4)
> {
>    fit.ls = lsfit( x = X[M[i,],], y = Y[M[i,]], intercept=F)
>    resu1[i,] = c( fit.ls$coeff[2], sum((fit.ls$res)^2))
> }
> t2 = proc.time()
> for (i in 1:10^4)
> {
>    hatM = t(X[M[i,],])%*%X[M[i,],]
>    try( beta <- ( solve( hatM ) %*% t( X[M[i,],] )%*%Y[M[i,]] ), silent = T)
>    try( varr <- sum((Y[M[i,]]-X[M[i,],]%*%beta)^2), silent = T)
>    try( resu2[i, ] <- c( beta[2], varr ), silent = T)
> }
> t3 = proc.time()
> 
> for (i in 1:10^4)
> {
>    hatM = t(X[M[i,],])%*%X[M[i,],]]
>    try( beta <- ( qr.solve( hatM ) %*% t( X[M[i,],] )%*%Y[M[i,]] ), silent = 
> T)
>    try( varr <- sum(( Y[M[i,]]-X[M[i,],]%*%beta)^2 ), silent = T)
>    try( resu2[i, ] <- c( beta[2], varr )), silent = T)
> }
> t4 = proc.time()
> 
> (t2-t1)[3]
> (t3-t2)[3]
> (t4-t3)[3]
> 
> 
> #----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to