Hi Ira,
I tried the ?lapply(). Looks like it edges the ?for() loop.
For e.g.
set.seed(435)
m1 <- matrix(rnorm(2000*30), ncol=30)
m2 <- matrix(rnorm(2000*30), ncol= 30)
corsP<-vector()
system.time({for(i in 1:2000) corsP[i] = cor(m1[i,], m2[i,])})
# user system elapsed
# 0.124 0.000 0.122
system.time({corsP2<- unlist(lapply(1:2000,function(i) cor(m1[i,],m2[i,])))})
# user system elapsed
# 0.108 0.000 0.110
identical(corsP,corsP2)
#[1] TRUE
system.time(corsP3<- diag(cor(t(m1),t(m2))))
# user system elapsed
# 0.272 0.004 0.276
mNew<- rbind(m1,m2)
indx<-rep(seq(nrow(mNew)/2),2)
system.time({corsP4<- tapply(seq_along(indx),list(indx),FUN=function(x)
cor(t(mNew[x,]),t(mNew[x,]))[2])})
# user system elapsed
# 0.156 0.000 0.160
attr(corsP4,"dimnames")<- NULL
all.equal(corsP,as.vector(corsP4))
#[1] TRUE
A.K.
________________________________
From: Ira Sharenow <[email protected]>
To: arun <[email protected]>
Sent: Monday, September 23, 2013 5:45 PM
Subject: Re: Correlate rows of 2 matrices
Arun,
What department are you in? Are you on LinkedIn?
The loop takes about a second. I do not know how to use lapply/sapply with more
than one object and a function of two variables such as cor().
When there are 2,000 columns it cannot be right to compute 4,000,000
correlations in order to use the 2,000 that are along the diagonal.
Ira
On 9/23/2013 2:12 PM, arun wrote:
Ira, I work as a postdoc at Wayne State Univ. in Detroit. I didn't check the
speed of ?diag(). It could be a bit slower because it first computes the whole
correlation and then take the diagonal elements. In that respect, loop will
save the time. Would be worth checking whether ?lapply() improves the speed
compared to ?for(). Arun ________________________________
From: Ira Sharenow <[email protected]> To: arun <[email protected]>
Sent: Monday, September 23, 2013 4:42 PM
Subject: Re: Correlate rows of 2 matrices Arun, On a contract, I work for this
San Francisco firm. But I work from home.
http://www.manifoldpartners.com/Home.html How about yourself? Where are you
located? Incidentally for my large matrix in addition to computing the pearson
correlation matrix with use = "pairwise.complete.obs" (85 seconds), I also have
to do spearman calculations. The code ran for 27 minutes. I only need about
2000 correlations, but I am computing 2000* 2000 correlations. Using a loop
reduced the time to about 1 second Please note that this initial data set is
one of the smaller ones I will be working on. Ira
On 9/23/2013 11:54 AM, arun wrote: Hi Ira,
Glad it worked for you. I would also choose the one you selected.
BTW, where do you work?
Regards,
Arun ________________________________
From: Ira Sharenow <[email protected]> To: arun <[email protected]>
Sent: Monday, September 23, 2013 2:47 PM
Subject: Re: Correlate rows of 2 matrices Arun, Thanks for your help. I am very
impressed with your ability to string together functions in order to achieve a
desired result. On the other hand I prefer simplicity and I will have to
explain my code to my boss who might have to eventually modify my code after
I’ve moved on. I decided to go with your first option. It worked quite well.
diag(cor(t(m1),t(m2))) Thanks again. Ira
On 9/22/2013 6:57 PM, Ira Sharenow wrote: Arun,
>
>I have a new problem for you. I have two data frames (or matrices) and row by
>row I want to take the correlations. So if I have a 3 row by 10 column matrix,
>I would produce 3 correlations. Is there a way to merge the matrices and then
>use some sort of split? Ideas/solutions much appreciated. m1 =
>matrix(rnorm(30), nrow = 3)
m2 = matrix(rnorm(30), nrow = 3)
>set.seed(22)
>m1 = matrix(rnorm(30), nrow = 3)
m2 = matrix(rnorm(30), nrow = 3)
for(i in 1:3) corsP[i] = cor(m1[i,], m2[i,])
corsP
>[1] -0.50865019 -0.27760046 0.01423144
>Thanks. Ira
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.