Let me fix a couple of typos in that email: Hi All:
Let's say I have two dataframes (Condition1 and Condition2); each being on the order of 12,000 and 16,000 rows; 1 column. The entries contain dates. I'd like to calculate, for each possible pair of dates (that is: Condition1[1:12,000] and Condition2[1:16,000], the number of days difference between the dates in the pair. The result should be a matrix 12,000 by 16,000, which I'll call M. The purpose of building such a matrix M is to create a histogram of all the values contained within it. Ex): Condition1 <- data.frame('dates' = rep(c('2001-02-10','1998-03-14'),6000)) Condition2 <- data.frame('dates' = rep(c('2003-07-06','2007-03-11'),8000)) First, my instinct is to try and vectorize the operation. I tried this by expanding each vector into a matrix of repeated vectors (I'd then just subtract the two resultant matrices to get matrix M). I got the following error: > expandedCondition1 <- matrix(rep(Condition1[[1]], nrow(Condition2)), > byrow=TRUE, ncol=nrow(Condition1)) Error: cannot allocate vector of size 732.4 Mb > expandedCondition2 <- matrix(rep(Condition2[[1]], nrow(Condition1)), > byrow=FALSE, nrow=nrow(Condition2)) Error: cannot allocate vector of size 732.4 Mb Since it seems these matrices are too large, I'm wondering whether there's a better way to call a hist command without actually building the said matrix.. I'd greatly appreciate any ideas! Best, Jonathan On Mon, Feb 15, 2010 at 8:19 PM, Jonathan <jonsle...@gmail.com> wrote: > Hi All: > > Let's say I have two dataframes (Condition1 and Condition2); each > being on the order of 12,000 and 16,000 rows; 1 column. The entries > contain dates. > > I'd like to calculate, for each possible pair of dates (that is: > Condition1[1:10,000] and Condition2[1:10,000], the number of days > difference between the dates in the pair. The result should be a > matrix 12,000 by 16,000. Really, what I need is a histogram of all > the values in this matrix. > > Ex): > Condition1 <- data.frame('dates' = rep(c('2001-02-10','1998-03-14'),6000)) > Condition2 <- data.frame('dates' = rep(c('2003-07-06','2007-03-11'),8000)) > > First, my instinct is to try and vectorize the operation. I tried > this by expanding each vector into a matrix of repeated vectors (I'd > then just subtract the two). I got the following error: > >> expandedCondition1 <- matrix(rep(Condition1[[1]], nrow(Condition2)), >> byrow=TRUE, ncol=nrow(Condition1)) > Error: cannot allocate vector of size 732.4 Mb >> expandedCondition2 <- matrix(rep(Condition2[[1]], nrow(Condition1)), >> byrow=FALSE, nrow=nrow(Condition2)) > Error: cannot allocate vector of size 732.4 Mb > > Since it seems these matrices are too large, I'm wondering whether > there's a better way to call a hist command without actually building > the said matrix.. > > I'd greatly appreciate any ideas! > > Best, > Jonathan > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.