On Jan 8, 2008 12:34 AM, suman Duvvuru <[EMAIL PROTECTED]> wrote: > Hello,
> I have a dataset with 20,000 variables.and I would like to compute a pearson > correlation matrix which will be 20000*20000. The cor() function doesnt work > in this case due to memory problem. If you have any ideas regarding a > feasible way to compute correlations on such a huge dataset, please help me > out. Considering that a single copy of such a matrix, stored as a dense matrix, is over 1 Gb > 20000^2 * 8 / (2^20) [1] 3051.8 I'm not surprised that you run into memory problems. Perhaps it is time to look at the forest instead of the trees. What would you do with such a matrix if you were able to calculate and store it? > Please feel free to share your memory handling techniques in R. > > Thanks, > Suman > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.