Re: [R] Correlation of huge matrix saved as binary file

2012-03-03 Thread Thomas Lumley
On Sat, Mar 3, 2012 at 2:36 PM, Peter Langfelder wrote: > 3. Instead of calculating the correlations one-by-one, calculate them > in small blocks (if you have enough memory and you run a 64-bit R). > With 900M rows, you will only be able to put a 900Mx2 into an R > object, but if you have two suc

Re: [R] Correlation of huge matrix saved as binary file

2012-03-02 Thread Peter Langfelder
I don't think you can speed it up by a whole lot... but you can try a few things, especially if you don't have missing data in the matrix (which you probably don't). The main question is what takes most of the time- the api calls or the cor() call? If it's cor, here's what you can try: 1. Pre-stan

[R] Correlation of huge matrix saved as binary file

2012-03-02 Thread Bryo
Hi, I have a 900,000,000*9,000 matrix where I need to calculate the correlation between all entries along the smaller dimension, thus creating a 9k*9k correlation matrix. This matrix is too big to be uploaded in R, and is saved as a binary file. To access the data in the file I use mmap and some a