I'm not an expert useR but I asked a similar question to stack overflow that might give you new ideas.
http://stackoverflow.com/questions/17458556/how-can-i-speed-up-this-sapply-for-cross-checking-samples On Thu, Aug 15, 2013 at 2:30 AM, Praveen Surendran <ps...@medschl.cam.ac.uk>wrote: > Dear Doran, Bert and Roger, > > Thank you for attending my query and for your valuable responses. > > The task is slightly more complex. Here's the real case... I have genetic > variation data (40,000 single nucleotide polymorphisms) from 90,000 > individuals. This makes the 90,000 (samples) rows/columns of the matrix and > 40,000 (SNPs) rows/columns of the matrix. Matrix data are genetic > variations with values 0,1,2 or 3 where 0 is missing. There will be very > few individuals with missing data. > > The task is to identify the relatedness between these 90,000 individuals > using their genetic data (0,1,2 or 3). These values needs to be > standardised before matrix multiplication. This will make the matrix much > larger compared to the 0/1/2/3 matrix and most of these will be real > numbers with decimals. > > Bert, I will not be doing a 90,000 x 40,000 %*% 40,000 x 90,000. The plan > is to load this 90000 x 40000 matrix into R, then standardise and multiply > this in batches of 90,000 samples against 500 samples using these 40,000 > variants and process these in parallel to get 90,000 x 90,000 comparisons. > Does that sort of clarifies the situation? > > I tried loading a 90,000 x 40,000 matrix as a matrix in R this morning on > the cluster with specifications described in my previous e-mail. This > crashed due to memory overflow. I am trying for possibilities > > Any comments or thoughts will be greatly appreciated. > > Regards, > > Praveen. > > -----Original Message----- > From: Roger Koenker [mailto:rkoen...@illinois.edu] > Sent: 14 August 2013 23:06 > To: Praveen Surendran > Cc: r-help@r-project.org > Subject: Re: [R] Matrix Multiplication using R. > > In the event that these are moderately sparse matrices, you could try > Matrix or SparseM. > > > Roger Koenker > rkoen...@illinois.edu > > > > > On Aug 14, 2013, at 10:40 AM, Praveen Surendran wrote: > > > Dear all, > > > > I am exploring ways to perform multiplication of a 90000 x 40000 matrix > with it's transpose. > > As expected even a 40000 x 100 %*% 100x40000 didn't work on my > desktop... giving the error "Error: cannot allocate vector of length > 1600000000" > > > > However I am trying to run this on one node (64GB RAM; 2.60 GHz > processor) of a high performance computing cluster. > > Appreciate if anyone has any comments on whether it's advisable to > perform a matrix multiplication of this size using R and also on any better > ways to handle this task. > > > > Kind Regards, > > > > Praveen. > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.