You will probably need a 64-bit version of R, but you are running on Windows and I think there is only a beta version available there.
I ran this on my 32-bit Window version to generate 1M indices into your sample space; your could use this to get the data off a database or there is a package for mapping stuff into shared memory > z <- sample(90e6,1e6) > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 127539 3.5 350000 9.4 350000 9.4 Vcells 583209 4.5 38486022 293.7 45583534 347.8 > Notice that this used 347M byte to create the sequence 1:90M to sample from. So you will have to manage space carefully or go to 64-bits and buy memory. On Tue, Mar 9, 2010 at 4:28 PM, Jonathan <jonsle...@gmail.com> wrote: > Furthermore, I am not even able to take a sample of my large vector > (which does exist somehow and is in memory): > > > sampleOfBigVector <- c(range(myBigVector),sample(myBigVector, 1000)) > Error: cannot allocate vector of size 718.0 Mb > > > I guess I don't know what else I can do now, except find some cluster > with a lot of memory to run this code on (presumably I'd be able to > allocate those vectors then)? > > Jonathan > > > On Tue, Mar 9, 2010 at 4:11 PM, Jonathan <jonsle...@gmail.com> wrote: > > Hi R-help, > > I am interested in comparing two vectors of data > > observations to see if they come from the same distrubution (and have > > settled on the Kolmogorov-Smirnov test to do this).. > > > > I'd prefer to use all my data points, but computationally speaking, > > this is proving to be troublesome due to the size of my vectors (the > > larger of the two is about 90 million observations). I suppose I > > could take a smaller sample of points from that large vector to use as > > input in my ks-test, but I want to see if I can avoid doing that, in > > favor of including all of the data.. > > > > Code: > >> result <- ks.test(rep(1:940,100000),rep(1:940,800)) > > Error: cannot allocate vector of size 358.6 Mb > > > > Any ideas? > > > > OS: Windows 7 64-bit, R ver. 2.10.1, Memory: 4 gb > > > > Best, > > Jonathan > > > > > > > > Best, > > Jonathan > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.