Furthermore, I am not even able to take a sample of my large vector (which does exist somehow and is in memory):
> sampleOfBigVector <- c(range(myBigVector),sample(myBigVector, 1000)) Error: cannot allocate vector of size 718.0 Mb I guess I don't know what else I can do now, except find some cluster with a lot of memory to run this code on (presumably I'd be able to allocate those vectors then)? Jonathan On Tue, Mar 9, 2010 at 4:11 PM, Jonathan <jonsle...@gmail.com> wrote: > Hi R-help, > I am interested in comparing two vectors of data > observations to see if they come from the same distrubution (and have > settled on the Kolmogorov-Smirnov test to do this).. > > I'd prefer to use all my data points, but computationally speaking, > this is proving to be troublesome due to the size of my vectors (the > larger of the two is about 90 million observations). I suppose I > could take a smaller sample of points from that large vector to use as > input in my ks-test, but I want to see if I can avoid doing that, in > favor of including all of the data.. > > Code: >> result <- ks.test(rep(1:940,100000),rep(1:940,800)) > Error: cannot allocate vector of size 358.6 Mb > > Any ideas? > > OS: Windows 7 64-bit, R ver. 2.10.1, Memory: 4 gb > > Best, > Jonathan > > > > Best, > Jonathan > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.