> -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan > Sent: Tuesday, March 09, 2010 1:28 PM > To: r-help > Subject: Re: [R] ks.test; memory problems > > Furthermore, I am not even able to take a sample of my large vector > (which does exist somehow and is in memory): > > > sampleOfBigVector <- c(range(myBigVector),sample(myBigVector, 1000)) > Error: cannot allocate vector of size 718.0 Mb
Add the argument replace=TRUE to the call to sample() to save space (presumable it is used to check for duplicates in the sample). It is unlikely to make a difference in this case. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > > I guess I don't know what else I can do now, except find some cluster > with a lot of memory to run this code on (presumably I'd be able to > allocate those vectors then)? > > Jonathan > > > On Tue, Mar 9, 2010 at 4:11 PM, Jonathan <jonsle...@gmail.com> wrote: > > Hi R-help, > > I am interested in comparing two vectors of data > > observations to see if they come from the same distrubution > (and have > > settled on the Kolmogorov-Smirnov test to do this).. > > > > I'd prefer to use all my data points, but computationally speaking, > > this is proving to be troublesome due to the size of my vectors (the > > larger of the two is about 90 million observations). I suppose I > > could take a smaller sample of points from that large > vector to use as > > input in my ks-test, but I want to see if I can avoid doing that, in > > favor of including all of the data.. > > > > Code: > >> result <- ks.test(rep(1:940,100000),rep(1:940,800)) > > Error: cannot allocate vector of size 358.6 Mb > > > > Any ideas? > > > > OS: Windows 7 64-bit, R ver. 2.10.1, Memory: 4 gb > > > > Best, > > Jonathan > > > > > > > > Best, > > Jonathan > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.