You will probably need a 64-bit version of R, but you are running on Windows
and I think there is only a beta version available there.

I ran this on my 32-bit Window version to generate 1M indices into your
sample space; your could use this to get the data off a database or there is
a package for mapping stuff into shared memory

> z <- sample(90e6,1e6)
> gc()
         used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 127539  3.5     350000   9.4   350000   9.4
Vcells 583209  4.5   38486022 293.7 45583534 347.8
>
Notice that this used 347M byte to create the sequence 1:90M to sample
from.  So you will have to manage space carefully or go to 64-bits and buy
memory.

On Tue, Mar 9, 2010 at 4:28 PM, Jonathan <jonsle...@gmail.com> wrote:

> Furthermore, I am not even able to take a sample of my large vector
> (which does exist somehow and is in memory):
>
> > sampleOfBigVector <- c(range(myBigVector),sample(myBigVector, 1000))
> Error: cannot allocate vector of size 718.0 Mb
>
>
> I guess I don't know what else I can do now, except find some cluster
> with a lot of memory to run this code on (presumably I'd be able to
> allocate those vectors then)?
>
> Jonathan
>
>
> On Tue, Mar 9, 2010 at 4:11 PM, Jonathan <jonsle...@gmail.com> wrote:
> > Hi R-help,
> >    I am interested in comparing two vectors of data
> > observations to see if they come from the same distrubution (and have
> > settled on the Kolmogorov-Smirnov test to do this)..
> >
> > I'd prefer to use all my data points, but computationally speaking,
> > this is proving to be troublesome due to the size of my vectors (the
> > larger of the two is about 90 million observations).  I suppose I
> > could take a smaller sample of points from that large vector to use as
> > input in my ks-test, but I want to see if I can avoid doing that, in
> > favor of including all of the data..
> >
> > Code:
> >> result <- ks.test(rep(1:940,100000),rep(1:940,800))
> > Error: cannot allocate vector of size 358.6 Mb
> >
> > Any ideas?
> >
> > OS: Windows 7 64-bit, R ver. 2.10.1, Memory: 4 gb
> >
> > Best,
> > Jonathan
> >
> >
> >
> > Best,
> > Jonathan
> >
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to