Hi all,

 

I have an interesting project coming up, but the datasets are way bigger
than anything I've used before with R. I'll end up with a dataset with about
45,000,000 records, each with 3 columns. I'll probably want to add some more
columns for analysis if I can. My client can't deal with such big files, so
she is sending the data to me in chunks of "only" 4,500,000 records at a
time. The question is, can I get away with making one dataset/matrix out of
all 45 million records? It should be under 2 gigabytes total, but I don't
want the whole import process to fail after, say, 20 million records.

 

I'm running Windows 7 Home Premium, 64 bit version, with 8 GB of RAM and
plenty of hard drive space, but I'm planning to upgrade all that just for
this project, so that it will complete in my lifetime.

 

Thanks for any help!

 

rich 


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to