Hello,

I'm working with a 22 GB datasets with ~100 million observations and ~40
variables. It's store in SQLite and I use the RSQLite package to load it
into memory. Loading the full population, even for only a few variables,
can be very slow and I was wondering if there are best practices for how to
manage large datasets when doing analysis in R. Is there an alternative
file format / relational datbase in which I should be storing the data?

Best,

James
-- 
James F. Mahon III, Ph.D. Candidate
Harvard University
Tel: (857) 209-8438
Fax: (270) 813-3498
Web: http://www.people.fas.harvard.edu/~jmahon/

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to