Dear R helpers:

I am a newbie to R and have a question related to cleaning large data frames
in R.

So far, I have been using SAS for data cleaning because my data sets are
relatively large (handling multiple files, each could be as large as 5-10
G).
I am not a fan of SAS at all and am eager to move data cleaning tasks into R
completely.

Seems to me, there are 3 options. Using SQL, ff or filehash. I do not want
to learn sql. so my question is more related to ff and filehash.

In specifics,

(1) for merging two large data frames,  which one is better, ff vs.
filehash?
(2) for reshaping a large data frame (say from long to wide or the opposite)
which one is better, ff vs. filehash?

If you can provide examples, that will be even better.

Many thanks in advance.

-Sean

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to