Dear R helpers: I am a newbie to R and have a question related to cleaning large data frames in R.
So far, I have been using SAS for data cleaning because my data sets are relatively large (handling multiple files, each could be as large as 5-10 G). I am not a fan of SAS at all and am eager to move data cleaning tasks into R completely. Seems to me, there are 3 options. Using SQL, ff or filehash. I do not want to learn sql. so my question is more related to ff and filehash. In specifics, (1) for merging two large data frames, which one is better, ff vs. filehash? (2) for reshaping a large data frame (say from long to wide or the opposite) which one is better, ff vs. filehash? If you can provide examples, that will be even better. Many thanks in advance. -Sean [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.