I've been training randomForest models on 7 million rows of data (41 features). Here's an example call:
myModel <- randomForest(RESPONSE~., data=mydata, ntree=50, maxnodes=30) I thought surely with only 50 trees and 30 terminal nodes that the memory footprint of "myModel" would be small. But it's 65 megs in a dump file. The object seems to be holding all sorts of predicted, actual, and vote data from the training process. What if I just want the forest and that's it? I want a tiny dump file that I can load later to make predictions off of quickly. I feel like the forest by itself shouldn't be all that large... Anyone know how to strip this sucker down to just something I can make predictions off of going forward? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.