I've been training randomForest models on 7 million rows of data (41
features). Here's an example call:

myModel <- randomForest(RESPONSE~., data=mydata, ntree=50, maxnodes=30)

I thought surely with only 50 trees and 30 terminal nodes that the memory
footprint of "myModel" would be small. But it's 65 megs in a dump file. The
object seems to be holding all sorts of predicted, actual, and vote data
from the training process.

What if I just want the forest and that's it? I want a tiny dump file that
I can load later to make predictions off of quickly. I feel like the forest
by itself shouldn't be all that large...

Anyone know how to strip this sucker down to just something I can make
predictions off of going forward?

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to