Note that that isn't exactly what I recommended. If you look at the example in the help page for combine(), you'll see that it is combining RF objects trained on the same data; i.e., instead of having one RF with 500 trees, you can combine five RFs trained on the same data with 100 trees each into one 500-tree RF.
The way you are using combine() is basically using sample size to limit tree size, which you can do by playing with the nodesize argument in randomForest() as I suggested previously. Either way is fine as long as you don't see prediction performance degrading. Andy > -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of apresley > Sent: Tuesday, January 04, 2011 6:30 PM > To: r-help@r-project.org > Subject: Re: [R] randomForest speed improvements > > > Andy, > > Thanks for the reply. I had no idea I could combine them > back ... that > actually will work pretty well. We can have several "worker > threads" load > up the RF's on different machines and/or cores, and then > re-assemble them. > RMPI might be an option down the road, but would be a bit of > overhead for us > now. > > Using the method of combine() ... I was able to drastically reduce the > amount of time to build randomForest objects. IE, using > about 25,000 rows > (6 columns), it takes maybe 5 minutes on my laptop. Using 5 > randomForest > objects (each with 5k rows), and then combining them, takes < > 1 minute. > > -- > Anthony > -- > View this message in context: > http://r.789695.n4.nabble.com/randomForest-speed-improvements- > tp3172523p3174621.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.