From: Liaw, Andy > > Note that that isn't exactly what I recommended. If you look at the > example in the help page for combine(), you'll see that it is > combining > RF objects trained on the same data; i.e., instead of having > one RF with > 500 trees, you can combine five RFs trained on the same data with 100 > trees each into one 500-tree RF. > > The way you are using combine() is basically using sample > size to limit > tree size, which you can do by playing with the nodesize argument in > randomForest() as I suggested previously. Either way is fine > as long as > you don't see prediction performance degrading.
I should also mention that another way you can do something similar is by making use of the sampsize argument in randomForest(). For example, if you call randomForest() with sampsize=500, it will randomly draw 500 data points to grow each tree. This way you don't even need to run the RFs separately and combine them. Andy > Andy > > > -----Original Message----- > > From: r-help-boun...@r-project.org > > [mailto:r-help-boun...@r-project.org] On Behalf Of apresley > > Sent: Tuesday, January 04, 2011 6:30 PM > > To: r-help@r-project.org > > Subject: Re: [R] randomForest speed improvements > > > > > > Andy, > > > > Thanks for the reply. I had no idea I could combine them > > back ... that > > actually will work pretty well. We can have several "worker > > threads" load > > up the RF's on different machines and/or cores, and then > > re-assemble them. > > RMPI might be an option down the road, but would be a bit of > > overhead for us > > now. > > > > Using the method of combine() ... I was able to drastically > reduce the > > amount of time to build randomForest objects. IE, using > > about 25,000 rows > > (6 columns), it takes maybe 5 minutes on my laptop. Using 5 > > randomForest > > objects (each with 5k rows), and then combining them, takes < > > 1 minute. > > > > -- > > Anthony > > -- > > View this message in context: > > http://r.789695.n4.nabble.com/randomForest-speed-improvements- > > tp3172523p3174621.html > > Sent from the R help mailing list archive at Nabble.com. > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > Notice: This e-mail message, together with any > attachme...{{dropped:11}} > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.