[R] mboost vs gbm

thebennjammin Mon, 23 Jul 2012 20:58:54 -0700

I'm attempting to fit boosted regression trees to a censored response using
IPCW weighting.  I've implemented this through two libraries, mboost and
gbm, which I believe should yield models that would perform comparably. 
This, however, is not the case - mboost performs much better.  This seems
odd.  This issue is meaningful since the output of this regression needs to
be implemented in a production system, and mboost doesn't even expose the
ensembles.


# default params for blackboost are a gaussian loss, and maxdepth of 2
m.mboost = blackboost(Y ~ X1 + X2, data=tdata, weights=t.ipcw,
control=boost_control(mstop=100))
m.gbm = gbm(Y ~ X1 + X2, data=tdata, weights=t.ipcw,
distribution="gaussian", interaction.depth=2, bag.fraction=1, n.trees=2500)

# compare IPCW weighted squared loss
sum((predict(m.mboost, newdata=tdata)-tdata$Y)^2 * t.ipcw) <
sum((predict(m.gbm, newdata=tdata, n.trees=2500)-tdata$Y)^2 * t.ipcw)
# TRUE, mboost with 100 trees will reduce the loss function from gbm 100
trees by 20%, and gbm with 2500 trees by 5%

The documentation says blackboost essentially does the same thing as mboost,
so any ideas on what could be driving this large difference in performance?




--
View this message in context: 
http://r.789695.n4.nabble.com/mboost-vs-gbm-tp4637518.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mboost vs gbm

Reply via email to