On Tue, 02-Mar-2010 at 02:43PM -0500, Liaw, Andy wrote:
|> In most implementations of boosting, and for that matter, single tree,
|> the first variable wins when there are ties. In randomForest the
That still doesn't explain why with gbm, two identical variables will
"share the glory" (approxima
On Tue, Mar 2, 2010 at 2:43 PM, Liaw, Andy wrote:
> In most implementations of boosting, and for that matter, single tree,
> the first variable wins when there are ties.
They must be in a union :-)
>> What happens if there's a third?
If they were P perfectly correlated predictors, the importanc
In most implementations of boosting, and for that matter, single tree,
the first variable wins when there are ties. In randomForest the
variables are sampled, and thus not tested in the same order from one
node to the next, thus the variables are more likely to "share the
glory".
Best,
Andy
Fro
On Mon, 01-Mar-2010 at 12:01PM -0500, Max Kuhn wrote:
|> In theory, the choice between two perfectly correlated predictors is
|> random. Therefore, the importance should be "diluted" by half.
|> However, this is implementation dependent.
|>
|> For example, run this:
|>
|> set.seed(1)
|> n <-
In theory, the choice between two perfectly correlated predictors is
random. Therefore, the importance should be "diluted" by half.
However, this is implementation dependent.
For example, run this:
set.seed(1)
n <- 100
p <- 10
data <- as.data.frame(matrix(rnorm(n*(p-1)), nrow = n))
dat
Dear R users,
Im trying to understand how correlated predictors impact the Relative
Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman
described
with single decision trees (referring to Briemans CART
algorithm), the relative importance measure is augmented by a strate
6 matches
Mail list logo