Well, explanation on "importance" says, that for regression the first
column (%IncMSE)
is the mean decrease in accuracy and the second ("IncNodePurity") the
mean decrease in MSE.
Dose not make much sense at all.
I do not know what "%IncMSE" stands for. Alright, MSE= mean square
error", but of what exactly,
since it is found next to the variables used for prediction.
And what does "%Inc" refer to? percent increase? But again percent
increase regarding mean square error?
Why would I want to increase the mean square error?
If as I assume "IncNodePurity" stands for increase in node
impurity...why would I want to increase the node impurity?
It would really help a lot to know how these two values are exactly
calculated and what they stand for. Both is not clear.
Thanks
Mareike
Liaw, Andy schrieb:
I would have thought that the help page for importance() is an (the?) obvious
place to look...
If that description is not clear, please let me know which part isn't clear to
you.
Andy
From: Mareike Lies
I am trying to use the package RandomForest performing regression.
The variable importance estimates are given as: "%IncMSE"
and
"IncNodePurity"
Can anyone explain me what these refer to and how they are calculated?
I found a lot of information on variable importance measures for
classification problems, but nothing on regression.
Thanks a lot.
Mareike
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice: This e-mail message, together with any attach...{{dropped:13}}
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.