Well, explanation on "importance" says, that for regression the first column (%IncMSE) is the mean decrease in accuracy and the second ("IncNodePurity") the mean decrease in MSE.
Dose not make much sense at all.
I do not know what "%IncMSE" stands for. Alright, MSE= mean square error", but of what exactly,
since it is found next to the variables used for prediction.
And what does "%Inc" refer to? percent increase? But again percent increase regarding mean square error?
Why would I want to increase the mean square error?
If as I assume "IncNodePurity" stands for increase in node impurity...why would I want to increase the node impurity? It would really help a lot to know how these two values are exactly calculated and what they stand for. Both is not clear.

Thanks
Mareike



Liaw, Andy schrieb:
I would have thought that the help page for importance() is an (the?) obvious 
place to look...

If that description is not clear, please let me know which part isn't clear to 
you.

Andy

From: Mareike Lies
I am trying to use the package RandomForest performing regression.
The variable importance estimates are given as: "%IncMSE" and "IncNodePurity"
Can anyone explain me what these refer to and how they are calculated?
I found a lot of information on variable importance measures for classification problems, but nothing on regression.

Thanks a lot.
Mareike

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Notice:  This e-mail message, together with any attach...{{dropped:13}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to