[R] confusion matrix in randomForest

Miklos Kiss Sun, 20 Jul 2008 00:01:30 -0700

I have a question on the output generated by randomForest in classification
mode, specifically, the confusion matrix.  The confusion matrix lists the
various classes and how the forest classified each one, plus the
classification error.  Are these numbers essentially averages over all the
trees in the forest?  If so, is there a way I can get the standard deviation
values out of the randomForest, or do I have to evaluate each tree
individually?  By way of illustration, let me show the confusion matrix
using the iris data.  The output below shows that the forest correctly
classified 47 versicolor irises, but this is the result for the entire
forest.  I'd like to know if every tree will have 47 correctly classified
versicolor irises, but I don't think it will.  Same for the class.error
value.  Not every tree will have those exact same values, right?


But this raises another question.  For this example, I used the entire data
set to generate the forest, and so I assume that the confusion matrix is
based on OOB data, so if I created a training set and evaluated trees
individually in the test set I could get averages and standard deviations on
the error rate.

Any thoughts?  Thanks in advance.

-Miklos Z. Kiss

> print(iris.rf)
Call:
 randomForest(formula = Species ~ ., data = iris, importance = TRUE,     
keep.forest = TRUE) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 2

        OOB estimate of  error rate: 5.33%
Confusion matrix:
           setosa versicolor virginica class.error
setosa         50          0         0        0.00
versicolor      0         47         3        0.06
virginica       0          5        45        0.10
-- 
View this message in context: 
http://www.nabble.com/confusion-matrix-in-randomForest-tp18550873p18550873.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] confusion matrix in randomForest

Reply via email to