Hi!

I'm using the Naive Bayes classifier provided by the e1071 package (
http://cran.r-project.org/web/packages/e1071) and I've noticed that the
predict function has a different behavior when the level set of the columns
used for prediction is different from the ones used for fitting. From
inspecting the predict.naiveBayes I came to the conclusion that this is due
to the conversion of factors to their internal codes using the data.matrix
function. For example, consider the following piece of R code:

> library(mlbench)
> library(e1071)
> data(HouseVotes84)
> model <- naiveBayes(Class ~ ., data = HouseVotes84)
> head(HouseVotes84)
       Class   V1 V2 V3   V4   V5 V6 V7 V8 V9 V10  V11  V12 V13 V14 V15  V16
1 republican    n  y  n    y    y  y  n  n  n   y <NA>    y   y   y   n    y
2 republican    n  y  n    y    y  y  n  n  n   n    n    y   y   y   n <NA>
3   democrat <NA>  y  y <NA>    y  y  n  n  n   n    y    n   y   y   n    n
4   democrat    n  y  y    n <NA>  y  n  n  n   n    y    n   y   n   n    y
5   democrat    y  y  y    n    y  y  n  n  n   n    y <NA>   y   y   y    y
6   democrat    n  y  y    n    y  y  n  n  n   n    n    n   y   y   y    y
> predict(model, HouseVotes84[1,-1])
[1] republican
Levels: democrat republican
> new.data <- data.frame(V1="n", V2="y", V3="n", V4="y", V5="y", V6="y",
V7="n", V8="n", V9="n", V10="y", V11=NA_character_, V12="y", V13="y",
V14="y", V15="n", V16="y", stringsAsFactors=TRUE)
> predict(model, new.data)
[1] democrat
Levels: democrat republican

I haven't used other classification methods in R, so I'm unsure if this is
what is expected from the application of the predict function. Is this a
bug or the expected behavior?

Thanks!

--
Joao.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to