Hello,
I am working with the naïve bayes function inlibrary(e1071). The function calls are: transactions.train.nb = naiveBayes(as.factor(DealerID) ~ as.factor(Manufacturer) + as.factor(RangeDesc) +as.factor(BodyType) +as.factor(FuelType) +as.factor(PaintColour) +as.factor(TransmissionType) +as.factor(Mileage) +as.factor(Registration), data=transactions.train, na.action=na.omit) where transactions.train is a dataframe with dimension 2032rows by 14 columns. and transactions.test.nb = predict(transactions.train.nb,transactions.test[,-1], type='raw') An example of the result are View(transactions.test.nb) Reduced results shown: 188 225 229 270 273 1 0.000984 0.000492 0.000492 0.000492 0.001476 2 0.000984 0.000492 0.000492 0.000492 0.001476 3 0.000984 0.000492 0.000492 0.000492 0.001476 4 0.000984 0.000492 0.000492 0.000492 0.001476 5 0.000984 0.000492 0.000492 0.000492 0.001476 I was struggling to understand why the returnedprobabilities are the same for each column as I was hoping for them to bedifferent. Dealer ID should have a different probability to row 1 than row 2.Each row does sum to 1. Transactions.train represents 67% of the full set of data. I’ve tried introducing laplace smoothing, and experimentedwith increasing and decreasing the number of parameters used to generate thetraining naivebayes object But as of yet I can’t figure it out. Could anybody help? Kind regards, Phil, [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.