Hello,

 
I am working with the naïve bayes function inlibrary(e1071).


 
The function calls are:

transactions.train.nb = naiveBayes(as.factor(DealerID) ~

                                   as.factor(Manufacturer) 

                                    + as.factor(RangeDesc)

                                    +as.factor(BodyType)  

                                    +as.factor(FuelType) 

                                    +as.factor(PaintColour)

                                    +as.factor(TransmissionType) 

                                    +as.factor(Mileage)

                                    +as.factor(Registration),

                                     data=transactions.train, 

                                     na.action=na.omit)


 
where transactions.train is a dataframe with dimension 2032rows by 14 columns.


 
and


 
transactions.test.nb = predict(transactions.train.nb,transactions.test[,-1], 
type='raw')


 
An example of the result are

View(transactions.test.nb)


 
Reduced results shown:

                188                     225                         229         
                270                     273

                                                                                
 

1              0.000984              0.000492              0.000492             
 0.000492              0.001476

2              0.000984              0.000492              0.000492             
 0.000492              0.001476

3              0.000984              0.000492              0.000492             
 0.000492              0.001476

4              0.000984              0.000492              0.000492             
 0.000492              0.001476

5              0.000984              0.000492              0.000492             
 0.000492              0.001476


 
I was struggling to understand why the returnedprobabilities are the same for 
each column as I was hoping for them to bedifferent.

Dealer ID should have a different probability to row 1 than row 2.Each row does 
sum to 1.


 
Transactions.train represents 67% of the full set of data.

I’ve tried introducing laplace smoothing, and experimentedwith increasing and 
decreasing the number of parameters used to generate thetraining naivebayes 
object

But as of yet I can’t figure it out.  Could anybody help?


 
Kind regards,

Phil,


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to