[R] Question regarding Naive Bayes
PHILIP GLADWIN
philipgladwin at btinternet.com
Wed Nov 30 17:59:46 CET 2016
Hello,
I am working with the naïve bayes function inlibrary(e1071).
The function calls are:
transactions.train.nb = naiveBayes(as.factor(DealerID) ~
as.factor(Manufacturer)
+ as.factor(RangeDesc)
+as.factor(BodyType)
+as.factor(FuelType)
+as.factor(PaintColour)
+as.factor(TransmissionType)
+as.factor(Mileage)
+as.factor(Registration),
data=transactions.train,
na.action=na.omit)
where transactions.train is a dataframe with dimension 2032rows by 14 columns.
and
transactions.test.nb = predict(transactions.train.nb,transactions.test[,-1], type='raw')
An example of the result are
View(transactions.test.nb)
Reduced results shown:
188 225 229 270 273
1 0.000984 0.000492 0.000492 0.000492 0.001476
2 0.000984 0.000492 0.000492 0.000492 0.001476
3 0.000984 0.000492 0.000492 0.000492 0.001476
4 0.000984 0.000492 0.000492 0.000492 0.001476
5 0.000984 0.000492 0.000492 0.000492 0.001476
I was struggling to understand why the returnedprobabilities are the same for each column as I was hoping for them to bedifferent.
Dealer ID should have a different probability to row 1 than row 2.Each row does sum to 1.
Transactions.train represents 67% of the full set of data.
I’ve tried introducing laplace smoothing, and experimentedwith increasing and decreasing the number of parameters used to generate thetraining naivebayes object
But as of yet I can’t figure it out. Could anybody help?
Kind regards,
Phil,
[[alternative HTML version deleted]]
More information about the R-help
mailing list