[R] classification

array chip Thu, 07 Apr 2011 11:13:56 -0700

Dear all, this is not a pure R question, but really about how to set up a 
multinomial logistic regression model to do a multi-class classification. I 
would really appreciate if any of you would give me some of your thoughts and 
recommendation.


Let's say we have 3-class classification problem: A, B and C. I have certain 
number of samples, with each sample, I have 3 variables (Xa, Xb and Xc). The 
trick here is that these 3 variables measures the extent of the likelihood of 
the samples being class A, B and C, i.e., Xa for class A, Xb for class B and Xc 
for class C. For a given sample i, we can simply make a rough prediction based 
on the values of Xa, Xb and Xc. For example:

 for sample 1, Xa=10, Xb=50, Xc=15, then most likely I would predict sample 1 
as 
class "B".

Then I have another set of variables Ya, Yb and Yc doing similar things.

I can construct a dataset as below:
           Xa   Xb    Xc     Ya   Yb   Yc  class
sample 1   10   50    15    0.2  0.8  0.1   B
sample 2   8    4     6     0.7  0.5  0.3   A
:
:


and then make a model fit<-multinom(class~Xa+Xb+Xc+Ya+Yb+Yc)

But my understanding is that this model is not working in a way of by simply 
looking at each row of the data and pick the class that has the best Xs and/or 
Ys. In leave-one-out, sometimes it picks up a class that apparently is not a 
winner if I compare across Xs and Ys.

Greatly appreciate if anyone can suggest a more sensible way to construct the 
data and/or a different way of thinking of the problem at all.

John
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] classification

Reply via email to