Hi,
On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote:
Hi,
The answers to my previous question about nominal variables has lead
me to a more important question.
What is the "best practice" way to feed nominal variable to an SVM.
For example:
color = ("red, "blue", "green")
I could translate that into an index so I wind up with
color= (1,2,3)
But my concern is that the SVM will now think that the values are
numeric in "range" and not discrete conditions.
Another thought would be to create 3 binary variables from the
single color variable, so I have:
red = (0,1)
blue = (0,1)
green = (0,1)
A example fed to the SVM would have one positive and two negative
values to indicate the color value:
i.e. for a blue example:
red = 0, blue =1 , green = 0
Do it this way.
So, imagine if the features for your examples were color and height,
your "feature matrix" for N examples would be N x 4
0,1,0,15 # blue object, height 15
1,0,0,10 # red object, height 10
0,0,1,5 # green object, height 5
...
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.