Hi,

Thanks for the continued support.

I've been working on this all night, and have learned some things:

1) Since I'm really committed to using an SVM, I need to skip the examples with missing data. I have a training set of approximately 22,000 examples of which about 500 have missing values. Not a significant number to skip.

2) I believe the heart of my problem is based on the behavior of the scale function. If I pass scale a single value or a list of values that are all 0, then it returns NaN. I am scaling data by groups, and some of them have all 0 for some columns. So, even though I start with "clean" data containing no NA values, I wind up with some after the scale operations. I just posted a separate message asking for help on this.

3) R is forcing me to look at details of the experiment that were never considered in RapidMiner (RM). In fact, I'm quite suspicious as to how RM is handling these issues since they are hidden within the "black box" of their GUI.

4) The learning curve is steep, but worth it!!

If there was a "R" class in Los Angeles, I'd sign up right away...

Thanks again for all the help.

-N

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to