Re: [R] SVM probability output variation

Steve Lianoglou Wed, 21 Oct 2009 11:56:33 -0700

Howdy,

On Oct 21, 2009, at 1:05 PM, Anders Carlsson wrote:
<snip>

Yes, exactly that. In your example, though, the variation seems tobe a lot smaller. I'm guessing that has to with the data.
If I instead output the decision values, the whole procedure isfully reproducible, i.e. the exact same values are returned when Iretrain the model.


By the decision values, you mean the predict labels, right?

I have no idea how the probabilities are calculated, but it seems tobe in this step that the differences arise. In my case, I feel a bithesitant to use them when they differ that much between runs (15% orso)...

I'd find that a bit disconcerting, too. Can you give a sample of yourdata + code your using that can reproduce this example?


Warning: Brainstorming Below

If I were to calculate probabilities for my class labels, I'd make theprobability some function of the example's distance from the decisionboundary.

Now, if your decision boundary isn't changing from run to run (and Iguess it really shouldn't be, since the SVM returns the maximum marginclassifier (which is, by definition, unique, right?)), it's hard toimagine why these probabilities would change, either ...

... unless you're holding out different subsets of your data duringtraining, or perhaps have a different value for your penalty (cost)parameter when building the model. I believe you said that you'reactually training the same exact model each time, though, right?


Anyway, I see the help page for ?svm says this, if it helps:

"The probability model for classification fits a logistic distributionusing maximum likelihood to the decision values of all binaryclassifiers, and computes the a-posteriori class probabilities for themulti-class problem using quadratic optimization"


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SVM probability output variation

Reply via email to