Hi Jack, to 1) and 2) there are telling you the same. I recommend you to read the first sections of the article it is very well writen and clear. There you will read about duality.
to 3) I interpret the scatter plot so: * Increasing the value of C (...) forces the creation of a more accurate model.* A more accurate model is done my adding more SV ( till we get a convex hull of the data ) hope it helps Regards Pau 2010/7/14 Jack Luo <jluo.rh...@gmail.com> > Pau, > > Thanks a lot for your email, I found it very helpful. Please see below for > my reply, thanks. > > -Jack > > On Wed, Jul 14, 2010 at 10:36 AM, Pau Carrio Gaspar > <paucar...@gmail.com>wrote: > >> Hello Jack, >> >> 1 ) why do you thought that " larger C is prone to overfitting than >> smaller C" ? >> > > *There is some statement in the link http://www.dtreg.com/svm.htm > > "To allow some flexibility in separating the categories, SVM models have a > cost parameter, C, that controls the trade off between allowing training > errors and forcing rigid margins. It creates a soft margin that permits > some misclassifications. Increasing the value of C increases the cost of > misclassifying points and forces the creation of a more accurate model that > may not generalize well." > > My understanding is that this means larger C may not generalize well (prone > to overfitting). > * > > 2 ) if you look at the formulation of the quadratic program problem you > will see that C rules the error of the "cutting plane " ( and overfitting > ). Therfore for hight C you allow that the "cutting plane" cuts worse the > set, so SVM needs less points to build it. a proper explanation is in > Kristin P. Bennett and Colin Campbell, "Support Vector Machines: Hype or > Hallelujah?", SIGKDD Explorations, 2,2, 2000, 1-13. > http://www.idi.ntnu.no/emner/it3704/lectures/papers/Bennett_2000_Support.pdf > > *Could you be more specific about this? I don't quite understand. * > >> >> 3) you might find usefull this plots: >> >> library(e1071) >> m1 <- matrix( c( >> 0, 0, 0, 1, 1, 2, 1, 2, 3, 2, 3, 3, 0, >> 1,2,3, 0, 1, 2, 3, >> 1, 2, 3, 2, 3, 3, 0, 0, 0, 1, 1, 2, 4, 4,4,4, >> 0, 1, 2, 3, >> 1, 1, 1, 1, 1, 1, -1,-1, -1,-1,-1,-1, 1 ,1,1,1, 1, >> 1,-1,-1 >> ), ncol = 3 ) >> >> Y = m1[,3] >> X = m1[,1:2] >> >> df = data.frame( X , Y ) >> >> par(mfcol=c(4,2)) >> for( cost in c( 1e-3 ,1e-2 ,1e-1, 1e0, 1e+1, 1e+2 ,1e+3)) { >> #cost <- 1 >> model.svm <- svm( Y ~ . , data = df , type = "C-classification" , kernel >> = "linear", cost = cost, >> scale =FALSE ) >> #print(model.svm$SV) >> >> plot(x=0,ylim=c(0,5), xlim=c(0,3),main= paste( "cost: ",cost, "#SV: ", >> nrow(model.svm$SV) )) >> points(m1[m1[,3]>0,1], m1[m1[,3]>0,2], pch=3, col="green") >> points(m1[m1[,3]<0,1], m1[m1[,3]<0,2], pch=4, col="blue") >> points(model.svm$SV[,1],model.svm$SV[,2], pch=18 , col = "red") >> } >> * >> * > > *Thanks a lot for the code, I really appreciate it. I've run it, but I am > not sure how should I interpret the scatter plot, although it is obvious > that number of SVs decreases with cost increasing. * > >> >> Regards >> Pau >> >> >> 2010/7/14 Jack Luo <jluo.rh...@gmail.com> >> >>> Hi, >>> >>> I have a question about the parameter C (cost) in svm function in e1071. >>> I >>> thought larger C is prone to overfitting than smaller C, and hence leads >>> to >>> more support vectors. However, using the Wisconsin breast cancer example >>> on >>> the link: >>> http://planatscher.net/svmtut/svmtut.html >>> I found that the largest cost have fewest support vectors, which is >>> contrary >>> to what I think. please see the scripts below: >>> Am I misunderstanding something here? >>> >>> Thanks a bunch, >>> >>> -Jack >>> >>> > model1 <- svm(databctrain, classesbctrain, kernel = "linear", cost = >>> 0.01) >>> > model2 <- svm(databctrain, classesbctrain, kernel = "linear", cost = 1) >>> > model3 <- svm(databctrain, classesbctrain, kernel = "linear", cost = >>> 100) >>> > model1 >>> >>> Call: >>> svm.default(x = databctrain, y = classesbctrain, kernel = "linear", >>> cost = 0.01) >>> >>> >>> Parameters: >>> SVM-Type: C-classification >>> SVM-Kernel: linear >>> cost: 0.01 >>> gamma: 0.1111111 >>> >>> Number of Support Vectors: 99 >>> >>> > model2 >>> >>> Call: >>> svm.default(x = databctrain, y = classesbctrain, kernel = "linear", >>> cost = 1) >>> >>> >>> Parameters: >>> SVM-Type: C-classification >>> SVM-Kernel: linear >>> cost: 1 >>> gamma: 0.1111111 >>> >>> Number of Support Vectors: 46 >>> >>> > model3 >>> >>> Call: >>> svm.default(x = databctrain, y = classesbctrain, kernel = "linear", >>> cost = 100) >>> >>> >>> Parameters: >>> SVM-Type: C-classification >>> SVM-Kernel: linear >>> cost: 100 >>> gamma: 0.1111111 >>> >>> Number of Support Vectors: 44 >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.