Hi, I have a question about the parameter C (cost) in svm function in e1071. I thought larger C is prone to overfitting than smaller C, and hence leads to more support vectors. However, using the Wisconsin breast cancer example on the link: http://planatscher.net/svmtut/svmtut.html I found that the largest cost have fewest support vectors, which is contrary to what I think. please see the scripts below: Am I misunderstanding something here?
Thanks a bunch, -Jack > model1 <- svm(databctrain, classesbctrain, kernel = "linear", cost = 0.01) > model2 <- svm(databctrain, classesbctrain, kernel = "linear", cost = 1) > model3 <- svm(databctrain, classesbctrain, kernel = "linear", cost = 100) > model1 Call: svm.default(x = databctrain, y = classesbctrain, kernel = "linear", cost = 0.01) Parameters: SVM-Type: C-classification SVM-Kernel: linear cost: 0.01 gamma: 0.1111111 Number of Support Vectors: 99 > model2 Call: svm.default(x = databctrain, y = classesbctrain, kernel = "linear", cost = 1) Parameters: SVM-Type: C-classification SVM-Kernel: linear cost: 1 gamma: 0.1111111 Number of Support Vectors: 46 > model3 Call: svm.default(x = databctrain, y = classesbctrain, kernel = "linear", cost = 100) Parameters: SVM-Type: C-classification SVM-Kernel: linear cost: 100 gamma: 0.1111111 Number of Support Vectors: 44 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.