Why exactly do you want to "stabilize" your results? If it's in preparation for publication/classroom demo/etc., certainly resetting the seed before each run (and hence getting the same sample() output) will make your results exactly reproducible. However, if you are looking for a clearer picture of the true efficacy of your svm and there's no real underlying order to the data set (i.e., not a time series), then a straight sample() seems better to me.
I'm not particularly well read on the svm literature, but it sounds like you are worried by widely varying performance of the svm itself. If that's the case, it seems (to me at least) that there are certain data points that are strongly informative and it might be a more interesting question to look into which ones those are. I guess my answer, as a total non-savant in the field, is that it depends on your goal: repeated runs with sample will give you more information about the strength of the svm while setting the seed will give you reproducibility. Importance sampling might be of interest, particularly if it could be tied to the information content of each data point, and a quick skim of the MC variance reduction literature might just provide some fun insights. I'm not entirely sure how you mean to bootstrap the act of setting the seed (a randomly set seed seems to be the same as not setting a seed at all) but that might give you a nice middle ground. Sorry this can't be of more help, Michael On Mon, Sep 26, 2011 at 6:32 PM, Riccardo G-Mail <ric.rom...@gmail.com>wrote: > Hi, I'm working with support vector machine for the classification purpose, > and I have a problem about the accuracy of prediction. > > I divided my data set in train (1/3 of enteire data set) and test (2/3 of > data set) using the "sample" function. Each time I perform the svm model I > obtain different result, according with the result of the "sample" function. > I would like to "stabilize" the performance of my analysis. To do this I > used the "set.seed" function. Is there a better way to do this? Should I > perform a bootstrap on my work-flow (sample and svm)? > > Here is an example of my workflow: > ### not to run > index <- 1:nrow(myData) > set.seed(23) > testindex <- sample(index, trunc(length(index)/3)) > testset <- myData[testindex, ] > trainset <- myData[-testindex, ] > > tune.svm() > svm.model <- svm(Factor ~ ., data = myData, cost = from tune.svm, > gamma = from tune.svm, cross= 10, subset= testset) > summary(svm.model) > predict(svm.model, testset) > > Best > Riccardo > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.