Re: [R] Random Forest & Cross Validation

2011-02-27 Thread ronzhao
Thanks to you all! Now I got it! -- View this message in context: http://r.789695.n4.nabble.com/Random-Forest-Cross-Validation-tp3314777p3327384.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https:

Re: [R] Random Forest & Cross Validation

2011-02-24 Thread Liaw, Andy
rg > Subject: Re: [R] Random Forest & Cross Validation > > If you want to get honest estimates of accuracy, you should > repeat the feature selection within the resampling (not the > test set). You will get different lists each time, but that's > the point. Right now you

Re: [R] Random Forest & Cross Validation

2011-02-22 Thread mxkuhn
If you want to get honest estimates of accuracy, you should repeat the feature selection within the resampling (not the test set). You will get different lists each time, but that's the point. Right now you are not capturing that uncertainty which is why the oob and test set results differ so mu

Re: [R] Random Forest & Cross Validation

2011-02-22 Thread ronzhao
Thanks, Max. Yes, I did some feature selections in the training set. Basically, I selected the top 1000 SNPs based on OOB error and grow the forest using training set, then using the test set to validate the forest grown. But if I do the same thing in test set, the top SNPs would be different th

Re: [R] Random Forest & Cross Validation

2011-02-20 Thread Max Kuhn
> I am using randomForest package to do some prediction job on GWAS data. I > firstly split the data into training and testing set (70% vs 30%), then > using training set to grow the trees (ntree=10). It looks that the OOB > error in training set is good (<10%). However, it is not very good for

[R] Random Forest & Cross Validation

2011-02-19 Thread ronzhao
Hi, I am using randomForest package to do some prediction job on GWAS data. I firstly split the data into training and testing set (70% vs 30%), then using training set to grow the trees (ntree=10). It looks that the OOB error in training set is good (<10%). However, it is not very good for th