Thanks to you all!
Now I got it!
--
View this message in context:
http://r.789695.n4.nabble.com/Random-Forest-Cross-Validation-tp3314777p3327384.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
https:
rg
> Subject: Re: [R] Random Forest & Cross Validation
>
> If you want to get honest estimates of accuracy, you should
> repeat the feature selection within the resampling (not the
> test set). You will get different lists each time, but that's
> the point. Right now you
If you want to get honest estimates of accuracy, you should repeat the feature
selection within the resampling (not the test set). You will get different
lists each time, but that's the point. Right now you are not capturing that
uncertainty which is why the oob and test set results differ so mu
Thanks, Max.
Yes, I did some feature selections in the training set. Basically, I
selected the top 1000 SNPs based on OOB error and grow the forest using
training set, then using the test set to validate the forest grown.
But if I do the same thing in test set, the top SNPs would be different th
> I am using randomForest package to do some prediction job on GWAS data. I
> firstly split the data into training and testing set (70% vs 30%), then
> using training set to grow the trees (ntree=10). It looks that the OOB
> error in training set is good (<10%). However, it is not very good for
Hi,
I am using randomForest package to do some prediction job on GWAS data. I
firstly split the data into training and testing set (70% vs 30%), then
using training set to grow the trees (ntree=10). It looks that the OOB
error in training set is good (<10%). However, it is not very good for th
6 matches
Mail list logo