Dear Researches,
I am using RF (in regression way) for analize several metrics extract from
image. I am tuning RF setting a loop using different range of mtry, tree
and nodesize using the lower value of MSE-OOB
mtry from 1 to 5
nodesize from1 to 10
tree from 1 to 500
using this paper as refery
Palmer, D. S., O'Boyle, N. M., Glen, R. C., & Mitchell, J. B. O. (2007).
Random Forest Models To Predict Aqueous Solubility. Journal of Chemical
Information and Modeling, 47, 150-158.
my problem is the following using data(airquality) :
the tunning parameters with the lower value is:
> print(result.mtry.df[result.
mtry.df$RMSE == min(result.mtry.df$RMSE),])
*RMSE = 15.44751
MSE = 238.6257
mtry = 3
nodesize = 5
tree = 35*
the numer of tree is very low, different respect how i can read in several
pubblications
And the second value lower is a tunning parameters with *tree = 1*
print(head(result.mtry.df[with(result.mtry.df, order(MSE)), ]))
RMSE MSE mtry nodesize tree
12035 15.44751 238.6257 3 5 35
*18001 15.44861 238.6595 4 7 1
*7018 16.02354 256.7539 2 5 18
20031 16.02536 256.8121 5 1 31
11037 16.02862 256.9165 3 3 37
11612 16.05162 257.6544 3 4 112
i am wondering if i wrong in the setting or there are some aspects i don't
conseder.
thanks for attention and thanks in advance for suggestions and help
Gianni
require(randomForest)
data(airquality)
set.seed(131)
MyOzone <- data.frame(na.omit(airquality))
str(MyOzone)
#all data
My.mtry=c(1,2,3,4,5)
My.nodesize=c(seq(1,10,by=1))
My.tree=c(seq(1,500,by=1))
# {}
result.mtry <- list()
for(i in 1:length(My.mtry)){
result.nodesize <- list()
for(m in 1:length(My.nodesize)){
result.tree <- list()
for(l in 1:length(My.tree)){
ozone.rf <- randomForest(MyOzone[,-c(1)],MyOzone[,c(1)],
mtry=My.mtry[[i]],
nodesize=My.nodesize[[m]],
ntree=My.tree[[l]], importance=TRUE)
result.tree[[l]] <-
data.frame(RMSE=sqrt(ozone.rf$mse[ozone.rf$ntree]),MSE=ozone.rf$mse[ozone.rf$ntree],mtry=My.mtry[[i]],nodesize=My.nodesize[[m]],tree=My.tree[[l]])
}
result.tree.df <- do.call(rbind,result.tree)
result.nodesize[[m]] <- result.tree.df
}
result.nodesize.df <- do.call(rbind,result.nodesize)
result.mtry[[i]] <- result.nodesize.df
}
result.mtry.df <- do.call(rbind,result.mtry)
print(result.mtry.df[result.mtry.df$RMSE == min(result.mtry.df$RMSE),])
RMSE MSE mtry nodesize tree
12035 15.44751 238.6257 3 5 35
head(result.mtry.df[with(result.mtry.df, order(RMSE)), ])
RMSE MSE mtry nodesize tree
12035 15.44751 238.6257 3 5 35
18001 15.44861 238.6595 4 7 1
7018 16.02354 256.7539 2 5 18
20031 16.02536 256.8121 5 1 31
11037 16.02862 256.9165 3 3 37
11612 16.05162 257.6544 3 4 112
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.