subject:"\[R\] Random Forest weighting"

Re: [R] Random Forest weighting

2008-12-05 Thread Raghu Naik

Andy, Thanks for your email. I understand that by default, the sampsize variable will use the behavior variable that we are classifying as the strata variable. Then, I could set sampsize=c(no=89, yes=11). I implemented that but I got 99% classification error rate on the yes value. When I oversam

Re: [R] Random Forest weighting

2008-12-04 Thread Liaw, Andy

If I understand your situation correctly, you may be able to make use of the "strata" and "sampsize" arguments in randomForest() to get bootstrap samples that resemble the original data distribution. They allow you to specify stratified samples using the "strata" variable. Best, Andy From: Ragh

[R] Random Forest weighting

2008-12-03 Thread Raghu Naik

Folks, I have a query around weighting in Random Forest (RF). I know that several earlier emails in this group have raised this issue, but I did not find an answer to my query. I am working on a dataset (dataset1) that consists of 4 million records that can be reduced to a dataset (dataset2) of a