Hi all,

I've been using the randomForest package on a dataset (described later) and
my problem is: even though I specify proximity= TRUE in the call I get a
NULL proximity matrix. Any thoughts on why that may happen?

Unfortunately I can't post my dataset, which is particularly problematic
here since i believe that's where the problem is. So I'll try to give as
detailed of an account as i can.

The outcome is binary, highly skewed with the positive outcome being 1.5%
of the data.
The dataset has ~7000 observations and 200 predictors. these are either 2
level factors or continuous variables. Extremely sparse.

Here is my call:

#i pass a balanced dataset for each tree, to deal with the skewed outcome.
rf<-randomForest(y~. ,data=train, ntree=800,replace=TRUE,sampsize = c(112,
112), proximilty=TRUE)



Any ideas on why im getting a null proximity measure/ solutions?

Thanks!

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to