Thanks Peter.
Indeed by setting a seed the two results are similar.
I am self-studying and wanted to make sure I understood the concept of
OOB samples and how much "reliable" were performance metrics calculated
on them.
It seems I did got it. That's good :)
On 4/11/21 6:34 AM, Peter Langfel
I think the only thing you are doing wrong is not setting the random
seed (set.seed()) so your results are not reproducible. Depending on
the random sample used to select the training and test sets, you get
slightly varying accuracy for both, sometimes one is better and
sometimes the other.
HTH,
Hi ML,
For random forest, I thought that the out-of-bag performance should be
the same (or at least very similar) to the performance calculated on a
separated test set.
But this does not seem to be the case.
In the following code, the accuracy computed on out-of-bag sample is
77.81%, while
Hi everyone. I'm using a random forest in R to successfully perform a
classification on a dichotomous DV in a dataset that has 29 IVs of
type double and approximately 285,000 records. I ran my model on a
70/30 train/test split of the original dataset.
I'm trying to use the rfUtilities packa
Dear R user,
I try randomForest and had an issue for the prediction. The training and test
data from two separate csv file. However I end with all NA or prediction.
Y is the continuouse variable, X variables are factors, continuous variables,
variables with numeric 0/1 indication Y/N (not covert
Hi all,
I have built a Random Forest using Caret package, however, I don't understand
how the splits are labeled in trees. My dataset contains the frequency of the
words in the speeches of the people:
'data.frame': 499 obs. of 608 variables:
$ alright : num 1 0 0 0 0 0 0 1 2 1 ...
$ bad : n
This is explained in the "Details" section of the help page for partialPlot.
Best
Andy
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jesús Para
> Fernández
> Sent: Tuesday, April 12, 2016 1:17 AM
> To: r-help@r-project.
Hi,
To evaluate the partial influence of a factor with a random Forest, wich
response is OK/NOK I�m using partialPlot, being the x axis the factor axis and
the Y axis is between -1 and 1. What this -1 and 1 means?
An example:
https://www.dropbox.com/s/4b92lqxi3592r0d/Captura.JPG?dl=0
Thanks
I would suggest that you post instead on stats.stackexchange.com .
This forum is mostly about R programming issues, not statistics
(admittedly, the intersection is nonempty, but ...) That stackexchange
forum is more about statistics.
You might also consider a bioconductor forum, as this appears t
Hi all,
I'd like to use random forest regression to say something about the
importance of a set of genes (binary) for schizophrenia-related behavior
(continuous measure). I am still reading up on this technique, but would
already really appreciate any feedback on whether my approach is valid.
So...
On Jun 22, 2015, at 10:46 AM, synapse 123 wrote:
> Hi
> I wanted to know if I cn use Random Forest in R for time to event data. I
> cannot use Random Survival Forest since my data is not censored. Any
> suggestions.
>
I'm not sure why that should be a problem for RandomSurvivalForests. It's not
Hi
I wanted to know if I cn use Random Forest in R for time to event data. I
cannot use Random Survival Forest since my data is not censored. Any
suggestions.
Thanks
Azi
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list
Can you post your memory profile and codes?
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide comment
Dear All,
I am a bit concerned about the memory consumption of randomForest in
caret.
This seems to e due to the fact that the option keep.forest=FALSE does
not work in caret.
Does anybody know a workaround for that?
Many thanks
Lorenzo
__
R-help@r-pro
Hello R Experts,
I want to make sure I understand how the strata, sampsize and replace
parameters work so I can confidently perform downsampling on a dataset I'm
working with.
My main question is when the documentation talks about how each of these
parameters (strata, sampsize, replace) works
Thanks for the reply...Actually you answered my questionI just want to know
how people use it...
-Original Message-
From: Sarah Goslee [mailto:sarah.gos...@gmail.com]
Sent: Friday, June 20, 2014 11:31 AM
To: Li, Yan
Cc: r-help@r-project.org
Subject: Re: [R] random forest application
Hi,
This is not an R question, so really not appropriate for the list.
The answer depends on what "worth it" means to you.
There are many applications:
http://scholar.google.com/scholar?hl=en&q=%22random+forest%22&btnG=&as_sdt=1%2C39&as_sdtp=
Sarah
On Fri, Jun 20, 2014 at 10:12 AM, Li, Yan wr
Hi All,
Is anyone using random forest for predicting? Some people claimed that it will
give more accurate result than decision tree. But considering it builds 500(by
default) full trees, is it worth to use random forest to predict instead of
decision tree? What typical applications of this algo
Hi all,
I've been using the randomForest package on a dataset (described later) and
my problem is: even though I specify proximity= TRUE in the call I get a
NULL proximity matrix. Any thoughts on why that may happen?
Unfortunately I can't post my dataset, which is particularly problematic
here si
On Sat, Feb 15, 2014 at 8:43 AM, Lorenzo Isella
wrote:
> Dear All,
> I am a bit puzzled.
> I am developing a random forest model.
> The data is large and it involves hundred of predictors, but the code I have
> written is relatively simple.
> After training my random forest model, I apply it on so
Dear All,
I am a bit puzzled.
I am developing a random forest model.
The data is large and it involves hundred of predictors, but the code I
have written is relatively simple.
After training my random forest model, I apply it on some new data set to
carry out some prediction, as you can see be
your question doesn't seem to specifically related to either R or random
forest. instead, it is about how to assign weights to training
observations.
On Sun, Mar 24, 2013 at 6:43 AM, Lorenzo Isella wrote:
> Dear All,
> I am using randomForest to predict the final selling price of some items.
> A
Dear All,
I am using randomForest to predict the final selling price of some items.
As it often happens, I have a lot of (noisy) historical data, but the
question is not so much about data cleaning.
The dataset for which I need to carry out some predictions are fairly
recent sales or even some
Andrew,
That did the trick.
Thank you.
Dan
From: Andrew Robinson [mailto:mensuration...@gmail.com]
Sent: Monday, January 14, 2013 6:06 PM
To: Lopez, Dan
Cc: R help (r-help@r-project.org)
Subject: Re: [R] Random Forest Error for Factor to Character column
After you subset the data, did you
After you subset the data, did you redeclare the factor? If not then R
still thinks it has the potential for all those levels.
TRAINSET$JOBTITLE <- factor(TRAINSET$JOBTITLE)
I hope this helps
Andrew
On Tuesday, January 15, 2013, Lopez, Dan wrote:
> Hi,
>
> Can someone please offer me some guid
Hi,
Can someone please offer me some guidance?
I imported some data. One of the columns called "JOBTITLE" when imported was
imported as a factor column with 416 levels.
I subset the data in such a way that only 4 levels have data in "JOBTITLE" and
tried running randomForest but it complained a
Hello,
Regarding imbalanced data: When using sampsize correction to balanced
inbalanced data in Random Forests, what are the implications of the
algorithm no longer using a bootstrapped sample? For instance, if I set
sampsize to 25, 25 for a binary response, in a dataset of N=800, how does
randomf
Sorry, the previous was not right post.
I want to know the difference between following to methods of random forest.
1. epiG.rf <-randomForest(gamma~.,data=data, na.action = na.fail,ntree =
300,xtest = NULL, ytest = NULL,replace = T, proximity =F)
2. epiG.rf <-randomForest(x = data,,y = data$gam
Hi all,
Can some one tell me the difference between the following two formulas?
1. epiG.rf <-randomForest(gamma~.,data=data, na.action = na.fail,ntree =
300,xtest = NULL, ytest = NULL,replace = T, proximity =F)
2.epiG.rf <-randomForest(gamma~.,data=data, na.action = na.fail,ntree =
300,xtest = NU
t: Tuesday, October 16, 2012 10:47 PM
To: R-help@r-project.org
Subject: [R] Random Forest for multiple categorical variables
Dear all,
I have the following data set.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 alpha beta
1111 111 111 11111alpha beta1
2122 12
Dear all,
I have the following data set.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 alpha beta
1111 111 111 11111alpha beta1
2122 122 12 2 12212alpha beta1
3133 133 13 3 13 313alpha beta1
41
set type="votes" and norm.votes=FALSE, you will get the counts
instead of proportions.
Best,
Andy
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Lopez, Dan
Sent: Wednesday, September 26, 2012 9:05 PM
To: R help (r-help@r-pr
Hello,
I have two Random Forest (RF) related questions.
1. How do I view the classifications for the detail data of my training
data (aka trainset) that I used to build the model? I know there is an object
called predicted which I believe is a vector. To view the detail for my testset
I
Does anyone know if there are any special considerations with Random Forest and
correlated fields or rather derived fields?
For example if we are trying to predict who might leave our company to go work
for another company some of the variables we may look at are below (in addition
to others).
Have you tried to check memory limit.
You may want to check
Memory.limit()
Although in most of the cases you can extend limit to 4000.
Also as David mentioned try to run only r and force stop others.
Best Regards,
Bhupendrasinh Thakre
Sent from my iPhone
On Aug 30, 2012, at 10:02 AM, David Win
On Aug 30, 2012, at 4:02 AM, mushira wrote:
Hi all,
I am trying out with random forest on party package but am getting
an error
saying : cannot allocate vector of size 564." What would be the
problem? the
coding as below:
data.controls <- cforest_unbiased(ntree=1000, mtry=3)
data.cforest
Hi all,
I am trying out with random forest on party package but am getting an error
saying : cannot allocate vector of size 564." What would be the problem? the
coding as below:
>data.controls <- cforest_unbiased(ntree=1000, mtry=3)
> data.cforest <- cforest(class ~x1+x2+x3, data = Score,
> contro
I'm using the partial plot function in the Random Forest package
(randomForest).
I want to be able to control the region of values it chooses for the x.var
variable (instead of going from 0 to 10, I want to go from 0 to 1000).
The problem I'm having is that it seems the only method to do that
Sent: Wednesday, May 23, 2012 1:51 PM
To: r-help@R-project.org
Subject: [R] Random Forest Classification_ForestCombination
Hello,
I am aware of the fact that the combine() function in the Random Forest package
of R is meant to combine forests built from the same training set, but is there
any
Hello,
I am aware of the fact that the combine() function in the Random Forest package
of R is meant to combine forests built from the same training set, but is there
any way to combine trees built on different training sets? Both the training
datasets used contain the same variables and classe
You should be able to use the Rgui menu to install packages.
Andy
> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On Behalf Of Niratha
> Sent: Wednesday, February 01, 2012 5:16 AM
> To: r-help@r-project.org
> Subject:
Hi,
I have installed R version 2.14 in windows 7 . I want to use
randomForest package. I installed Rtools and MikTex 2.9, but i am not
possible to read description file and also its not possible to build
package. when i give this command in windows R CMD IINSTALL --build
randomForest its shows
Bill thanks so much. I left of the as.matrix and it worked! I really
appreciate the help.
--
View this message in context:
http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4218240.html
Sent from the R help mailing list archive at Nabble.com.
__
ay, December 16, 2011 2:55 PM
> To: r-help@r-project.org
> Subject: Re: [R] Random Forest Reading N/A's, I don't see them
>
> The data set I attached was just those 10 lines. It was only meant to show
> any possible obvious mistake I may have made. The real set has the 4
On Dec 15, 2011, at 2:39 PM, Lost in R wrote:
After checking the original data in Excel for blanks and running
Summary(cm3)
to identify any null values in my data, I'm unable to identify an
instances.
Yet when I attempted to use the data in Random Forest, I get the
following
error. Is ther
The data set I attached was just those 10 lines. It was only meant to show
any possible obvious mistake I may have made. The real set has the 4498 line
of data.
--
View this message in context:
http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4206630.html
Sent
On Dec 16, 2011, at 12:20 PM, Lost in R wrote:
I've also attached here a sample of my data in Excel. I'm thinking it
It? What is "it"?
must be
a problem with a character, but can't figure it out. Is there a list
somewhere of characters to avoid in R?
Thanks,
Mike
http://r.789695.n4.nabble
What exactly is your problem with this file? The file that you sent
had 10 lines of what appeared to be data and 4489 lines with just
commas which would read in as NAs. When you do an 'str' you get:
> str(x)
'data.frame': 4498 obs. of 195 variables:
$ Good_Bad : Factor w/ 3
I've also attached here a sample of my data in Excel. I'm thinking it must be
a problem with a character, but can't figure it out. Is there a list
somewhere of characters to avoid in R?
Thanks,
Mike
http://r.789695.n4.nabble.com/file/n4205479/Sample_Data_Set.csv
Sample_Data_Set.csv
--
View this
Thanks Michael - That was a help, i got rid of the "," in my numbers and the
"%" which were making many of the numeric variables FACTORS. It appears that
I made all of the those revisions, but still getting the same error.
Attached is the str() output if anyone could shed some light it would be
mu
Use str() on your object and attach the result. For even faster help, use
dput() on a *small* sample of your data to make the problem reproducible.
My guess is that there are characters or, less likely, factors lurking about...
Michael
On Dec 15, 2011, at 2:39 PM, Lost in R
wrote:
> After c
After checking the original data in Excel for blanks and running Summary(cm3)
to identify any null values in my data, I'm unable to identify an instances.
Yet when I attempted to use the data in Random Forest, I get the following
error. Is there something that Random Forest is reading as null which
r-help-bounces@r- cc
project.org
Subject
[R] Random Forest
Hi All,
I wrant to do Random Forest classification. I installed R, randomForest
classifier package for R
but dont know how to use it.
Is there any Open Source Remote sensing application which do RF
classification on satellite images?
Anyone r has random forest classification example?
Any languag
On Tue, May 24, 2011 at 3:18 PM, Unger, Rachel
wrote:
> I'm analyzing data using Random Forest Regression. For some of the
> species I am analyzing, the percent variation explained is negative.
> Could you please explain to me what that means? If you need more
> information, please let me know.
I'm analyzing data using Random Forest Regression. For some of the
species I am analyzing, the percent variation explained is negative.
Could you please explain to me what that means? If you need more
information, please let me know. Thank you.
Sincerely,
Rachel Unger
[[altern
Thanks to you all!
Now I got it!
--
View this message in context:
http://r.789695.n4.nabble.com/Random-Forest-Cross-Validation-tp3314777p3327384.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
https:
art adding steps
such as feature selections, all bets are off.
Andy
> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On Behalf Of mxkuhn
> Sent: Tuesday, February 22, 2011 7:17 PM
> To: ronzhao
> Cc: r-help@r-project.o
If you want to get honest estimates of accuracy, you should repeat the feature
selection within the resampling (not the test set). You will get different
lists each time, but that's the point. Right now you are not capturing that
uncertainty which is why the oob and test set results differ so mu
Thanks, Max.
Yes, I did some feature selections in the training set. Basically, I
selected the top 1000 SNPs based on OOB error and grow the forest using
training set, then using the test set to validate the forest grown.
But if I do the same thing in test set, the top SNPs would be different th
> I am using randomForest package to do some prediction job on GWAS data. I
> firstly split the data into training and testing set (70% vs 30%), then
> using training set to grow the trees (ntree=10). It looks that the OOB
> error in training set is good (<10%). However, it is not very good for
Hi,
I am using randomForest package to do some prediction job on GWAS data. I
firstly split the data into training and testing set (70% vs 30%), then
using training set to grow the trees (ntree=10). It looks that the OOB
error in training set is good (<10%). However, it is not very good for th
gt; From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On Behalf Of Claudia Beleites
> Sent: Saturday, October 23, 2010 3:39 PM
> To: r-help@r-project.org
> Subject: Re: [R] Random Forest AUC
>
> Dear List,
>
> Just curiosity (disclaimer: I never used
orever", which nececitate the need to find the
optimal number of iterations. You don't need that with RF.
-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of vioravis
Sent: Saturday, October 23, 2010 12:15 AM
To: r-help@r-
", which nececitate the need to find the
> > optimal number of iterations. You don't need that with RF.
> >
> >> -Original Message-
> >> From: r-help-boun...@r-project.org
> >> [mailto:r-help-boun...@r-project.org] On Behalf
tate the need to find the
> optimal number of iterations. You don't need that with RF.
>
>> -Original Message-
>> From: r-help-boun...@r-project.org
>> [mailto:r-help-boun...@r-project.org] On Behalf Of vioravis
>> Sent: Saturday, October 23, 20
ind the
optimal number of iterations. You don't need that with RF.
> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On Behalf Of vioravis
> Sent: Saturday, October 23, 2010 12:15 AM
> To: r-help@r-project.org
&
Thanks Max and Andy. If the Random Forest is always giving an AUC of 1, isn't
it over fitting??? If not, how do you differentiate this from over
fitting??? I believe Random forests are claimed to never over fit (from the
following link).
http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home
gt; Sent: Friday, October 22, 2010 1:20 AM
> To: r-help@r-project.org
> Subject: [R] Random Forest AUC
>
>
> Guys,
>
> I used Random Forest with a couple of data sets I had to
> predict for binary
> response. In all the cases, the AUC of the training set is
> co
Ravishankar,
> I used Random Forest with a couple of data sets I had to predict for binary
> response. In all the cases, the AUC of the training set is coming to be 1.
> Is this always the case with random forests? Can someone please clarify
> this?
This is pretty typical for this model.
> I hav
Guys,
I used Random Forest with a couple of data sets I had to predict for binary
response. In all the cases, the AUC of the training set is coming to be 1.
Is this always the case with random forests? Can someone please clarify
this?
I have given a simple example, first using logistic regressi
Max,
Thanks. Yes what you said is exactly I am looking for, i.e. the first tree
fits using data from sites A&B, then predicts on C (and so on).
Does that means if I :
1. pass this list as index into trainControl
> tmpSiteList
[[1]]
[1] 1 2 3 4 5 6 7
[[2]]
[1] 1 2 3 8 9 10
[[3]]
[1] 4 5
The index indicates which samples should go into the training set.
However, you are using out of bag sampling, so it would use the whole
training set and return the OOB error (instead of the error estimates
that would be produced by resampling via the index).
Which do you want? OOB estimates or ot
Thanks for all the help.
I had tried using the "index" in caret to try to dictate which rows of the
sample would be used in each of the tree building in RF. (e.g. use all data
from A B site for training, hold out all data from C site for testing etc)
However after running, when I cross-checked
;
> Message: 44
> Date: Tue, 20 Jul 2010 08:48:04 -0700 (PDT)
> From: Coll
> To: r-help@r-project.org
> Subject: [R] Random Forest - Strata
> Message-ID: <1279640884553-2295731.p...@n4.nabble.com>
> Content-Type: text/plain; charset=us-ascii
>
>
> Hi all,
&g
g
Subject: [R] Random Forest - Strata
Message-ID: <1279640884553-2295731.p...@n4.nabble.com>
Content-Type: text/plain; charset=us-ascii
Hi all,
Had struggled in getting "Strata" in randomForest to work on this.
Can I get randomForest for each of its TREE, to get ALL sample from
Hi all,
Had struggled in getting "Strata" in randomForest to work on this.
Can I get randomForest for each of its TREE, to get ALL sample from some
strata to build tree, while leaving some strata TOTALLY untouched as oob?
e.g. in below, how I can tell RF to,
- for tree 1 in the forest, to use
Thank you very much for this suggestion, I was not aware of this package.
Apart from this, is suggestion 2 (changing nodesize attribute) a good way to
go? Experimenting with sampsize (suggestion 4) has yielded promising
results.
Kind regards,
Andreas Béguin
2010/5/24 Gabor Grothendieck
> You co
You could also try the Boruta package for variable selection.
2010/5/24 Andreas Béguin :
> Dear R-help list members,
>
> I have a statistical question regarding the Random Forest function (RF) as
> applied to ecological prediction of species presences and absences.
>
> RF seems to perform very wel
Dear R-help list members,
I have a statistical question regarding the Random Forest function (RF) as
applied to ecological prediction of species presences and absences.
RF seems to perform very well for prediction of species ranges or
prevalences. However, the problem with my dataset is a high de
Thanks for providing the code that allows me to reproduce the problem.
It looks like the prediction routine for some reason returns "0" as
prediction for some trees, thus causing the problem observed. I'll look
into it.
Andy
From: Dror
>
> Hi,
> Thank you for your replies
> as for the predic
Hi,
Thank you for your replies
as for the prediction length, i run this code:
"
library(arules)
data(AdultUCI)
AdultUCI$workclass<-factor(AdultUCI$workclass, levels =
c(levels(AdultUCI$workclass), "UNKNOWN"))
AdultUCI$workclass[is.na(AdultUCI$workclass)]<-"UNKNOWN"
AdultUCI$occupation<-factor(Adu
From: Dror
>
> Hi,
> I'm working with randomForest package and i have 2 questions:
> 1. how can i drop a specific tree from the forest?
Answered in another post.
> 2. i'm trying to get the voting of each tree in a prediction
> datum using the
> folowing code
> > pr<-predict(RF,NewData,type="
From: Dror
>
> Hi,
> I need help with the randomForest prediction. i run the folowing code:
>
> > iris.rf <- randomForest(Species ~ ., data=iris,
> > importance=TRUE,keep.forest=TRUE, proximity=TRUE)
> > pr<-predict(iris.rf,iris,predict.all=T)
> > iris.rf$votes[53,]
> setosa versicolor virg
Hi,
I need help with the randomForest prediction. i run the folowing code:
> iris.rf <- randomForest(Species ~ ., data=iris,
> importance=TRUE,keep.forest=TRUE, proximity=TRUE)
> pr<-predict(iris.rf,iris,predict.all=T)
> iris.rf$votes[53,]
setosa versicolor virginica
0.000 0.8074866
Hi,
I'm working with randomForest package and i have 2 questions:
1. how can i drop a specific tree from the forest?
2. i'm trying to get the voting of each tree in a prediction datum using the
folowing code
> pr<-predict(RF,NewData,type="prob",predict.all=TRUE)
my forest has 300 trees and i ge
From: Dror
>
> Hi,
> i'm using randomForest package and i have 2 questions:
> 1. Can i drop one tree from an RF object?
Yes.
> 2. i have a 300 trees forest, but when i use the predict
> function on new
> data (with predict.all=TRUE) i get only 270 votes. did i do
> something wrong?
Try to fol
Hi,
i'm using randomForest package and i have 2 questions:
1. Can i drop one tree from an RF object?
2. i have a 300 trees forest, but when i use the predict function on new
data (with predict.all=TRUE) i get only 270 votes. did i do something wrong?
Thanks
--
View this message in context:
http
gt; these plots from different predictor variables, but not the absolute
> range. Hope that helps.
>
> Andy
>
>> -Original Message-
>> From: r-help-boun...@r-project.org
>> [mailto:r-help-boun...@r-project.org] On Behalf Of Carlos M.
>> Zambrana-Torrelio
ject.org] On Behalf Of Carlos M.
> Zambrana-Torrelio
> Sent: Monday, October 19, 2009 3:47 PM
> To: r-help@r-project.org
> Subject: [R] Random Forest - partial dependence plot
>
> Hi everybody,
>
> I used random forest regression to explain the patterns of species
> richn
Hi everybody,
I used random forest regression to explain the patterns of species
richness and a bunch of climate variables (e.g. Temperature,
precipitation, etc.) All are continuos variables. My results are
really interesting and my model explained 96,7% of the variance.
Now I am trying to take
Hi,
Are you looking for variable selection? If this is the case than you
can use LASSO, Elastic net, Sparse PLS regression methods which encourages
variable selection. PCA does not select variables as you get all your
variables in the PCs. You can sparse PCA.
Regards
Alex
On Wed, Jun 24, 2
Hi
I am trying to explore the use of random forests for regression to
identify the important environmental/microclimate variables involved in
predicting the abundance of a species in different habitats, there are
approx 40 variable and between 200 and 500 data points depending on the
dataset. I
estimate of MSE.
HTH,
Andy
> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On Behalf Of Ryan Harrigan
> Sent: Sunday, June 07, 2009 9:38 PM
> To: r-help@r-project.org
> Subject: [R] Random Forest % Variation vs Psuedo-R^2
Hi all (and Andy!),
When running a randomForest run in R, I get the last part of an output
(with do.trace=T) that looks like this:
1993 | 0.04606 130.43 |
1994 | 0.04605 130.40 |
1995 | 0.04605 130.43 |
1996 | 0.04605 130.43 |
1997 | 0.04606 130.44 |
1998 | 0.04607 130.47 |
1
Read ?importance, especially the "scale" argument.
Andy
> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On Behalf Of Li GUO
> Sent: Friday, March 27, 2009 1:24 PM
> To: r-help@r-project.org
> Subject
Hello,
I have an object of Random Forest : iris.rf (importance = TRUE).
What is the difference between "iris.rf$importance" and "importance(iris.rf)"?
Thank you in advance,
Best,
Li GUO
[[alternative HTML version deleted]]
__
R-help@
randomForest output is based on predict(iris.rf) whereas the
code shown below uses predict(iris.rf, iris). See ?predict.randomForest
for an explanation.
On Thu, Feb 26, 2009 at 11:10 AM, Li GUO wrote:
> Dear R users,
>
> I have a question on the confusion matrix generated by function randomFores
Dear R users,
I have a question on the confusion matrix generated by function randomForest.
I used the entire data
set to generate the forest, for example:
> print(iris.rf)
Call:
randomForest(formula = Species ~ ., data = iris, importance = TRUE,
keep.forest = TRUE)
confusion
Andy,
Thanks for your email.
I understand that by default, the sampsize variable will use the behavior
variable that we are classifying as the strata variable.
Then, I could set sampsize=c(no=89, yes=11). I implemented that but I got
99% classification error rate on the yes value. When I oversam
1 - 100 of 119 matches
Mail list logo