Re: [R] Random Forest: OOB performance = test set performance?

2021-04-11 Thread thebudget72
Thanks Peter. Indeed by setting a seed the two results are similar. I am self-studying and wanted to make sure I understood the concept of OOB samples and how much "reliable" were performance metrics calculated on them. It seems I did got it. That's good :) On 4/11/21 6:34 AM, Peter Langfel

Re: [R] Random Forest: OOB performance = test set performance?

2021-04-10 Thread Peter Langfelder
I think the only thing you are doing wrong is not setting the random seed (set.seed()) so your results are not reproducible. Depending on the random sample used to select the training and test sets, you get slightly varying accuracy for both, sometimes one is better and sometimes the other. HTH,

[R] Random Forest: OOB performance = test set performance?

2021-04-10 Thread thebudget72
Hi ML, For random forest, I thought that the out-of-bag performance should be the same (or at least very similar) to the performance calculated on a separated test set. But this does not seem to be the case. In the following code, the accuracy computed on out-of-bag sample is 77.81%, while

[R] random forest significance testing tools

2020-05-10 Thread Tom Woolman
Hi everyone. I'm using a random forest in R to successfully perform a classification on a dichotomous DV in a dataset that has 29 IVs of type double and approximately 285,000 records. I ran my model on a 70/30 train/test split of the original dataset. I'm trying to use the rfUtilities packa

[R] Random Forest prediction all NA

2018-10-24 Thread Chen, Lang
Dear R user, I try randomForest and had an issue for the prediction. The training and test data from two separate csv file. However I end with all NA or prediction. Y is the continuouse variable, X variables are factors, continuous variables, variables with numeric 0/1 indication Y/N (not covert

[R] Random Forest tree labels

2018-01-04 Thread Elahe chalabi via R-help
Hi all, I have built a Random Forest using Caret package, however, I don't understand how the splits are labeled in trees. My dataset contains the frequency of the words in the speeches of the people: 'data.frame': 499 obs. of 608 variables: $ alright : num 1 0 0 0 0 0 0 1 2 1 ... $ bad : n

Re: [R] Random Forest classification

2016-04-18 Thread Liaw, Andy
This is explained in the "Details" section of the help page for partialPlot. Best Andy > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jesús Para > Fernández > Sent: Tuesday, April 12, 2016 1:17 AM > To: r-help@r-project.

[R] Random Forest classification

2016-04-11 Thread Jesús Para Fernández
Hi, To evaluate the partial influence of a factor with a random Forest, wich response is OK/NOK I�m using partialPlot, being the x axis the factor axis and the Y axis is between -1 and 1. What this -1 and 1 means? An example: https://www.dropbox.com/s/4b92lqxi3592r0d/Captura.JPG?dl=0 Thanks

Re: [R] Random forest regression: feedback on general approach and possible issues

2015-12-04 Thread Bert Gunter
I would suggest that you post instead on stats.stackexchange.com . This forum is mostly about R programming issues, not statistics (admittedly, the intersection is nonempty, but ...) That stackexchange forum is more about statistics. You might also consider a bioconductor forum, as this appears t

[R] Random forest regression: feedback on general approach and possible issues

2015-12-04 Thread Johannes Klene
Hi all, I'd like to use random forest regression to say something about the importance of a set of genes (binary) for schizophrenia-related behavior (continuous measure). I am still reading up on this technique, but would already really appreciate any feedback on whether my approach is valid. So...

Re: [R] Random Forest -

2015-06-22 Thread David Winsemius
On Jun 22, 2015, at 10:46 AM, synapse 123 wrote: > Hi > I wanted to know if I cn use Random Forest in R for time to event data. I > cannot use Random Survival Forest since my data is not censored. Any > suggestions. > I'm not sure why that should be a problem for RandomSurvivalForests. It's not

[R] Random Forest -

2015-06-22 Thread synapse 123
Hi I wanted to know if I cn use Random Forest in R for time to event data. I cannot use Random Survival Forest since my data is not censored. Any suggestions. Thanks Azi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list

Re: [R] Random Forest in Caret

2015-04-22 Thread Suzen, Mehmet
Can you post your memory profile and codes? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide comment

[R] Random Forest in Caret

2015-04-22 Thread Lorenzo Isella
Dear All, I am a bit concerned about the memory consumption of randomForest in caret. This seems to e due to the fact that the option keep.forest=FALSE does not work in caret. Does anybody know a workaround for that? Many thanks Lorenzo __ R-help@r-pro

[R] Random Forest - Strata and sampsize and replace

2014-11-18 Thread Lopez, Dan
Hello R Experts, I want to make sure I understand how the strata, sampsize and replace parameters work so I can confidently perform downsampling on a dataset I'm working with. My main question is when the documentation talks about how each of these parameters (strata, sampsize, replace) works

Re: [R] random forest application

2014-06-20 Thread Li, Yan
Thanks for the reply...Actually you answered my questionI just want to know how people use it... -Original Message- From: Sarah Goslee [mailto:sarah.gos...@gmail.com] Sent: Friday, June 20, 2014 11:31 AM To: Li, Yan Cc: r-help@r-project.org Subject: Re: [R] random forest application

Re: [R] random forest application

2014-06-20 Thread Sarah Goslee
Hi, This is not an R question, so really not appropriate for the list. The answer depends on what "worth it" means to you. There are many applications: http://scholar.google.com/scholar?hl=en&q=%22random+forest%22&btnG=&as_sdt=1%2C39&as_sdtp= Sarah On Fri, Jun 20, 2014 at 10:12 AM, Li, Yan wr

[R] random forest application

2014-06-20 Thread Li, Yan
Hi All, Is anyone using random forest for predicting? Some people claimed that it will give more accurate result than decision tree. But considering it builds 500(by default) full trees, is it worth to use random forest to predict instead of decision tree? What typical applications of this algo

[R] Random forest proximity measure

2014-05-22 Thread Maggie Makar
Hi all, I've been using the randomForest package on a dataset (described later) and my problem is: even though I specify proximity= TRUE in the call I get a NULL proximity matrix. Any thoughts on why that may happen? Unfortunately I can't post my dataset, which is particularly problematic here si

Re: [R] Random Forest, Variable Mismatch

2014-02-15 Thread Peter Langfelder
On Sat, Feb 15, 2014 at 8:43 AM, Lorenzo Isella wrote: > Dear All, > I am a bit puzzled. > I am developing a random forest model. > The data is large and it involves hundred of predictors, but the code I have > written is relatively simple. > After training my random forest model, I apply it on so

[R] Random Forest, Variable Mismatch

2014-02-15 Thread Lorenzo Isella
Dear All, I am a bit puzzled. I am developing a random forest model. The data is large and it involves hundred of predictors, but the code I have written is relatively simple. After training my random forest model, I apply it on some new data set to carry out some prediction, as you can see be

Re: [R] Random Forest, Giving More Importance to Some Data

2013-03-24 Thread Wensui Liu
your question doesn't seem to specifically related to either R or random forest. instead, it is about how to assign weights to training observations. On Sun, Mar 24, 2013 at 6:43 AM, Lorenzo Isella wrote: > Dear All, > I am using randomForest to predict the final selling price of some items. > A

[R] Random Forest, Giving More Importance to Some Data

2013-03-24 Thread Lorenzo Isella
Dear All, I am using randomForest to predict the final selling price of some items. As it often happens, I have a lot of (noisy) historical data, but the question is not so much about data cleaning. The dataset for which I need to carry out some predictions are fairly recent sales or even some

Re: [R] Random Forest Error for Factor to Character column

2013-01-15 Thread Lopez, Dan
Andrew, That did the trick. Thank you. Dan From: Andrew Robinson [mailto:mensuration...@gmail.com] Sent: Monday, January 14, 2013 6:06 PM To: Lopez, Dan Cc: R help (r-help@r-project.org) Subject: Re: [R] Random Forest Error for Factor to Character column After you subset the data, did you

Re: [R] Random Forest Error for Factor to Character column

2013-01-14 Thread Andrew Robinson
After you subset the data, did you redeclare the factor? If not then R still thinks it has the potential for all those levels. TRAINSET$JOBTITLE <- factor(TRAINSET$JOBTITLE) I hope this helps Andrew On Tuesday, January 15, 2013, Lopez, Dan wrote: > Hi, > > Can someone please offer me some guid

[R] Random Forest Error for Factor to Character column

2013-01-14 Thread Lopez, Dan
Hi, Can someone please offer me some guidance? I imported some data. One of the columns called "JOBTITLE" when imported was imported as a factor column with 416 levels. I subset the data in such a way that only 4 levels have data in "JOBTITLE" and tried running randomForest but it complained a

[R] Random Forest imbalanced data and partial plots

2012-10-23 Thread Valerie Steen
Hello, Regarding imbalanced data: When using sampsize correction to balanced inbalanced data in Random Forests, what are the implications of the algorithm no longer using a bootstrapped sample? For instance, if I set sampsize to 25, 25 for a binary response, in a dataset of N=800, how does randomf

Re: [R] random forest

2012-10-21 Thread Gyanendra Pokharel
Sorry, the previous was not right post. I want to know the difference between following to methods of random forest. 1. epiG.rf <-randomForest(gamma~.,data=data, na.action = na.fail,ntree = 300,xtest = NULL, ytest = NULL,replace = T, proximity =F) 2. epiG.rf <-randomForest(x = data,,y = data$gam

[R] random forest

2012-10-21 Thread Gyanendra Pokharel
Hi all, Can some one tell me the difference between the following two formulas? 1. epiG.rf <-randomForest(gamma~.,data=data, na.action = na.fail,ntree = 300,xtest = NULL, ytest = NULL,replace = T, proximity =F) 2.epiG.rf <-randomForest(gamma~.,data=data, na.action = na.fail,ntree = 300,xtest = NU

Re: [R] Random Forest for multiple categorical variables

2012-10-17 Thread Liaw, Andy
t: Tuesday, October 16, 2012 10:47 PM To: R-help@r-project.org Subject: [R] Random Forest for multiple categorical variables Dear all, I have the following data set. V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 alpha beta 1111 111 111 11111alpha beta1 2122 12

[R] Random Forest for multiple categorical variables

2012-10-16 Thread Gyanendra Pokharel
Dear all, I have the following data set. V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 alpha beta 1111 111 111 11111alpha beta1 2122 122 12 2 12212alpha beta1 3133 133 13 3 13 313alpha beta1 41

Re: [R] Random Forest - Extract

2012-10-03 Thread Liaw, Andy
set type="votes" and norm.votes=FALSE, you will get the counts instead of proportions. Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Lopez, Dan Sent: Wednesday, September 26, 2012 9:05 PM To: R help (r-help@r-pr

[R] Random Forest - Extract

2012-09-26 Thread Lopez, Dan
Hello, I have two Random Forest (RF) related questions. 1. How do I view the classifications for the detail data of my training data (aka trainset) that I used to build the model? I know there is an object called predicted which I believe is a vector. To view the detail for my testset I

[R] Random Forest and Correlated Fields

2012-09-14 Thread Lopez, Dan
Does anyone know if there are any special considerations with Random Forest and correlated fields or rather derived fields? For example if we are trying to predict who might leave our company to go work for another company some of the variables we may look at are below (in addition to others).

Re: [R] random forest using party package

2012-08-30 Thread Bhupendrasinh Thakre
Have you tried to check memory limit. You may want to check Memory.limit() Although in most of the cases you can extend limit to 4000. Also as David mentioned try to run only r and force stop others. Best Regards, Bhupendrasinh Thakre Sent from my iPhone On Aug 30, 2012, at 10:02 AM, David Win

Re: [R] random forest using party package

2012-08-30 Thread David Winsemius
On Aug 30, 2012, at 4:02 AM, mushira wrote: Hi all, I am trying out with random forest on party package but am getting an error saying : cannot allocate vector of size 564." What would be the problem? the coding as below: data.controls <- cforest_unbiased(ntree=1000, mtry=3) data.cforest

[R] random forest using party package

2012-08-30 Thread mushira
Hi all, I am trying out with random forest on party package but am getting an error saying : cannot allocate vector of size 564." What would be the problem? the coding as below: >data.controls <- cforest_unbiased(ntree=1000, mtry=3) > data.cforest <- cforest(class ~x1+x2+x3, data = Score, > contro

[R] Random Forest Partial Dependence Plot

2012-06-21 Thread Namit Setia
I'm using the partial plot function in the Random Forest package (randomForest). I want to be able to control the region of values it chooses for the x.var variable (instead of going from 0 to 10, I want to go from 0 to 1000). The problem I'm having is that it seems the only method to do that

Re: [R] Random Forest Classification_ForestCombination

2012-05-29 Thread Liaw, Andy
Sent: Wednesday, May 23, 2012 1:51 PM To: r-help@R-project.org Subject: [R] Random Forest Classification_ForestCombination Hello, I am aware of the fact that the combine() function in the Random Forest package of R is meant to combine forests built from the same training set, but is there any

[R] Random Forest Classification_ForestCombination

2012-05-23 Thread Nikita Desai
Hello, I am aware of the fact that the combine() function in the Random Forest package of R is meant to combine forests built from the same training set, but is there any way to combine trees built on different training sets? Both the training datasets used contain the same variables and classe

Re: [R] Random Forest Package

2012-02-01 Thread Liaw, Andy
You should be able to use the Rgui menu to install packages. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Niratha > Sent: Wednesday, February 01, 2012 5:16 AM > To: r-help@r-project.org > Subject:

[R] Random Forest Package

2012-02-01 Thread Niratha
Hi, I have installed R version 2.14 in windows 7 . I want to use randomForest package. I installed Rtools and MikTex 2.9, but i am not possible to read description file and also its not possible to build package. when i give this command in windows R CMD IINSTALL --build randomForest its shows

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-20 Thread Lost in R
Bill thanks so much. I left of the as.matrix and it worked! I really appreciate the help. -- View this message in context: http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4218240.html Sent from the R help mailing list archive at Nabble.com. __

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread William Dunlap
ay, December 16, 2011 2:55 PM > To: r-help@r-project.org > Subject: Re: [R] Random Forest Reading N/A's, I don't see them > > The data set I attached was just those 10 lines. It was only meant to show > any possible obvious mistake I may have made. The real set has the 4

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread David Winsemius
On Dec 15, 2011, at 2:39 PM, Lost in R wrote: After checking the original data in Excel for blanks and running Summary(cm3) to identify any null values in my data, I'm unable to identify an instances. Yet when I attempted to use the data in Random Forest, I get the following error. Is ther

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread Lost in R
The data set I attached was just those 10 lines. It was only meant to show any possible obvious mistake I may have made. The real set has the 4498 line of data. -- View this message in context: http://r.789695.n4.nabble.com/Random-Forest-Reading-N-A-s-I-don-t-see-them-tp4201546p4206630.html Sent

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread David Winsemius
On Dec 16, 2011, at 12:20 PM, Lost in R wrote: I've also attached here a sample of my data in Excel. I'm thinking it It? What is "it"? must be a problem with a character, but can't figure it out. Is there a list somewhere of characters to avoid in R? Thanks, Mike http://r.789695.n4.nabble

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread jim holtman
What exactly is your problem with this file? The file that you sent had 10 lines of what appeared to be data and 4489 lines with just commas which would read in as NAs. When you do an 'str' you get: > str(x) 'data.frame': 4498 obs. of 195 variables: $ Good_Bad : Factor w/ 3

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-16 Thread Lost in R
I've also attached here a sample of my data in Excel. I'm thinking it must be a problem with a character, but can't figure it out. Is there a list somewhere of characters to avoid in R? Thanks, Mike http://r.789695.n4.nabble.com/file/n4205479/Sample_Data_Set.csv Sample_Data_Set.csv -- View this

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-15 Thread Lost in R
Thanks Michael - That was a help, i got rid of the "," in my numbers and the "%" which were making many of the numeric variables FACTORS. It appears that I made all of the those revisions, but still getting the same error. Attached is the str() output if anyone could shed some light it would be mu

Re: [R] Random Forest Reading N/A's, I don't see them

2011-12-15 Thread R. Michael Weylandt
Use str() on your object and attach the result. For even faster help, use dput() on a *small* sample of your data to make the problem reproducible. My guess is that there are characters or, less likely, factors lurking about... Michael On Dec 15, 2011, at 2:39 PM, Lost in R wrote: > After c

[R] Random Forest Reading N/A's, I don't see them

2011-12-15 Thread Lost in R
After checking the original data in Excel for blanks and running Summary(cm3) to identify any null values in my data, I'm unable to identify an instances. Yet when I attempted to use the data in Random Forest, I get the following error. Is there something that Random Forest is reading as null which

Re: [R] Random Forest Classification

2011-10-26 Thread Steve_Friedman
r-help-bounces@r- cc project.org Subject [R] Random Forest

[R] Random Forest Classification

2011-10-26 Thread Mohammed Rashad
Hi All, I wrant to do Random Forest classification. I installed R, randomForest classifier package for R but dont know how to use it. Is there any Open Source Remote sensing application which do RF classification on satellite images? Anyone r has random forest classification example? Any languag

Re: [R] Random Forest

2011-05-24 Thread Peter Langfelder
On Tue, May 24, 2011 at 3:18 PM, Unger, Rachel wrote: > I'm analyzing data using Random Forest Regression.  For some of the > species I am analyzing, the percent variation explained is negative. > Could you please explain to me what that means?  If you need more > information, please let me know.

[R] Random Forest

2011-05-24 Thread Unger, Rachel
I'm analyzing data using Random Forest Regression. For some of the species I am analyzing, the percent variation explained is negative. Could you please explain to me what that means? If you need more information, please let me know. Thank you. Sincerely, Rachel Unger [[altern

Re: [R] Random Forest & Cross Validation

2011-02-27 Thread ronzhao
Thanks to you all! Now I got it! -- View this message in context: http://r.789695.n4.nabble.com/Random-Forest-Cross-Validation-tp3314777p3327384.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https:

Re: [R] Random Forest & Cross Validation

2011-02-24 Thread Liaw, Andy
art adding steps such as feature selections, all bets are off. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of mxkuhn > Sent: Tuesday, February 22, 2011 7:17 PM > To: ronzhao > Cc: r-help@r-project.o

Re: [R] Random Forest & Cross Validation

2011-02-22 Thread mxkuhn
If you want to get honest estimates of accuracy, you should repeat the feature selection within the resampling (not the test set). You will get different lists each time, but that's the point. Right now you are not capturing that uncertainty which is why the oob and test set results differ so mu

Re: [R] Random Forest & Cross Validation

2011-02-22 Thread ronzhao
Thanks, Max. Yes, I did some feature selections in the training set. Basically, I selected the top 1000 SNPs based on OOB error and grow the forest using training set, then using the test set to validate the forest grown. But if I do the same thing in test set, the top SNPs would be different th

Re: [R] Random Forest & Cross Validation

2011-02-20 Thread Max Kuhn
> I am using randomForest package to do some prediction job on GWAS data. I > firstly split the data into training and testing set (70% vs 30%), then > using training set to grow the trees (ntree=10). It looks that the OOB > error in training set is good (<10%). However, it is not very good for

[R] Random Forest & Cross Validation

2011-02-19 Thread ronzhao
Hi, I am using randomForest package to do some prediction job on GWAS data. I firstly split the data into training and testing set (70% vs 30%), then using training set to grow the trees (ntree=10). It looks that the OOB error in training set is good (<10%). However, it is not very good for th

Re: [R] Random Forest AUC

2010-10-24 Thread Liaw, Andy
gt; From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Claudia Beleites > Sent: Saturday, October 23, 2010 3:39 PM > To: r-help@r-project.org > Subject: Re: [R] Random Forest AUC > > Dear List, > > Just curiosity (disclaimer: I never used

Re: [R] Random Forest AUC

2010-10-23 Thread Claudia Beleites
orever", which nececitate the need to find the optimal number of iterations. You don't need that with RF. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of vioravis Sent: Saturday, October 23, 2010 12:15 AM To: r-help@r-

Re: [R] Random Forest AUC

2010-10-23 Thread Changbin Du
", which nececitate the need to find the > > optimal number of iterations. You don't need that with RF. > > > >> -Original Message- > >> From: r-help-boun...@r-project.org > >> [mailto:r-help-boun...@r-project.org] On Behalf

Re: [R] Random Forest AUC

2010-10-23 Thread mxkuhn
tate the need to find the > optimal number of iterations. You don't need that with RF. > >> -Original Message- >> From: r-help-boun...@r-project.org >> [mailto:r-help-boun...@r-project.org] On Behalf Of vioravis >> Sent: Saturday, October 23, 20

Re: [R] Random Forest AUC

2010-10-23 Thread Liaw, Andy
ind the optimal number of iterations. You don't need that with RF. > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of vioravis > Sent: Saturday, October 23, 2010 12:15 AM > To: r-help@r-project.org &

Re: [R] Random Forest AUC

2010-10-22 Thread vioravis
Thanks Max and Andy. If the Random Forest is always giving an AUC of 1, isn't it over fitting??? If not, how do you differentiate this from over fitting??? I believe Random forests are claimed to never over fit (from the following link). http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home

Re: [R] Random Forest AUC

2010-10-22 Thread Liaw, Andy
gt; Sent: Friday, October 22, 2010 1:20 AM > To: r-help@r-project.org > Subject: [R] Random Forest AUC > > > Guys, > > I used Random Forest with a couple of data sets I had to > predict for binary > response. In all the cases, the AUC of the training set is > co

Re: [R] Random Forest AUC

2010-10-22 Thread Max Kuhn
Ravishankar, > I used Random Forest with a couple of data sets I had to predict for binary > response. In all the cases, the AUC of the training set is coming to be 1. > Is this always the case with random forests? Can someone please clarify > this? This is pretty typical for this model. > I hav

[R] Random Forest AUC

2010-10-21 Thread vioravis
Guys, I used Random Forest with a couple of data sets I had to predict for binary response. In all the cases, the AUC of the training set is coming to be 1. Is this always the case with random forests? Can someone please clarify this? I have given a simple example, first using logistic regressi

Re: [R] Random Forest - Strata

2010-07-28 Thread Coll
Max, Thanks. Yes what you said is exactly I am looking for, i.e. the first tree fits using data from sites A&B, then predicts on C (and so on). Does that means if I : 1. pass this list as index into trainControl > tmpSiteList [[1]] [1] 1 2 3 4 5 6 7 [[2]] [1] 1 2 3 8 9 10 [[3]] [1] 4 5

Re: [R] Random Forest - Strata

2010-07-27 Thread Max Kuhn
The index indicates which samples should go into the training set. However, you are using out of bag sampling, so it would use the whole training set and return the OOB error (instead of the error estimates that would be produced by resampling via the index). Which do you want? OOB estimates or ot

Re: [R] Random Forest - Strata

2010-07-27 Thread Coll
Thanks for all the help. I had tried using the "index" in caret to try to dictate which rows of the sample would be used in each of the tree building in RF. (e.g. use all data from A B site for training, hold out all data from C site for testing etc) However after running, when I cross-checked

Re: [R] Random Forest - Strata

2010-07-21 Thread mxkuhn
; > Message: 44 > Date: Tue, 20 Jul 2010 08:48:04 -0700 (PDT) > From: Coll > To: r-help@r-project.org > Subject: [R] Random Forest - Strata > Message-ID: <1279640884553-2295731.p...@n4.nabble.com> > Content-Type: text/plain; charset=us-ascii > > > Hi all, &g

Re: [R] Random Forest - Strata

2010-07-21 Thread Tim Howard
g Subject: [R] Random Forest - Strata Message-ID: <1279640884553-2295731.p...@n4.nabble.com> Content-Type: text/plain; charset=us-ascii Hi all, Had struggled in getting "Strata" in randomForest to work on this. Can I get randomForest for each of its TREE, to get ALL sample from

[R] Random Forest - Strata

2010-07-20 Thread Coll
Hi all, Had struggled in getting "Strata" in randomForest to work on this. Can I get randomForest for each of its TREE, to get ALL sample from some strata to build tree, while leaving some strata TOTALLY untouched as oob? e.g. in below, how I can tell RF to, - for tree 1 in the forest, to use

Re: [R] Random Forest for Ecological Prediction under presence of Spatial Autocorrelation

2010-05-25 Thread Andreas Béguin
Thank you very much for this suggestion, I was not aware of this package. Apart from this, is suggestion 2 (changing nodesize attribute) a good way to go? Experimenting with sampsize (suggestion 4) has yielded promising results. Kind regards, Andreas Béguin 2010/5/24 Gabor Grothendieck > You co

Re: [R] Random Forest for Ecological Prediction under presence of Spatial Autocorrelation

2010-05-24 Thread Gabor Grothendieck
You could also try the Boruta package for variable selection. 2010/5/24 Andreas Béguin : > Dear R-help list members, > > I have a statistical question regarding the Random Forest function (RF) as > applied to ecological prediction of species presences and absences. > > RF seems to perform very wel

[R] Random Forest for Ecological Prediction under presence of Spatial Autocorrelation

2010-05-24 Thread Andreas Béguin
Dear R-help list members, I have a statistical question regarding the Random Forest function (RF) as applied to ecological prediction of species presences and absences. RF seems to perform very well for prediction of species ranges or prevalences. However, the problem with my dataset is a high de

Re: [R] Random Forest

2010-03-10 Thread Liaw, Andy
Thanks for providing the code that allows me to reproduce the problem. It looks like the prediction routine for some reason returns "0" as prediction for some trees, thus causing the problem observed. I'll look into it. Andy From: Dror > > Hi, > Thank you for your replies > as for the predic

Re: [R] Random Forest

2010-03-10 Thread Dror
Hi, Thank you for your replies as for the prediction length, i run this code: " library(arules) data(AdultUCI) AdultUCI$workclass<-factor(AdultUCI$workclass, levels = c(levels(AdultUCI$workclass), "UNKNOWN")) AdultUCI$workclass[is.na(AdultUCI$workclass)]<-"UNKNOWN" AdultUCI$occupation<-factor(Adu

Re: [R] Random Forest

2010-03-01 Thread Liaw, Andy
From: Dror > > Hi, > I'm working with randomForest package and i have 2 questions: > 1. how can i drop a specific tree from the forest? Answered in another post. > 2. i'm trying to get the voting of each tree in a prediction > datum using the > folowing code > > pr<-predict(RF,NewData,type="

Re: [R] Random Forest prediction questions

2010-03-01 Thread Liaw, Andy
From: Dror > > Hi, > I need help with the randomForest prediction. i run the folowing code: > > > iris.rf <- randomForest(Species ~ ., data=iris, > > importance=TRUE,keep.forest=TRUE, proximity=TRUE) > > pr<-predict(iris.rf,iris,predict.all=T) > > iris.rf$votes[53,] > setosa versicolor virg

[R] Random Forest prediction questions

2010-03-01 Thread Dror
Hi, I need help with the randomForest prediction. i run the folowing code: > iris.rf <- randomForest(Species ~ ., data=iris, > importance=TRUE,keep.forest=TRUE, proximity=TRUE) > pr<-predict(iris.rf,iris,predict.all=T) > iris.rf$votes[53,] setosa versicolor virginica 0.000 0.8074866

Re: [R] Random Forest

2010-02-27 Thread Dror
Hi, I'm working with randomForest package and i have 2 questions: 1. how can i drop a specific tree from the forest? 2. i'm trying to get the voting of each tree in a prediction datum using the folowing code > pr<-predict(RF,NewData,type="prob",predict.all=TRUE) my forest has 300 trees and i ge

Re: [R] Random Forest

2010-02-16 Thread Liaw, Andy
From: Dror > > Hi, > i'm using randomForest package and i have 2 questions: > 1. Can i drop one tree from an RF object? Yes. > 2. i have a 300 trees forest, but when i use the predict > function on new > data (with predict.all=TRUE) i get only 270 votes. did i do > something wrong? Try to fol

[R] Random Forest

2010-02-16 Thread Dror
Hi, i'm using randomForest package and i have 2 questions: 1. Can i drop one tree from an RF object? 2. i have a 300 trees forest, but when i use the predict function on new data (with predict.all=TRUE) i get only 270 votes. did i do something wrong? Thanks -- View this message in context: http

Re: [R] Random Forest - partial dependence plot

2009-10-20 Thread Carlos M. Zambrana-Torrelio
gt; these plots from different predictor variables, but not the absolute > range.  Hope that helps. > > Andy > >> -Original Message- >> From: r-help-boun...@r-project.org >> [mailto:r-help-boun...@r-project.org] On Behalf Of Carlos M. >> Zambrana-Torrelio

Re: [R] Random Forest - partial dependence plot

2009-10-20 Thread Liaw, Andy
ject.org] On Behalf Of Carlos M. > Zambrana-Torrelio > Sent: Monday, October 19, 2009 3:47 PM > To: r-help@r-project.org > Subject: [R] Random Forest - partial dependence plot > > Hi everybody, > > I used random forest regression to explain the patterns of species > richn

[R] Random Forest - partial dependence plot

2009-10-19 Thread Carlos M. Zambrana-Torrelio
Hi everybody, I used random forest regression to explain the patterns of species richness and a bunch of climate variables (e.g. Temperature, precipitation, etc.) All are continuos variables. My results are really interesting and my model explained 96,7% of the variance. Now I am trying to take

Re: [R] Random Forest Variable Importance Interpretation

2009-07-06 Thread Alex Roy
Hi, Are you looking for variable selection? If this is the case than you can use LASSO, Elastic net, Sparse PLS regression methods which encourages variable selection. PCA does not select variables as you get all your variables in the PCs. You can sparse PCA. Regards Alex On Wed, Jun 24, 2

[R] Random Forest Variable Importance Interpretation

2009-06-24 Thread lara harrup (IAH-P)
Hi I am trying to explore the use of random forests for regression to identify the important environmental/microclimate variables involved in predicting the abundance of a species in different habitats, there are approx 40 variable and between 200 and 500 data points depending on the dataset. I

Re: [R] Random Forest % Variation vs Psuedo-R^2?

2009-06-08 Thread Liaw, Andy
estimate of MSE. HTH, Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Ryan Harrigan > Sent: Sunday, June 07, 2009 9:38 PM > To: r-help@r-project.org > Subject: [R] Random Forest % Variation vs Psuedo-R^2

[R] Random Forest % Variation vs Psuedo-R^2?

2009-06-07 Thread Ryan Harrigan
Hi all (and Andy!), When running a randomForest run in R, I get the last part of an output (with do.trace=T) that looks like this: 1993 | 0.04606 130.43 | 1994 | 0.04605 130.40 | 1995 | 0.04605 130.43 | 1996 | 0.04605 130.43 | 1997 | 0.04606 130.44 | 1998 | 0.04607 130.47 | 1

Re: [R] Random Forest Variable Importance

2009-03-27 Thread Liaw, Andy
Read ?importance, especially the "scale" argument. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Li GUO > Sent: Friday, March 27, 2009 1:24 PM > To: r-help@r-project.org > Subject

[R] Random Forest Variable Importance

2009-03-27 Thread Li GUO
Hello, I have an object of Random Forest : iris.rf (importance = TRUE). What is the difference between "iris.rf$importance" and "importance(iris.rf)"? Thank you in advance, Best, Li GUO [[alternative HTML version deleted]] __ R-help@

Re: [R] Random Forest confusion matrix

2009-02-26 Thread Gabor Grothendieck
randomForest output is based on predict(iris.rf) whereas the code shown below uses predict(iris.rf, iris). See ?predict.randomForest for an explanation. On Thu, Feb 26, 2009 at 11:10 AM, Li GUO wrote: > Dear R users, > > I have a question on the confusion matrix generated by function randomFores

[R] Random Forest confusion matrix

2009-02-26 Thread Li GUO
Dear R users, I have a question on the confusion matrix generated by function randomForest. I used the entire data set to generate the forest, for example: > print(iris.rf) Call: randomForest(formula = Species ~ ., data = iris, importance = TRUE, keep.forest = TRUE) confusion

Re: [R] Random Forest weighting

2008-12-05 Thread Raghu Naik
Andy, Thanks for your email. I understand that by default, the sampsize variable will use the behavior variable that we are classifying as the strata variable. Then, I could set sampsize=c(no=89, yes=11). I implemented that but I got 99% classification error rate on the yes value. When I oversam

  1   2   >