Many thanks for your help. Sorry for my delayed reply, but I was away. Regarding the OOB error, sorry it was a typo.
As far as the voting, I was just wondering if there is a function that will give me the prediction of each case through each tree. Is there any function that produce the rules for each tree? If I have a new case that I want to predict the class that it belongs to, how can I predict that? I should look to each tree and then get the voting? Or are there some predictive rules that I can use? I cannot do that prediction from the results that function votes give to me... Also, I was wondering why randomizations along with combining the predictions from the trees significantly improve the overall predictive accuracy? Thanks a lot, Chrysanthi 2009/4/13 Liaw, Andy <andy_l...@merck.com> > I really don't understand what you don't understand. Do you know how a > tree forms a prediction? If not, it may be a good idea to learn about that > first. The code runs prediction of each case through all trees in the > forest and that's how the votes are formed. > > [For OOB predictions, only predictions from trees for which the case is > out-of-bag are counted. That's why you may get odd-ball vote fractions even > when you grow 100 trees and expect the votes to be in seq(0, 1, by=0.01).] > > 100% - 2.34% = 97.66%, not 76.6% (I can only assume you had a typo). > > Cheers, > Andy > > ------------------------------ > *From:* Chrysanthi A. [mailto:chrys...@gmail.com] > *Sent:* Monday, April 13, 2009 9:44 AM > > *To:* Liaw, Andy > *Cc:* r-help@r-project.org > *Subject:* Re: [R] help with random forest package > > > But how does it estimate that voting output? How does it get the 85.7% for > all the trees? > > Regarding the prediction accuracy. If I have OOB error = 2.34, then the > prediction accuracy will be equal to 76.6%, right? > > Many thanks, > > Chrysanthi. > > > 2009/4/13 Liaw, Andy <andy_l...@merck.com> > >> RF forms prediction by voting. Note that each row in the output sums to >> 1. It says 85.7% of the trees classified the first case as "healthy" and >> the other 14.3% of the trees "unhealthy". The majority (in two-class cases >> like this one) wins, so the prediction is "healthy". >> >> You can take 1 - OOB error rate as the estimate of prediction accuracy (if >> you have not selected variables, e.g., using variable importance, in >> building the final RF model). >> >> Andy >> >> ------------------------------ >> *From:* Chrysanthi A. [mailto:chrys...@gmail.com] >> *Sent:* Friday, April 10, 2009 10:44 AM >> >> *To:* Liaw, Andy >> *Cc:* r-help@r-project.org >> *Subject:* Re: [R] help with random forest package >> >> >> >> Hi, >> >> To be honest, I cannot really understand what is the meaning of the >> votes.. For example having five samples and two classes what the numbers >> below means? >> healthy unhealthy >> 1 0.85714286 0.14285714 >> 2 0.92857143 0.07142857 >> 3 0.90000000 0.10000000 >> 4 0.92857143 0.07142857 >> 5 0.84615385 0.15384615 >> >> Suppose now, having the classification, I have an unknown sample and >> according to the results that Ive got, how can I predict in which class it >> belongs to? Do the votes give that prediction to us? >> >> Also, the error is reported on the "OOB estimate of error rate", right? >> For example, if we have OOB estimate of error rate:2.34%, we can say that >> the prediction accuracy is approx. 97.7%? How can we estimate the prediction >> accuracy? >> >> >> Thanks a lot, >> >> Chrysanthi. >> >> >> 2009/4/8 Liaw, Andy <andy_l...@merck.com> >> >>> I'm not quite sure what you're asking. RF predicts by classifying the >>> new observation using all trees in the forest, and take plural vote. The >>> predict() method for randomForest objects does that for you. The getTree() >>> function shows you what each individual tree is like (not visually, just the >>> underlying representation of the tree). >>> >>> Andy >>> >>> ------------------------------ >>> *From:* Chrysanthi A. [mailto:chrys...@gmail.com] >>> *Sent:* Wednesday, April 08, 2009 2:56 PM >>> *To:* Liaw, Andy >>> *Cc:* r-help@r-project.org >>> *Subject:* Re: [R] help with random forest package >>> >>> Many thanks for the reply. >>> >>> So, extracting the votes, how can we clarify the classification result? >>> If I want to predict in which class will be included an unknown sample, what >>> is the rule that will give me that? >>> >>> Thanks a lot, >>> >>> Chrysanthi. >>> >>> >>> >>> 2009/4/8 Liaw, Andy <andy_l...@merck.com> >>> >>>> The source code of the whole package is available on CRAN. All packages >>>> are submitted to CRAN is source form. >>>> >>>> There's no "rule" per se that gives the final prediction, as the final >>>> prediction is the result of plural vote by all trees in the forest. >>>> >>>> You may want to look at the varUsed() and getTree() functions. >>>> >>>> Andy >>>> >>>> From: Chrysanthi A. >>>> > Hello, >>>> > >>>> > I am a phd student in Bioinformatics and I am using the Random Forest >>>> > package in order to classify my data, but I have some questions. >>>> > Is there a function in order to visualize the trees, so as to >>>> > get the rules? >>>> > Also, could you please provide me with the code of >>>> > "randomForest" function, >>>> > as I would like to see how it works. I was wondering if I can get the >>>> > classification having the most votes over all the trees in >>>> > the forest (the >>>> > final rules that will give me the final classification). >>>> > Also, is there a >>>> > possibility to get a vector with the attributes that are >>>> > being selected for >>>> > each node during the construction of each tree? I mean, that >>>> > I would like to >>>> > know the m<<M variables that are selected at each node out of >>>> > the M input >>>> > attributes.. Are they selected randomly? Is there a >>>> > possibility to select >>>> > the same variable in subsequent nodes? >>>> > >>>> > Thanks a lot, >>>> > >>>> > Chrysanthi. >>>> > >>>> > [[alternative HTML version deleted]] >>>> > >>>> > ______________________________________________ >>>> > R-help@r-project.org mailing list >>>> > https://stat.ethz.ch/mailman/listinfo/r-help >>>> > PLEASE do read the posting guide >>>> > http://www.R-project.org/posting-guide.html >>>> > and provide commented, minimal, self-contained, reproducible code. >>>> > >>>> Notice: This e-mail message, together with any attachments, contains >>>> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, >>>> New Jersey, USA 08889), and/or its affiliates (which may be known >>>> outside the United States as Merck Frosst, Merck Sharp & Dohme or >>>> MSD and in Japan, as Banyu - direct contact information for affiliates >>>> is >>>> available at http://www.merck.com/contact/contacts.html) that may be >>>> confidential, proprietary copyrighted and/or legally privileged. It is >>>> intended solely for the use of the individual or entity named on this >>>> message. If you are not the intended recipient, and have received this >>>> message in error, please notify us immediately by reply e-mail and >>>> then delete it from your system. >>>> >>>> >>> Notice: This e-mail message, together with any attachments, contains >>> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, >>> New Jersey, USA 08889), and/or its affiliates (which may be known >>> outside the United States as Merck Frosst, Merck Sharp & Dohme or >>> MSD and in Japan, as Banyu - direct contact information for affiliates is >>> available at http://www.merck.com/contact/contacts.html) that may be >>> confidential, proprietary copyrighted and/or legally privileged. It is >>> intended solely for the use of the individual or entity named on this >>> message. If you are not the intended recipient, and have received this >>> message in error, please notify us immediately by reply e-mail and >>> then delete it from your system. >>> >>> >> >> Notice: This e-mail message, together with any attachments, contains >> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, >> New Jersey, USA 08889), and/or its affiliates (which may be known >> outside the United States as Merck Frosst, Merck Sharp & Dohme or >> MSD and in Japan, as Banyu - direct contact information for affiliates is >> available at http://www.merck.com/contact/contacts.html) that may be >> confidential, proprietary copyrighted and/or legally privileged. It is >> intended solely for the use of the individual or entity named on this >> message. If you are not the intended recipient, and have received this >> message in error, please notify us immediately by reply e-mail and >> then delete it from your system. >> >> > Notice: This e-mail message, together with any attach...{{dropped:17}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.