HI, Andy, Thanks so much for your reply!
IN the paper "Classification and regression by randomForest", the first page, there is "the random forest estimate the the importance of a variable by looking at how much prediction error increase when the variable is permuted..." IN the help document of randomForest, the variable is measured in total decrease in node impurities. IT should be total* increase* in node impurities? right? if total decrease in node impurities, will it be contradict with the paper? ALso in the fit$importance, what is the meaning for first two columns? > fit$importance 0 1 MeanDecreaseAccuracy MeanDecreaseGini CT 0.0022352025 0.003829344 0.0030311246 5.184427 DP 0.0069461974 0.016387520 0.0116650960 15.440624 DY 0.0141150255 0.026031690 0.0200603555 19.901538 FC 0.0024279188 0.005158945 0.0037948155 5.527078 NE 0.0352705133 0.070503233 0.0527718526 46.278504 NW 0.0256059127 0.034433862 0.0299981496 26.440402 QT 0.0037228694 0.008181262 0.0059571350 9.308828 SK 0.0048187014 0.008895719 0.0068609174 10.662129 TA 0.0042134249 0.011746533 0.0079851331 12.878367 WC 0.0177155268 0.014981440 0.0163366320 14.240232 WD 0.0232972311 0.034083695 0.0286702065 25.335182 WG 0.0328547215 0.053142508 0.0429480441 30.663749 WW 0.0093983693 0.006377956 0.0078681474 7.250101 YG 0.0051691399 0.007338639 0.0062618144 11.084111 num_cell 0.0061355526 0.005373049 0.0057463613 5.060577 num_genes 0.0364878788 0.044544488 0.0404558096 32.745034 position 0.0025375614 0.011566496 0.0070255302 10.070505 freq_hypo 0.0008723241 0.001757602 0.0013181209 1.930695 freq_intra 0.0009449492 0.001943090 0.0014431451 2.611950 log_hypo 0.0004514713 0.001366561 0.0009096419 1.736749 acid_per 0.0125815445 0.023360179 0.0179634375 21.131681 base_per 0.0070077737 0.012196570 0.0096129124 13.675893 charge_per 0.0095668425 0.024125997 0.0168345956 20.969665 hydrophob_per 0.0185736697 0.031941513 0.0252200036 25.994903 polar_per 0.0169369327 0.023633413 0.0202776247 20.890415 On Thu, Apr 29, 2010 at 5:22 AM, Liaw, Andy <andy_l...@merck.com> wrote: > Please see the "Detail" section of the help page for the importance() > function in the randomForest package, and let me know which part of it you > do not understand. > > For boosting, you need to read its documentation and decide for yourself if > its importance measure is at all comparable to the two in RF. > > Andy > > ------------------------------ > *From:* Changbin Du [mailto:changb...@gmail.com] > *Sent:* Wednesday, April 28, 2010 8:58 PM > *To:* Liaw, Andy > *Cc:* r-help@r-project.org > *Subject:* variable importance in Random Forest > > HI, Dear Andy, > > I run the RandomFOrest in R, and get the following resutls in variable > importance: > > What is the meaning of MeanDecreaseAccuracy and MeanDecreaseGini? > > I found they are raw values, they are not scaled to 1, right? > > Which column if most similar to the variable rel.influence in Boosting? > > Thanks so much! > > > > > fit$importance > 0 1 MeanDecreaseAccuracy > MeanDecreaseGini > CT 0.0022352025 0.003829344 0.0030311246 > 5.184427 > DP 0.0069461974 0.016387520 0.0116650960 > 15.440624 > DY 0.0141150255 0.026031690 0.0200603555 > 19.901538 > FC 0.0024279188 0.005158945 0.0037948155 > 5.527078 > NE 0.0352705133 0.070503233 0.0527718526 > 46.278504 > NW 0.0256059127 0.034433862 0.0299981496 > 26.440402 > QT 0.0037228694 0.008181262 0.0059571350 > 9.308828 > SK 0.0048187014 0.008895719 0.0068609174 > 10.662129 > TA 0.0042134249 0.011746533 0.0079851331 > 12.878367 > WC 0.0177155268 0.014981440 0.0163366320 > 14.240232 > WD 0.0232972311 0.034083695 0.0286702065 > 25.335182 > WG 0.0328547215 0.053142508 0.0429480441 > 30.663749 > WW 0.0093983693 0.006377956 0.0078681474 > 7.250101 > YG 0.0051691399 0.007338639 0.0062618144 > 11.084111 > num_cell 0.0061355526 0.005373049 0.0057463613 > 5.060577 > num_genes 0.0364878788 0.044544488 0.0404558096 > 32.745034 > position 0.0025375614 0.011566496 0.0070255302 > 10.070505 > freq_hypo 0.0008723241 0.001757602 0.0013181209 > 1.930695 > freq_intra 0.0009449492 0.001943090 0.0014431451 > 2.611950 > log_hypo 0.0004514713 0.001366561 0.0009096419 > 1.736749 > acid_per 0.0125815445 0.023360179 0.0179634375 > 21.131681 > base_per 0.0070077737 0.012196570 0.0096129124 > 13.675893 > charge_per 0.0095668425 0.024125997 0.0168345956 > 20.969665 > hydrophob_per 0.0185736697 0.031941513 0.0252200036 > 25.994903 > polar_per 0.0169369327 0.023633413 0.0202776247 > 20.890415 > > > > > > > > > > > -- > Sincerely, > Changbin > -- > > > Notice: This e-mail message, together with any attach...{{dropped:21}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.