Hi Professor Brian, Thanks for your reply.
I think there are many statisticians here, and it is somehow R related, hoping someone can help me. I have done a simple test, using a sample csv data which I post if need. donut <- read.csv(file="D:/donut.csv", header = TRUE); donut[["color"]] <- as.factor(donut[["color"]]) donut[["shape"]] <- as.factor(donut[["shape"]]) donut[["k"]] <- as.factor(donut[["k"]]) donut[["k0"]] <- as.factor(donut[["k0"]]) donut[["bias"]] <- as.factor(donut[["bias"]]) lr <- glm(color ~ shape + x + y, family = binomial, data = donut); summary(lr) Call: glm(formula = color ~ shape + x + y, family = binomial, data = donut) Deviance Residuals: Min 1Q Median 3Q Max -2.1079 -0.9476 0.5086 0.7518 1.4079 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 2.53010 1.65500 1.529 0.1263 shape22 0.05628 1.54990 0.036 0.9710 shape23 -0.74568 1.44813 -0.515 0.6066 shape24 -2.61896 1.38016 -1.898 0.0578 . shape25 -2.07648 1.32818 -1.563 0.1180 x -0.45885 1.52863 -0.300 0.7640 y -0.59311 1.46999 -0.403 0.6866 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 50.446 on 39 degrees of freedom Residual deviance: 42.473 on 33 degrees of freedom AIC: 56.473 Number of Fisher Scoring iterations: 4 In the Coefficients section, is Pr(>|z|) the P-value for that variable, and there are a few other questions: 1. How to determine the predict power of each variables? 2. How to determine the overall performance of the fitted model, here what's the difference between and "Deviance Residuals" and "Residual deviance"? 3. How to compare "Null deviance" and "Residual deviance"? 4. What does AIC mean, and how to use this measure? 5. What does the Signif. codes section mean? Regards, Xiaobo Gu On Mon, Jun 6, 2011 at 9:59 PM, Prof Brian Ripley <rip...@stats.ox.ac.uk> wrote: > On Mon, 6 Jun 2011, Xiaobo Gu wrote: > >> Hi, >> >> I am trying glm with family = binomial to do binary logistic >> regression, but how can I assess the accuracy of the fitted model, the >> summary method can print a lot of information about the returned >> object, such as coefficients, because statistics is not my speciality, >> so can you share some rule of thumb to exam the fitted model from the >> practical perspective. > > It depends entirely on why you did the fit. People have written whole books > on assessing the performance of classification procedures such as binary > logistic regression. For example, the residual deviance is closely related > to log-probability scoring: for some purposes that is a good performance > measure, for others (e.g. when you are going to threshold the predicted > probabilities) it can be very misleading. > > In short, you need statistical advice, not R advice (the purpose of this > list). > >> >> Regards, >> >> Xiaobo Gu >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.