Hi Vito (question to the authors of MASS below) 2008/6/30 vito muggeo <[EMAIL PROTECTED]>: > Dear Haubo, > many thanks for your reply. > Yes you are right, by scaling the "score", I get the same results. > > However it sounds strange to me. I understand that the SE and/or t-ratio of > the intercepts depend on the scale of the explanatory variable, but I do not > understand why the t-ratio of the slope also depends on the scale.. In > classical regression models (LM, GLM, ...) the scale of the covariate does > not affect the t ratio! As far as I know, classical references (e.g. > Agresti, 2002) do not discuss a such point. Could yu suggest me additional > reference? Namely why should I center the explanatory variables?
The wrong standard errors is a computational issue. Centering is a statistical and (sometimes) a computational issue. The t-value is just the ratio of the estimate to its standard error, and when the program does not give you the correct standard error, naturally the t-value is also wrong. It is often a good idea to center covariates for inferential purposes and a very good reference is Venables' "Exegeses on Linear Models": http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf. I find it natural to expect polr to give the correct standard error without centering the covariate. I do not know why it does not, but perhaps is has something to do with the scaling of the parameters parsed to the BFGS-optimizer in optim. Optimization of the intercepts seems to be performed on something like summary(fm1 <- polr(factor(grade)~score)) cumsum(c(fm1$zeta[1], exp(fm1$zeta[-1]))) which seems to distort the whole variance-covariance matrix of the parameters and therefore provides wrong standard errors for the score-parameter as well. Optimization is much better scaled if you do summary(fm2 <- polr(factor(grade)~1 + I(score - median(score)))) cumsum(c(fm2$zeta[1], exp(fm2$zeta[-1]))) This is merely a guess, and perhaps the authors of MASS, could provide some clarification? One other thing, that puzzles me, is the fact that the parameter estimate/standard error ratio is reported as a t-value rather than a z-value. Likelihood theory (as I know it) seems to suggest a z-value rather than the t-value. A t-value is also provided by the special case; the ordinary logistic regression model: summary(glm(grade > 3 ~ score, family = binomial)) as well as by lrm in package Design. > > Of course profile-Lik based CI may be very usefuls at this aim..but this is > another story.. I agree, but the topic is closely related to standard errors ;-) Best Rune > > many thanks again, > vito > > Rune Haubo ha scritto: >> >> Dear Vito >> >> No, you are not wrong, but you should center score prior to model >> estimation: >> >> summary(fm1 <- polr(factor(grade)~I(score - mean(score)))) >> >> which gives the same standard errors as do lrm. Now the intercepts >> refer the median score rather than some potential unrealistic score of >> 0. >> >> You can always check if standard errors are appropriate by plotting >> the profile likelihood and the normal approximation (based on the >> standard errors): >> >> x <- seq(-.043 - .04, -.043 + .04, len = 20) >> dev <- double(20) >> for(i in 1:20) >> dev[i] <- polr(factor(grade) ~ 1 + offset(x[i] * (score - >> median(score))))$deviance >> >> yy <- spline(x, dev, n = 100, method = "natural") >> plot(yy[[1]], exp(-(yy[[2]] - min(yy[[2]]))/2), type = "l", >> xlab = "Score", ylab = "Normalized profile likelihood") >> ## Add Normal approximation: >> lines(yy[[1]], exp(-(yy[[1]] - coef(fm1))^2 / >> (2 * vcov(fm1)[1, 1])), lty = 2, >> col = "red") >> >> ## Add confidence limits: >> level <- c(0.95, 0.99) >> lim <- sapply(level, function(x) >> exp(-qchisq(x, df=1)/2) ) >> abline(h = lim, col = "grey") >> >> ## Add confidence interval points: >> points(confint(fm1), rep(lim[1], 2), pch = 3) >> >> So these standard errors are the most appropriate you can get, but >> because the profile likelihood is not symmetric, you could consider >> reporting the asymmetric profile likelihood confidence interval >> instead. >> >> Hope this helps >> Rune >> >> 2008/6/30 vito muggeo <[EMAIL PROTECTED]>: >>> >>> Dear all, >>> It appears that MASS::polr() and Design::lrm() return the same point >>> estimates but different st.errs when fitting proportional odds models, >>> >>> grade<-c(4,4,2,4,3,2,3,1,3,3,2,2,3,3,2,4,2,4,5,2,1,4,1,2,5,3,4,2,2,1) >>> >>> score<-c(525,533,545,582,581,576,572,609,559,543,576,525,574,582,574,471,595, >>> 557,557,584,599,517,649,584,463,591,488,563,553,549) >>> >>> library(MASS) >>> library(Design) >>> >>> summary(polr(factor(grade)~score)) >>> lrm(factor(grade)~score) >>> >>> It seems to me that results from lrm() are right.. >>> >>> Am I wrong? >>> Thanks, >>> vito >>> >>> -- >>> ==================================== >>> Vito M.R. Muggeo >>> Dip.to Sc Statist e Matem `Vianelli' >>> Università di Palermo >>> viale delle Scienze, edificio 13 >>> 90128 Palermo - ITALY >>> tel: 091 6626240 >>> fax: 091 485726/485612 >>> http://dssm.unipa.it/vmuggeo >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> > > -- > ==================================== > Vito M.R. Muggeo > Dip.to Sc Statist e Matem `Vianelli' > Università di Palermo > viale delle Scienze, edificio 13 > 90128 Palermo - ITALY > tel: 091 6626240 > fax: 091 485726/485612 > http://dssm.unipa.it/vmuggeo > ==================================== > -- Rune Haubo Bojesen Christensen Master Student, M.Sc. Phone: (+45) 30 26 45 54 Informatics and Mathematical Modelling Section for Statistics Technical University of Denmark, Build.321, DK-2800 Kgs. Lyngby, Denmark ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.