[cc'ing back to r-help again -- I do this so the answers can be archived and viewed by others]
On 12-02-02 02:41 PM, Sally Luo wrote: > Prof. Bolker, > > Thanks for your quick reply and detailed explanation. > > I also ran the unrestricted model using glmfit <- > glm(y~x1+x2+x3+x4+x5+x6+x7+x8, family=binomial(link="probit"),data=d). > However, the results I got from glm and mle2 (both for the unrestricted > model) are not very similar (please see below). In your earlier example, > both glm and mle2 produce almost the same estimation results. I just hope > to figure out what might cause the discrepancy in the estimation results > I've got. > > > coef(summary(*glmfit*)) > Estimate Std. Error z value Pr(>|z|) > (Intercept) -0.853900059 0.2464179864 -3.4652505 5.297377e-04 > x1 1.627125691 0.3076174699 5.2894450 1.226881e-07 > x2 -0.092716326 0.5229866504 -0.1772824 8.592866e-01 > x3 -3.301509522 0.9169991843 -3.6003407 3.178004e-04 > x4 7.187483436 2.2135961171 3.2469715 1.166401e-03 > x5 -0.002544181 0.0112740324 -0.2256673 8.214602e-01 > x6 6.978374268 2.2347939216 3.1226030 1.792594e-03 > x7 -0.009832379 0.0113807583 -0.8639476 3.876167e-01 > x8 -0.001252075 0.0002304789 -5.4324941 5.557178e-08 > >> coef(summary(*mle2fit*)) > Estimate Std. Error z value Pr(z) > pprobit.(Intercept) -0.603492668 0.230117071 -2.6225463 8.727541e-03 > pprobit.x1 1.645984346 0.288479906 5.7057158 1.158552e-08 > pprobit.x2 -0.157361533 0.523048376 -0.3008546 7.635253e-01 > pprobit.x3 -3.935203692 0.932692587 -4.2191862 2.451857e-05 > pprobit.x4 7.512701611 0.062911076 119.4177885 0.000000e+00 > pprobit.x5 -0.001475556 0.011525137 -0.1280293 8.981258e-01 > pprobit.x6 7.399355063 0.018372749 402.7353318 0.000000e+00 > pprobit.x7 -0.010113008 0.011647725 -0.8682388 3.852636e-01 > pprobit.x8 -0.001650021 0.000244997 -6.7348622 1.640854e-11 My best guess is that you are running into optimization problems. The big advantage of glm() is that it uses a special-purpose optimization method (iteratively reweighted least squares) that is generally much more robust/reliable than general-purpose nonlinear optimizers such as nlminb. If there is indeed a GLM fitting routine coded in R, somewhere, that someone has adapted to work with box constraints, it will probably perform better than mle2. Some general suggestions for troubleshooting this: * check the log-likelihoods returned by the two methods. If they are very close (say within 0.01 likelihood units), then the issue is that you just have a very flat goodness-of-fit surface, and the two sets of coefficients are in practice very similar to each other. * if possible, try starting each approach (glm(), mle2()) from the solution found by the other (it's a little bit of a pain to get the syntax right here) and see if they get stuck right where they are or whether they find that one answer or the other is right. * if you were using one of the optimizing methods from optim() (rather than nlminb), e.g. L-BFGS-B, I would suggest you try using parscale to rescale the parameters to have approximately equal magnitudes near the solution. This apparently isn't possible with nlminb, but you could try optimizer="optim" (the default), method="L-BFGS-B" and see how you do (although L-BFGS-B is often a bit finicky). Alternatively, you can try optimizer="optimx", in which case you have a larger variety of unconstrained optimizers to choose from (you have to install the optimx package and take a look at its documentation). Alternatively, you can scale your input variables (e.g. use scale() on your input matrix to get zero-centered, sd 1 variables), although you would then have to adjust your lower and upper bounds accordingly. * it's a bit more work, but you may be able to unpack this a bit and provide analytical derivatives. That would help a lot. In short: you are entering the quagmire of numerical optimization methods. I have learned most of this stuff by trial and error -- can anyone on the list suggest a good/friendly introduction? (Press et al Numerical Recipes; Givens and Hoeting's Computational Statistics book looks good, although I haven't read it ...) Ben Bolker > > > > On Thu, Feb 2, 2012 at 1:12 PM, Ben Bolker <bbol...@gmail.com> wrote: > >> >> [cc'ing back to r-help] >> >> On 12-02-02 01:56 PM, Sally Luo wrote: >> >>> I tried to adapt your code to my model and got the results as below. I >>> don't know how to fix the warning messages. It says "rearrange the lower >>> (or upper) bounds to match 'start'". >> >> The warning is overly conservative in this case. I should work on >> engineering the package so that it handles this better. You can >> disregard them. >> >> In answer to your previous questions: >> >> * "size" refers to the number of trials per observation (1, if you have >> binary data) >> * you've got the form of the lower and upper bounds right. >> * you've got the formula in 'parameters' right -- this builds a linear >> model (using R's model.matrix) on the probit scale based on the 8 >> parameters >>> >>> And two of the estimates for my restricted parameters are on the >> boundary. >>> The warning message says the variance-covariance calculations may be >>> unreliable. Those parameters are the ones of interest to my study. Can >> I >>> still make inferences using the p-values reported by mle2 in this case? >> >> That's quite tricky unfortunately, and it isn't a problem that's >> specific to the mle2 package. The basic issue is that the whole >> derivation of the multivariate normal sampling distribution of the >> maximum likelihood estimator depends on the maximum likelihood being an >> interior local maximum (and hence having a negative-definite hessian, or >> a positive-definite information matrix), which is untrue on the boundary >> -- the Wikipedia article on maximum likelihood mentions this issue, for >> example http://en.wikipedia.org/wiki/Maximum_likelihood >> >> Perhaps someone here can suggest an approach (although it gets outside >> the scope of "R help", or you can ask on http://stats.stackexchange.com... >> >> >>> >>> Thanks for your help. ---- Sally >>> >>> >>> mle2fit <- mle2(y~dbinom(pnorm(pprobit),size=1), >>> + parameters=list(pprobit~x1+x2+x3+x4+x5+x6+x7+x8), >>> + start=list(pprobit=0), >>> + optimizer="nlminb", >>> + lower=c(-Inf,-1,-1,-1,-Inf,-Inf,-Inf,-Inf,-Inf), >>> + upper=c(Inf,1,1,1,Inf,Inf,Inf,Inf,Inf), >>> + data=d) >>> >>> Warning messages: >>> 1: In fix_order(call$lower, "lower bounds", -Inf) : >>> lower bounds not named: rearranging to match 'start' >>> 2: In fix_order(call$upper, "upper bounds", Inf) : >>> upper bounds not named: rearranging to match 'start' >>> 3: In mle2(y ~ dbinom(pnorm(pprobit), size = 1), parameters = >> list(pprobit >>> ~ : >>> some parameters are on the boundary: variance-covariance calculations >>> based on Hessian may be unreliable >> >>> >>> >>> On Wed, Feb 1, 2012 at 11:16 PM, Sally Luo <shali...@gmail.com> wrote: >>> >>>> Prof. Bolker, >>>> >>>> Thanks a lot for your reply. >>>> >>>> In my model, I have 9 explanatory variables and I need to restrict the >>>> range of parameters 2-4 to (-1,1). I tried to modify the univariate >> probit >>>> example you gave in your reply, however, I could not get through. >>>> >>>> Specificially, I am not sure what 'pprobit' represents in your code? How >>>> should I code this part if I have more than one variable? >>>> >>>> Also does "size" refer to the number of parameters? >>>> >>>> Since only 3 parameters need to be restricted in my model, should I >> write >>>> lower=c(-Inf, -1,-1,-1, -Inf, -Inf, -Inf, -Inf, -Inf) and upper=c(Inf, >>>> 1,1,1, Inf, Inf, Inf, Inf, Inf)? >>>> >>>> Thanks again for your kind help. >>>> >>>> Best, >>>> >>>> Sally >>>> >>>> >>>> >>>> On Wed, Feb 1, 2012 at 7:19 AM, Ben Bolker <bbol...@gmail.com> wrote: >>>> >>>>> Sally Luo <shali623 <at> gmail.com> writes: >>>>> >>>>>> >>>>>> Dear R helpers, >>>>>> >>>>>> I need to estimate a probit model with box constraints placed on >>>>> several of >>>>>> the model parameters. I have the following two questions: >>>>>> >>>>>> 1) How are the standard errors calclulated in glm >>>>>> (family=binomial(link="probit")? I ran a typical probit model using >> the >>>>>> glm probit link and the nlminb function with my own coding of the >>>>>> loglikehood, separately. As nlminb does not produce the hessian >> matrix, >>>>> I >>>>>> used hessian (numDeriv) to calculate it. However, the standard errors >>>>>> calculated using hessian function are quite different from the ones >>>>>> generated by the glm function, although the parameter estimates are >> very >>>>>> close. I was wondering what makes this difference in the estmation of >>>>>> standard errors and how this computation is carried out in glm. >>>>>> >>>>>> 2) Does any one know how to estimate a constrained probit model in R >>>>> (to be >>>>>> specific, I need to restrain the range of three parameters to [-1,1])? >>>>>> Among the optimation functions, so far nlminb and spg work for my >>>>> problem, >>>>>> but neither produces a hessian matrix. As I mentioned above, if I use >>>>>> hessian funciton and calculate standard errors manually, the standard >>>>>> errors seem not right. >>>>>> >>>>> >>>>> I'm a little biased, but I think the bbmle package is the >>>>> easiest way to get this done -- it provides convenient wrappers >>>>> for a range of optimizers including nlminb. >>>>> I would warn however that you should be very careful interpreting >>>>> the meaning of the Hessian matrix if some of your parameters lie >>>>> on the boundary of the feasible space ... >>>>> >>>>> set.seed(101) >>>>> x <- runif(100) >>>>> p <- pnorm(1+3*x) >>>>> y <- rbinom(100,p,size=1) >>>>> d <- data.frame(x,y) >>>>> glmfit <- glm(y~x,family=binomial(link="probit"),data=d) >>>>> coef(summary(glmfit)) >>>>> >>>>> library(bbmle) >>>>> mle2fit <- mle2(y~dbinom(pnorm(pprobit),size=1), >>>>> parameters=list(pprobit~x), >>>>> start=list(pprobit=0), >>>>> data=d) >>>>> coef(summary(mle2fit)) >>>>> >>>>> ## now add constraints >>>>> mle2fit_constr <- mle2(y~dbinom(pnorm(pprobit),size=1), >>>>> parameters=list(pprobit~x), >>>>> start=list(pprobit=0), >>>>> optimizer="nlminb", >>>>> lower=c(2,0.15), >>>>> data=d) >>>>> >>>>> coef(summary(mle2fit_constr)) >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> <http://www.r-project.org/posting-guide.html> >> >>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>> >> >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.