Re: [R] Correcting for overdispersion

peter dalgaard Mon, 09 Jul 2012 13:35:46 -0700

On Jul 9, 2012, at 21:08 , Lawrence, Adaku wrote:

> Hello,
> Thanks for getting back to me. I was of the impression that once the res. 
> var. is larger than the df then the data was overdispersed and as such the 
> model was not a best fit. Is this true?

Not without qualification. There are various schools, but if you ask me, I 
think that overdispersion models are used a bit too often without proper 
attention to what they actually mean. Sometimes the effect is (unwittingly) to 
paper over systematic lack of fit in the model (judging by your residuals, 
that's not likely the case here, though). 

To use such models you should have evidence of lack of fit and/or a plausible 
reason for the extra variation. 

Re. evidence, you have a deviance of 7.31 on 4 df which corresponds to a p 
value of 0.12 in the asymptotic chi-square distribution. So, not exactly 
convincing; also, you need to consider whether the expected counts are large 
enough for the asymptotics to hold.

Re. plausibility, you should ask yourself whether there is good reason to have 
have an extra random effect operating at the level of individual binomial 
distributions. This could be the case if you have an experiment of the sort 
where you give, say, a doses of pesticide to containers of 50 flies, and count 
the dead ones. In that case, there could be effects of getting the dose 
slightly wrong, the temperature of the container, and whatnot. If on the other 
hand, you inject a batch of rats with a dose from a randomly chosen vial, each 
of which contain a carefully and individually measured-out dose, then it could 
be quite hard to think of a reason for something increasing or decreasing the 
probability for all rats at the same dose.

That being said, as far as I can tell, there's no problem in principle with 
using dose.p on an overdispersed model, because it only depends on vcov(obj). 
An overdispersion parameter based on 4 df is the most worrying bit.

-pd

> Here is an example of the output from R:
> Call:
> glm(formula = y ~ log(conc), family = binomial)
> Deviance Residuals: 
>        1         2         3         4         5         6  
>  0.54568   1.08474   0.04561  -2.00959   0.05772   1.33891  
> Coefficients:
>             Estimate Std. Error z value Pr(>|z|)    
> (Intercept) -5.52815    0.85916  -6.434 1.24e-10 ***
> log(conc)    0.40457    0.05938   6.813 9.56e-12 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> (Dispersion parameter for binomial family taken to be 1)
>     Null deviance: 78.811  on 5  degrees of freedom
> Residual deviance:  7.311  on 4  degrees of freedom
> AIC: 30.45
> Number of Fisher Scoring iterations: 4
> > 
> > xv<-seq(min(log(conc)-1),max(log(conc)+1),0.01)
> > lines(xv,predict(model,list(conc=exp(xv)),type="response"))
> > 
> > dose.p(model,p=c(0.10,0.25,0.5,0.75,0.90))
>                Dose        SE
> p = 0.10:  8.233179 0.9810446
> p = 0.25: 10.948665 0.6580127
> p = 0.50: 13.664152 0.4703530
> p = 0.75: 16.379638 0.5720159
> p = 0.90: 19.095125 0.8665399
> > exp(13.664152)
> [1] 859539.4
> > exp(13.664152+(1.96*0.4703530))
> [1] 2160918
> > exp(13.664152-(1.96*0.04703530))
> [1] 783842
> BW
> Adaku
> ________________________________________
> From: peter dalgaard [pda...@gmail.com]
> Sent: 09 July 2012 20:03
> To: Lawrence, Adaku
> Cc: r-help@r-project.org
> Subject: Re: [R] Correcting for overdispersion
> 
> On Jul 9, 2012, at 20:23 , Lawrence, Adaku wrote:
> 
> > Hello,
> >
> > I am trying to determine LD50 and LD95 using dose.p in MASS however some of 
> > the Residual variance is larger than the degrees of freedom. Please can 
> > anyone help with any advice as to how i can correct for this?
> 
> Er, in what sense is that a problem? Your code is not reproducible, at least 
> some output to look at might help.
> 
> -pd
> 
> >
> > Here is the model as inputted into R
> >
> >
> >
> > y<-cbind(dead,n-dead)
> >
> > model<-glm(y~log(conc),binomial)
> > summary(model)
> >
> > xv<-seq(min(log(conc)-1),max(log(conc)+1),0.01)
> > lines(xv,predict(model,list(conc=exp(xv)),type="response"))
> >
> > dose.p(model,p=c(0.10,0.25,0.5,0.75,0.90,0.95))
> >
> >
> >
> > Thanks
> >
> > Adaku
> >
> >
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd....@cbs.dk  Priv: pda...@gmail.com

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Correcting for overdispersion

Reply via email to