Had several instances of similar issues with weighted leasts squares and weighted logistic regression recently. It is important to remember that the log likelihoods for the weighted models include a term that is a function of the n weights and that these are not incorporated in deviances (or sums of squares for ls regression).
Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: brian_c...@usgs.gov tel: 970 226-9326 From: Andrew Miles <rstuff.mi...@gmail.com> To: Mark Leeds <marklee...@gmail.com> Cc: r-help@r-project.org Date: 05/31/2012 04:09 PM Subject: Re: [R] Higher log-likelihood in null vs. fitted model Sent by: r-help-boun...@r-project.org Interesting. If you use the deviances printed out by the fitted model (including the null deviance) and back-engineer it to log-likelihoods, you get results identical to Stata. > m$deviance*(-.5) [1] -390.9304 > m$null.deviance*(-.5) [1] -393.064 However, using the deviance to calculate the AIC by hand does not get you the AIC that the model returns. > m$deviance + 2*2 [1] 785.8608 > m$aic [1] 764.3816 The difference seems to be in how the function glm.fit() calculates the AIC that it returns, and which the function logLik() uses to return the log-likelihood, as Mark indicated. The function is: binomial()$aic function (y, n, mu, wt, dev) { m <- if (any(n > 1)) n else wt -2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m * y), round(m), mu, log = TRUE)) } On the other hand, glm() calculates the deviance based on the sum of the deviance residuals. This gives me a sense for how log-likelihoods calculated using the deviance and model AIC could differ (i.e., different ways of calculating them), but it is still not clear to me *why *the two approaches differ. And they only differ when I use weights with the data. More importantly, it makes me wonder which is more reliable - in my case, it seems the deviance residuals approach is more reliable in that a) it gives a sensible result (i.e., a model with a predictor has a higher log-likelihood than the null model), and b) it matches an independent estimation performed in Stata. But is that always the case? And if so, why use the other approach at all? On Thu, May 31, 2012 at 9:55 AM, Mark Leeds <marklee...@gmail.com> wrote: > Hi Duncan: I don't know if the following can help but I checked the code > and logLik defines the log likelihood as (p - glmobject$aic/2) where p is > the glmobject$rank. So, > the reason for the likelihood being less is that, in the null, it ends up > being ( 1 - glmobject$aic/2) and in the other one it ends up being ( 2 - > glmobject$aic/2). > > so > > 2 - 764.4/2 = -380.2 and > > 1 - 761.9/2 = -379.95 ( close enough for govt work ) > > So, that's where the #'s are coming from but it really depends on how AIC > is defined. > Likelihoods should not involve degrees of freedom ( atleast not where they > make > likelihood less like in the above example ) so maybe backing the > likelihood out using > AIC is the issue ? ( AIC = -2 * likelihood + 2p so p - AIC/2 = > likelihood). AIC is a function of the likelihood but , as far as I know, > likelihood is not a function of the AIC. > Thanks for any insight. > > > > > > > On Thu, May 31, 2012 at 9:26 AM, Duncan Murdoch <murdoch.dun...@gmail.com>wrote: > >> On 12-05-31 8:53 AM, Andrew Miles wrote: >> >>> Two related questions. >>> >>> First, I am fitting a model with a single predictor, and then a null >>> model >>> with only the intercept. In theory, the fitted model should have a >>> higher >>> log-likelihood than the null model, but that does not happen. See the >>> output below. My first question is, how can this happen? >>> >> >> I suspect you'll need to give sample data before anyone can really help >> with this. >> >> >>> m >>>> >>> >>> Call: glm(formula = school ~ sv_conform, family = binomial, data = dat, >>> weights = weight) >>> >>> Coefficients: >>> (Intercept) sv_conform >>> -2.5430 0.2122 >>> >>> Degrees of Freedom: 1488 Total (i.e. Null); 1487 Residual >>> Null Deviance: 786.1 >>> Residual Deviance: 781.9 AIC: 764.4 >>> >>>> null >>>> >>> >>> Call: glm(formula = school ~ 1, family = binomial, data = dat, weights = >>> weight) >>> >>> Coefficients: >>> (Intercept) >>> -2.532 >>> >>> Degrees of Freedom: 1488 Total (i.e. Null); 1488 Residual >>> Null Deviance: 786.1 >>> Residual Deviance: 786.1 AIC: 761.9 >>> >>>> logLik(m); logLik(null) >>>> >>> 'log Lik.' -380.1908 (df=2) >>> 'log Lik.' -379.9327 (df=1) >>> >>>> >>>> >>> My second question grows out of the first. I ran the same two model on >>> the >>> same data in Stata and got identical coefficients. However, the >>> log-likelihoods were different than the one's I got in R, and followed my >>> expectations - that is, the null model has a lower log-likelihood than >>> the >>> fitted model. See the Stata model comparison below. So my question is, >>> why do identical models fit in R and Stata have different >>> log-likelihoods? >>> >> >> That's easier: they use different base measures. The likelihood is only >> defined up to a multiplicative constant, so the log likelihoods can have an >> arbitrary constant added to them and still be valid. But I would have >> expected both models to use the same base measure, so the differences in >> log-likelihood should match. >> >> Duncan Murdoch >> >> >> ------------------------------**------------------------------** >>> ----------------- >>> Model | Obs ll(null) ll(model) df AIC >>> BIC >>> -------------+----------------**------------------------------** >>> ----------------- >>> mod1 | 1489 -393.064 -390.9304 2 785.8608 >>> 796.4725 >>> null | 1489 -393.064 -393.064 1 788.1279 >>> 793.4338 >>> >>> Thanks in advance for any input or references. >>> >>> Andrew Miles >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/**listinfo/r-help< https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide http://www.R-project.org/** >>> posting-guide.html <http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> ______________________________**________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/**listinfo/r-help< https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide http://www.R-project.org/** >> posting-guide.html <http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.