Re: [R] Higher log-likelihood in null vs. fitted model

Brian S Cade Thu, 31 May 2012 23:07:38 -0700

Had several instances of similar issues with weighted leasts squares and 
weighted logistic regression recently.  It is important to remember that 
the log likelihoods for the weighted models include a term that is a 
function of the n weights and that these are not incorporated in deviances 
(or sums of squares for ls regression).

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326

From:
Andrew Miles <rstuff.mi...@gmail.com>
To:
Mark Leeds <marklee...@gmail.com>
Cc:
r-help@r-project.org
Date:
05/31/2012 04:09 PM
Subject:
Re: [R] Higher log-likelihood in null vs. fitted model
Sent by:
r-help-boun...@r-project.org

Interesting.  If you use the deviances printed out by the fitted model
(including the null deviance) and back-engineer it to log-likelihoods, you
get results identical to Stata.

> m$deviance*(-.5)

[1] -390.9304

> m$null.deviance*(-.5)

[1] -393.064
However, using the deviance to calculate the AIC by hand does not get you
the AIC that the model returns.

> m$deviance + 2*2

[1] 785.8608

> m$aic

[1] 764.3816
The difference seems to be in how the function glm.fit() calculates the 
AIC
that it returns, and which the function logLik() uses to return the
log-likelihood, as Mark indicated.  The function is:

binomial()$aic

function (y, n, mu, wt, dev)

{

    m <- if (any(n > 1))

        n

    else wt

    -2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m * y),

        round(m), mu, log = TRUE))

}
On the other hand, glm() calculates the deviance based on the sum of the
deviance residuals.

This gives me a sense for how log-likelihoods calculated using the 
deviance
and model AIC could differ (i.e., different ways of calculating them), but
it is still not clear to me *why *the two approaches differ.  And they 
only
differ when I use weights with the data.

More importantly, it makes me wonder which is more reliable - in my case,
it seems the deviance residuals approach is more reliable in that a) it
gives a sensible result (i.e., a model with a predictor has a higher
log-likelihood than the null model), and b) it matches an independent
estimation performed in Stata.  But is that always the case? And if so, 
why
use the other approach at all?

On Thu, May 31, 2012 at 9:55 AM, Mark Leeds <marklee...@gmail.com> wrote:

> Hi Duncan: I don't know if the following can help but I checked the code
> and logLik defines the log likelihood as (p -  glmobject$aic/2) where p 
is
> the glmobject$rank.  So,
> the reason for the likelihood being less is that, in the null, it ends 
up
> being ( 1 - glmobject$aic/2)  and in the other one it ends up being ( 2 
-
> glmobject$aic/2).
>
> so
>
>  2 - 764.4/2 = -380.2 and
>
>  1 - 761.9/2 = -379.95 ( close enough for govt work )
>
> So, that's where the #'s are coming from but it really depends on how 
AIC
> is defined.
> Likelihoods should not involve degrees of freedom ( atleast not where 
they
> make
> likelihood less like in the above example ) so maybe backing the
> likelihood out using
> AIC is the issue ?  ( AIC = -2 * likelihood + 2p so   p - AIC/2 =
> likelihood). AIC is a function of the likelihood but , as far as I know,
> likelihood is not a function of the AIC.
> Thanks for any insight.
>
>
>
>
>
>
> On Thu, May 31, 2012 at 9:26 AM, Duncan Murdoch 
<murdoch.dun...@gmail.com>wrote:
>
>> On 12-05-31 8:53 AM, Andrew Miles wrote:
>>
>>> Two related questions.
>>>
>>> First, I am fitting a model with a single predictor, and then a null
>>> model
>>> with only the intercept.  In theory, the fitted model should have a
>>> higher
>>> log-likelihood than the null model, but that does not happen.  See the
>>> output below.  My first question is, how can this happen?
>>>
>>
>> I suspect you'll need to give sample data before anyone can really help
>> with this.
>>
>>
>>>  m
>>>>
>>>
>>> Call:  glm(formula = school ~ sv_conform, family = binomial, data = 
dat,
>>>     weights = weight)
>>>
>>> Coefficients:
>>> (Intercept)   sv_conform
>>>     -2.5430       0.2122
>>>
>>> Degrees of Freedom: 1488 Total (i.e. Null);  1487 Residual
>>> Null Deviance:    786.1
>>> Residual Deviance: 781.9 AIC: 764.4
>>>
>>>> null
>>>>
>>>
>>> Call:  glm(formula = school ~ 1, family = binomial, data = dat, 
weights =
>>> weight)
>>>
>>> Coefficients:
>>> (Intercept)
>>>      -2.532
>>>
>>> Degrees of Freedom: 1488 Total (i.e. Null);  1488 Residual
>>> Null Deviance:    786.1
>>> Residual Deviance: 786.1 AIC: 761.9
>>>
>>>> logLik(m); logLik(null)
>>>>
>>> 'log Lik.' -380.1908 (df=2)
>>> 'log Lik.' -379.9327 (df=1)
>>>
>>>>
>>>>
>>> My second question grows out of the first.  I ran the same two model 
on
>>> the
>>> same data in Stata and got identical coefficients.  However, the
>>> log-likelihoods were different than the one's I got in R, and followed 
my
>>> expectations - that is, the null model has a lower log-likelihood than
>>> the
>>> fitted model.  See the Stata model comparison below.  So my question 
is,
>>> why do identical models fit in R and Stata have different
>>> log-likelihoods?
>>>
>>
>> That's easier:  they use different base measures.  The likelihood is 
only
>> defined up to a multiplicative constant, so the log likelihoods can 
have an
>> arbitrary constant added to them and still be valid.  But I would have
>> expected both models to use the same base measure, so the differences 
in
>> log-likelihood should match.
>>
>> Duncan Murdoch
>>
>>
>>  ------------------------------**------------------------------**
>>> -----------------
>>>        Model |    Obs    ll(null)        ll(model)     df          AIC
>>>     BIC
>>> -------------+----------------**------------------------------**
>>> -----------------
>>>         mod1 |   1489    -393.064   -390.9304     2    785.8608
>>>  796.4725
>>>         null |      1489    -393.064   -393.064      1     788.1279
>>>  793.4338
>>>
>>> Thanks in advance for any input or references.
>>>
>>> Andrew Miles
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________**________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<
https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________**________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<
https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

                 [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Higher log-likelihood in null vs. fitted model

Reply via email to