In running the following example of a loglinear model for the Titanic data,
I was surprised to see NaN reported for the
Pearson chisq

> loglin(Titanic, margin=list(1:3, 4))
2 iterations: deviation 2.273737e-13
$lrt
[1] 671.9622

$pearson
[1] NaN

$df
[1] 15

$margin
$margin[[1]]
[1] "Class" "Sex"   "Age"

$margin[[2]]
[1] "Survived"


Tracing it back, this occurs because there are zeros in the fitted/expected frequencies
for children among the Crew.

> # get fitted (expected) values
> fitted <- loglin(Titanic, margin=list(1:3, 4), fit=TRUE)$fit
2 iterations: deviation 2.273737e-13
> fitted[Class="Crew",,Age="Child",]
        Survived
Sex      No Yes
  Male    0   0
  Female  0   0

I certainly understand the difference between sampling
zeros and structural zeros, and this distinction seems properly implemented in loglin() via the start= argument, but only in the calculation of Pearson chisq, not for LRT. I think this is a code bug, but if there is a reason for the difference, it should be documented in the help for
loglin.

Another probable bug is that the calculation of of the LRT chisq also takes zero fitted values into account, while the calculation of the Pearson chisq does not, and leads to the NaN
result for my example.
It occurs in the following portion of the code for loglin:

    fit <- z$fit
    attributes(fit) <- attributes(table)
    observed <- as.vector(table[start > 0])
    expected <- as.vector(fit[start > 0])
    pearson <- sum((observed - expected)^2/expected)
    observed <- as.vector(table[table * fit > 0])
    expected <- as.vector(fit[table * fit > 0])
    lrt <- 2 * sum(observed * log(observed/expected))

I don't understand the reasons for the different calculations of observed & expected
for pearson & lrt.

FWIW, below is how I calculate these in my mosaics.sas program

start chisq(obs, fit);
   *-- Find Pearson and likelihood ratio chisquares;
   gf = sum ( (obs - fit)##2 / ( fit + (fit=0) ) );
   lr = 2 # sum ( obs # log ( (obs+(obs=0)) / (fit + (fit=0)) ) );
   return (gf // lr);
   finish;

-Michael




--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to