On 08/25/2010 05:17 PM, francogrex wrote: > > Hi I'm having different outputs from GLM when using a "condensed" table > V1 V2 V3 Present Absent > 0 0 0 3 12 > 0 0 1 0 0 > 0 1 0 0 0 > 0 1 1 1 0 > 1 0 0 7 20 > 1 0 1 0 0 > 1 1 0 3 0 > 1 1 1 6 0 > > > resp=cbind(Present, Absent) > glm(resp~V1+V2+V3+I(V1*V2*V3),family=binomial) >> Deviance Residuals: > [1] 0 0 0 0 0 0 0 0 > etc and also coefficients... > > And when using the same but "expanded" table > > V1 V2 V3 condition (1 present 0 abscent) > Id1 1 0 0 1 > id2 1 1 1 1 > ... etc > glm(condition~V1+V2+V3+I(V1*V2*V3),family=binomial) >> Deviance Residuals: > Min 1Q Median 3Q Max > -0.7747317 -0.7747317 -0.6680472 0.0001315 1.7941226 > and also coefficients are different from above. > > What could I be doing wrong? > >
Not necessarily anything. Anything technical, that is. You have 3 uninformative combinations where the total is zero. The model has 5 parameters. This is quite likely to generate a perfect fit to the aggregated data. With the groups having zeros in the "absent" category, the fit probably diverged so some coefficients are numerically large. Refitting with individual data will likely give slightly different coefficients, since it sort of depends on how far you came on the way to infinity. With the aggregated data, a perfect fit gives residuals of zero, but with individual data, the 0's and 1's give negative and positive residuals. Try z <- rep(0:1,5) zz <- cbind(5,5) summary(glm(z~1, binomial)) summary(glm(zz~1, binomial)) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.