On May 7, 2012, at 19:39 , Bert Gunter wrote: > 1. As this is a statistical, rather than an R issue, you would do > better posting on a statistical help site like stats.stackexchange.com > (although some generous soul here may respond). > > 2. You would also probably do better consulting with a local > statistical resource if available, as it is difficult to explain such > issues remotely. > > Cheers, > Bert
Well, in this case it is pretty obvious that there are no positive outcomes in the 1996 cohort (log odds essentially -Inf); as this is the base level, the Wald tests are completely unreliable (Hauck-Donner effect). The LRT is 1369-1283=86 which is quite consistent with chisq.test(). -pd > > On Mon, May 7, 2012 at 10:05 AM, lincoln <misen...@hotmail.com> wrote: >> Hi all, >> >> I can't find the error in the binomial GLM I have done. I want to use that >> because there are more than one explanatory variables (all categorical) and >> a binary response variable. >> This is how my data set looks like: >>> str(data) >> 'data.frame': 1004 obs. of 5 variables: >> $ site : int 0 0 0 0 0 0 0 0 0 0 ... >> $ sex : Factor w/ 2 levels "0","1": NA NA NA NA 1 NA NA NA NA NA ... >> $ age : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ... >> $ cohort: Factor w/ 11 levels "1996","2000",..: 11 11 11 11 11 11 11 11 11 >> 11 ... >> $ birth : Factor w/ 3 levels "5","6","7": 3 3 2 2 2 2 2 2 2 2 ... >> >> I know that, particularly for one level of variable "cohort" (2004 value), >> it should be a strong effect of variable "cohort" on variable "site" so I do >> a Chi square test that confirms the null hypothesis there is a difference in >> sites on the way "cohort" is distributed: >> >>> (chisq.test(data$site,data$cohort)) >> >> Pearson's Chi-squared test >> >> data: data$site and data$cohort >> X-squared = 82.6016, df = 10, *p-value = 1.549e-13* >> >> Mensajes de aviso perdidos >> In chisq.test(data$site, data$cohort) : >> Chi-squared approximation may be incorrect >> >> >> >> >> After that, I have tried to use a binomial GLM with all the explanatory >> variables but I couldn't find any significance of any variable, neither >> cohort, and for this reason I tried to use only cohort as predictor and I >> get this: >> >> >>> BinomialGlm <- glm(site ~ cohort, data=data,binomial) >>> summary(BinomialGlm) >> >> Call: >> glm(formula = site ~ cohort, family = binomial, data = data) >> >> Deviance Residuals: >> Min 1Q Median 3Q Max >> -1.9239 -0.9365 -0.9365 1.3584 1.6651 >> >> Coefficients: >> Estimate Std. Error z value Pr(>|z|) >> (Intercept) -12.57 324.74 -0.039 0.969 >> cohort2000 11.47 324.75 0.035 0.972 >> cohort2001 13.82 324.74 0.043 0.966 >> cohort2002 12.97 324.74 0.040 0.968 >> cohort2003 13.66 324.74 0.042 0.966 >> *cohort2004 14.25 324.74 0.044 0.965* >> cohort2006 12.21 324.74 0.038 0.970 >> cohort2007 11.81 324.74 0.036 0.971 >> cohort2008 12.41 324.74 0.038 0.970 >> cohort2009 12.15 324.74 0.037 0.970 >> cohort2010 11.97 324.74 0.037 0.971 >> >> (Dispersion parameter for binomial family taken to be 1) >> >> Null deviance: 1369.3 on 1003 degrees of freedom >> Residual deviance: 1283.7 on 993 degrees of freedom >> AIC: 1305.7 >> >> Number of Fisher Scoring iterations: 11 >> >> >> >> >> I tired to use simple GLM (gaussian family) and I get results that are more >> logicals: >> >>> GaussGlm <- glm(site ~ cohort, data=data) >>> summary(GaussGlm) >> >> Call: >> glm(formula = site ~ cohort, data = data) >> >> Deviance Residuals: >> Min 1Q Median 3Q Max >> -0.8429 -0.3550 -0.3550 0.6025 0.7500 >> >> Coefficients: >> Estimate Std. Error t value Pr(>|t|) >> (Intercept) 5.740e-14 4.762e-01 0.000 1.0000 >> cohort2000 2.500e-01 5.324e-01 0.470 0.6388 >> cohort2001 7.778e-01 5.020e-01 1.549 0.1216 >> cohort2002 6.000e-01 4.880e-01 1.230 0.2192 >> cohort2003 7.500e-01 4.861e-01 1.543 0.1231 >> *cohort2004 8.429e-01 4.796e-01 1.757 0.0792 .* >> cohort2006 4.118e-01 4.832e-01 0.852 0.3943 >> cohort2007 3.204e-01 4.785e-01 0.670 0.5033 >> cohort2008 4.600e-01 4.786e-01 0.961 0.3367 >> cohort2009 3.975e-01 4.772e-01 0.833 0.4051 >> cohort2010 3.550e-01 4.768e-01 0.745 0.4567 >> --- >> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >> >> (Dispersion parameter for gaussian family taken to be 0.2267955) >> >> Null deviance: 245.40 on 1003 degrees of freedom >> Residual deviance: 225.21 on 993 degrees of freedom >> AIC: 1372.5 >> >> Number of Fisher Scoring iterations: 2 >> >> >> >> What is going on? Any suggestion/commentary? >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Can-t-find-the-error-in-a-Binomial-GLM-I-am-doing-please-help-tp4615340.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.