On 07-Mar-09 10:57:17, Thomas Lumley wrote: > On Fri, 6 Mar 2009, joris meys wrote: >> Dear all, >> I have a dataset where the interaction is more than obvious, >> but I was asked to give a p-value, so I ran a logistic regression >> using glm. Very funny, in the outcome the interaction term is NOT >> significant, although that's completely counterintuitive. There >> are 3 variables : spot (binary response), constr (gene construct) >> and vernalized (growth conditions). Only for the FLC construct >> after vernalization, the chance on spots should be lower. So in >> the model one would suspect the interaction term to be significant. >> >> Yet, only the two main terms are significant here. Can it be my >> data is too sparse to use these models? Am I using the wrong method? > > The point estimate for the interaction term is large: 1.79, or an > odds ratio of nearly 6. > > The data are very strongly overdispersed (variance is 45 times larger > than it should be), so they don't fit a binomial model well. If you > used a quasibinomial model you would get no statistical significance > for any of the terms. > > I would say the problem is partly combination of the overdispersion and > the sample size. It doesn't help that the situation appears to be a > difference between the FLC:yes cell and the other three cells, a > difference that is spread out over the three parameters. > -thomas
The following way of looking at it may be helpful. Display the data as two 2x2 tables (one for each level of 'constr'): Spot Spot constr="FLC" 1 0 constr="free" 1 0 --------------+-------+--- --------------+-------+--- Vern = "yes": |20 27| 47 Vern = "yes": 42 3| 45 | | | | Vern = "no" : |42 3| 45 Vern = "no" : |44 1| 45 --------------+-------+--- --------------+-------+--- |62 30| 92 |86 4| 90 It seems clear that, in the constr="free" table, there is a close approximation to no information about the relationship between 'vernalized' and 'spot'. Given the margins, even the most extreme possible tables (by col: (45,41)/(0,4) and (41,45)/4,0)) have probabilities 0.058 of occurring. Other possibilities give probabilities 0.250, 0.384,0.250. On the other hand, the constr="FLC" table shows a very marked association between 'vernalized' and 'spot'. But, given that there is not much information on the "free" table, you are not going to find an interaction between 'constr' and 'vernalized'. (You could try out the glm() for each of the possible "free" tables, given the margins). So, in my view, the aetiology of the symptoms is hypospotification in the "free" lifestyle ... Treatment: Increase your intake of "free"! Then you may get enough information about association in that case. Ted. >> # data generation >> testdata <- >> matrix(c(rep(0:1,times=4),rep(c("FLC","FLC","free","free"),times=2), >> rep(c("no","yes"),each =4),3,42,1,44,27,20,3,42),ncol=4) >> colnames(testdata) <-c("spot","constr","vernalized","Freq") >> testdata <- as.data.frame(testdata) >> >> # model >> T0fit <- glm(spot~constr*vernalized, weights=Freq, data=testdata, >> family="binomial") >> anova(T0fit) >> >> Kind regards >> Joris >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > Thomas Lumley Assoc. Professor, Biostatistics > tlum...@u.washington.edu University of Washington, Seattle > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 07-Mar-09 Time: 14:14:06 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.