dear list, i just found this post in the archive:
---------------- On 23-Apr-05 Bill.Venables at csiro.au <https://stat.ethz.ch/mailman/listinfo/r-help> wrote: >:* -----Original Message-----*>:* From: r-help-bounces at stat.math.ethz.ch ><https://stat.ethz.ch/mailman/listinfo/r-help> *>:* [mailto:r-help-bounces at >stat.math.ethz.ch <https://stat.ethz.ch/mailman/listinfo/r-help>] On Behalf Of >*>:* michael watson (IAH-C)*>:* [...]*>:* I have a highly significant >interaction term. In the context*>:* of the experiment, this makes sense. I >can visualise the data *>:* graphically, and sure enough I can see that both >factors have*>:* different effects on the data DEPENDING on what the value >of*>:* the other factor is. *>:* *>:* I explain this all to my colleague - >and she asks "but which*>:* ones are different?" This is best illustrated with >an example.*>:* We have either infected | uninfected, and vaccinated | >unvaccinated*>:* (the two factors).*>:* We're measuring expression of a gene. >Graphically, in the*>:* infected group, vaccination makes expression go up. In >the*>:* uninfected group, vaccination makes expression go down. In*>:* bo! th the vaccinated and unvaccinated groups, infection makes*>:* expression go down, but it goes down further in unvaccinated*>:* than it does in vaccinated.*>:* *>:* So from a statistical point of view, I can see exactly why*>:* the interaction term is significant, but what my colleage*>:* wants to know is that WITHIN the vaccinated group, does*>:* infection decrease expression significantly? And within*>:* the unvaccinated group, does infection decrease expression*>:* significantly? Etc etc etc Can I get this information from*>:* the output of the ANOVA, or do I carry out a separate*>:* test on e.g. just the vaccinated group? (seems a cop out to me)*>* *>* No, you can't get this kind of specific information out of the anova*>* table and yes, anova tables *are* a bit of a cop out. (I sometimes *>* think they should only be allowed between consenting adults in*>* private.)* I think the "cop out" Michael Watson was referring to means going back to basics and doing a separate analysis on each group (though no doubt using the Res SS from the AoV table). Not that I disagree with your comment: I sometimes think that anova tables are often passed round between adults in order to induce consent which might otherwise have been withheld. >* What you are asking for is a non-standard, but perfectly*>* reasonable >partition of the degrees of freedom between the*>* classes of a single factor >with four levels got by pairing*>* up the levels of vaccination and >innoculation. Of course you*>* can get this information, but you have to do a >bit of work*>* for it. * It seems to me that this is a wrapper for "separate analysis on each group"! >* Before I give the example which I don't expect too many people*>* to read >entirely, let me issue a little challenge, namely to*>* write tools to >automate a generalized version of the procedure*>* below.* [technical setup snipped] >>* contrasts(dat$vac_inf) <- ginv(m)*>>* gm <- aov(y ~ vac_inf, dat)*>>* >>summary(gm)*>* Df Sum Sq Mean Sq F value Pr(>F)*>* vac_inf >>3 12.1294 4.0431 7.348 0.04190*>* Residuals 4 2.2009 0.5502*>* *>* >>This doesn't tell us too much other than there are differences,*>* probably. >>Now to specify the partition:*>* *>>* summary(gm, *>* >>split = list(vac_inf = list("- vs +|N" = 1, *>* >> "- vs +|Y" = 2)))*>* Df Sum Sq Mean Sq F value >>Pr(>F)*>* vac_inf 3 12.1294 4.0431 7.3480 0.04190*>* >>vac_inf: - vs +|N 1 7.9928 7.9928 14.5262 0.01892*>* vac_inf: - vs +|Y >>1 3.7863 3.7863 6.8813 0.05860*>* Residuals 4 2.2009 0.5502* Wow, Bill! Dazzling. This is like watching a rabbit hop into a hat, and fly out as a dove. I must study this syntax. But where can I find out about the "split" argument to "summary"? I've found the *function* split, whose effect is similar, but I've wandered around the "summary", "summary.lm" etc. forest for a while without locating the *argument*. My naive ("cop-out") approach would have been on the lines of (without setting up the contrast matrix): summary(aov(y~vac*inf,data=dat)) Df Sum Sq Mean Sq F value Pr(>F) vac 1 0.3502 0.3502 0.6364 0.46968 inf 1 11.3908 11.3908 20.7017 0.01042 * vac:inf 1 0.3884 0.3884 0.7058 0.44812 Residuals 4 2.2009 0.5502 so we get the 2.2009 on 4 df SS for redisuals with mean SS 0.5502. Then I would do: mNp<-mean(y[(vac=="N")&(inf=="+")]) mNm<-mean(y[(vac=="N")&(inf=="-")]) mYp<-mean(y[(vac=="Y")&(inf=="+")]) mYm<-mean(y[(vac=="Y")&(inf=="-")]) c( mYp, mYm, mNp, mNm ) ##[1] 2.4990492 0.5532018 2.5212655 -0.3058972 c( mYp-mYm, mNp-mNm ) ##[1] 1.945847 2.827163 after which: 1-pt(((mYp-mYm)/sqrt(0.5502)),4) ##[1] 0.02929801 1-pt(((mNp-mNm)/sqrt(0.5502)),4) ##[1] 0.009458266 give you 1-sided t-tests, and 1-pf(((mYp-mYm)/sqrt(0.5502))^2,1,4) ##[1] 0.05859602 1-pf(((mNp-mNm)/sqrt(0.5502))^2,1,4) ##[1] 0.01891653 give you F-tests (equivalent to 2-sided t-tests) which agree with Bill's results above. And, in this case, presenting the results as mean differences shows that the effect of infection appears to differ between vaccinated and unvaccinated groups; but a simple test shows this not to be significant: 1-pf( (sqrt(1/2)*((mYp-mYm)-(mNp-mNm))/sqrt(0.5502))^2, 1,4) ##[1] 0.4481097 As I said above, this would be my naive approach to this particular case (and to any like it), and I would expect to be able to explain all this in simple terms to a colleague who was asking the sort of questions you have reported. Or is it the case that offering an anova table is needed, in order to evoke consent to the results by virtue of the familiar format? >* As expected, infection changes the mean for both vaccinated and*>* >unvaccinated, as we arranged when we generated the data.* The challenge to generalise is interesting. However, as implied above, I'm already impressed by Bill's footwork in R for this simple case, and it might be some time before I'm fluent enough in that sort of thing to deal with more complicated cases, let alone the general one. For users like myself, a syntax whose terms are closer to ordinary language would be more approachable. Something on the lines of summary(aov(y ~ vac*inf), by=inf, within=vac) which would present a table similar to Bill's above (by inf within the different levels of vac). Intriguing. The challenge is tempting ... ! Best wishes, Ted. -------- This way for simple main effects testing looks reasonable to me. I just wanted to know whether the user should have applied a multiple comparison correction in this case (bonferroni, FDR, etc)? cheers Jake [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.