Dear R users, I have trouble obtaining the same results for nested Anova with two fixed factors when using lm and aov functions.
The formulas are: > e1=aov(y~x/z) > e2=lm(y~x/z) summary(e1) Df Sum Sq Mean Sq F value Pr(>F) x 47 260.0 5.5 18.0088 < 2.2e-16 *** x:z 195 169.6 0.9 2.8318 < 2.2e-16 *** Residuals 14425 4430.3 0.3 --- Signif. codes: 0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 2 observations deleted due to missingness For e2 Residual standard error: 0.5542 on 14425 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.08839, Adjusted R-squared: 0.07309 F-statistic: 5.779 on 242 and 14425 DF, p-value: < 2.2e-16 I prefer to use lm, as in my case I want to know the difference between the first control group and all the other factors though regression coefficients. The same is true for levels of the nested factor within each level of the main factor. Since I am fairly novice to running linear models in R, I am not sure what can cause this problem; it also seems that lm does not provide the decomposition of MS into MS(x) and MS(z) and corresponding F-test statistics. (Is this possible to estimate them from lm output?) Finally, few words about the dataset: main factor x has 48 levels, repeated from 60 to 540 times and represents different patients. The nested factor z has 9 levels, but not all of them occur within levels of factor x. Although the nested factor levels are independent between each of the main factor (i.e. they samples taken from different tissues of each patients), considering the large size of the dataset I was advised on this forum to use the same encoding of levels of nested factor z at each level of factor x. I am not sure if this influences QR decomposition and leads to differences that I observe. I would most appreciate your help as after reading help pages I still can not understand the cause for lm vs aov discrepancy. The dataset with three factors can be downloaded from http://www.compbio.group.cam.ac.uk/Resources/Sergii_temp/example.RData Thank you, Sergii ---------------------------------------------- Sergii Ivakhno PhD student Computational Biology Group Cancer Research UK Cambridge Research Institute Li Ka Shing Centre Robinson Way Cambridge CB2 0RE England +44 (0)1223 404293 (O) +44 (0)1223 404128 (F) http://www.compbio.group.cam.ac.uk <http://www.compbio.group.cam.ac.uk/> / This communication is from Cancer Research UK. Our website is at www.cancerresearchuk.org. We are a charity registered under number 1089464 and a company limited by guarantee registered in England & Wales under number 4325234. Our registered address is 61 Lincoln's Inn Fields, London WC2A 3PX. Our central telephone number is 020 7242 0200. This communication and any attachments contain information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) please note that any form of disclosure, distribution, copying or use of this communication or the information in it or in any attachments is strictly prohibited and may be unlawful. If you have received this communication in error, please notify the sender and delete the email and destroy any copies of it. E-mail communications cannot be guaranteed to be secure or error free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses. We do not accept liability for any such matters or their consequences. Anyone who communicates with us by e-mail is taken to accept the risks in doing so. [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.