Dear  R users,

I have trouble obtaining the same results for nested Anova with two fixed 
factors when using lm and aov functions. 

The formulas are:

> e1=aov(y~x/z)

> e2=lm(y~x/z)

 

summary(e1)

               Df Sum Sq Mean Sq F value    Pr(>F)

x              47  260.0     5.5 18.0088 < 2.2e-16 ***

x:z           195  169.6     0.9  2.8318 < 2.2e-16 ***

Residuals   14425 4430.3     0.3

---

Signif. codes:  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1

2 observations deleted due to missingness

 

For e2

Residual standard error: 0.5542 on 14425 degrees of freedom

  (2 observations deleted due to missingness)

Multiple R-squared: 0.08839,    Adjusted R-squared: 0.07309

F-statistic: 5.779 on 242 and 14425 DF,  p-value: < 2.2e-16

 

 

 

I prefer to use lm, as in my case I want to know the difference between the 
first control group and all the other factors though regression coefficients. 
The same is true for levels of the nested factor within each level of the main 
factor. 

 

Since I am fairly novice to running linear models in R, I am not sure what can 
cause this problem; it also seems that lm does not provide the decomposition of 
MS into MS(x) and MS(z) and corresponding F-test statistics. (Is this possible 
to estimate them from lm output?)

 

Finally, few words about the dataset: main factor x has 48 levels, repeated 
from 60 to 540 times and represents different patients. The nested factor z has 
9 levels, but not all of them occur within levels of factor x. Although the 
nested factor levels are independent between each of the main factor (i.e. they 
samples taken from different tissues of each patients), considering the large 
size of the dataset I was advised on this forum to use the same encoding of 
levels of nested  factor z at each level of factor x. I am not sure if this 
influences QR decomposition and leads to differences that I observe.

I would most appreciate your help as after reading help pages I still can not 
understand the cause for lm vs aov discrepancy.

The dataset with three factors can be downloaded from

http://www.compbio.group.cam.ac.uk/Resources/Sergii_temp/example.RData 

 

Thank you,

Sergii  

 
 
  
----------------------------------------------
Sergii Ivakhno

PhD student

Computational Biology Group
Cancer Research UK Cambridge Research Institute
Li Ka Shing Centre
Robinson Way
Cambridge CB2 0RE
England

+44 (0)1223 404293 (O)
+44 (0)1223 404128 (F)

http://www.compbio.group.cam.ac.uk <http://www.compbio.group.cam.ac.uk/> /


This communication is from Cancer Research UK. Our website is at 
www.cancerresearchuk.org. We are a charity registered under number 1089464 and 
a company limited by guarantee registered in England & Wales under number 
4325234. Our registered address is 61 Lincoln's Inn Fields, London WC2A 3PX. 
Our central telephone number is 020 7242 0200.

This communication and any attachments contain information which is 
confidential and may also be privileged.   It is for the exclusive use of the 
intended recipient(s).  If you are not the intended recipient(s) please note 
that any form of disclosure, distribution, copying or use of this communication 
or the information in it or in any attachments is strictly prohibited and may 
be unlawful.  If you have received this communication in error, please notify 
the sender and delete the email and destroy any copies of it.

E-mail communications cannot be guaranteed to be secure or error free, as 
information could be intercepted, corrupted, amended, lost, destroyed, arrive 
late or incomplete, or contain viruses.  We do not accept liability for any 
such matters or their consequences.  Anyone who communicates with us by e-mail 
is taken to accept the risks in doing so.
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to