Re: [R] [package-car:Anova] extracting residuals from Anova for Type II/III Repeated Measures ?

John Fox Fri, 20 Feb 2009 14:48:38 -0800

Dear Tal,

I didn't have time to look at all this yesterday.


Since aov() doesn't do what I typically want to do, I guess I've not paid
much attention to it recently. I can see, however, that you appear to have
specified the error strata incorrectly, since (given your desire to compare
to Anova) the within-block factors are nested within blocks. Something like

> npk.aovE <- aov(value ~  N*P*K + Error(block/N*P*K), npk.long)

should be closer to what you want, and in fact produces all of the sums of
squares, but doesn't put all of the error terms together with the
corresponding terms; thus, you get, e.g., the test for N but not for P and
K, even though the SSs and error SSs for the latter are in the table. By
permuting N, P, and K, you can get the other F tests. I suspect that this
has to do with the sequential approach taken by aov() but someone else more
familiar with how it works will have to fill in the details. I wonder,
though, whether you've read the sections in Statistical Models in S and MASS
referenced in the help file for aov.

> summary(npk.aovE) 

Error: block
          Df  Sum Sq Mean Sq F value Pr(>F)
Residuals  5 153.147  30.629               

Error: P
  Df Sum Sq Mean Sq
P  1 16.803  16.803

Error: K
  Df Sum Sq Mean Sq
K  1 190.40  190.40

Error: block:N
          Df Sum Sq Mean Sq F value   Pr(>F)   
N          1 378.56  378.56  38.614 0.001577 **
Residuals  5  49.02    9.80                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: block:P
          Df  Sum Sq Mean Sq F value Pr(>F)
Residuals  5 19.1317  3.8263               

Error: block:K
          Df  Sum Sq Mean Sq F value Pr(>F)
Residuals  5 24.4933  4.8987               

Error: P:K
    Df  Sum Sq Mean Sq
P:K  1 0.96333 0.96333

Error: block:N:P
          Df Sum Sq Mean Sq F value  Pr(>F)  
N:P        1 42.563  42.563  8.6888 0.03197 *
Residuals  5 24.493   4.899                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: block:N:K
          Df Sum Sq Mean Sq F value  Pr(>F)   
N:K        1 66.270  66.270  17.320 0.00881 **
Residuals  5 19.132   3.826                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: block:P:K
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals  5 49.018   9.804               

Error: block:N:P:K
          Df  Sum Sq Mean Sq F value Pr(>F)
N:P:K      1  74.003  74.003  2.4161 0.1808
Residuals  5 153.147  30.629               


John

> -----Original Message-----
> From: Tal Galili [mailto:tal.gal...@gmail.com]
> Sent: February-19-09 2:23 AM
> To: John Fox
> Cc: r-help@r-project.org
> Subject: Re: [R] [package-car:Anova] extracting residuals from Anova for
Type
> II/III Repeated Measures ?
> 
> Hello John, thank you for the fast reply.
> 
> Thanks to your answer I was able to reproduce the "between" residuals by
> simply writing:
> sum(residuals(mod.ok)^2)
> 
> But I must admit that how to obtain the "within" was beyond me, so some
> advice here would be of great help.
> (Also, it might be worth adding these residuals to the standard Anova
output,
> since I believe some researchers using the package could find it useful
when
> reporting there results)
> 
> 
> By working with the Anova package, I stumbled into a conflict between the
> Anova and the aov outputs.
> The toy data used was reassembled from the (?aov) example (and filled in
for
> balance, and contr.sum contrasts where assigned).
> When I compared the Anova (on SS type III) and the aov F (And p.value)
> results - I found significant differences between them (though the SS
where
> identical).
> If I understood correctly, for balanced designs both test should have
yielded
> the same results. any ideas on what the source of the difference might be?
> 
> Here is the code:
> 
> 
> 
> 
> ## get the data:
> utils::data(npk, package="MASS")
> library(reshape) # for data manipulation
> colnames(npk)[5] <- "value" # so that the data set will read proparly in
> "reshape"
> 
> npk.wide = cast(block ~  N +P+ K, data = npk)
> 
> # now we fill in the NA's, to get a balanced design
> is.na(npk.wide) #it has na's. we'll fill them with that factor's
combination
> mean, so to keep some significance around
> combination.mean <- apply(npk.wide, 2, function(x){mean(x,na.rm = T)})
> for(i in c(2:9))
> { npk.wide[is.na(npk.wide)[,i],i] = combination.mean[i-1] }
> is.na(npk.wide) #no na's anymore
> 
> # recreating a balanced npk for the aov
> npk.long <- melt(npk.wide)
> 
> head(npk.long,4)
> #         block    value N P K
> #X0_0_0       1 46.80000 0 0 0
> #X0_0_0.1     2 51.43333 0 0 0
> #X0_0_0.2     3 51.43333 0 0 0
> #X0_0_0.3     4 51.43333 0 0 0
> head(npk.wide,4)
> #  block    0_0_0 0_0_1    0_1_0 0_1_1    1_0_0    1_0_1    1_1_0    1_1_1
> #1     1 46.80000  52.0 54.33333  49.5 63.76667 57.00000 62.80000 54.36667
> #2     2 51.43333  55.5 56.00000  50.5 59.80000 54.66667 57.93333 58.50000
> #3     3 51.43333  55.0 62.80000  50.5 69.50000 54.66667 57.93333 55.80000
> #4     4 51.43333  45.5 44.20000  50.5 62.00000 54.66667 57.93333 48.80000
> 
> 
> # npk.wide$block # it is a factor - good.
> 
> 
> # change into contr.sum
> for(i in c(1,3:5)) npk.long[,i] <- C(npk.long[,i] , "contr.sum")
> npk.wide[,1] <- C(npk.wide[,1] , "contr.sum")
> 
> 
> 
> ## as a test, not particularly sensible statistically
> npk.aovE <- aov(value~  N*P*K + Error(block), npk.long)
> npk.aovE
> summary(npk.aovE)
> 
> ## output
> Error: block
>           Df  Sum Sq Mean Sq F value Pr(>F)
> Residuals  5 153.147  30.629
> 
> Error: Within
>           Df Sum Sq Mean Sq F value    Pr(>F)
> N          1 378.56  378.56 39.1502 3.546e-07 ***
> P          1  16.80   16.80  1.7378   0.19599
> K          1 190.40  190.40 19.6911 8.657e-05 ***
> N:P        1  42.56   42.56  4.4018   0.04319 *
> N:K        1  66.27   66.27  6.8535   0.01298 *
> P:K        1   0.96    0.96  0.0996   0.75415
> N:P:K      1  74.00   74.00  7.6533   0.00899 **
> Residuals 35 338.43    9.67
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> ## END output
> 
> # the residuals SS
> # 338.43 + 153.147
> # ==
> sum(residuals(lm(value~  N*P*K , npk.long))^2)
> #  491.58
> 
> 
> 
> idata  <- cast(N+P+K  ~ . , data = npk.long)[,c(1:3)]
> idata
> for(i in 1:3) idata[,i] <- C(idata[,i] , "contr.sum")
> 
> mod.ok <- lm(as.matrix(npk.wide[,-1]) ~  1, data = npk.wide)
> av.ok <- Anova(mod.ok, idata=idata, idesign=~ N*P*K, type = "III")
> summary(av.ok, multivariate=FALSE)
> 
> 
> ## output - with different F and p.value results
> Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
> 
>                 SS num Df Error SS den Df         F    Pr(>F)
> (Intercept) 144541      1      153      5 4719.0302 1.238e-08 ***
> N              379      1       49      5   38.6145  0.001577 **
> P               17      1       19      5    4.3915  0.090257 .
> K              190      1       24      5   38.8684  0.001554 **
> N:P             43      1       24      5    8.6888  0.031971 *
> N:K             66      1       19      5   17.3195  0.008810 **
> P:K              1      1       49      5    0.0983  0.766580
> N:P:K           74      1      153      5    2.4161  0.180810
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> ## End output
> 
> 
> # the SS residuals are the same thou
> sum(residuals(mod.ok)^2)
> #  491.58
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Thu, Feb 19, 2009 at 12:24 AM, John Fox <j...@mcmaster.ca> wrote:
> 
> 
>       Dear Tal,
> 
>       I suppose that the "between" residuals would be obtained, for your
> example,
>       by residuals(mod.ok). I'm not sure what the "within" residuals are.
You
>       could apply the transformation for each within-subject effect to the
> matrix
>       of residuals to get residuals for that effect -- is that what you
had
> in
>       mind? A list of transformations is in the element $P of the
Anova.mlm
>       object.
> 
>       Regards,
>        John
> 
>       ------------------------------
>       John Fox, Professor
>       Department of Sociology
>       McMaster University
>       Hamilton, Ontario, Canada
>       web: socserv.mcmaster.ca/jfox
> 
> 
> 
>       > -----Original Message-----
>       > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org]
>       On
>       > Behalf Of Tal Galili
>       > Sent: February-18-09 4:04 PM
>       > To: r-help@r-project.org
>       > Subject: [R] [package-car:Anova] extracting residuals from Anova
for
> Type
>       > II/III Repeated Measures ?
>       >
>       > Hello dear R members.
>       > I have been learning the Anova syntax in order to perform an SS
type
> III
>       > Anova with repeated measures designs (thank you Prof. John Fox!)
>       > And another question came up: where/what are the (between/within)
>       residuals
>       > for my model?
>       >
>       >
>       >
>       > ############  Play code:
>       >
>       >
>       > phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5,
> 5)),
>       >     levels=c("pretest", "posttest", "followup"))
>       > hour <- ordered(rep(1:5, 3))
>       > idata <- data.frame(phase, hour)
>       > idata
>       >
>       > mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5,
>       >                      post.1, post.2, post.3, post.4, post.5,
>       >                      fup.1, fup.2, fup.3, fup.4, fup.5) ~
>       >  treatment*gender,
>       >                 data=OBrienKaiser)
>       > av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour)
>       >
>       >
>       > summary(av.ok, multivariate=FALSE)
>       >
>       > ## Univariate Type II Repeated-Measures ANOVA Assuming Sphericity
>       > ##
>       > ##                                  SS num Df Error SS den Df
F
>       >  Pr(>F)
>       > ## treatment                   211.286      2  228.056     10
4.6323
>       >  0.037687
>       > ## gender                       58.286      1  228.056     10
2.5558
>       >  0.140974
>       > ## treatment:gender            130.241      2  228.056     10
2.8555
>       >  0.104469
>       > ## phase                       167.500      2   80.278     20
20.8651
>       > 1.274e-05
>       > ## treatment:phase              78.668      4   80.278     20
4.8997
>       >  0.006426
>       > ## gender:phase                  1.668      2   80.278     20
0.2078
>       >  0.814130
>       > ## treatment:gender:phase       10.221      4   80.278     20
0.6366
>       >  0.642369
>       > ## hour                        106.292      4   62.500     40
17.0067
>       > 3.191e-08
>       > ## treatment:hour                1.161      8   62.500     40
0.0929
>       >  0.999257
>       > ## gender:hour                   2.559      4   62.500     40
0.4094
>       >  0.800772
>       > ## treatment:gender:hour         7.755      8   62.500     40
0.6204
>       >  0.755484
>       > ## phase:hour                   11.083      8   96.167     80
1.1525
>       >  0.338317
>       > ## treatment:phase:hour          6.262     16   96.167     80
0.3256
>       >  0.992814
>       > ## gender:phase:hour             6.636      8   96.167     80
0.6900
>       >  0.699124
>       > ## treatment:gender:phase:hour  14.155     16   96.167     80
0.7359
>       >  0.749562
>       >
>       >
>       >
>       >
>       >
>       >
>       >
>       >
>       >
>       > --
>       > ----------------------------------------------
>       >
>       >
>       > My contact information:
>       > Tal Galili
>       > Phone number: 972-50-3373767
>       > FaceBook: Tal Galili
>       > My Blogs:
>       > www.talgalili.com
>       > www.biostatistics.co.il
>       >
> 
>       >       [[alternative HTML version deleted]]
>       >
>       > ______________________________________________
>       > R-help@r-project.org mailing list
>       > https://stat.ethz.ch/mailman/listinfo/r-help
>       > PLEASE do read the posting guide
>       http://www.R-project.org/posting-guide.html
>       > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> 
> 
> 
> 
> --
> ----------------------------------------------
> 
> 
> My contact information:
> Tal Galili
> Phone number: 972-50-3373767
> FaceBook: Tal Galili
> My Blogs:
> www.talgalili.com
> www.biostatistics.co.il
> 
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [package-car:Anova] extracting residuals from Anova for Type II/III Repeated Measures ?

Reply via email to