Hello John, thank you for the fast reply.
Thanks to your answer I was able to reproduce the "between" residuals by
simply writing:
sum(residuals(mod.ok)^2)
But I must admit that how to obtain the "within" was beyond me, so some
advice here would be of great help.
(Also, it might be worth adding these residuals to the standard Anova
output, since I believe some researchers using the package could find it
useful when reporting there results)
By working with the Anova package, I stumbled into a conflict between the
Anova and the aov outputs.
The toy data used was reassembled from the (?aov) example (and filled in for
balance, and contr.sum contrasts where assigned).
When I compared the Anova (on SS type III) and the aov F (And p.value)
results - I found significant differences between them (though the SS where
identical).
If I understood correctly, for balanced designs both test should have
yielded the same results. any ideas on what the source of the difference
might be?
Here is the code:
## get the data:
utils::data(npk, package="MASS")
library(reshape) # for data manipulation
colnames(npk)[5] <- "value" # so that the data set will read proparly in
"reshape"
npk.wide = cast(block ~ N +P+ K, data = npk)
# now we fill in the NA's, to get a balanced design
is.na(npk.wide) #it has na's. we'll fill them with that factor's combination
mean, so to keep some significance around
combination.mean <- apply(npk.wide, 2, function(x){mean(x,na.rm = T)})
for(i in c(2:9))
{ npk.wide[is.na(npk.wide)[,i],i] = combination.mean[i-1] }
is.na(npk.wide) #no na's anymore
# recreating a balanced npk for the aov
npk.long <- melt(npk.wide)
head(npk.long,4)
# block value N P K
#X0_0_0 1 46.80000 0 0 0
#X0_0_0.1 2 51.43333 0 0 0
#X0_0_0.2 3 51.43333 0 0 0
#X0_0_0.3 4 51.43333 0 0 0
head(npk.wide,4)
# block 0_0_0 0_0_1 0_1_0 0_1_1 1_0_0 1_0_1 1_1_0 1_1_1
#1 1 46.80000 52.0 54.33333 49.5 63.76667 57.00000 62.80000 54.36667
#2 2 51.43333 55.5 56.00000 50.5 59.80000 54.66667 57.93333 58.50000
#3 3 51.43333 55.0 62.80000 50.5 69.50000 54.66667 57.93333 55.80000
#4 4 51.43333 45.5 44.20000 50.5 62.00000 54.66667 57.93333 48.80000
# npk.wide$block # it is a factor - good.
# change into contr.sum
for(i in c(1,3:5)) npk.long[,i] <- C(npk.long[,i] , "contr.sum")
npk.wide[,1] <- C(npk.wide[,1] , "contr.sum")
## as a test, not particularly sensible statistically
npk.aovE <- aov(value~ N*P*K + Error(block), npk.long)
npk.aovE
summary(npk.aovE)
## output
Error: block
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 5 153.147 30.629
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
N 1 378.56 378.56 39.1502 3.546e-07 ***
P 1 16.80 16.80 1.7378 0.19599
K 1 190.40 190.40 19.6911 8.657e-05 ***
N:P 1 42.56 42.56 4.4018 0.04319 *
N:K 1 66.27 66.27 6.8535 0.01298 *
P:K 1 0.96 0.96 0.0996 0.75415
N:P:K 1 74.00 74.00 7.6533 0.00899 **
Residuals 35 338.43 9.67
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## END output
# the residuals SS
# 338.43 + 153.147
# ==
sum(residuals(lm(value~ N*P*K , npk.long))^2)
# 491.58
idata <- cast(N+P+K ~ . , data = npk.long)[,c(1:3)]
idata
for(i in 1:3) idata[,i] <- C(idata[,i] , "contr.sum")
mod.ok <- lm(as.matrix(npk.wide[,-1]) ~ 1, data = npk.wide)
av.ok <- Anova(mod.ok, idata=idata, idesign=~ N*P*K, type = "III")
summary(av.ok, multivariate=FALSE)
## output - with different F and p.value results
Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
SS num Df Error SS den Df F Pr(>F)
(Intercept) 144541 1 153 5 4719.0302 1.238e-08 ***
N 379 1 49 5 38.6145 0.001577 **
P 17 1 19 5 4.3915 0.090257 .
K 190 1 24 5 38.8684 0.001554 **
N:P 43 1 24 5 8.6888 0.031971 *
N:K 66 1 19 5 17.3195 0.008810 **
P:K 1 1 49 5 0.0983 0.766580
N:P:K 74 1 153 5 2.4161 0.180810
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## End output
# the SS residuals are the same thou
sum(residuals(mod.ok)^2)
# 491.58
On Thu, Feb 19, 2009 at 12:24 AM, John Fox <[email protected]> wrote:
> Dear Tal,
>
> I suppose that the "between" residuals would be obtained, for your example,
> by residuals(mod.ok). I'm not sure what the "within" residuals are. You
> could apply the transformation for each within-subject effect to the matrix
> of residuals to get residuals for that effect -- is that what you had in
> mind? A list of transformations is in the element $P of the Anova.mlm
> object.
>
> Regards,
> John
>
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
>
>
> > -----Original Message-----
> > From: [email protected] [mailto:[email protected]]
> On
> > Behalf Of Tal Galili
> > Sent: February-18-09 4:04 PM
> > To: [email protected]
> > Subject: [R] [package-car:Anova] extracting residuals from Anova for Type
> > II/III Repeated Measures ?
> >
> > Hello dear R members.
> > I have been learning the Anova syntax in order to perform an SS type III
> > Anova with repeated measures designs (thank you Prof. John Fox!)
> > And another question came up: where/what are the (between/within)
> residuals
> > for my model?
> >
> >
> >
> > ############ Play code:
> >
> >
> > phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)),
> > levels=c("pretest", "posttest", "followup"))
> > hour <- ordered(rep(1:5, 3))
> > idata <- data.frame(phase, hour)
> > idata
> >
> > mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5,
> > post.1, post.2, post.3, post.4, post.5,
> > fup.1, fup.2, fup.3, fup.4, fup.5) ~
> > treatment*gender,
> > data=OBrienKaiser)
> > av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour)
> >
> >
> > summary(av.ok, multivariate=FALSE)
> >
> > ## Univariate Type II Repeated-Measures ANOVA Assuming Sphericity
> > ##
> > ## SS num Df Error SS den Df F
> > Pr(>F)
> > ## treatment 211.286 2 228.056 10 4.6323
> > 0.037687
> > ## gender 58.286 1 228.056 10 2.5558
> > 0.140974
> > ## treatment:gender 130.241 2 228.056 10 2.8555
> > 0.104469
> > ## phase 167.500 2 80.278 20 20.8651
> > 1.274e-05
> > ## treatment:phase 78.668 4 80.278 20 4.8997
> > 0.006426
> > ## gender:phase 1.668 2 80.278 20 0.2078
> > 0.814130
> > ## treatment:gender:phase 10.221 4 80.278 20 0.6366
> > 0.642369
> > ## hour 106.292 4 62.500 40 17.0067
> > 3.191e-08
> > ## treatment:hour 1.161 8 62.500 40 0.0929
> > 0.999257
> > ## gender:hour 2.559 4 62.500 40 0.4094
> > 0.800772
> > ## treatment:gender:hour 7.755 8 62.500 40 0.6204
> > 0.755484
> > ## phase:hour 11.083 8 96.167 80 1.1525
> > 0.338317
> > ## treatment:phase:hour 6.262 16 96.167 80 0.3256
> > 0.992814
> > ## gender:phase:hour 6.636 8 96.167 80 0.6900
> > 0.699124
> > ## treatment:gender:phase:hour 14.155 16 96.167 80 0.7359
> > 0.749562
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> > ----------------------------------------------
> >
> >
> > My contact information:
> > Tal Galili
> > Phone number: 972-50-3373767
> > FaceBook: Tal Galili
> > My Blogs:
> > www.talgalili.com
> > www.biostatistics.co.il
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [email protected] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
--
----------------------------------------------
My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
www.talgalili.com
www.biostatistics.co.il
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.