Hello John, thanks for your reply and correction. I apologies for my crude mistake in applying the aov (now I have learned better). I hope to get a hold of "Statistical Models in S", but I don't predict it could easily happen in the near future.
Also, I would be very happy if you could supply me with some more directions as to how to obtain the "within" residuals (such as reported from the aov summary), since I am not sure how to proceed with that. With regards, Tal On Sat, Feb 21, 2009 at 12:41 AM, John Fox <j...@mcmaster.ca> wrote: > > Dear Tal, > > I didn't have time to look at all this yesterday. > > Since aov() doesn't do what I typically want to do, I guess I've not paid > much attention to it recently. I can see, however, that you appear to have > specified the error strata incorrectly, since (given your desire to compare > to Anova) the within-block factors are nested within blocks. Something like > > > npk.aovE <- aov(value ~ N*P*K + Error(block/N*P*K), npk.long) > > should be closer to what you want, and in fact produces all of the sums of > squares, but doesn't put all of the error terms together with the > corresponding terms; thus, you get, e.g., the test for N but not for P and > K, even though the SSs and error SSs for the latter are in the table. By > permuting N, P, and K, you can get the other F tests. I suspect that this > has to do with the sequential approach taken by aov() but someone else more > familiar with how it works will have to fill in the details. I wonder, > though, whether you've read the sections in Statistical Models in S and MASS > referenced in the help file for aov. > > > summary(npk.aovE) > > Error: block > Df Sum Sq Mean Sq F value Pr(>F) > Residuals 5 153.147 30.629 > > Error: P > Df Sum Sq Mean Sq > P 1 16.803 16.803 > > Error: K > Df Sum Sq Mean Sq > K 1 190.40 190.40 > > Error: block:N > Df Sum Sq Mean Sq F value Pr(>F) > N 1 378.56 378.56 38.614 0.001577 ** > Residuals 5 49.02 9.80 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Error: block:P > Df Sum Sq Mean Sq F value Pr(>F) > Residuals 5 19.1317 3.8263 > > Error: block:K > Df Sum Sq Mean Sq F value Pr(>F) > Residuals 5 24.4933 4.8987 > > Error: P:K > Df Sum Sq Mean Sq > P:K 1 0.96333 0.96333 > > Error: block:N:P > Df Sum Sq Mean Sq F value Pr(>F) > N:P 1 42.563 42.563 8.6888 0.03197 * > Residuals 5 24.493 4.899 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Error: block:N:K > Df Sum Sq Mean Sq F value Pr(>F) > N:K 1 66.270 66.270 17.320 0.00881 ** > Residuals 5 19.132 3.826 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Error: block:P:K > Df Sum Sq Mean Sq F value Pr(>F) > Residuals 5 49.018 9.804 > > Error: block:N:P:K > Df Sum Sq Mean Sq F value Pr(>F) > N:P:K 1 74.003 74.003 2.4161 0.1808 > Residuals 5 153.147 30.629 > > > John > > > -----Original Message----- > > From: Tal Galili [mailto:tal.gal...@gmail.com] > > Sent: February-19-09 2:23 AM > > To: John Fox > > Cc: r-help@r-project.org > > Subject: Re: [R] [package-car:Anova] extracting residuals from Anova for > Type > > II/III Repeated Measures ? > > > > Hello John, thank you for the fast reply. > > > > Thanks to your answer I was able to reproduce the "between" residuals by > > simply writing: > > sum(residuals(mod.ok)^2) > > > > But I must admit that how to obtain the "within" was beyond me, so some > > advice here would be of great help. > > (Also, it might be worth adding these residuals to the standard Anova > output, > > since I believe some researchers using the package could find it useful > when > > reporting there results) > > > > > > By working with the Anova package, I stumbled into a conflict between the > > Anova and the aov outputs. > > The toy data used was reassembled from the (?aov) example (and filled in > for > > balance, and contr.sum contrasts where assigned). > > When I compared the Anova (on SS type III) and the aov F (And p.value) > > results - I found significant differences between them (though the SS > where > > identical). > > If I understood correctly, for balanced designs both test should have > yielded > > the same results. any ideas on what the source of the difference might be? > > > > Here is the code: > > > > > > > > > > ## get the data: > > utils::data(npk, package="MASS") > > library(reshape) # for data manipulation > > colnames(npk)[5] <- "value" # so that the data set will read proparly in > > "reshape" > > > > npk.wide = cast(block ~ N +P+ K, data = npk) > > > > # now we fill in the NA's, to get a balanced design > > is.na(npk.wide) #it has na's. we'll fill them with that factor's > combination > > mean, so to keep some significance around > > combination.mean <- apply(npk.wide, 2, function(x){mean(x,na.rm = T)}) > > for(i in c(2:9)) > > { npk.wide[is.na(npk.wide)[,i],i] = combination.mean[i-1] } > > is.na(npk.wide) #no na's anymore > > > > # recreating a balanced npk for the aov > > npk.long <- melt(npk.wide) > > > > head(npk.long,4) > > # block value N P K > > #X0_0_0 1 46.80000 0 0 0 > > #X0_0_0.1 2 51.43333 0 0 0 > > #X0_0_0.2 3 51.43333 0 0 0 > > #X0_0_0.3 4 51.43333 0 0 0 > > head(npk.wide,4) > > # block 0_0_0 0_0_1 0_1_0 0_1_1 1_0_0 1_0_1 1_1_0 1_1_1 > > #1 1 46.80000 52.0 54.33333 49.5 63.76667 57.00000 62.80000 54.36667 > > #2 2 51.43333 55.5 56.00000 50.5 59.80000 54.66667 57.93333 58.50000 > > #3 3 51.43333 55.0 62.80000 50.5 69.50000 54.66667 57.93333 55.80000 > > #4 4 51.43333 45.5 44.20000 50.5 62.00000 54.66667 57.93333 48.80000 > > > > > > # npk.wide$block # it is a factor - good. > > > > > > # change into contr.sum > > for(i in c(1,3:5)) npk.long[,i] <- C(npk.long[,i] , "contr.sum") > > npk.wide[,1] <- C(npk.wide[,1] , "contr.sum") > > > > > > > > ## as a test, not particularly sensible statistically > > npk.aovE <- aov(value~ N*P*K + Error(block), npk.long) > > npk.aovE > > summary(npk.aovE) > > > > ## output > > Error: block > > Df Sum Sq Mean Sq F value Pr(>F) > > Residuals 5 153.147 30.629 > > > > Error: Within > > Df Sum Sq Mean Sq F value Pr(>F) > > N 1 378.56 378.56 39.1502 3.546e-07 *** > > P 1 16.80 16.80 1.7378 0.19599 > > K 1 190.40 190.40 19.6911 8.657e-05 *** > > N:P 1 42.56 42.56 4.4018 0.04319 * > > N:K 1 66.27 66.27 6.8535 0.01298 * > > P:K 1 0.96 0.96 0.0996 0.75415 > > N:P:K 1 74.00 74.00 7.6533 0.00899 ** > > Residuals 35 338.43 9.67 > > --- > > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > ## END output > > > > # the residuals SS > > # 338.43 + 153.147 > > # == > > sum(residuals(lm(value~ N*P*K , npk.long))^2) > > # 491.58 > > > > > > > > idata <- cast(N+P+K ~ . , data = npk.long)[,c(1:3)] > > idata > > for(i in 1:3) idata[,i] <- C(idata[,i] , "contr.sum") > > > > mod.ok <- lm(as.matrix(npk.wide[,-1]) ~ 1, data = npk.wide) > > av.ok <- Anova(mod.ok, idata=idata, idesign=~ N*P*K, type = "III") > > summary(av.ok, multivariate=FALSE) > > > > > > ## output - with different F and p.value results > > Univariate Type III Repeated-Measures ANOVA Assuming Sphericity > > > > SS num Df Error SS den Df F Pr(>F) > > (Intercept) 144541 1 153 5 4719.0302 1.238e-08 *** > > N 379 1 49 5 38.6145 0.001577 ** > > P 17 1 19 5 4.3915 0.090257 . > > K 190 1 24 5 38.8684 0.001554 ** > > N:P 43 1 24 5 8.6888 0.031971 * > > N:K 66 1 19 5 17.3195 0.008810 ** > > P:K 1 1 49 5 0.0983 0.766580 > > N:P:K 74 1 153 5 2.4161 0.180810 > > --- > > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > ## End output > > > > > > # the SS residuals are the same thou > > sum(residuals(mod.ok)^2) > > # 491.58 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Feb 19, 2009 at 12:24 AM, John Fox <j...@mcmaster.ca> wrote: > > > > > > Dear Tal, > > > > I suppose that the "between" residuals would be obtained, for your > > example, > > by residuals(mod.ok). I'm not sure what the "within" residuals are. > You > > could apply the transformation for each within-subject effect to the > > matrix > > of residuals to get residuals for that effect -- is that what you > had > > in > > mind? A list of transformations is in the element $P of the > Anova.mlm > > object. > > > > Regards, > > John > > > > ------------------------------ > > John Fox, Professor > > Department of Sociology > > McMaster University > > Hamilton, Ontario, Canada > > web: socserv.mcmaster.ca/jfox > > > > > > > > > -----Original Message----- > > > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > > project.org] > > On > > > Behalf Of Tal Galili > > > Sent: February-18-09 4:04 PM > > > To: r-help@r-project.org > > > Subject: [R] [package-car:Anova] extracting residuals from Anova > for > > Type > > > II/III Repeated Measures ? > > > > > > Hello dear R members. > > > I have been learning the Anova syntax in order to perform an SS > type > > III > > > Anova with repeated measures designs (thank you Prof. John Fox!) > > > And another question came up: where/what are the (between/within) > > residuals > > > for my model? > > > > > > > > > > > > ############ Play code: > > > > > > > > > phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, > > 5)), > > > levels=c("pretest", "posttest", "followup")) > > > hour <- ordered(rep(1:5, 3)) > > > idata <- data.frame(phase, hour) > > > idata > > > > > > mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, > > > post.1, post.2, post.3, post.4, post.5, > > > fup.1, fup.2, fup.3, fup.4, fup.5) ~ > > > treatment*gender, > > > data=OBrienKaiser) > > > av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour) > > > > > > > > > summary(av.ok, multivariate=FALSE) > > > > > > ## Univariate Type II Repeated-Measures ANOVA Assuming Sphericity > > > ## > > > ## SS num Df Error SS den Df > F > > > Pr(>F) > > > ## treatment 211.286 2 228.056 10 > 4.6323 > > > 0.037687 > > > ## gender 58.286 1 228.056 10 > 2.5558 > > > 0.140974 > > > ## treatment:gender 130.241 2 228.056 10 > 2.8555 > > > 0.104469 > > > ## phase 167.500 2 80.278 20 > 20.8651 > > > 1.274e-05 > > > ## treatment:phase 78.668 4 80.278 20 > 4.8997 > > > 0.006426 > > > ## gender:phase 1.668 2 80.278 20 > 0.2078 > > > 0.814130 > > > ## treatment:gender:phase 10.221 4 80.278 20 > 0.6366 > > > 0.642369 > > > ## hour 106.292 4 62.500 40 > 17.0067 > > > 3.191e-08 > > > ## treatment:hour 1.161 8 62.500 40 > 0.0929 > > > 0.999257 > > > ## gender:hour 2.559 4 62.500 40 > 0.4094 > > > 0.800772 > > > ## treatment:gender:hour 7.755 8 62.500 40 > 0.6204 > > > 0.755484 > > > ## phase:hour 11.083 8 96.167 80 > 1.1525 > > > 0.338317 > > > ## treatment:phase:hour 6.262 16 96.167 80 > 0.3256 > > > 0.992814 > > > ## gender:phase:hour 6.636 8 96.167 80 > 0.6900 > > > 0.699124 > > > ## treatment:gender:phase:hour 14.155 16 96.167 80 > 0.7359 > > > 0.749562 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > ---------------------------------------------- > > > > > > > > > My contact information: > > > Tal Galili > > > Phone number: 972-50-3373767 > > > FaceBook: Tal Galili > > > My Blogs: > > > www.talgalili.com > > > www.biostatistics.co.il > > > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > -- > > ---------------------------------------------- > > > > > > My contact information: > > Tal Galili > > Phone number: 972-50-3373767 > > FaceBook: Tal Galili > > My Blogs: > > www.talgalili.com > > www.biostatistics.co.il > > > > > > > -- ---------------------------------------------- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: www.talgalili.com www.biostatistics.co.il ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.