[R] How to interpret an ANOVA result?

Rui Barradas Tue, 15 May 2012 06:55:10 -0700

Hello,

1. There's a function anova(), which you have forgot to use.


 > model <- lm(d~pos*site, data=x)
 > anova(model)
Analysis of Variance Table

Response: d
           Df Sum Sq Mean Sq F value   Pr(>F)
pos        1   0.05   0.050  0.0026 0.960045
site       4 636.50 159.125  8.3971 0.003084 **
pos:site   4  59.70  14.925  0.7876 0.558857
Residuals 10 189.50  18.950
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The waffer position and it's interaction with site are not significant.

2. There is also a function aov(), similar to lm() but with a slightly 
different use.

 > model.aov <- aov(d~pos*site, data=x)
 > summary(model.aov)
             Df Sum Sq Mean Sq F value  Pr(>F)
pos          1    0.0    0.05   0.003 0.96005
site         4  636.5  159.13   8.397 0.00308 **
pos:site     4   59.7   14.93   0.788 0.55886
Residuals   10  189.5   18.95
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The only difference is in the decimals displayed.

3. Back to lm(), you can use confint() to see if the estimates' 95% 
confidence intervals include zero.
If so, they're statiscally insignificant.

 > confint(model)
                   2.5 %      97.5 %
(Intercept) 1371.141457 1384.858543
posR          -5.199444   14.199444
site2          1.800556   21.199444
site3          2.300556   21.699444
site4          7.300556   26.699444
site5         -8.699444   10.699444
posR:site2   -17.717086    9.717086
posR:site3   -24.217086    3.217086
posR:site4   -19.217086    8.217086
posR:site5   -16.717086   10.717086

And the interpretation is the same.

(There are books on-line, check out
http://www.r-project.org/

At the bottom left, click Books.)

Hope this helps,

Em 15-05-2012 11:00,  Robert Latest escreveu:
> Date: Mon, 14 May 2012 12:17:03 +0200 From: Robert Latest 
> <boblat...@gmail.com> To: r-help@r-project.org Subject: [R] How to 
> interpret an ANOVA result? Message-ID: 
> <CAMXbmUQx6EoB1zwiK5vMu0=hddy-v0s_cqnh5puvq+s0xn7...@mail.gmail.com> 
> Content-Type: text/plain; charset=ISO-8859-1 Hello all, here's a 
> real-world example: I'm measuring a quantity (d) at five sites (site1 
> thru site5) on a silicon wafer. There is a clear site-dependence of 
> the measured value. To find out if this is a measurement artifact I 
> measured the wafer four times: twice in the normal position (posN), 
> and twice rotated by 180 degrees (posR). My data looks like this 
> (full, self-contained code at bottom). Note that sites with the same 
> number correspond to the same physical location on the wafer (the 
> rotation has already been taken into account here).
>> >  head(x)
>       d site pos
> 1 1383    1   N
> 2 1377    1   R
> 3 1388    1   R
> 4 1373    1   N
> 5 1386    2   N
> 6 1394    2   R
>
>> >  boxplot (d~pos+site)
> This boxplot (see code) already hints at a true site-dependence of the
> measured value (no artifact). OK, so let's do an ANOVA to make this
> more quantitative:
>
>> >  summary(lm(d ~ site*pos)
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1378.000      3.078 447.672<  2e-16 ***
> site2         11.500      4.353   2.642  0.02466 *
> site3         12.000      4.353   2.757  0.02025 *
> site4         17.000      4.353   3.905  0.00294 **
> site5          1.000      4.353   0.230  0.82294
> posR           4.500      4.353   1.034  0.32561
> site2:posR    -4.000      6.156  -0.650  0.53050
> site3:posR   -10.500      6.156  -1.706  0.11890
> site4:posR    -5.500      6.156  -0.893  0.39264
> site5:posR    -3.000      6.156  -0.487  0.63655
>
> Now I think that I see the following:
> - The average of d at site1 in pos. N (first in alphabet) is 1378.
> - Average values for site2, 3, 4 (especially 4) in pos. N deviate
> significantly from pos. 1. For instance, values at site4 are on
> average 17 greater than at site1.
> - The average value at site5 does not differ significantly from site1.
> OK, that was the top part of the result table. Now the bottom part:
> - In reverse position(posR) the average of d at site1 is 4.5 bigger,
> but that's not significant.
> - The average of d at site3:posR is 10.5 smaller than something, but
> smaller than what? And why does this -10.5 deviation have a p-value of
> .1 (not significant) vs the .02 (significant) deviation of 11.5
> (site2, top part)?
>
> Let's see if I can figure that out. Difference between posN and posR
> at site3 is not so big:
>> >  mean(d[site==3&pos=="R"])-mean(d[site==3&pos=="N"])
> [1] -6
> Is this what makes it insignificant?
>
> Shuffling around the numbers until I get to -10.5:
>
>> >  
>> > mean(d[site==3&pos=="R"])-mean(d[site==3&pos=="N"])-(mean(d[site==1&pos=="R"])-mean(d[site==1&pos=="N"]))
> [1] -10.5
>
> OK, one has to keep track of all the differences and stuff.
>
> So I think I have understood about 80% of this simple example. The
> reason I'm going after this so stubbornly is that I'm at the beginning
> of a DOE which will take several weeks of measuring and will end up
> being analyzed with a big ANOVA (two response and about six
> explanatory variables, some continuous, some factorial). Already in
> the DOE phase I want to understand what I will be doing with the data
> later (this is for a Six Sigma project in an industrial production
> environment, in case anybody wants to know).
>
> Thanks,
> robert
>
> Here's the full dataset:
>
> x<- structure(list(d = c(1383L, 1377L, 1388L, 1373L, 1386L, 1394L,
> 1386L, 1393L, 1390L, 1382L, 1386L, 1390L, 1395L, 1396L, 1392L,
> 1395L, 1378L, 1382L, 1379L, 1380L), site = structure(c(1L, 1L,
> 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L,
> 5L, 5L), .Label = c("1", "2", "3", "4", "5"), class = "factor"),
>      pos = structure(c(1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L,
>      2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L), .Label = c("N",
>      "R"), class = "factor")), .Names = c("d", "site", "pos"), row.names = 
> c(NA,
> -20L), class = "data.frame")
> attach(x)
> head(x)
> boxplot (d~pos+site)
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to interpret an ANOVA result?

Reply via email to