On Jun 14, 2012, at 2:17 PM, Alaska_Man wrote:


Dr. Winsemius,
Really quick, a BACI is a Before-After-Control-Impact approach. I have a long time series of sea cucumber density estimates, which are taken at the same location(s) through time. Some are in areas Impacted by sea otters and some are in areas Not Impacted by sea otters (two levels). Each estimate is also coded with a "B" if it is Before the time sea otters showed up at the impacted sites or "A" if after (Impact and Control sites are both coded with BA).

So you probably need a mixed-effects analysis because you have repeated measures (how many?) in the same location. (How many locations?)

BACI analyses suggest an impact if the ANOVA interaction term (BA*Otter) is significant; i.e., changes in sea cucumber density from before to after depend on whether sea otters are present. I log transformed the response to help normalize the data, as it has many zeros.

I'm not sure that makes good sense. I don't think those are structural zeros. Knowing how difficult it is to find sea critters, I think you still have a significant possibility that one or more sea cumcumbers was missed in those sites with measured zero values. For one thing the log of 0 is not a well defined value. For another thing I think it inflates the impact of small numbers on the inferential statistic, And for a third thing, the interpretation of effects gets all confused. You are not really interested in a ratio effect measure, at least I wouldn't. Far preferable would be to use some type of robust statistic to handle the inference issues and keep the estimates on a linear scale, perhaps using bootstrap methods. Davison and Hinkley have pre-post designs in their "practicals".

While shapiro-wilks does not suggest normality, it is a very large data set and it is "approximately" normal based on graphical examination. Again, the data is unbalanced, as there are many more estimates at the control sites and before period.

So you should not be using aov, since that method assumes balance. Regression methods should be appropriate.

With that said... I would like to perform the following pairwise comparisons; B:With Otters v. A:With Otters, B:Without Otters v. A:Without Otters I am performing other ANOVAs with different data and no interaction, where I would like to perform multiple pairwise comparisons between the fivelevels of a single factor. I used the code I provided previously and still received error messages. If I can get this BACI interaction problem figured out, I should manage to adjust it to other models. I recently came accross Dunnett's Modified Tukey Kramer (DTK.test) and it appears to address the same issue of unbalanced data and has a very straightforward script (although I am not sure it lends itself to interactions?). Is this test an appropriate substitute for the glht method? You wrote, "When your factors are both binary, the effect estimates fit nicely into a 2 x 2 table and the consideration of the single contrast added by the interaction is fairly simple" wool=='A' wool=='B' tension=='L' 3.7179 -0.4356+3.7179 tension=='M' -0.6012+3.7179 0.6281+3.7179 This output seems to be what I am looking for; assuming that if the value range for a comparison includes zero, then there is no significant difference?

Those are the predicted levels of log(breaks) at various combinations of wool and tension. You can pretty much always create such a table from the coefficients in a linear model. (Since you used glm() without a family argument you got a linear link.)


 Where did those values come from?

I just read them off the output of print(model) and added the appropriate contrasts to the baseline "Intercept" which applies to the wool=="A" and tension=="L" category . With R's default treatment contrasts, all coefficients are referenced to the Intercept, and so you need to add back each of the coefficients to get estimates for the separate groups.

I hope this helps clear up my problem. If you have concerns about pitfalls with this approach, then I would love to hear them and I can research them outside of this thread.

I would think this should be discussed with your advisor. If s/he thinks its appropriate to get further Internet-mediated advice, then you should go either to stats.stackexchange or the R-SIG-ME mailing list where they have better minds than mine to bring to bear on designs that are hierarchal.

 This is part of a masters thesis and needs to be sound.


Thank you very much for your time.
Sean

Date: Thu, 14 Jun 2012 10:26:58 -0700
From: ml-node+s789695n4633417...@n4.nabble.com
To: seanlars...@hotmail.com
Subject: Re: Tukey Kramer with ANOVA (glm)



        
On Jun 13, 2012, at 7:36 PM, Alaska_Man wrote:


Hello,



I am performing a BACI analysis with ANOVA using the following glm:


I admit I had no idea what a "BACI analysis" might be. Looking it up

it appears to be a cross-over design and my statistical betters have

sternly warned me about this regression briar patch in the past. I'm

especially suspicious of the lack of any statements about the balance

in the sampling in your presentation. (And for that matter the

extremely sketchy statement of design.)




fit1<-glm(log(Cucs_m+1)~(BA*Otter)+BA+Otter+ID+Primary, data=b1)


I'm guessing you do not understand that BA*Otter in an R formula

expands to BA + Otter + BA:Otter


The summary(aov(fit1)) shows significance in the interaction;

however, now I

would like to determine what combinations of BA and Otter are

significantly

different (each factor has two levels). ID and PRIMARY substrates are

categorical and included in the model to help explain some of the

variation

in the data.  The data is unbalanced so I plan on using Tukey Kramer

post

hoc analysis.  Here is how my data is laid out, it is a fairly

substantial

data set:

Editing done on original (although it proved unrevealing.)

Subdistrict  T   Year  Cucs_m  Primary Persistence   Otter

Fishing    BA         ID

109-41,42    9  2010   0.00     sil           3      1

1        A   109-41,42

109-41,42   13  2010   2.75     rck           3      1

1        A   109-41,42

109-41,42   16  2010   2.00     rck           3      0

1        A   109-41,42

109-41,42   18  2010   8.25     rck           3      0

0        B   109-41,42



I am assuming this is an appropriate pairwise comparison analysis

and I

cannot get the code to work with my data.

What does it mean to be doing "pairwise comparisons" on two-level

factor variables?)


I am *unclear how to code it to

work with the interaction*; however, even when I attempt to use it

only for

a single factor, it does not work (see below).



x<-aov(glm(Cucs_m~as.factor(BA),data=cuc))

glht(x, linfct=mcp(BA="Tukey"))

....................................

Error in mcp2matrix(model, linfct = linfct) :

Variable(s) ‘BA’ have been specified in ‘linfct’ but cannot be

found in

‘model’!

I suspect the glht() function is looking for 'as.factor(BA)` in the

model matrix and not finding it. If BA is not already a factor, then

it would make sense to do:


cuc$BA <- factor(cuc$BA)


.... before any analysis. Notice that you get a warning that

performing contrasts in the presence of interactions is something to

be warned about. If you do not know what you are doing here (and your

proposed analysis hints at that possibility), I may have set a trap

for you by solving a syntactic problem but not solving a conceptual

problem.


mod <- glm(log(breaks) ~ wool*tension, data=subset(warpbreaks,

tension %in% c("L","M")))

glht(mod, linfct=mcp(tension="Tukey"))


        General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Linear Hypotheses:

           Estimate

M - L == 0  -0.6012


Warning message:

In mcp2matrix(model, linfct = linfct) :

  covariate interactions found -- default contrast might be

inappropriate

---------------

Looking at the mod-object you see that the "Estimate" above is

actually NOT what you had interest in. You were presumably more

interested in the contrast woolB:tensionM whose coefficient was 0.6281.

----


Coefficients:

   (Intercept)           woolB        tensionM  woolB:tensionM

        3.7179         -0.4356         -0.6012          0.6281

----------


I would have instead done something like this:


mod <- glm(log(breaks) ~ wool*tension, data=subset(warpbreaks,

tension %in% c("L","M")))

mod2 <- glm(log(breaks) ~ wool+tension, data=subset(warpbreaks,

tension %in% c("L","M")))


anova(mod,mod2)

Analysis of Deviance Table


Model 1: log(breaks) ~ wool * tension

Model 2: log(breaks) ~ wool + tension

  Resid. Df Resid. Dev Df Deviance

1        32     4.6235

2        33     5.5113 -1 -0.88777


Now I can say that the addition of an interaction term resulted in a

non-significant improvement in model fit at least when measured on the

log(breaks) scale. (Note: This is quite a different result than one

sees on the untransformed scale where the interaction is highly

significant.) When your factors are both binary, the effect estimates

fit nicely into a 2 x 2 table and the consideration of the single

contrast added by the interaction is fairly simple.


                         wool=='A'                  wool=='B'

 tension=='L'         3.7179                      -0.4356+3.7179

 tension=='M'     -0.6012+3.7179          0.6281+3.7179


Can anyone off suggestions on potential problems with my approach

and/or

script issues?


Why was the log transformation being done? Is the desired outcome a

statement about ratios?


--


David Winsemius, MD

West Hartford, CT


______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        
        

        

        
        
If you reply to this email, your message will be added to the discussion below:
                
http://r.789695.n4.nabble.com/Tukey-Kramer-with-ANOVA-glm-tp4633314p4633417.html
        
        
                
                To unsubscribe from Tukey Kramer with ANOVA (glm), click here.

                NAML
                                                

--
View this message in context: 
http://r.789695.n4.nabble.com/Tukey-Kramer-with-ANOVA-glm-tp4633314p4633435.html
Sent from the R help mailing list archive at Nabble.com.
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to