Andy Robertson <ar313 <at> exeter.ac.uk> writes: > I am using lmer combined with AIC model selection and averaging (in the > MuMIn package) to try and assess how isotope values (which indicate diet) > vary within a population of animals. > > I have multiple measures from individuals (variable 'Tattoo') and multiple > individuals within social groups within 4 locations (A, B, C ,D) crucially I > am interested if there are differences between sexes and age classes > (variable AGECAT2) and whether this differs with location. > > However, whether or not I get a significant sex:location interaction depends > on which location is my reference level and I cannot understand why this is > the case. It seems to be due to the fact that the standard error associated > with my interactions varies depending on which level is the reference. > > Any help or advice would be appreciated, > > Andrew Robertson
This is all a little overwhelming. I appreciate that you are trying to be thorough, but there's an awful lot to look at here ... I will give comments until the point where I ran out of time. > > Below is the example code of what I am doing and an example of the model > summary and model averaging results with location A as the ref level or > location B. > > if A is the reference level... > > #full model > > Amodel<-lmer(d15N~(AGECAT2+Sex+Location1+AGECAT2:Location1+Sex:Location1+AGE > CAT2:Sex+(1|Year)+(1|Location1/Socialgroup/Tattoo)), REML=FALSE, > data=nocubs) Note that you have Location in your model twice, once as a fixed effect and once as a random effect. This is bound to lead to trouble. If you use (1|Location1:Socialgroup) and (1|Location1:Socialgroup:Tattoo) you will get the random effects you want without also incorporating a random effect of Location1. You could specify the fixed effects as (AGECAT2+Sex+Location1)^2 if you wanted (it would be equivalent to this specification). > > #standardise model > Amodels<-standardize(Amodel, standardize.y=FALSE) is this from the 'rockchalk' package? Do you know that it isn't doing something funny? > #dredge models > summary(model.avg(get.models(Adredge,cumsum(weight)<0.95))) > > Then the average model coefficients indicate no sex by location interaction > > Component models: > df logLik AICc Delta Weight > 235 13 -765.33 1557.28 0.00 0.68 > 1235 15 -764.55 1559.91 2.63 0.18 > 3 9 -771.64 1561.57 4.29 0.08 > 12345 17 -763.67 1562.37 5.09 0.05 > > Term codes: > AGECAT2 c.Sex Location1 AGECAT2:c.Sex > c.Sex:Location1 > 1 2 3 4 > 5 > What is c.Sex? "centered sex" (e.g -1 for males and +1 for females? In general I think it is a bad idea to model-average sets of models some of which contain interactions, because (unless the design is perfectly balanced and the contrasts are set to sum-to-zero contrasts), the meaning of the main effects changes between models. In a model with an interaction (assuming sum-to-zero contrasts), the main effect represents the average effect across groups using equal weights: for example the main effect of sex would be the mean of the male and female predictions. In the model without an interaction, the main effect of groups will represent the average across groups weighting by the number of individuals per group ... > Model-averaged coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 8.673592 0.474524 18.279 <2e-16 *** > c.Sex 0.095375 0.452065 0.211 0.833 > Location1B -3.972882 0.556575 7.138 <2e-16 *** > Location1C -3.633331 0.531858 6.831 <2e-16 *** > Location1D -3.348665 0.539143 6.211 <2e-16 *** > c.Sex:Location1B -0.372653 0.513492 0.726 0.468 > c.Sex:Location1C 0.428299 0.511254 0.838 0.402 > c.Sex:Location1D -0.757582 0.512586 1.478 0.139 > AGECAT2OLD -0.179772 0.150842 1.192 0.233 > AGECAT2YEARLING -0.009596 0.132328 0.073 0.942 > AGECAT2OLD:c.Sex 0.045963 0.296471 0.155 0.877 > AGECAT2YEARLING:c.Sex -0.323985 0.268919 1.205 0.228 In general you should not test terms involving categorical variables (e.g. sex:location) by looking at all of the individual parameter z-values, but by comparing models with and without the term. This gets harder when you are doing model averaging. In general I would say that model averaging and information-theoretic approaches in general are best for *prediction*, while good old-fashioned frequentist approaches are best for *hypothesis testing*, which seems to be what you are trying to do ... Also note that the summary is giving you the results of Z-tests, which do not take the finite size of the data set into account. > And the full model summary looks like this.. > > Linear mixed model fit by maximum likelihood > > Formula: d15N ~ (AGECAT2 + Sex + Location1 + AGECAT2:Location1 + > Sex:Location1 + AGECAT2:Sex + (1 | Year) + (1 | > Location1/Socialgroup/Tattoo)) > > Data: nocubs > > AIC BIC logLik deviance REMLdev > > 1568 1670 -761.1 1522 1534 > > Random effects: > > Groups Name Variance Std.Dev. > > Tattoo:(Socialgroup:Location1) (Intercept) 0.35500 0.59582 > Socialgroup:Location1 (Intercept) 0.35620 0.59682 > Location1 (Intercept) 0.00000 0.00000 > Year (Intercept) 0.00000 0.00000 > Residual 0.49584 0.70416 Note here that you're getting zero variances for the location and year variances, and almost identical variances for the other two random effects (which looks a little fishy to me, but I can't quite say that it's wrong). > Number of obs: 608, groups: Tattoo:(Socialgroup:Location1), 132; > Socialgroup:Location1, 22; Location1, 4; Year, 2 Trying to fit a 4-level or even more extremely a 2-level factor as a random effect is almost guaranteed to give you zero variance estimates. I would strongly consider fitting Location and Year as fixed effects (you can still include social group within location and individual within social group as random effects). (See point above about how to exclude Location from the random effects.) > Fixed effects: > Estimate Std. Error t value > (Intercept) 8.83179 0.52961 16.676 > AGECAT2OLD -0.44101 0.41081 -1.074 > AGECAT2YEARLING 0.01805 0.38698 0.047 > SexMale -0.11346 0.51239 -0.221 > Location1B -3.97880 0.63063 -6.309 > Location1C -4.04816 0.60404 -6.702 > Location1D -3.36389 0.63304 -5.314 > AGECAT2OLD:Location1B 0.44198 0.54751 0.807 > AGECAT2YEARLING:Location1B -0.22134 0.52784 -0.419 > AGECAT2OLD:Location1C 0.20684 0.50157 0.412 > AGECAT2YEARLING:Location1C 0.24132 0.47770 0.505 > AGECAT2OLD:Location1D 0.53653 0.52778 1.017 > AGECAT2YEARLING:Location1D 0.51755 0.51038 1.014 > SexMale:Location1B -0.02442 0.57546 -0.042 > SexMale:Location1C 0.74680 0.58128 1.285 > SexMale:Location1D -0.41800 0.59505 -0.702 > AGECAT2OLD:SexMale -0.08907 0.32513 -0.274 > AGECAT2YEARLING:SexMale -0.40146 0.30409 -1.320 > > If location B is the reference level then the average model coefficients > indicate an age by sex interaction in location C. ??? Do you mean an effect of sex in location C? I don't see where the interaction with age comes in ... Also note that you seem to have changed from "c.Sex" (a continuous variable, according to the model summary) to "Sex" (a factor with "Female" as the first level and "Male" as the second). Is that responsible for the differences you are seeing? > Component models: > df logLik AICc Delta Weight > 235 13 -765.33 1557.28 0.00 0.68 > 1235 15 -764.55 1559.91 2.63 0.18 > 3 9 -771.64 1561.57 4.29 0.08 > 12345 17 -763.67 1562.37 5.09 0.05 > > Term codes: > AGECAT2 c.Sex Location2 AGECAT2:c.Sex > c.Sex:Location2 > 1 2 3 4 > 5 > > Model-averaged coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 4.700710 0.294275 15.974 <2e-16 *** > c.Sex -0.277278 0.248093 1.118 0.2637 > Location2A 3.972882 0.556575 7.138 <2e-16 *** > Location2C 0.339551 0.379873 0.894 0.3714 > Location2D 0.624217 0.390063 1.600 0.1095 > c.Sex:Location2A 0.372653 0.513492 0.726 0.4680 > c.Sex:Location2C 0.800952 0.345898 2.316 0.0206 * > c.Sex:Location2D -0.384929 0.346832 1.110 0.2671 > AGECAT2OLD -0.179772 0.150842 1.192 0.2333 > AGECAT2YEARLING -0.009596 0.132328 0.073 0.9422 > AGECAT2OLD:c.Sex 0.045963 0.296471 0.155 0.8768 > AGECAT2YEARLING:c.Sex -0.323985 0.268919 1.205 0.2283 > stopped here ... In general it's not surprising that the apparent effect measured in the way you have parameterized and are measuring it changes with parameterization. The parameters mean different things and are using a different baseline ... A lot of this is basic (although not easy) stuff about parameterization. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.