Martijn,
I don't think there is one right answer to this. If you look at things
in the way that one would usually view a smooth model then m2 is both
simpler (lower EDF) and fits better, so is simply a better model (if the
simpler model fits better then why would you not use it?).
But of course `simpler' depends on whether you view the random effect as
counting for one parameter, or for it's `effective degrees of freedom'.
If it's the former then you should probably fit models using
method="ML" and compare via a GLRT test using the ML score, or simply
drop the fixed effect if its p-value according to anova(m2) is too high.
I would not use anova(m1,m2) in this case, because of the difficulty in
interpreting the random effects as being equivalent to un-penalized
effects with rank equal to the random effect edfs.
best,
Simon
On 11/05/12 17:50, Martijn Wieling wrote:
Dear Simon,
Thanks for your concise reply, this is very helpful.
With respect to my second question, however, I was not entirely clear
- or perhaps I'm misunderstanding your answer. What I meant is:
suppose I have a model with a random effect s(X, bs="re"). Now I want
to test if a certain (fixed-effect) predictor A improves the model.
I therefore compare:
m1 = gam(Y ~ s(X,bs="re"), data=dat)
m2 = gam(Y ~ A + s(X,bs="re"), data=dat)
What I didn't make explicit before is that A in the model summary of
m2 does not reach significance (e.g., p = 0.2). Comparing the models
m1 and m2, shows that m1 is the more complex model (as adding A
decreases the edf's invested in the ranef spline with more than 1),
and m1 is not significantly better than m2. Now my question is, should
I keep m2, even though A is not significant itself? Or should I ignore
the result of anova(m1,m2) anyway, given that this comparison is not
suitable when comparing models including random effects (as you argue
regarding my first question)?
If that is the case and the anova is not usable to compare m1 and m2
due to the random effect parameter, note that the same can occur
without random effects but when a non-linearity is included such as
s(Longitude,Latitude). What then is appropriate: keep m1 (which is
more complex), or use m2 (which has a less complex non-linearity, but
includes an additional non-significant fixed-effect factor).
With kind regards,
Martijn
--
*******************************************
Martijn Wieling
http://www.martijnwieling.nl
wiel...@gmail.com
+31(0)614108622
*******************************************
University of Groningen
http://www.rug.nl/staff/m.b.wieling
*******************************************
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 http://people.bath.ac.uk/sw283
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.