Re: [R] mgcv: inclusion of random intercept in model - based on p-value of smooth or anova?

Simon Wood Thu, 19 Jul 2012 02:26:54 -0700

Martin,

I've just submitted mgcv_1.7-19 to CRAN, which includes a major upgradeof the p-value computation for random effect terms (and any other smoothterm which can be penalized to zero as part of estimation). The newp-values are still conditional on the smoothing parameter/variancecomponent estimates, but given those estimates are based on the nulldistribution of the restricted likelihood ratio statistic. Basically, ifyou are going to condition on the smoothing parameters, then this is theright thing to do, but if smoothing parameter estimates are veryuncertain, then the p-values may be underestimates (in simulations theunderestimation seems to be rather modest).

In the Gaussian mixed additive model case, the alternative is to use theRLRsim package, while for GLMMs the only way to further improve thingsis by writing code to simulate for the null distribution of the teststatistic...


Thanks for pushing be into action on this!

best,
Simon


On 25/06/12 16:31, Simon Wood wrote:

Martin,

I had a nagging feeling that there must be more to this than my previous
reply suggested, and have been digging a bit further. Basically I would
expect these p-values to not be great if you had nested random effects
(such as main effects + their interaction), but your case looked rather
straightforward...

After some digging it turns out that the p-value for Placename is wrong
in this case, as you suspected. R routine `qr' has a `tol' argument that
by default is set to 1e-7. In the computation of the test statistic for
"re" terms I had called qr without changing this default to the value 0
that is actually appropriate (I should have noticed this, I've been
caught out by it before). With the correct 'tol', Placement is
non-significant. This will be fixed in the next release, but that won't
happen until I've tested the random effects p-value approximations a bit
more thoroughly.

best,
Simon



On 11/06/12 14:56, Martijn Wieling wrote:

Dear Simon,

I ran an additional analysis using bam (mgcv 1.7-17) with three random
intercepts and no non-linearities, and compared these to the results
of lmer (lme4). Using bam results in a significant random intercept
(even though it has a very low edf-value), while the lmer results show
no variance associated to the random intercept of Placename. Should I
drop the random intercept of Placename and if so, how is this apparent
from the results of bam?

Summaries of both models are shown below.

With kind regards,
Martijn

#### l1 = lmer(RefPMIdistMeanLog.c ~ geogamfit + (1|Word) + (1|Key) +
(1|Placename), data=wrddst); print(l1,cor=F)

Linear mixed model fit by REML
Formula: RefPMIdistMeanLog.c ~ geogamfit + (1 | Word) + (1 | Key) + (1
|      Placename)
    Data: wrddst
     AIC    BIC logLik deviance REMLdev
  -44985 -44927  22498   -45009  -44997
Random effects:
  Groups    Name        Variance  Std.Dev.
  Word      (Intercept) 0.0800944 0.283009
  Key       (Intercept) 0.0013641 0.036933
  Placename (Intercept) 0.0000000 0.000000
  Residual              0.0381774 0.195390
Number of obs: 112608, groups: Word, 357; Key, 320; Placename, 40

Fixed effects:
             Estimate Std. Error t value
(Intercept) -0.00342    0.01513   -0.23
geogamfit    0.99249    0.02612   37.99


#### m1 = bam(RefPMIdistMeanLog.c ~ geogamfit + s(Word,bs="re") +
s(Key,bs="re") + s(Placename,bs="re"), data=wrddst,method="REML");
summary(m1,freq=F)

Family: gaussian
Link function: identity

Formula:
RefPMIdistMeanLog.c ~ geogamfit + s(Word, bs = "re") + s(Key,
     bs = "re") + s(Placename, bs = "re")

Parametric coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.00342    0.01513  -0.226    0.821
geogamfit    0.99249    0.02612  37.991<2e-16 ***
---
Signif. codes:  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1

Approximate significance of smooth terms:
                    edf Ref.df       F p-value
s(Word)      3.554e+02    347 634.716<2e-16 ***
s(Key)       2.946e+02    316  23.054<2e-16 ***
s(Placename) 1.489e-04     38   7.282<2e-16 ***
---
Signif. codes:  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1

R-sq.(adj) =  0.693   Deviance explained = 69.4%
REML score = -22498  Scale est. = 0.038177  n = 112608


On Wed, May 23, 2012 at 11:30 AM, Simon Wood-4 [via R]
<ml-node+s789695n4631060...@n4.nabble.com>  wrote:

Having looked at this further, I've made some changes in mgcv_1.7-17 to
the p-value computations for terms that can be penalized to zero during
fitting (e.g. s(x,bs="re"), s(x,m=1) etc).

The Wald statistic based p-values from summary.gam and anova.gam (i.e.
what you get from e.g. anova(a) where a is a fitted gam object) are
quite well founded for smooth terms that are non-zero under full
penalization (e.g. a cubic spline is a straight line under full
penalization). For such smooths, an extension of Nychka's (1988) result
on CI's for splines gives a well founded distributional result on which
to base a Wald statistic. However, the Nychka result requires the
smoothing bias to be substantially less than the smoothing estimator
variance, and this will often not be the case if smoothing can actually
penalize a term to zero (to understand why, see argument in appendix of
Marra&  Wood, 2012, Scandinavian Journal of Statistics, 39,53-74).

Simulation testing shows that this theoretical concern has serious
practical consequences. So for terms that can be penalized to zero,
alternative approximations have to be used, and these are now
implemented in mgcv_1.7-17 (see ?summary.gam).

The approximate test performed by anova(a,b) (a and b are fitted "gam"
objects) is less well founded. It is a reasonable approximation when
each smooth term in the models could in principle be well approximated
by an unpenalized term of rank approximately equal to the edf of the
smooth term, but otherwise the p-values produced are likely to be much
too small. In particular simulation testing suggests that the test is
not to be trusted with s(...,bs="re") terms, and can be poor if the
models being compared involve any terms that can be penalized to zero
during fitting. (Although the mechanisms are a little different, this is
similar to the problem we would have if the models were viewed as
regular mixed models and we tried to use a GLRT to test variance
components for equality to zero).

These issues are now documented in ?anova.gam and ?summary.gam...

Simon

On 08/05/12 15:01, Martijn Wieling wrote:

Dear useRs,

I am using mgcv version 1.7-16. When I create a model with a few
non-linear terms and a random intercept for (in my case) country using
s(Country,bs="re"), the representative line in my model (i.e.
approximate significance of smooth terms) for the random intercept
reads:
                          edf       Ref.df     F          p-value
s(Country)       36.127 58.551   0.644    0.982

Can I interpret this as there being no support for a random intercept
for country? However, when I compare the simpler model to the model
including the random intercept, the latter appears to be a significant
improvement.

anova(gam1,gam2,test="F")

Model 1: ....
Model 2: .... + s(BirthNation, bs="re")
    Resid. Df Resid. Dev     Df Deviance      F    Pr(>F)
1    789.44     416.54
2    753.15     373.54 36.292   43.003 2.3891 1.225e-05 ***

I hope somebody could help me in how I should proceed in these
situations. Do I include the random intercept or not?

I also have a related question. When I used to create a mixed-effects
regression model using lmer and included e.g., an interaction in the
fixed-effects structure, I would test if the inclusion of this
interaction was warranted using anova(lmer1,lmer2). It then would show
me that I invested 1 additional df and the resulting (possibly
significant) improvement in fit of my model.

This approach does not seem to work when using gam. In this case an
apparent investment of 1 degree of freedom for the interaction, might
result in an actual decrease of the degrees of freedom invested by the
total model (caused by a decrease of the edf's of splines in the model
with the interaction). In this case, how would I proceed in
determining if the model including the interaction term is better?

With kind regards,
Martijn Wieling

--
*******************************************
Martijn Wieling
http://www.martijnwieling.nl
[hidden email]
+31(0)614108622
*******************************************
University of Groningen
http://www.rug.nl/staff/m.b.wieling

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603               http://people.bath.ac.uk/sw283

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


________________________________
If you reply to this email, your message will be added to the discussion
below:
http://r.789695.n4.nabble.com/mgcv-inclusion-of-random-intercept-in-model-based-on-p-value-of-smooth-or-anova-tp4617585p4631060.html

To unsubscribe from mgcv: inclusion of random intercept in model -
based on
p-value of smooth or anova?, click here.
NAML



--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603               http://people.bath.ac.uk/sw283

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mgcv: inclusion of random intercept in model - based on p-value of smooth or anova?

Reply via email to