I am using gam from the mgcv package to analyze a dataset with 24 entries :

ran  f1     f2   y
1   3000    5   545
1   3000    10  1045
1   10000   5   536
1   10000   10  770
2   3000    5   842
2   3000    10  2042
2   10000   5   615
2   10000   10  1361
3   3000    5   328
3   3000    10  1028
3   10000   5   262
3   10000   10  722
4   3000    5   349
4   3000    10  665
4   10000   5   255
4   10000   10  470
5   3000    5   680
5   3000    10  1510
5   10000   5   499
5   10000   10  1422
6   3000    5   628
6   3000    10  2062
6   10000   5   499
6   10000   10  2158

The data has two fixed effects (f1 and f2) and one random effect (ran). The 
dependent data is y. Because the dependent data y represents counts and is 
overdispersed, I am using a negative binomial model.

The gam model and its summary output is as follows:

library(mgcv)
summary(gam(y ~ f1 * f2 + s(ran, bs = "re"), data = df2, family = nb, method = 
"REML"))

Family: Negative Binomial(27.376)
Link function: log

Formula:
y ~ f1 * f2 + s(ran, bs = "re")

Parametric coefficients:
              Estimate Std. Error z value Pr(>|z|)
(Intercept)  5.500e+00  3.137e-01  17.533  < 2e-16 ***
f1          -3.421e-05  3.619e-05  -0.945    0.345
f2           1.760e-01  3.355e-02   5.247 1.55e-07 ***
f1:f2        2.665e-07  4.554e-06   0.059    0.953
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
         edf Ref.df Chi.sq p-value
s(ran) 4.726      5  85.66  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.866   Deviance explained = 93.6%
-REML = 185.96  Scale est. = 1         n = 24

The Wald test from summary gives very high significance for f2 (P = 1.55e-07). 
However, when I test the significance of f2 by comparing two different models 
using anova, I get dramatically different results:

anova(gam(y ~ f1 * f2 + s(ran, bs = "re"), data = df2, family = nb, method = 
"ML"),
gam(y ~ f1 + s(ran, bs = "re"), data = df2, family = nb, method = "ML"),
test="Chisq")

Analysis of Deviance Table

Model 1: y ~ f1 * f2 + s(ran, bs = "re")
Model 2: y ~ f1 + s(ran, bs = "re")
  Resid. Df Resid. Dev      Df Deviance Pr(>Chi)
1    14.843     18.340
2    16.652     21.529 -1.8091   -3.188   0.1752

f2 is no longer significant. The models were changed from REML to ML, as 
recommended for evaluation of fixed effects.

If the interaction is preserved, f2 still remains insignificant using anova:

anova(gam(y ~ f1 + f2 + f1:f2 + s(ran, bs = "re"), data = df2, family = nb, 
method = "ML"),
gam(y ~ f1 + f1:f2 + s(ran, bs = "re"), data = df2, family = nb, method = "ML"),
test="Chisq")
Analysis of Deviance Table

Model 1: y ~ f1 + f2 + f1:f2 + s(ran, bs = "re")
Model 2: y ~ f1 + f1:f2 + s(ran, bs = "re")
  Resid. Df Resid. Dev       Df Deviance Pr(>Chi)
1    14.843     18.340
2    15.645     19.194 -0.80159 -0.85391   0.2855

I would be very grateful for advice on which of these approaches is most 
appropriate. Many thanks!

________________________________

UCLA HEALTH SCIENCES IMPORTANT WARNING: This email (and any attachments) is 
only intended for the use of the person or entity to which it is addressed, and 
may contain information that is privileged and confidential. You, the 
recipient, are obligated to maintain it in a safe, secure and confidential 
manner. Unauthorized redisclosure or failure to maintain confidentiality may 
subject you to federal and state penalties. If you are not the intended 
recipient, please immediately notify us by return email, and delete this 
message from your computer.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to