I'm running lmer repeatedly on artificial data with two fixed factors (called
'gender' and 'stress') and one random factor ('speaker'). Gender is a
between-speaker variable, stress is a within-speaker variable, if that matters.
Each dataset has 100 rows from each of 20 speakers, 2000 rows in all.

About 5% of the time I get a strange result, where the lmer() model with BOTH
fixed factors and the random factor ('gs_s') comes out MUCH worse compared to
the models with ONE fixed factor and the random factor ('g_s' and 's_s'), and
also compared to the glm() model with both fixed factors and no random factor
('gs').

This doesn't make much sense to me.

I've placed a dataset on the Web that exhibits this behavior, as follows:

dat <- read.csv("http://www.ling.upenn.edu/~johnson4/strange.csv";)

gs <- glm(outcome~gender+stress,binomial,dat)
g_s <- lmer(outcome~gender+(1|speaker),dat,binomial)
s_s <- lmer(outcome~stress+(1|speaker),dat,binomial)
gs_s <- lmer(outcome~gender+stress+(1|speaker),dat,binomial)

logLik(gs)         #  -1344 (df=3)
logLik(g_s)        #  -1342 (df=3)
logLik(s_s)        #  -1314 (df=3)
logLik(gs_s)       # -11823 (df=4)

This seems like an error of some kind. The glm() model with both fixed effects
is well-behaved, but lmer() seems to be going haywire when confronted with the
same situation plus the random effect.

Could anyone advise me how to stop this from happening, and/or explain why it
is?

Thanks very much,
Daniel

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to