Mixed models based on likelihood methods can often handle missing observations within subjects, but they not do well with missing individual elements in the design matrices (think unit nonresponse vs item nonresponse in the survey world). Continuing with the example I recently sent to you
set.seed(0112358) library(nlme) school <- factor(rep(1:20, each=4)) year <- rep(2006:2009, 20) year.c <- year - mean(year) tmt <- sample(0:1, 20, replace = TRUE)[school] math <- rnorm(80, 2 + tmt + .001*year + .0001*tmt*year, 1.5) + rnorm(20)[school] tmt <- factor(tmt) dfr <- data.frame(math, school, tmt, year, year.c) rm(math, school, year, year.c, tmt) # Now create missing observations, # -- note just a single obs for the first school: dfr2 <- dfr[-c(2:6, sample(75, 5)), ] table(dfr2$school) # # school: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # numobs: 1 2 4 3 4 4 4 4 4 3 3 4 4 4 3 4 4 4 4 4 f3 <- lme(math ~ year.c*tmt, data = dfr2, random=~1|school) intervals(f3) #Approximate 95% confidence intervals # # Fixed effects: # lower est. upper #(Intercept) 3.0705975 3.81227729 4.5539571 #year.c -0.3913985 0.09107008 0.5735386 #tmt1 0.3146311 1.34895504 2.3832790 #year.c:tmt1 -0.4385680 0.20748194 0.8535319 #attr(,"label") #[1] "Fixed effects:" # # Random Effects: # Level: school # lower est. upper #sd((Intercept)) 0.35068 0.7334782 1.534134 # # Within-group standard error: # lower est. upper #1.226240 1.492044 1.815464 Note that even with the missing obs, in this instance we got pretty good estimates of the variance components (parameters: within-group sd = 1.5; between-school sd = 1). However, I had to change the seed from the one I used in the last email because, as is not uncommon with unbalanced data and relatively small sample sizes, there were convergence problems. If you were using classical MoM estimators fitting a mixed model with unbalanced data would lead to very precarious inferences (not that the likelihood inference isn't without it share of pitfalls, but that's another email...) hth, Kingsford On Wed, Oct 21, 2009 at 11:13 AM, Peter Flom <peterflomconsult...@mindspring.com> wrote: > Hello > > Running R2.9.2 on Windows XP > > I am puzzled by the performance of LME in situations where there are missing > data. As I understand it, one of the strengths of this sort of model is how > well it deals with missing data, yet lme requires nonmissing data. > > Thus, > > m1.mod1 <- lme(fixed = math_1 ~ I(year-2007.5)*TFC_, > data = long, > random = ~I(year-2007.5)|schoolnum) > > causes an error in na.fail.default, but adding na.action = "na.omit" makes a > model with no errors. However, if I create that model, i.e. > > m1.mod1 <- lme(fixed = math_1 ~ I(year-2007.5)*TFC_, > data = long, > random = ~I(year-2007.5)|schoolnum, > na.action = "na.omit") > > then the diagnostic plots suggested in Pinheiro & Bates produce errors; e.g. > > plot(m1.mod1, schoolnum~resid(.), abline = 0) > > gives an error "could not find function "NaAct". > > > Searching the archives showed a similar question from 2007, but did not show > any responses. > > Thanks for any help > > Peter > > ) > > Peter L. Flom, PhD > Statistical Consultant > Website: www DOT peterflomconsulting DOT com > Writing; http://www.associatedcontent.com/user/582880/peter_flom.html > Twitter: @peterflom > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.