[I think I've seen this reported before but can't locate it any more. I believe this oddity (glitch? feature?) is behind a query that Jean-Baptiste Ferdy asked a year ago <http://finzi.psych.upenn.edu/Rhelp08/2008-October/176449.html>]
It appears that glmmPQL looks in the global workspace, not within the data frame specified by the "data" argument, for the variables specified in the "form" argument of spatial correlation structures provided to the "correlation" argument. This is potentially confusing/dangerous, because you can easily (1) get an error saying that the variable is not found (the least dangerous and confusing), (2) get an error saying the variable is of the wrong length, if there is a variable with a matching name but the wrong length in the global environment, (3) get completely misled if there is a *different* variable with a matching name in the global environment with the right length (in which case glmmPQL will ignore the element in the data frame you specified and use the one from the global environment). The following patch (to MASS 7.3-3, retrieved from CRAN src/contrib 2.10.0 directory) appears to fix the problem. A quick look suggests that the same version of MASS is in 2.11.0 as well. *** glmmPQL.R.orig 2009-10-20 23:48:38.000000000 -0400 --- glmmPQL.R 2009-10-20 23:51:33.000000000 -0400 *************** *** 39,44 **** --- 39,47 ---- off <- attr(Terms, "offset") if(length(off<- attr(Terms, "offset"))) allvars <- c(allvars, as.character(attr(Terms, "variables"))[off+1]) + ## add variables in correlation formula, if any + if (!missing(correlation) && !is.null(attr(correlation,"formula"))) + allvars <- c(allvars,all.vars(attr(correlation,"formula"))) ## substitute back actual formula (rather than a variable name) Call$fixed <- eval(fixed); Call$random <- eval(random) m$formula <- as.formula(paste("~", paste(allvars, collapse="+"))) code below shows the problem. Ben Bolker ## set up data set.seed(1001) d = data.frame(x = runif(100), y = runif(100), z = rpois(100,2), f = factor(rep(letters[1:20],each=5))) library(MASS) ## without correlation structure: fine glmmPQL(z~1,random=~1|f,data=d,family="poisson") ## with correlation structure, ## but x and y not defined outside of 'd' glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## Error in eval(expr, envir, enclos) : object 'x' not found x = y = runif(10) glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## Error in ... ## variable lengths differ (found for 'f') ## this is what we probably wanted x = d$x y = d$y glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## this is dangerous! x = y = runif(100) glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## proof that glmmPQL really is looking ## outside of the data frame for its ## correlation variables ... x = y = rep(0,100) glmmPQL(z~1,random=~1|f,data=d, correlation=corExp(form=~x+y), family="poisson") ## Error in getCovariate.corSpatial(object, data = data) : ## Cannot have zero distances in "corSpatial" > sessionInfo() R version 2.9.2 (2009-08-24) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets grid methods [8] base other attached packages: [1] nlme_3.1-95 MASS_7.2-49 reshape_0.8.3 plyr_0.1.9 proto_0.3-8 loaded via a namespace (and not attached): [1] ggplot2_0.8.4 lattice_0.17-26 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel