on 01/16/2009 02:20 PM VanHezewijk, Brian wrote: > I've recently encountered an issue when trying to use the predict.glm > function. > > > > I've gotten into the habit of using the dataframe$variablename method of > specifying terms in my model statements. I thought this unambiguous > notation would be acceptable in all situations but it seems models > written this way are not accepted by the predict function. Perhaps > others have encountered this problem as well.
<snip> The bottom line is "don't do that". :-) When the predict.*() functions look for the variable names, they use the names as specified in the formula that was used in the initial creation of the model object. As per ?predict.glm: Note Variables are first looked for in newdata and then searched for in the usual way (which will include the environment of the formula used in the fit). A warning will be given if the variables found are not of the same length as those in newdata if it was supplied. As per your example, using: x <- 1:100 y <- 2 * x orig.df <- data.frame(x1 = x, y1 = y) lm1 <- glm(orig.df$y1 ~ orig.df$x1, data = orig.df) pred1 <- predict(lm1, newdata = data.frame(x1 = 101:150)) When predict.glm() tries to locate the variable "orig.df$x1" in the data frame passed to 'newdata', it cannot be found. The correct name in the model is "orig.df$x1", not "x1" as you used above. Thus, since it cannot find that variable in 'newdata', it begins to look elsewhere for a variable called "orig.df$x1". Guess what? It finds it in the global environment as a column the original dataframe 'orig.df'. Since that column has a length of 100 and the data frame that you passed to newdata only has 50, you get an error. Warning message: 'newdata' had 50 rows but variable(s) found have 100 rows There is a "method" to the madness and good reason why the modeling functions and others that take a formula argument also have a 'data' argument to specify the location of the variables to be used. HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.