Re: [R] predict.lm(...,type="terms") question

David Winsemius Sat, 01 Sep 2012 18:40:49 -0700


On Sep 1, 2012, at 4:33 AM, Rui Barradas wrote:

Hello,
John is right, there seems to be an error in predict.lm. It can bemade to work but if the model is fitted with lm(...,data) then itmesses things up. Apparently predict.lm disregards the data argumentand uses whatever it finds in the global environment.


That is not how I would describe the "problem". See below.

In the examples below, after running David's x <- rnorm(15) example(case 1 is a simplified version of it), case 3 shows the bug, thereare 15 predictions for 4 new values. Then we must backtrack to case2 and also consider it a bug.


Examples:

# ------------------- 1
# David's example simplified, only one regressor.
# And with set.seed()

set.seed(4409)
x <- rnorm(15)
y <- x + rnorm(15)

fit <- lm(y ~ x)
new <- data.frame(y = seq(-3.5, 3.5, 0.5))
predict(fit, newdata = new, type = "terms")
# no error is thrown.

# ------------------- 2
# Original post, 'area' and 'concn' are now 'y' and 'x'
# Remove 'x' and 'y' from the global environment.
rm(x, y)
dat <- data.frame(
   y = c(4875, 8172, 18065, 34555),
   x = c(25, 50, 125, 250))
new <- data.frame(y = c(8172, 10220, 11570, 24150))

model <- lm(y ~ x, data = dat)
predict(model, newdata = new, type = "terms")
# Error in eval(expr, envir, enclos) : object 'x' not found

Why should predict not complain when it is offered a newdata argumentthat does no contain a vector of values for "x"? The whole point ofthe terms method of prediction is to offer estimates for specificvalues of items on the RHS of the formula. The OP seems to havetrouble understanding that point. Putting in a vector with the name ofthe LHS item makes no sense to me. I certainly cannot see that anyparticular behavior for this pathological input is described forpredict.lm in its help page, but throwing an error seems perfectlyreasonable to me.


--
David.


# ------------------- 3
# Original post.
# Do NOT remove 'x' and 'y' from the global environment.
set.seed(4409)
x <- rnorm(15)
y <- x + rnorm(15)

dat <- data.frame(
   y = c(4875, 8172, 18065, 34555),
   x = c(25, 50, 125, 250))
new <- data.frame(y = c(8172, 10220, 11570, 24150))

model <- lm(y ~ x, data = dat)
predict(model, newdata = new, type = "terms")
# Warning message:
# 'newdata' had 4 rows but variable(s) found have 15 rows

#------------------- 4
# Original post, no data argument to lm.

y <- c(4875, 8172, 18065, 34555)
x <- c(25, 50, 125, 250)
new <- data.frame(y = c(8172, 10220, 11570, 24150))

model <- lm(y ~ x)
predict(model, newdata = new, type = "terms")
# no warning or error is thrown, predictions seem allright

I haven't looked at the code for predict.lm, and really don't knowwhere the error is.

Bug report?

Rui Barradas

Em 01-09-2012 06:40, David Winsemius escreveu:

On Aug 31, 2012, at 3:48 PM, jjthaden wrote:
On Aug 30, 2012, at 4:35 PM,  David Windemius wrote:
David said my newdata data frame 'new' must have a column named'area'.It did. Nonetheless predict.lm throws an error with type ="terms" and
newdata = new. I see nothing in the predict.lm documentation that
bars this usage. Is there a bug?
After correcting the error in your definition of the 'area'vector I
get no error from predict.lm().
David.
What error did you correct?
This was the code you offered:

#Ludbrook's data set S1 (except renaming
#his 'x' as 'concn' and his 'y' as 'area')
S1 <- data.frame(
   area = c(2.4,2.6,6.0,6.5,8.9,),
   concn = c(1.1,4,5,8.5,8.5))

There's an extra comma in the "area" vector.
The newdata data frames in my examples have been pretty simple,and have
defined the 'area' vector simply, for instance,
new <- data.frame(area = c(8172, 10220, 11570, 24150))
new
#        area
#    1  8172
#    2 10220
#    3 11570
#    4 24150

Is something wrong with this?
I don't really know .... this appears to be a different problemthan you posed earlier. If you would learn to post with context, wemight not be in the position of trying to read your mind.
Were you able to make predict.lm work with newdata = new and type= "terms"?
Yes. I have no trouble doing so.

>  x <- rnorm(15); x2=rnorm(15)
>  y <- x + x2 +rnorm(15)
>  fit <- lm(y ~ x+x2)
>  new <- data.frame(x2 = seq(-3.5, 3.5, 0.5)  )
>  predict(fit, newdata=new,type="terms", terms="x2")
          x2
1  -4.9230035
2  -4.1850588
3  -3.4471141
4  -2.7091694
5  -1.9712248
6  -1.2332801
7  -0.4953354
8   0.2426093
9   0.9805539
10  1.7184986
11  2.4564433
12  3.1943880
13  3.9323326
14  4.6702773
15  5.4082220
attr(,"constant")
[1] 0.4501911
-John
Sent from the (NOT) R help mailing list (and NOT) archive atNabble.com.
Hmmm ... Nabble ... definitely part of the problem.


David Winsemius, MD
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] predict.lm(...,type="terms") question

Reply via email to