Thanks, Joris,
This clarifies at least where exactly it comes from. I still find the
high-level behavior of 'predict' very counter-intuitive as the estimated
model contains no coefficients in 'z', but I think we agree on that.
I am not sure how much trouble it would be to improve this behavior, b
Technically it is used as a predictor in the model. The information is
contained in terms :
> terms(x ~ . - z, data = d)
x ~ (y + z) - z
attr(,"variables")
list(x, y, z)
attr(,"factors")
y
x 0
y 1
z 0
attr(,"term.labels")
[1] "y"
attr(,"order")
[1] 1
attr(,"intercept")
[1] 1
attr(,"response")
[1
Joris, the point is that 'z' is NOT used as a predictor in the model.
Therefore it should not affect predictions. Also, I find it suspicious that
the error only occurs when the response variable conitains missings and 'z'
is unique (I have tested several other cases to confirm this).
-Mark
Op vr
It's not a bug per se. It's the effect of removing all observations linked
to a certain level in your data frame. So the output of lm() doesn't
contain a coefficient for level a of z, but your new data contains that
level a. With a small addition, this works again:
d <- data.frame(x=rnorm(12),y=rn
Dear R-developers,
In the 'lm' documentation, the '-' operator is only specified to be used
with -1 (to remove the intercept from the model).
However, the documentation also refers to the 'formula' help file, which
indicates that it is possible to subtract any term. Indeed, the following
works wi