Re: [Rd] Apparent bug in behavior of formulas with '-' operator for lm

2018-03-16 Thread Mark van der Loo
Thanks, Joris, This clarifies at least where exactly it comes from. I still find the high-level behavior of 'predict' very counter-intuitive as the estimated model contains no coefficients in 'z', but I think we agree on that. I am not sure how much trouble it would be to improve this behavior, b

Re: [Rd] Apparent bug in behavior of formulas with '-' operator for lm

2018-03-16 Thread Joris Meys
Technically it is used as a predictor in the model. The information is contained in terms : > terms(x ~ . - z, data = d) x ~ (y + z) - z attr(,"variables") list(x, y, z) attr(,"factors") y x 0 y 1 z 0 attr(,"term.labels") [1] "y" attr(,"order") [1] 1 attr(,"intercept") [1] 1 attr(,"response") [1

Re: [Rd] Apparent bug in behavior of formulas with '-' operator for lm

2018-03-16 Thread Mark van der Loo
Joris, the point is that 'z' is NOT used as a predictor in the model. Therefore it should not affect predictions. Also, I find it suspicious that the error only occurs when the response variable conitains missings and 'z' is unique (I have tested several other cases to confirm this). -Mark Op vr

Re: [Rd] Apparent bug in behavior of formulas with '-' operator for lm

2018-03-16 Thread Joris Meys
It's not a bug per se. It's the effect of removing all observations linked to a certain level in your data frame. So the output of lm() doesn't contain a coefficient for level a of z, but your new data contains that level a. With a small addition, this works again: d <- data.frame(x=rnorm(12),y=rn

[Rd] Apparent bug in behavior of formulas with '-' operator for lm

2018-03-16 Thread Mark van der Loo
Dear R-developers, In the 'lm' documentation, the '-' operator is only specified to be used with -1 (to remove the intercept from the model). However, the documentation also refers to the 'formula' help file, which indicates that it is possible to subtract any term. Indeed, the following works wi