> On Dec 19, 2017, at 11:12 AM, EDUARDO GARCIA PORTUGUES > <edgar...@est-econ.uc3m.es> wrote: > > Dear R-devel list, > > I realized that removing a predictor in lm through the "-"'s operator in > formula() does not affect the complete cases that are considered. A minimal > example is: > > summary(lm(Wind ~ ., data = airquality)) > # 42 observations deleted due to missingness > > summary(lm(Wind ~ . - Ozone, data = airquality)) > # still 42 observations deleted due to missingness, even if only 7 are > # missing for the response and the rest of the predictors > > summary(lm(Wind ~ ., data = subset(airquality, select = -Ozone))) > # 7 observations deleted due to missingness > > I find this behaviour somehow striking and I was wondering whether it is > intended, or whether it would be appropriate to document it in lm's help.
The behavior in the second instance seems consistent with a desire to compare models (full versus reduced) based on the same data. You expectation appears to be something else but you have not really explained your rationale for a different expectation other than to call it "striking". If by "striking" you mean hitting your head and saying "Oh course, I should have thought of that" then we would be in agreement. -- David. > > Any insight on this issue is appreciated. > > Best regards, > -- > Eduardo García Portugués > Assistant professor > Department of Statistics > Carlos III University of Madrid > > Office: 7.3.J21 (Leganés) > Phone: (+34) 91624 8836 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel David Winsemius Alameda, CA, USA 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel