[Rd] lm considers removed predictors when finding complete cases
Dear R-devel list, I realized that removing a predictor in lm through the "-"'s operator in formula() does not affect the complete cases that are considered. A minimal example is: summary(lm(Wind ~ ., data = airquality)) # 42 observations deleted due to missingness summary(lm(Wind ~ . - Ozone, data = airquality)) # still 42 observations deleted due to missingness, even if only 7 are # missing for the response and the rest of the predictors summary(lm(Wind ~ ., data = subset(airquality, select = -Ozone))) # 7 observations deleted due to missingness I find this behaviour somehow striking and I was wondering whether it is intended, or whether it would be appropriate to document it in lm's help. Any insight on this issue is appreciated. Best regards, -- Eduardo García Portugués Assistant professor Department of Statistics Carlos III University of Madrid Office: 7.3.J21 (Leganés) Phone: (+34) 91624 8836 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] abort
FYI. Gábor ❯ R --vanilla -q > sessionInfo() R version 3.4.3 (2017-11-30) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.3 > conditionCall.x <- function(c) strrep("xxx?", 3000) > stop(structure(list(message="1"), class=c("x","condition"))) zsh: abort R --vanilla -q __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] abort
Thanks; fixed in R-devel and R-patched. Best, luke On Tue, 19 Dec 2017, Gábor Csárdi wrote: FYI. Gábor ❯ R --vanilla -q sessionInfo() R version 3.4.3 (2017-11-30) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.3 conditionCall.x <- function(c) strrep("xxx?", 3000) stop(structure(list(message="1"), class=c("x","condition"))) zsh: abort R --vanilla -q __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] lm considers removed predictors when finding complete cases
> On Dec 19, 2017, at 11:12 AM, EDUARDO GARCIA PORTUGUES > wrote: > > Dear R-devel list, > > I realized that removing a predictor in lm through the "-"'s operator in > formula() does not affect the complete cases that are considered. A minimal > example is: > > summary(lm(Wind ~ ., data = airquality)) > # 42 observations deleted due to missingness > > summary(lm(Wind ~ . - Ozone, data = airquality)) > # still 42 observations deleted due to missingness, even if only 7 are > # missing for the response and the rest of the predictors > > summary(lm(Wind ~ ., data = subset(airquality, select = -Ozone))) > # 7 observations deleted due to missingness > > I find this behaviour somehow striking and I was wondering whether it is > intended, or whether it would be appropriate to document it in lm's help. The behavior in the second instance seems consistent with a desire to compare models (full versus reduced) based on the same data. You expectation appears to be something else but you have not really explained your rationale for a different expectation other than to call it "striking". If by "striking" you mean hitting your head and saying "Oh course, I should have thought of that" then we would be in agreement. -- David. > > Any insight on this issue is appreciated. > > Best regards, > -- > Eduardo García Portugués > Assistant professor > Department of Statistics > Carlos III University of Madrid > > Office: 7.3.J21 (Leganés) > Phone: (+34) 91624 8836 > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel David Winsemius Alameda, CA, USA 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel