[Rd] Should there be a confint.mlm ?
It seems that confint.default returns an empty data.frame for objects of class mlm. For example: ``` nobs <- 20 set.seed(1234) # some fake data datf <- data.frame(x1=rnorm(nobs),x2=runif(nobs),y1=rnorm(nobs),y2=rnorm(nobs)) fitm <- lm(cbind(y1,y2) ~ x1 + x2,data=datf) confint(fitm) # returns: 2.5 % 97.5 % ``` I have seen proposed workarounds on stackoverflow and elsewhere, but suspect this should be fixed in the stats package. A proposed implementation would be: ``` # I need this to run the code, but stats does not: format.perc <- stats:::format.perc # compute confidence intervals for mlm object. confint.mlm <- function (object, level = 0.95, ...) { cf <- coef(object) ncfs <- as.numeric(cf) a <- (1 - level)/2 a <- c(a, 1 - a) fac <- qt(a, object$df.residual) pct <- format.perc(a, 3) ses <- sqrt(diag(vcov(object))) ci <- ncfs + ses %o% fac setNames(data.frame(ci),pct) } # returning to the example above, confint(fitm) # returns: 2.5 % 97.5 % y1:(Intercept) -1.2261 0.7037 y1:x1 -0.5100 0.2868 y1:x2 -2.7554 0.8736 y2:(Intercept) -0.6980 2.2182 y2:x1 -0.6162 0.5879 y2:x2 -3.9724 1.5114 ``` -- --sep [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Should there be a confint.mlm ?
> steven pav > on Thu, 19 Jul 2018 21:51:07 -0700 writes: > It seems that confint.default returns an empty data.frame > for objects of class mlm. For example: > It seems that confint.default returns an empty data.frame for objects of > class mlm. Not quite: Note that 'mlm' objects are also 'lm' objects, and so it is confint.lm() which is called here and fails. > For example: > > ``` > nobs <- 20 > set.seed(1234) > # some fake data > datf <- > data.frame(x1=rnorm(nobs),x2=runif(nobs),y1=rnorm(nobs),y2=rnorm(nobs)) > fitm <- lm(cbind(y1,y2) ~ x1 + x2,data=datf) > confint(fitm) > # returns: > 2.5 % 97.5 % > ``` > > I have seen proposed workarounds on stackoverflow and elsewhere, but > suspect this should be fixed in the stats package. I agree. It may be nicer to tweak confint.lm() instead though. I'm looking into doing that. > A proposed implementation would be: > > ``` > # I need this to run the code, but stats does not: > format.perc <- stats:::format.perc or better (mainly for esthetical reasons), use environment(confint.mlm) <- asNamespace("stats") after defining confint.mlm [below] > # compute confidence intervals for mlm object. > confint.mlm <- function (object, level = 0.95, ...) { > cf <- coef(object) > ncfs <- as.numeric(cf) > a <- (1 - level)/2 > a <- c(a, 1 - a) > fac <- qt(a, object$df.residual) > pct <- format.perc(a, 3) > ses <- sqrt(diag(vcov(object))) BTW --- and this is a diversion --- This is nice mathematically (and used in other places, also in "base R" I think) but in principle is a waste: Computing a full k x k matrix and then throwing away all but the length-k diagonal ... In the past I had contemplated but never RFC'ed or really implemented a stderr() generic with default method stderr.default <- function(object) sqrt(diag(vcov(object))) but allow non-default methods to be smarter and hence more efficient. > ci <- ncfs + ses %o% fac > setNames(data.frame(ci),pct) > } > > # returning to the example above, > confint(fitm) > # returns: > 2.5 % 97.5 % > y1:(Intercept) -1.2261 0.7037 > y1:x1 -0.5100 0.2868 > y1:x2 -2.7554 0.8736 > y2:(Intercept) -0.6980 2.2182 > y2:x1 -0.6162 0.5879 > y2:x2 -3.9724 1.5114 > ``` I'm looking into a relatively small patch to confint.lm() *instead* of the confint.mlm() above Thank you very much, Steven, for your proposal! I will let you (and the R-devel audience) know the outcome. Best regards, Martin Maechler ETH Zurich and R Core Team > -- > > --sep > > [[alternative HTML version deleted]] > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Model formulas with explicit references
Dear R-Devel, I seem to no longer be able to access the bug-reporting system, so am doing this by e-mail. My report concerns models where variables are explicitly referenced (or is it "dereferenced"?), such as: cars.lm <- lm(mtcars[[1]] ~ factor(mtcars$cyl) + mtcars[["disp"]]) I have found that it is not possible to predict such models with new data. For example: > predict(cars.lm, newdata = mtcars[1:5, ) 12345678 9 10 20.37954 20.37954 26.58543 17.70329 14.91157 18.60448 14.91157 25.52859 25.68971 20.17199 11 12 13 14 15 16 17 18 19 20 20.17199 17.21096 17.21096 17.21096 11.85300 12.18071 12.72688 27.38558 27.46750 27.59312 21 22 23 24 25 26 27 28 29 30 26.25500 16.05853 16.44085 15.18466 13.81922 27.37738 26.24954 26.93772 15.15735 20.78917 31 32 16.52278 26.23042 Warning message: 'newdata' had 5 rows but variables found have 32 rows Instead of returning 5 predictions, it returns the 32 original predicted values. There is a warning message suggesting that something went wrong. This tickled my curiosity, and hance this result: > predict(cars.lm, newdata = data.frame(x = 1:32)) 12345678 9 10 20.37954 20.37954 26.58543 17.70329 14.91157 18.60448 14.91157 25.52859 25.68971 20.17199 11 12 13 14 15 16 17 18 19 20 20.17199 17.21096 17.21096 17.21096 11.85300 12.18071 12.72688 27.38558 27.46750 27.59312 21 22 23 24 25 26 27 28 29 30 26.25500 16.05853 16.44085 15.18466 13.81922 27.37738 26.24954 26.93772 15.15735 20.78917 31 32 16.52278 26.23042 Again, the new data are ignored, but there is no warning message, because the previous warning was based only on a discrepancy with the number of rows and the number of predictions. Indeed, the new data set makes no sense at all in the context of this model. At the root of this behavior is the fact that the model.frame function ignores its data argument with such models. So instead of constructing a new frame based on the new data, it just returns the original model frame. I am not really suggesting that you try to make these things work with models when the formula is like this. Instead, I am hoping that it throws an actual error message rather than just a warning, and that you be a little bit more sophisticated than merely checking the number of rows. Both predict() with newdata provided, and model.frame() with a data argument, should return an informative error message that says that model formulas like this are not supported with new data. Here is what appears to be an easy way to check: > get_all_vars(terms(cars.lm)) Error in eval(inp, data, env) : object 'cyl' not found Thanks Russ Russell V. Lenth - Professor Emeritus Department of Statistics and Actuarial Science The University of Iowa - Iowa City, IA 52242 USA Voice (319)335-0712 (Dept. office) - FAX (319)335-3017 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Library lib.loc Option Ignored for Dependencies
Here's a trick/workaround; if lib.loc is the path to your library, then prior to calling library(), > environment(.libPaths)$.lib.loc <- lib.loc Good day, If there's a library folder of the latest R packages and a particular package from it is loaded using the lib.loc option, the dependencies of that package are still attempted to be loaded from another folder of older packages specified by R_LIBS, which may cause errors about version requirements not being met. The documentation of the library function doesn't explain what the intended result is in such a case, but it could reasonably be expected that R would also load the dependencies from the user-specified lib.loc folder. -- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel