>>>>> Michael Chirico <michaelchiri...@gmail.com> >>>>> on Tue, 27 Feb 2018 20:18:34 +0800 writes:
Slightly amended 'Subject': (unimportant mistake: a dgeMatrix is *not* sparse) MM: modified to commented R code, slightly changed from your post: ## I am attempting to use the lars package with a sparse input feature matrix, ## but the following fails: library(Matrix) library(lars) data(diabetes) # from 'lars' ##UAagghh! not like this -- both attach() *and* as.data.frame() are horrific! ##UA attach(diabetes) ##UA x = as(as.matrix(as.data.frame(x)), 'dgCMatrix') x <- as(unclass(diabetes$x), "dgCMatrix") lars(x, y, intercept = FALSE) ## Error in scale.default(x, FALSE, normx) : ## length of 'scale' must equal the number of columns of 'x' ## More specifically, scale.default fails as called from lars(): normx <- new("dgeMatrix", x = c(4, 0, 9, 1, 1, -1, 4, -2, 6, 6)*1e-14, Dim = c(1L, 10L), Dimnames = list(NULL, c("x.age", "x.sex", "x.bmi", "x.map", "x.tc", "x.ldl", "x.hdl", "x.tch", "x.ltg", "x.glu"))) scale.default(x, center=FALSE, scale = normx) ## Error in scale.default(x, center = FALSE, scale = normx) : ## length of 'scale' must equal the number of columns of 'x' > The problem is that this check fails because is.numeric(normx) is FALSE: > if (is.numeric(scale) && length(scale) == nc) > So, the error message is misleading. In fact length(scale) is the same as > nc. Correct, twice. > At a minimum, the error message needs to be repaired; do we also want to > attempt as.numeric(normx) (which I believe would have allowed scale to work > in this case)? It seems sensible to allow both 'center' and 'scale' to only have to *obey* as.numeric(.) rather than fulfill is.numeric(.). Though that is not a bug in scale() as its help page has always said that 'center' and 'scale' should either be a logical value or a numeric vector. For that reason I can really claim a bug in 'lars' which should really not use scale(x, FALSE, normx) but rather scale(x, FALSE, scale = as.numeric(normx)) and then all would work. > ----------------- > (I'm aware that there's some import issues in lars, as the offending line > to create normx *should* work, as is.numeric(sqrt(drop(rep(1, nrow(x)) %*% > (x^2)))) is TRUE -- it's simply that lars doesn't import the appropriate S4 > methods) > Michael Chirico Yes, 'lars' has _not_ been updated since Spring 2013, notably because its authors have been saying (for rather more than 5 years I think) that one should really use require("glmnet") instead. Your point is still valid that it would be easy to enhance base :: scale.default() so it'd work in more cases. Thank you for that. I do plan to consider such a change in R-devel (planned to become R 3.5.0 in April). Martin Maechler, ETH Zurich ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel