Sorry if this is a re-post, not sure if my original message got though. Is it possible to replace c() with .subset()? Or would this throw errors because of something specific in c() or as.matrix()? There's a pretty nice speed up by replacing c() with the .subset().
Ex. ---- library(microbenchmark) diag2 <- function(x,nrow, ncol){ if (is.matrix(x)) { if (nargs() > 1L) stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix") if ((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L)) # replace this part y <- .subset(x,1 + 0L:(m - 1L) * (dim(x)[1L] + 1)) nms <- dimnames(x) if (is.list(nms) && !any(sapply(nms, is.null)) && identical((nm <- nms[[1L]][seq_len(m)]), nms[[2L]][seq_len(m)])) names(y) <- nm return(y) } if (is.array(x) && length(dim(x)) != 1L) stop("'x' is an array, but not one-dimensional.") if (missing(x)) n <- nrow else if (length(x) == 1L && nargs() == 1L) { n <- as.integer(x) x <- 1 } else n <- length(x) if (!missing(nrow)) n <- nrow if (missing(ncol)) ncol <- n } # Tests nc <- 1e04 set.seed(1) m <- matrix(sample(letters,nc^2,replace=TRUE), ncol = nc) runoff <- microbenchmark( diaga <- diag(m), diagb <- diag2(m) ) runoff Unit: microseconds expr min lq mean median uq max neval diaga 429033.896 434186.694 512143.6728 503355.5865 572811.11 656035.584 100 diagb 216.112 251.445 536.8531 688.3595 706.98 2437.921 100 Regards, Steve Bronder Website: stevebronder.com Phone: 412-719-1282 Email: sbron...@stevebronder.com On Thu, May 7, 2015 at 11:49 AM, Steve Bronder <sbron...@stevebronder.com> wrote: > Is it possible to replace c() with .subset()? Example below > > #### > #### > > library(microbenchmark) > > > diag2 <- function(x,nrow, ncol){ > if (is.matrix(x)) { > if (nargs() > 1L) > stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix") > if ((m <- min(dim(x))) == 0L) > return(vector(typeof(x), 0L)) > # replace this part > y <- .subset(x,1 + 0L:(m - 1L) * (dim(x)[1L] + 1)) > nms <- dimnames(x) > if (is.list(nms) && !any(sapply(nms, is.null)) && identical((nm <- > nms[[1L]][seq_len(m)]), > > nms[[2L]][seq_len(m)])) > names(y) <- nm > return(y) > } > if (is.array(x) && length(dim(x)) != 1L) > stop("'x' is an array, but not one-dimensional.") > if (missing(x)) > n <- nrow > else if (length(x) == 1L && nargs() == 1L) { > n <- as.integer(x) > x <- 1 > } > else n <- length(x) > if (!missing(nrow)) > n <- nrow > if (missing(ncol)) > ncol <- n > } > > nc <- 10 > > set.seed(1) > m <- matrix(sample(letters,nc^2,replace=TRUE), ncol = nc) > > > runoff <- microbenchmark( > diaga = diag(m), > diagb = diag2(m) > ) > > Regards, > > Steve Bronder > Website: stevebronder.com > Phone: 412-719-1282 > Email: sbron...@stevebronder.com > > > On Tue, May 5, 2015 at 9:46 AM, <luke-tier...@uiowa.edu> wrote: > >> Looks like the c(x)[...] bit used to be as.matrix(x)[...]. Not sure >> why the change was made many years ago, but this was before names were >> handled explicitly. It would definitely be better to not force the >> duplicate, at least in the case where we are sure c() and [ would not >> dispatch. >> >> Best, >> >> luke >> >> On Mon, 4 May 2015, peter dalgaard wrote: >> >> >>> On 04 May 2015, at 19:59 , franknarf <by.hook...@gmail.com> wrote: >>>> >>>> But I'm still wondering why diag() uses c()...? With it being so slow, >>>> I'd >>>> be inclined to write a qdiag() without the c() and just use that the >>>> next >>>> time I need matrix algebra. Any insight would be appreciated; thanks! >>>> >>> >>> Well, there are two possibilities: Either it is deliberate or it isn't. >>> >>> The latter isn't too unlikely, given that the effect is seen for large >>> matrices. I would appear to be a matter of O(n) (picking out n items) vs. >>> O(n^2) (copying an n x n matrix), but this might drown out in a context >>> involving matrix multiplication and/or inversion, both of which are O(n^3). >>> >>> If it is deliberate, the question is why. There could be devils in the >>> details; notice in particular that c() strips off non-name attributes. >>> However, I'm not aware of a situation where such attributes could cause >>> trouble. >>> >>> -pd >>> >>> >>> >> -- >> Luke Tierney >> Ralph E. Wareham Professor of Mathematical Sciences >> University of Iowa Phone: 319-335-3386 >> Department of Statistics and Fax: 319-335-3017 >> Actuarial Science >> 241 Schaeffer Hall email: luke-tier...@uiowa.edu >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu >> >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel