Very interesting. You are doing some stuff here that I have never seen. Thank you. I will test it on my real data on Monday and let you know what I find. That cmpfun function looks very useful!
Thanks, Ben On Sat, Mar 10, 2012 at 10:26 AM, Joshua Wiley <jwiley.ps...@gmail.com>wrote: > Hi Ben, > > It seems likely that there are bigger bottle necks in your overall > program/use---have you tried Rprof() to find where things really get > slowed down? In any case, f2() below takes about 70% of the time as > your function in your test data, and 55-65% of the time for a bigger > example I constructed. Rui's function benefits substantially from > byte compiling, but is still slower. As a side benefit, f2() seems to > use less memory than your current implementation. > > Cheers, > > Josh > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > ######sample data ############## > vals <- matrix(LETTERS[1:9], nrow = 3, ncol = 3, > dimnames = list(c('row1','row2','row3'), c('col1','col2','col3'))) > > indx <- matrix(c(1,1,3,3,2,2,2,3,1,2,2,1), nrow=4, ncol=3) > storage.mode(indx) <- "integer" > > > f <- function(x, i, di = dim(i), dx = dim(x)) { > out <- x[c(i + matrix(0:(dx[1L] - 1L) * dx[1L], nrow = di[1L], ncol > = di[2L], TRUE))] > dim(out) <- di > return(out) > } > > > fun <- function(valdata, inxdata){ > nr <- nrow(inxdata) > nc <- ncol(inxdata) > mat <- matrix(NA, nrow=nr*nc, ncol=2) > i1 <- 1 > i2 <- nr > for(j in 1:nc){ > mat[i1:i2, 1] <- inxdata[, j] > mat[i1:i2, 2] <- rep(j, nr) > i1 <- i1 + nr > i2 <- i2 + nr > } > matrix(valdata[mat], ncol=nc) > } > > require(compiler) > f2 <- cmpfun(f) > fun2 <- cmpfun(fun) > > system.time(for (i in 1:10000) f(vals, indx)) > system.time(for (i in 1:10000) f2(vals, indx)) > system.time(for (i in 1:10000) fun(vals, indx)) > system.time(for (i in 1:10000) fun2(vals, indx)) > system.time(for (i in 1:10000) > > matrix(vals[cbind(c(indx),rep(1:ncol(indx),each=nrow(indx)))],nrow=nrow(indx),ncol=ncol(indx))) > > ## now let's make a bigger test set > set.seed(1) > vals2 <- matrix(sample(LETTERS, 10^7, TRUE), nrow = 10^4) > indx2 <- sapply(1:ncol(vals2), FUN = function(x) sample(10^4, 10^3, TRUE)) > > dim(vals2) > dim(indx2) > > ## the best contenders from round 1 > gold <- > matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2)) > test1 <- f2(vals2, indx2) > all.equal(gold, test1) > > system.time(for (i in 1:20) f2(vals2, indx2)) > system.time(for (i in 1:20) > > matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2))) > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > On Sat, Mar 10, 2012 at 7:48 AM, Ben quant <ccqu...@gmail.com> wrote: > > Thanks for the info. Unfortunately its a little bit slower after one > apples > > to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73 > seconds. > > Not a big deal, but significant when I have to do this 300 to 500 times. > > > > regards, > > > > ben > > > > On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas <rui1...@sapo.pt> wrote: > > > >> Hello, > >> > >> I don't know if it's the fastest but it's more natural to have an index > >> matrix with two columns only, > >> one for each coordinate. And it's fast. > >> > >> fun <- function(valdata, inxdata){ > >> nr <- nrow(inxdata) > >> nc <- ncol(inxdata) > >> mat <- matrix(NA, nrow=nr*nc, ncol=2) > >> i1 <- 1 > >> i2 <- nr > >> for(j in 1:nc){ > >> mat[i1:i2, 1] <- inxdata[, j] > >> mat[i1:i2, 2] <- rep(j, nr) > >> i1 <- i1 + nr > >> i2 <- i2 + nr > >> } > >> matrix(valdata[mat], ncol=nc) > >> } > >> > >> fun(vals, indx) > >> > >> Rui Barradas > >> > >> > >> -- > >> View this message in context: > >> > http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html > >> Sent from the R help mailing list archive at Nabble.com. > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > Programmer Analyst II, Statistical Consulting Group > University of California, Los Angeles > https://joshuawiley.com/ > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.