Very interesting. You are doing some stuff here that I have never seen.
Thank you. I will test it on my real data on Monday and let you know what I
find. That cmpfun function looks very useful!

Thanks,
Ben

On Sat, Mar 10, 2012 at 10:26 AM, Joshua Wiley <jwiley.ps...@gmail.com>wrote:

> Hi Ben,
>
> It seems likely that there are bigger bottle necks in your overall
> program/use---have you tried Rprof() to find where things really get
> slowed down?  In any case, f2() below takes about 70% of the time as
> your function in your test data, and 55-65% of the time for a bigger
> example I constructed.  Rui's function benefits substantially from
> byte compiling, but is still slower.  As a side benefit, f2() seems to
> use less memory than your current implementation.
>
> Cheers,
>
> Josh
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> ######sample data ##############
> vals <- matrix(LETTERS[1:9], nrow = 3, ncol = 3,
>  dimnames = list(c('row1','row2','row3'), c('col1','col2','col3')))
>
> indx <- matrix(c(1,1,3,3,2,2,2,3,1,2,2,1), nrow=4, ncol=3)
> storage.mode(indx) <- "integer"
>
>
> f <- function(x, i, di = dim(i), dx = dim(x)) {
>  out <- x[c(i + matrix(0:(dx[1L] - 1L) * dx[1L], nrow = di[1L], ncol
> = di[2L], TRUE))]
>  dim(out) <- di
>  return(out)
> }
>
>
> fun <- function(valdata, inxdata){
>        nr <- nrow(inxdata)
>        nc <- ncol(inxdata)
>        mat <- matrix(NA, nrow=nr*nc, ncol=2)
>        i1 <- 1
>        i2 <- nr
>        for(j in 1:nc){
>                mat[i1:i2, 1] <- inxdata[, j]
>                mat[i1:i2, 2] <- rep(j, nr)
>                i1 <- i1 + nr
>                i2 <- i2 + nr
>        }
>        matrix(valdata[mat], ncol=nc)
> }
>
> require(compiler)
> f2 <- cmpfun(f)
> fun2 <- cmpfun(fun)
>
> system.time(for (i in 1:10000) f(vals, indx))
> system.time(for (i in 1:10000) f2(vals, indx))
> system.time(for (i in 1:10000) fun(vals, indx))
> system.time(for (i in 1:10000) fun2(vals, indx))
> system.time(for (i in 1:10000)
>
> matrix(vals[cbind(c(indx),rep(1:ncol(indx),each=nrow(indx)))],nrow=nrow(indx),ncol=ncol(indx)))
>
> ## now let's make a bigger test set
> set.seed(1)
> vals2 <- matrix(sample(LETTERS, 10^7, TRUE), nrow = 10^4)
> indx2 <- sapply(1:ncol(vals2), FUN = function(x) sample(10^4, 10^3, TRUE))
>
> dim(vals2)
> dim(indx2)
>
> ## the best contenders from round 1
> gold <-
> matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2))
> test1 <- f2(vals2, indx2)
> all.equal(gold, test1)
>
> system.time(for (i in 1:20) f2(vals2, indx2))
> system.time(for (i in 1:20)
>
> matrix(vals2[cbind(c(indx2),rep(1:ncol(indx2),each=nrow(indx2)))],nrow=nrow(indx2),ncol=ncol(indx2)))
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> On Sat, Mar 10, 2012 at 7:48 AM, Ben quant <ccqu...@gmail.com> wrote:
> > Thanks for the info. Unfortunately its a little bit slower after one
> apples
> > to apples test using my big data. Mine: 0.28 seconds. Yours. 0.73
> seconds.
> > Not a big deal, but significant when I have to do this 300 to 500 times.
> >
> > regards,
> >
> > ben
> >
> > On Fri, Mar 9, 2012 at 1:23 PM, Rui Barradas <rui1...@sapo.pt> wrote:
> >
> >> Hello,
> >>
> >> I don't know if it's the fastest but it's more natural to have an index
> >> matrix with two columns only,
> >> one for each coordinate. And it's fast.
> >>
> >> fun <- function(valdata, inxdata){
> >>        nr <- nrow(inxdata)
> >>        nc <- ncol(inxdata)
> >>        mat <- matrix(NA, nrow=nr*nc, ncol=2)
> >>        i1 <- 1
> >>        i2 <- nr
> >>        for(j in 1:nc){
> >>                mat[i1:i2, 1] <- inxdata[, j]
> >>                mat[i1:i2, 2] <- rep(j, nr)
> >>                i1 <- i1 + nr
> >>                i2 <- i2 + nr
> >>        }
> >>        matrix(valdata[mat], ncol=nc)
> >> }
> >>
> >> fun(vals, indx)
> >>
> >> Rui Barradas
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://r.789695.n4.nabble.com/Re-index-values-of-one-matrix-to-another-of-a-different-size-tp4458666p4460575.html
> >> Sent from the R help mailing list archive at Nabble.com.
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to