On Feb 5, 2010, at 1:50 AM, Bert Gunter wrote:

Folks:

You can make use of matrix subscripting and avoid R level loops and applys
altogether. This will end up being many times faster.

Here's your original code:

Z=matrix(rnorm(20), nrow=4)
index=replicate(4, sample(1:5, 3))
P=4
tmpr=list()
for (i in 1:P)
{
 tmp = Z[i,index[,i]]
 tmpr[[i]]=tmp
}

for clarity, here's the index matrix I got:
index
    [,1] [,2] [,3] [,4]
[1,]    5    1    2    3
[2,]    2    2    4    4
[3,]    1    5    5    5

Here's what I got for tmpr when I used your code:

tmpr
[[1]]
[1] -0.6246316 -0.8695538 -0.4136176

[[2]]
[1]  0.02885345 -1.89837071  0.43195955

[[3]]
[1]  0.2453368 -0.1788287 -0.6620405

[[4]]
[1] -0.87077697 -1.62554371  0.04464793

So the ith component of tmpr is is just what the indices in the ith column of index pick out of the ith row of Z. That is, the first component of tmpr
are the (1,5), (1,2), and (1,1) elements of Z. Matrix (in general,
array)indexing -- read the man page for "[" carefully: it's documented in the "Matrices and Arrays" section -- allow you to "stack" these pairs (for n-dim arrays,n-tuples) row-wise into a matrix and use this matrix as an
index:

Z[cbind(c(1,1,1),index[,1])]
[1] -0.6246316 -0.8695538 -0.4136176

So you can do everything at once by (making use of R's columnwise storage of
arrays) as:

result <- Z[cbind(as.vector(col(index)), as.vector(index))]

which gives:

[1] -0.62463163 -0.86955383 -0.41361765 0.02885345 -1.89837071 0.43195955
0.24533679
[8] -0.17882867 -0.66204048 -0.87077697 -1.62554371  0.04464793

Note that this vector is the same as: unlist(tmpr). So you can turn it into
a matrix e.g. where column i is the ith component of tmpr by:

dim(result) <- dim(index)

As I said, for large problems, this should be wayyyyy faster than explicit loops or the hidden (and optimized, but still) loops of apply functions.

Well, twice as fast as the explicit anyway:

> system.time( replicate(10000, {result <- Z[cbind(as.vector(col(index)), as.vector(index))]; dim(result) <- dim(index)} )
+ )
   user  system elapsed
  0.164   0.001   0.171

> system.time( replicate(10000, for (i in 1:P) { tmpr[[i]] <- Z[i,index[,i]] } ) )
  user  system elapsed
 0.267   0.049   0.330

Which was in turn twice as fast as the lapply approach:

> system.time( replicate(10000, tmpr[1:4]<-lapply(1:4, function(i, x, y) {x[i,y[,1]]}, Z, index ) ) )
  user  system elapsed
 0.628   0.015   0.646

--
David.



Bert Gunter
Genentech Nonclinical Statistics




-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On
Behalf Of RICHARD M. HEIBERGER
Sent: Thursday, February 04, 2010 9:10 PM
To: Carrie Li
Cc: r-help
Subject: Re: [R] How do I use "tapply" in this case ?

lapply(1:4, function(i, x, y) {x[i,y[,1]]}, Z, index ) ## reproduces
your results

sapply(1:4, function(i, x, y) {x[i,y[,1]]}, Z, index ) ## collapses
your list into a set of columns

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to