Le 15/08/10 02:43, david h shanabrook a écrit :
I need to increment cells of a matrix (collusionM).  The indexes to increment 
are in an index (matchIndex). This is sample  code

    library(seqinr)
    library(Matrix)
    x<- "abcabcabc"
    mx<- s2c(x)
    collisionM<- Matrix(0,nrow=10, ncol=10, sparse=TRUE)
    matchIndex<- list()
    rows<- length(mx)
    for (j in 1:(rows)) matchIndex[[j]]<- which(mx[j]==mx)
    for (j in 1:(rows)) collisionM[j,matchIndex[[j]]]<- 
collisionM[j,matchIndex[[j]]] + 1

   Works fine, except with my data (rows=32000) it is too slow.  The problem is 
the second for loop, where it increments the index of the sparse matrix; this 
needs to be rewritten so it is more efficient.  Any ideas?

Hi,

Nice exercise to demonstrate matrix indexing. First let's look at it on a simpler example:

> x <- matrix( 1:16, nr = 4 )
> x
     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16
> y <- cbind( c(1,2,3,4), c(1,2,3,4) )
> y
     [,1] [,2]
[1,]    1    1
[2,]    2    2
[3,]    3    3
[4,]    4    4
> x[ y ]
[1]  1  6 11 16

Now we can apply it to your example, below f is your code and g uses matrix indexing:

f <- function(x = "abcabcabc"){
        mx <- s2c(x)
        rows <- length(mx)
        collisionM <- Matrix(0,nrow=rows, ncol=rows, sparse=TRUE)
        matchIndex <- list()
        for (j in 1:(rows)) matchIndex[[j]] <- which(mx[j]==mx)
for (j in 1:(rows)) collisionM[j,matchIndex[[j]]] <- collisionM[j,matchIndex[[j]]] + 1
        collisionM
}

g <- function( x = "abcabcabc"){
        mx <- s2c(x)
        rows <- length(mx)
        collisionM <- Matrix(0,nrow=rows, ncol=rows, sparse=TRUE)
        
        matchIndex <- do.call( rbind, lapply( 1:rows, function(      i){
                cbind( i, which( mx[i] == mx ) )
        } ) )
        collisionM[ matchIndex ] <- collisionM[matchIndex] + 1
        collisionM
}

# first check it works as expected :
> all( f() == g() )
[1] TRUE

# then do some timings
> long <- paste( sample(letters, 1000, replace = TRUE ), collapse = "" )
> system.time( f( x = long ) )
   user  system elapsed
  8.022   1.052   9.074
> system.time( g( x = long ) )
   user  system elapsed
  0.079   0.003   0.082



... and it seems we don't even need indexing at all, we can just create the matrix using sparseMatrix :

h <- function( x = "abcabcabc"){
        mx <- s2c(x)
        rows <- length(mx)
        matchIndex <- do.call( rbind, lapply( 1:rows, function(      i){
                cbind( i, which( mx[i] == mx ) )
        } ) )
        sparseMatrix( matchIndex[,1], matchIndex[,2] )
}

which gives some improvement :

> system.time( h( x = long ) )
   user  system elapsed
  0.048   0.000   0.048

Romain


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/bzoWrs : Rcpp svn revision 2000
|- http://bit.ly/b8VNE2 : Rcpp at LondonR, oct 5th
`- http://bit.ly/aAyra4 : highlight 0.2-2

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to