On 03/16/2012 12:31 PM, William Dunlap wrote:
You didn't show your complete code but the following may help you speed things 
up.
Compare a function, f0, structured like your code and one, f1, that calls sum 
once
instead of counting length(x)-3 times.

f0<- function(x, test.pattern) {
     count<- 0
     for(indx in seq_len(length(x)-3)) {
        if ((x[indx] == test.pattern[1])&&  (x[indx+1] == test.pattern[2])&&  
(x[indx+2] == test.pattern[3])) {
            count<- count + 1
        }
     }
     count
}

f1<- function(x, test.pattern) {
     indx<- seq_len(length(x)-3)
     sum((x[indx] == test.pattern[1])&  (x[indx+1] == test.pattern[2])&  
(x[indx+2] == test.pattern[3]))
}


bin.05<- round((log10(1:10000000)%%1e-3 - log10(1:10000000)%%1e-4) * 1e4) # 
quasi-random sample of 10^7 from {0,...,9}
system.time(print(f0(bin.05, c(2,3,3))))
[1] 3194
    user  system elapsed
   14.35    0.00   14.35
system.time(print(f1(bin.05, c(2,3,3))))
[1] 3194
    user  system elapsed
    0.70    0.21    0.90

You are probably also slowing things down by doing
     yourList$yourCounts[1]<- yourList$yourCounts[1] + 1
many times instead of
    count<- yourList$yourCounts[1]
once and
    count<- count + 1
many times.  The former evaluates $, [, $<-, and [<- many
times and the $<- and [<- in particular may use a fair bit of time.


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf
Of Walter Anderson
Sent: Friday, March 16, 2012 10:00 AM
To: R Help
Subject: [R] Faster way to implement this search?

I am working on a simulation where I need to count the number of matches
for an arbitrary pattern in a large sequence of binomial factors.  My
current code is

      for(indx in 1:(length(bin.05)-3))
        if ((bin.05[indx] == test.pattern[1])&&  (bin.05[indx+1] ==
test.pattern[2])&&  (bin.05[indx+2] == test.pattern[3]))
          return.values$count.match.pattern[1] =
return.values$count.match.pattern[1] + 1

Since I am running the above code for each simulation multiple times on
sequences of 10,000,000 factors the code is taking longer than I would
like.   Is there a better (more "R" way of achieving the same answer?

Walter Anderson

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Thank you for this response. That made a huge speed improvement in my simulation speed!

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to