Hello,
Function Biostrings::matchPattern can be called with an algorithm =
"boyer-moore" argument.
I've never used it, this is the return value of
library(sos)
r1 <- findFn('boyer')
r2 <- findFn('moore')
r1 & r2
I have implemented the Boyer-Moore algorithm a couple of times, the
first(!) of all in 8086 assembly, but I'm seeing a difficulty regarding
your original request.
A Boyer-Moore algorithm to search for subsequences of character vectors
all of which such that nchar(x) is 1 should be very easy to implement
using the .Call interface, but for integer vectors I am not seeing how
to implement the bad character shift table. What would be the alphabet?
The set of 32-bit integers? In this case the table length would be
prohibitive...
Ideas anyone?
Rui Barradas
Em 28-08-2012 22:05, Duncan Murdoch escreveu:
Is there a function to efficiently search for a subsequence within a
vector?
For example, with
x <- 1:100
I'd like to search for the sequence c(49,50,51), and be told that it
occurs exactly once, starting at location 49. (The items in the
vectors might be numeric or character, and there might be repetitions
within the search pattern or within the vector I'm searching.)
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.