On Wed, Feb 15, 2012 at 02:17:35PM +1000, Redding, Matthew wrote:
> Hi All,
>
>
> I've been trawling through the documentation and listserv archives on this
> topic -- but
> as yet have not found a solution. I'm sure this is pretty simple with R, but
> I cannot work out how without
> resorting to ugly nested loops.
>
> As far as I can tell, grep, match, and %in% are not the correct tools.
>
> Question:
> given these vectors --
> patrn <- c(1,2,3,4)
> exmpl <- c(3,3,4,2,3,1,2,3,4,8,8,23,1,2,3,4,4,34,4,3,2,1,1,2,3,4)
>
> how do I get the desired answer by finding the occurence of the pattern and
> returning the starting indices:
> 6, 13, 23
Hi.
A more efficient version of the previous suggestion
is as follows.
m <- length(patrn)
n <- length(exmpl)
candidate <- seq.int(length=n-m+1)
for (i in seq.int(length=m)) {
candidate <- candidate[patrn[i] == exmpl[candidate + i - 1]]
}
candidate
[1] 6 13 23
In this solution, the set of candidate indices decreases. If
the prefixes of the searched pattern are rare, the set of
candidates is reduced in a few iterations and the remaining
iterations become faster.
Hope this helps.
Petr Savicky.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.