On Tue, Sep 27, 2011 at 5:51 PM, Marcelo Araya <marcelo...@gmail.com> wrote: > Hi all > > > > I am analyzing bird song element sequences. I would like to know how can I > get how many times a given subsequence is found in single string sequence. > > > > > > For example: > > > > If I have this single sequence: > > > > ABCABAABABABCAB > > > > I am looking for the subsequence "ABC". Want I need to get here is that the > subsequence is found twice. > > > > Any idea how can I do this? >
gregexpr will return the position and length of multiple matches. And you can feed it a vector. So: > songs=c("ABCABAABABABCAB","ABACAB","ABABCABCBC") > gregexpr(m,songs) [[1]] [1] 1 11 attr(,"match.length") [1] 3 3 [[2]] [1] -1 attr(,"match.length") [1] -1 [[3]] [1] 3 6 attr(,"match.length") [1] 3 3 - in the first item, it was found at posn 1 and 11 - in the second it wasnt found at all - in the third, it was found at posn 3 and 6 so just do some apply-ing to the returned list and get the length of each element. Job done! Barry PS bonus points for spotting the hidden prog-rock song title. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.