Where do you get "should" and "expect" from? All the regular expression tools that I am familiar with only match non-overlapping patterns unless you do extra to specify otherwise. One of the standard references for regular expressions if you really want to understand what is going on is "Mastering Regular Expressions" by Jeffrey Friedl. You should really read through that book before passing judgment on the correctness of an implementation.
If you want the overlaps, you need to come up with a regular expression that will match without consuming all of the string. Here is one way to do it with your example: > gregexpr("1122(?=1122)", paste(rep("1122", 10), collapse=""), perl=TRUE) [[1]] [1] 1 5 9 13 17 21 25 29 33 attr(,"match.length") [1] 4 4 4 4 4 4 4 4 4 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 > -----Original Message----- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r- > project.org] On Behalf Of rthom...@aecom.yu.edu > Sent: Friday, December 12, 2008 10:05 AM > To: r-de...@stat.math.ethz.ch > Cc: r-b...@r-project.org > Subject: [Rd] gregexpr - match overlap mishandled (PR#13391) > > Full_Name: Reid Thompson > Version: 2.8.0 RC (2008-10-12 r46696) > OS: darwin9.5.0 > Submission from: (NULL) (129.98.107.177) > > > the gregexpr() function does NOT return a complete list of global > matches as it > should. this occurs when a pattern matches two overlapping portions of > a > string, only the first match is returned. > > the following function call demonstrates this error (although this is > not how I > initially discovered the problem): > gregexpr("11221122", paste(rep("1122", 10), collapse="")) > > instead of returning 9 matches as one would expect, only 5 matches are > returned > . . . > > [[1]] > [1] 1 9 17 25 33 > attr(,"match.length") > [1] 8 8 8 8 8 > > you will note, essentially, that the entire first match is then > excluded from > subsequent matching > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel