I like Boris's "Hadley" solution. For the record, I've appended a version that uses regular expressions, the only benefit of which is that it could be generalized to find more-complicated patterns.
-- Mike counts <- sapply(text1, function(next_string) { loc_example <- length(gregexpr("Example", next_string)[[1]]) loc_example }, USE.NAMES=FALSE) > counts [1] 5 5 5 5 > On Tue, Apr 25, 2017 at 5:33 PM, Boris Steipe <boris.ste...@utoronto.ca> wrote: > I should add: there's a str_count() function in the stringr package. > > library(stringr) > str_count(text1, "Example") > # [1] 5 5 5 5 > > I guess that would be the neater solution. > > B. > > > >> On Apr 25, 2017, at 8:23 PM, Boris Steipe <boris.ste...@utoronto.ca> wrote: >> >> How about: >> >> unlist(lapply(strsplit(text1, "Example"), function(x) { length(x) - 1 } )) >> >> >> Splitting your string on the five "Examples" in each gives six elements. >> length(x) - 1 is the number of >> matches. You can use any regex instead of "example" if you need to tweak >> what you are looking for. >> >> >> B. >> >> >> >> >>> On Apr 25, 2017, at 8:14 PM, Dan Abner <dan.abne...@gmail.com> wrote: >>> >>> Hi all, >>> >>> I am looking for a streamlined way of counting the number of enumerated >>> items are each element of a character vector. For example: >>> >>> >>> text1<-c("This is an example. >>> List 1 >>> 1) Example 1 >>> 2) Example 2 >>> 10) Example 10 >>> List 2 >>> 1) Example 1 >>> 2) Example 2 >>> These have been examples.","This is another example. >>> List 1 >>> 1. Example 1 >>> 2. Example 2 >>> 10. Example 10 >>> List 2 >>> 1. Example 1 >>> 2. Example 2 >>> These have been examples.","This is a third example. List 1 1) Example 1. >>> 2) Example 2. 10) Example 10. List 2 1) Example 1. 2) Example 2. These have >>> been examples." >>> ,"This is a fourth example. List 1 1. Example 1. 2. Example 2. 10. Example >>> 10. List 2 Example 1. 2. Example 2. These have been examples.") >>> >>> text1 >>> >>> === >>> >>> I would like the result to be c(5,5,5,5). Notice that sometimes there are >>> leading hard returns, other times not. Sometimes are there separate lists >>> and the same numbers are used in the enumerated items multiple times within >>> each character string. Sometimes the leading numbers for the enumerated >>> items exceed single digits. Notice that the delimiter may be ) or a period >>> (.). If the delimiter is a period and there are hard returns (example 2), >>> then I expect that will be easy enough to differentiate sentences ending >>> with a number from enumerated items. However, I imagine it would be much >>> more difficult to differentiate the two for example 4. >>> >>> Any suggestions are appreciated. >>> >>> Best, >>> >>> Dan >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.