That is not a very selective regex. Actually, a long "or" probably is best, but you don't have to type it in directly.
prefixes <- c( "AD", "FN" ) pat <- paste0( "^(", paste( prefixes, collapse="|" ), ")[0-9]{4}$" ) grepl( pat, Identifier ) -- Sent from my phone. Please excuse my brevity. On November 29, 2016 10:37:29 AM PST, Glenn Schultz <glennmschu...@me.com> wrote: >Hello All, > >I have a dataframe of about 1.5 million rows from this dataframe I need >to filter out identifiers. An example would be 070000-07099, >AD0000-AD0999, and AL0000-AL9999, FN0000-FN9999. I am using grepl to >identify those of interest as follows: > > grepl("^[FN]|[AD]{2}", Identifier) > >The above seems to work in the case of FN and AD. However, there are >20 such identifiers and there must be a better way to do this than a >long "or" statement. Ultimately, I would like to filter these out >using dplyr which I think the first step is to create a vector of >TRUE/FALSE then filter on TRUE > >Any Ideas are appreciated, >Glenn > > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.