On Oct 24, 2016 6:05 PM, "Joe Ceradini" <joecerad...@gmail.com> wrote: > > Excellent - thanks David! > Regex syntax never fails to scare the crap out of me :) > > David absolutely solved my problem (in record time, no less), so it > can be put to rest. However, if anyone knows how to accomplish the > same thing through non base packages, like stringr or stringi, I'd be > interested in seeing those solutions as well.
Try it, its easy. I would be very surprised if you can't figure it out. The stringr vignette is a good place to start. Best, Ista > > Thanks, > Joe > > > On Mon, Oct 24, 2016 at 3:42 PM, David Wolfskill <da...@catwhisker.org> wrote: > > > > On Mon, Oct 24, 2016 at 03:33:20PM -0600, Joe Ceradini wrote: > > > R Helpers, > > > > > > I would like to extract the entire word beginning with "BT" (or "BT-") > > > and not any thing else in the string. Or, I would like to extract from > > > BT up until the next space. > > > > > > test <- data.frame(x = c("abc", "Sample BT-1501-2E stuff", "Bt-1599-3E stuff")) > > > test > > > > > > So, from test$x I would like to only extract "BT-1501-2E" and "Bt-1599-3E". > > > > > > I started with straight grep but of course that is not what I need. > > > grep("BT", test$x, value = TRUE, ignore.case = TRUE) > > > "Sample BT-1501-2E stuff" "Bt-2134df stuff" > > > > > > In a somewhat similar post, the solution involved boundaries or > > > anchors, but I haven't been able to adapt it to my needs, so I won't > > > even bother including my boundary attempts :) > > > http://stackoverflow.com/questions/7227976/using-grep-in-r-to-find-strings-as-whole-words-but-not-strings-as-part-of-words > > > > > > If possible, it would also be helpful if something was returned, like > > > NA, for rows without a "BT" match. So, conceptually, test$x would > > > return: > > > NA, "BT-1501-2E", "Bt-1599-3E". > > > > > > Thanks! > > > Joe > > > .... > > > > This is not exactly what you requested, as it returns the original > > unmodified string when there's no match; I expect you can come up with > > some code to test for that. It does, however, meet the rest of your > > requirements -- or so I believe: > > > > > test > > x > > 1 abc > > 2 Sample BT-1501-2E stuff > > 3 Bt-1599-3E stuff > > > sub("^.*(BT-?\\w*).*$", "\\1", test$x, ignore.case = TRUE, perl = TRUE) > > [1] "abc" "BT-1501" "Bt-1599" > > > > > > > Peace, > > david > > -- > > David H. Wolfskill da...@catwhisker.org > > Those who would murder in the name of God or prophet are blasphemous cowards. > > > > See http://www.catwhisker.org/~david/publickey.gpg for my public key. > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.