Hi all
I have the following regular expression problem: I want to find
complete elements of a vector that end in a repeated character but
where the repetition doesn't make up the whole word. That is, for the
vector vec:
vec<-c("aaaa", "baaa", "bbaa", "bbba", "baamm", "aa")
I would like to get
"baaa"
"bbaa"
"baamm"
>From tools where negative lookbehind can involve variable lengths, one
would think this would work:
grep("(?<!(?:\\1|^))(.)\\1{1,}$", vec, perl=T)
But then R doesn't like it that much ... I also know I can get it like this:
whole.word.rep <- grep("^(.)\\1{1,}$", vec, perl=T) # 1 6
rep.at.end <- grep("(.)\\1{1,}$", vec, perl=T) # 1 2 3 5 6
setdiff(rep.at.end, whole.word.rep) # 2 3 5
But is there a one-line grep thingy to do this?
Thx for any pointers,
STG
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.