Prof Brian Ripley wrote: > On Mon, 19 Jan 2009, Rolf Turner wrote: > >> >> On 19/01/2009, at 10:44 AM, Gabor Grothendieck wrote: >> >>> Well, that's why it was only provided when you insisted. This is >>> not what regexp's are good at. >>> >>> On Sun, Jan 18, 2009 at 4:35 PM, Rau, Roland <r...@demogr.mpg.de> wrote: >>>> Thanks! (I have to admit, though, that I expected something simple) >> >> It may not be what regexp's are good at, but the grep command in >> unix/linux >> does what is required *very* simply via the ``-v'' flag. I >> conjecture that >> it would not be difficult to add an argument with similar impact to the >> grep() function in R. > > Indeed. I have often wondered why grep() returned indices, when a > logical vector would seem more natural in R (and !grep(...) would have > been all that was needed). > > Looking at the code I see it does in fact compute a logical vector, > just not return it. So adding 'invert' (the long-form of -v is > --invert) is a job of a very few lines and I have done so for 2.9.0. >
in fact, it's simpler than that. instead of redundantly distributing the fix over four different lines in character.c, it's enough to ^= the logical vector of matched/unmatched flags in just one place, on-the-fly, close to the end of the loop over the vector of input strings. see attached patch. for consistency, you might want to - name the internal invert flag 'invert_opt' instead of 'invert'; - apply the same fix to agrep. it's also trivial to add another argument to grep, say 'logical', which will cause grep to return a logical vector of the same length as the input strings vector. see the attached patch. note: i am novice to r internals, and i get some mystical warnings i haven't decoded yet while using the extended grep, but otherwise the code compiles well and grep works as intended; you'd need to fix the cause of the warnings. if you want the 'logical' argument, you need to decide how it interacts with 'values'. in the patch, 'values' set to TRUE resets 'logical' to FALSE, with a warning. further suggestions: the arguments 'values' and 'logical' could be replaced with one argument, say 'output', which would take a value from {'indices', 'values', 'logical'}. it might make further extensions easier to implement and maintain. attached are patches to character.c, names.c, and grep.R; if you tell me which other files need a patch to get rid of the warnigns (see below), i'll make one. s = c("abc", "bcd", "cde") grep("b", s) # 1 2 grep("b", s, value=TRUE) # "abc" "bcd" grep("b", s, logical=TRUE) # TRUE TRUE FALSE s[grep("b", s, logical=TRUE)] # "abc" "bcd" # Warning: stack imbalance in 'grep', 9 then 10 # Warning: stack imbalance in '.Internal', 8 then 9 # Warning: stack imbalance in '{', 6 then 7 grep("b", s, invert=TRUE) # 3 grep("b", s, invert=TRUE, value=TRUE) # "cde" s[!grep("b", s, logical)] # "cde" # Warning: stack imbalance in 'grep', 15 then 16 # Warning: stack imbalance in '.Internal', 14 then 15 # Warning: stack imbalance in '{', 12 then 13 # Warning: stack imbalance in '!', 6 then 7 # Warning: stack imbalance in '[', 2 then 3 vQ
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.