On 13/09/2011 12:42 PM, Timothy Bates wrote:
Dear R cognoscenti,
While having NA as a native type is nifty, it is annoying when making binary
choices.
Question: Is there anything bad about writing comparison functions that
behavior like %in% (which I love) and ignore NAs?
"%>%"<- function(table, x) {
return(which(table> x))
}
"%<%"<- function(table, x) {
return(which(table< x))
}
test<- c(NA, 1:4,NA,5)
test %>% 2
# [1] 3 4 6
test %<% 2
# [1] 1
Why do I want to do this?
Because in coding, I often end up with big chunks looking like this:
((mydataframeName$myvariableName> 2& !is.na(mydataframeName$myvariableName))&
(mydataframeName$myotherVariableName == "male"&
!is.na(mydataframeName$myotherVariableName)))
Which is much less readable/maintainable/editable than
mydataframeName$myvariableName> 2& mydataframeName$myotherVariableName ==
"male"
But ">" returns anything involving an NA, so it breaks selection statements
(which can't contain NA) and leaves lines in data that are wished to be excluded
If this does not have nasty side-effects, it would be a great addition to GTD*
in R
If anyone knows a short cut to code the effect I wish, love to hear it.
I would suggest subsetting first if you really want to ignore the NAs.
A problem with your suggestion is that since it doesn't return a logical
vector, it will behave quite differently from a standard comparison in
an expression. For example
(a < 5) & (b < 6)
will work (but sometimes generate NAs), but
(a %<% 5) & (b %<% 6)
will not. (You'd need to use the intersect() function.)
Duncan Murdoch
Cheers,
tim
* GTD = Getting Things Done
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.