On Tue, Jun 2, 2009 at 1:18 PM, William Dunlap <[email protected]> wrote:
> %in% is a thin wrapper on a call to match().
Yes, as I mentioned in my email, all this is clearly documented in ? match.
> match() is not a generic function (and is not documented to be one),
> so it treats data.frames as lists, as their underlying representation is a
> list of columns.
Yes, I understand that this is the proximal cause of the current strange
behavior. What I don't understand is why the current behavior is a good
idea.
> match is documented to convert lists to character and to then run
the character version of match on that character data
Yes, this peculiar behavior is documented. What I don't get is its
rationale.
> match does not bail out if the types of the x and table arguments don't
> match
> (that would be undesirable in the integer/numeric mismatch case).
Why would it 'bail out'?
The related functions, duplicated() and unique(), do have
> row-wise data.frame methods. E.g.,
> > duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)]))
> [1] FALSE FALSE FALSE FALSE TRUE
> Perhaps match() ought to have one also....
>
I think that %in% and is.element() ought to remain calls to match()
> and that if you want them to work row-wise on data.frames then
> match should get a data.frame method.
After all that, it sounds like we agree...!
-s
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel