Re: [Rd] duplicates() function

Duncan Murdoch Mon, 11 Apr 2011 11:05:37 -0700

On 08/04/2011 11:39 AM, Joshua Ulrich wrote:

On Fri, Apr 8, 2011 at 10:15 AM, Duncan Murdoch
<murdoch.dun...@gmail.com>  wrote:
>  On 08/04/2011 11:08 AM, Joshua Ulrich wrote:
>>
>>  How about:
>>
>>  y<- rep(NA,length(x))
>>  y[duplicated(x)]<- match(x[duplicated(x)] ,x)
>
>  That's a nice solution for vectors.  Unfortunately for me, I have a matrix
>  (which duplicated() handles by checking whole rows).  So a better example
>  that I should have posted would be
>
>  x<-  cbind(1, c(9,7,9,3,7) )
>
>  and I'd still like the same output
>
For a matrix, could you apply the same strategy used in duplicated()?


y<- rep(NA,NROW(x))
temp<- apply(x, 1, function(x) paste(x, collapse="\r"))
y[duplicated(temp)]<- match(temp[duplicated(temp)], temp)

Since this thread hasn't ended, I will say that I think this solution isthe best I've seen for my specific problem. I was actually surprisedthat duplicated() did the string concatenation trick, but since it does,it makes a lot of sense to do the same in duplicates().

I think a good general purpose solution that worked whereverduplicated() works would likely be harder, because we don't really havethe right primitives to make it work.


Duncan Murdoch

>>    duplicated(x)
>
>  [1] FALSE FALSE  TRUE FALSE TRUE
>
>>    duplicates(x)
>
>  [1] NA NA  1 NA  2
>
>
>  Duncan Murdoch
>
>>  --
>>  Joshua Ulrich  |  FOSS Trading: www.fosstrading.com
>>
>>
>>
>>  On Fri, Apr 8, 2011 at 9:59 AM, Duncan Murdoch<murdoch.dun...@gmail.com>
>>    wrote:
>>  >    I need a function which is similar to duplicated(), but instead of
>>  >  returning
>>  >    TRUE/FALSE, returns indices of which element was duplicated.  That is,
>>  >
>>  >>    x<- c(9,7,9,3,7)
>>  >>    duplicated(x)
>>  >    [1] FALSE FALSE  TRUE FALSE TRUE
>>  >
>>  >>    duplicates(x)
>>  >    [1] NA NA  1 NA  2
>>  >
>>  >    (so that I know that element 3 is a duplicate of element 1, and element
>>  >  5 is
>>  >    a duplicate of element 2, whereas the others were not duplicated
>>  >  according
>>  >    to our definition.)
>>  >
>>  >    Is there a simple way to write this function?  I have  an ugly
>>  >    implementation in R that loops over all the values; it would make more
>>  >  sense
>>  >    to redo it in C, if there isn't a simple implementation I missed.
>>  >
>>  >    Duncan Murdoch
>>  >
>>  >    ______________________________________________
>>  >    R-devel@r-project.org mailing list
>>  >    https://stat.ethz.ch/mailman/listinfo/r-devel
>>  >
>
>


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] duplicates() function

Reply via email to