No anomaly, it is just that you need to know what it is for, before trying to
use it.
Basically, duplicated() works by looking up entries in a hash table (for which
there is a substantial literature, just google it). This will be somewhat more
efficient if you know the number of unique values
I'll go just a bit "fer-er." It appears the anomaly -- I hesitate to
call it a bug -- is in the C code for duplicated.default():
> duplicated(letters[1:10],nmax=10)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> duplicated(letters[1:10],nmax=9)
[1] FALSE FALSE FALSE FALSE FAL
Well, you won't like this, but it is kind of wimpily (is that a word?)
documented:
If you check the code of factor(), you will see that nmax appears as
an argument in a call to unique(). ?unique says for nmax, "... see
duplicated" . And ?duplicated says:
"If nmax is set too small there is liable
3 matches
Mail list logo