subject:"\[R\] nmax parameter in factor function"

Re: [R] nmax parameter in factor function

2017-06-04 Thread peter dalgaard

No anomaly, it is just that you need to know what it is for, before trying to use it. Basically, duplicated() works by looking up entries in a hash table (for which there is a substantial literature, just google it). This will be somewhat more efficient if you know the number of unique values

Re: [R] nmax parameter in factor function

2017-06-03 Thread Bert Gunter

I'll go just a bit "fer-er." It appears the anomaly -- I hesitate to call it a bug -- is in the C code for duplicated.default(): > duplicated(letters[1:10],nmax=10) [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > duplicated(letters[1:10],nmax=9) [1] FALSE FALSE FALSE FALSE FAL

Re: [R] nmax parameter in factor function

2017-06-03 Thread Bert Gunter

Well, you won't like this, but it is kind of wimpily (is that a word?) documented: If you check the code of factor(), you will see that nmax appears as an argument in a call to unique(). ?unique says for nmax, "... see duplicated" . And ?duplicated says: "If nmax is set too small there is liable

[R] nmax parameter in factor function

2017-06-03 Thread Ramnik Bansal

I have been trying to understand how the argument 'nmax' works in 'factor' function. R-Documentation states - "Since factors typically have quite a small number of levels, for large vectors x it is helpful to supply nmax as an upper bound on the number of unique values." In the code below what is