> Martin Maechler
> on Tue, 13 Sep 2016 18:33:35 +0200 writes:
> Suharto Anggono Suharto Anggono via R-devel
> on Fri, 2 Sep 2016 16:10:00 + writes:
>> I am basically fine with the change.
>> How about using just the following?
>> if(!is.character(exclude))
>> exclude <- as.vector(exclude, typeof(x)) # may result in NA
>> x <- as.character(x)
>> It looks simpler and is, more or less, equivalent.
> yes, but the current code should be slightly faster..
>> In factor.Rd, in description of argument 'exclude', "(when \code{x} is a
\code{factor} already)" can be removed.
>> A larger change that, I think, is reasonable is entirely removing the
code
>> exclude <- as.vector(exclude, typeof(x)) # may result in NA
>> The explicit coercion of 'exclude' is not necessary.
>> Function 'factor' works without it.
>> The coercion of 'exclude' may lead to a surprise because it "may result
in NA".
>> Example from
https://stat.ethz.ch/pipermail/r-help/2005-April/069053.html :
>> factor(as.integer(c(1,2,3,3,NA)), exclude=NaN)
>> excludes NA.
>> As a bonus, without the coercion of 'exclude', 'exclude' can be a factor
if 'x' is a factor. This part of an example in
https://stat.ethz.ch/pipermail/r-help/2011-April/276274.html works.
>> cc <- c("x","y","NA")
>> ff <- factor(cc)
>> factor(ff,exclude=ff[3])
> Yes, two good reasons for a change. factor() would finally
> behave according to the documentation which has been mentioning
> that 'exclude' could be a factor: ((Until my R-devel changes of a
> few weeks ago, i.e. at least in all recent released versions of R)),
> the help page for factor has said
> || If 'exclude' is used it should either be a factor with the same
> || level set as 'x' or a set of codes for the levels to be excluded.
>> However, the coercion of 'exclude' has been in function 'factor' in R
"forever".
> Indeed: On March 6, 1998, svn rev. 845, when the R source files got a
> '.R' appended, and quite a long time before R 1.0.0,
> the factor function was as short as (but using an .Internal(.) !)
> function (x, levels = sort(unique(x), na.last = TRUE), labels, exclude =
NA,
> ordered = FALSE)
> {
> if (length(x) == 0)
> return(character(0))
> exclude <- as.vector(exclude, typeof(x))
> levels <- levels[is.na(match(levels, exclude))]
> x <- .Internal(factor(match(x, levels), length(levels),
> ordered))
> if (missing(labels))
> levels(x) <- levels
> else levels(x) <- labels
> x
> }
> and already contained that line.
> Nevertheless, simplifying factor() by removing that line (or those
> 2 lines, now!) seems to only have advantages
> I'm not yet committing to anything, but currently would strongly
> consider it .. though *after* the
> ' OP '
> issue has settled a bit.
(Which it has; the decision has been to keep it.)
I have now committed Suharto's proposal above, to entirely drop the
exclude <- as.vector(exclude, typeof(x))
parts
in the factor() function... which has the two advantages
mentioned above and simplifies the code (and documentation).
r71424 | maechler | 2016-09-30 14:38:43 +0200 (Fri, 30 Sep 2016) | 1 line
Changed paths:
M /trunk/doc/NEWS.Rd
M /trunk/src/library/base/R/factor.R
M /trunk/src/library/base/man/factor.Rd
M /trunk/tests/reg-tests-1c.R
simplify factor(), allowing 'exclude= ' as documented in R <= 3.3.x
I do expect some "reaction" in CRAN/Bioconductor package space,
so the final word has not been spoken on this, but the new code
is more aestethic to me.
Thank you for the suggestion,
Martin
>>
>> On Wed, 31/8/16, Martin Maechler wrote:
>> Subject: Re: [Rd] 'droplevels' inappropriate change
>> Cc: "Martin Maechler"
>> Date: Wednesday, 31 August, 2016, 2:51 PM
> Martin Maechler
> on Sat, 27 Aug 2016 18:55:37 +0200 writes:
> Suharto Anggono Suharto Anggono via R-devel
> on Sat, 27 Aug 2016 03:17:32 + writes:
In R devel r71157, 'droplevels' documentation, in "Arguments" section,
says this about argument 'exclude'.
passed to factor(); factor levels which should be excluded from the
result even if present. Note that this was implicitly NA in R <= 3.3.1 which
did drop NA levels even when present in x, contrary to the documentation. The
current default is compatible with x[ , drop=FALSE].
The part
x[ , drop=FALSE]
should be
x[ , drop=TRUE]
>> [[elided Yahoo spam]]
>>> a "typo" by me. .. fixed now.
Saying that 'exclude' is factor levels