Peter, Thanks for the reply.
If that were the case, then should not the following be allowed to work with ordered factors? > median(factor(c("1", "2", "3"), ordered = TRUE)) Error in median.default(factor(c("1", "2", "3"), ordered = TRUE)) : need numeric data At least on the surface, if you can lexically order a character vector: > median(c("red", "blue", "green")) [1] "green" you can also order a factor, or ordered factor, and if the number of elements is odd, return a median value. Regards, Marc > On Jan 9, 2020, at 10:46 AM, peter dalgaard <pda...@gmail.com> wrote: > > I think median() behaves as designed: As long as the argument can be ordered, > the "middle observation" makes sense, except when the middle falls between > two categories, and you can't define and average of the two candidates for a > median. > > The "sick man" would seem to be var(). Notice that it is also inconsistent > with cov(): > >> cov(c("1","2","3","4"),c("1","2","3","4") ) > Error in cov(c("1", "2", "3", "4"), c("1", "2", "3", "4")) : > is.numeric(x) || is.logical(x) is not TRUE >> var(c("1","2","3","4"),c("1","2","3","4") ) > [1] 1.666667 > > -pd > > >> On 9 Jan 2020, at 14:49 , Marc Schwartz via R-devel <r-devel@r-project.org> >> wrote: >> >> Jean-Luc, >> >> Please keep the communications on the list, for the benefit of others, now >> and in the future, via the list archive. I am adding r-devel back here. >> >> I can't speak to the rationale in some of these cases. As I noted, it may be >> (is likely) due to differing authors over time, and there may have been >> relevant use cases at the time that the code was written, resulting in the >> various checks. Presumably, the additional checks were not incorporated into >> the other functions to enforce a level of consistency. >> >> We will need to wait for someone from R Core to comment. >> >> Regards, >> >> Marc >> >>> On Jan 9, 2020, at 8:34 AM, Lipatz Jean-Luc <jean-luc.lip...@insee.fr> >>> wrote: >>> >>> Ok, inconstencies. >>> >>> The last test you wrote is a bit strange. I agree that it is useful to warn >>> about a computation that have no sense in the case of factors. But why >>> testing data;frames? If you go that way using random structures, you can >>> also try : >>> >>>> median(list(1,2),list(3,4),list(4,5)) >>> Error in if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) >>> return(x[FALSE][NA]) : >>> l'argument n'est pas interprétable comme une valeur logique >>> De plus : Warning message: >>> In if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) >>> return(x[FALSE][NA]) : >>> la condition a une longueur > 1 et seul le premier élément est utilisé >>> >>> giving a message which, despite of his length, doesn't really explain the >>> reason of the error. >>> >>> Why not a test on arguments like? >>> if (!is.numeric(x)) >>> stop("need numeric data") >>> >>> >>> -----Message d'origine----- >>> De : Marc Schwartz <marc_schwa...@me.com> >>> Envoyé : jeudi 9 janvier 2020 14:19 >>> À : Lipatz Jean-Luc <jean-luc.lip...@insee.fr> >>> Cc : R-Devel <r-devel@r-project.org> >>> Objet : Re: [Rd] mean >>> >>> >>>> On Jan 9, 2020, at 7:40 AM, Lipatz Jean-Luc <jean-luc.lip...@insee.fr> >>>> wrote: >>>> >>>> Hello, >>>> >>>> Is there a reason for the following behaviour? >>>>> mean(c("1","2","3")) >>>> [1] NA >>>> Warning message: >>>> In mean.default(c("1", "2", "3")) : >>>> l'argument n'est ni numérique, ni logique : renvoi de NA >>>> >>>> But: >>>>> var(c("1","2","3")) >>>> [1] 1 >>>> >>>> And also: >>>>> median(c("1","2","3")) >>>> [1] "2" >>>> >>>> But: >>>>> quantile(c("1","2","3"),p=.5) >>>> Error in (1 - h) * qs[i] : >>>> argument non numérique pour un opérateur binaire >>>> >>>> It sounds like a lack of symetry. >>>> Best regards. >>>> >>>> >>>> Jean-Luc LIPATZ >>>> Insee - Direction générale >>>> Responsable de la coordination sur le développement de R et la mise en >>>> oeuvre d'alternatives à SAS >>> >>> >>> Hi, >>> >>> It would appear, whether by design or just inconsistent implementations, >>> perhaps by different authors over time, that the checks for whether or not >>> the input vector is numeric differ across the functions. >>> >>> A further inconsistency is for median(), where: >>> >>>> median(c("1", "2", "3", "4")) >>> [1] NA >>> Warning message: >>> In mean.default(sort(x, partial = half + 0L:1L)[half + 0L:1L]) : >>> argument is not numeric or logical: returning NA >>> >>> as a result of there being 4 elements, rather than 3, and the internal >>> checks in the code, where in the case of the input vector having an even >>> number of elements, mean() is used: >>> >>> if (n%%2L == 1L) >>> sort(x, partial = half)[half] >>> else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) >>> >>> >>> Similarly: >>> >>>> median(factor(c("1", "2", "3"))) >>> Error in median.default(factor(c("1", "2", "3"))) : need numeric data >>> >>> because the input vector is a factor, rather than character, and the >>> initial check has: >>> >>> if (is.factor(x) || is.data.frame(x)) >>> stop("need numeric data") >>> >>> >>> Regards, >>> >>> Marc Schwartz >>> >>> >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd....@cbs.dk Priv: pda...@gmail.com > > > > > > > > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel