------------------ >>>>> Henric Winell <[hidden email]> >>>>> on Wed, 21 Oct 2015 13:43:02 +0200 writes:
> Den 2015-10-21 kl. 07:24, skrev Suharto Anggono Suharto Anggono via R-devel: >> Marius Hofert-4------------------------------ >>> Den 2015-10-09 kl. 12:14, skrev Martin Maechler: >>> I think so: the code above doesn't seem to do the right thing. Consider >>> the following example: >>> >>> > x <- c(1, 1, 2, 3) >>> > rank2(x, ties.method = "last") >>> [1] 1 2 4 3 >>> >>> That doesn't look right to me -- I had expected >>> >>> > rev(sort.list(x, decreasing = TRUE)) >>> [1] 2 1 3 4 >>> >> >> Indeed, well spotted, that seems to be correct. >> >>> >>> Henric Winell >>> >> ------------------------------ >> >> In the particular example (of length 4), what is really wanted is the following. >> ind <- integer(4) >> ind[sort.list(x, decreasing=TRUE)] <- 4:1 >> ind > You don't provide the output here, but 'ind' is, of course, >> ind > [1] 2 1 3 4 >> The following gives the desired result: >> sort.list(rev(sort.list(x, decreasing=TRUE))) > And, again, no output, but >> sort.list(rev(sort.list(x, decreasing=TRUE))) > [1] 2 1 3 4 > Why is it necessary to use 'sort.list' on the result from > 'rev(sort.list(...'? You can try all kind of code on this *too* simple example and do experiments. But let's approach this a bit more scientifically and hence systematically: Look at rank {the R function definition} to see that for the case of no NA's, rank(x, ties.method = "first') === sort.list(sort.list(x)) If you assume that to be correct and want to define "last" to be correct as well (in the sense of being "first"-consistent), it is clear that rank(x, ties.method = "last) === rev(sort.list(sort.list(rev(x)))) must also be correct. I don't think that *any* of the proposals so far had a correct version [but the too simplistic examples did not show the problems]. In R-devel (the R development) version of today, i.e., svn revision >= 69549, the implementation of ties.method = "last' uses ## == rev(sort.list(sort.list(rev(x)))) : if(length(x) == 0) integer(0) else { i <- length(x):1L sort.list(sort.list(x[i]))[i] }, which is equivalent to using rev() but a bit more efficient. Martin Maechler, ETH Zurich ------------------ I'll defend that my code is correct in general. All comes from the fact that, if p is a permutation of 1:n, { ind <- integer(n); ind[p] <- 1:n; ind } gives the same result to sort.list(p) You can make sense of it like this. In ind[p] <- 1:n, ind[1] is the position where p == 1. So, ind[1] is the position of the smallest element of p. So, it is the first element of sort.list(p). Next elements follow. That's why 'sort.list' is used for ties.method="first" and ties.method="random" in function 'rank' in R. When p gives the desired order, { ind <- integer(n); ind[p] <- 1:n; ind } gives ranks of the original elements based on the order. The original element in position p[1] has rank 1, the original element in position p[2] has rank 2, and so on. Now, I say that rev(sort.list(x, decreasing=TRUE)) gives the desired order for ties.method="last". With the order, the elements are from smallest to largest; for equal elements, elements are ordered by their positions backwards. ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel