Le 25/04/2023 à 10:24, Viechtbauer, Wolfgang (NP) a écrit :
Hi all,

Posted this many years ago 
(https://stat.ethz.ch/pipermail/r-devel/2017-December/075224.html), but either 
this slipped under the radar or my feeble mind is unable to understand what 
xyTable() is doing here and nobody bothered to correct me. I now stumbled again 
across this issue.

x <- c(1, 1, 2, 2,  2, 3)
y <- c(1, 2, 1, 3, NA, 3)
table(x, y, useNA="always")
xyTable(x, y)

Why does xyTable() report that there are NA instances of (2,3)? I could 
understand the logic that the NA could be anything, including a 3, so the 
$number value for (2,3) is therefore unknown, but then the same should apply so 
(2,1), but here $number is 1, so the logic is then inconsistent.

I stared at the xyTable code for a while and I suspect this is coming from 
order() using na.last=TRUE by default, but in any case, to me the behavior 
above is surprising.
Not really. The variable 'first' in xyTable() is supposed to detect positions of first values in repeated pair sequences. Then it is used to retained only their indexes in a vector of type 1:n. Finally, by taking diff(), a number of repeated pairs is obtained. However, as 'first' will contain one NA  for your example, the diff() call will produce two NAs by taking the difference with precedent and following number. Hence, the result.

Here is a slightly modified code ox xyTable to handle NA too.

xyTableNA <- function (x, y = NULL, digits)
{
    x <- xy.coords(x, y, setLab = FALSE)
    y <- signif(x$y, digits = digits)
    x <- signif(x$x, digits = digits)
    n <- length(x)
    number <- if (n > 0) {
        orderxy <- order(x, y)
        x <- x[orderxy]
        y <- y[orderxy]
        first <- c(TRUE, (x[-1L] != x[-n]) | (y[-1L] != y[-n]))
        firstNA <- c(TRUE, xor(is.na(x[-1L]), is.na(x[-n])) | xor(is.na(y[-1L]), is.na(y[-n])))
        first[firstNA] <- TRUE
        first[is.na(first) | isFALSE(first)] <- FALSE
        x <- x[first]
        y <- y[first]
        diff(c((1L:n)[first], n + 1L))
    }
    else integer()
    list(x = x, y = y, number = number)
}

Best,
Serguei.


Best,
Wolfgang

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to