Nice! Would this be something to consider as either a permanent fix to xyTable() (to me, the function is right now behaving in a rather unexpected manner, if not to say, buggy) or via an argument (for backwards compatability)?
Best, Wolfgang >-----Original Message----- >From: Serguei Sokol [mailto:so...@insa-toulouse.fr] >Sent: Tuesday, 25 April, 2023 11:35 >To: Viechtbauer, Wolfgang (NP); r-devel@r-project.org >Subject: Re: [Rd] xyTable(x,y) versus table(x,y) with NAs > >I correct myself. Obviously, the line > >first[is.na(first) | isFALSE(first)] <- FALSE > >should read > >first[is.na(first)] <- FALSE > >Serguei. > >Le 25/04/2023 à 11:30, Serguei Sokol a écrit : >> Le 25/04/2023 à 10:24, Viechtbauer, Wolfgang (NP) a écrit : >>> Hi all, >>> >>> Posted this many years ago >>> (https://stat.ethz.ch/pipermail/r-devel/2017-December/075224.html), >>> but either this slipped under the radar or my feeble mind is unable >>> to understand what xyTable() is doing here and nobody bothered to >>> correct me. I now stumbled again across this issue. >>> >>> x <- c(1, 1, 2, 2, 2, 3) >>> y <- c(1, 2, 1, 3, NA, 3) >>> table(x, y, useNA="always") >>> xyTable(x, y) >>> >>> Why does xyTable() report that there are NA instances of (2,3)? I >>> could understand the logic that the NA could be anything, including a >>> 3, so the $number value for (2,3) is therefore unknown, but then the >>> same should apply so (2,1), but here $number is 1, so the logic is >>> then inconsistent. >>> >>> I stared at the xyTable code for a while and I suspect this is coming >>> from order() using na.last=TRUE by default, but in any case, to me >>> the behavior above is surprising. >> Not really. The variable 'first' in xyTable() is supposed to detect >> positions of first values in repeated pair sequences. Then it is used >> to retained only their indexes in a vector of type 1:n. Finally, by >> taking diff(), a number of repeated pairs is obtained. However, as >> 'first' will contain one NA for your example, the diff() call will >> produce two NAs by taking the difference with precedent and following >> number. Hence, the result. >> >> Here is a slightly modified code ox xyTable to handle NA too. >> >> xyTableNA <- function (x, y = NULL, digits) >> { >> x <- xy.coords(x, y, setLab = FALSE) >> y <- signif(x$y, digits = digits) >> x <- signif(x$x, digits = digits) >> n <- length(x) >> number <- if (n > 0) { >> orderxy <- order(x, y) >> x <- x[orderxy] >> y <- y[orderxy] >> first <- c(TRUE, (x[-1L] != x[-n]) | (y[-1L] != y[-n])) >> firstNA <- c(TRUE, xor(is.na(x[-1L]), is.na(x[-n])) | >> xor(is.na(y[-1L]), is.na(y[-n]))) >> first[firstNA] <- TRUE >> first[is.na(first) | isFALSE(first)] <- FALSE >> x <- x[first] >> y <- y[first] >> diff(c((1L:n)[first], n + 1L)) >> } >> else integer() >> list(x = x, y = y, number = number) >> } >> >> Best, >> Serguei. >>> >>> Best, >>> Wolfgang ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel