>>>>> Martin Maechler >>>>> on Thu, 12 Dec 2019 17:20:47 +0100 writes:
>>>>> Karolis Koncevičius >>>>> on Mon, 9 Dec 2019 23:43:36 +0200 writes: >> So I tried adding Infinity support for all cases. And it >> is (as could be expected) more complicated than I >> thought. > "Of course !" Thank you, Karolis, in any case! >> It is easy to add Inf support for the test. The problems >> start with conf.int=TRUE. >> Currently confidence intervals are computed via >> `uniroot()` and, in the case of infinities, we are >> computationally looking for roots over infinite interval >> which results in an error. I suspect this is the reason >> Inf values were removed in the first place. > Maybe. It's still wrong to be done "up front". I'm sure > 98% (or so ;-) of all calls to wilcox.test() do *not* use > conf.int = TRUE >> Just a note, I found a few more errors/inconsistencies >> when requesting confidence intervals with paired=TRUE >> (due to Infinities being left in). >> Current error in Inf-Inf scenario: >> wilcox.test(c(1,2,Inf), c(4,8,Inf), paired=TRUE, >> conf.int=TRUE) Error in if (ZEROES) x <- x[x != 0] : >> missing value where TRUE/FALSE needed > Good catch .. notably as it also happens when > conf.int=FALSE as by default. My version of wilcox.test() > now does give the same as when the to 'Inf' are left away. >> NaN confidence intervals: >> wilcox.test(c(1:9,Inf), c(21:28,Inf,30), paired=TRUE, >> conf.int=TRUE) >> Wilcoxon signed rank test with continuity correction >> data: c(1:9, Inf) and c(21:28, Inf, 30) V = 9.5, p-value >> = 0.0586 alternative hypothesis: true location shift is >> not equal to 0 0 percent confidence interval: NaN NaN >> sample estimates: midrange NaN > I don't see a big problem here. The NaN's in some sense > show the best that can be computed with this data. > Statistic and P-value, but no conf.int. >> The easiest "fix" for consistency would be to simply >> remove Infinity support from the paired=TRUE case. > I strongly disagree. We are not pruning good > functionality just for some definition of consistency. >> But going with the more desirable approach of adding >> Infinity support for non-paired cases - it is currently >> not clear to me what confidence intervals and >> pseudomedian should be. Specially when Infinities are on >> both sides. > I deem that not to be a big deal. They are not defined > given the default formulas and that is reflected by NA / > NaN in those parts of the result. >> Regards, Karolis Koncevičius. > But I have also spent a few hours now on > wilcox.test.default() behavior notably also looking at the > "rounding" / "machine precision" situation, and also on > your remark that the 'method: ...' does not indicate well > enough what was computed. > In my (not yet committed) but hereby proposed enhancement > of wilcox.test(), I have a new argument, 'digits.rank = > Inf' (the default 'Inf' corresponding to the current > behavior) with help page documentation: > digits.rank: a number; if finite, ‘rank(signif(r, > digits.rank))’ will be used to compute ranks for the test > statistic instead of (the default) ‘rank(r)’. > and then in 'Details :' > For stability reasons, it may be advisable to use > rounded data or to set ‘digits.rank = 7’, say, such that > determination of ties does not depend on very small > numeric differences (see the example). > and then in 'Examples: ' > ## accuracy in ties determination via 'digits.rank': > wilcox.test( 4:2, 3:1, paired=TRUE) # Warning: cannot > compute exact p-value with ties wilcox.test((4:2)/10, > (3:1)/10, paired=TRUE) # no ties => *no* warning > wilcox.test((4:2)/10, (3:1)/10, paired=TRUE, digits.rank = > 9) # same ties as (4:2, 3:1) > ---------------------- > Lastly, I propose to replace "test" by "exact test" in the > 'method' component (and print out) in case exact > computations were used. This information should be part > of the returned "htest" object, and not only visible from > the arguments and warnings that are printed during the > computations. This last change is in some sense the "most > back-incompatible" change of these, because many > wilcox.test() printouts would slightly change, e.g., >> w0 <- wilcox.test( 1:5, 4*(0:4), paired=TRUE) > Wilcoxon signed rank exact test > data: 1:5 and 4 * (0:4) V = 1, p-value = 0.125 > alternative hypothesis: true location shift is not equal > to 0 > where before (in R <= 3.6.x) it is just > Wilcoxon signed rank test > data: ......... ............... ............... > but I think R 4.0.0 is a good occasion for such a change. > Constructive feedback on all this is very welcome! Martin ... none ... I "assume" this means everybody likes the idea ;-) Anyway, now comitted to R-devel (for R 4.0.0), svn rev 77569 (in 'NEW FEATURES'). Martin >> On 2019-12-07 23:18, Karolis Koncevičius wrote: >>> Thank you for a fast response. Nice to see this mailing >>> list being so alive. >>> >>> Regarding Inf issue: I agree with your assessment that >>> Inf should not be removed. The code gave me an >>> impression that Inf values were intentionally removed >>> (since is.finite() was used everywhere, except for >>> paired case). I will try to adjust my patch according to >>> your feedback. >>> >>> One more thing: it seems like you assumed that issues >>> 2:4 are all related to machine precision, which is not >>> the case - only 2nd issue is. Just wanted to draw this >>> to your attention, in case you might have some feedback >>> and guidelines about issues 3 and 4 as well. >>> >>> >>> >>> On 2019-12-07 21:59, Martin Maechler wrote: >>>>>>>>> Karolis Koncevičius on Sat, 7 Dec 2019 20:55:36 >>>>>>>>> +0200 writes: >>>> >>>> > Hello, > Writing to share some things I've found >>>> about wilcox.test() that seem a > a bit inconsistent. >>>> >>>> > 1. Inf values are not removed if paired=TRUE >>>> >>>> > # returns different results (Inf is removed): > >>>> wilcox.test(c(1,2,3,4), c(0,9,8,7)) > >>>> wilcox.test(c(1,2,3,4), c(0,9,8,Inf)) >>>> >>>> > # returns the same result (Inf is left as value with >>>> highest rank): > wilcox.test(c(1,2,3,4), c(0,9,8,7), >>>> paired=TRUE) > wilcox.test(c(1,2,3,4), c(0,9,8,Inf), >>>> paired=TRUE) >>>> >>>> Now which of the two cases do you consider correct ? >>>> >>>> IHMO, the 2nd one is correct: it is exactly one >>>> property of the *robustness* of the wilcoxon test and >>>> very desirable that any (positive) outlier is treated >>>> the same as just "the largest value" and the test >>>> statistic (and hence the p-value) should not change. >>>> >>>> So I think the first case is wrong, notably if >>>> modified, (such that the last y is the overall maximal >>>> value (slightly larger sample): >>>> >>>>> wilcox.test(1:7, 1/8+ c(9:4, 12)) >>>> >>>> Wilcoxon rank sum test >>>> >>>> data: 1:7 and 1/8 + c(9:4, 12) W = 6, p-value = 0.01748 >>>> alternative hypothesis: true location shift is not >>>> equal to 0 >>>> >>>>> wilcox.test(1:7, 1/8+ c(9:4, 10000)) >>>> >>>> Wilcoxon rank sum test >>>> >>>> data: 1:7 and 1/8 + c(9:4, 10000) W = 6, p-value = >>>> 0.01748 alternative hypothesis: true location shift is >>>> not equal to 0 >>>> >>>>> wilcox.test(1:7, 1/8+ c(9:4, Inf)) >>>> >>>> Wilcoxon rank sum test >>>> >>>> data: 1:7 and 1/8 + c(9:4, Inf) W = 6, p-value = >>>> 0.03497 alternative hypothesis: true location shift is >>>> not equal to 0 >>>> >>>> The Inf case should definitely give the same as the >>>> 10'000 case. That's exactly one property of a robust >>>> statistic. >>>> >>>> Thank you, Karolis, this is pretty embarrassing to only >>>> be detected now after 25+ years of R in use ... >>>> >>>> The correct fix starts with replacing the is.finite() >>>> by !is.na() and keep the 'Inf' in the rank >>>> computations... (but then probably also deal with the >>>> case of more than one Inf, notably the Inf - Inf >>>> "exception" which is not triggered by your example...) >>>> >>>> >>>> --- >>>> >>>> Ben addressed the "rounding" / numerical issues >>>> unavoidable for the other problems. >>>> >>>> > 2. tolerance issues with paired=TRUE. >>>> >>>> > wilcox.test(c(4, 3, 2), c(3, 2, 1), paired=TRUE) > # >>>> ... > # Warning: cannot compute exact p-value with >>>> ties >>>> >>>> > wilcox.test(c(0.4,0.3,0.2), c(0.3,0.2,0.1), >>>> paired=TRUE) > # ... > # no warning >>>> >>>> > 3. Always 'x observations are missing' when >>>> paired=TRUE >>>> >>>> > wilcox.test(c(1,2), c(NA_integer_,NA_integer_), >>>> paired=TRUE) > # ... > # Error: not enough (finite) >>>> 'x' observations >>>> >>>> > 4. No indication if normal approximation was used: >>>> >>>> > # different numbers, but same "method" name > >>>> wilcox.test(rnorm(10), exact=FALSE, correct=FALSE) > >>>> wilcox.test(rnorm(10), exact=TRUE, correct=FALSE) >>>> >>>> >>>> > From all of these I am pretty sure the 1st one is >>>> likely unintended, > so attaching a small patch to >>>> adjust it. Can also try patching others if > consensus >>>> is reached that the behavioiur has to be modified. >>>> >>>> > Kind regards, > Karolis Koncevičius. >>>> ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel