>>>>> Karolis Koncevičius
>>>>> on Fri, 21 Apr 2023 11:32:41 +0300 writes:
> Hello,
> Today I was investigating ks.test() with two numerical arguments (x and
y) and was left a bit confused about the policy behind handling ties.
> I might be missing something, so sorry in advance, but here is what
confuses me:
> The documentation states: "The presence of ties always generates a
warning, since continuous distributions do not generate them"
Indeed, that has not correct anymore for quite a while I think.
The current default is `exact = NULL` and that will be made
into TRUE in certain circumstances, notably for all(*) small data
situations.
--
*) The help page gives details.
> But when I run a test with ties there is no warning:
> ks.test(1:4, 4:7)
and indeed the printed output explicitly says that the *exact*
test was used.
> However, when I specify that I do not want an exact test, there appears a
warning saying that the computation will be approximate:
> ks.test(1:4, 4:7, exact=FALSE)
> # Warning: p-value will be approximate in the presence of ties
> But isn’t specifying exact=FALSE already makes the test approximate?
yes, but I think the idea is you'd look twice, and see that in
this case it is recommended to also use simulate.p.value = TRUE,
> I tried inspecting the source code for guidance but also was left a bit
puzzled. In ks.test.R under if(is.numeric(y)) clause there is a variable called
TIES that is set and changed, but is never used anywhere. Here are examples:
> line 55 TIES <- FALSE
> line 61 TIES <- TRUE
> line 74 if (TIES)
> line 75 z <- w
> But later this z variable is not used as a variable in the code. It looks
to me that this TIES variable can be deleted without affecting anything else.
That is correct. It is indeed a remainder from before the
recent improvements and psmirnov().
[TIES is used in the other branch in the same ks.test.default() function]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel