On Wed, 28 Oct 2009, David Croll wrote:

Dear Achim,

let me thank you for this assurance!


The sample size is too large (~4000 observations per group) to solve this problem exactly. The error message could maybe be improved, but the message is clear: This is too large to deal with.

However, this is not a problem. With several thousand observations, standard normal approximations should work sufficiently well. And if you don't believe it, then you can look at approximate solutions that draw a sufficiently large number of permutations. Both is easily available when using wilcox_test() in "coin" as the startup message of "exactRankTests" suggests.

But in my dataset there are many, many tied ranks between group_1 and group_2 and the other ones. I wanted to use the exact procedure because I read the approximate solution would not give me exact p values in case of tied ranks...

Well, normal asymptotics will work anyway.

Am I paranoid, or am I in search of an exactness statistics cannot deliver?

Paranoid. ;-) With your sample size all solutions should be sufficiently close to exact.

Well, then I'll try permutation tests, and thank you again!

wilcox.exact() is a permutation test (as is wilcox.test if there are no ties). However, not all permutations have to be computed explicitly which is why it is also feasible for moderately sized samples. For large samples as yours, the computational burden is too large even for wilcox.exact(). The usual solution is to not look at all permutations but at a large number of them where you can make the approximation as precise as you want by increasing the number of permutations. And I'm fairly certain (but haven't tried) that the result of the approximate permutation test will be almost the same as the asymptotic permutation test.

hth,
Z

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to