Re: [Rd] cwilcox - new version

Andreas Löffler Wed, 17 Jan 2024 02:56:05 -0800

>
>
> Performance statistics are interesting. If we assume the two populations
> have a total of `m` members, then this implementation runs slightly slower
> for m < 20, and much slower for 50 < m < 100. However, this implementation
> works significantly *faster* for m > 200. The breakpoint is precisely when
> each population has a size of 50; `qwilcox(0.5,50,50)` runs in 8
> microseconds in the current version, but `qwilcox(0.5, 50, 51)` runs in 5
> milliseconds. The new version runs in roughly 1 millisecond for both. This
> is probably because of internal logic that requires many more `free/calloc`
> calls if either population is larger than `WILCOX_MAX`, which is set to 50.
>
Also because cwilcox_sigma has to be evaluated, and this is slightly more
demanding since it uses k%d.


There is a tradeoff here between memory usage and time of execution. I am
not a heavy user of the U test but I think the typical use case does not
involve several hundreds of tests in a session so execution time (my 2
cents) is less important. But if R crashes one execution is already
problematic.

But the takeaway is  probably: we should implement both approaches in the
code and leave it to the user which one she prefers. If time is important
and memory not an issue and if m, n are low go for the "traditional
approach". Otherwise, use my formula?

PS (@Aidan): I have applied for an bugzilla account two days ago and heard
not back from them. Also Spam is empty. Is that ok or shall I do something?

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] cwilcox - new version

Reply via email to