Re: [R] Mann-Whitney by group

Henric (Nilsson) Winell Sun, 22 Jul 2012 13:28:38 -0700

On 2012-07-17 05:13, R. Michael Weylandt wrote:

On Mon, Jul 16, 2012 at 3:39 PM, Oxenstierna <david.chert...@gmail.com> wrote:

lapply(thing, function(x) x[['p.value']]) --works very well, thank you.

Not to be a chore, but I'm interested in comparing the results of
wilcox.test--and the methodology we've employed so far--with the results and
methodology of wilcox_test (library("coin")).  So, I'd like to compare

There should not be any differences between the p-values obtained using'wilcox.test' and 'wilcox_test' in the asymptotic case. However, thelatter function allows you to use the exact null distribution even inthe presence of ties, or use an Monte Carlo approximation of the exactnull distribution. Using the approximately exact null distribution isparticularly helpful when the asymptotics doesn't work well, say, largebut unbalanced data, and/or the exact computations are too time consuming.

groups 5 and 6 across A through H using wilcox_test, and then I'd like to
extract the p-values.  Going through the same methodology as above, but
replacing wilcox.test with wilcox_test has failed, and so has the p.value
extraction method:  lapply(thing, function(x) x[['p.value']]) .

I believe the latter failure has to do with the fact that the coin package
has a built-in class and built-in extraction method (pvalue() to extract and
class "IndependenceTest"), but I don't know how to work around it.  For
example, for a single comparison:  wilcox_test(A~Group, Dtb) works fine, and
pvalue(wilcox.test(A~Group, Dtb)) extracts the p-value.

So, any ideas about how to compare groups 5 and 6 across A through H using
wilcox_test?

Well, since you're doing multiple tests here (A, C, ..., H vs Group) youshould consider adjusting for multiplicity and 'coin' allows you to dothat easily and efficiently. A multivariate Wilcoxon rank-sum test canbe constructed using


> set.seed(711109) # for reproducibility
> it <- independence_test(A + C + D + E + F + G + H ~ Group, data = Dtb,
                          ytrafo = function(data)
                              trafo(data, numeric_trafo = rank),
                          distribution = approximate(B = 100000))

approximating the exact null distribution using 100,000 Monte Carloreplicates.

Step-down adjusted p-values taking account of the dependence structurebetween the test statistics and possible discreteness in the nulldistribution are available through


> (psd <- pvalue(it, "step-down"))
        A       C       D       E       F       G       H
5 0.08598 0.08598 0.08598 0.20018 0.08598 0.08598 0.34182

and using the development version of 'coin', available at R-Forge, wecan get the unadjusted p-values from


> (pu <- pvalue(it, "unadjusted"))
        A       C       D       E       F       G       H
5 0.02894 0.02894 0.02894 0.11512 0.02894 0.02894 0.34182


If we look at the ratio of step-down adjusted and unadjusted p-values,

> psd / pu
         A        C        D        E        F        G H
5 2.970974 2.970974 2.970974 1.738881 2.970974 2.970974 1

we can see that this type of adjustment is pretty powerful compared tostep-down methods like Bonferroni-Holm that doesn't take account of thecorrelation nor the discreteness


> p.adjust(pu) / pu
  A C D E F G H
5 7 7 7 2 7 7 1


HTH,
Henric


I think there are a few things at play here.

1) coin uses so-called S4 objects, so `[[` style subsetting isn't
going to work. The "right way" is, as you have found to use the
pvalue() function.

2) It looks like you need to use the formula intervace for
wilcox_test. Unfortunately, this makes things a little more
complicated as you'll need to construct the formula programmatically.

A one liner looks something like this.

lapply(LETTERS[1:8], function(x)
pvalue(wilcox_test(as.formula(paste(x, "~ Group ")), Dtb)))

Where lapply loops over the letters A,B, etc. and makes the string `A
~ Group`, converts it to a formula, passes that to wilcox_test, then
gets the pvalue and returns it.

In two lines you could do:

thing <- lapply(LETTERS[1:8], function(x)
wilcox_test(as.formula(paste(x, "~ Group")), Dtb))
thing2 <- lapply(thing, pvalue)

Where thing has all the test result objects, and thing2 collects the pvalues.

Hope this helps,
Michael

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mann-Whitney by group

Reply via email to