Hello,
There's a test for iqr equality, of Westenberg (1948), that can be found
on-line if one really looks. It starts creating a 1 sample pool from the
two samples and computing the 1st and 3rd quartiles. Then a three column
table where the rows correspond to the samples is built. The middle
column is the counts between the quartiles and the side ones to the
outsides. These columns are collapsed into one and a Fisher exact test
is conducted on the 2x2 resulting table.
R code could be:
iqr.test <- function(x, y){
qq <- quantile(c(x, y), prob = c(0.25, 0.75))
a <- sum(qq[1] < x & x < qq[2])
b <- length(x) - a
c <- sum(qq[1] < y & y < qq[2])
d <- length(y) - b
m <- matrix(c(a, c, b, d), ncol = 2)
numer <- sum(lfactorial(c(margin.table(m, 1), margin.table(m, 2))))
denom <- sum(lfactorial(c(a, b, c, d, sum(m))))
p.value <- 2*exp(numer - denom)
data.name <- deparse(substitute(x))
data.name <- paste(data.name, ", ", deparse(substitute(y)), sep="")
method <- "Westenberg-Mood test for IQR range equality"
alternative <- "the IQRs are not equal"
ht <- list(
p.value = p.value,
method = method,
alternative = alternative,
data.name = data.name
)
class(ht) <- "htest"
ht
}
n <- 1e3
pv <- numeric(n)
set.seed(2319)
for(i in 1:n){
x <- rnorm(sample(20:30, 1), 4, 1)
y <- rchisq(sample(20:40, 1), df=4)
pv[i] <- iqr.test(x, y)$p.value
}
sum(pv < 0.05)/n # 0.8
Hope this helps,
Rui Barradas
Em 14-07-2012 09:01, peter dalgaard escreveu:
On Jul 14, 2012, at 08:16 , Prof Brian Ripley wrote:
On 13/07/2012 21:37, Greg Snow wrote:
A permutation test may be appropriate:
Yes, it may, but precisely which one is unclear. You are testing whether the
two samples have an identical distribution, whereas I took the question to be a
test of differences in dispersion, with differences in location allowed.
I do not think this can be solved without further assumptions. E.g people
often replace the two-sample t-test by the two-sample Wilcoxon test as a test
of differences in location, not realizing that the latter is also sensitive to
other aspects of the difference (e.g. both dispersion and shape).
(Brian knows this, of course, but I though it useful to insert a little
quibbling.)
"Sensitive" is perhaps a little misleading here. The test statistic in the
Wilcoxon test is essentially an estimate of the probability that a random observation in
one group is bigger than a random observation in the other group. It isn't hard to
imagine situation where that quantity is unaffected by a dispersion change so the test is
not sensitive in the sense that it can detect dispersion changes between sufficiently
large samples.
However, the point is that p values _rely on_ the null hypothesis that two
distributions are exactly the same. This is mostly uncontroversial if you are
testing for an irrelevant grouping, but if you need confidence intervals for
the difference, you are implicitly assuming a location-shift model.
The same thing is true for permutation tests in general: You need to be rather
careful about what the assumptions are that allows you to interchange things.
Asymptotically, the distribution of the IQR depends on the values of the
density at the true quartiles. These could be different in the two groups, and
easily completely unrelated to those of a pooled sample.
I think that I would suggest finding an error estimate for the IQR (or maybe
log IQR) in each group separately, perhaps by bootstrapping, and then compare
between groups with an asymptotic z test. The main caveat is whether you have
sufficiently large sample sizes for asymptotics to hold.
Peter D.
I nearly suggested (yesterday) doing the permutation test on differences from
medians in the two groups. But really this is off-topic for R-help and needs
interaction with a knowledgeable statistician to refine the question.
1. compute the ratio of the 2 IQR values (or other comparison of interest)
2. combine the data from the 2 samples into 1 pool, then randomly
split into 2 groups (matching sample sizes of original) and compute
the ratio of the IQR values for the 2 new samples.
3. repeat #2 a bunch of times (like for a total of 999 random splits)
and combine with the original value.
4. (optional, but strongly suggested) plot a histogram of all the
ratios and place a reference line of the original ratio on the plot.
5. calculate the proportion of ratios that are as extreme or more
extreme than the original, this is the (approximate) p-value.
I think it is an 'exact' (but random) p-value.
On Fri, Jul 13, 2012 at 5:32 AM, Schaber, Jörg
<joerg.scha...@med.ovgu.de> wrote:
Hi,
I have two non-normal distributions and use interquartile ranges as a
dispersion measure.
Now I am looking for a test, which tests whether the interquartile ranges from
the two distributions are significantly different.
Any idea?
Thanks,
joerg
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.