Re: [R] significance test interquartile ranges

Rui Barradas Sat, 14 Jul 2012 10:37:48 -0700

Hello,

Em 14-07-2012 13:08, peter dalgaard escreveu:


On Jul 14, 2012, at 12:25 , Rui Barradas wrote:

Hello,

There's a test for iqr equality, of Westenberg (1948), that can be found 
on-line if one really looks. It starts creating a 1 sample pool from the two 
samples and computing the 1st and 3rd quartiles. Then a three column table 
where the rows correspond to the samples is built. The middle column is the 
counts between the quartiles and the side ones to the outsides. These columns 
are collapsed into one and a Fisher exact test is conducted on the 2x2 
resulting table.



That's just wrong, is it not? Just because things were suggested by someone 
semi-famous, it doesn't mean that they actually work...

Take two normal distributions, equal in size,  with a sufficiently large 
difference between the means, so that there is no material overlap. The 
quartiles of the pooled sample will then be the medians of the original 
samples, and the test will be that one sample has the same number above its 
median as the other has below its median.

If it weren't for the pooling business, I'd say that it was a sane test for 
equality of quartiles, but not for the IQR.

Right, thank you! It forced me to pay more attention to what I wasreading. The "test is aimed at differences in scale only, presuming nodifference in location"

http://www.stat.ncsu.edu/information/library/mimeo.archive/ISMS_1986_1499.pdf

The original can be found at
http://www.dwc.knaw.nl/DL/publications/PU00018486.pdf

If we subtract the median of each sample to each of them, the mediansbecome zero but the IQRs remain as they were. In my simulation I hadchosen samples from distributions with equal mean, and that point passedunnoticed.

The code should then be slightly revised. I'll repost it because therewas a typo in the 'method' member of the returned list


iqr.test <- function(x, y){
        data.name <- deparse(substitute(x))
        data.name <- paste(data.name, ", ", deparse(substitute(y)), sep="")
        x <- x - median(x)
        y <- y - median(y)
        qq <- quantile(c(x, y), prob = c(0.25, 0.75))
        a <- sum(qq[1] < x & x < qq[2])
        b <- length(x) - a
        c <- sum(qq[1] < y & y < qq[2])
        d <- length(y) - b
        m <- matrix(c(a, c, b, d), ncol = 2)
        numer <- sum(lfactorial(c(margin.table(m, 1), margin.table(m, 2))))
        denom <- sum(lfactorial(c(a, b, c, d, sum(m))))
        p.value <- 2*exp(numer - denom)
        method <- "Westenberg-Mood test for IQR equality"
        alternative <- "the IQRs are not equal"
        ht <- list(
                p.value = p.value,
                method = method,
                alternative = alternative,
                data.name = data.name
        )
        class(ht) <- "htest"
        ht
}

Rui Barradas


R code could be:


iqr.test <- function(x, y){
        qq <- quantile(c(x, y), prob = c(0.25, 0.75))
        a <- sum(qq[1] < x & x < qq[2])
        b <- length(x) - a
        c <- sum(qq[1] < y & y < qq[2])
        d <- length(y) - b
        m <- matrix(c(a, c, b, d), ncol = 2)
        numer <- sum(lfactorial(c(margin.table(m, 1), margin.table(m, 2))))
        denom <- sum(lfactorial(c(a, b, c, d, sum(m))))
        p.value <- 2*exp(numer - denom)
        data.name <- deparse(substitute(x))
        data.name <- paste(data.name, ", ", deparse(substitute(y)), sep="")
        method <- "Westenberg-Mood test for IQR range equality"
        alternative <- "the IQRs are not equal"
        ht <- list(
                p.value = p.value,
                method = method,
                alternative = alternative,
                data.name = data.name
        )
        class(ht) <- "htest"
        ht
}

n <- 1e3
pv <- numeric(n)
set.seed(2319)
for(i in 1:n){
        x <- rnorm(sample(20:30, 1), 4, 1)
        y <- rchisq(sample(20:40, 1), df=4)
        pv[i] <- iqr.test(x, y)$p.value
}

sum(pv < 0.05)/n  # 0.8



To wit:

iqr.test(rnorm(100), rnorm(100,10,3))


        Westenberg-Mood test for IQR range equality

data:  rnorm(100), rnorm(100, 10, 3)
p-value = 0.2248
alternative hypothesis: the IQRs are not equal

replicate(10,iqr.test(rnorm(100), rnorm(100,10,3))$p.value)

  [1] 0.2248312 0.2248312 0.2248312 0.2248312 0.2248312 0.2248312 0.2248312
  [8] 0.2248312 0.2248312 0.2248312


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] significance test interquartile ranges

Reply via email to