[Rd] kendall tau correlation test for ties: Potential error (PR#8076)

2005-08-18 Thread dkoschuetzki
Full_Name: Dirk Koschuetzki
Version: 2.1.1
OS: source code
Submission from: (NULL) (194.94.136.34)


Hello,

>From the source code (R-2.1.1, file: .../R-2.1.1/src/library/stats/R/)

**
cor.test.default <-
function(x, y, alternative = c("two.sided", "less", "greater"),
 method = c("pearson", "kendall", "spearman"), exact = NULL,
 conf.level = 0.95, ...)
{
alternative <- match.arg(alternative)
method <- match.arg(method)
DNAME <- paste(deparse(substitute(x)), "and", deparse(substitute(y)))

if(length(x) != length(y))
stop("'x' and 'y' must have the same length")
OK <- complete.cases(x, y)
x <- x[OK]
y <- y[OK]
n <- length(x)

PVAL <- NULL
NVAL <- 0
conf.int <- FALSE

if(method == "pearson") {
// Omitted
}
else {
if(n < 2)
stop("not enough finite observations")
PARAMETER <- NULL
TIES <- (min(length(unique(x)), length(unique(y))) < n)
if(method == "kendall") {
method <- "Kendall's rank correlation tau"
names(NVAL) <- "tau"
r <- cor(x,y, method = "kendall")
ESTIMATE <- c(tau = r)

if(!is.finite(ESTIMATE)) {  # all x or all y the same
ESTIMATE[] <- NA
STATISTIC <- c(T = NA)
PVAL <- NA
}
else {
if(is.null(exact))
exact <- (n < 50)
if(exact && !TIES) {
q <- round((r + 1) * n * (n - 1) / 4)
pkendall <- function(q, n) {
.C("pkendall",
   length(q),
   p = as.double(q),
   as.integer(n),
   PACKAGE = "stats")$p
}
PVAL <-
switch(alternative,
   "two.sided" = {
   if(q > n * (n - 1) / 4)
   p <- 1 - pkendall(q - 1, n)
   else
   p <- pkendall(q, n)
   min(2 * p, 1)
   },
   "greater" = 1 - pkendall(q - 1, n),
   "less" = pkendall(q, n))
STATISTIC <- c(T = q)
} else {
STATISTIC <- c(z = r / sqrt((4 * n + 10) / (9 * n*(n-1
p <- pnorm(STATISTIC)
if(exact && TIES)
warning("Cannot compute exact p-value with ties")
}
}
} else {
// OMITTED
}
}

if(is.null(PVAL)) # for "pearson" only, currently
PVAL <- switch(alternative,
   "less" = p,
   "greater" = 1 - p,
   "two.sided" = 2 * min(p, 1 - p))

RVAL <- list(statistic = STATISTIC,
 parameter = PARAMETER,
 p.value = as.numeric(PVAL),
 estimate = ESTIMATE,
 null.value = NVAL,
 alternative = alternative,
 method = method,
 data.name = DNAME)
if(conf.int)
RVAL <- c(RVAL, list(conf.int = cint))
class(RVAL) <- "htest"
RVAL
}
*

Please look at the computation of the p-value for Kendalls tau. There is an
assignment to "p" right above the warning. In the bottom of the function there
is a comment that for the pearson case we have to use the modification and set
PVAL. 

The problem is: 
* Either the comment is wrong because the modification should be done with
kendall too, or
* The variable PVAL has to be assigned in the kendall block.

I hope this is clear so far.

Please send me some comments, because I'm not sure if my observation is ok. And
currently I try to figure out the significance in the biserial case which of
course makes heavy use of the tied case.

Cheers,
Dirk

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] kendall tau correlation test for ties: Potential error (PR#8076)

2005-08-24 Thread dkoschuetzki
Hello,

On Thu, 18 Aug 2005 15:07:07 +0200, Peter Dalgaard  
<[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] writes:

>> The problem is:
>> * Either the comment is wrong because the modification should be done  
>> with
>> kendall too, or
>> * The variable PVAL has to be assigned in the kendall block.
>
> I think it is the comment that is wrong. [...]

Thanks for the clarification. I think I got the untied case and I think  
that I have an understanding of the problems of the tied one by now.  
Thanks for you comments and many thanks for the great R system. I use it  
as my daily working environment and I'm very happy with it. Thanks!

Dirk

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel