The stats:::cor.test.default function has these tests near the start:

    if (length(x) != length(y))
        stop("'x' and 'y' must have the same length")
    if (!is.numeric(x))
        stop("'x' must be a numeric vector")
    if (!is.numeric(y))
        stop("'y' must be a numeric vector")

I'd like to suggest putting the first test in last place instead, which would make some user errors easier to diagnose.  For example, if I misspell one of the column names, I get

  df <- data.frame(x = 1:10, y = 1:10)
  cor.test(df$X, df$y)
  #> Error in cor.test.default(df$X, df$y): 'x' and 'y' must have the same length

because df$X is NULL.  It would be more obvious what went wrong if the error said

  Error in cor.test.default(df$X, df$y):  'x' must be a numeric vector

Duncan Murdoch

P.S. An even more friendly error message would give the actual expression for x instead, i.e.

  Error in cor.test.default(df$X, df$y):  'df$X' is not a numeric vector

but that's not the style of error used in most stats functions.

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to