I submit a couple options for addressing bug 16719: kruskal.test documentation for formula. https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16719
disallow-character.diff changes the documentation and error message to indicate that factors are accepted. allow-character.diff changes the kruskal.test functions to convert character vectors to factors; documentation is updated accordingly. I tested the updated functions with the examples in example.R. It is based on the examples in the bug report. If there is interest in applying either patch, especially the latter, I want first to test the change on lots of existing programs that call kruskal.test, to see if it causes any regressions. Is there a good place to look for programs that use particular R functions? I am having trouble building R, so I have so far tested these changes only by patching revision 74631 (SVN head) and sourcing the resulting kruskal.test.R in R 3.4.1 on OpenBSD 6.2. I thus have not tested the R documentation files.
Index: src/library/stats/R/kruskal.test.R =================================================================== --- src/library/stats/R/kruskal.test.R (revision 74631) +++ src/library/stats/R/kruskal.test.R (working copy) @@ -46,7 +46,10 @@ x <- x[OK] g <- g[OK] if (!all(is.finite(g))) - stop("all group levels must be finite") + if (is.character(g)) + stop("all group levels must be finite; convert group to a factor") + else + stop("all group levels must be finite") g <- factor(g) k <- nlevels(g) if (k < 2L) Index: src/library/stats/man/kruskal.test.Rd =================================================================== --- src/library/stats/man/kruskal.test.Rd (revision 74631) +++ src/library/stats/man/kruskal.test.Rd (working copy) @@ -22,11 +22,12 @@ \item{x}{a numeric vector of data values, or a list of numeric data vectors. Non-numeric elements of a list will be coerced, with a warning.} - \item{g}{a vector or factor object giving the group for the + \item{g}{a numeric vector or factor object giving the group for the corresponding elements of \code{x}. Ignored with a warning if \code{x} is a list.} \item{formula}{a formula of the form \code{response ~ group} where - \code{response} gives the data values and \code{group} a vector or + \code{response} gives the data values and \code{group} + a numeric vector or factor of the corresponding groups.} \item{data}{an optional matrix or data frame (or similar: see \code{\link{model.frame}}) containing the variables in the @@ -52,7 +53,8 @@ list, use \code{kruskal.test(list(x, ...))}. Otherwise, \code{x} must be a numeric data vector, and \code{g} must - be a vector or factor object of the same length as \code{x} giving + be a numeric vector or factor object of the same length as \code{x} + giving the group for the corresponding elements of \code{x}. } \value{
Index: src/library/stats/R/kruskal.test.R =================================================================== --- src/library/stats/R/kruskal.test.R (revision 74631) +++ src/library/stats/R/kruskal.test.R (working copy) @@ -45,7 +45,7 @@ OK <- complete.cases(x, g) x <- x[OK] g <- g[OK] - if (!all(is.finite(g))) + if (!is.character(g) & !all(is.finite(g))) stop("all group levels must be finite") g <- factor(g) k <- nlevels(g) Index: src/library/stats/man/kruskal.test.Rd =================================================================== --- src/library/stats/man/kruskal.test.Rd (revision 74631) +++ src/library/stats/man/kruskal.test.Rd (working copy) @@ -22,11 +22,13 @@ \item{x}{a numeric vector of data values, or a list of numeric data vectors. Non-numeric elements of a list will be coerced, with a warning.} - \item{g}{a vector or factor object giving the group for the + \item{g}{a character vector, numeric vector, or factor + giving the group for the corresponding elements of \code{x}. Ignored with a warning if \code{x} is a list.} \item{formula}{a formula of the form \code{response ~ group} where - \code{response} gives the data values and \code{group} a vector or + \code{response} gives the data values and \code{group} a + character vector, numeric vector, or factor of the corresponding groups.} \item{data}{an optional matrix or data frame (or similar: see \code{\link{model.frame}}) containing the variables in the @@ -52,7 +54,8 @@ list, use \code{kruskal.test(list(x, ...))}. Otherwise, \code{x} must be a numeric data vector, and \code{g} must - be a vector or factor object of the same length as \code{x} giving + be a numeric vector, character vector, or factor of the same length + as \code{x} giving the group for the corresponding elements of \code{x}. } \value{
source('kruskal.test.R') help(kruskal.test) data(mtcars) mtcars$type <- rep(letters[1:2], c(16, 16)) is.vector(mtcars$type) ## TRUE with(mtcars, kruskal.test(mpg, type)) ## Error in kruskal.test.default(c(21, 21, 22.8, 21.4, 18.7, 18.1, 14.3, : ## all group levels must be finite kruskal.test(mpg ~ type, mtcars) ## Error in kruskal.test.default(c(21, 21, 22.8, 21.4, 18.7, 18.1, 14.3, : ## all group levels must be finite mtcars$type <- rep(1:2, c(16, 16)) kruskal.test(mpg ~ type, mtcars) # works mtcars$type <- factor(mtcars$type) kruskal.test(mpg ~ type, mtcars) # works mtcars$type <- rep(c(8, Inf), c(16, 16)) kruskal.test(mpg ~ type, mtcars) # should fail
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel