I submit a couple options for addressing bug 16719: kruskal.test
documentation for formula.
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16719
disallow-character.diff changes the documentation and error message
to indicate that factors are accepted.
allow-character.diff changes the kruskal.test functions to convert
character vectors to factors; documentation is updated accordingly.
I tested the updated functions with the examples in example.R. It is
based on the examples in the bug report.
If there is interest in applying either patch, especially the latter,
I want first to test the change on lots of existing programs that call
kruskal.test, to see if it causes any regressions. Is there a good place
to look for programs that use particular R functions?
I am having trouble building R, so I have so far tested these changes
only by patching revision 74631 (SVN head) and sourcing the resulting
kruskal.test.R in R 3.4.1 on OpenBSD 6.2. I thus have not tested the
R documentation files.
Index: src/library/stats/R/kruskal.test.R
===================================================================
--- src/library/stats/R/kruskal.test.R (revision 74631)
+++ src/library/stats/R/kruskal.test.R (working copy)
@@ -46,7 +46,10 @@
x <- x[OK]
g <- g[OK]
if (!all(is.finite(g)))
- stop("all group levels must be finite")
+ if (is.character(g))
+ stop("all group levels must be finite; convert group to a
factor")
+ else
+ stop("all group levels must be finite")
g <- factor(g)
k <- nlevels(g)
if (k < 2L)
Index: src/library/stats/man/kruskal.test.Rd
===================================================================
--- src/library/stats/man/kruskal.test.Rd (revision 74631)
+++ src/library/stats/man/kruskal.test.Rd (working copy)
@@ -22,11 +22,12 @@
\item{x}{a numeric vector of data values, or a list of numeric data
vectors. Non-numeric elements of a list will be coerced, with a
warning.}
- \item{g}{a vector or factor object giving the group for the
+ \item{g}{a numeric vector or factor object giving the group for the
corresponding elements of \code{x}. Ignored with a warning if
\code{x} is a list.}
\item{formula}{a formula of the form \code{response ~ group} where
- \code{response} gives the data values and \code{group} a vector or
+ \code{response} gives the data values and \code{group}
+ a numeric vector or
factor of the corresponding groups.}
\item{data}{an optional matrix or data frame (or similar: see
\code{\link{model.frame}}) containing the variables in the
@@ -52,7 +53,8 @@
list, use \code{kruskal.test(list(x, ...))}.
Otherwise, \code{x} must be a numeric data vector, and \code{g} must
- be a vector or factor object of the same length as \code{x} giving
+ be a numeric vector or factor object of the same length as \code{x}
+ giving
the group for the corresponding elements of \code{x}.
}
\value{
Index: src/library/stats/R/kruskal.test.R
===================================================================
--- src/library/stats/R/kruskal.test.R (revision 74631)
+++ src/library/stats/R/kruskal.test.R (working copy)
@@ -45,7 +45,7 @@
OK <- complete.cases(x, g)
x <- x[OK]
g <- g[OK]
- if (!all(is.finite(g)))
+ if (!is.character(g) & !all(is.finite(g)))
stop("all group levels must be finite")
g <- factor(g)
k <- nlevels(g)
Index: src/library/stats/man/kruskal.test.Rd
===================================================================
--- src/library/stats/man/kruskal.test.Rd (revision 74631)
+++ src/library/stats/man/kruskal.test.Rd (working copy)
@@ -22,11 +22,13 @@
\item{x}{a numeric vector of data values, or a list of numeric data
vectors. Non-numeric elements of a list will be coerced, with a
warning.}
- \item{g}{a vector or factor object giving the group for the
+ \item{g}{a character vector, numeric vector, or factor
+ giving the group for the
corresponding elements of \code{x}. Ignored with a warning if
\code{x} is a list.}
\item{formula}{a formula of the form \code{response ~ group} where
- \code{response} gives the data values and \code{group} a vector or
+ \code{response} gives the data values and \code{group} a
+ character vector, numeric vector, or
factor of the corresponding groups.}
\item{data}{an optional matrix or data frame (or similar: see
\code{\link{model.frame}}) containing the variables in the
@@ -52,7 +54,8 @@
list, use \code{kruskal.test(list(x, ...))}.
Otherwise, \code{x} must be a numeric data vector, and \code{g} must
- be a vector or factor object of the same length as \code{x} giving
+ be a numeric vector, character vector, or factor of the same length
+ as \code{x} giving
the group for the corresponding elements of \code{x}.
}
\value{
source('kruskal.test.R')
help(kruskal.test)
data(mtcars)
mtcars$type <- rep(letters[1:2], c(16, 16))
is.vector(mtcars$type) ## TRUE
with(mtcars, kruskal.test(mpg, type))
## Error in kruskal.test.default(c(21, 21, 22.8, 21.4, 18.7, 18.1, 14.3, :
## all group levels must be finite
kruskal.test(mpg ~ type, mtcars)
## Error in kruskal.test.default(c(21, 21, 22.8, 21.4, 18.7, 18.1, 14.3, :
## all group levels must be finite
mtcars$type <- rep(1:2, c(16, 16))
kruskal.test(mpg ~ type, mtcars) # works
mtcars$type <- factor(mtcars$type)
kruskal.test(mpg ~ type, mtcars) # works
mtcars$type <- rep(c(8, Inf), c(16, 16))
kruskal.test(mpg ~ type, mtcars) # should fail
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel