With > R.version.string [1] "R Under development (unstable) (2013-01-26 r61752)"
'split.default' recycles a short factor for unclassed 'x', but not for an instance of x that is a class
> split(1:5, 1:2) $`1` [1] 1 3 5 $`2` [1] 2 4 Warning message: In split.default(1:5, 1:2) : data length is not a multiple of split variable > x = structure(1:5, class="A") > split(x, 1:2) $`1` [1] 1 $`2` [1] 2 Also, this is inconsistent with split<-, which does have recycling > split(x, 1:2) <- 1:2 Warning message: In split.default(seq_along(x), f, drop = drop, ...) : data length is not a multiple of split variable > x [1] 1 2 1 2 1 attr(,"class") [1] "A"A solution is to change a call to seq_along(f) toward the end of split.default to seq_along(x).
@@ -32,7 +32,7 @@ lf <- levels(f) y <- vector("list", length(lf)) names(y) <- lf - ind <- .Internal(split(seq_along(f), f)) + ind <- .Internal(split(seq_along(x), f)) for(k in lf) y[[k]] <- x[ind[[k]]] y }Maybe a little harder to argue the following, but in split.default, for a class that one might wish to develop factor-like behaviour, e.g.,
Rle = setClass("Rle", representation(values="integer", lengths="integer")) f = Rle(values=1:2, lengths=2:3) the code if (is.list(f)) f <- interaction(f, drop = drop, sep = sep) else if (drop || !is.factor(f)) f <- factor(f)requires that one make factor a generic and develop a method for factor.Rle. This contradicts the documentation
f: a ‘factor’ in the sense that ‘as.factor(f)’ defines the grouping, or a list of such factors in which case their interaction is used for the grouping.and perhaps the more common (?) pattern of coercion using as.*. A solution is to make as.factor a generic and revises the code above to use something like
if (is.list(f)) f <- interaction(f, drop = drop, sep = sep) else if (!is.factor(f)) f <- as.factor(f) else if (drop) f <- factor(f) One then gets split behaviour if there is an as.factor.Rle method as.factor.Rle <- function(x, ...) factor(rep(x@values, x@lengths), levels=unique(x@values)) setAs("Rle", "factor", function(from) as.factor.Rle(from)) These more elaborate changes are in the attached diff. Martin -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
split.diff.tar.gz
Description: GNU Zip compressed data
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel