sapply() stems from S / S+ times and hence has a long tradition. In spite of that I think that it should be enhanced...
As the subject mentions, sapply() produces a matrix in cases where the list components of the lapply(.) results are of the same length (and ...). However, it unfortunately "stops there". E.g., if you *nest* two sapply() calls where the inner one produces a matrix, very often the logical behavior would be for the outer sapply() to stack these matrices into an array of rank 3 ["array rank"(x) := length(dim(x))]. However it does not do that, e.g., an artifical example p0 <- function(...) paste(..., sep="") myF <- function(x,y) { stopifnot(length(x) <= 3) x <- rep(x, length.out=3) ny <- length(y) r <- outer(x,y) dimnames(r) <- list(p0("r",1:3), p0("C", seq_len(ny))) r } and > (v <- structure(10*(5:8), names=LETTERS[1:4])) A B C D 50 60 70 80 if we let sapply() not simplify, we see the list of same size matrices it produes: > sapply(v, myF, y = 2*(1:5), simplify=FALSE) $A C1 C2 C3 C4 C5 r1 100 200 300 400 500 r2 100 200 300 400 500 r3 100 200 300 400 500 $B C1 C2 C3 C4 C5 r1 120 240 360 480 600 r2 120 240 360 480 600 r3 120 240 360 480 600 $C C1 C2 C3 C4 C5 r1 140 280 420 560 700 r2 140 280 420 560 700 r3 140 280 420 560 700 $D C1 C2 C3 C4 C5 r1 160 320 480 640 800 r2 160 320 480 640 800 r3 160 320 480 640 800 However, quite deceptively > sapply(v, myF, y = 2*(1:5)) A B C D [1,] 100 120 140 160 [2,] 100 120 140 160 [3,] 100 120 140 160 [4,] 200 240 280 320 [5,] 200 240 280 320 [6,] 200 240 280 320 [7,] 300 360 420 480 [8,] 300 360 420 480 [9,] 300 360 420 480 [10,] 400 480 560 640 [11,] 400 480 560 640 [12,] 400 480 560 640 [13,] 500 600 700 800 [14,] 500 600 700 800 [15,] 500 600 700 800 My proposal -- implemented and "make check" tested -- is to add an optional argument 'ARRAY' which allows > sapply(v, myF, y = 2*(1:5), ARRAY=TRUE) , , A C1 C2 C3 C4 C5 r1 100 200 300 400 500 r2 100 200 300 400 500 r3 100 200 300 400 500 , , B C1 C2 C3 C4 C5 r1 120 240 360 480 600 r2 120 240 360 480 600 r3 120 240 360 480 600 , , C C1 C2 C3 C4 C5 r1 140 280 420 560 700 r2 140 280 420 560 700 r3 140 280 420 560 700 , , D C1 C2 C3 C4 C5 r1 160 320 480 640 800 r2 160 320 480 640 800 r3 160 320 480 640 800 > ----------- In the best of all worlds, the default would be 'ARRAY = TRUE', but of course, given the long-standing different behavior, it seem much too "risky", and my proposal includes remaining back-compatible with default ARRAY = FALSE. Martin Maechler, ETH Zurich ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel