On Dec 1, 2010, at 2:39 AM, Martin Maechler wrote: > sapply() stems from S / S+ times and hence has a long tradition. > In spite of that I think that it should be enhanced... > > As the subject mentions, sapply() produces a matrix in cases > where the list components of the lapply(.) results are of the > same length (and ...). > However, it unfortunately "stops there". > E.g., if you *nest* two sapply() calls where the inner one > produces a matrix, very often the logical behavior would be for > the outer sapply() to stack these matrices into an array of > rank 3 ["array rank"(x) := length(dim(x))]. > However it does not do that, e.g., an artifical example > > p0 <- function(...) paste(..., sep="") > myF <- function(x,y) { > stopifnot(length(x) <= 3) > x <- rep(x, length.out=3) > ny <- length(y) > r <- outer(x,y) > dimnames(r) <- list(p0("r",1:3), p0("C", seq_len(ny))) > r > } > > and > >> (v <- structure(10*(5:8), names=LETTERS[1:4])) > A B C D > 50 60 70 80 > > if we let sapply() not simplify, we see the list of same size > matrices it produes: > >> sapply(v, myF, y = 2*(1:5), simplify=FALSE) > $A > C1 C2 C3 C4 C5 > r1 100 200 300 400 500 > r2 100 200 300 400 500 > r3 100 200 300 400 500 > > $B > C1 C2 C3 C4 C5 > r1 120 240 360 480 600 > r2 120 240 360 480 600 > r3 120 240 360 480 600 > > $C > C1 C2 C3 C4 C5 > r1 140 280 420 560 700 > r2 140 280 420 560 700 > r3 140 280 420 560 700 > > $D > C1 C2 C3 C4 C5 > r1 160 320 480 640 800 > r2 160 320 480 640 800 > r3 160 320 480 640 800 > > However, quite deceptively > >> sapply(v, myF, y = 2*(1:5)) > A B C D > [1,] 100 120 140 160 > [2,] 100 120 140 160 > [3,] 100 120 140 160 > [4,] 200 240 280 320 > [5,] 200 240 280 320 > [6,] 200 240 280 320 > [7,] 300 360 420 480 > [8,] 300 360 420 480 > [9,] 300 360 420 480 > [10,] 400 480 560 640 > [11,] 400 480 560 640 > [12,] 400 480 560 640 > [13,] 500 600 700 800 > [14,] 500 600 700 800 > [15,] 500 600 700 800 > > > My proposal -- implemented and "make check" tested -- > is to add an optional argument 'ARRAY' > which allows > >> sapply(v, myF, y = 2*(1:5), ARRAY=TRUE) > , , A > > C1 C2 C3 C4 C5 > r1 100 200 300 400 500 > r2 100 200 300 400 500 > r3 100 200 300 400 500 > > , , B > > C1 C2 C3 C4 C5 > r1 120 240 360 480 600 > r2 120 240 360 480 600 > r3 120 240 360 480 600 > > , , C > > C1 C2 C3 C4 C5 > r1 140 280 420 560 700 > r2 140 280 420 560 700 > r3 140 280 420 560 700 > > , , D > > C1 C2 C3 C4 C5 > r1 160 320 480 640 800 > r2 160 320 480 640 800 > r3 160 320 480 640 800 > >> > ----------- > > In the best of all worlds, the default would be 'ARRAY = TRUE', > but of course, given the long-standing different behavior, > it seem much too "risky", and my proposal includes remaining > back-compatible with default ARRAY = FALSE. > > Martin Maechler, > ETH Zurich
Seems to me to be a reasonable proposal Martin, obviously with the proviso that the current default behavior is unaltered, as you note. Regards, Marc ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel