On 10/30/19 04:29, Martin Maechler wrote: >>>>>> Gabriel Becker >>>>>> on Tue, 29 Oct 2019 12:43:15 -0700 writes: > > > Hi all, > > So I've started working on this and I ran into something that I didn't > > know, namely that for x a multi-dimensional (2+) array, head(x) and > tail(x) > > ignore dimension completely, treat x as an atomic vector, and return an > > (unclassed) atomic vector: > > Well, that's (3+), not "2+" . > > But I did write (on Sep 17 in this thread!) > > > The current source for head() and tail() and all their methods > > in utils is just 83 lines of code {file utils/R/head.R minus > > the initial mostly copyright comments}. > > and if've ever looked at these few dozen of R code lines, you'll > have seen that we just added two simple utilities with a few > reasonable simple methods. To treat non-matrix (i.e. non-2d) > arrays as vectors, is typically not unreasonable in R, but > indeed with your proposals (in this thread), such non-2d arrays > should be treated differently either via new head.array() / > tail.array() methods ((or -- only if it can be done more nicely -- by > the default method)). > > Note however the following historical quirk : > >> sapply(setNames(,1:5), function(K) inherits(array(pi, dim=1:K), "array")) > 1 2 3 4 5 > TRUE FALSE TRUE TRUE TRUE > > (Is this something we should consider changing for R 4.0.0 -- to > have it TRUE also for 2d-arrays aka matrix objects ??)
That would be awesome! More generally I wonder how feasible it would be to fix all these inheritance quirks where inherits(x, "something"), is(x, "something"), and is.something(x) disagree. They've been such a nuisance for so many years... Thanks, H. > > The consequence of that is that > currently, "often" foo.matrix is just a copy of foo.array in > the case the latter exists: > "base" examples: foo in {unique, duplicated, anyDuplicated}. > > So I propose you change current head.matrix and tail.matrix to > head.array and tail.array > (and then have head.matrix <- head.array etc, at least if the > above quirk must remain, or remains (which I currently guess to > be the case)). > > > >> x = array(100, c(4, 5, 5)) > > >> dim(x) > > > [1] 4 5 5 > > >> head(x, 1) > > > [1] 100 > > >> class(head(x)) > > > [1] "numeric" > > > > (For a 1d array, it does return another 1d array). > > > When extending head/tail to understand multiple dimensions as > discussed in > > this thread, then, should the behavior for 2+d arrays be explicitly > > retained, or should head and tail do the analogous thing (with a > head(<2d > array> ) behaving the same as head(<matrix>), which honestly is what I > > expected to already be happening)? > > > Are people using/relying on this behavior in their code, and if so, > why/for > > what? > > > Even more generally, one way forward is to have the default methods > check > > for dimensions, and use length if it is null: > > > tail.default <- tail.data.frame <- function(x, n = 6L, ...) > > { > > if(any(n == 0)) > > stop("n must be non-zero or unspecified for all dimensions") > > if(!is.null(dim(x))) > > dimsx <- dim(x) > > else > > dimsx <- length(x) > > > ## this returns a list of vectors of indices in each > > ## dimension, regardless of length of the the n > > ## argument > > sel <- lapply(seq_along(dimsx), function(i) { > > dxi <- dimsx[i] > > ## select all indices (full dim) if not specified > > ni <- if(length(n) >= i) n[i] else dxi > > ## handle negative ns > > ni <- if (ni < 0L) max(dxi + ni, 0L) else min(ni, dxi) > > seq.int(to = dxi, length.out = ni) > > }) > > args <- c(list(x), sel, drop = FALSE) > > do.call("[", args) > > } > > > > I think this precludes the need for a separate data.frame method at > all, > > actually, though (I would think) tail.data.frame would still be > defined and > > exported for backwards compatibility. (the matrix method has some extra > > bits so my current conception of it is still separate, though it might > not > > NEED to be). > > > The question then becomes, should head/tail always return something > with > > the same dimensionally (number of dims) it got, or should data.frame > and > > matrix be special cased in this regard, as they are now? > > > What are people's thoughts? > > ~G > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Xl_11U8w8hVRbuqAPQkz0uSW02kokK9EUPhOopxw0d8&s=vyKU4VkWLb_fGG6KeDPPjVM5_nLhav6UiX7NkzgqsuE&e= > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel