>>>>> Gabriel Becker >>>>> on Sat, 2 Nov 2019 12:40:16 -0700 writes:
[....................] In the mean time, Gabe had worked quite a bit and provided a patch proposal at R's bugzilla, PR#17652 , i.e., here https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17652 A few days ago, I had committed a (slightly simplified) version of that to R-devel (svn rev 77462 ) with NEWS entry * head(x, n) and tail() default and other S3 methods notably for _vector_ n, e.g. to get a "corner" of a matrix, also extended for array's of higher dimension, thanks to the patch proposal by Gabe Becker in PR#16764. (which contains a *wrong* PR number that I've corrected in the mean time) A day or so later, the CRAN has alerted me to the fact that this change breaks the checks of some CRAN packages, as it seems about 30 now. There were at least two principal reasons, one of which was the fact that data frame subsetting has been somewhat surprising in R, without being documented so, *and* some packages have inadvertently made use of this pecularity -- which was inadvertently changed by r77462. In short, head(<data frame>) kept extraneous attributes because indeed d[i, ] keeps those attributes ... for data frames. I will amend the head() and tail() methods to remain back compatible (as much as sensible) for now, but here's what I've found about subsetting, i.e., behavior of the (partly C code internal) `[` methods in R : 1) For a data frame d, d[i, ] differs from d[i,j], as the former keeps (extra) attributes, 2) For a matrix both forms of indexing do not keep (extra) attributes. Here's some simple reproducible R code exhibiting the claim: ##==== Data frame subsetting (vs. matrix, array) "with extra attributes": ===== ## data frame w/ a (non-standard) attribute: str(treeS <- structure(trees, foo = "bar")) chkMat <- function(M) { stopifnot(nzchar(Mfoo <- attr(M, "foo")), length(d <- dim(M)) == 2, (n <- d[1]) >= 6, d[2] >= 3) ## n = nrow(M) stopifnot(exprs = { # attribute is kept if(inherits(M, "data.frame")) { identical( attr(M[ 1:3 , ] , "foo") , "bar") && identical( attr(M[(n-2):n , ] , "foo") , "bar") } else { ## matrix is.null ( attr(M[ 1:3 , ] , "foo")) && is.null ( attr(M[(n-2):n , ] , "foo")) } ## OTOH, [i,j]-indexing of data frames *does* drop "other" attributes: inherits(print(t.ij <- M[(n-2):n, 2:3] ), class(M)) ## now, the "foo" attribute of M[i,j] is gone! is.null(attr(t.ij, "foo")) }) } chkMat(treeS) chkMat(as.matrix(treeS)) ------- And (to repeat), currently head(d, n) is the same as d[1:n , ] when n >= 1, length(n) == 1 and this equality is relied upon by CRAN package code out there .. and hence I'll keep it with the "generalized" head() & tail() in R-devel. Martin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel