This originally came up in this dplyr issue: https://github.com/tidyverse/dplyr/issues/5745
Where `tibble::column_to_rownames()` failed because it eventually checks `.row_names_info(.data) > 0L` to see if there are automatic row names, which is in line with the documentation that Kevin pointed out: "type = 1 the latter with a negative sign for ‘automatic’ row names." Davis On Tue, Feb 16, 2021 at 2:29 PM Bill Dunlap <williamwdun...@gmail.com> wrote: > as.matrix.data.frame does not take the absolute value of that number: > > dPos <- > structure(list(X=101:103,201:203),class="data.frame",row.names=c(NA_integer_,+3L)) > > dNeg <- > structure(list(X=101:103,201:203),class="data.frame",row.names=c(NA_integer_,-3L)) > > rownames(as.matrix(dPos)) > [1] "1" "2" "3" > > rownames(as.matrix(dNeg)) > NULL > > -Bill > > On Tue, Feb 16, 2021 at 11:06 AM Kevin Ushey <kevinus...@gmail.com> wrote: > > > > Strictly speaking, I don't think this is a "corrupt" representation, > > given that any APIs used to access that internal representation will > > call abs() on the row count encoded within. At least, as far as I can > > tell, there aren't any adverse downstream effects from having the row > > names attribute encoded with this particular internal representation. > > > > On the other hand, the documentation in ?.row_names_info states, for > > the 'type' argument: > > > > integer. Currently type = 0 returns the internal "row.names" attribute > > (possibly NULL), type = 2 the number of rows implied by the attribute, > > and type = 1 the latter with a negative sign for ‘automatic’ row > > names. > > > > so one could argue that it's incorrect in light of that documentation > > (the row names are "automatic", but the row count is not marked with a > > negative sign). Or perhaps this is a different "type" of internal > > automatic row name, since it was generated from an already-existing > > integer sequence rather than "automatically" in a call to > > data.frame(). > > > > Kevin > > > > On Sun, Feb 14, 2021 at 6:51 AM Davis Vaughan <da...@rstudio.com> wrote: > > > > > > Hi all, > > > > > > I believe that the internal row names object created at this line in > > > `row_names_gets()` should be using `-n`, not `n`. > > > > https://github.com/wch/r-source/blob/b30641d3f58703bbeafee101f983b6b263b7f27d/src/main/attrib.c#L71 > > > > > > This can currently generate corrupt internal row names when using > > > `attributes<-` or `structure()`, which calls `attributes<-`. > > > > > > # internal row names are typically `c(NA, -n)` > > > df <- data.frame(x = 1:3) > > > .row_names_info(df, type = 0L) > > > #> [1] NA -3 > > > > > > # using `attributes()` materializes their non-internal form > > > attrs <- attributes(df) > > > attrs > > > #> $names > > > #> [1] "x" > > > #> > > > #> $class > > > #> [1] "data.frame" > > > #> > > > #> $row.names > > > #> [1] 1 2 3 > > > > > > # let's make a data frame from scratch with `attributes<-` > > > data <- list(x = 1:3) > > > attributes(data) <- attrs > > > > > > # oh no! > > > .row_names_info(data, type = 0L) > > > #> [1] NA 3 > > > > > > # Note: Must have `nrow(df) > 2` to demonstrate this bug, as otherwise > > > # internal row names are not attempted to be created in the C level > > > # `row_names_gets()` > > > > > > Thanks, > > > Davis > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-devel@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel