>>>>> William Dunlap via R-devel <r-devel@r-project.org> >>>>> on Wed, 13 Jan 2016 13:46:05 -0800 writes:
> as.data.frame methods behave inconsistently when they are given a row.name > argument of the wrong length. The matrix method silently ignores row.names > if it has the wrong length and the numeric, integer, and character methods > do not bother to check and thus make an illegal data.frame. > > > as.data.frame(matrix(1:6,nrow=3), row.names=c("One","Two")) > V1 V2 > 1 1 4 > 2 2 5 > 3 3 6 > > as.data.frame(1:3, row.names=c("One","Two")) > 1:3 > One 1 > Two 2 > Warning message: > In format.data.frame(x, digits = digits, na.encode = FALSE) : > corrupt data frame: columns will be truncated or padded with NAs > > as.data.frame(c("a","b","c"), row.names=c("One","Two")) > c("a", "b", "c") > One a > Two b > Warning message: > In format.data.frame(x, digits = digits, na.encode = FALSE) : > corrupt data frame: columns will be truncated or padded with NAs as I said yesterday, I want to "fix" this in R. As Paul Grosu mentioned, the bugous -- too tolerant -- behavior is in the as.data.frame.vector() method, and the as.data.frame.matrix() simply drops wrong row.names and use default row names in that case. This would leave (at least) two ways to change: 1) the *.matrix compatible one simply forgets wrong 'row.names' 2) Wrong row.names are a user error. Now, '1)' would be more in line with the matrix method, but really feels wrong, because it does not catch user error and silently disregards a specifically specified argument. For '2)' I propose a fix which will only *warn* about the wrong 'row.names' for now (so code continues to work which has implicitly relied on the wrong behavior, but with a warning: > as.data.frame(1:3, row.names=c("One","Two")) 1:3 1 1 2 2 3 3 Warning message: In as.data.frame.integer(1:3, row.names = c("One", "Two")) : 'row.names' is not a character vector of length 3 -- omitting it. Will be an error! > This will give new warnings in packages, and package authors can fix these.... before the above will eventually become an error. The remaining question is if the as.data.frame.matrix() method should not also produce the same warning about illegal row.names. Interestingly, the *model.matrix* method does produce an error even now, when row.names are specified of wrong length: > ff <- log(Volume) ~ log(Height) + log(Girth) > m <- model.frame(ff, trees) > mat <- model.matrix(ff, m) > data.frame(mat, row.names = paste0("r", 1:30)) Error in data.frame(mat, row.names = paste0("r", 1:30)) : row names supplied are of the wrong length > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel