On 29/04/2009 6:41 PM, Steven McKinney wrote:
Hi useRs,

A recent coding infelicity along these lines
yielded a corrupt data frame.

foo <- matrix(1:12, nrow = 3)
bar <- data.frame(foo)
bar$NewCol <- foo[foo[, 1] == 4, 4]
bar
lapply(bar, length)




foo <- matrix(1:12, nrow = 3)
bar <- data.frame(foo)
bar$NewCol <- foo[foo[, 1] == 4, 4]
bar
  X1 X2 X3 X4 NewCol
1  1  4  7 10   <NA>
2  2  5  8 11   <NA>
3  3  6  9 12   <NA>
Warning message:
In format.data.frame(x, digits = digits, na.encode = FALSE) :
  corrupt data frame: columns will be truncated or padded with NAs
lapply(bar, length)
$X1
[1] 3

$X2
[1] 3

$X3
[1] 3

$X4
[1] 3

$NewCol
[1] 0


Is this a bug in the data.frame machinery?
If an attempt is made to add a new column
to a data frame, and the new object does
not have length = number of rows of data frame,
or cannot be made to have such length via recycling,
shouldn't an error be thrown?

Instead in this example I end up with a
"corrupt data frame" having one zero-length column.


Should this be reported as a bug, or did I misinterpret
the documentation?

I don't think "$" uses any data.frame machinery. You are working at a lower level.

If you had added the new column using

bar <- data.frame(bar, NewCol=foo[foo[, 1] == 4, 4])

you would have seen the error:

Error in data.frame(bar, NewCol = foo[foo[, 1] == 4, 4]) :
  arguments imply differing number of rows: 3, 0

But since you treated it as a list, it let you go ahead and create something that was labelled as a data.frame but wasn't. This is one of the reasons some people prefer S4 methods: it's easier to protect against people who mislabel things.

Duncan Murdoch





sessionInfo()
R version 2.9.0 (2009-04-17) powerpc-apple-darwin8.11.1
locale:
en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] nlme_3.1-90

loaded via a namespace (and not attached):
[1] grid_2.9.0 lattice_0.17-22 tools_2.9.0

Also occurs on Windows box with R 2.8.1



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. V5Z 1L3
Canada

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to