At 17:25 27.03.2007 +0200, Martin Maechler wrote: >>>>>> "Herve" == Herve Pages <[EMAIL PROTECTED]> >>>>>> on Mon, 26 Mar 2007 20:48:33 -0700 writes: > > Herve> Hi, > >> dd <- data.frame(A=c("b","c","a"), B=3:1) dd > Herve> A B 1 b 3 2 c 2 3 a 1 > >> unlist(dd) > Herve> A1 A2 A3 B1 B2 B3 2 3 1 3 2 1 > > Herve> Someone else might get something different. It all > Herve> depends on the values of its 'stringsAsFactors' option: > >yes, and I don't like that (last) fact either. >IMO, an option should never be allowed to influence such a basic >function as data.frame(). > >I know I would have had time earlier to start discussing this, >but for some (probably good) reasons, I didn't get to it at the >time. >As Andy comments, everything is behaving as it should / is documented, >including the 'stringsAsFactors' option; >but personally, I really would want to consider changing >the default for data.frame()s stringAsFactors back (as >pre-R-2.4.0) to 'TRUE' instead of default.stringsAsFactors() >which is a smart version of getOption("stringsAsFactors"). >I find it ok ("acceptable") if its influencing read.table() >but feel differently for data.frame(). > >Martin > Martin!
I see the problem with options influencing "such a basic function as data.frame().", but in my view the difficulty starts earlier. In my understanding data.frame() is _the_ basic way to store empirical source data in R and I found the earlier default behaviour, to change character variables to factors, problematic. If changing character variables to factors were only an internal process, not visible to the user, I would not mind, but to include a character variable in a data frame and get a factor out of it, is somewhat disturbing. A naive user like me was especially confused by the fact that I could read an SPSS file with spss.get (default: charfactor=FALSE) and get a character variable in a data.frame as a character variable but then putting it in a different data.frame it changed to factor. I would wish a data.frame() function that behaves as a "data container" with the idea of rows(=cases) and columns(=variables) but without changing the mode/class of the objects. Heinz > > > > > >> dd2 <- data.frame(A=c("b","c","a"), B=3:1, > >> stringsAsFactors=FALSE) > >> dd2 > Herve> A B 1 b 3 2 c 2 3 a 1 > >> unlist(dd2) > Herve> A1 A2 A3 B1 B2 B3 "b" "c" "a" "3" "2" "1" > > Herve> Same thing with as.character: > > >> as.character(dd) > Herve> [1] "c(2, 3, 1)" "c(3, 2, 1)" > >> as.character(dd2) > Herve> [1] "c(\"b\", \"c\", \"a\")" "c(3, 2, 1)" > > Herve> Bug or "feature"? > > Herve> Note that as.character applied directly on dd$A > Herve> doesn't have this "feature": > > >> as.character(dd$A) > Herve> [1] "b" "c" "a" > >> as.character(dd2$A) > Herve> [1] "b" "c" "a" > > Herve> Cheers, H. > > Herve> ______________________________________________ > Herve> R-devel@r-project.org mailing list > Herve> https://stat.ethz.ch/mailman/listinfo/r-devel > >______________________________________________ >R-devel@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel