On Jul 30, 2013, at 9:01 AM, Mathieu Basille wrote: > Dear list, > > Here is a simple example in which the behaviour of 'format' does not make > sense to me. I have read the documentation and searched the archives, but > nothing pointed me in the right direction to understand this behaviour. Let's > start with a simple data frame: > > df1 <- data.frame(x = rnorm(110000), y = rnorm(110000), id = 1:110000) > > Let's now create a new variable 'id2' which is the character representation > of 'id'. Note that I use 'scientific = FALSE' to ensure that long numbers > such as 100,000 are not formatted using their scientific representation (in > this case 1e+05): > > df1$id2 <- apply(df1, 1, function(dfi) format(dfi["id"], scientific = FALSE)) > > Let's have a look at part of the result: > > df1$id2[99990:100010] > [1] "99990" "99991" "99992" "99993" "99994" "99995" "99996" > [8] "99997" "99998" "99999" "100000" "100001" "100002" "100003" > [15] "100004" "100005" "100006" "100007" "100008" "100009" "100010"
Some formating processes are carried out by system functions. In this case I am unable to reproduce with the same code on a Mac OS 10.7.5/R 3.0.1 Patched > df1$id2[99990:100010] [1] "99990" "99991" "99992" "99993" "99994" "99995" "99996" "99997" [9] "99998" "99999" "100000" "100001" "100002" "100003" "100004" "100005" [17] "100006" "100007" "100008" "100009" "100010" (I did notice that generation of the id2 variable seemed to take an inordinately long time.) -- David. > > So far, so good. Let's now play with the 'digits' option: > > options(digits = 4) > df2 <- data.frame(x = rnorm(110000), y = rnorm(110000), id = 1:110000) > df2$id2 <- apply(df2, 1, function(dfi) format(dfi["id"], scientific = FALSE)) > df2$id2[99990:100010] > [1] "99990" "99991" "99992" "99993" "99994" " 99995" " 99996" > [8] " 99997" " 99998" " 99999" "100000" "100001" "100002" "100003" > [15] "100004" "100005" "100006" "100007" "100008" "100009" "100010" > > Notice the extra leading space from 99995 to 99999? To make sure it only > happened there: > > df2$id2[which(df1$id2 != df2$id2)] > [1] " 99995" " 99996" " 99997" " 99998" " 99999" > > And just to make sure it only occurs in a 'apply' call, here is the same > directly on a numeric vector: > > id2 <- format(1:110000, scientific = FALSE) > id2[99990:100010] > [1] " 99990" " 99991" " 99992" " 99993" " 99994" " 99995" " 99996" > [8] " 99997" " 99998" " 99999" "100000" "100001" "100002" "100003" > [15] "100004" "100005" "100006" "100007" "100008" "100009" "100010" > > Here the leading spaces are for every number, which makes sense to me. Is > there anything I'm misinterpreting in the behaviour of 'format'? > Thanks in advance for any hint, > Mathieu. > > > PS: Some background for this question. It all comes from a Rmd document, that > knitr consistently failed to process, while the R code was fine using batch > or interactive R. knitr uses 'options(digits = 4)' as opposed to > 'options(digits = 7)' by default in R, which made one of my function throw an > error with knitr, but not with batch or interactive R. I managed to solve the > problem using 'trim = TRUE' in 'format', but I still do not understand what's > going on... > If you're interested, see here for more details on the original problem: > http://stackoverflow.com/questions/17866230/knitr-vs-interactive-r-behaviour/17872176 > > > -- > > ~$ whoami > Mathieu Basille, PhD > > ~$ locate --details > University of Florida \\ > Fort Lauderdale Research and Education Center > (+1) 954-577-6314 > http://ase-research.org/basille > > ~$ fortune > « Le tout est de tout dire, et je manque de mots > Et je manque de temps, et je manque d'audace. » > -- Paul Éluard > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.