Minor point and probably not relevant to the speed issue, but df() is the density function for the F distribution, so I have (recently) stopped using it for referring to data.frames.
Sean On 30 October 2013 23:32, Gabriel Becker <gmbec...@ucdavis.edu> wrote: > Hadley, > > As far as I can tell from a quick look, it is because implicit printing > uses a different mechanism which does a fair bit more work. > > >From comments in print.c in the R sources: > > * print.default() -> do_printdefault (with call tree below) > * > * auto-printing -> PrintValueEnv > * -> PrintValueRec > * -> call print() for objects > * Note that auto-printing does not call print.default. > * PrintValue, R_PV are similar to auto-printing. > > PrintValueEnv includes, among other things, checks for functions, S4 > objects, and s3 objects before constructing (in C code) an R call to print > for S3 objects and show for S4 objects and evaluating it using Rf_eval. So > there is an extra trip to the R evaluator. > > I imagine that extra work is where the hangup is but that is a > slightly-informed guess as I haven't done any detailed timings or checks. > > Basically my understanding of the processes is as follows: > > print(df) > print call is evaluated, S3 dispatch happens, print.default in C is called, > result printed to terminal, print call returns > > df > expression "df" evaluated, auto-print initiated, type of object returned by > expression is determined, print call is constructed in C code, print call > is evaluated in C code, THEN all the stuff above happens. > > I dunno if that helps or not as I can't speak to how to change/fix it atm. > > ~G > > > > On Wed, Oct 30, 2013 at 3:22 PM, Hadley Wickham <h.wick...@gmail.com> wrote: > >> Hi all, >> >> Can anyone help me understand why an implicit print (i.e. just typing >> df at the console), is so much slower than an explicit print (i.e. >> print(df)) in the example below? I see the difference in both Rstudio >> and in a terminal. >> >> # Construct large df as quickly as possible >> dummy <- 1:18e6 >> df <- lapply(1:10, function(x) dummy) >> names(df) <- letters[1:10] >> class(df) <- c("myobj", "data.frame") >> attr(df, "row.names") <- .set_row_names(18e6) >> >> print.myobj <- function(x, ...) { >> print.data.frame(head(x, 2)) >> } >> >> start <- proc.time(); df; flush.console(); proc.time() - start >> # user system elapsed >> # 0.408 0.557 0.965 >> start <- proc.time(); print(df); flush.console(); proc.time() - start >> # user system elapsed >> # 0.019 0.002 0.020 >> >> sessionInfo() >> # R version 3.0.2 (2013-09-25) >> # Platform: x86_64-apple-darwin10.8.0 (64-bit) >> # >> # locale: >> # [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> # >> # attached base packages: >> # [1] stats graphics grDevices utils datasets methods base >> >> Thanks! >> >> Hadley >> >> -- >> Chief Scientist, RStudio >> http://had.co.nz/ >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > > -- > Gabriel Becker > Graduate Student > Statistics Department > University of California, Davis > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel