Dear Dennis, David, Jeff, and Denes,

Thanks for your helps and comments.  The simple one seems good enough for
my works.

Best,

Steve

On Wed, Dec 17, 2014 at 5:46 AM, Dénes Tóth <toth.de...@ttk.mta.hu> wrote:
>
> Dear Jeff,
>
> On 12/17/2014 01:46 AM, Jeff Newmiller wrote:
>
>> You are chasing ghosts of performance past, Denes.
>>
>
> In terms of memory efficiency, yes. In terms of CPU time, there can be
> significant difference, see below.
>
>
> The data.frame
>
>> function causes no problems, and if it is used then the OP would not
>> need to presume they know the internal structure of the data frame.
>> See below. (I am using R3.1.2.)
>>
>> a1 <- list(x = rnorm(1e6), y = rnorm(1e6))
>> a2 <- list(x = rnorm(1e6), y = rnorm(1e6))
>> a3 <- list(x = rnorm(1e6), y = rnorm(1e6))
>>
>> # get names of the objects
>> out_names <- ls(pattern="a[[:digit:]]$")
>>
>> # amount of memory allocated
>> gc(reset=TRUE)
>>
>> # Explicitly call data frame
>> out2 <- data.frame( a1=a1[["x"]], a2=a2[["x"]], a3=a3[["x"]] )
>>
>> # No copying.
>> gc()
>>
>> # Your suggested retreival method
>> out3a <- lapply( lapply( out_names, get ), "[[", "x" )
>> names( out3a ) <- out_names
>> # The "obvious" way to finish the job works fine.
>> out3 <- do.call( data.frame, out3a )
>>
>
> BTW, the even more "obvious" as.data.frame() produces the same with an
> even more intuitive interface.
>
> However, for lists with a larger number of elements the transformation to
> a data.frame can be pretty slow. In the toy example, we created only a
> three-element list. Let's increase it a little bit.
>
> ---
>
> # this is not even that large
> datlen <- 1e2
> listlen <- 1e5
>
> # create a toy list
> mylist <- matrix(seq_len(datlen * listlen),
>                  nrow = datlen, ncol = listlen)
> mylist <- lapply(1:ncol(mylist), function(i) mylist[, i])
> names(mylist) <- paste0("V", seq_len(listlen))
>
>
> # define the more efficient function ---
> # note that I put class(x) first so that setattr does not
> # modify the attributes of the original input (see ?setattr,
> # you have to be careful)
> setAttrib <- function(x) {
>     class(x) <- "data.frame"
>     data.table::setattr(x, "row.names", seq_along(x[[1]]))
>     x
> }
>
> # benchmarking
> # (we do not need microbenchmark here, the differences are
> # extremely large) - on my machine, 9.4 sec, 8.1 sec vs 0.15 sec
> gc(reset=TRUE)
> system.time(df1 <- do.call(data.frame, mylist))
> gc()
> system.time(df2 <- as.data.frame(mylist))
> gc()
> system.time(df3 <- setAttrib(mylist))
> gc()
>
> # check results
> identical(df1, df2)
> identical(df1, df3)
>
> ----
>
> Of course for small datasets, one should use the built-in and safe
> functions (either do.call or as.data.frame). BTW, for the original
> three-element list, these are even faster than the workaround.
>
> All the best,
>   Denes
>
>
>
>
>
>
>> # No copying... well, you do end up with a new list in out3, but the
>> data itself doesn't get copied.
>> gc()
>>
>>
>> On Tue, 16 Dec 2014, D?nes T?th wrote:
>>
>>  On 12/16/2014 06:06 PM, SH wrote:
>>>
>>>> Dear List,
>>>>
>>>> I hope this posting is not redundant.  I have several list outputs
>>>> with the
>>>> same components.  I ran a function with three different scenarios below
>>>> (e.g., scen1, scen2, and scen3,...,scenN).  I would like to extract the
>>>> same components and group them as a data frame.  For example,
>>>> pop.inf.r1 <- scen1[['pop.inf.r']]
>>>> pop.inf.r2 <- scen2[['pop.inf.r']]
>>>> pop.inf.r3 <- scen3[['pop.inf.r']]
>>>> ...
>>>> pop.inf.rN<-scenN[['pop.inf.r']]
>>>> new.df <- data.frame(pop.inf.r1, pop.inf.r2, pop.inf.r3,...,pop.inf.rN)
>>>>
>>>> My final output would be 'new.df'.  Could you help me how I can do that
>>>> efficiently?
>>>>
>>>
>>> If efficiency is of concern, do not use data.frame() but create a list
>>> and add the required attributes with data.table::setattr (the setattr
>>> function of the data.table package). (You can also consider creating a
>>> data.table instead of a data.frame.)
>>>
>>> # some largish lists
>>> a1 <- list(x = rnorm(1e6), y = rnorm(1e6))
>>> a2 <- list(x = rnorm(1e6), y = rnorm(1e6))
>>> a3 <- list(x = rnorm(1e6), y = rnorm(1e6))
>>>
>>> # amount of memory allocated
>>> gc(reset=TRUE)
>>>
>>> # get names of the objects
>>> out_names <- ls(pattern="a[[:digit:]]$")
>>>
>>> # create a list
>>> out <- lapply(lapply(out_names, get), "[[", "x")
>>>
>>> # note that no copying occured
>>> gc()
>>>
>>> # decorate the list
>>> data.table::setattr(out, "names", out_names)
>>> data.table::setattr(out, "row.names", seq_along(out[[1]]))
>>> class(out) <- "data.frame"
>>>
>>> # still no copy
>>> gc()
>>>
>>> # output
>>> head(out)
>>>
>>>
>>> HTH,
>>>  Denes
>>>
>>>
>>>
>>>> Thanks in advance,
>>>>
>>>> Steve
>>>>
>>>> P.S.:  Below are some examples of summary outputs.
>>>>
>>>>
>>>>  summary(scen1)
>>>>>
>>>>                  Length Class  Mode
>>>> aql                1   -none- numeric
>>>> rql                1   -none- numeric
>>>> alpha              1   -none- numeric
>>>> beta               1   -none- numeric
>>>> n.sim              1   -none- numeric
>>>> N                  1   -none- numeric
>>>> n.sample           1   -none- numeric
>>>> n.acc              1   -none- numeric
>>>> lot.inf.r          1   -none- numeric
>>>> pop.inf.n       2000   -none- list
>>>> pop.inf.r       2000   -none- list
>>>> pop.decision.t1 2000   -none- list
>>>> pop.decision.t2 2000   -none- list
>>>> sp.inf.n        2000   -none- list
>>>> sp.inf.r        2000   -none- list
>>>> sp.decision     2000   -none- list
>>>>
>>>>> summary(scen2)
>>>>>
>>>>                  Length Class  Mode
>>>> aql                1   -none- numeric
>>>> rql                1   -none- numeric
>>>> alpha              1   -none- numeric
>>>> beta               1   -none- numeric
>>>> n.sim              1   -none- numeric
>>>> N                  1   -none- numeric
>>>> n.sample           1   -none- numeric
>>>> n.acc              1   -none- numeric
>>>> lot.inf.r          1   -none- numeric
>>>> pop.inf.n       2000   -none- list
>>>> pop.inf.r       2000   -none- list
>>>> pop.decision.t1 2000   -none- list
>>>> pop.decision.t2 2000   -none- list
>>>> sp.inf.n        2000   -none- list
>>>> sp.inf.r        2000   -none- list
>>>> sp.decision     2000   -none- list
>>>>
>>>>> summary(scen3)
>>>>>
>>>>                  Length Class  Mode
>>>> aql                1   -none- numeric
>>>> rql                1   -none- numeric
>>>> alpha              1   -none- numeric
>>>> beta               1   -none- numeric
>>>> n.sim              1   -none- numeric
>>>> N                  1   -none- numeric
>>>> n.sample           1   -none- numeric
>>>> n.acc              1   -none- numeric
>>>> lot.inf.r          1   -none- numeric
>>>> pop.inf.n       2000   -none- list
>>>> pop.inf.r       2000   -none- list
>>>> pop.decision.t1 2000   -none- list
>>>> pop.decision.t2 2000   -none- list
>>>> sp.inf.n        2000   -none- list
>>>> sp.inf.r        2000   -none- list
>>>> sp.decision     2000   -none- list
>>>>
>>>>     [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> ------------------------------------------------------------
>> ---------------
>> Jeff Newmiller                        The     .....       .....  Go
>> Live...
>> DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
>> Go...
>>                                        Live:   OO#.. Dead: OO#..  Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#.
>> rocks...1k
>> ------------------------------------------------------------
>> ---------------
>>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to