On 07/12/2010 01:45 PM, cstrato wrote: > Dear all, > > With great interest I followed the discussion: > https://stat.ethz.ch/pipermail/r-devel/2010-July/057901.html > since I have currently a similar problem: > > In a new R session (using xterm) I am importing a simple table > "Hu6800_ann.txt" which has a size of 754KB only: > >> ann <- read.delim("Hu6800_ann.txt") >> dim(ann) > [1] 7129 11 > > > When I call "object.size(ann)" the estimated memory used to store "ann" > is already 2MB: > >> object.size(ann) > 2034784 bytes > > > Now I call "split()" and check the estimated memory used which turns out > to be 3.3GB: > >> u2p <- split(ann[,"ProbesetID"],ann[,"UNIT_ID"]) >> object.size(u2p) > 3323768120 bytes
I guess things improve with stringsAsFactors=FALSE in read.delim? Martin > > During the R session I am running "top" in another xterm and can see > that the memory usage of R increases to about 550MB RSIZE. > > > Now I do: > >> object.size(unlist(u2p)) > 894056 bytes > > It takes about 3 minutes to complete this call and the memory usage of R > increases to about 1.3GB RSIZE. Furthermore, during evaluation of this > function the free RAM of my Mac decreases to less than 8MB free PhysMem, > until it needs to swap memory. When finished, free PhysMem is 734MB but > the size of R increased to 577MB RSIZE. > > Doing "split(ann[,"ProbesetID"],ann[,"UNIT_ID"],drop=TRUE)" did not > change the object.size, only processing was faster and it did use less > memory on my Mac. > > Do you have any idea what the reason for this behavior is? > Why is the size of list "u2p" so large? > Do I make any mistake? > > > Here is my sessionInfo on a MacBook Pro with 2GB RAM: > >> sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > Best regards > Christian > _._._._._._._._._._._._._._._._._._ > C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a > V.i.e.n.n.a A.u.s.t.r.i.a > e.m.a.i.l: cstrato at aon.at > _._._._._._._._._._._._._._._._._._ > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel