On 05/01/2013 11:20 AM, David Kulp wrote:
I'm using refClass for a complex multi-directional tree structure with
possibly 100,000s of nodes.  The refClass design is very impressive and I'd
love to use it, but I've found that the size of refClass instances are very
large and creation time is slow.  For example, below is a RefClass and normal
S4 class.  The RefClass requires about 4KB per instance vs 500B for the S4
class -- based on adding the Ncells and Vcells of used memory reported by
gc().  And instantiation is more than twice as slow for a RefClass.  (R
2.14.2)

Anyone have thoughts on this and whether there's any hope for improving
resources on either front?

Hi David -- not necessarily helpful but creating a few large objects is always better than creating many small in R, so perhaps re-conceptualize your data structure? As a rough analogy, instead of constructing a graph as a large number of 'Node' instances each pointing to one another, a graph could be represented as a data.frame containing columns of 'from' and 'to' indexes (neighbour-edge list, a few large objects) or as an adjacency matrix. One would also implement creation and update of the few large objects in an R-friendly (vectorized) way.

Perhaps there are existing packages that already model the data you're interested in? If your multi-directional tree can be represented as a graph, then perhaps

  http://bioconductor.org/packages/release/bioc/html/graph.html

including facilities in the Boost graph library (RBGL, on the Bioconductor web site, too) or the igraph package can be put to use.

Martin


I wonder what others are doing.  I've been thinking about lightweight
alternative implementations, but nothing particularly elegant has come to
mind, yet!

Thanks!


simple <- setRefClass('simple', fields = list(a = "character", b="numeric")
) gc() system.time(simple.list <- lapply(1:100000, function(i) {
simple$new(a='foo',b=i) })) gc()

setClass('simple2', representation(a="character",b="numeric"))
setMethod("initialize", "simple2", function(.Object, a, b) { .Object@a <- a
.Object@b <- b .Object })

gc() system.time(simple2.list <- lapply(1:100000, function(i) {
new('simple2',a='foo',b=i) })) gc()

______________________________________________ R-help@r-project.org mailing
list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
guide http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to