[R] Size of a refClass instance

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Fri May 3 15:47:43 CEST 2013


Interesting conclusion. Alternatively, that representation of your object model may not be computationally effective. This discrepancy may be less exaggerated in C++, but you may still find that large numbers of objects are less efficient in their use of memory or cpu time than vector processing even there. I would read the point of Martin's response as "Don't confuse your mental model of the solution with its implementation".
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

David Kulp <dkulp at fiksu.com> wrote:

>Good tip.  Thanks Morgan.
>I agree that a different structure might (necessarily) be in order.  I
>wanted to create a tree where nodes in a tree were of different derived
>sub-classes -- possibly holding more data and behaving polymorphically.
>OO programming seemed ideal for this: lots of small things with
>specialized behavior -- but this isn't R's strength.
>
>On May 2, 2013, at 4:57 PM, Martin Morgan wrote:
>
>> On 05/01/2013 11:20 AM, David Kulp wrote:
>>> I'm using refClass for a complex multi-directional tree structure
>with
>>> possibly 100,000s of nodes.  The refClass design is very impressive
>and I'd
>>> love to use it, but I've found that the size of refClass instances
>are very
>>> large and creation time is slow.  For example, below is a RefClass
>and normal
>>> S4 class.  The RefClass requires about 4KB per instance vs 500B for
>the S4
>>> class -- based on adding the Ncells and Vcells of used memory
>reported by
>>> gc().  And instantiation is more than twice as slow for a RefClass. 
>(R
>>> 2.14.2)
>>> 
>>> Anyone have thoughts on this and whether there's any hope for
>improving
>>> resources on either front?
>> 
>> Hi David -- not necessarily helpful but creating a few large objects
>is always better than creating many small in R, so perhaps
>re-conceptualize your data structure? As a rough analogy, instead of
>constructing a graph as a large number of 'Node' instances each
>pointing to one another, a graph could be represented as a data.frame
>containing columns of 'from' and 'to' indexes (neighbour-edge list, a
>few large objects) or as an adjacency matrix. One would also implement
>creation and update of the few large objects in an R-friendly
>(vectorized) way.
>> 
>> Perhaps there are existing packages that already model the data
>you're interested in? If your multi-directional tree can be represented
>as a graph, then perhaps
>> 
>>  http://bioconductor.org/packages/release/bioc/html/graph.html
>> 
>> including facilities in the Boost graph library (RBGL, on the
>Bioconductor web site, too) or the igraph package can be put to use.
>> 
>> Martin
>> 
>>> 
>>> I wonder what others are doing.  I've been thinking about
>lightweight
>>> alternative implementations, but nothing particularly elegant has
>come to
>>> mind, yet!
>>> 
>>> Thanks!
>>> 
>>> 
>>> simple <- setRefClass('simple', fields = list(a = "character",
>b="numeric")
>>> ) gc() system.time(simple.list <- lapply(1:100000, function(i) {
>>> simple$new(a='foo',b=i) })) gc()
>>> 
>>> setClass('simple2', representation(a="character",b="numeric"))
>>> setMethod("initialize", "simple2", function(.Object, a, b) {
>.Object at a <- a
>>> .Object at b <- b .Object })
>>> 
>>> gc() system.time(simple2.list <- lapply(1:100000, function(i) {
>>> new('simple2',a='foo',b=i) })) gc()
>>> 
>>> ______________________________________________ R-help at r-project.org
>mailing
>>> list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
>posting
>>> guide http://www.R-project.org/posting-guide.html and provide
>commented,
>>> minimal, self-contained, reproducible code.
>>> 
>> 
>> 
>> -- 
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>> 
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list