[R] How to: initialize, setValidity, copy-constructor

Martin Morgan mtmorgan at fhcrc.org
Fri Jul 10 15:30:56 CEST 2009


Renaud Gaujoux wrote:
> Thanks Martin,
> 
> I'll try that. One question about memory though, just to be sure. When
> one does:
> 
> # create new instance of A
> a <- new('A', ...)
> # create new instance of B based on a
> b <- new('B', a, b=...)
> 
> Will two instances of A be created (using twice the memory) or will
> there be a single instance and a reference to it (own by b the instance
> of B)?

Some insight can be had from compiling R with memory profiling enabled
(see the R installation and administration guide). tracemem then tags
its argument and reports when it is duplicated

> setClass("A", representation=representation(a="numeric"))
[1] "A"
> setClass("B", contains="A", representation=representation(b="numeric"))
[1] "B"
> tracemem(x <- 1:5)                      # tag x for profiling
[1] "<0x1c95c8f0>"
> tracemem(a <- new("A", a=x))            # two copies of x
tracemem[0x1c95c8f0 -> 0x1c95c9c8]: initialize initialize new
tracemem[0x1c95c9c8 -> 0x1c95caa0]: .Call slot<- initialize initialize new
[1] "<0x1c7a3050>"
> tracemem(b <- new("B", a))              # copy a, probably 'deep'
tracemem[0x1c7a3050 -> 0x1c77e8e8]: initialize initialize new
[1] "<0x1c761990>"
> bb1 <- initialize(b, a)                 # copy a, copy b twice
tracemem[0x1c7a3050 -> 0x1c712478]: initialize initialize
tracemem[0x1c761990 -> 0x1c6eb9a0]: initialize initialize
tracemem[0x1c6eb9a0 -> 0x1c6dfc78]: asMethod as<- initialize initialize
> bb2 <- initialize(b, b=x)               # copy b, copy x twice
tracemem[0x1c95c8f0 -> 0x1bf041a0]: initialize initialize
tracemem[0x1c761990 -> 0x1c676a50]: initialize initialize
tracemem[0x1bf041a0 -> 0x1c5ff240]: .Call slot<- initialize initialize

I don't know what to make of this, though, in terms of practical advice
-- data.frame(x=x) is reported to create three copies of x, for instance.

Martin

> 
> Thanks.
> 
> Martin Morgan wrote:
>> Hi Renaud --
>>
>> Renaud Gaujoux <renaud at mancala.cbio.uct.ac.za> writes:
>>
>>  
>>> Hello list,
>>>
>>> I'm having troubles setting up a basic calss hierarchy with S4.
>>> Here is a simplified schema of what I'd like to do:
>>>
>>> - Two classes: A and B that extends A
>>> - Ensure that the slots added by B are consistent with the slots of A
>>> - Let A initialize itself (I'm not supposed to know the internal
>>>   cooking of A)
>>> - By default set the slots of B based on the slots that A initialized
>>>
>>> Another question is: what is the recommended way of implementing a
>>> copy-constructor in R?
>>>
>>> I know that all of this is easily done in C++. The constructor of each
>>> class is called recursively back-up to the root class. Validity checks
>>> can be performed after/during associated
>>> initialization. Copy-constructor are basics in C++.
>>>
>>> Here below is a piece of code that I thought would work (but it does
>>> not... therefore my post), what's wrong with it?
>>> I think the main issue is when is the validity check performed: why is
>>> it performed before the end of the initialize method?
>>>     
>>
>> loosely, new("B", ...) calls initialize(prototypeOfB, ...).
>> initialize,B-method uses callNextMethod(), so initialize,A-method sees
>> as .Object the value prototypeOfB.  If initialize,A-method is
>> well-behaved, it'll call initialize,ANY-method, which also sees
>> prototypeOfB. You'll see that, when the ... argument is not empty
>>
>>   getMethod(initialize, "ANY")
>>
>> eventually calls validObject, in this case on prototypeOfB. Hence what
>> you are seeing, an 'early' check on the validity of B.
>>
>> There are many creative ways around this in initialize,B-method, e.g.,
>> assigning B slots before callNextMethod(), or explicitly creating a
>> new instance of A from appropriate supplied arguments (in
>> initialize,B-method, name arguments meant to initialize B slots and
>> pass ... to the A constructor) and using that to initialize B, etc.
>>
>> The approach I find most palatable (not meant to be real code) is to
>> have a constructor
>>
>>   B <- function(x, y, z, ...) {
>>      # do all the work to map x, y, z into slots of A, B (or an
>>      # instance of A and slots of B), then...
>>      new("B", a=, b=, ...) # or new("B", instanceOfA, b=, ...)
>>   }
>>
>> and avoid writing explicit initialize methods.
>> Oddly enough, this solution leads to a copy constructor, viz.,
>>
>>   initialize(instanceOfB, b=)
>>
>> I'm not sure that this really does anything more than move the
>> 'pattern' from the initialize method to the constructor.
>>
>> Martin
>>
>>  
>>> Thank you for your help.
>>> Renaud
>>>
>>> # define class A with a single numeric slot
>>> setClass('A', representation(a='numeric'))
>>>
>>> # define class B that extends class A, adding another numeric slot
>>> setClass('B', representation('A', b='numeric'))
>>> # we want for example to ensure that slots a and b have the same length
>>> setValidity('B',
>>> function(object){
>>> cat("*** B::validate ***\n")
>>> print(object)
>>> cat("*****************\n")
>>> if( length(object at a) != length(object at b) ) return('Inconsistent
>>> lengths')
>>> TRUE
>>> }
>>> )
>>> # As a default behaviour if b is not provided, we want slot b to be
>>>   equal to slot a
>>> setMethod('initialize', 'B',
>>> function(.Object, b, ...){
>>> cat("*** B::initialize ***\n")
>>> print(.Object)
>>>
>>> # Let the superclass (A) initialize itself via callNextMethod
>>> # I thought it would only do that: initialize and optionnaly validate
>>>   the class A part of the object
>>> #But it generates an ERROR: apparently it calls the validation method
>>> of B,
>>> # before leaving me a chance to set slot b to a valid value
>>> .Object <- callNextMethod(.Object, ...)
>>>
>>> # now deal with the class B part of the object
>>> cat("*** Test missing b ***\n")
>>> if( missing(b) ){
>>> cat("*** b is MISSING ***\n")
>>> b <- .Object at a
>>> }
>>>
>>> # set slot b
>>> .Object at b <- b
>>>
>>> .Object
>>> }
>>> )
>>>
>>> ### Testing
>>>
>>> # empty A: OK
>>> aObj <- new('A')
>>> aObj
>>>
>>> # class A with some data: OK
>>> aObj <- new('A', a=c(1,2) )
>>> aObj
>>>
>>> # empty B: OK
>>> bObj <- new('B')
>>> bObj
>>>
>>> # initialize B setting the slot of class A: ERROR
>>> bObj <- new('B', a=c(1,2))
>>>
>>> # initialize B setting only the slot class B: OK!! Whereas it produces
>>>   a non valid object.
>>> bObj <- new('B', b=c(1,2))
>>> bObj
>>>
>>> ######### RESULTS:
>>>
>>>  > # empty A: OK
>>>  > aObj <- new('A')
>>>  > aObj
>>> An object of class “A”
>>> Slot "a":
>>> numeric(0)
>>>
>>>  >
>>>  > # class A with some data: OK
>>>  > aObj <- new('A', a=c(1,2) )
>>>  > aObj
>>> An object of class “A”
>>> Slot "a":
>>> [1] 1 2
>>>
>>>  >
>>>  > # empty B: OK
>>>  > bObj <- new('B')
>>> *** B::initialize ***
>>> An object of class “B”
>>> Slot "b":
>>> numeric(0)
>>>
>>> Slot "a":
>>> numeric(0)
>>>
>>> *** Test missing b ***
>>> *** b is MISSING ***
>>>  > bObj
>>> An object of class “B”
>>> Slot "b":
>>> numeric(0)
>>>
>>> Slot "a":
>>> numeric(0)
>>>
>>>  >
>>>  > # initialize B setting the slot of class A: ERROR
>>>  > bObj <- new('B', a=c(1,2))
>>> *** B::initialize ***
>>> An object of class “B”
>>> Slot "b":
>>> numeric(0)
>>>
>>> Slot "a":
>>> numeric(0)
>>>
>>> *** B::validate ***
>>> An object of class “B”
>>> Slot "b":
>>> numeric(0)
>>>
>>> Slot "a":
>>> [1] 1 2
>>>
>>> *****************
>>> Error in validObject(.Object) :
>>> invalid class "B" object: Inconsistent lengths
>>>  >
>>>  > # initialize B setting only the slot class B: OK!! Whereas it
>>>      creates a non valid object.
>>>  > bObj <- new('B', b=c(1,2))
>>> *** B::initialize ***
>>> An object of class “B”
>>> Slot "b":
>>> numeric(0)
>>>
>>> Slot "a":
>>> numeric(0)
>>>
>>> *** Test missing b ***
>>>  > bObj
>>> An object of class “B”
>>> Slot "b":
>>> [1] 1 2
>>>
>>> Slot "a":
>>> numeric(0)
>>>
>>>
>>>
>>> -----------------------------
>>>  > sessionInfo()
>>> R version 2.9.1 (2009-06-26)
>>> x86_64-pc-linux-gnu
>>>
>>> locale:
>>> LC_CTYPE=en_ZA.UTF-8;LC_NUMERIC=C;LC_TIME=en_ZA.UTF-8;LC_COLLATE=en_ZA.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_ZA.UTF-8;LC_PAPER=en_ZA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_ZA.UTF-8;LC_IDENTIFICATION=C
>>>
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices datasets utils methods base
>>>
>>> other attached packages:
>>> [1] Biobase_2.4.1
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>     
>>
>>   
>




More information about the R-help mailing list