[Rd] Problems initializing an extended S4 class

Mon Mar 3 11:51:14 CET 2008

Hi Martin,

Many thanks for the reply, that really helps a lot. Some follow-up 
observations below.

Martin Morgan wrote:
> Hi Jim --
> 
> I think your problems have to do with the way 'initialize' and
> 'validObject' work together. At some point in it's code, validObject
> converts an instance 'b' of a derived class (e.g., B) into that of the
> base class (e.g., A). It does this, by calling a = new("A"), and then
> copying the relevant slots from the instance of b to new a. So...
> 

With your comments as context, I stepped more carefully through relevant 
source code in the methods package. As you indicated, my trouble arises 
in the 'validObject' function. In particular, it's when 'as' is called 
to coerce the nascent B object into its superclass A. [For reference, 
it's in line 456 of the SClasses.R in the R-2.6.2 source tarball.]

This led me to try defining a simple B -> A coercion method:
setAs("B", "A", function(from, to) new(to, x=from at x))

Indeed, after doing this, I can successfully create an instance of B.

It still seems peculiar that the initialize and validity methods are 
both called twice for A when B is instantiated -- particularly given 
that the 'a' object itself was already initialized and checked for 
validity. There may well be a good reason for all this, but I still 
wonder if I'm doing something inefficiently.

In any case, although I haven't tested it, I'm optimistic that defining 
'coerce' methods in this fashion will also solve the problem in my real 
code without my needing to redesign the initialize and validity methods.

>> 1. Why is initialize invoked *twice* for A during instantiation of B?
> 
> ...the initialize method for A gets invoked twice, once when B is
> being created originally, and once when B is being checked for
> validity and is being converted to A.
> 
>> 2. The second time initialize is invoked for A, it appears .Object at x is 
>> only bound to its implicit prototype value of numeric(0). Why? Of 
> 
> ...and this is because of how A is being constructed from B -- with a
> called to new("A"), with no additional arguments.
> 
> Practically, this means that your classes have to be valid when
> created with new("A"). This imposes constraints on the prototype,and
> on the initialize method, and in the end I've found myself writing
> 'constructors' that process user-friendly arguments into a format that
> can be passed to 'new' as slots, e.g.,
> 
> A <- function(x=numeric(0), ...) {
>     x <- ifelse(log(x)<0, 1, x)
>     new("A", x=x, ...)
> }
> 
> In some ways I think this is the right thing to do anyway (a layer of
> abstraction between the user and implementation), even if the
> motivation might not be so pure.

Agreed, and I actually do have such user-level constructors. However, 
precisely in the interests of abstraction, I'd prefer not to expose the 
innards of my initialize methods in these functions. I admit this is 
partly on purity grounds, but more practical issues are (1) I may have 
several variants of the user-level constructors and don't want to repeat 
the common initialization code in each one, and (2) the operations I 
have in my initialize methods are only sensible for valid objects (and 
thus not meaningfully executed before 'new' is called). Hopefully this 
is all in the true spirit of the S4 object system...

Thanks again for the clear and helpful reply!

Cheers,
Jim

> Martin
> 
> 
> Jim Regetz <regetz at nceas.ucsb.edu> writes:
> 
>> Hi all,
>>
>> I am having trouble extending S4 classes in cases where I'm using both 
>> validity and initialize methods. I've read as much S4 information as I 
>> can find, but I've yet to have that "a-ha" moment.
>>
>> In my application, I am using validity methods to guard against 
>> inappropriate input data that my code has no way of handling, and 
>> initialize methods to detect and deal with fixable problems. As a toy 
>> example, consider classes A and B and associated methods as defined 
>> below. I use a validity method for A to complain about negative values, 
>> and an initialize method to "correct" small input values. B should 
>> simply extend A by adding an extra slot. The example is contrived, but 
>> it illustrates a key behavior that I also encounter in my real code:
>>
>> setClass("A", representation(x="numeric"))
>> setClass("B", representation(y="character"), contains="A")
>>
>> setValidity("A", function(object) {
>>      message("start validate A")
>>      retval <- NULL
>>      if (any(object at x<0))
>>          retval <- c(retval, "x must be positive")
>>      message("done validate A")
>>      if(is.null(retval)) return(TRUE) else return(retval)
>> })
>>
>> setMethod("initialize", "A", function(.Object, ...) {
>>      message("start init A")
>>      .Object <- callNextMethod()
>>      x <- .Object at x
>>      .Object at x <- ifelse(log(x)<0, 1, x)
>>      message("done init A")
>>      .Object
>> })
>>
>> setMethod("initialize", "B", function(.Object, ...) {
>>      message("start init B")
>>      callNextMethod()
>> })
>>
>>
>> # Creating an instance of A works just as I would expect
>>> a <- new("A", x=c(0.5, 2))
>> start init A
>> start validate A
>> done validate A
>> done init A
>>> a
>> An object of class “A”
>> Slot "x":
>> [1] 1 2
>>
>> # But subsequently creating a derived B object fails!
>>> new("B", a, y="foo")
>> start init B
>> start init A
>> start validate A
>> start init A
>> Error in checkSlotAssignment(object, name, value) :
>>    assignment of an object of class "logical" is not valid for slot "x"
>> in an object of class "A"; is(value, "numeric") is not TRUE
>>
>> The two things I haven't quite figured out are:
>>
>> 1. Why is initialize invoked *twice* for A during instantiation of B?
>>
>> 2. The second time initialize is invoked for A, it appears .Object at x is 
>> only bound to its implicit prototype value of numeric(0). Why? Of 
>> course, this leads to an error because the ifelse expression 
>> subsequently evaluates to logical(0) rather than a numeric vector as 
>> intended. Again, this is a contrived example, but a very real problem in 
>> my code.
>>
>> I suppose I could define a prototype for A that I know won't break my 
>> initialize method, but that seems inelegant and hard to maintain. Is 
>> there a better way to code this so that I can reliably instantiate B 
>> using a valid A object? Hopefully I've just got something wrong in the 
>> formals for my initialize methods or in my use of getNextMethod(), but 
>> I've had no luck trying some alternatives -- and ultimately I'd prefer 
>> to better understand the underlying behavior rather than stumble onto 
>> something that merely appears to work.
>>
>> I'd be grateful for any suggestions...
>>
>> Thanks,
>> Jim
>>
>> ------------------------------
>> James Regetz, Ph.D.
>> Scientific Programmer/Analyst
>> National Center for Ecological Analysis & Synthesis
>> 735 State St, Suite 300
>> Santa Barbara, CA 93101
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>