[Rd] What does is() mean?

Mon Mar 17 12:01:41 MET 2003

Roger Koenker wrote:
> 
> Suppose you have a class, say sex, for lack of a better example, and
> you are tempted, in defining the behavior of the call,
> 
>         is(x,"sex")
> 
> to check whether certain basic features are satisfied, not to just trust the claim
> that x is specified to be of class "sex". 

Well, it depends.  Sometimes (in dispatching methods, e.g.) you don't
want to ask too deeply about what the object REALLY is, but would just
like to (excuse the expression in the context) "get on with it", given a
basic assertion.

That's fundamentally what is() does:  it simply checks the class
inheritance structure for class(x).

The function defined to poke more deeply into the issue is
validObject():

	validObject(x)

checks as much as possible into whether x is a valid object of its
class.  Some checks are built in (slots of the right classes, etc.). 
Others if needed can be incorporated in the class definitions (argument
validity=) or via setValidity().  Validity checking uses inheritance, so
validity methods for contained classes will be applied.

>`Without delving into details
> further sanity checking of the structure of the object is sometimes prudent to
> avoid subsequent nonsense.  This checking could be built into the is method,
> but the documentation for "new" suggests that one might alternatively want to
> define a method, "initialize" that was used to create objects of class "sex"
> and not let people just create such objects willy-nilly.  My questions are these:
> 
> 1.  Am I correct in thinking that initialize is the right way to handle this?

Well, about the best there is, for a single attack.  See comment to your
point 3.

> 
> 2.  Are there examples of the initialize strategy, beyond the one given in
> the "new" documentation?

I believe some of the packages in BioConductor use initialize methods. 
Others??

> 
> 3.  Are there efficiency issues that one should be cautious about?

Indeed.  The "obvious" strategy is:  if the class has a validity method,
direct or inherited, then initialize() should invoke it.  The default
initialize method does not (in either R or S-Plus).

Should that be changed?  Logically, I would say yes:  if the class
designer specified a validity method, it should not be possible to use
new() to create invalid objects.  But there is an efficiency penalty.

At the moment, you need to build your own initialize() method.  That's
not all bad--in the process you may also make the arguments to
initialize() reasonable names instead of ..., or otherwise get beyond
the notion of just supplying slot names in calls to new().

Here's a simple example (which I'll add to the initialize
documentation).  The validity method requires a single string for the
"id" slot.

setClass("a", representation(x="numeric", id = "character"),
         validity = function(object)
     if(length(object at id)==1) TRUE else 
     "Expected a single string as the \"id\" slot")

and the initialize method calls validObject.

setMethod("initialize", "a", function(.Object, x = numeric(), id =
"<>"){
  .Object at x <- x
  .Object at id <- id
   validObject(.Object)
  .Object})

With that definition, you get a check with new():

R> new("a", x=1:10, id=character())
Error in validObject(.Object) : Invalid "a" object: Expected a single
string as the "id" slot

[A couple of details for those interested.  The default values in the
initialize() method above are important.  Otherwise, simple calls such
as new("a") will fail.

Also, R (but not currently S-Plus) has a function callNextMethod() that
looks good for writing initialize methods.  It often is, but there is a
current bug that requires you to supply all the arguments to
callNextMethod in this case, contrary to the documentation.  With luck,
the bug will be fixed.  Meanwhile, the way to use callNextMethod in
initialize() is like this:

R> setMethod("initialize", "a", function(.Object, ...) {
+ x <- callNextMethod(.Object, ...)
+ validObject(x)
+ x})

]

> 
> 4.  Is there any way to enforce the Foucaultian imperative that any new
> "sex" object has to pass through the initialize phase?

Well, as someone might have said, it depends what you mean by "new".

The above mechanism pretty much ensures that objects will be valid when
created.

But it doesn't prevent some code from doing:
  x at id <- character(0)

So, one might define methods for "@<-" that included validity checking. 
Sometimes, though, (as in checking that two slots have the same length,
e.g.) it may take some care not to create invalid objects temporarily: 
the discipline of always creating the objects through a call to new()
will usually work, but once again with some slight efficiency penalty.

The whole area of valid objects is one that all of us interested folks
should discuss.

It would be nice to have some more "real" examples.

John

> 
> Oh, for those simple days of yesteryore when the pecadillos of the president
> caused no harm, and the Dow was over 10,000...
> 
> Apologies in advance if this moment of sexistential doubt offends anyone.
> 
> url:    www.econ.uiuc.edu       Roger Koenker           Dept. of Economics UCL,
> email   rkoenker at uiuc.edu       Department of Economics Drayton House,
> vox:    217-333-4558            University of Illinois  30 Gorden St,
> fax:    217-244-6678            Champaign, IL 61820     London,WC1H 0AX, UK
>                                                         vox:    020-7679-5838
> 
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

-- 
John M. Chambers                  jmc at bell-labs.com
Bell Labs, Lucent Technologies    office: (908)582-2681
700 Mountain Avenue, Room 2C-282  fax:    (908)582-3340
Murray Hill, NJ  07974            web: http://www.cs.bell-labs.com/~jmc