[R-pkg-devel] S3 length method behavior

Martin Maechler maechler at stat.math.ethz.ch
Wed Feb 3 08:52:24 CET 2016


>>>>> Barry Rowlingson <b.rowlingson at lancaster.ac.uk>
>>>>>     on Tue, 2 Feb 2016 17:23:46 +0000 writes:

    > On Tue, Feb 2, 2016 at 3:28 PM, Hadley Wickham <h.wickham at gmail.com> wrote:
    >> I've found that it's a very bad idea to provide length or names
    >> methods for just this reason.

well, not quite, see below ..

    >>> After looking
    >>> for memory leaks and other errors I finally noticed that the str() on the
    >>> object of myClass looked odd. It returned something like this:
    >>> 
    >>> List of 82
    >>> $ file  : chr "my/file/location"
    >>> $ handle:<externalptr>
    >>> $ NA:
    >>> Error in object[[i]] : subscript out of bounds


    >>> My questions are, then, whether this behavior makes sense and what to do
    >>> about it. If I define my own str() method, will that fix it? I think I am
    >>> just misunderstanding what is going on with the methods I have defined.
    >>> Hopefully, someone can offer some clarity.

    > Defining a str on your class will at least fix the out of bounds error:

    > Create a trivial S3 class:

    >> z=list(1,22)
    >> class(z)="foo"

    > length method looks at the second element:

    >> length.foo=function(x){x[[2]]}
    >> length(z)
    > [1] 22

    > and str barfs:

    >> str(z)
    > List of 22
    > $ : num 1
    > $ : num 22
    > $ :Error in object[[i]] : subscript out of bounds

    > Define a str method:

    >> str.foo=function(object,...){for(i in
    > 1:length(unclass(object))){str(unclass(object[[i]]))}}
    >> str(z)
    > num 1
    > num 22

    > BUT... the real problem is that S3 classes are seriously informal and
    > there's no concept of what methods you need to define on a class
    > because there's no concept of an "interface" that new classes have to
    > conform to. So stuff breaks, seemingly at random, and via action at a
    > distance. Somewhere something is going to expect z[[1]] to
    > z[[length(z)]] to exist, which is what the default str is doing...

Indeed.
Still, it can also be advantageous to define such methods *consistently*.

With *consistence*, I mean that at least

- names(obj) either returns NULL or a character vector of
  length length(obj), and that as Barry mentions,
- obj[[ i ]]  is meaningful  for (i in  seq_along(obj))
           [yes, seq_along(.) automatically works with your length() method !]

- often you'd also want  obj[ i ]  to also work consistently
  (sometimes identically to `[[`)

I'd say that oftentimes it may be easier (and more "rewarding")
to define such `[` and `[[` methods for your class anyway.

As author of str(), I'll declare the design(*) of str() to be such
that with these methods (length, names, `[`, `[[`) defined
consistently, str.default(obj) already works sensibly.  
The alternative is indeed to define your own str() method.  

One of the two you'd want often, because e.g.,
  str( list( <obj1>, <obj2> ) )
or similar things should work too.

--
(*) to be honest, str() grew and developed very much
    historically, so the above is more an "implementation principle"

    > +1 on Hadley - don't override any basic R structural methods, create
    > new ones with new names. You can make them more meaningful too. For
    > your example, maybe "messageCount(myObject)"?

    > Barry



More information about the R-package-devel mailing list