[Bioc-devel] Biobase class versions

Martin Morgan mtmorgan at fhcrc.org
Wed May 17 21:52:30 CEST 2006


Bioconductor developers!

As anticipated last week, the most recent svn data classes in Biobase
have now been changed. The hope is that this will make future changes
as transparent as possible.

* Affected classes

The affected classes are

eSet and its derivatives (ExpressionSet, SnpSet, and MultiSet)
exprSet
AnnotatedDataFrame
annotatedDataset, MIAME, phenoData

These classes now 'contain' either 'Versioned' or
'VersionedBiobase'. Previous instances (e.g., stored to disk) can be
updated to the new version using steps such as

> data(sample.ExpressionSet)
> sample.ExpressionSet <- updateObject(sample.ExpressionSet)

* How do you benefit?

The new information available to developers can be accessed as

> classVersion(sample.ExpressionSet)
            R       Biobase          eSet ExpressionSet 
      "2.4.0"     "1.11.10"       "1.0.0"       "1.0.0" 
> isVersioned(sample.ExpressionSet) # FALSE before update
[1] TRUE
> isCurrent(sample.ExpressionSet)
            R       Biobase          eSet ExpressionSet 
         TRUE         FALSE          TRUE          TRUE

You'll notice that the version information for this object is a named
list. The first two elements indicate the version of R and Biobase
used to create the object. The latter two elements are contained in
the class prototype, and the class prototype is consulted to see if
the instance of an object is 'current'. These lists can be subsetted
in the usual way, e.g.,

> isCurrent(sample.ExpressionSet)[c("eSet", "ExpressionSet")]
         eSet ExpressionSet 
         TRUE          TRUE 

Versioned classes and updateObject and related methods simplify the
long-term maintenance of data objects. Take 'MultiSet' as an
example. This is a new class, and might undergo changes in its
structure at some point in the future. When these changes are
introduced, the developer will change the version number of the class
in its prototype (the last line, below):

setClass("MultiSet",
         contains = "eSet",
         prototype = prototype(
           new("VersionedBiobase",
               versions=c(classVersion("eSet"), MultiSet="1.0.1"))))

and add code to update to the new version

setMethod("updateObject", signature(object="MultiSet"),
          function(object, ..., verbose=FALSE) {
              if (verbose) message("updateObject(object = 'MultiSet')")
              object <- callNextMethod()
              if (isCurrent(object)["MultiSet"]) return(object)
              ## Create an updated instance.
              if (!isVersioned(object))
                  ## Radical surgery -- create a new, up-to-date instance
                  new("MultiSet",
                      assayData = updateObject(assayData(object)),
                      phenoData = updateObject(phenoData(object)),
                      experimentData = updateObject(experimentData(object)),
                      annotation = updateObject(annotation(object)))
              else {
                  ## Make minor changes, and update version by consulting class definition
                  classVersion(object)["MultiSet"] <-
                      classVersion("MultiSet")["MultiSet"]
                  object
              }
          })

updateObject then returns a new, enhanced object:

> classVersion(updateObject(obj))
        R   Biobase      eSet  MultiSet 
  "2.4.0" "1.11.11"   "1.0.0"   "1.0.1" 

As in the example, versioning helps in choosing which modifications to
perform -- minor changes for a slightly out-of-date object, radical
surgery for something more ancient. Version information might also be
used in methods, where changing class representation might facilitate
more efficient routines.

* Versioned versus VersionedBiobase

The information on R and Biobase versions is present in eSet derived
classes becasue eSet contains VersionedBiobase. On the other hand,
phenoData contains Versioned, and has only information about its own
class version.

> classVersion(new("phenoData"))
phenoData 
  "1.0.0" 

The rationale for this is that phenoData is and will likely remain
relatively simple, and details about R and Biobase are probably
irrelevant to its use. On the other hand, some aspects of eSet and the
algorithms that operate on them are more cutting edge and more subject
to changes in R or Biobase. Knowing the version of R and Biobase used
to create an instance might provide valuable debugging information.

* Adding Versioned information to your own classes

The key to versioning your own classes is to define your class to
'contain' Versioned or VersionedBiobase, and to add the version
information in the prototype. See the examples in the Versioned help
page.

It is also possible to add arbitrary information to particular
instances, though these might not persist through updateObject.

> classVersion(obj)["MyID"] <- "0.0.1"
> classVersion(obj)
        R   Biobase      eSet  MultiSet      MyID 
  "2.4.0" "1.11.11"   "1.0.0"   "1.0.0"   "0.0.1" 
> classVersion(updateObject(obj))
        R   Biobase      eSet  MultiSet      MyID 
  "2.4.0" "1.11.11"   "1.0.0"   "1.0.1"   "0.0.1" 

There is considerable documentation about these classes and methods in
Biobase. I'll be watching for anomalies that these changes produce,
and am eager to respond to any questions or issues that arise.

Martin
-- 
Bioconductor



More information about the Bioc-devel mailing list