[Bioc-devel] Biobase / eSet changes for this release

Rafael A Irizarry ririzarr at jhsph.edu
Thu Apr 13 15:17:26 CEST 2006


Looks good.

One comment:
Typically, SnpSet will have more than just the allele calls and p-values.
It will aslo have a copy number estimate and it's measure of 
uncertainty. How hard is it to add these?
You would need one more array  just like the one you used for the 
genotype calls.


Martin Morgan wrote:

>Biobase/eSet developers,
>Here is a brief summary of the version of eSet to be included in the
>this release; the code builds and checks without error, though missing
>documentation (to be corrected within the week) mean that there are
>still warning messages during check.  The most recent changes are in
>There is one very recent change, to the overall class structure, that
>we agonized over a great deal before making at the last moment.  We
>recognize that this is very unfortunate timing, and that it will cause
>needless work for bioconductors; we will help out as much as possible.
>There are three major changes:
>1. Change in class structure.
>eSet -- VIRTUAL
>  ExpressionSet
>  SnpSet
>  (TilingSet -- not implemented)
>The main functionality of eSet is to coordinate assayData, phenoData,
>experimentData, and the annoation.  eSet is also a generalized
>container, with high-throughput data stored in the assayData
>slot. eSet is a VIRTUAL class; if you want to store and manipulate a
>consistent set of elements in the assay data slot you should create a
>subclass of eSet. An example of how to do this is below.
>ExpressionSet requires that the assayData slot contain matrix element
>'exprs'; other elements (of dimension identical to exprs) are
>permitted. as(exprSet, "ExpressionSet") coerces exprSet objects to
>ExpressionSet, perhaps issuing warnings if ambiguities arise.
>obj <- as(sample.exprSet, "ExpressionSet")
>SnpSet is meant to contain SNP data in a manner analogous to
>ExpressionSet; 'call' and 'callProbability' are required assayData
>elements providing information on the call and a statement of
>confidence in the call. The exact structure of these matricies is not
>specified, but the idea is that 'call' encodes diploid genotypes.
>2. Change in assayData storage
>The assayData slot is an AssayData class union of 'list' and
>'environment'; as a class union, there is no 'initialize'
>method. Instead, the list or environment can be populated with
>elements using a call to assayDataNew(...).
>An innovation is the storageMode method, which can be used to change
>how elements in assayData are stored. In particular the storageMode
>can be 'lockedEnvironment', and indeed this is the default. An
>environment is locked in the sense that new elements cannot be added
>to the environment, and existing elements cannot be changed. This
>means that the pass-by-reference semantics of environments will not
>catch users off-guard:
>obj <- as(sample.exprSet,"ExpressionSet") # default: lockedEnvironment
>storageMode(obj) <- "environment"
>obj1 <- obj
>exprs(obj1) <- exprs(obj1)[1:10,1:5]
>dims(obj) # yikes! obj exprs dimensions changed!
>obj <- as(sample.exprSet,"ExpressionSet") # default: lockedEnvironment
>storageMode(obj) <- "environment"
>obj1 <- obj
>exprs(obj1) <- log(exprs(obj))
>identical(exprs(obj1),exprs(obj)) # TRUE: yikes again!
>obj <- as(sample.exprSet,"ExpressionSet") # default: lockedEnvironment
>obj1 <- obj
>exprs(obj1) <- log(exprs(obj1))
>identical(exprs(obj1),exprs(obj)) # FALSE: good!
>Note that attempts to directly change slots in locked environments
>cause an error
>>assayData(obj1)$exprs <- NULL
>Error: cannot change value of a locked binding. 
>The setReplaceMethod for exprs (and assayData) succeeds by performing
>a deep copy of the entire environment. Becaue this is very
>inefficient, the recommended paradigm to update an element in a
>lockedEnvironment is to extract it, make many changes, and then
>reassign it, e.g.,
>ex <- exprs(obj1)
># many changes, ex <- log(ex), ...
>exprs(obj1) <- ex
>lockedEnvironment offers some efficiency in copying objects, because
>the environment is not copied during function calls. This is not
>completely satisfactory, though
>func <- function(assayData) # good: contents of env will not be copied
>  max(exprs(assayData)) # not so good: exprs copied from environment
>3. Changes in other slots
>Other slots have been changed to treat variable metadata more
>efficiently (in the AnnotatedDataFrame class of slot phenoData) and to
>simplify the type of data stored as experimentData. These changes are
>mostly in line with the web discussions.
>In making these changes, I have tried not to break the existing
>interface beyond what is necessary for the new functionality (e.g.,
>pData still returns the 'data' part of phenoData). One difference,
>though, is that the methods dim, ncol, etc return a vector of
>dimensions reflecting the shared dimensionality of the assayData
>memebers; dims returns an array of dimensions of each element.
>These changes affect eSets; any difficulties you might have with
>exprSet probably reflect changes made several months ago to validity
>Please let me know of any feedback,
>The original 'sample.eSet' contains four elements in the assayData
>slot: R, G, Rb, Gb. To derive a class from eSet for this data, create
>a class, and provide initializaation and validation
>methods. Optionally, update previous eSet data structures to your new
>class. For instance,
>setClass("SwirlSet", contains="eSet")
>setMethod("initialize", "SwirlSet",
>          function(.Object,
>                   phenoData = new("AnnotatedDataFrame"),
>                   experimentData = new("MIAME"),
>                   annotation = character(),
>                   R = new("matrix"),
>                   G = new("matrix"),
>                   Rb = new("matrix"),
>                   Gb = new("matrix"),
>                   ... ) {
>            callNextMethod(.Object,
>                           assayData = assayDataNew(
>                             R=R, G=G, Rb=Rb, Gb=Gb,
>                             ...),
>                           phenoData = phenoData,
>                           experimentData = experimentData,
>                           annotation = annotation)
>          })
>setValidity("SwirlSet", function(object) {
>  assayDataValidMembers(assayData(object), c("R", "G", "Rb", "Gb"))
>obj <- updateOldESet(sample.eSet,"SwirlSet")
>Bioc-devel at stat.math.ethz.ch mailing list

More information about the Bioc-devel mailing list