[BioC] ExpressionSet or MAList

Gordon K Smyth smyth at wehi.EDU.AU
Tue May 6 10:08:31 CEST 2008


Although somewhat tangential to the discussion, because ExpressionSet and 
MAList both can store annotation, I thought it might be interesting to 
explain why I view the ability to store more than one column of probe 
annotation in a microarray data object as essential. There are many 
reasons, including

1. If the data object is subsetted frequently, the annotation should 
subset appropriately.

2. I want to be able to come back to an analysis years afterwards and be 
able to repeat it exactly, including the annotation, not be completely 
dependent on a constantly changing annotation package.  This is part of 
reproducible research as I see it.  Of course I also want to be able to 
update the annotation, but in a controlled way.

3. Applications requiring annotation such as the limma controlStatus() 
function.

4. I am frequently presented with academic arrays for which no single 
annotation column of unique probe identifiers is provided.  Instead 
several columns may be needed to identify the probe.  People who haven't 
had this experience are fortunate.

In general, the need to work with annotation as an associated data.frame 
is greater with "messier" microarray platforms such as academic two-colour 
cDNA arrays and with once-off custom platforms.

Gordon


> Date: Wed, 30 Apr 2008 10:10:21 -0700
> From: Martin Morgan <mtmorgan at fhcrc.org>
> Subject: Re: [BioC] ExpressionSet or MAList
> To: Daniel Brewer <daniel.brewer at icr.ac.uk>
> Cc: bioconductor at stat.math.ethz.ch
>
> Daniel Brewer <daniel.brewer at icr.ac.uk> writes:
>
>> To my mind MAList stores the annotation with the dataset which I feel is
>
> Storing annotations with the object can be a bad thing if the
> annotations are the same, because then there are effectively different
> variants of the same annotation, one for each object. These will
> inevitably drift apart, leading to confusion. There is also a memory
> use issue.
>
> That said, annotations can be added to ExpressionSet, specifically
> using featureData to store an AnnotatedDataFrame (data.frame +
> annotation on column labels).
>
>> an advantage whereas ExpressionSet is the base implementation for many
>> libraries.



More information about the Bioconductor mailing list