[BioC] designing an eSet derived object

Martin Morgan mtmorgan at fhcrc.org
Fri Nov 5 18:33:40 CET 2010


On 11/05/2010 05:02 AM, Wolfgang RAFFELSBERGER wrote:
> Dear list,
> 

> basically I'm trying to design an object to contain the following
> microarray-data
> 1) "gxIndData": microarray-data normalized in parallel by (an
> array-dependent) number of n methods plus the corresponding
> expression-calls (again, <= n methods),
> 2) "gxAvData": derived values (replicate-averages, SEMs, etc),
> 3) gene/spot annotation,
> 4) sample-description,
> 5) various supl informations (parameters, notes, versions, etc)
> 
> In overall, this is a somehow modified/extended concept to the
> Biobase eSet and I'm trying to figure out if there is a way to use
> the Biobase eSet. This way I hope to maintain a decent level of
> compatibility with other Bioconductor methods and allow code-reuse.
> 
> Now I'd like to store  the various sections of 1) and 2) as separate
> lists with n matrixes of values to keep things organized.
> 
> According to the Vignette "Biobase development and the new eSet"
> section 5 ("Extending eSet"), I defined new a new class 'eSet'. But
> as soon as I integrate something different than matrixes at the level
> of 'AssayData', I get an error-message (see code below) - no matter
> if these are simply lists or custom-objects. I suppose this means
> that I would have to store all matrixes (up to 10*6methods =60
> matrixes) without further organization at the level of 'AssayData'.

eSet requires that all AssayData elements are two-dimensional with
identical dimensions, so a list-of-matrices would not work.

> However, I'd like to keep at least one (in my case better 2) levels
> of additional arborescence to keep the data organized.
> 
> So, finally I would like to integrate two new classes for 1) and 2)
> at the level of the assayData slot of my modified/new eSet.
> 
> Does this mean this is not possible and that I cannot use the 'eSet'
> for my purposes ? Do I have to create a novel class somehow
> equivalent but finally incompatible to the 'eSet' ?
> 
> Any suggestions/hints ?

One possiblity, if this is for your own use and not as the foundation
for a package, is to use NChannelSet, where each method is a 'channel'.

Another possibility is to create a class that extends eSet with a slot
containing, e.g., an AnnotatedDataFrame with columns describing the
AssayData, and a method to query the slot / select the appropriate
assayData elements

And perhaps what you really have is more a list of (of lists of)
ExpressionSets, each element of the list with additional information. An
approach here would use the IRanges 'SimpleList' infrastructure, e.g.,

> lst = SimpleList(a=new("ExpressionSet"), b=new("ExpressionSet"))
> elementMetadata(lst) = DataFrame(method=c("A", "B"))
> lst[elementMetadata(lst)$method == "A"]
SimpleList of length 1
names(1): a
> lst[elementMetadata(lst)$method == "A"][[1]]
ExpressionSet (storageMode: lockedEnvironment)
assayData: 0 features, 0 samples
  element names: exprs
protocolData: none
phenoData: none
featureData: none
experimentData: use 'experimentData(object)'
Annotation:

Martin

> 
> Thank’s in advance,
> wolfgang
> 
> ##
> 
>  require(Biobase)
>  setClass("gxSet", contains = "eSet")
>  setMethod("initialize", "gxSet", function(.Object, A=new("list"),B=new("list"),...) {
>    callNextMethod(.Object, A=A,B=B,  ...) })
>  new("gxSet")
>  ## produces :
>  Error in function (storage.mode = c("lockedEnvironment", "environment",  :
>    'AssayData' elements with invalid dimensions: 'A' 'B'
> 
> 
>  ## ideally I'd like to use
>  setClass("gxIndData",representation(SIdata="list",SIcall="list"))
>  setClass("gxAvData",representation(avSI="list",expressed="list",SEM="list", conCall="list",
>    FC="list",FiltFin="list",FiltSI="list",FiltOther="list"))
>  setClass("gxSet", contains = "eSet")
> 
>  setMethod("initialize","gxSet", function(.Object,
>    assayData=assayDataNew(IndData=IndData,AvData=AvData),
>    IndData=new("gxIndData"), AvData=new("gxAvData"),...) {
>    if(!missing(assayData) && any(!missing(IndData), !missing(AvData))) {
>      warning("using 'assayData'; ignoring 'IndData', 'AvData'") }
>    callNextMethod(.Object, assayData = assayData, ...)
>  })
> 
>  new("gxSet")
>  ## produces :
>  Error in assayDataNew(IndData = IndData, AvData = AvData) :
>    'AssayData' elements with invalid dimensions: 'AvData' 'IndData'
> 
> 
>  ## the alternative : an eSet 'like' but independent and incompatible object ..
>  setClass("gxSet",representation(IndData="gxIndData",AvData="gxAvData",phenoData="AnnotatedDataFrame",featureData="AnnotatedDataFrame",
>   experimentData="MIAME",annotation="character",protocolData="AnnotatedDataFrame",notes="list"))
> 
> 
> 
> ## for completeness:
> sessionInfo()
> R version 2.12.0 (2010-10-15)
> Platform: i386-pc-mingw32/i386 (32-bit)
> 
> locale:
> [1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252
> [4] LC_NUMERIC=C                   LC_TIME=French_France.1252
> 
> attached base packages:
> [1] grDevices datasets  splines   graphics  stats     tcltk     utils     methods   base
> 
> other attached packages:
> [1] affy_1.28.0     Biobase_2.10.0  svSocket_0.9-50 TinnR_1.0.3     R2HTML_2.2      Hmisc_3.8-3     survival_2.35-8
> 
> loaded via a namespace (and not attached):
> [1] affyio_1.18.0         cluster_1.13.1        grid_2.12.0           lattice_0.19-13       preprocessCore_1.12.0
> [6] svMisc_0.9-60         tools_2.12.0
> 
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> Wolfgang Raffelsberger, PhD
> Laboratoire de BioInformatique et Génomique Intégratives
> IGBMC,
> 1 rue Laurent Fries,  67404 Illkirch  Strasbourg,  France
> Tel (+33) 388 65 3300         Fax (+33) 388 65 3276
> wolfgang.raffelsberger @ igbmc.fr
> 
> 
> 	[[alternative HTML version deleted]]
> 
> 
> 
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list