[Bioc-devel] eSet for aCGH data

Vincent Carey 525-2265 stvjc at channing.harvard.edu
Fri Oct 5 15:10:00 CEST 2007


> I guess it would indeed be easiest to define a separate eSet subclass
> for the different 'stages' of aCGH data, being raw, normalized,
> segmented, called and regions. For this the ExpressionSet class can be
> used with virtually no changes, except for perhaps some methods. Or
> perhaps a single class, with a slot to define the type of data contained
> in the class and thus the way methods behave.
>
> Vincent, how does your cghSet class differ from the ExpressionSet class,
> and why?

for the "how", it would be best to see the code in Neve2006/R.  Briefly,
cghSet contains eSet.    there's a method "logRatios" that just grabs
the exprs element of the assayData.  the package data component neveCGHmatch
is an instance of cghSet, and it has a man page.  the vignette gives
some indications of how to work with the featureData component of that
structure.

the real reason for Neve2006 is to define cghExSet, which contains eSet,
but adds slots cghAssays (AssayData instance) and cloneMeta
(AnnotatedDataFrame instance).  The purpose of cghExSet is to have
a container for the Neve 2006 data that combine expression and aCGH
data on the same samples.  There have been some recommendations for
improvements from Martin Morgan that are awaiting implementation.  I
am taking a very conservative (with respect to programming effort)
approach to this development because I am not a direct user of CGH
data and I have no real use cases.

for the "why", i would say that we should extend eSet, not ExpressionSet,
to represent these data that are conceptually distinct from expression
measures.  but the details of design for the cghSet need to address use
cases. Presumably these will go beyond what is in the Neve2006 vignette,
and, if they involve a series of classes like cghRawBatch, cghNorm,
cghSeg, for example, the stuff in Neve2006 may be irrelevant.  cghSet as
I defined it could be discarded, or it could be regarded as a suitable
container for "cooked" aCGH results, which uses eSet infrastructure
appropriately.  on my superficial review of aCGH related software
in bioconductor, there was nothing that used S4 to couple the sample
information closely to the assay data results.  i feel that whatever
we do should allow this coupling at the earliest possible stage.

>
> Sjoerd
>
> -----Oorspronkelijk bericht-----
> Van: Vincent Carey 525-2265 [mailto:stvjc at channing.harvard.edu]
> Verzonden: Thursday, October 04, 2007 16:14
> Aan: Vosse, S.J.
> CC: bioc-devel at stat.math.ethz.ch
> Onderwerp: Re: [Bioc-devel] eSet for aCGH data
>
>
> > Dear all,
> >
> >
> >
> > first let me thank Martin Morgan and James MacDonald for their answers
> > to my question about the exprSet class on the Bioconductor mailing
> list.
> > They have been very helpful.
> >
> >
> >
> > I am thinking to adapt/extend the eSet class or probably ExpressionSet
> > to contain aCGH data for our package CGHcall. My question is whether a
> > similar class for aCGH data already exists or if anyone has been
> working
> > on it or has thoughts on the subject. The class would be the same as
> > ExpressionSet, only there would need to be slots for raw data,
> > normalized data, segmented data, called data and regions data
> > (http://la-press.com/cr_data/files/f_CIN-3-Wiel-et-al_96.pdf).
>
> you might have a look at the Neve2006 package, only in the devel
> experiment data branch (currently labeled 2.1 on the web site)
>
> i did not push this package into release because of lack of consensus
> on the preferred representation of aCGH data.  we have a number of
> packages like aCGH, DNAcopy, snapCGH that use their own representations.
>
> cghSet and cghExSet are defined in Neve2006.  cghExSet confronts the
> problem of managing expression and CGH data obtained on the same
> samples
>
> i question whether you should have a class that manages
> raw and normalized and segmented data together.  we have used stagewise
> representations in the expression domain, with containers like
> oligoBatch
> for the raw intensities and ExpressionSet devoted to expression level
> quantifications that will be analyzed downstream.  binding data together
> at various levels of processing may have some benefits but also many
> costs if the data are voluminous.
>
> for what you mention, it seems that the normalized and/or called data
> and
> regions data would be in the AssayData and featureData eSet slots
> respectively.
>
> once we get some consensus among multiple developers/users in place, it
> is likely that a central eSet derivative class devoted to aCGH data
> would
> be defined in Biobase (or some relevant primarily core-maintained
> package)
> for interested developers to use.
>
> we can start a wiki page on the developer's wiki devoted to this topic
> if there is sufficient interest
> >
> >
> >
> > Sjoerd Vosse
> >
> >
> > 	[[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
>
> The information transmitted in this electronic communica...{{dropped:9}}
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


The information transmitted in this electronic communica...{{dropped:2}}



More information about the Bioc-devel mailing list