[Bioc-devel] RFC: eSet with two color data
Wolfgang Huber
huber at ebi.ac.uk
Wed Mar 21 14:56:09 CET 2007
Dear all,
I hope that this question is not too tedious for those who have already
thought hard about it, but I am not aware of consensus and good
documentation in Biobase on this topic:
How can we best represent preprocessed, normalised data from a set of
two- (or n-) colour arrays in an eSet like structure? I would like to
keep the intensity information of each channel, and not reduce to
M-values since that looses information.
I see two options:
A) in an ExpressionSet-derivative called e.g. "ExpressionSetWithColors"
with ncol = n times the number of arrays, and with mandatory phenoData
columns named e.g. "arrayID" and "dye" .
B) in an eSet-derivative with ncol = the number of arrays, and n
congruent matrices in the assayData slot.
Currently I prefer A, because
- most of the infrastructure is already there and the additional work is
little
- in B, the interpretation of the phenoData columns gets mushy because
some columns will refer to the arrays, others to one particular sample
of the n hybrised to each array, and we need additional infrastructure
to resolve that.
Is there anything that someone can point out that I am not aware of?
Also (different topic:) do we already have an ontology in place
somewhere for control features (e.g. empty features, features measuring
a known spike-in ratio)?
Best wishes
Wolfgang
------------------------------------------------------------------
Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber
More information about the Bioc-devel
mailing list