Wolfgang Huber huber at ebi.ac.uk
Wed Mar 21 14:56:09 CET 2007

Dear all,

I hope that this question is not too tedious for those who have already
thought hard about it, but I am not aware of consensus and good
documentation in Biobase on this topic:

How can we best represent preprocessed, normalised data from a set of
two- (or n-) colour arrays in an eSet like structure? I would like to
keep the intensity information of each channel, and not reduce to
M-values since that looses information.

I see two options:

A) in an ExpressionSet-derivative called e.g. "ExpressionSetWithColors"
with ncol = n times the number of arrays, and with mandatory phenoData
columns named e.g. "arrayID" and "dye" .

B) in an eSet-derivative with ncol = the number of arrays, and n
congruent matrices in the assayData slot.

Currently I prefer A, because
- most of the infrastructure is already there and the additional work is
- in B, the interpretation of the phenoData columns gets mushy because
some columns will refer to the arrays, others to one particular sample
of the n hybrised to each array, and we need additional infrastructure
to resolve that.

Is there anything that someone can point out that I am not aware of?

Also (different topic:) do we already have an ontology in place
somewhere for control features (e.g. empty features, features measuring
a known spike-in ratio)?

Best wishes

Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

