[BioC] problem to compare two series of arrays hybridized with two different protocols
Stephen Henderson
s.henderson at ucl.ac.uk
Fri Feb 17 12:29:37 CET 2006
MDS and PCA are exploratory analyses the latter accentuating difference. If you are trying to make a diagnostic then I would be pretty worried if your signatures were too specific to a very narrow protocol-- they might be biased. Bear in mind that the bias you describe is not in the biological samples (the real target) but in the protocol.
Try quantile or loess normalizing--train on one and test on the other. Then try putting them both together and cross validating.
By using 2 slightly different biased datasets you are probably ensuring it is not over-fit and is robust enough for practical use. No?
Stephen Henderson
Wolfson Inst. for Biomedical Research
Cruciform Bldg., Gower Street
University College London
United Kingdom, WC1E 6BT
+44 (0)207 679 6827
-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Malick.PAYE at eu.biomerieux.com
Sent: 16 February 2006 17:01
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] problem to compare two series of arrays hybridized with two different protocols
Hello all,
I work for an in vitro diagnostic company and we are interested in
analysing microarray gene expression data, we try to identify molecular
signature to discriminate two populations (healthy and Cancer) .
We have collected 100 samples and hybridized them with a given protocol
and have identified a molecular signature based on these data.
We then plan to assess the performance of such a signature, so we collect
100 new samples and have hybridized with an "upgrade" version of the first
protocol.
According to biologist there is no big difference between the two
protocols.
But when comparing the two populations (first and second protocol) with
classical exploration technics (MDS, PCA) we can see that there is a clear
difference between the two series.
I try to explain to biologist that the two series are quite different and
normalizing the data will not solve the problem of changing protocol and
explain them consequently it's not a problem of normalization.
I proposed to the biologists to take few samples and to hybridize them
(same samples) with the two protocols to try to see if there can be a
relationship explaining the difference between the two protocols, so that
each sample will be hybridized with the two protocols.
My questions are :
How to best analyse these 10 samples (5*2)?
Is there a way to try to make the two populations comparable (I tried to
normalize the data with quantiles and invariant set but we still have the
two groups) ?
Is it reasonable to combine the two series to try to identify a new
signature ?
Any help will be greatly appreciated,
Thanks in advance,
Malick,
Malick Paye | bioMérieux | Biomathematician
Phone: (+33)4 78 87 70 97 | Fax: (+33)4 78 87 53 40
[Parc Polytec, 5 Rue des Berges, 38004 Cedex 01 Grenoble, France]
AVIS : Ce courrier et ses pieces jointes sont destines a leur seul destinataire et peuvent contenir des informations confidentielles appartenant a bioMerieux. Si vous n'etes pas destinataire, vous etes informe que toute lecture, divulgation, ou reproduction de ce message et des pieces jointe est strictement interdite. Si vous avez recu ce message par erreur merci d'en prevenir l'expediteur et de le detruire, ainsi que ses pieces jointes.
NOTICE: This message and attachments are intended only for the use of their addressee and may contain confidential information belonging to bioMerieux. If you are not the intended recipient, you are hereby notified that any reading, dissemination, distribution, or copying of this message, or any attachment, is strictly prohibited. If you have received this message in error, please notify the original sender immediately and delete this message, along with any attachments.
[[alternative HTML version deleted]]
**********************************************************************
This email and any files transmitted with it are confidentia...{{dropped}}
More information about the Bioconductor
mailing list