[BioC] FW: duplicates, technical and biological replicates + dividing a microarray into two parts

Staninska, Ana, Dr. ana.staninska at helmholtz-muenchen.de
Thu Feb 4 12:47:28 CET 2010


Thank you very much, 

Best, 
Ana
________________________________________
From: Claus Mayer [claus at bioss.ac.uk]
Sent: Wednesday, February 03, 2010 4:09 PM
To: Staninska, Ana, Dr.; Naomi Altman; bioconductor at stat.math.ethz.ch
Subject: RE: [BioC] FW: duplicates, technical and biological replicates + dividing a microarray into two parts

Dear Ana,

To give Naomi some rest, perhaps I can help with some answers:


>
> 1. Now that you mention, I can see that the within array variability
> should be smaller then the technical variability,
> but I cannot understand why treating them as the same, should be less
> statistically valid then averaging the duplicate spots.
> How could one judge what is statistically more valid? Could you maybe tell
> me where I could read  more about this,
> so I will know more and I won't make the same mistakes again?

If you treat within array variability the same as between array variability
you would give every replicated spot on the array the same importance as you
would give an extra reading from another array. If you now imagine the ideal
case that you have a very low within array variability then basically the
replicates on the same array will more or less give you identical results,
i.e these spots wouldn't add any information, but you would treat them as if
they did. To give an example, imagine you have three replicated spots on
each of 2 array. Array 1 gives you the values 2.9,3.0, 3.1, Array 2 gives
you 6.9,7.0,7.1. If you average you reduce it to 3 and 7 within an overall
average of 5 and a standard error (the number that measures how well you
estimate the total average) of 2. If you take all 6 values and treat them as
independent replicates you end up with the same mean of 5, but the standard
error reduces to 0.9. This means that by neglecting the different types of
variation you create the false impression of a more precise result.

>
> 2, I should have probably mentioned before,  the correlation between my
> duplicate spots (calculated with duplicateCorelation function in Limma)
> is in the range (0.5,0.6), and the correlation between my technical
> replicates is in the range (-0.3, -0.2).
> So I think the duplicates spots are not well correlated, and averaging
> them we will lose valuable information.

The first question to answer is: why is the correlation so poor? It is most
likely to indicate poor array quality. The information you get from the
duplicates is not of biological interest, it only tells you something about
the within array-variability. In that sense it is valuable information but
not as far as the quantity of biological interest (gene expression) is
concerned.

> If I do averaging of the spots, should I do it before or after
> normalization?

I would always do that after normalization.

Hope that helps a bit to understand it all.

Claus



More information about the Bioconductor mailing list