[BioC] Question regarding handling technical replicates for Affy arrays

Tue May 30 14:58:08 CEST 2006

Dear Noel,

If you average the technical replicates, the averages have somewhat 
less variance than the data for the samples with only 1 array.  So, 
probably you are best off doing 2 analyses - selecting each array in 
turn.  This may lead to slightly different "top gene" lists, but it 
will be bettter than selecting the "best" array and not understanding 
how this affects your analysis.  Since you appear to have adequate 
biological replication, the tow results should be very similar.  You 
should take a careful gene by gene look at any genes that show up as 
statistically significant in one of the analyses but not the other.

--Naomi

At 08:53 PM 5/29/2006, Gordon K Smyth wrote:
>Dear Noel,
>
>I'm sorry I didn't fully appreciate from your first email that you 
>have only one replicate,
>because you didn't give the whole biolrep vector.  A single 
>replicate is simply not enough to
>estimate the technical-replicate variance component.  You need at 
>least two.  That is the reason
>why duplicateCorrelation() returns a NaN answer.
>
>I guess that most people would average the technical replicates or 
>would choose the "best" one.
>It isn't likely to make a lot of difference.  There's no perfect 
>solution because this isn't a
>perfect experimental design.
>
>Best wishes
>Gordon
>
> > Date: Mon, 29 May 2006 01:20:10 -0700 (PDT)
> > From: "noel0925 at sbcglobal.net" <noel0925 at sbcglobal.net>
> > Subject: Re: [BioC] Question regarding handling technical replicates
> >       for     Affy arrays
> > To: bioconductor at stat.math.ethz.ch
> >
> >
> > Hi Gordon,
> >
> > Thank you for your reply.
> >
> > Actually, I have looked at the consensus correlation
> > and I obtain [1] NaN. This doesn't seem sensible.
> >
> > Perhaps I have specified the biological replicates
> > incorrectly. The desciption of dupcor states that
> > &quot;Typically the blocks are biological replicates and
> > the repeated observations are technical replicates.&quot;
> > As such, I thought that it made sense to create a
> > vector of the replicates as follows:
> >
> >
> > biolrep &lt;-
> > 
> c(1,2,3,4,5,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39)
> >
> > Thus, there are 39 DIFFERENT RNA samples and 1 sample
> > which is replicated (hybed to two different arrays).
> > Where the number 5 is repeated twice since it is the
> > only sample for which there is a technical replicate.
> > Samples 1-20 are of RNA1, samples 21-30 are RNA2, and
> > samples 31-40 are RNA3.
> >
> > Have I specified the biological replicates properly?
> > The &quot;biolrep&quot; examples I have seen in the literature
> > confused me a bit since it seems to specify both a
> > block of biological replicates and techical reps
> > within those blocks. But the cases given are for two
> > color arrays for example, in section 23.5 of the Limma
> > book chapter, the first example is for the case where
> > two wt and two mut mice from the same strain are
> > compared using two arrays for each pair so that the
> > 1st and 2nd and 3rd and 4th are technical reps. So
> > here,
> > biolrep&lt;- c(1,1,2,2).
> >
> > This is different however from the Affy data I
> > describe since the 3 different genotypes are all on
> > separate arrays rather than both wt and mut on the
> > same array.
> >
> > If I do:
> > biolrep &lt;-
> > 
> c(1,2,3,4,5,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39)
> >
> > then corfit$consensus yields NaN. Though the
> > biological reps are not explicitly defined here, I
> > would assume they are inferred from f&lt;-
> > factor(targets$Target,levels = c(&quot;RNA1&quot;, &quot;RNA2&quot;,
> > &quot;RNA3&quot;)).
> >
> >
> > If I do:
> > biolrep &lt;- c(rep(1,20), rep(2,10), rep(3,10))
> > then corfit$consensus yields Inf and this does not
> > indicate which arrays are technical reps.
> >
> > Any further insight you could offer would be great.
> > Thanks very much,
> >
> > Noelle
> >
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111