# How to average over duplicate spots [Was: Re: [BioC] technical replicates and spots in limma]

Gordon Smyth smyth at wehi.edu.au
Thu Apr 21 10:21:23 CEST 2005

```I really hesitate to explain how to average of duplicate spots using limma,
because it is not something I generally recommend. It is however quite easy:

quality weights. Suppose that there are 'ndups' duplicates at spacing
'spacing'. You can average over duplicates by

fit1 <- lmFit(MA, design=diag(ncol(MA)), ndups=ndups, spacing=spacing,
correlation=0)

Now the averaged log-ratios are in

Y <- fit1\$coef

and the consolidated spot quality weights are in

w <- 1/fit1\$stdev.unscaled^2

Now you can fit any model you like to the averaged log-ratios, e.g.,

fit <- lmFit(Y, design, weights=w)
etc

At 05:22 PM 19/04/2005, Ron Ophir wrote:
> >>>> "Gordon K Smyth" <smyth at wehi.EDU.AU> 04/18/05 2:24 PM >>>
> >> Date: Sun, 17 Apr 2005 17:24:14 +0300
> >> From: "Ron Ophir" <ron.ophir at weizmann.ac.il>
> >> Subject: [BioC] technical replicates and spots in limma
> >> To: <bioconductor at stat.math.ethz.ch>
> >>
> >> Dear limma experts,
> >> I have direct experiments with two biological replicates and two
> >> technical replicates. In each array sots are printted in 4
>replicates.
> >> In duplicateCorrelation help it is written that "At this time it is
>not
> >> possible to estimate correlations between duplicate spots and
>between
> >> technical replicates simultaneously."
> >> The question is it possible to average on both technical and spot
> >> replicates but not simultaneously and if yes then how?
> >> If not which least-squares analysis should I drop technical sample
> >> replicates or spots replicates?
> >
> >The between spot correlation is usually in the range 0.5-0.9.
>Correlations between technical
> >replicates are usually not so strong, seldom higher than around
>0.2-0.3 and often less.
>
> >
> >If you're going to ignore one of these correlations, it should be the
>technical replication.  If
> >you're going to average over one of the replicate structures, it
>should be over the replicate
> >spots.
>
>Thanks. Averaging over spot replicates using duplicateCorrelation()
>assuming equal space between replicates coordinates or I can give a
>vector of spots location like in block for technical replicates.
>If the latter is not possible, does the following commands are what
>should be done:
>spotRep<-as.factor(c(1,1,2,2,3,1,1,3,3,2,2,3,...))
>vvRaw\$R<-unlist(by(vvRaw\$R,spotRep,mean))
>vvRaw\$G<-unlist(by(vvRaw\$G,spotRep,mean))
>vvRaw\$Rb<-unlist(by(vvRaw\$Rb,spotRep,mean))
>vvRaw\$Gb<-unlist(by(vvRaw\$Gb,spotRep,mean))

No, this won't work. Apart from anything else, the averaging should occur
on the M-values rather than raw R, G, Rb and Gb intensities. See above.

Gordon

>Ron
>
> >
> >The measurement error is often larger than the biological variation,
>so that treating the
> >technical replicates as biological replicates is often not as bad as
>it sounds.  This is what I
> >would usually do, having checked the between technical rep correlation
>is not large.
> >
> >Gordon
> >