[BioC] dye swaps of technical replicates and variable numbers of replicate spots

Thu Aug 21 13:07:38 MEST 2003

At 01:17 AM 21/08/2003, Ramon Diaz-Uriarte wrote:
>Thanks for the comment. But I am a little bit troubled by one aspect:
>
>On Wednesday 20 August 2003 04:13, Gordon Smyth wrote:
> > >
> > >   Efect R1 R2 R3
> > >1      0  1  0  0
> > >2      1  1  0  0
> > >3      0  0  1  0
> > >4      1  0  1  0
> > >5      0  0  0  1
> > >6      1  0  0  1
> > >
> > > > lm.series(data, design)
> >
> > You're actually overparametrized here. (I hadn't turned my brain on
> > properly in my previous email.) The differences between the three
> > biological replicates contribute only two degrees of freedom, not three,
> > and your first effect will not estimate what you want. You need something
> >
> > equivalent to like:
> >  > design
> >
> >    Efect R2 R3
> > 1     -1   0   0
> > 2      1   0   0
> > 3     -1   1   0
> > 4      1   1   0
> > 5     -1   0   1
> > 6      1   0   1
> >
> > The three coefficients will represent: (1) The RNA comparison that you're
> > interested in, (2) The difference between biological replicate 2 versus 1,
> > and (3) The difference between biological replicate 3 versus 1.
> >
>
>Yes, I see now that it is overparametrized (and the lme model I wrote was 
>also
>overparametrized). However, what troubles me is that, since this is a model
>without intercept, depending on how I set up the last two columns of the
>design matrix (the ones I don't really care about) does change the sigma with
>lm.series.
>
>I can see, numerically, how this happens via the hat matrix (sum sq. error =
>Y' (I-H) Y) because the hat matrix is different between the
>parameterizations. But that is not what we want, because whether we compare
>replicates 1vs 2 and 1 vs 3 or 1 vs 2 and 2 vs 3 is irrelevant. Where am I
>getting lost?

Yes, OK, you're right.

I haven't tried out any technical replicate analyses myself yet and I 
haven't worked out how the details will work. It would appear that one 
can't do it with a fixed effect analysis at the log-ratio level. So you're 
on your own ... good luck!

>This is an example:
>
>mini.data.l <- mini.data <- rnorm(6)
>dim(mini.data.l) <- c(1, 6)
>
>d1 <- cbind(Efect = rep(c(-1, 1), 3),
>             R2 = c(0, 0, 1, 1, 0, 0),
>             R3 = c(0, 0, 0, 0, 1, 1))
>
>d2 <- cbind(Efect = rep(c(-1, 1), 3),
>             R1 = c(1, 1, 0, 0, 0, 0),
>             R3 = c(0, 0, 0, 0, 1, 1))
>
>d3 <- cbind(Efect = rep(c(-1, 1), 3),
>             R1 = c(1, 1, 0, 0, 0, 0),
>             R2 = c(0, 0, 1, 1, 0, 0))
>
>lm.series(mini.data.l, d1)
>lm.series(mini.data.l, d2)
>lm.series(mini.data.l, d3)
>
>#### Now with lme:
>tech.reps <- factor(c(1, 1, 2, 2, 3, 3))
>summary(lme(mini.data ~ -1 + d1[, 1], random = ~ 1|tech.reps))
>## The sigma tends to be very similar to that of
>d.large <- cbind(Efect = rep(c(-1, 1), 3),
>             R1 = c(1, 1, 0, 0, 0, 0),
>             R2 = c(0, 0, 1, 1, 0, 0),
>             R3 = c(0, 0, 0, 0, 1, 1))
>lm.series(mini.data.l, d.large)$sigma #why?

When I ran this example the sigma from lm.series (1.6) was not all all the 
same as the sigma from lme (1.4).

>## Note how the lm.series are equivalent, respectively, to:
>summary(lm(mini.data ~ -1 + d1))
>summary(lm(mini.data ~ -1 + d2))
>summary(lm(mini.data ~ -1 + d3))
>
>
>Best,
>
>Ramón

Gordon

>--
>Ramón Díaz-Uriarte
>Bioinformatics Unit
>Centro Nacional de Investigaciones Oncológicas (CNIO)
>(Spanish National Cancer Center)
>Melchor Fernández Almagro, 3
>28029 Madrid (Spain)
>Fax: +-34-91-224-6972
>Phone: +-34-91-224-6900
>
>http://bioinfo.cnio.es/~rdiaz