[BioC] dye swaps of technical replicates and variable numbers of replicate spots

Wed Aug 20 18:17:41 MEST 2003

Thanks for the comment. But I am a little bit troubled by one aspect:

On Wednesday 20 August 2003 04:13, Gordon Smyth wrote:
> >
> >   Efect R1 R2 R3
> >1      0  1  0  0
> >2      1  1  0  0
> >3      0  0  1  0
> >4      1  0  1  0
> >5      0  0  0  1
> >6      1  0  0  1
> >
> > > lm.series(data, design)
>
> You're actually overparametrized here. (I hadn't turned my brain on
> properly in my previous email.) The differences between the three
> biological replicates contribute only two degrees of freedom, not three,
> and your first effect will not estimate what you want. You need something
>
> equivalent to like:
>  > design
>
>    Efect R2 R3
> 1     -1   0   0
> 2      1   0   0
> 3     -1   1   0
> 4      1   1   0
> 5     -1   0   1
> 6      1   0   1
>
> The three coefficients will represent: (1) The RNA comparison that you're
> interested in, (2) The difference between biological replicate 2 versus 1,
> and (3) The difference between biological replicate 3 versus 1.
>

Yes, I see now that it is overparametrized (and the lme model I wrote was also 
overparametrized). However, what troubles me is that, since this is a model 
without intercept, depending on how I set up the last two columns of the 
design matrix (the ones I don't really care about) does change the sigma with 
lm.series.

I can see, numerically, how this happens via the hat matrix (sum sq. error = 
Y' (I-H) Y) because the hat matrix is different between the 
parameterizations. But that is not what we want, because whether we compare 
replicates 1vs 2 and 1 vs 3 or 1 vs 2 and 2 vs 3 is irrelevant. Where am I 
getting lost?

This is an example:

mini.data.l <- mini.data <- rnorm(6)
dim(mini.data.l) <- c(1, 6)

d1 <- cbind(Efect = rep(c(-1, 1), 3),
            R2 = c(0, 0, 1, 1, 0, 0),
            R3 = c(0, 0, 0, 0, 1, 1))

d2 <- cbind(Efect = rep(c(-1, 1), 3),
            R1 = c(1, 1, 0, 0, 0, 0),
            R3 = c(0, 0, 0, 0, 1, 1))

d3 <- cbind(Efect = rep(c(-1, 1), 3),
            R1 = c(1, 1, 0, 0, 0, 0),
            R2 = c(0, 0, 1, 1, 0, 0))

lm.series(mini.data.l, d1)
lm.series(mini.data.l, d2)
lm.series(mini.data.l, d3)

#### Now with lme:
tech.reps <- factor(c(1, 1, 2, 2, 3, 3))
summary(lme(mini.data ~ -1 + d1[, 1], random = ~ 1|tech.reps))
## The sigma tends to be very similar to that of
d.large <- cbind(Efect = rep(c(-1, 1), 3),
            R1 = c(1, 1, 0, 0, 0, 0),
            R2 = c(0, 0, 1, 1, 0, 0),
	    R3 = c(0, 0, 0, 0, 1, 1))
lm.series(mini.data.l, d.large)$sigma #why?

## Note how the lm.series are equivalent, respectively, to:
summary(lm(mini.data ~ -1 + d1))
summary(lm(mini.data ~ -1 + d2))
summary(lm(mini.data ~ -1 + d3))

Best,

Ramón

-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://bioinfo.cnio.es/~rdiaz