[BioC] lmFit object limma

Tue Jul 25 14:24:39 CEST 2006

Dear limma experts,

I am analyzing the data set given to me and described by these the
column names of my MA object:

> colnames(MA.hyp)
[1] "../ampcon/mev/C0-_1stround_vs_C60-_1stround_13263536.mev"
[2] "../ampcon/mev/C0-_2ndround_vs_C60-_2ndround_13263534.mev"
[3] "../ampcon/mev/C60-_1stround_vs_C0-_1stround_13263533.mev"
[4] "../ampcon/mev/C60-_2ndround_vs_C0-_2ndround_13263531.mev"
[5] "../licl/mev/C0-_vs_C60-_13260944.mev"
[6] "../licl/mev/C60-_vs_C0-_13260945.mev"

Slide 1 has RNA from sample C0- and C60- after 1 round of linear
amplification. Slide 3 is the corresponding dyeswap.

Slide 2 has RNA from from sample C0- and C60 after 2 rounds of linear
amplification. Slide 4 is the corresponding dyeswap.

Slide 5 has total RNA from sample C0- and C60- (i.e. now amplification).
Slide 6 is the corresponding dye-swap.

Aims:
------
My motivation is to see how the log ratio of intensities is preserved
with each protocol. I am following a paper by Nygaard et al 2003
entitled "Obtaining reliable information from minute amounts of RNA
using cDNA microarrays" BMC Genomics 2002(3). In this paper they
performed two investigations:

a) Multiple hypothesis testing of log ratios to identify those genes
whose log ratios were not preserved in each protocol
b) Mixed effects modelling to quantify noise terms. 

I hope to do both but my initial problem concerns a). Initially I have
set up a design matrix with three protocol effects:

> hyp.design
  round_1 round_2 non_amp
1       1       0       0
2       0       1       0
3      -1       0       0
4       0      -1       0
5       0       0       1
6       0       0      -1

I fit the model thus:

>fit.hyp<-lmFit(MA.hyp,design=hyp.design)

Which gives me estimates of the protocol effect. I would like to perform
a t-test **CAVEAT APPROACHING!** I realize that this is not the best way
to perform the analysis due to the inherent problems with ordinary t-
statistics, but my adviser would like to see how the analysis compares
with the Nygaard paper. So specific questions relating to problem a)
are:

1) How do I carry out the t-test on a per-gene basis given the mean
protocol effect available from the fit object. I can see that the limma
guide has a way of obtaining the t-statistics but I'm not really sure
how to do the test on a per gene basis.

2) I look at the stdev.unscaled and it is the same for each protocol. Is
this to be expected? Sorry, my stats knowledge is not great.  

 round_1   round_2   non_amp
0.7071068 0.7071068 0.7071068

3) What does the stdev.scaled and the sigma attributes relate to?

In addition, I thought that modelling as separate channels might be more
applicable i.e create a design matrix like this:

> design.sc
   C0.1 C60.1 C0.2 C60.2 C0.NA C60.NA
1     1     0    0     0     0      0
2     0     1    0     0     0      0
3     0     0    1     0     0      0
4     0     0    0     1     0      0
5     0     1    0     0     0      0
6     1     0    0     0     0      0
7     0     1    0     0     0      0
8     0     0    1     0     0      0
9     0     0    0     0     1      0
10    0     0    0     0     0      1
11    0     0    0     0     0      1
12    0     0    0     0     1      0

 And then fitting the contrasts such as C0.1-C0.NA and using the
moderated t-statistics to test. I realize that this would not test ratio
preservation, but would provide a measure of protocol dependent effects
on each channel. I'd appreciate any thoughts, especially since I am not
a statistician !

Kind regards

Andy