[BioC] limma design question
Jenny Drnevich
drnevich at illinois.edu
Fri Dec 5 21:32:22 CET 2008
Hi Gordon,
I've been out for a while and finally read your detailed reply.
Thanks so much - it really helps clarify things for me!!
Cheers,
Jenny
At 05:55 PM 11/27/2008, Gordon K Smyth wrote:
>Hi Jenny,
>
>Should blocks be fixed (in the design matrix) or treated as random
>(hence enter the covariance matrix as correlations)? This question
>has a long history in mathematical statistics, so long that you can
>be sure than the answer is somewhat subtle.
>
>Neither approach is right or wrong. The random approach makes more
>assumptions and allows you, in some circumstances, to extract more
>information. The limma approach with dupcor etc makes even more
>assumptions than classical random effects models. If the blocks are
>treated as fixed, then treatments can only be compared within
>blocks. If blocks are treated as random, then it is possible to
>compare treatments between blocks as well as within.
>
>So the first key issue is whether treatment comparisons are made
>between blocks or within blocks.
>
>Suppose you do an experiment on random samples of subjects from two
>groups, in which each subject is subjected to several tests. The
>subjects are blocks. The total sums of squares can be divided into
>between and within subject sums of squares. In other words, the
>information in the data can be divided into a between-subject error
>strata and a within-subject strata.
>
>Suppose you want to compare the two groups. All the information is
>in the between-subject error strata. You cannot do any statistical
>test unless you treat the subjects as random.
>
>Suppose now you want to compare the treatments. If the experiment
>is balanced (all subjects do all tests), then all the information
>about the treatments is in the within-block strata. So you may as
>well treat the subjects as fixed effects (as for example is done in
>a paired t-test).
>
>If the experiment is unbalanced (each subject does only a subset of
>the tests, subjects do tests a different number of times), then you
>can extract more information about the treatment comparisons from
>the between-subject error strata. To do this, you have to treat the
>blocks as random.
>
>The second key issue to consider is whether it makes sense
>scientifically to treat the blocks as random. If there are only two
>or three blocks, then there is little to be gained by treating them
>as random. If the blocks have large unpredictable effects, then it
>is much safer to treat them as fixed. If you want to make specific
>conclusions about each of the blocks, then it doesn't make sense to
>treat them as a random. In general, random is natural if there are
>lots of blocks with relatively small effects and not of interest in
>themselves. Sometimes you can go either way.
>
>Hope this helps
>Gordon
>
>On Tue, 25 Nov 2008, Jenny Drnevich wrote:
>
>>Hi Jim,
>>
>>I've seen you suggest this way for account for blocks by fitting
>>extra columns in the design matrix before. I'm just wondering how
>>this differs from the suggestion in the limma vignette (Section 8.2
>>Technical Replication) to use duplicateCorrelation() to determine
>>the average correlation between blocks. I know they are not
>>mathematically equivalent; the coefficients for the treatment
>>groups are slightly different, they use different DF, and the
>>p-values tend to be larger using the duplicateCorrelation() method
>>(at least for the one experiment I'm using). So, is one more
>>"correct" than the other? Or are blocks of technical replicates
>>different somehow than blocks of patients or cell lines, etc.?
>>
>>Thanks,
>>Jenny
>
>Jenny Drnevich, Ph.D.
>
>Functional Genomics Bioinformatics Specialist
>W.M. Keck Center for Comparative and Functional Genomics
>Roy J. Carver Biotechnology Center
>University of Illinois, Urbana-Champaign
>
>330 ERML
>1201 W. Gregory Dr.
>Urbana, IL 61801
>USA
>
>ph: 217-244-7355
>fax: 217-265-5066
>e-mail: drnevich at illinois.edu
More information about the Bioconductor
mailing list