[BioC] limma design question

Tue Nov 25 22:40:34 CET 2008

Hi Jenny,

The way I understand it, the difference is that the way I suggested is 
simply a fixed-effects model, where we assume that the variance is 
constant for all of the groups.

If you compute the intra-group correlation using duplicateCorrelation(), 
you will then fit a mixed linear model that allows for different 
variance (or correlation) within the groups.

I don't know if one is more correct than the other. Certainly the fixed 
effects model makes more assumptions. I think you can use 
duplicateCorrelation() to see if the intra-group correlation is high, 
which would argue for fitting a mixed linear model instead.

Best,

Jim

Jenny Drnevich wrote:
> Hi Jim,
> 
> I've seen you suggest this way for account for blocks by fitting extra 
> columns in the design matrix before. I'm just wondering how this differs 
> from the suggestion in the limma vignette (Section 8.2 Technical 
> Replication) to use duplicateCorrelation() to determine the average 
> correlation between blocks. I know they are not mathematically 
> equivalent; the coefficients for the treatment groups are slightly 
> different, they use different DF, and the p-values tend to be larger 
> using the duplicateCorrelation() method (at least for the one experiment 
> I'm using). So, is one more "correct" than the other? Or are blocks of 
> technical replicates different somehow than blocks of patients or cell 
> lines, etc.?
> 
> Thanks,
> Jenny
> 
> At 08:05 AM 11/25/2008, James W. MacDonald wrote:
>> Hi Adrian,
>>
>> Adrian Johnson wrote:
>>> dear group,
>>> I am sorry to ask again design related question. the data is from SMD.
>>> three or two different samples have been obtained from single patient.
>>> Say :
>>> from patient 1 -  (A). a normal tissue, (B). inflamed tissue and (C).
>>> cancer tissue was extracted
>>> from Patient 2 -  (A). a normal tissue (B). cancer tissue was only
>>> extracted and like wise.
>>> A universal reference sample was used to hybridize on Green channel.
>>> This is a paired design and a reference design. Limma manual describes
>>> examples unique to one specific design.
>>
>> Yes, but the 'limma User's Guide' also notes that the reference design 
>> is pretty much the same as a one-color analysis, but that you have to 
>> account for dye-swaps. Since you don't have dye-swaps, then it _is_ 
>> the same as a one-color analysis. The only wrinkle here is that you 
>> have blocked data (which is also covered in the limma User's Guide).
>>
>> If you had doubts, you could have approached this iteratively. First 
>> let's see what limma thinks you should be using:
>>
>> > modelMatrix(targets, ref="Ref")
>> Found unique target names:
>>  ACA B N Ref
>>       ACA B N
>>  [1,]   0 1 0
>>  [2,]   1 0 0
>>  [3,]   0 0 1
>>  [4,]   1 0 0
>>  [5,]   0 0 1
>>  [6,]   1 0 0
>>  [7,]   0 0 1
>>  [8,]   0 1 0
>>  [9,]   1 0 0
>>
>> So this is a pretty simple model matrix, but it doesn't account for 
>> the blocks.
>>
>> > Cy5=factor(c("B","ACA","N","ACA","N","ACA","N","B","ACA"))
>> > sibship=factor(rep(c(12,15,16,17), c(2,2,2,3)))
>> > model.matrix(~0 + Cy5 + sibship)
>>   Cy5ACA Cy5B Cy5N sibship15 sibship16 sibship17
>> 1      0    1    0         0         0         0
>> 2      1    0    0         0         0         0
>> 3      0    0    1         1         0         0
>> 4      1    0    0         1         0         0
>> 5      0    0    1         0         1         0
>> 6      1    0    0         0         1         0
>> 7      0    0    1         0         0         1
>> 8      0    1    0         0         0         1
>> 9      1    0    0         0         0         1
>>
>> Now this is identical to the above, but with three extra columns to 
>> capture the sib-specific means. Note that you could have simply added 
>> the three extra columns for the sibs to the previous model matrix.
>>
>> Also note that your contrast matrix will have to have 6 rows (with the 
>> last three being all zeros).
>>
>> Best,
>>
>> Jim
>>
>>
>>> I do not know how to combine two different designs.
>>> My targets file:
>>> FileName        Cy3     Cy5     SibShip (patient)
>>> 61453.xls       Ref     B       12
>>> 61454.xls       Ref     ACA     12
>>> 61459.xls       Ref     N       15
>>> 61460.xls       Ref     ACA     15
>>> 61461.xls       Ref     N       16
>>> 61462.xls       Ref     ACA     16
>>> 61463.xls       Ref     N       17
>>> 61464.xls       Ref     B       17
>>> 61465.xls       Ref     ACA     17
>>>
>>> I want to identify BvsN, ACAvsN, ACAvsB.
>>> how could I get design matrix for this type of design.
>>> This is one of those studies where rare cancers have been studied (in 
>>> 2003).
>>> Unfortunately, this is public dataset (Published in Oncogene) where
>>> experiments have been done using stanford microarray database.
>>> thank you in advance.
>>> Adrian.
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: 
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> -- 
>> James W. MacDonald, M.S.
>> Biostatistician
>> Hildebrandt Lab
>> 8220D MSRB III
>> 1150 W. Medical Center Drive
>> Ann Arbor MI 48109-0646
>> 734-936-8662
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> Jenny Drnevich, Ph.D.
> 
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
> 
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
> 
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at illinois.edu

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662