[BioC] limma design question
James W. MacDonald
jmacdon at med.umich.edu
Tue Nov 25 22:40:34 CET 2008
Hi Jenny,
The way I understand it, the difference is that the way I suggested is
simply a fixed-effects model, where we assume that the variance is
constant for all of the groups.
If you compute the intra-group correlation using duplicateCorrelation(),
you will then fit a mixed linear model that allows for different
variance (or correlation) within the groups.
I don't know if one is more correct than the other. Certainly the fixed
effects model makes more assumptions. I think you can use
duplicateCorrelation() to see if the intra-group correlation is high,
which would argue for fitting a mixed linear model instead.
Best,
Jim
Jenny Drnevich wrote:
> Hi Jim,
>
> I've seen you suggest this way for account for blocks by fitting extra
> columns in the design matrix before. I'm just wondering how this differs
> from the suggestion in the limma vignette (Section 8.2 Technical
> Replication) to use duplicateCorrelation() to determine the average
> correlation between blocks. I know they are not mathematically
> equivalent; the coefficients for the treatment groups are slightly
> different, they use different DF, and the p-values tend to be larger
> using the duplicateCorrelation() method (at least for the one experiment
> I'm using). So, is one more "correct" than the other? Or are blocks of
> technical replicates different somehow than blocks of patients or cell
> lines, etc.?
>
> Thanks,
> Jenny
>
> At 08:05 AM 11/25/2008, James W. MacDonald wrote:
>> Hi Adrian,
>>
>> Adrian Johnson wrote:
>>> dear group,
>>> I am sorry to ask again design related question. the data is from SMD.
>>> three or two different samples have been obtained from single patient.
>>> Say :
>>> from patient 1 - (A). a normal tissue, (B). inflamed tissue and (C).
>>> cancer tissue was extracted
>>> from Patient 2 - (A). a normal tissue (B). cancer tissue was only
>>> extracted and like wise.
>>> A universal reference sample was used to hybridize on Green channel.
>>> This is a paired design and a reference design. Limma manual describes
>>> examples unique to one specific design.
>>
>> Yes, but the 'limma User's Guide' also notes that the reference design
>> is pretty much the same as a one-color analysis, but that you have to
>> account for dye-swaps. Since you don't have dye-swaps, then it _is_
>> the same as a one-color analysis. The only wrinkle here is that you
>> have blocked data (which is also covered in the limma User's Guide).
>>
>> If you had doubts, you could have approached this iteratively. First
>> let's see what limma thinks you should be using:
>>
>> > modelMatrix(targets, ref="Ref")
>> Found unique target names:
>> ACA B N Ref
>> ACA B N
>> [1,] 0 1 0
>> [2,] 1 0 0
>> [3,] 0 0 1
>> [4,] 1 0 0
>> [5,] 0 0 1
>> [6,] 1 0 0
>> [7,] 0 0 1
>> [8,] 0 1 0
>> [9,] 1 0 0
>>
>> So this is a pretty simple model matrix, but it doesn't account for
>> the blocks.
>>
>> > Cy5=factor(c("B","ACA","N","ACA","N","ACA","N","B","ACA"))
>> > sibship=factor(rep(c(12,15,16,17), c(2,2,2,3)))
>> > model.matrix(~0 + Cy5 + sibship)
>> Cy5ACA Cy5B Cy5N sibship15 sibship16 sibship17
>> 1 0 1 0 0 0 0
>> 2 1 0 0 0 0 0
>> 3 0 0 1 1 0 0
>> 4 1 0 0 1 0 0
>> 5 0 0 1 0 1 0
>> 6 1 0 0 0 1 0
>> 7 0 0 1 0 0 1
>> 8 0 1 0 0 0 1
>> 9 1 0 0 0 0 1
>>
>> Now this is identical to the above, but with three extra columns to
>> capture the sib-specific means. Note that you could have simply added
>> the three extra columns for the sibs to the previous model matrix.
>>
>> Also note that your contrast matrix will have to have 6 rows (with the
>> last three being all zeros).
>>
>> Best,
>>
>> Jim
>>
>>
>>> I do not know how to combine two different designs.
>>> My targets file:
>>> FileName Cy3 Cy5 SibShip (patient)
>>> 61453.xls Ref B 12
>>> 61454.xls Ref ACA 12
>>> 61459.xls Ref N 15
>>> 61460.xls Ref ACA 15
>>> 61461.xls Ref N 16
>>> 61462.xls Ref ACA 16
>>> 61463.xls Ref N 17
>>> 61464.xls Ref B 17
>>> 61465.xls Ref ACA 17
>>>
>>> I want to identify BvsN, ACAvsN, ACAvsB.
>>> how could I get design matrix for this type of design.
>>> This is one of those studies where rare cancers have been studied (in
>>> 2003).
>>> Unfortunately, this is public dataset (Published in Oncogene) where
>>> experiments have been done using stanford microarray database.
>>> thank you in advance.
>>> Adrian.
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Hildebrandt Lab
>> 8220D MSRB III
>> 1150 W. Medical Center Drive
>> Ann Arbor MI 48109-0646
>> 734-936-8662
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> Jenny Drnevich, Ph.D.
>
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
>
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
>
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at illinois.edu
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662
More information about the Bioconductor
mailing list