[BioC] limma, voom, weights: factorial versus nested interaction results
Reiner Schulz
reiner.schulz at kcl.ac.uk
Fri Jan 4 09:22:20 CET 2013
Thank you very much, Gordon.
Out of curiosity, why did you develop the voom method when there is
edgeR? Are there good, perhaps data-dependent, reasons for using one
over the other?
Best, Reiner
On 03/01/13 23:59, Gordon K Smyth wrote:
> Dear Reiner,
>
> You don't mention contrasts.fit(), but I assume that you are using that
> function to get equivalent quantities from the two design matrices.
>
> If I am guessing your problem correctly, then the issue is that
> contrasts.fit() gives results that are not exact in the presence of voom
> weights. See the warning note on the help page for contrasts.fit.
>
> You will get the exact correct t-statistic if the contrast you want is a
> coefficient of your design matrix. You should also get exact correct
> results when using the one way layout design formula ~0+cond, but you
> may get an approximate test statistic if you use a contrast from the
> nested formula. The fold changes are exact in all cases.
>
> There was a thread about this a few years ago, see:
>
> https://www.stat.math.ethz.ch/pipermail/bioconductor/2009-December/030947.html
>
>
> Basically I have promised to change contrasts.fit() to give exact
> results in all cases, even though this will make it much slower in some
> cases. I have not so far found the time to do it, but will definitely
> do so by the time we publish the voom method.
>
> Best wishes
> Gordon
>
> ---------------------------------------------
> Professor Gordon K Smyth,
> Bioinformatics Division,
> Walter and Eliza Hall Institute of Medical Research,
> 1G Royal Parade, Parkville, Vic 3052, Australia.
> http://www.statsci.org/smyth
>
>
>
>> Date: Wed, 2 Jan 2013 13:31:58 +0100
>> From: Reiner Schulz <reiner.schulz at kcl.ac.uk>
>> To: <bioconductor at r-project.org>
>> Subject: [BioC] limma, voom, weights: factorial versus nested
>> interaction results
>>
>> Dear all,
>>
>> Am analysing an RNA-seq experiment with a two-way factorial design, and
>> am observing different results for two contrasts of interest depending
>> on whether I use a factorial model matrix or a nested interaction model
>> matrix. I only observe this if the data are pre-processed w/ voom:
>>> counts.voom= voom( ...
>> i.e., when there are observational level weights passed to lmFit. If I
>> only pass counts.voom$E (no observational level weights) to lmFit, the
>> results for the two contrasts of interest are identical for the
>> factorial and nested interaction model matrices.
>> Is this to be expected, and if so, what is the explanation?
>>
>> Much appreciated, Reiner
>>
>> PS: some more details ...
>>
>>> strain
>> [1] MUT MUT MUT MUT MUT MUT MUT MUT MUT MUT MUT MUT MUT WT WT WT WT
>> WT WT
>> [20] WT WT WT WT WT WT WT
>> Levels: WT MUT
>>> treat
>> [1] U T2 T1 U T2 T1 U U T2 T1 U T2 T1 U T2 T1 U T2 T1 U U T2
>> T1 U T2
>> [26] T1
>> Levels: U T1 T2
>>
>>> cond= factor( paste( sep='.', strain, treat), lev=c('WT.U', 'MUT.U',
>> 'WT.T1', 'MUT.T1', 'WT.T2', 'MUT.T2'))
>>> design.matrix= model.matrix( ~0 + cond)
>>> design.matrix
>> WT.U MUT.U WT.T1 MUT.T1 WT.T2 MUT.T2
>> 1 0 1 0 0 0 0
>> 2 0 0 0 0 0 1
>> 3 0 0 0 1 0 0
>> 4 0 1 0 0 0 0
>> 5 0 0 0 0 0 1
>> 6 0 0 0 1 0 0
>> 7 0 1 0 0 0 0
>> 8 0 1 0 0 0 0
>> 9 0 0 0 0 0 1
>> 10 0 0 0 1 0 0
>> 11 0 1 0 0 0 0
>> 12 0 0 0 0 0 1
>> 13 0 0 0 1 0 0
>> 14 1 0 0 0 0 0
>> 15 0 0 0 0 1 0
>> 16 0 0 1 0 0 0
>> 17 1 0 0 0 0 0
>> 18 0 0 0 0 1 0
>> 19 0 0 1 0 0 0
>> 20 1 0 0 0 0 0
>> 21 1 0 0 0 0 0
>> 22 0 0 0 0 1 0
>> 23 0 0 1 0 0 0
>> 24 1 0 0 0 0 0
>> 25 0 0 0 0 1 0
>> 26 0 0 1 0 0 0
>>
>>> design.matrix.alt= model.matrix( ~strain + strain:treat)
>>> design.matrix.alt
>> Intercept MUT WT:T1 MUT:T1 WT:T2 MUT:T2
>> 1 1 1 0 0 0 0
>> 2 1 1 0 0 0 1
>> 3 1 1 0 1 0 0
>> 4 1 1 0 0 0 0
>> 5 1 1 0 0 0 1
>> 6 1 1 0 1 0 0
>> 7 1 1 0 0 0 0
>> 8 1 1 0 0 0 0
>> 9 1 1 0 0 0 1
>> 10 1 1 0 1 0 0
>> 11 1 1 0 0 0 0
>> 12 1 1 0 0 0 1
>> 13 1 1 0 1 0 0
>> 14 1 0 0 0 0 0
>> 15 1 0 0 0 1 0
>> 16 1 0 1 0 0 0
>> 17 1 0 0 0 0 0
>> 18 1 0 0 0 1 0
>> 19 1 0 1 0 0 0
>> 20 1 0 0 0 0 0
>> 21 1 0 0 0 0 0
>> 22 1 0 0 0 1 0
>> 23 1 0 1 0 0 0
>> 24 1 0 0 0 0 0
>> 25 1 0 0 0 1 0
>> 26 1 0 1 0 0 0
>>
>> Observed equalities (as expected):
>> MUT == MUT.U - WT.U
>> WT:T1 == WT.T1 - WT.U
>> MUT:T1 == MUT.T1 - MUT.U
>> WT:T2 == WT.T2 - WT.U
>> MUT:T2 == MUT.T2 - MUT.U
>> MUT:T1 - WT:T1 == (MUT.T1 - MUT.U) - (WT.T1 - WT.U)
>> MUT:T2 - WT:T2 == (MUT.T2 - MUT.U) - (WT.T2 - WT.U)
>>
>> Observed inequalities (only w/ voom pre-processed data; unexpected):
>> WT:T1 - WT:T2 != (WT.T1 - WT.U) - (WT.T2 - WT.U)
>> MUT:T1 - MUT:T2 != (MUT.T1 - MUT.U) - (MUT.T2 - MUT.U)
>>
>> --
>>
>
> ______________________________________________________________________
> The information in this email is confidential and intended solely for
> the addressee.
> You must not disclose, forward, print or use it without the permission
> of the sender.
> ______________________________________________________________________
>
--
Reiner Schulz, Ph.D.
Senior Lecturer in Bioinformatics and Epigenomics
Dept. of Medical and Molecular Genetics
King's College London
Guy's Hospital, 8th floor Tower Wing
London SE1 9RT
https://atlas.genetics.kcl.ac.uk/~rschulz
More information about the Bioconductor
mailing list