[BioC] Fitting linear model with different design matrix for each gene in limma

Ryan rct at thompsonclan.org
Thu Dec 19 04:05:59 CET 2013


If you think that limma's duplicateCorrelation is the correct method to 
use except for the requirement to use multiple genes, then I would 
recommend you read the source of that function and incorporate its 
approach into your gene-by-gene pipeline. It looks like it uses 
mixedModel2Fit from the statmod package.

On Wed Dec 18 17:30:34 2013, Can Cenik wrote:
> Thanks a lot for the quick answer.
>
> However, lm cannot incorporate the fact that there are multiple measurements of the same subject as well as measurements from different subjects for a given factor level (REF or PLUS). I am leaning towards using a linear mixed effects model incorporating the factor of interest as a fixed effect and subject as a random effect. I think I can use lme from nlme package for this. Would you have any recommendations about which of the two models to use:
>
> fit <- lm(data$expr ~ data$Factor + data$Subject)
> vs
> fit2 <- lme(expr ~ Factor, data, random = ~ 1 | subject)
>
> Thanks again!
>
> ----- Original Message -----
> From: "Ryan" <rct at thompsonclan.org>
> To: "Elif Sarinay [guest]" <guest at bioconductor.org>
> Cc: bioconductor at r-project.org, ccenik at stanford.edu
> Sent: Wednesday, December 18, 2013 4:06:09 PM
> Subject: Re: [BioC] Fitting linear model with different design matrix for each gene in limma
>
> I'm pretty sure the post that you link to gives you your answer: if you
> want to fit a different model for each gene, then you should simply
> call "lm" on each gene individually instead of lmFit. You won't be able
> to get the empirical Bayes squeezing that limma performs, of course.
> However, I can't think of an experimental design where doing this makes
> logical sense. Are you sure you don't need to make a single design
> matrix with one coefficient for each "REF vs PLUS" partitioning, and
> then fit this single design for all genes using lmFit?
>
> On Wed Dec 18 11:55:57 2013, Elif Sarinay [guest] wrote:
>>
>> Hi,
>>
>> I have expression data from a large number of subjects with replicates. I am interested in fitting a linear model to understand the effect of a simple factor with two levels on the expression of each gene. I realize that if my design is as follows:
>> Subject Factor
>> 1       REF
>> 1       REF
>> 2       REF
>> 2       REF
>> 3       REF
>> 3       REF
>> 4       PLUS
>> 4       PLUS
>> 5       PLUS
>> 5       PLUS
>> ...
>>
>> I can fit a linear model with lmFit using
>> design <- model.matrix(~targets$Factor)
>> corfit <- duplicateCorrelation(data,design,block=targets$Subject)
>> lmFit (data, design, weights=weights, block=targets$Subject, correlation=corfit$consensus)
>>
>> However, in my case the design is different for each gene. In other words, the subjects that belong to the "REF" or "PLUS" level of the factor changes for each gene. I am wondering if there is any possibility to include the changing design for each gene.
>>
>> I tried applying lmFit to each gene separately using the correct design with an appropriate apply function. However, I found Dr. Smyth's answer to an earlier question (https://stat.ethz.ch/pipermail/bioconductor/2010-August/034794.html) suggesting that this is not a good alternative.
>>
>>
>> Thanks for any advice
>>
>>    -- output of sessionInfo():
>>
>> R version 2.14.1 (2011-12-22)
>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] nlme_3.1-103 edgeR_2.4.6  limma_3.10.3
>>
>> loaded via a namespace (and not attached):
>> [1] grid_2.14.1    lattice_0.20-6 sva_3.0.3      tools_2.14.1
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list