[R] Question on LIMMA analysis with covariates and some missing data

Wed Dec 3 11:58:32 CET 2014

Hello,

I have a dataset of asthma patients for which white blood cells gene 
expression was measured with one-color Affymetrix microarrays (N~500, 
asthma is a factor with 4 levels: control, moderate, severe, severe & 
smokers).

I also have an extensive clinical dataset related, but with many missing 
values (for example, our controls don't have asthma exacerbations counts).

Our goal is to find DEGs between asthma groups, but we suspect that some 
of those clinical variables have an influence on gene expression, so we 
want to treat those as covariates in the model.

Now the question: can LIMMA handle missing data in the covariates and 
produce accurately corrected p-values for the genes ?

The model matrix is constructed like so (example with age and sex as 
covariates):

# Microarray data is in 'data' variable
asthma<-factor("Control", "Moderate", "Severe", "SevereSmokers")
design<-model.matrix(~0 + asthma + age + sex)
contrast.matrix<-makeContrasts(Control-Moderate, Control-Severe, 
Control-SevereSmokers, levels=design)
fit<-lmFit(data, design)
fit2<-contrasts.fit(fit, contrast.matrix)
fit2<-eBayes(fit2)

Many thanks,

Bertrand
-- 
EISBM logo <http://www.eisbm.org> *Bertrand De Meulder
Researcher *
European Institute for Systems Biology and Medicine
Campus Charles Mérieux - Université de Lyon
CNRS - UCBL - ENS
*E-mail:*bdemeulder at eisbm.org <mailto:bdemeulder at eisbm.org>

*Office:* +33(0)4 37 28 74 41

*Office*
Université Claude Bernard
3^e étage plot 2
50 Avenue Tony Garnier
69366 Lyon cedex 07
France 	*Laboratory*
LyonBioPôle - Centre d'Infectiologie
2^e étage Bât. Domilyon
321 Avenue Jean Jaurès
69007 Lyon
France

Follow us : EISBM eisbm.org <http://www.eisbm.org> | Facebook Facebook 
<http://www.facebook.com/EISBM> | Twitter Twitter 
<http://twitter.com/EISBM_EU>