[R-sig-ME] Mixed model (with interaction) for gene expression and iteration
Paolo Innocenti
paolo.innocenti at ebc.uu.se
Tue Jun 2 14:53:42 CEST 2009
Dear Douglas and list,
I am thinking about fitting a mixed model for a microarray experiment
using lme4, since other specific software seems not suitable for my
purposes. I'll briefly describe the model and kindly ask for suggestions
on the model and the workflow I can use to get useful results.
My response variable Y is gene expression levels for a given gene (say
g_i) from 120 samples.
The factor I want to include are:
Sex: fixed, two levels, M/F.
Line: 15 randomly picked genotypes from a large outbred population.
I am interested in:
- if the gene is differentially expressed in the 2 sexes (effect of
sex), in the 15 lines (effect of line) and the interaction of the two.
- the variance component of line = how much of the variance is due to
the genotype
- the variance component of the interaction = the genetic variation for
sexual dimorphism.
Reading a bit of this mailing list, I came up with these three models:
m1 <- lmer(Y1 ~ sex + (1|line) + (1|sex:line))
or
m2 <- lmer(Y1 ~ sex + (sex|line))
or
m3 <- lmer(Y1 ~ sex + (0 + sex|line))
Which should all be the same model (and indeed they have all the same
residuals) but different parametrization (see self-contained example
below).
Now, in the light of my needs (see above), which model makes it easier
to extract the components I need? Also, do they make different
assumptions - as different levels of independency among levels of random
factors?
I will need to be able to extract the variance component values in an
iterative process (i have 18.000 genes): is VarCorr() the way to go?
VarCorr(m1)$'sex:line'[1]
VarCorr(m1)$'line'[1]
Last two question: what is the easier way to assess, in an iterative
process, normality of residuals, and what is a sensible way to assess
significant differential expression of genes (since I guess I can't get
p-values and then apply FDR correction?)
Thanks a lot for reading so far and I'll be grateful for any kind of help.
paolo
Self-contained example:
Y1 <- as.numeric(
c("11.6625","11.3243","11.7819","11.5032","11.7578","11.9379","11.8491",
"11.9035","11.2042","11.0344","11.5137","11.1995","11.6327","11.7392",
"11.9869","11.6955","11.5631","11.7159","11.8435","11.5814","12.0756",
"12.3428","12.3342","11.9883","11.6067","11.6102","11.6517","11.4444",
"11.9567","12.0478","11.9683","11.8207","11.5860","11.6028","11.6522",
"11.6775","12.3570","12.2266","12.2444","12.1369","11.2573","11.4577",
"11.4432","11.2994","11.8486","11.9258","11.9864","11.9964","11.2806",
"11.2527","11.3672","11.0791","11.9501","11.7223","11.9825","11.8114",
"11.6116","11.4284","11.3731","11.6942","12.2153","12.0101","12.2150",
"12.1932","11.5617","11.3761","11.4813","11.7503","11.9889","12.1530",
"12.3299","12.4436","11.4566","11.4614","11.5527","11.3513","11.9764",
"11.8810","12.0999","11.9083","11.4870","11.6764","11.3973","11.4507",
"12.1141","11.9906","12.1118","11.9728","11.3382","11.4146","11.4590",
"11.2527","12.1101","12.0448","12.2191","11.8317","11.3982","11.3555",
"11.3897","11.7731","11.9749","11.8666","12.1984","12.0350","11.4642",
"11.4509","11.5552","11.4346","12.0714","11.7136","11.9019","11.8158",
"11.3132","11.3121","11.1612","11.2073","11.6658","11.7879","11.7847",
"11.5300"))
sex <- factor(rep(c("F","M"), 15, each=4))
line <- factor(rep(1:15, each=8))
m1 <- lmer(Y1 ~ sex + (1|line) + (1|sex:line))
m2 <- lmer(Y1 ~ sex + (sex|line))
m3 <- lmer(Y1 ~ sex + (0 + sex|line))
VarCorr(m1)$'sex:line'[1]
VarCorr(m1)$'line'[1]
Output:
>> m1
> Linear mixed model fit by REML
> Formula: Y1 ~ sex + (1 | line) + (1 | sex:line)
> AIC BIC logLik deviance REMLdev
> -91.13 -77.2 50.57 -111.1 -101.1
> Random effects:
> Groups Name Variance Std.Dev.
> sex:line (Intercept) 0.0023237 0.048205
> line (Intercept) 0.0169393 0.130151
> Residual 0.0168238 0.129707
> Number of obs: 120, groups: sex:line, 30; line, 15
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 11.45977 0.03955 289.72
> sexM 0.52992 0.02951 17.96
>
> Correlation of Fixed Effects:
> (Intr)
> sexM -0.373
>> m2
> Linear mixed model fit by REML
> Formula: Y1 ~ sex + (sex | line)
> AIC BIC logLik deviance REMLdev
> -90 -73.27 51 -112.1 -102
> Random effects:
> Groups Name Variance Std.Dev. Corr
> line (Intercept) 0.0152993 0.123691
> sexM 0.0046474 0.068172 0.194
> Residual 0.0168238 0.129707
> Number of obs: 120, groups: line, 15
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 11.45977 0.03606 317.8
> sexM 0.52992 0.02951 18.0
>
> Correlation of Fixed Effects:
> (Intr)
> sexM -0.161
>> m3
> Linear mixed model fit by REML
> Formula: Y1 ~ sex + (0 + sex | line)
> AIC BIC logLik deviance REMLdev
> -90 -73.27 51 -112.1 -102
> Random effects:
> Groups Name Variance Std.Dev. Corr
> line sexF 0.015299 0.12369
> sexM 0.023227 0.15240 0.899
> Residual 0.016824 0.12971
> Number of obs: 120, groups: line, 15
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 11.45977 0.03606 317.8
> sexM 0.52991 0.02951 18.0
>
> Correlation of Fixed Effects:
> (Intr)
> sexM -0.161
--
Paolo Innocenti
Department of Animal Ecology, EBC
Uppsala University
Norbyvägen 18D
75236 Uppsala, Sweden
More information about the R-sig-mixed-models
mailing list