[R-sig-ME] standard error and statistical significance in lmer versus lm
dakotajudo at mac.com
Fri Jan 2 22:23:10 CET 2015
My comments are strictly about the computations, as I understand them, and not about the appropriateness of the models.
First, I want to be clear that I am commenting on your concern about having two different answers. My thinking is that you have two different answers because you are not asking the same question in each case.
What you seem to be asking is, when using either lm or lmer, is the significance genotype:week interaction different?
But what you are really asking is, if I include probeset in the model, is the significance genotype:week interaction different?
That’s because the model specify for lmer includes probeset, but the model you specify for lm does not. These are two different statistical models, and you would expect different inferences about the significance of genotype:week, regardless of which computational engine you use.
I have not doubt that you are not interested in the specific effect of each gene. Nonetheless, by including probeset in the formula you pass to lmer, you are getting estimates for each effect; they’re just not being reported in the summary for lmer objects. IIRC, you would use ranef to view the for each gene by week (that’s a best guess, I’d really have to see the data to be sure about the interpretation)
You can also model probeset as a random effect using lm. Including a random effect in a linear model does not change the nature of the effect, it simply changes how the effects are estimated - in this case, ordinary least squares vs maximum likelihood. The effective difference is how you would compute F values in the resulting AOV table. aov() in R will only run F-tests against residual error; if you have other random effects in the model you need to compute F values manually.
My suggestion is that you run, on a subset of your data, both
aov(lm(values ~ week * genotype))
aov(lm(values ~ week * genotype + week:probesets))
then compare the residual error to the residual error of
lmer(values ~ week * genotype + (week | probesets))
I suspect the second lm model to be very similar to lmer, and that you might rest assured that you are not getting two different answers. I think your concern arises from not comparing like things.
> On Jan 2, 2015, at 12:16 PM, Vincenzo Lagani <vlagani at ics.forth.gr> wrote:
> Dear Peter,
> thanks for your suggestion. Unfortunately, even on my most powerful machine I could not fit the models you suggested: R returns a memory error, due to the inability of allocating ~60GB of memory (more than 18000 probesets in my data).
> Moreover, I am not totally sure if the model you suggested is suitable for my application. While I fully agree with you that it would be interesting to observe what happens when the variability due to the probesets is explicitly modeled, I must say that I am not interested in the specific effect of each single gene. Moreover, the set of genes I am using can be considered as a (more or less) random sample from the set of all mouse genes. Thus, it seems to me that the term "probesets" should be definitely modeled as a random term.
> Thanks again and kind regards,
> On 1/2/2015 4:34 PM, Peter Claussen wrote:
>> If I’m reading this correctly, I don’t think you’re comparing the same models, lm vs lmer
>> Using lm, you have two models
>> values ~ week + genotype
>> values ~ week * genotype
>> Using lmer, you have two different models.
>> week + genotype + (week | probesets)
>> week * genotype + (week | probesets)
>> If you were to include (week:probesets) in your lm models, and compare
>> values ~ week + genotype + week:probesets
>> values ~ week * genotype + week:probesets
>> would that clarify the confusion about the different results?
>>> On Dec 26, 2014, at 1:11 PM, Vincenzo Lagani <vlagani at ics.forth.gr> wrote:
>>> Dear all,
>>> I am modelling gene expression data with mixed models using lme4. My
>>> goal is to assess whether gene expression globally decreases or
>>> increases with time.
>>> Specifically, the data consist in whole expression profiles measured at
>>> five different time points in two different strains of mice. For each
>>> time point and each strain there are three replicates. The data are not
>>> longitudinal, i.e., different mice are used at each time point. We had
>>> to remove one profile from a time point because it was not matching our
>>> quality criteria, so the data became not properly balanced.
>>> On these data I am fitting the following model:
>>> lme4Model.full <- lmer(values ~ week * genotype + (week | probesets),
>>> data = dataset, REML = FALSE)
>>> where 'values' stands for the gene expression, 'week' is a scaled
>>> numeric reporting the age of the mice in weeks, and 'genotype' is a
>>> factor with two levels representing the two different genetic
>>> backgrounds. Each single gene is let having its own random intercept and
>>> What puzzles me is that when I compare this model with its simpler,
>>> non-mixed version:
>>> lmModel.full <- lm(values ~ week * genotype, data = dataset)
>>> I obtain the same coefficients but different standard errors (see
>>> below). Furthermore, while the interaction coefficient is not
>>> significant in the simple linear model, it becomes highly significant in
>>> the mixed model, at least according to these ANOVA tests:
>>> lmModel <- lm(values ~ week + genotype, data = dataset)
>>> lme4Model <- lmer(values ~ week + genotype + (week | probesets), data =
>>> dataset, REML = FALSE)
>>>> anova(lmModel.full, lmModel)
>>> Analysis of Variance Table
>>> Model 1: values ~ week * genotype
>>> Model 2: values ~ week + genotype
>>> Res.Df RSS Df Sum of Sq F Pr(>F)
>>> 1 414341 2162664
>>> 2 414342 2162666 -1 -2.4697 0.4732 0.4915
>>>> anova(lme4Model.full, lme4Model)
>>> Data: dataset
>>> lme4Model: values ~ week + genotype + (week | probesets)
>>> lme4Model.full: values ~ week * genotype + (week | probesets)
>>> Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
>>> lme4Model 7 72376 72452 -36181 72362
>>> lme4Model.full 8 72325 72413 -36155 72309 52.472 1 4.364e-13 ***
>>> Assessing whether the interaction coefficient is significant is actually
>>> the aim of my study, and having two totally different answers confuses
>>> me. My understanding is that the mixed model better catches the variance
>>> structure of the data and thus it is able to better estimate the
>>> standard errors and p-values of the coefficients. Is this correct? In
>>> other words, can I confidently claim that the p-values obtained from the
>>> mixed models are "the correct ones" and that the interaction term is
>>> actually significant?
>>> My apologies if this issue has been already posted on this list. Despite
>>> having seen multiple posts here on similar topics, I have not been able
>>> to find an answer to these questions.
>>> Thanks in advance for your help. Any suggestion is very welcome.
>>> Linear mixed model fit by maximum likelihood ['lmerMod']
>>> Formula: values ~ week * genotype + (week | probesets)
>>> Data: dataset
>>> AIC BIC logLik deviance df.resid
>>> 72325.3 72412.8 -36154.7 72309.3 414337
>>> Scaled residuals:
>>> Min 1Q Median 3Q Max
>>> -13.2622 -0.5246 0.0128 0.5270 23.3456
>>> Random effects:
>>> Groups Name Variance Std.Dev. Corr
>>> probesets (Intercept) 5.167338 2.27318
>>> week 0.005029 0.07092 -0.23
>>> Residual 0.047063 0.21694
>>> Number of obs: 414345, groups: probesets, 18015
>>> Fixed effects:
>>> Estimate Std. Error t value
>>> (Intercept) 6.5727983 0.0169414 388.0
>>> week -0.0151336 0.0006823 -22.2
>>> genotypeCSB 0.2405300 0.0007121 337.8
>>> week:genotypeCSB 0.0050430 0.0006962 7.2
>>> Correlation of Fixed Effects:
>>> (Intr) week gntCSB
>>> week -0.177
>>> genotypeCSB -0.015 -0.028
>>> wk:gntypCSB -0.001 -0.392 -0.054
>>> lm(formula = values ~ week * genotype, data = dataset)
>>> Min 1Q Median 3Q Max
>>> -5.7560 -1.9881 0.0083 1.6823 7.6407
>>> Estimate Std. Error t value Pr(>|t|)
>>> (Intercept) 6.572798 0.004407 1491.384 < 2e-16 ***
>>> week -0.015134 0.004547 -3.329 0.000873 ***
>>> genotypeCSB 0.240530 0.007500 32.072 < 2e-16 ***
>>> week:genotypeCSB 0.005043 0.007331 0.688 0.491537
>>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>> Residual standard error: 2.285 on 414341 degrees of freedom
>>> Multiple R-squared: 0.002491, Adjusted R-squared: 0.002484
>>> F-statistic: 344.9 on 3 and 414341 DF, p-value: < 2.2e-16
>>> [[alternative HTML version deleted]]
>>> R-sig-mixed-models at r-project.org mailing list
> R-sig-mixed-models at r-project.org mailing list
More information about the R-sig-mixed-models