[R-sig-ME] correlated random effects in lmer and false convergence
Douglas Bates
bates at stat.wisc.edu
Tue Jun 22 04:42:45 CEST 2010
On Mon, Jun 21, 2010 at 5:04 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> A quick response as I must catch a bus soon.
>
> On Mon, Jun 21, 2010 at 4:18 PM, Ailene Kane <ailene at u.washington.edu> wrote:
>> Dear Dr. Bates and/or other expert lmer users:
>>
>> We are having problems fitting a linear mixed effects model using the lmer()
>> function in the lme4 package, and are hoping that you will be able (and
>> willing!) to answer some of our questions. We have read several chapters in
>> your book on lmer, and other online resources which have helped resolve many
>> of our questions, but our data set is pretty complex and we wanted to make
>> very sure that we are interpreting the output (and some error messages that
>> we get) correctly. Thank you so much for your time!
>>
>> As background, we are exploring the influence of climative factors (e.g.
>> snow, temperature, etc) on annual tree growth (“rwi”= ring width index, a
>> detrended index for tree growth that ranges from ~-1 to +2). We are
>> evaluating the fit of 29 different models, including various climate
>> variables and combinations of climate variables as fixed effects (e.g. in
>> the model below, GST=Growing season Temperature, and GPT=growing season
>> precipitation). Note: the climate variables have been standardized by
>> substracting the mean and dividing by the standard deviation. There are 2
>> random effects: “ind” (the individual tree; 20 trees were cored at each
>> location and they differ in sensitivity to climate) and “yr” (the year of
>> tree growth - this is sort of like a block effect, because you could have
>> years with similar temperature values). A subset of our dataset is attached
>> as a .csv, and some sample code and output is listed below (and in the
>> attached file).
>>
>> We have the following questions:
>>
>> 1) We originally thought that test13 and test13b were the same thing:
>>
>> test13<-lmer(rwi~GST*GPT+(0+GST*GPT|ind)+(1|yrs),control=list(maxIter=5000))
>> test13b<-lmer(rwi~GST*GPT+(0+GST|ind)+(0+GPT|ind)+(0+GST:GPT|ind)+(1|yrs),control=list(maxIter=5000))
>
> No, they are not the same as you have seen in your later
> investigation. The first model generates a vector-valued random
> effect for each level of ind with possible correlation in the
> variance-covariance matrix. In the second model the components of the
> random effects are independent.
>
> It seems to me that there is very little variation associated with the
> levels of ind. I enclose a transcript of fitting a series of models
> ending up with one that doesn't have any random effects for ind and it
> seems to compare well to the others.
>
> I must go to the bus. I'll explain about the model-building strategy later.
>> However, we recently noticed that they result in differences in the
>> correlations between random effects, and different AIC values (output is
>> below and can be generated by the code below, if you are interested). In
>> test13, the random effects on ind for GST,GPT, and their interaction are
>> perfectly correlated (1.00). In test13b the random effects are NOT
>> PERFECTLY correlated (r= 0.6349112). Does this mean that test13 ASSUMES
>> correlated random effects (because of the way that it is coded)? We
>> definitely don't want this and wanted to make sure that we understood
>> correctly how to code the model appropriately. We are comparing the 29
>> potential models to a null model and using the delta AIC (in part) to
>> evaluate the best-fit model. The null model is coded as follows:
When describing the distribution of the random effects you must be
careful to distinguish the unconditional distribution and the
conditional distribution, given the data and the values of the model
parameters. The model is described in terms of the unconditional
distribution whereas the values provided by the ranef extractor are
from the conditional distribution.
The unconditional distribution is a multivariate Gaussian (or
"normal") distribution with mean zero and a parameterized
variance-covariance matrix. That matrix is generated from the
random-effects terms in the model formula. Each such term is of the
form (expr|grp) where expr is a linear model expression and grp is the
"grouping factor". This is usually the name of a factor like ind or
yr but could also be an expression, as long as it evaluates to a
factor.
The variance-covariance matrix of the unconditional distribution is
block diagonal with one block for each term. Thus the random effects
for different terms are independent in this distribution, even when
their grouping factors are the same. When there is more than one
random effect per level of the factor, as in the term (0 + GST * GPT |
ind), which generates 3 random effects for each individual, these
random effects can be correlated in the unconditional distribution.
Thus that term requires 6 variance-covariance parameters (3 variances
and 3 covariances) to be estimated while the other expression, (0 +
GST|ind) + (0+GPT|ind) + (0 + GST:GPT|ind) requires on 3 variance
parameters to be estimated.
The maximum likelihood or REML estimates seek a balance between
complexity of the model and fidelity to the data. The way that the
complexity of the model is measured a model with a singular
variance-covariance matrix is considered ideal. If the random effects
don't have much explanatory power, as in this model, the variance
parameters will be driven to zero or the correlations to +1 or -1.
Here the random effects with respect to ind don't contribute that much
to the model fit and you are better off without them,
There is always a question of how to go about building a statistical
model for your data. Sometimes we take a forward selection approach,
starting with a simple model and adding terms as indicated and testing
whether they have sufficient explanatory power to warrant retaining
them. Sometimes we start with a complex model and eliminate terms
that seem redundant. I have noticed a tendency for those trained in
areas like ecology, forestry and agriculture to start with the complex
model and eliminate terms that are not contributing. This works fine
in some situations like well-balanced agricultural field trials. You
can fit very complex models to balanced designs because the
contributions of various terms lie in orthogonal subspaces of the
response space, resulting in independent estimators for their
coefficients. Observational data from fields like ecology is often
highly imbalanced making it difficult to fit the complex model. In
those cases it is probably better to start simple and build up the
model rather than trying to fit all possible terms and see which ones
make the cut. Of course, you may not be able to convince the referees
on your paper that this approach makes sense.
>> test0<-lmer(rwi~1+(0+1|ind)+(1|yrs),control=list(maxIter=5000)
>>
>> 2) We have been getting warning message for some models (for example,
>> test13): “Warning message: In mer_finalize(ans) : singular convergence (7).”
>> We have explored our data and get this error message only for complex
>> models (e.g. 2 climate variables with an interaction), and mainly at
>> locations where tree growth (of individual trees) is not correlated to any
>> of our climate variables. Do you think this interpretation makes sense, in
>> which case we can conclude that the model fits poorly, indicating that other
>> factors besides climate influence growth? Or, is there something else that
>> you think we should do to avoid singular convergence? Again, the output
>> generated by the code (test13b) is below
That message about singular convergence comes from the nlminb
optimizer that is used in the released versions of lme4. In the
development version, called lme4a and available only through R-forge
in source form, I have switched to another optimizer called bobyqa in
lme4a because it seems to provide better results and is generally
faster than nlminb. As you indicate, the convergence problems are
characteristic of complex models and often indicate a model that is
more complex than is warranted by the data.
>> and attached, in case you are interested.
>> Thank you so much for your help!
>>
>> Best wishes,
>>
>> Ailene
>>
>> R OUTPUT FROM ATTACHED CODE:
>>> summary(test13)
>> Linear mixed model fit by REML
>> Formula: rwi ~ GST * GPT + (0 + GST * GPT | ind) + (1 | yrs)
>> AIC BIC logLik deviance REMLdev
>> -528.4 -463 276.2 -577 -552.4
>> Random effects:
>> Groups Name Variance Std.Dev. Corr
>> yrs (Intercept) 0.02764111 0.166256
>> ind GST 0.00033501 0.018303
>> GPT 0.00029068 0.017049 1.000
>> GST:GPT 0.00023243 0.015246 1.000 1.000
>> Residual 0.03587474 0.189406
>> Number of obs: 1727, groups: yrs, 94; ind, 20
>> Fixed effects:
>> Estimate Std. Error t value
>> (Intercept) 0.999112 0.018777 53.21
>> GST 0.071152 0.019373 3.67
>> GPT 0.036372 0.019266 1.89
>> GST:GPT -0.004302 0.019494 -0.22
>> Correlation of Fixed Effects:
>> (Intr) GST GPT
>> GST 0.033
>> GPT -0.024 0.340
>> GST:GPT 0.320 0.140 -0.032
>>> summary(test13b)
>> Linear mixed model fit by REML
>> Formula: rwi ~ GST * GPT + (0 + GST | ind) + (0 + GPT | ind) + (0 + GST:GPT
>> | ind) + (1 | yrs)
>> AIC BIC logLik deviance REMLdev
>> -522.7 -473.6 270.4 -565.3 -540.7
>> Random effects:
>> Groups Name Variance Std.Dev.
>> yrs (Intercept) 2.7626e-02 0.1662108
>> ind GST:GPT 8.7606e-05 0.0093598
>> ind GPT 3.2627e-05 0.0057120
>> ind GST 2.3199e-05 0.0048165
>> Residual 3.6397e-02 0.1907813
>> Number of obs: 1727, groups: yrs, 94; ind, 20
>> Fixed effects:
>> Estimate Std. Error t value
>> (Intercept) 0.999101 0.018778 53.21
>> GST 0.071329 0.018967 3.76
>> GPT 0.036529 0.018928 1.93
>> GST:GPT -0.004107 0.019310 -0.21
>> Correlation of Fixed Effects:
>> (Intr) GST GPT
>> GST 0.034
>> GPT -0.024 0.310
>> GST:GPT 0.323 0.106 -0.069
>> Ailene Kane Ettinger
>> PhD Candidate
>> Biology Department
>> University of Washington
>> Box 351800
>> Seattle, Washington 98195-1800
>> ailene at u.washington.edu
>>
>>
>>
>>
>
More information about the R-sig-mixed-models
mailing list